Statistics and Probability Letters 83 (2013) 543–550
Contents lists available at SciVerse ScienceDirect
Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro
Limiting spectral distribution of normalized sample covariance matrices with p/n → 0 Junshan Xie College of Mathematics and Information Science, Henan University, Kaifeng, 475000, PR China
article
info
Article history: Received 21 June 2012 Received in revised form 21 October 2012 Accepted 22 October 2012 Available online 7 November 2012 MSC: primary 60F15 secondary 62H99
abstract We consider a type of normalized sample covariance matrix without independence in columns, and derive the limiting spectral distribution when the number of variables p and the sample size n satisfy that p → ∞, n → ∞, and p/n → 0. This result is a supplement to the corresponding result under the case that p/n → c ∈ (0, ∞), which was obtained by Bai and Zhou (2008). © 2012 Elsevier B.V. All rights reserved.
Keywords: Normalized sample covariance matrices Limiting spectral distribution Stieltjes transform
1. Introduction and main result The limiting spectral distribution of sample covariance matrices is one of the major topics in both random matrix theory and multivariate statistics. If we let X be a p × n complex random matrix, denote Xk ∈ Cp (k = 1, 2, . . . , n) to be the kth columns of X , and assume that the Xk′ are independent and identically distributed (i.i.d.) random vectors, then X can be viewed as the sample data matrix which comes from a p-dimensional population with sample size n. And the sample covariance matrix can be defined as 1
n 1
Xk Xk∗ , (1.1) n k=1 where the symbol X ∗ means the conjugate transpose of the matrix X . If we denote the eigenvalues of Sn by λ1 , λ2 , . . . , λp , then the empirical spectral distribution (ESD) of Sn is Sn =
n
XX ∗ =
F Sn (x) =
p 1
I[λ ,+∞) (x). (1.2) p i =1 i When the entries of X are i.i.d. complex or real-valued random variables with zero mean and variance one, there are many results concerning the limiting spectral distribution of Sn . As a cornerstone in random matrix theory, Marchenko and p Pastur (1967) proved that the ESD F Sn (x) almost surely converges to Fc (x) provided that n → c ∈ (0, ∞), where Fc (x) is the distribution function for the Marchenko–Pastur law with parameter c > 0; that is, Fc (x) has density
√ (b − x)(x − a) fc (x) = 2π xc 0
if a ≤ x ≤ b otherwise,
E-mail address:
[email protected]. 0167-7152/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2012.10.014
544
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
√
√
and a point mass 1 − 1c at the origin if c > 1, where a = (1 − c )2 and b = (1 + c )2 . In this case, we can see that both the sample size n and the dimension p tend to infinity at the same speed. p In the framework of n → c ∈ (0, ∞), there are a number of results in which the independence of the entries of X is weakened. The seminal paper of Marchenko and Pastur (1967) established the limiting spectral distribution of sample covariance matrices with independent rows. Yin and Krishnaiah (1985) considered the case where the independent rows have a spherically symmetric distribution. Götze and Tikhomirov (2006) studied sample covariance matrices satisfying certain martingale-type conditions without any assumption on the independence of the entries. Aubrun (2006) obtained the Marchenko–Pastur law for matrices with independent rows distributed uniformly on the lnp balls. This was generalized by Pajor and Pastur (2009) to matrices with independent rows distributed according to an arbitrary isotropic logconcave measure. Adamczak (2011) studied a class of sample covariance matrices with uncorrelated entries in which each normalized row and normalized column converges to one in probability. More recently, O’Rourke (2012) obtained the corresponding result when the entries of the sample covariance matrices are dependent and satisfy some conditional moment conditions. For the corresponding result on the sample covariance matrices generated by some other types of data, one can refer to Karoui (2009), Hui and Pan (2010), Pfaffel and Schlemm (2011), and Yao (2012). There is a remarkable work due to Bai and Zhou (2008), who established a general criterion for the Marchenko–Pasturtype equation about a wider class of sample covariance matrices without independence in columns. In their paper, they assume that the entries Xjk (j = 1, 2, . . . , p) of the complex-valued random vector Xk satisfy that, for all k, E X¯ jk Xlk = tlj (j, l = 1, . . . , p), the non-negative definite matrix T = (tjl )n×n has uniformly bounded spectral norm, and the spectral distribution p F T tends to a non-random probability distribution H. When n → c ∈ (0, ∞), for any non-random p × p matrix B = (bjk ) with bounded norm, it holds that E |Xk∗ BXk − trBT |2 = o(n2 ),
(1.3)
where tr(·) is the trace of a matrix. Then F almost surely converges to a probability distribution, whose Stieltjes transform m = m(z ), z ∈ C+ := {z : z ∈ C, Imz > 0}, satisfies Sn
1
m=
t (1 − c − czm) − z
dH (t ).
p
Meanwhile, the case that n → 0 as both n and p tend to infinity is also an interesting one in modern statistics. We will define the real-valued normalized sample covariance matrix by Ap,n =
1
√
2 np
(XX t − nIp ),
(1.4)
where X t is the transpose of X , Ip is the p × p identity matrix, and the i.i.d. real-valued entries of X satisfy the additional p condition that EXij4 < ∞. When considering the case that n = n(p) → ∞ and n → 0 as p → ∞, Bai and Yin (1988) proved that the empirical distribution F Ap,n almost surely converges to the Semicircle law Fsc (x) with density fsc (x) =
2 π
1 − x2
if |x| ≤ 1 otherwise.
0
This density is just the limiting spectral density function of a Hermitian Wigner random matrix whose diagonal are i.i.d. random variables and whose off-diagonal elements are also i.i.d. random variables with unit variance. Based on the work of Bai and Zhou (2008), this paper will further consider the limiting spectral properties of normalized p sample covariance matrices under the setting of n → ∞, p → ∞, and n → 0. A corresponding result which is similar to that of Bai and Yin (1988) can be listed as follows. Theorem 1.1. Suppose that the following hold. (a) The number of variables p and the sample size n satisfy that p → ∞,
p n
→ 0, and
p3 n
→ ∞ as n → ∞. ¯ (b) For all 1 ≤ k ≤ n, E Xjk Xlk = tlj , j, l = 1, 2, . . . , p, and for any non-random matrix B = (bjk )p×p with bounded norm, when n → ∞, 3 p , (1.5) E |Xk∗ BXk − trBT |2 = o n
where T = (tlj )p×p . (c) The spectral norm of the matrix T is uniformly bounded and the empirical spectral distribution F T tends to a probability distribution H. If we denote Rn =
n p
( 1n XX ∗ − T ), then the empirical spectral distribution F Rn almost surely converges weakly to a probability
distribution F , whose Stieltjes transform s(z ), z ∈ C+ , satisfies
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
dH (t )
s(z ) = −
z + t s˜(z )
,
545
(1.6)
where s˜(z ) is the unique solution in C+ to tdH (t )
s˜(z ) = −
z + t s˜(z )
.
(1.7)
Remark 1.1. If the random vector Xk ∈ Rp satisfies E {Xk Xkt } = Ip , we have T = Ip and H (x) =
x≥1 x < 1.
1 0
Then F Rn almost surely converges to the Semicircle law, which coincides with the result of Bai and Yin (1988). A similar topic has been studied by Pan and Gao (2010), who also got the limiting spectral distribution of the normalized p sample covariance matrices under the assumption that n → 0. And they considered the case that the random vector Xk can 1
be expressed as Xk = T02 Yk , where T0 is a symmetric non-negative definite matrix, and Yk consists of i.i.d. real-valued entries 2 4 with EY11 = 0, EY11 = 1, and EY11 < ∞. Denoting Y = (Y1 , . . . , Yn ), define Sˆ =
n
1
p
n
1
1
T02 YY t T02 − T0
. ˆ
p
When p → ∞, n → ∞, and n → 0, they proved that F S almost surely converges to the same distribution F as above, whose Stieltjes transform is determined by (1.6) and (1.7). Recently, Bao (2012) considered the same problem by using the Stein method, and also obtained a similar conclusion. Although our result is similar to those appearing in Pan and Gao (2010) and Bao (2012), the normalized sample covariance matrix in this paper has a more general dependent structure. Our result can be regarded as a supplement of the main result of Bai and Zhou (2008) for the case that p/n → c ∈ (0, ∞). One of the main tools to deal with the limiting spectral properties of random matrices is the Stieltjes transform. For any real-valued function G(x) with bounded variation, its Stieltjes transform sG (z ) is defined on C+ as sG (z ) =
+∞
1
−∞ x − z
dG(x).
Note that the Stieltjes transform of the empirical spectral distribution of the underlying matrix Rn is sRn (z ) =
p 1
1
p i=1 λi (Rn ) − z
=
1 p
tr(Rn − zIp )−1 .
And the famous Stieltjes continuity theorem (see Geronimo and Hill, 2003 or Bai and Silverstein, 2010) reveals that the vague convergence of a probability measure sequence is equivalent to the convergence of its Stieltjes transform sequence towards the corresponding transform of the limiting measure. The next section will give us the proof of the main result. In the rest of the paper, the notation ∥ · ∥ means the spectral a.s.
norm of a matrix or the Euclidean norm of a vector. The symbol −→ means the almost sure convergence. And we also use C to denote various positive universal constants which may be different from one line to the next, and sometimes may depend on the imaginary part of z. 2. The proof Our proof will employ the Stieltjes transform method, which has been widely used in Bai and Silverstein (2010). From the Stieltjes continuity theorem, one can get the vague convergence of {F Rn } by showing pointwise convergence of its Stieltjes transforms. We will first show the convergence at the fixed z of the Stieltjes transforms of F Rn , and then give a tightness result to go from vague to weak convergence. We now turn to the actual proof. Denote
Rn =
n p
(S − T ) =
n n 1 p
n k=1
∗
Xk Xk − T
,
R−1 (z ) = (Rn − zIp )−1 .
We will finish the proof by taking the following three steps. a.s.
(I) For any fixed z = u + iv ∈ C+ , 1p trR−1 (z ) − 1p E [tr R−1 (z )] −→ 0.
(II) For any fixed z = u + iv ∈ C+ , 1p E [trR−1 (z )] → s(z ), where s(z ) is determined by the Eqs. (1.6) and (1.7). (III) From the vague convergence to weak convergence of F Rn .
546
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
For the first step, let Fk (1 ≤ k ≤ n) denote the σ -field generated by X1 , X2 , . . . , Xk , Ek (·) = Ek (·|Fk ) denote the conditional expectation with respect to Fk , and E0 (·) denote the unconditional expectation. Also write R(k) =
n p
( 1n
1 −1 Xj Xj∗ − T ), R− (k) (z ) = (R(k) − zIp ) . Then, we have
n
j̸=k
1 p
tr R−1 (z ) −
1 p
E [tr R−1 (z )] =
=
n 1 Ek [tr R−1 (z )] − Ek−1 [tr R−1 (z )] p k=1 n 1
p k=1
1 (Ek − Ek−1 )[tr R−1 (z ) − tr R− (k) (z )] n
1
=− √
p np k=1
1 (Ek − Ek−1 )tr[R−1 (z )Xk Xk∗ R− (k) (z )],
where the last equality used the definition of R(k) and a well-known resolvent equality, which states that, for the non-singular square matrices Σ1 and Σ2 , Σ1−1 − Σ2−1 = −Σ1−1 (Σ1 − Σ2 )Σ2−1 . Since 1 R− (k) (z )Xk
R−1 (z )Xk =
1 1 + √1np Xk∗ R− (k) (z )Xk
,
(2.1)
and because, for any matrices Σ1 and Σ2 which satisfy the condition that both Σ1 Σ2 and Σ2 Σ1 are square, tr(Σ1 Σ2 ) = tr(Σ2 Σ1 ). We can see that 1 p
−1
tr R
1
(z ) − E [tr R (z )] = − −1
p
p k=1
1 2 Xk∗ [R− (k) (z )] Xk
√1
n 1
(Ek − Ek−1 )
np
1 1 + √1np Xk∗ R− (k) (z )Xk
.
Noting that 1 2 tr [R− (k) (z )] T
Ek
1 1 + √1np tr R− (k) (z )T
1 2 tr [R− (k) (z )] T
= Ek−1
1 + √1np tr R(−k1) (z )T
,
(2.2)
some calculations can lead to 1 p
tr R−1 (z ) −
1 p
E [tr R−1 (z )] := −
n 1
p k=1
(Ek − Ek−1 )(η1 + η2 ),
(2.3)
where
η1 =
1 1 1 + √1np tr[R− (k) (z )T ]
1
1 1 1 2 2 Xk∗ [R− tr [R− , (k) (z )] Xk − √ (k) (z )] T np np
√
(2.4)
and √1 X ∗ np k
η2 =
1 2 [R − (k) (z )] Xk
√1 X ∗ R−1 np k (k)
1 1 + √1np tr[R− (k) (z )T ]
(z )Xk −
√1 tr np
1 R− (k) (z )T
1 1 + √1np Xk∗ R− (k) (z )Xk
.
(2.5)
On the one hand, there exists a positive constant M such that
1 √ tr[R−1 (z )T ] ≤ p M , (k) np n v so we can see that
1 +
1 → 1 as n → ∞. ≤ − 1 1 p √ tr[R (z )T ] (k) 1 − n Mv np 1
(2.6)
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
547
By the Burkholder inequality for the martingale difference (see Lemma 2.11 in Bai and Silverstein, 2010) and assumption (1.5), we have
n 2 n 1 1 (Ek − Ek−1 )η1 ≤ 2 E |η1 |2
E
p k=1
p
k=1
2 −1 1 1 2 2 ≤ C 3 E Xk∗ [R− (k) (z )] Xk − tr [R(k) (z )] T p 1
=o
p
,
(2.7)
where the last equality used the assumption that On the other hand, since
p n
→ 0.
1 ∗ −1 1 2 √ Xk [R(k) (z )]2 Xk √1 ∥X ∗ R−1 (z )∥2 ∥Xk∗ R− 1 k (k) np np (k) (z )∥ ≤ , = ≤ − 1 − 1 1 ∗ ∗ 1 + √ Xk R(k) (z )Xk Im 1 + √1 X ∗ R−1 (z )X v ( z ) X ) Im ( X R k k (k) np k np k (k)
(2.8)
using essentially the same argument as for the term η1 , we can deduce that
2 1 n −1 2 1 1 2 2 E (Ek − Ek−1 )η2 ≤ C 3 E Xk∗ [R− (k) (z )] Xk − tr [R(k) (z )] T p k=1 p 1 . =o
(2.9)
p
Combining the relations (2.7) and (2.9), we can get
2
1 1 E tr R−1 (z ) − E [tr R−1 (z )] = o p p
1 p
,
which with the Borel–Cantelli Lemma can lead to 1 p
tr R−1 (z ) −
1 p
a.s.
E [tr R−1 (z )] −→ 0,
which completes the proof of step I. We now consider the second step. For each 1 ≤ k ≤ n, we will denote
an = −
n p
+
n p1+
1 1 [R − (k) (z )T ]
√1 tr np
.
(2.10)
Using the facts that 1
(Rn − zIp ) − (an T − zIp ) = √
n
np j=1
∗
Xj Xj −
n
1
1 p 1 + √1 tr[R− (k) (z )T ] np
T
(2.11)
and Xj∗ R−1 (z ) =
1 Xj∗ R− (j) (z ) 1 1 + √1np Xj∗ R− (j) (z )Xj
,
(2.12)
we have 1
(an T − zIp )−1 − (Rn − zIp )−1 = √
1 n (an T − zIp )−1 Xj Xj∗ R− (j) (z )
np j=1
1 1 + √1np Xj∗ R− (j) (z )Xj
−
n p
1 1 + √1np tr[R− (k) (z )T ]
(an T − zIp )−1 TR−1 (z ).
Multiplying T l and taking 1p tr(·) on both sides of the above, we obtain
548
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
1 p
tr[(an T − zIp )
−1 l
T ]−
1 p
tr[(Rn − zIp )
−1 l
T] =
n 1
√1
np
np
−
1 l −1 Xj∗ R− (j) (z )T (an T − zIp ) Xj 1 1 + √1np Xj∗ R− (j) (z )Xj
p j=1 √1
:=
tr(an T − zIp )−1 TR−1 (z )T l
1 1 + √1np tr R− (j) (z )T
n 1
p j=1
(dj1 + dj2 + dj3 ),
where √1
np
dj1 =
1 −1 l −1 l −1 √1 Xj∗ R− (j) (z )T (an T − zIp ) Xj − np tr[R(j) (z )T (an T − zIp ) T ] 1 1 + √1np Xj∗ R− (j) (z )Xj
√1
np
dj2 =
1 l −1 −1 l −1 √1 tr[R− (j) (z )T (an T − zIp ) T ] − np tr[R (z )T (an T − zIp ) T ] 1 1 + √1np Xj∗ R− (j) (z )Xj
√1
1
dj3 = √ tr[R np
−1
np
(z )T (an T − zIp ) T ] −1
l
, ,
1 ∗ −1 √1 tr[R− (j) (z )T ] − np Xj R(j) (z )Xj
1 1 + √1np Xj∗ R− (j) (z )Xj
1 1 + √1np tr R− (j) (z )T
.
Since
z 1 +
1 1 1 ≤ , ≤ − 1 1 ∗ v Im z + z √np Xj R(j) (z )Xj √1 X ∗ R−1 (z )Xj np j (j)
we can see that
1 +
|z | . ≤ √1 X ∗ R−1 (z )Xj v j ( j ) np 1
Then it follows by assumption (1.5) that
2 1 n n 1 −1 l −1 l −1 2 dj1 ≤ C 3 E Xj∗ R− E (j) (z )T (an T − zIp ) Xj − tr[R(j) (z )T (an T − zIp ) T ] , p j=1 p = o(1),
(2.13)
and
2 2 1 n 1 1 −1 dj3 ≤ C 2 E Xj∗ R− E (j) (z )Xj − tr[R(j) (z )T ] , p j=1 p = o(1),
(2.14) p n
where the last inequality used the assumption that → 0 again. For any Hermite matrix A and each 1 ≤ j ≤ n, it holds that
−1 tr R (z )A − tr R−1 (z )A ≤ ∥A∥ . (j) v Together with the assumption that n 1
p j=1
E |dj2 | ≤ C
12
n p3
p3 n
(2.15)
→ ∞, we know that
= o(1).
(2.16)
Combining with (2.13), (2.14) and (2.16), we can get that, for each non-negative integer l, when n → ∞,
E
1 p
tr[(an T − zIp )
−1 l
T ] −E
1 p
tr[(Rn − zIp )
−1 l
T]
→ 0.
(2.17)
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
549
By the definition of an , we can see that an = −
1 [R − (k) (z )T ]
1 tr p
1 1 + √1np tr[R− (k) (z )T ]
.
Together with relation (2.15), (2.6) leads to 1 p
Etr[(an T − zIp )−1 T l ] −
1 p
−1 1 − tr R−1 (z )T T − zIp T l → 0.
Etr
(2.18)
p
Let E [ 1p tr R−1 (z )] = sp (z ) and E [ 1p trR−1 (z )T ] = s˜p (z ). From the relations (2.17) and (2.18), taking l = 0, 1, respectively, we can deduce that dF T (t )
s ( z ) = − p s˜p (z ) = −
z + t s˜p (z ) tdF T (t )
z + t s˜p (z )
.
By the assumptions of Theorem 1.1, we know that, for each fixed z ∈ C+ , {˜sp (z )} is bounded. Thus, for any subsequence {p }, there is a subsequence {p′′ } of {p′ } such that {˜sp (z )} tends to a limit, say s˜(z ) ∈ C+ . Using the fact that F T → H, we can see that t s˜(z ) + dH (t ) = 0, z + s˜(z )t ′
and along this subsequence, dF T (t )
dH (t )
z + t s˜p (z )
→
z + t s˜(z )
.
Then there exists a limit s(z ) such that s(z ) +
1 z + s˜(z )t
dH (t ) = 0,
which means that (s(z ), s˜(z )) is the solution of Eqs. (1.6) and (1.7). When H (t ) =
1 0
t ≥0 t < 0,
the above two equations can be used to determine that s(z ) = − 1z . In what follows, we will always consider the case that H (t ) is not a degenerate distribution at zero. For the sake of completeness, we will follow an argument which is similar to that of Bai and Silverstein (2010) or Pan and Gao (2010), to prove that there is a unique solution in C+ to the equation characterizing s˜(z ). To this end, if we denote s˜(z ) = m1 + im2 and suppose that there exist two solutions s˜(z ), s˜0 (z ) ∈ C+ of Eq. (1.7), then s˜(z ) − s˜0 (z ) = −
tdH (t )
z + t s˜(z )
= (˜s(z ) − s˜0 (z ))
+
tdH (t ) z + t s˜0 (z ) t 2 dH (t )
(z + t s˜(z ))(z + t s˜0 (z ))
.
Considering the imaginary part of both sides of (1.7) and the fact that z = u + iv , we get
m2 =
t v + t 2 m2
|z + t s˜0 (z )|
dH (t ) > m2
t2
|z + t s˜0 (z )|
dH (t ),
which implies that 1>
t2
|z + t s˜0 (z )|
dH (t ).
As H (t ) is not a degenerate distribution at the point of zero, we have inequality that
t2
|z +t s˜0 (z )|
dH (t ) ̸= 0, and m2 > 0. It follows by Holder’s
2 t 2 dH (t ) t2 t2 ≤ dH ( t ) dH (t ) < 1, (z + t s˜(z ))(z + t s˜ (z )) |z + t s˜(z )| |z + t s˜0 (z )| 0
550
J. Xie / Statistics and Probability Letters 83 (2013) 543–550
which leads to a contradiction. The contradiction implies that s˜(z ) = s˜0 (z ), and hence Eq. (1.7) has at most one solution. The proof of step II is completed. Finally, we will prove the weak convergence of {F Rn } in the last step. By step I and step II, we obtain that, for each z ∈ C+ , a.s.
( ) −→ s(z ). Applying the Vitali Lemma, we conclude that, with probability one on any compact subset of C+ , ( ) → s(z ), where s(z ) satisfies Eqs. (1.6) and (1.7). By the Stieltjes continuity theorem, we can get that, almost
1 tr R−1 z p 1 trR−1 z p Rn
surely, F vaguely converges to the distribution function F . To get the weak convergence of {F Rn }, we only need to prove the tightness of {F Rn }. Writing ζ = xdF Rn (x), we can get
ζ =
1 p
tr Rn =
n
p3
n 1 ∗ Xk Xk − tr T n k=1
,
and Eζ 2 ≤ C
n 1
np3 k=1
E |Xk∗ Xk − tr T |2 = o
1 n
.
Then we conclude that ζ is almost surely bounded by some positive constant, say K . It follows by the Markov inequality that, for any ε > 0, there exists a sufficiently large number M1 , such that F Rn [M1 , +∞) ≤
K M1
< ε,
a.s.
The Markov inequality also can lead to the fact that there exists a sufficiently large number M2 , such that F Rn (−∞, −M2 ] < ε,
a.s.,
which proves the almost sure tightness of F Rn . The proof of Theorem 1.1 is then completed. Acknowledgment The author would like to thank the referee for many valuable comments and suggestions. References Adamczak, R., 2011. On the Marchenko–Pastur and circular laws for some classes of random matrices with dependent entries. Electron. J. Probab. 16, 1065–1095. Aubrun, G., 2006. Random points in the unit ball of lnp . Positivity 10, 755–759. Bai, Z.D., Silverstein, J.W., 2010. Spectral Analysis of Large Dimensional Random Matrices. In: Mathematics Monograph Series, vol. 15. Science Press, Beijing. Bai, Z.D., Yin, Y.Q., 1988. Convergence to the semicircle law. Ann. Probab. 16, 863–875. Bai, Z.D., Zhou, W., 2008. Large sample covariance matrices without independent structure in columns. Statist. Sinica 18, 425–442. p Bao, Z.G., 2012. Strong convergence of ESD for the generalized sample covariance matrices when n → 0. Statist. Probab. Lett. 82, 894–901. Geronimo, J.S., Hill, T.P., 2003. Necessary and sufficient condition that the limit of Stieltjes transforms is a Stieltjes transform. J. Approx. Theory 121, 54–60. Götze, F., Tikhomirov, A., 2006. Limit theorems for spectra of positive random matrices under dependence. J. Math. Sci. 39, 935–941. Hui, J., Pan, G.M., 2010. Limiting spectral distribution for large sample covariance matrices with m-dependent elements. Commun. Stat.—Theory Methods 39, 935–941. Karoui, N.E., 2009. Concentration of measure and spectra of random matrices: with applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 (6), 2362–2405. Marchenko, V., Pastur, L., 1967. Distribution of eigenvalues for some sets of random matrices. Math. USSR-Sb. 1, 457–483. O’Rourke, S., 2012. A note on the Marchenko–Pastur law for a class of random matrices with dependent entries. Preprint. arxiv:1201.3354. Pajor, A., Pastur, L., 2009. On the limiting empirical measure of the sum of rank one matrices with log-concave distribution. Studia Math. 195, 11–29. Pan, G.M., Gao, J.T., 2010. Asymptotics theory for sample covariance matrix under cross-sectional dependence. Preprint. Nanyang Technological University, Singapore. Pfaffel, O., Schlemm, E., 2011. Eigenvalue distribution of large sample covariance matrices of linear processes. Probab. Math. Statist. 31, 313–329. Yao, J.F., 2012. A note on a Marchenko–Pastur type theorem for time series. Statist. Probab. Lett. 82, 22–28. Yin, Y.Q., Krishnaiah, P.R., 1985. Limit theorem for eigenvalues of the sample covariance matrix when underlying distribution is isotropic. Teor. Veroyatnost. i Primenen. 30, 810–816.