Journal of Statistical Planning and Inference 142 (2012) 2241–2256
Contents lists available at SciVerse ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Likelihood ratio tests for covariance matrices of high-dimensional normal distributions Dandan Jiang a,1, Tiefeng Jiang b,2,n, Fan Yang b,c a b c
School of Mathematics, Jilin University, Changchun 130012, China School of Statistics, University of Minnesota, 224 Church Street, Minneapolis, MN 55455, United States Boston Scientific, 1 Scimed Place, Maple Grove, MN 55311, United States
a r t i c l e i n f o
abstract
Article history: Received 27 August 2011 Received in revised form 27 February 2012 Accepted 28 February 2012 Available online 7 March 2012
For a random sample of size n obtained from a p-variate normal population, the likelihood ratio test (LRT) for the covariance matrix equal to a given matrix is considered. By using the Selberg integral, we prove that the LRT statistic converges to a normal distribution under the assumption p=n-y 2 ð0; 1. The result for y¼ 1 is much different from the case for y 2 ð0; 1Þ. Another test is studied: given two sets of random observations of sample size n1 and n2 from two p-variate normal distributions, we study the LRT for testing the two normal distributions having equal covariance matrices. It is shown through a corollary of the Selberg integral that the LRT statistic has an asymptotic normal distribution under the assumption p=n1 -y1 2 ð0; 1 and p=n2 -y2 2 ð0; 1. The case for maxfy1 ,y2 g ¼ 1 is much different from the case max fy1 ,y2 g o 1. & 2012 Elsevier B.V. All rights reserved.
Keywords: High-dimensional data Testing on covariance matrices Selberg integral Gamma function
1. Introduction In their pioneer work, Bai et al. (2009) studied two Likelihood Ratio Tests (LRTs) by using Random Matrix Theory. The limiting distributions of the LRT test statistics are derived. There are two purposes in this paper. We first use the Selberg integral, a different method, to revisit the two problems. We then prove two theorems which cover the critical cases that are not studied in Bai et al. (2009). Now we review the two tests and present our results. Let x1 , . . . ,xn be i.i.d. Rp -valued random variables with normal distribution Np ðl, RÞ, where l 2 Rp is the mean vector and R is the covariance matrix. Consider the test H 0 : R ¼ Ip
vs
Ha : RaIp ,
ð1:1Þ
with l unspecified. Any test H0 : R ¼ R0 with known non-singular R0 and unspecified l can be reduced to (1.1) by 1=2 1=2 transforming data yi ¼ R0 xi for i ¼ 1; 2, . . . ,n (then y1 , . . . ,yn are i.i.d. with distribution Np ðl~ ,Ip Þ, where l~ ¼ R0 l). Recall x¼
n
n 1X x ni¼1 i
and
S¼
n 1X ðx xÞðxi xÞn : ni¼1 i
Corresponding author. E-mail addresses:
[email protected] (D. Jiang),
[email protected] (T. Jiang),
[email protected] (F. Yang). 1 Supported in part by NSFC 11101181 and RFDP 20110061120005. 2 Supported in part by NSF #DMS-0449365.
0378-3758/$ - see front matter & 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2012.02.057
ð1:2Þ
2242
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
Of course S is a p p matrix. After scaling and taking logarithm, a LRT statistic for (1.1) is chosen to be in the following form: p
Lnn ¼ trðSÞlog9S9p ¼
1X ðl n log li Þ þ p log np, ni¼1 i
ð1:3Þ
where l1 , . . . , lp are the eigenvalues of nS. See, for example, p. 355 from Muirhead (1982) for this. The notation log above stands for the natural logarithm loge throughout the paper. For fixed p, it is known from the classical multivariate analysis theory that a (constant) linear transform of nLnn converges to w2pðp þ 1Þ=2 as n-1. See, e.g., p. 359 from Muirhead (1982). When p is large, particularly as n-1 and p=n-y 2 ð0; 1Þ, there are some results on the improvement of the convergence, see, e.g., Bai and Saranadasa (1996). The fact that dimension p is large and is proportional to the sample size n is a common practice in modern data. A failure for a similar LRT test in the high dimensional case (p is large) is observed by Dempster (1958) in as early as 1958. It is due to this reason that Bai et al. (2009) study the statistic Lnn in (1.3) when both n and p are large and are proportional to each other. Now, we state our results in this paper next. Theorem 1. Let x1 , . . . ,xn be i.i.d. random vectors with normal distribution Np ðl, RÞ. Let Lnn be as in (1.3). Assume H0 in (1.1) holds. If n 4p ¼ pn and limn-1 p=n ¼ y 2 ð0; 1, then ðLnn mn Þ=sn converges in distribution to Nð0; 1Þ as n-1, where hp 3 p p i : mn ¼ np log 1 þ py and s2n ¼ 2 þ log 1 2 n n n A simulation study was made for the quantity ðLnn mn Þ=sn as in Theorem 1. We chose p=n ¼ 0:9 in Fig. 1 with different values of n. The figure shows that the convergence becomes more accurate as n increases. To see the convergence rate for the case y ¼ 1, we chose an extreme scenario with p ¼ n4 in Fig. 2. As n increases, the convergence rate seems quite decent too. Now, note that s2n -2y2 logð1yÞ if p=n-y 2 ð0; 1Þ. We obviously have the following corollary. Corollary 1.1. Let x1 , . . . ,xn be i.i.d. random vectors with normal distribution Np ðl, RÞ. Let Lnn be as in (1.3). Assume H0 in (1.1) holds. If n 4p ¼ pn and limn-1 p=n ¼ y 2 ð0; 1Þ, then Lnn mn converges in distribution to Nð0, s2 Þ as n-1, where s2 ¼ 2y2 logð1yÞ and p 3 mn ¼ ðnpÞ log 1 þpy logð1yÞ: n 2 Looking at Theorem 1, it is obvious that s2n 2 logð1ðp=nÞÞ as p=n-1. We then get the following. Corollary 1.2. Assume all the conditions in Theorem 1 hold with y¼1. Let r n ¼ ðlogð1ðp=nÞÞÞ1=2 . Then Lnn pðpn þ1:5Þr 2n pffiffiffi converges in distribution to Nð0; 1Þ as n-1: 2r n The above result studies the critical case for y ¼ 1, which is not covered in Bai et al. (2009). In fact, the random matrix tool by Bai and Silverstein (2004) is used to derive the results in Bai et al. (2009). Their tool fails when y¼1. For a practical testing procedure, we would use Theorem 1 directly instead of using Corollaries 1.1 and 1.2, which deal with the cases y 2 ð0; 1Þ and y¼1 separately. This is because, for a real set of data, sometimes it is hard to judge when p/n goes to 1 or when it goes to a number less than 1.
Fig. 1. Histograms were constructed based on 10,000 simulations of the normalized likelihood ratio statistic ðLnn mn Þ=sn according to Theorem 1 under the null hypothesis S ¼ Ip with p=n ¼ 0:9. The curves on the top of the histograms are the standard normal curve.
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
2243
Fig. 2. Histograms were constructed based on 10; 000 simulations of the normalized likelihood ratio statistic ðLnn mn Þ=sn according to Theorem 1 under the null hypothesis S ¼ Ip with p ¼ n4. The curves on the top of the histograms are the standard normal curve.
Now we study another likelihood test. For two p-dimensional normal distributions Nðlk , Rk Þ, k ¼ 1; 2, where R1 and R2 are non-singular and unknown, we wish to test H0 : R1 ¼ R2
vs
Ha : R1 aR2 ,
ð1:4Þ
with unspecified l1 and l2 . The data are given as follows: x1 , . . . ,xn1 is a random sample from Np ðl1 , R1 Þ; y1 , . . . ,yn2 is a random sample from Np ðl2 , R2 Þ, and two sets of random vectors are independent. The two relevant covariance matrices are A¼
n1 1 X ðx xÞðxi xÞn n1 i ¼ 1 i
and
B¼
n2 1 X ðy yÞðyi yÞn , n2 i ¼ 1 i
ð1:5Þ
where x¼
n1 1 X x n1 i ¼ 1 i
and
y¼
n2 1 X y: n2 i ¼ 1 i
ð1:6Þ
Let N ¼ n1 þn2 and ck ¼ nk =N for k¼1,2. The likelihood ratio test statistic is T N ¼ 2 log L1
where L1 ¼
9A9
n1 =2
n2 =2
9B9
9c1 Aþ c2 B9
N=2
:
ð1:7Þ
See, e.g., Section 8.2 from Muirhead (1982) for this. The second main result in this paper is as follows. Theorem 2. Let ni 4 p for i ¼ 1; 2 and TN be as in (1.7). Assume H0 in (1.4) holds. If n1 -1,n2 -1 and p-1 with p=ni -yi 2 ð0; 1 for i ¼ 1; 2, then 1 TN mn converges in distribution to Nð0; 1Þ, sn N where
mn ¼ ðpN þ 2:5Þ log 1
s2n ¼ 2 log 1
2 p X ðpni þ 1:5Þni p log 1 ; N ni N i¼1
2 X n2i p p 2 log 1 : 2 N ni i¼1N
ð1:8Þ
We did some simulations for the statistic ðT N =Nmn Þ=sn as in Theorem 2. In Fig. 3, we chose p=n1 ¼ p=n2 ¼ 0:9, the picture shows that the convergence rate is quite robust with the value of n1 ,n2 and p increases even though the ratio 0.9 is close to 1. To see the convergence rate for the case that maxfy1 ,y2 g ¼ 1, we chose an extreme situation with p ¼ n1 4 ¼ n2 4 in Fig. 4. The convergence rate looks well too although it is not as fast as the case p=n1 ¼ p=n2 ¼ 0:9 presents.
2244
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
Fig. 3. Histograms were constructed based on 10; 000 simulations of the normalized likelihood ratio statistic ðT N =Nmn Þ=sn according to Theorem 2 under the null hypothesis S1 ¼ S2 with p=n1 ¼ p=n2 ¼ 0:9. The curves on the top of the histograms are the standard normal curve.
Fig. 4. Histograms were constructed based on 10; 000 simulations of the normalized likelihood ratio statistic ðT N =Nmn Þ=sn according to Theorem 2 under the null hypothesis S1 ¼ S2 with p ¼ n1 4 ¼ n2 4. The curves on the top of the histograms are the standard normal curve.
According to the notation in Theorem 2, we know that p=N ¼ ððn1 =pÞ þ ðn2 =pÞÞ1 -y1 y2 =ðy1 þ y2 Þ and ni =N ¼ ni =p ððn1 =pÞ þ 1 1 ðn2 =pÞÞ1 -y1 i =ðy1 þ y2 Þ for i¼1, 2. We easily get the following corollary. Corollary 1.3. Let ni 4 p for i ¼ 1; 2 and TN be as in (1.7). Assume H0 in (1.4) holds. If n1 -1,n2 -1 and p-1 with p=ni -yi 2 ð0; 1Þ for i ¼ 1; 2, then TN nn converges in distribution to Nðm, s2 Þ, N where
m ¼ 12½5 logð1yÞ3g1 logð1y1 Þ3g2 logð1y2 Þ; s2 ¼ 2½logð1yÞg21 logð1y1 Þg22 logð1y2 Þ;
nn ¼ ðpNÞ log 1
p ðpn1 Þn1 p ðpn2 Þn2 p log 1 log 1 , N n1 n2 N N
ð1:9Þ
with g1 ¼ y2 ðy1 þ y2 Þ1 , g2 ¼ y1 ðy1 þ y2 Þ1 and y ¼ y1 y2 ðy1 þy2 Þ1 . Our method of proving the above results is much different from Bai et al. (2009). The random matrix theories, developed by Bai and Silverstein (2004) for the Wishart matrices and Zheng (2008) for the F-matrices, are used in Bai et al. (2009). The tools are universal in the sense that no normality assumption is needed. However, the requirements that y o1
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
2245
as in Corollary 1.1 and maxfy1 ,y2 g o 1 as in Corollary 1.3 are crucial. Technically, the study for critical cases that y¼ 1 and that maxfy1 ,y2 g ¼ 1 are more challenging. Under the normality assumption, without relying on the random matrix theories similar to Bai and Silverstein (2004) and Zheng (2008), we are able to use analysis tools. In fact, the Selberg integral is used in the proof of both theorems. Through the Selberg integral, some close forms of the moment generating functions of the two likelihood ratio test statistics are obtained. We then study the moment generating functions to derive the central limit theorems for the two likelihood ratio test statistics. In particular, our results study the cases that y r1 and that maxfy1 ,y2 g r1. As shown in Corollary 1.2, the result for y¼1 and the result for y 2 ð0; 1Þ are much different. The same applies for the second test. We develop a tool on the product of a series of Gamma functions (Proposition 2.1). It is powerful in analyzing the moment generating functions of the two log-likelihood ratio statistics studied in this paper. The organization of the rest of the paper is as follows. In Section 2, we derive a tool to study the product of a series of the Gamma functions. The proofs of the main theorems stated above are given in Section 3.
2. Auxiliary results Proposition 2.1. Let n 4 p ¼ pn and r n ¼ ðlogð1ðp=nÞÞÞ1=2 . Assume that p=n-y 2 ð0; 1 and t ¼ t n ¼ Oð1=r n Þ as n-1. Then, as n-1, i t G n 1 Y 2 ¼ ptð1 þ log 2Þpt log n þ r 2n ðt 2 þ ðpn þ1:5ÞtÞ þ oð1Þ: log i i ¼ np G 2 The proposition is proved through the following three lemmas. Lemma 2.1. Let b :¼ bðxÞ be a real-valued and bounded function defined on ð0,1Þ. Then log
2 Gðx þ bÞ b b 1 þO 2 , ¼ b log x þ 2x GðxÞ x
as x-þ 1, where GðxÞ is the gamma function. Proof. Recall the Stirling formula (see, e.g., p. 368 from Gamelin, 2001 or (37) on p. 204 from Ahlfors, 1979): pffiffiffiffiffiffi 1 1 1 þO 3 , log GðzÞ ¼ z log zz log z þ log 2p þ 2 12z x as x ¼ ReðzÞ- þ 1. It follows that log
as x- þ 1. First, use the fact that logð1 þ tÞ tðt 2 =2Þ þ Oðt 3 Þ as t-0 to get b x log x ðx þbÞ logðx þbÞx log x ¼ ðx þbÞ log x þlog 1 þ x ! 2 b b ¼ ðx þ bÞ log x þ 2 þ Oðx3 Þ x log x x 2x ¼ b log x þb þ
2 b 1 þO 2 , 2x x
as x- þ 1. Evidently, b b 1 ¼ þO 2 logðx þ bÞlog x ¼ log 1 þ x x x
and
1 1 1 ¼O 2 , x þb x x
as x- þ 1. Plugging these two assertions into (2.1), we have log
Gðx þ bÞ 1 1 1 1 1 þO 3 , ¼ ðx þ bÞ logðx þ bÞx log xb ðlogðx þbÞlog xÞ þ 2 12 x þ b x GðxÞ x
2 Gðx þ bÞ b b 1 þO 2 , ¼ b log x þ 2x GðxÞ x
as x- þ 1.
&
ð2:1Þ
2246
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
Lemma 2.2. Let n 4 p ¼ pn . Assume that limn-1 p=n ¼ y 2 ð0; 1Þ and ft n ; n Z 1g is bounded. Then, as n-1, i t G n n 1 Y 3t 2 ¼ pt n ð1 þ log 2Þt n n log n þ t n ðnpÞ logðnpÞ t 2n þ n logð1yÞ þ oð1Þ: log i 2 i ¼ np G 2
ð2:2Þ
Proof. Since p=n-y 2 ð0; 1Þ, then np- þ1 as n-1. By Lemma 2.1, there exists integer C 1 Z 2 such that i G t i t2 þ t C1 2 ¼ t log þ þ jðiÞ and 9jðiÞ9 r 2 , log i 2 i i G 2 for all i Z np as n is sufficiently large, where here and later in this proof we write t for tn for short notation. Notice t log ði=2Þ ¼ t log 2t log i. Then, i G t n1 n1 n1 n1 X X X X 1 2 ¼ pt log 2t þ log log i þ ðt 2 þtÞ jðiÞ i i i ¼ np i ¼ np i ¼ np i ¼ np G 2 n1 X 1 n! n 1 t log þ t log þO ¼ pt log 2 þðt 2 þ tÞ i ðnpÞ! ðnpÞ n i ¼ np ¼ pt log 2 þðt 2 þ tÞ since
n1 X 1 n! t logð1yÞt log þ oð1Þ, i ðnpÞ! i ¼ np
ð2:3Þ
Pn1
i ¼ np jðiÞ ¼ Oð1=nÞ and log ðn=ðnpÞÞ-logð1yÞ as n-1. First, Z n1 n1 n1 Z i X X 1 1 1 r dx ¼ dx: i x np1 x i ¼ np i ¼ np i1
By working on the lower bound similarly, we have Z n Z n1 n1 X n 1 1 1 n1 ¼ dxr r dx ¼ log : log np x i x np1 np np1 i ¼ np This implies, by assumption p=n-y, that n1 X 1 -logð1yÞ, i i ¼ np
ð2:4Þ 0
as n-1. Second, by the Stirling formula (see, e.g., p. 210 from Freitag and Busam, 2005), there are some yn , yn 2 ð0; 1Þ, pffiffiffiffiffiffiffiffiffi n n þ ðy =12nÞ n n! 1 n 2pnn e ¼ log pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi þoð1Þ log ¼ n log nðnpÞ logðnpÞp þ log np n þ p þ ðy0n =12ðnpÞÞ ðnpÞ! 2 np 2pðnpÞðnpÞ e ¼ n log nðnpÞ logðnpÞp
1 logð1yÞ þ oð1Þ, 2
as n-1. Join this with (2.3) and (2.4), we arrive at i t n 1 G Y t 2 ¼ pt log 2ðt 2 þ tÞ logð1yÞt logð1yÞtn log n þ tðnpÞ logðnpÞ þ tp þ logð1yÞ þ oð1Þ log i 2 i ¼ np G 2 3t logð1yÞtn log n þtðnpÞ logðnpÞ þ oð1Þ, ¼ ptð1þ log 2Þ t 2 þ 2 as n-1. The proof is then completed.
&
Lemma 2.3. Let n 4 p ¼ pn and r n ¼ ðlogð1ðp=nÞÞÞ1=2 . Assume that limn-1 p=n ¼ 1 and t ¼ t n ¼ Oð1=r n Þ as n-1. Then, as n-1, i t n 1 G Y 2 ¼ ptð1 þ log 2Þpt log n þr 2n ðt 2 þ ðpn þ 1:5ÞtÞ þ oð1Þ: log i i ¼ np G 2
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
2247
Proof. Obviously, limn-1 r n ¼ þ1. Hence, ft n ; n Z 2g is bounded. By Lemma 2.1, there exist integers C 1 Z 2 and C 2 Z 2 such that i G t i t2 þ t C1 2 ¼ t log þ log þ jðiÞ and 9jðiÞ9 r 2 , ð2:5Þ i 2 i i G 2 for all i Z C 2 . We will use (2.5) to estimate
Qn1
i ¼ np
Gði=2tÞ=Gði=2Þ. However, when np is small, say, 2 or 3 (which is possible
since p=n-1), the identity (2.5) cannot be directly applied to estimate each term in the product of
Qn1
i ¼ np
Gði=2tÞ=Gði=2Þ.
We next use a truncation to solve the problem thanks to the fact that Gði=2tÞ=Gði=2Þ-1 as n-1 for fixed i. Fix M Z C 2 . Write 8 i 1 if np Z M; > > G t < M1 2 Y for i Z1 and gn ¼ ai ¼ ai if np o M: > i > : i ¼ np G 2 Then,
n1 Y i ¼ np
Easily,
i i t G t n1 Y 2 2 ¼ gn : i i i ¼ ðnpÞ3M G G 2 2
G
ð2:6Þ
M M min ð14ai Þ r gn r max ð13ai Þ ,
1rirM
1rirM
for all n Z 1. Note that, for each i Z1, ai -1 as n-1 since limn-1 t n ¼ 0. Thus, since M is fixed, the two bounds above go to 1 as n-1. Consequently, limn-1 gn ¼ 1. This and (2.6) say that i i t t G G n 1 n 1 Y Y 2 2 , ð2:7Þ i i i ¼ np G i ¼ ðnpÞ3M G 2 2 as n-1. By (2.5), as n is sufficiently large, we know i G t n 1 n1 X Y i t2 þ t 2 ¼ þ jðiÞ , log t log þ i 2 i i ¼ ðnpÞ3M G i ¼ ðnpÞ3M 2 with 9jðiÞ9r C 1 i2 for i Z C 2 . Write t log ði=2Þ ¼ t log i þ t log 2. It follows that i G t n 1 n1 n1 n1 X Y X X 1 2 ¼ ðnðnpÞ3MÞt log 2t log log i þ ðt 2 þ tÞ jðiÞ :¼ An Bn þ C n þ Dn þ i i i ¼ ðnpÞ3M G i ¼ ðnpÞ3M i ¼ ðnpÞ3M i ¼ ðnpÞ3M 2 ð2:8Þ as n is sufficiently large. Now we analyze the four terms above. By distinguishing the cases np 4M and np r M, we get 9An pt log 29 r ðt log 2Þ 9npM9 Iðnp r MÞ rðM log 2Þt: Now we estimate Bn. By the same argument as in (2.9), we get X X n1 M X n1 hðiÞ hðiÞ r 9hðiÞ9 i ¼ ðnpÞ3M i¼1 i ¼ ðnpÞ
ð2:9Þ
ð2:10Þ
for hðxÞ ¼ log x or hðxÞ ¼ 1=x on x 2 ð0,1Þ. By the Stirling formula (see, e.g., Freitag and Busam, 2005, p. 210), pffiffiffiffiffiffiffiffiffi n! ¼ 2pnnn en þ yn =12n with yn 2 ð0; 1Þ for all n Z 1. It follows that for some yn , y0n 2 ð0; 1Þ, pffiffiffiffiffiffiffiffiffi n1 X 2pnnn en þ ðyn =12nÞ n! np þ log ¼ log pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi log i ¼ log 0 ðnpÞ! n 2pðnpÞðnpÞnp en þ p þ ðyn =12ðnpÞÞ i ¼ np
2248
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
np 1 np ¼ n log nðnpÞ logðnpÞp þ log þ Rn , n 2 n P with 9Rn 9 r 1 as n is sufficiently large. Recall Bn ¼ t n1 i ¼ ðnpÞ3M log i. We know from (2.10) that Bn tn log ntðnpÞ logðnpÞtp þ t log np r Ct, 2 n þ log
ð2:11Þ
where C here and later stands for a constant and can be different from line to line. P Now we estimate Cn. Recall the identity sn :¼ ni¼ 1 ð1=iÞ ¼ log n þ cn for all n Z 1 and limn-1 cn ¼ c, where c 0:577 is the Euler constant. Thus, 9ðsn snp Þlog ðn=ðnpÞÞ9 r cn þ cnp . Moreover, X n n X X n1 1 1 1 ¼ sn snp and r 1: i i ¼ np i i ¼ np þ 1 i i ¼ np þ 1 Therefore, X n1 1 n log r C: np i ¼ np i P Consequently, since C n ¼ ðt 2 þ tÞ n1 i ¼ ðnpÞ3M ð1=iÞ, we know from (2.10) that n C n ðt 2 þtÞ log r ðt 2 þtÞC: np
ð2:12Þ
Finally, it is easy to see from the second fact in (2.5) that 9Dn 9r C 1
1 X 1 2 i¼Mi
,
ð2:13Þ
for all n Z2. Now, reviewing that t ¼ t n -0 as n-1, we have from (2.7)–(2.9), (2.11) and (2.12) that, for fixed integer M 40, t np n þðt 2 þ tÞ log þ Dn þoð1Þ An Bn þ C n þ Dn ¼ pt log 2 tn log ntðnpÞ log ðnpÞtp þ log 2 n np 3t 3t ¼ ptð1þ log 2Þ þ t 2 þ nt log n t 2 þ ðnpÞt logðnpÞ þDn þ oð1Þ, 2 2 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} En
nr 2n .
Then 3t En ¼ ptð1þ log 2Þpt log n þ r 2n t 2 þ ðnpÞt : 2
as n-1. Write logðnpÞ ¼ log
From (2.13) we have that lim sup9ðAn Bn þC n þ Dn ÞEn 9r C 1 n-1
1 X 1 2 i¼Mi
,
for any M Z C 2 . Recalling (2.7) and (2.8), letting M-1, we eventually obtain the desired conclusion.
&
Proof of Proposition 2.1. The conclusion corresponding to the case y¼1 follows from Lemma 2.3. If y 2 ð0; 1Þ, then limn-1 r n ¼ ðlogð1yÞÞ1=2 , and hence ft n : n Z1g is bounded. It follows that p 3t p log 1 : ptð1þ log 2Þpt log n þ r 2n ðt 2 þðpn þ 1:5ÞtÞ ¼ ptð1 þ log 2Þpt log ntðpnÞ log 1 t 2 þ n 2 n The last term above is identical to ðt 2 þð3t=2ÞÞ logð1yÞ þ oð1Þ since p=n-y as n-1. Moreover, p ¼ pt log n þtðnpÞðlogðnpÞlog nÞ ¼ nt log n þ tðnpÞ log ðnpÞ: pt log ntðpnÞ log 1 n The above three assertions conclude 3t logð1yÞ þ oð1Þ, ptð1þ log 2Þpt log n þ r 2n ðt 2 þðpn þ 1:5ÞtÞ ¼ ptð1 þ log 2Þnt log n þ ðnpÞt logðnpÞ t 2 þ 2 as n-1. This is exactly the right hand side of (2.2).
&
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
2249
3. Proof of main results We first prove Theorem 1. To do that, we need to make a preparation. Assume that x1 , . . . ,xn are Rp -valued random variables. Recall n 1X ðx xÞðxi xÞn ni¼1 i
S¼
where x ¼
n 1X x: ni¼1 i
ð3:1Þ
The following is from Theorem 3.1.2 and Corollary 3.2.19 in Muirhead (1982). Lemma 3.1. Assume n 4 p. Let x1 , . . . ,xn be i.i.d. Rp -valued random variables with distribution Np ðl,Ip Þ. Then nS and Zn Z have the same distribution, where Z :¼ ðzij Þðn1Þp and zij’s are i.i.d. with distribution Nð0; 1Þ. Further, l1 , . . . , lp have joint density function Y
f ðl1 , . . . , lp Þ ¼ Const
9li lj 9
1riojrp
p Y
liðnp2Þ=2 eð1=2Þ
Pp i ¼ 1
li
,
i¼1
for all l1 40, l2 4 0, . . . , lp 40. Recall the b-Laguerre ensemble as follows: Y
f b,a ðl1 , . . . , lp Þ ¼ cbL ,a
p Y
b
9li lj 9
1riojrp
laq eð1=2Þ i
Pp i ¼ 1
li
,
ð3:2Þ
i¼1
for all l1 4 0, l2 4 0, . . . , lp 4 0, where b G 1þ p Y 2 , cbL ,a ¼ 2pa b b j ¼ 1 G 1 þ j G a ðpjÞ 2 2
ð3:3Þ
b 4 0, p Z 2, a 4ðb=2Þðp1Þ and q ¼ 1 þ ðb=2Þðp1Þ. See, e.g., Dumitriu and Edelman (2002) and Jiang for further details. It is R R known that f b,a ðl1 , . . . , lp Þ is a probability density function, i.e., ½0,1Þp f b,a ðl1 , . . . , lp Þ dl1 dlp ¼ 1. See (17.6.5) from Mehta (2004) (which is essentially a corollary of the Selberg integral in (3.23) below). Evidently, the density function in Lemma 3.1 corresponds to the b-Laguerre ensemble in (3.2) with
b ¼ 1, a ¼ 12 ðn1Þ and q ¼ 1 þ 12ðp1Þ:
ð3:4Þ
Lemma 3.2. Let n 4 p and Lnn be as in (1.3). Assume l1 , . . . , lp have density function f b,a ðl1 , . . . , lp Þ as in (3.2) with a ¼ ðb=2Þðn1Þ and q ¼ 1 þ ðb=2Þðp1Þ. Then b pðtðb=2Þðn1ÞÞ G at j p1 Y n 2t 2 , EetLn ¼ eðlog n1Þpt 1 2pt b n j ¼ 0 G a j 2 for any t 2 ðð1=2Þb,ð1=2Þðb4nÞÞ. Proof. Recall p
Lnn ¼
1X ðl n log lj Þ þ p log np: nj¼1 j
We then have n
EetLn ¼ eðlog n1Þpt
Z
ðt=nÞ
½0,1Þp
¼ eðlog n1Þpt cbL ,a
e
Pp
j ¼ 1
lj
p Y
lt j f b,a ðl1 , . . . , lp Þ dl1 dlp
j¼1
Z
ðð1=2Þðt=nÞÞ
½0,1Þ
p
e
Pp
j ¼ 1
lj
p Y j¼1
ljðatÞq
Y
b
9lk ll 9 dl1 dlp :
ð3:5Þ
1rkolrp
For t 2 ð 12 b, 12 ðb4nÞÞ, we know ð1=2Þðt=nÞ 40. Make transforms mj ¼ ð1ð2t=nÞÞlj for 1 rj rp. It follows that the above is identical to Z Pp p Y 2t pðatqÞðb=2Þpðp1Þp b ð1=2Þ m Y ðatÞq j ¼ 1 j e mj 9mk ml 9 dm1 dmp : ð3:6Þ eðlog n1Þpt cbL ,a 1 n ½0,1Þp j¼1 1rkolrp
2250
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
Since t 2 ð 12 b, 12 ðb4nÞÞ and np Z1, we know to
b 2
r
b 2
ðnpÞ ¼
b 2
b
b
ðn1Þ ðp1Þ ¼ a ðp1Þ: 2 2
That is, at 4 ðb=2Þðp1Þ. Therefore the integral in (3.6) is equal to 1=cbL ,at by (3.2) and (3.3). It then from (3.5) and (3.6) that b,a n 2t pðatqÞðb=2Þpðp1Þp cL EetLn ¼ eðlog n1Þpt 1 b,at n cL b pðatqÞðb=2Þpðp1Þp ðpjÞ G at p Y 2t 2 : ¼ eðlog n1Þpt 1 2pt b n j ¼ 1 G a ðpjÞ 2 Now, use a ¼ ðb=2Þðn1Þ and q ¼ 1 þ ðb=2Þðp1Þ to obtain that b pðtðb=2Þðn1ÞÞ G at j p1 Y n 2t 2 : 2pt EetLn ¼ eðlog n1Þpt 1 b n j ¼ 0 G a j 2 The proof is completed.
&
Let fZ,Z n ; n Z1g be a sequence of random variables. It is known that Z n converges to Z in distribution if lim EetZ n ¼ EetZ o 1,
ð3:7Þ
n-1
for all t 2 ðt 0 ,t 0 Þ, where t 0 4 0 is a constant. See, e.g., p. 408 from Billingsley (1986). Proof of Theorem 1. First, since logð1xÞ o x for all x o 1, we know s2n 40 for all n 4p Z1. Now, by assumption, it is easy to see ( 2½y þlogð1yÞ if y 2 ð0; 1Þ; lim s2n ¼ ð3:8Þ n-1 þ1 if y ¼ 1: Trivially, the limit is always positive. Consequently,
d0 :¼ inffsn ; n 4 p Z1g 4 0: To finish the proof, by (3.7) it is enough to show that n
L mn 2 s -es =2 ¼ EesNð0;1Þ , E exp n
ð3:9Þ
sn
as n-1 for all s such that 9s9 o d0 =2. Fix s such that 9s9 o d0 =2. Set t ¼ t n ¼ s=sn . Then 9t n 9 o 1=2 for all n 4p Z1. In Lemma 3.2, take b ¼ 1 and a ¼ ðn1Þ=2, by (3.4), nj1 ptðnp=2Þ þ ðp=2Þ t G p1 Y n 2t 2 : EetLn ¼ eðlog n1Þpt 1 2pt nj1 n j¼0 G 2 Letting i ¼ nj1, we get n
EetLn ¼ 2pt eðlog n1Þpt
i ptðnp=2Þ þ ðp=2Þ nY t G 1 2t 2 , 1 i n i ¼ np G 2
ð3:10Þ
for n 4 p. Then n
log EetLn
1n 2t log 1 þlog ¼ ptðlog n1log 2Þ þp t þ 2 n
n 1 Y i ¼ np
Now, use identity logð1xÞ ¼ xðx2 =2Þ þOðx3 Þ as x-0 to have 1n 2t 1n 2t 2t 2 1 log 1 ¼ p tþ 2 þO 3 p tþ 2 n 2 n n n
i t 2 : i G 2
G
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
¼
2251
2pt 1n t 2pt 1 1n 1 p tþ 1þ þoð1Þ ¼ tþ þO þoð1Þ ¼ t 2 þ ptyt þ oð1Þ, n 2 n n 2 2 n n
as n-1. Recall r n ¼ ðlogð1ðp=nÞÞÞ1=2 . We know t ¼ t n ¼ s=sn ¼ Oð1=r n Þ as n-1. By Proposition 2.1, i t n 1 G Y 2 ¼ ptð1 þ log 2Þpt log n þ r 2n ðt 2 þ ðpn þ1:5ÞtÞ þ oð1Þ, log i i ¼ np G 2 as n-1. Join all the assertions from (3.10) to the above to obtain that n p log EetLn ¼ ptðlog n1log 2Þ t 2 þ ptyt þ ptð1 þ log 2Þpt log n þ r 2n ðt 2 þ ðpn þ1:5ÞtÞ þ oð1Þ n p ¼ þ r 2n t 2 þ ½p þ r 2n ðpn þ 1:5Þyt þ oð1Þ, n
ð3:11Þ
as n-1. Noticing
3 p p þ r 2n ðpn þ 1:5Þy ¼ np log 1 þ py ¼ mn , 2 n
and from the definition of sn and notation t ¼ s=sn , we know ððp=nÞ þ r 2n Þt 2 ¼ s2 =2. Hence, it follows from (3.11) that n
n L mn s2 s ¼ log EetLn mn t- , log E exp n sn 2 as n-1. This implies (3.9). The proof is completed.
&
Now we start to prove Theorem 2. The following lemma says that the distribution of L1 in (1.7) does not depend on the mean vectors or covariance matrices of the population distributions where random samples xi ’s and yj ’s come from. Lemma 3.3. Let L1 be defined as in (1.7) with n1 4 p and n2 4 p. Then, under H0 in (1.4), L1 and ðn1 þ n2 Þp=2
ðn1 þ n2 Þ L~ 1 :¼ n p=2 n p=2 n11 n22
n1 =2
9C9
n2 =2
9IC9
ð3:12Þ
have the same distribution, where C ¼ ðUn U þ Vn VÞ1=2 ðUn UÞðUn U þVn VÞ1=2 ,
ð3:13Þ
with U ¼ ðuij Þðn1 1Þp and V ¼ ðvij Þðn2 1Þp , and fuij ,vkl g are i.i.d. random variables with distribution Nð0; 1Þ. Proof. Recall that x1 , . . . ,xn1 is a random sample from population Np ðl1 , R1 Þ, and y1 , . . . ,yn2 is a random sample from population Np ðl2 , R2 Þ, and the two sets of random variables are independent. Under H0 in (1.4), R1 ¼ R2 ¼ R and R is non-singular. Set x~ i ¼ R1=2 xi
y~ j ¼ R1=2 yj ,
and
for 1 r i rn1 and 1r j r n2 . Then fx~ i ; 1 ri rn1 g are i.i.d. with distribution Np ðl~ 1 ,Ip Þ where l~ 1 ¼ R1=2 l1 ; fy~ j ; 1 r j rn2 g are i.i.d. with distribution Np ðl~ 2 ,Ip Þ where l~ 2 ¼ R1=2 l2 . Further, fx~ i ; 1r i rn1 g and fy~ j ; 1 rj rn2 g are obviously independent. Similar to (1.5) and (1.6), define n1 1 X ðx~ x~ Þðx~ i x~ Þn A~ ¼ n1 i ¼ 1 i
and
n2 1 X B~ ¼ ðy~ y~ Þðy~ i y~ Þn , n2 i ¼ 1 i
ð3:14Þ
where x~ ¼
n1 1 X x~ n1 i ¼ 1 i
n2 1 X y~ : n2 i ¼ 1 i
ð3:15Þ
B ¼ R1=2 B~ R1=2 :
ð3:16Þ
y~ ¼
and
It is easy to check that A ¼ R1=2 A~ R1=2
and
By Lemma 3.1, d n1 A~ ¼ Un U
and
d n2 B~ ¼ Vn V,
ð3:17Þ
where U ¼ ðuij Þðn1 1Þp and V ¼ ðvij Þðn2 1Þp , and fuij ,vkl ; i,j,k,l Z 1g are i.i.d. random variables with distribution Nð0; 1Þ. Review (1.7), n1 =2
L1 ¼
9A9
n2 =2
9B9
N=2
9c1 A þ c2 B9
¼
NNp=2 n p=2 n p=2 n11 n22
n1 =2
9n1 A9
n2 =2
9n2 B9
9n1 A þn2 B9
N=2
2252
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
¼
N Np=2 n p=2 n p=2 n11 n22
~ n1 =2 9n2 B9 ~ n2 =2 9n1 A9 , N=2 ~ 9n1 A~ þ n2 B9
ð3:18Þ
~ 9R9 and 9n2 B9 ¼ 9n2 B9 ~ 9R9 and since 9n1 A9 ¼ 9n1 A9 ~ R1=2 9 ¼ 9n1 A~ þ n2 B9 ~ 9R9, 9n1 A þ n2 B9 ¼ 9R1=2 ðn1 A~ þn2 BÞ ðn þ n Þ=2
N=2
by (3.16), and hence the term 9R9 1 2 in the numerator canceled 9R9 in the denominator. Define ~ 1 A~ þn2 BÞ ~ 1=2 ðn1 AÞðn ~ 1=2 . We see from the independence between n1 A~ and n2 B~ and the independence C~ ¼ ðn1 A~ þ n2 BÞ between Un U and Vn V that d C~ ¼ C,
ð3:19Þ
where C is as in (3.13). It is obvious that ~ 9n1 A~ þ n2 B9 ~ ¼ 9n1 A9 ~ 9C9
1
1
~ ¼ 9n2 B9 ~ 9n1 A~ þ n2 B9 ~ 9IC9
and
:
Hence we have from (3.18) that L1 ¼
NNp=2
n1 =2
n p=2 n p=2 n11 n22
n2 =2
~ 9C9
~ 9IC9
:
ð3:20Þ
Finally, we get the desired conclusion from (3.19) and (3.20).
&
Let l1 , . . . , lp be the eigenvalues of the b-Jacobi ensemble or the b-MANOVA matrix, that is, they have the joint probability density function Y
f ðl1 , . . . , lp Þ ¼ cbJ ,a1 ,a2
b
9li lj 9
1riojrp
p Y
lia1 q ð1li Þa2 q ,
ð3:21Þ
i¼1
for 0 r l1 , . . . , lp r1, where a1 ,a2 4ðb=2Þðp1Þ are parameters, q ¼ 1þ ðb=2Þðp1Þ, and b b ðpjÞ G 1þ G a þ a p 1 2 Y 2 2 , cbJ ,a1 ,a2 ¼ b b b j ¼ 1 G 1 þ j G a ðpjÞ G a ðpjÞ 1 2 2 2 2
ð3:22Þ
with a1 ¼ ðb=2Þðn1 1Þ and a2 ¼ ðb=2Þðn2 1Þ. The fact that f ðl1 , . . . , lp Þ is a probability density function follows from the Selberg integral (see, e.g., Forrester and Warnaar, 2008; Mehta, 2004): Z p Y 1 b Y a1 q 9li lj 9 li ð1li Þa2 q dl1 lp ¼ b,a ,a : ð3:23Þ 1 2 ½0;1p 1 r i o j r p c i¼1 J It is known that the eigenvalues of C defined in (3.13) has density function f ðl1 , . . . , lp Þ in (3.21) with b ¼ 1,
a1 ¼ 12 ðn1 1Þ,
a2 ¼ 12 ðn2 1Þ
and
q ¼ 1 þ 12ðp1Þ:
ð3:24Þ
See, for example, Constantine (1963) and Muirhead (1982) for this fact. Lemma 3.4. Let TN be as in (1.7). Assume n1 4 p and n2 4 p. Then EetT N ¼ C n1 ,n2 U n ðtÞ V 1,n ðtÞ1 V 2,n ðtÞ1 , for all t oð1=2Þð1ðp=ðn1 4n2 ÞÞÞ, where C n1 ,n2 ¼
n1n1 pt nn22 pt ðn1 þ n2 Þ
V 1,n ðtÞ ¼
nY 1 1
ðn1 þ n2 Þpt
,
Gð12 iÞ 1 Gð2 in1 tÞ i ¼ n1 p
U n ðtÞ ¼
and
N2 Y
Gð12 iÞ
i ¼ Np1
Gð12 iNtÞ
V 2,n ðtÞ ¼
,
nY 2 1
Gð12 iÞ : 1 Gð2 in2 tÞ i ¼ n2 p
Proof. From (1.7), etT N ¼ ðL1 Þ2t for any t 2 R. Therefore, by Lemma 3.3, 0 1 p Y n1 t n2 t n1 t tT N n2 t A @ , 9IC9 Þ ¼ C n1 ,n2 E lj ð1lj Þ Ee ¼ C n1 ,n2 Eð9C9 j¼1 1 ,a2 where l1 , . . . , lp are the eigenvalues of C in (3.13). Write caJ 1 ,a2 ¼ c1,a . By (3.22) and (3.24), J Z p Y Y a n tq lj 1 1 ð1lj Þa2 n2 tq 9li lj 9 dl1 lp EetT N ¼ C n1 ,n2 caJ 1 ,a2
½0;1p j ¼ 1
1riojrp
ð3:25Þ
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
¼ C n1 ,n2
caJ 1 ,a2 cJa1 n1 t,a2 n2 t
2253
,
ð3:26Þ
since f ðl1 , . . . , lp Þ is a probability density function. Of course, recalling ai ¼ 12 ðni 1Þ for i¼ 1, 2 and the assumption that t o 12 ð1p=ðn1 4n2 ÞÞ, we know a1 n1 t 4 12 ðp1Þ
and
a2 n2 t 4 12ðp1Þ,
which are required in (3.21). From (3.26), we see 2 EetT N
p Y
1 2
Now, use ai ¼
Gða1 þ a2 12 ðpjÞÞ
31 2
p 6Y 7 7 ¼ C n1 ,n2 6 4 5 1 1 G ða þ a Nt ðpjÞÞ 1 2 2 j¼1 j ¼ 1 G a1 n1 t ðpjÞ 2 ¼: C n ,n U~ n ðtÞ V~ 1,n ðtÞ1 V~ 2,n ðtÞ1 : 1
a1 12
G a1 ðpjÞ
p 6Y 6 4
1 2
31
7 7 5 1 j ¼ 1 G a2 n2 t ðpjÞ 2
ð3:27Þ
2
1 2 ðni 1Þ
G a2 ðpjÞ
for i¼1, 2 again to have
1 2
ðpjÞ ¼ ðn1 p þ j1Þ;
a2 12 ðpjÞ ¼ 12 ðn2 p þ j1Þ;
a1 þ a2 12 ðpjÞ ¼ 12ðNp þ j2Þ:
Thus, by setting i ¼ Np þ j2 for j ¼ 1; 2, . . . ,p, we have U~ n ðtÞ ¼
p Y
Gða1 þ a2 12 ðpjÞÞ ¼ 1 G ða 1 þ a2 Nt 2 ðpjÞÞ j¼1
N 2 Y
Gð12 iÞ ¼ U n ðtÞ: 1 G ð 2 iNtÞ i ¼ Np1
Similarly, V~ i,n ðtÞ ¼ V i,n ðtÞ for i ¼1, 2. These combining with (3.27) yield the desired result.
&
Lemma 3.5. Let TN be as in (1.7). Assume ni 4 p and p=ni -yi 2 ð0; 1 for i¼1, 2. Recall s in (1.8). Then, 0 o sn o 1 for all n1 Z2,n2 Z2, and E expfðT N =ðNsn ÞÞtg o1 for all t 2 R as n1 and n2 are sufficiently large. 2 n
Proof. First, we claim that
s2 :¼ 2½logð1yÞg21 logð1y1 Þg22 logð1y2 Þ 4 0, 1
ð3:28Þ 1
1
for all y1 ,y2 2 ð0; 1Þ, where g1 ¼ y2 ðy1 þ y2 Þ , g2 ¼ y1 ðy1 þy2 Þ and y ¼ y1 y2 ðy1 þ y2 Þ . 00 In fact, consider hðxÞ ¼ logð1xÞ for x o1. Then, h ðxÞ ¼ ð1xÞ2 40 for x o 1. That is, h(x) is a convex function. Take 2 2 2 g3 ¼ 2y1 y2 =ðy1 þ y2 Þ . Then, g1 þ g2 þ g3 ¼ 1. Hence, by the convexity, g21 logð1y1 Þg22 logð1y2 Þ ¼ g21 logð1y1 Þg22 logð1y2 Þg3 logð10Þ ologð1ðg21 y1 þ g22 y2 þ g3 0ÞÞ ¼ logð1yÞ, where the strict inequality comes since y1 a0 and y2 a0. Now, taking yi ¼ p=ni 2 ð0; 1Þ for i¼1, 2 in (3.28), we get y2 n1 y1 n2 y y p , g2 ¼ and y ¼ 1 2 ¼ : g1 ¼ ¼ ¼ N y1 þy2 N y1 þ y2 N y1 þ y2 Evidently, n1 =N,n2 =N,p=N 2 ð0; 1Þ. Then, by (3.28), we know 0 o sn o 1 for all n1 Z 2,n2 Z 2. Second, noting that t 1 p 1 p 1 1 o t: ¼ 1, Nsn , N sn 2 n1 4n2 2 n1 4n2 to prove the second part, it suffices to show from Lemma 3.4 that p 1 N sn ¼ þ 1: lim n1 ,n2 -1 n1 4n2
ð3:29Þ
Case 1: y1 o 1, y2 o 1. Recall s2 in (3.28). Evidently, s2n -s2 2 ð0,1Þ as n1 ,n2 -þ 1. Hence, (3.29) follows since 1ðp=ðn1 4n2 ÞÞ-1y1 3y2 4 0. Case 2: maxfy1 ,y2 g ¼ 1. This implies s2n - þ 1 as n1 ,n2 -1 because logð1ðp=NÞÞ-log y 2 ð1,0Þ and the sum of the last two terms on the right hand side of (1.8) goes to þ 1. Further, the given conditions say that ni 1 Z p, and hence, 1ðp=ni Þ Z 1=ni Z1=N for i¼1, 2. Thus,
p p p N sn ¼ min 1 ,1 Nsn Z sn - þ 1, 1 n1 4n2 n1 n2 as n1 ,n2 -1. We get (3.29). The proof is completed.
&
Proof of Theorem 2. From Lemma 3.5, we assume, without loss of generality, that E expfðT N =ðNsn ÞÞtg o 1 for all n1 Z2,n2 Z2 and t 2 R. Fix t 2 R. Set t n ¼ t n1 ,n2 ¼ t=sn for n1 ,n2 Z2. From the condition p=ni -yi for i ¼ 1; 2 as p4n1 4n2 ¼ p-1 by the assumption n1 4p and n2 4p (we will simply say ‘‘p-1’’ in similar situations later), we know
2254
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
s2n has a positive limit (possibly þ 1) as p-1. It follows that ftn ; n1 ,n2 Z 2g is bounded. By Lemma 3.4,
TN tn tn tn pt log V 2,n þ log U n þ n ðn1 log n1 þ n2 log n2 N log NÞ: t ¼ log V 1,n log E exp N sn N N N N
ð3:30Þ
Set g1 ¼ y2 ðy1 þ y2 Þ1 , g2 ¼ y1 ðy1 þ y2 Þ1 and y ¼ y1 y2 ðy1 þy2 Þ1 . Easily, ni p p -y 2 ð0; 1Þ and 2 log 1 -2 logð1yÞ 2 ð1,0Þ, -gi 2 ð0; 1Þ, N1 N N as p-1. Then, from (1.8) we know that 1=2 ! ni 1 p t n gi t ¼O log 1 ni N sn
and
tn ¼ O
p 1=2 log 1 , N1
ð3:31Þ
for i¼1, 2 as p-1. Replacing ‘‘t’’ in Proposition 2.1 with ‘‘n1 t n =N’’, we have i n 2 G 1 tn nY 1 1 n tn n1 pt n n1 pt n n1 2 N ¼ ¼ log ð1 þ log 2Þ log n1 þr 2n,1 12 t 2n þðpn1 þ 1:5Þ t n þoð1Þ, log V 1,n i N N N N N i ¼ n1 p G 2 ð3:32Þ as p-1, where 1=2 p , r n,i :¼ log 1 ni
i ¼ 1; 2:
ð3:33Þ
Similarly, tn ¼ log log V 2,n N
nY 2 1
G
i ¼ n2 p
i n2 tn 2 N i G 2
¼
2 n n2 pt n n2 pt n n2 ð1 þ log 2Þ log n2 þr 2n,2 22 t 2n þðpn2 þ 1:5Þ t n þoð1Þ, N N N N ð3:34Þ
as p-1. By the same argument, by using (3.31) we see i t G ðN1Þ1 n Y tn 2 ¼ pt n ð1 þlog 2Þpt n logðN1Þ þR2n ðt 2n þðpN þ 2:5Þt n Þ þ oð1Þ, ¼ log log U n i N i ¼ ðN1Þp G 2 as p-1, where p 1=2 : Rn ¼ log 1 N1
ð3:35Þ
ð3:36Þ
From (3.32) and (3.34), ! n2i 2 tn pt n pt ni þ n ni log ni ¼ i n ð1 þlog 2Þ þr 2n,i t t þ ðpn þ 1:5Þ log V i,n n þ oð1Þ i n N N N N N2 ¼
n2 r 2 ðpni þ 1:5Þni r 2n,i ni pt n ð1 þlog 2Þ þ i 2n,i t 2n þ t n þoð1Þ, N N N
ð3:37Þ
as p-1 for i¼ 1, 2. Since ft n g is bounded, use logð1 þxÞ ¼ x þ Oðx2 Þ as x-0 to see 1 ¼ yt n þ oð1Þ, pt n log Npt n logðN1Þ ¼ pt n log 1 þ N1 as p-1, where lim p=ðN1Þ ¼ y1 y2 =ðy1 þy2 Þ ¼ y o1. Therefore, by (3.35) and the fact N ¼ n1 þ n2 , tn log U n þ ptn log N ¼ pt n ð1 þlog 2Þ þyt n þR2n ðt 2n þðpN þ 2:5Þt n Þ þ oð1Þ N ¼
n1 pt n þn2 pt n ð1þ log 2Þ þ R2n t 2n þ ðy þ ðpN þ 2:5ÞR2n Þt n þoð1Þ, N
as p-1. Joining (3.30) with (3.37) and (3.38), we obtain 2 n1 2 n2 r þ 22 r 2n,2 R2n t 2n þ rn t n þ oð1Þ, log Eetn T N =N ¼ 2 n,1 N N
ð3:38Þ
ð3:39Þ
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
2255
as p-1, where
rn ¼
1 ððpn1 þ1:5Þn1 r 2n,1 þðpn2 þ 1:5Þn2 r 2n,2 ÞðpN þ 2:5ÞR2n y: N
ð3:40Þ
By using the fact logð1 þ xÞ ¼ x þ oðx2 Þ again, we have that N1 Np 1 1 p 1 ¼ log 1 log 1 ¼ þO log , N Np1 N Np NðNpÞ N2 as p-1. Reviewing (3.36), we have p p N1 Np p 1 ¼ log 1 þlog ¼ r 2n þ þO , R2n ¼ log 1 N1 N N Np1 NðNpÞ N2
ð3:41Þ
as p-1, where p 1=2 : r n ¼ log 1 N In particular, since ft n g is bounded, R2n t 2n ¼ r 2n t 2n þoð1Þ,
ð3:42Þ
as p-1. By (3.41), recalling p=N-y, we get p ðpN þ 2:5ÞR2n ¼ ðpN þ2:5Þr 2n þ oð1Þ ¼ ðpN þ2:5Þr 2n yþ oð1Þ, N as p-1. Plug this into (3.40) to have that
rn ¼
1 ððpn1 þ1:5Þn1 r 2n,1 þðpn2 þ 1:5Þn2 r 2n,2 ÞðpN þ 2:5Þr 2n þoð1Þ, N
as p-1. Now plug the above and (3.42) into (3.39), since ft n g is bounded, we have 2 n1 2 n22 2 2 2 r þ r r log Eetn T N =N ¼ n,1 n,2 n t n þ mn t n þ oð1Þ, N2 N2
ð3:43Þ
ð3:44Þ
as p-1 with
mn ¼
1 ððpn1 þ1:5Þn1 r 2n,1 þðpn2 þ 1:5Þn2 r 2n,2 ÞðpN þ 2:5Þr 2n : N
Using t n ¼ t=sn and the definition of sn , we get 2 n1 2 n22 2 n22 p n21 p p t2 2 2 2 r þ r r ¼ t log 1 log 1 log 1 t - , n,1 n,2 n n n N n1 n2 2 N2 N2 N2 N2 as p-1. This and (3.44) conclude that
T n N mn t2 t n ¼ log Eetn T N =N mn t n - , log E exp N 2 as p-1, which is equivalent to that
1 Tn 2 E exp mn t -et =2 ¼ EetNð0;1Þ , sn N as p-1 for any t 2 R. The proof is completed by using (3.7).
&
Acknowledgment We thank Danning Li very much for her check of our proofs and many good suggestions. We also thank an anonymous referee for very helpful comments for revision. References Ahlfors, L.V., 1979. Complex Analysis, 3rd ed. McGraw-Hill, Inc. Bai, Z., Saranadasa, H., 1996. Effect of high dimension comparison of significance tests for a high-dimensional two sample problem. Statistica Sinica 6, 311–329. Bai, Z., Silverstein, J., 2004. CLT for linear spectral statistics of large dimensional sample covariance matrices. Annals of Probability 32, 553–605. Bai, Z., Jiang, D., Yao, J., Zheng, S., 2009. Corrections to LRT on large-dimensional covariance matrix by RMT. Annals of Statistics 37 (6B), 3822–3840. Billingsley, P., 1986. Probability and Measure. Wiley Series in Probability and Mathematical Statistics, 2nd ed. Constantine, A., 1963. Some non-central distribution problems in multivariate analysis. Annals of Mathematical Statistics 34, 1270–1285. Dempster, A., 1958. A high-dimensional two sample significance test. Annals of Mathematical Statistics 29, 995–1010. Dumitriu, I., Edelman, A., 2002. Matrix models for beta-ensembles. Journal of Mathematical Physics 43 (11), 5830–5847.
2256
D. Jiang et al. / Journal of Statistical Planning and Inference 142 (2012) 2241–2256
Forrester, P., Warnaar, S., 2008. The importance of the Selberg integral. Bulletin of the American Mathematical Society 45 (4), 489–534. Freitag, E., Busam, R., 2005. Complex Analysis. Springer. Gamelin, T.W., 2001. Complex Analysis, 1st ed. Springer. Jiang, T. Limit theorems on Beta-Jacobi ensembles /http://arXiv.org/abs/0911.2262S. Mehta, M.L., 2004. Random Matrices, Pure and Applied Mathematics (Amsterdam), 3rd ed. vol. 142. Elsevier, Academic Press, Amsterdam. Muirhead, R.J., 1982. Aspects of Multivariate Statistical Theory. Wiley, New York. Zheng, S., 2008. Central limit theorem for linear spectral statistics of large dimensional F-matrix. Preprint. Northeast Normal University, Changchun, China.