Statistics & Probability Letters 7 (1989) 425-430 North-Holland
April 1989
AN ADJUSTMENT FOR A TEST CONCERNING A PRINCIPAL COMPONENT SUBSPACE
J a m e s R. S C H O T T
Department of Statistics, University of Central Florida, Orlando, FL USA Received October 1987 Revised February 1988
Abstract: A Bartlett adjustment factor is obtained for a statistic used to test the hypothesis that a specified set of m orthonormal vectors spans the same subspace as does the set of the first m principal component v e c t o r s . The adjusted and unadjusted statistics are compared via a simulation study. Keywords: asymptotic expansion, Bartlett adjustment factor, latent root, latent vector, principal component subspace.
1. Introduction S u p p o s e Yl,---, YN is a r a n d o m sample from a p-variate n o r m a l distribution with covariance matrix A. In some situations, particularly when p is large, one m a y wish to reduce the n u m b e r o f variables u n d e r consideration from p to m. Let wi b e the normalized latent vector of A c o r r e s p o n d i n g to the ith largest latent r o o t X i, so that V, = {u ~ RP: (wlw~ + . - . + w , w , ~ ) u = u} is the subspace s p a n n e d b y the first m p g t principal c o m p o n e n t vectors. Then the variance of the n e w set of variables v; =- ( a l y ~. . . . , amy,) will b e maximized b y choosing a l , . . . , a,. as a n y orthonormal basis of I'm. Typically A is u n k n o w n so that we c a n n o t o b t a i n al . . . . , am. However, b y using the sample covariance matrix S we can o b t a i n an estimate of V,. If x~ represents the normalized latent vector of S corresponding to 1~, the ith largest latent root, we could estimate I'm b y I2, = { u ~ RP: ( x l x ~ + . . . + x . x ' ) u = u}. Then for a n y o r t h o n o r m a l basis, fll . . . . . ! P p tim, of 12m we could use the n e w set of variables w i = (flaYi,---, flf, Yi). O f t e n we find that fll . . . . . tim closely resembles another set of o r t h o n o r m a l vectors ~q. . . . . Ym which m a y yield linear c o m b i n a t i o n s of the original variables that are easier to use or interpret. F o r instance, with m = 2, if the s u b s p a c e generated b y ~1 = (2-1/2, 2-1/2, 0 , . . . , 0) and ~2 = (2-1/2, _ 2-1/2, 0 . . . . . 0) is n o t too different f r o m ~ , one m a y wish to w o r k with the n e w variables z ' = ( 2 - 1 / 2 y ~ 1 + 2-1/2yi2, 2 - 1 / 2 y ~ a 2-1/2y~2 ). W h e t h e r this is a reasonable course of action will b e determined b y c o m p u t i n g the p - v a l u e of the test of the null hypothesis H 0 ( m ) : V, = V,, where V. = { u ~ R P: (~,ly~ + " ' " + "1',%~) u = u }. Tyler (1981) d e v e l o p e d an a s y m p t o t i c procedure for testing H o ( m ). Let F = ( ~ 1 , - - - , % . ) , X = (X 1 .... , Xm), D i = diag(11/(l 1 - l i ) 2 . . . . , l , J ( l , , , - 1i)2), a n d n -- N - 1. Then the test rejects H o ( m ) if r a n k t t { ( x l x ~ + . . " + x,,,x,,,)F} < m or if r a n k { ( x l x ~ + . . . + x , , , x m ) F } = m and p
' ' ' T.=n E t,-1 tr{l"x,x,r(l"XO, X ' r)-'} i=m+
(1)
l
exceeds the 1 - a quantile of the chi-squared distribution with m ( p - m ) degrees of freedom. Tyler (1981) has s h o w n that, u n d e r H o ( m ), r a n k { ( x l x ~ + . - - + x . x . ) Fp ) - - - , m in probability a n d T,,,---, X 2. ( e - m ) , in distribution as n ---, oo. 0167-7152/89/$3.50 © 1989, Elsevier Science Publishers B.V. (North-Holland)
425
Volume 7, Number 5
STATISTICS & PROBABILITY LETTERS
April 1989
A slightly different problem was considered by Mallows (1961). He obtained the likelihood ratio test of the null hypothesis that "Y1 ~'m are latent vectors of A; there is no assertion here that these vectors correspond to the largest roots. Recently, Srivastava (1987) derived a useful expression for the asymptotic distribution of this likelihood ratio statistic. In yet another related problem, James (1977) considered tests of the null hypothesis that the subspace spanned by V~,..., V,,, is also spanned by m of the principal components. Tyler's test procedure is a generalization of a test, proposed by Anderson (1963), for a single principal component. This author, in an earlier paper (Schott, 1987) has shown that a Bartlett adjustment factor greatly improves the performance of Anderson's test criterion, in terms of the type I error probability. The purpose of this paper is to obtain such an adjustment factor for T,,, and compare the performances of the unadjusted and adjusted statistics. ....
,
2. The adjusted statistic 2.1. Bartlett adjustment factors Let Q be the likelihood ratio statistic for testing some null hypothesis H 0 against/-/1, and suppose the null distribution of T = - 2 log Q is asymptotically x 2. Bartlett (1937) used the following approach to improve the chi-squared approximation to the null distribution of T. Suppose that under the null hypothesis
E(T)=d{1
q- c//n dr O ( n - 3 / 2 ) },
where c is a constant and n is the sample size or some related quantity. Then the adjusted statistic T. = (1 - c / n ) T has an expected value that comes closer to that of X2 than does the expected value of T. In fact, Lawley (1956b) has shown that all moments of T. agree with those of Xg with error O(n-3/2). This assures us that T. is improving the chi-squared approximation. 2.2. The adjustment factor when m = 1 In studying the null distribution of T,., we may assume without loss of generality that A = diag(hl,..., ~,p) and that each element of yi is 0 except for the ith element which is 1. In this case the expression for T,, given in (1) can be simplified to T,,,=n
E
1-j"
xy, z j j ( i ) + 2
i=m+l
Xj,Xk,ek(i ) ,
(2)
jffil k ~ l
j
"
T1 = n
2
i--2E
)2 X21111i
"
The mean of T,,, ignoring terms of order n-3/2, can be derived by making use of expansions for li and xj~ in terms of the elements aki = Ski -- ?~kSki, where ski is the (k, l)th element of S and /Ski is the Kxonecker delta. These expansions are valid as long as Xi is a distinct root; an expansion for 1~ is given by Lawley (1956a) and one for xj~ is given by Sugiura (1976). In this section we use these expansions to obtain an expression for E (T 1). 426
Volume 7, Number 5
STATISTICS & PROBABILITY LETTERS
April 1989
We will write T1 = nx~2(W~2 + . . . + W~p), where Wj, = x:,(lj- 1,)2/61,. We need to expand 7"1 in terms of the akt's up to and including fourth order terms. Now e xl-~2 = 1 + I~~
a12t
+ higher ordered terms,
(X~ - X , ) 2
2
a,ia,,ai
Wj,=ASi+X/~.i~'2_.~.i) 1--if,+ h,Xi'-'~X,)I+~
.)
asia" +E
1+2t"-2~'7
+
------
-2t~ih.h,-~l_--~, )
a..a. (,..)
2+
~
g
afaj2(XJ-Jk2i/AJ)
,.j x,x,(x~- x,)2(x,- x,)
aj,a, +,.,Exjx,(x,-x,)
{
2
(xj-x,)
1
(x,-x,)
+
(x,- v,/x,) (xj_x,)2
ajia fla liajj (1 + h i/~.j)
ajia ,ia fla .
{
+,.,E x~x,(x,-Xl)
2
2Aj
2 ( X , - X,)
(x,-x,)
ajia fla lk a ki + 2 ,.,E k.iE hjhi(Xi_ ht)(h, - hk )
+
X , ( X : - X,) }
ajla liajk a ki ,.iE k.iE XjX,(X, - X t ) ( A , - hk)"
The following expectations in the akt'S typify those needed here and later:
E(a2,)= h,h2/n ,
2 2 E(a21a3,)=A2Azh3/n 2, E( a~,) = 3A~hZ2/n2, E ( a~laZ21)= 2A3h 2/n 2.
E( a,la2~x) = 2Jkzlh2/n2. E( a21aala32 )
= ~klJk2Aa//n 2,
All other terms arising have expectations of order n -3 or less. Using the results above, it can be shown, after quite a bit of simplification, that 1 + 3((~,,. 2+•,) e ( x ~ 2 W " ) = -~ ( x , - -x,)2XlXi) 2
E x_,(__x,x,--3_) x + ,.x E n2(~-__h,) + ,.1 n2(x~- x,)2(x, - x,) (3)
Consequently, we get
E(T1) = ( p - 1){ 1 + c~/n + O(n -3/2) }. where c 1 ~ 3 -[-½ ( p ~ 2) +
6
X-"
A1Ai
~p-1),.~ (x _x,y
+
1
X-'X-"
(p-1),~
2 2_ 2A2h,h,)
(x _x,)2(x _x,)~
We note that if ~'1 is much larger than the remaining roots, cl will be approximately equal to c~' = p + 1. 427
Volume 7, Number 5
STATISTICS & PROBABILITY LETTERS
April 1989
2.3. The adustment factor when m > 2 Obtaining the expected value of T,, when m > 2 requires more work in getting expansions for the Zjk(i)'s. Put X = ( X ( X 2)', where X 1 is m x m. Then if we let ctj.k denote the cofactor of X1 corresponding to the element Xyk, we have m
(l h -- li) lh
h=l
It is straightforward to show that
,~)\=
aj~
(xj-x~) 2 m
+ higher ordered terms,
p
a2
m
2
for j 4= k,
m
+ 2 Y'. E
~;j-1- E E
h = a 1=1 (~k t - ~ k h ) q'j •h
a2h
h=l 1=1 (X~- Xh)2
2
+ higher ordered terms,
*j *j h>l
m
p
a2h
I X l l - = = l + y'
+ higher ordered terms.
h=l J=,.+l
(Xt-
Xh) 2
Consequently, we get l?lxfiZyj(i)=
Wji+ E
ajialj
ajialj
1=1 ~ki~kj(~k I -- ~kj) 2
*j
+
-,-, *j x,(Xl- xj) 2 x,(x,- x,) 2
--
+ higher ordered terms,
(4)
where Wj; is as defined in Section 2.2. Note that the expectation of the term in the first set of braces in (4) is given by (3), after changing the subscript 1 to j. Also, we have auatk ajlOlkl
+ higher ordered terms,
(Xl- Xj)(Xt- Xk) ajk
a kk ajk
P
~ J ~ = ( x j - xk) + (x~- x~)2
for j4= 1, k4=l, jq=k,
ajl a Ik
m
1=1
1=1
*k
+ higher ordered terms,
a j l a Ik
E (L_x~)(x,_x~) + E (xj-x,)(x,-x~) ~j,k
for j 4: k,
so that for j ~ k, li-lxjiXkiZjk(i)
--~
m
a j i a k i a j l a k l ( ~ k j __ ~ki)
,__EX,Xj(Xk--X,)(X,--Xk)(Xj--X,)-cjk, q'j,k
m
a j i a k i a f l a k t ( jk k _ ~ki )
+ ,__E x,x~(xj- x,)(x,- xA(x,- x,) q'j,k m
-- Gkji + E 1=1
ajiakiajlakl
( ~k I _ h i )2
x,x,(xj- x,)(xk- x,)(x,- xj)(x,- xk)
*j,k
+ higher ordered terms, 428
(5)
Volume 7, Number 5
STATISTICS & PROBABILITY LETTERS
April 1989
where
a,,,aa, a,~a(h a - h , ) GaB. ~ = __
xox~(x,~- x~)(xo- x,~) a~,val3~,a#lja,~a( h,, - Xr)
h,~jkv(ha _ Xv)(~., _ X8 ) 1 + ~--~" a ava a#al~va vv
+
xox,(x~- x,)(xo- xp)2 xox,(xo- x,)(x~P
+
1"
a a , a~,a,,,atp(X,~ - X v )
E xox,(x~- x,)(x°- x~)(x,- x~)
1=1
,# P
+
a~,a~aa~ta,,(X,~
- Xv)
+
*v
a~.,a,,~a,,,a,v(M - X,)
E xox,(x~- x,l(x°- x~l(x,- x,) •
I=1 ~y
Now calculating the expected values of (4) and (5), and then using these in (2), we obtain after simplification
E(T,n)=m(p-m){1
+c,,,/n+O(n-3/2)}
where
c,,,=3+½(p-m-1)+{m(p-m)}-l[ ~,~_, [ j
+
p 2~ E (x;- x,)(x,- x,)
i=m+l
v
" ) ~ j { ( m - 1))~j+6~,,}
E E , = ~ + 1 j=~
+ EE
~
(Xj-
X,) ~
x3. + x2A3 - 2x2.xix, ]
... 7,2,.--7,2 ,.,..j=~ .(x,-xj)(x,-x~)
•
If X,, is much larger than Am+l, Cm will be approximately equal to c* = p + 1.
3. Comparison of adjusted and unadjusted statistics The result obtained by Lawley (1956b) essentially guarantees that adjusting a likelihood ratio statistic by a Bartlett adjustment factor improves the chi-squared approximation to the null distribution of the test statistic. The test criterion, T,n, is not a likelihood ratio criterion so that Lawley's result does not apply here. Consequently, we will compare the performances of the adjusted and unadjusted statistics via a simulation study. Simulated probabilities of type I error were obtained for the unadjusted statistic Tin, the adjusted statistic, denoted by T*, and a third statistic, T * * , in which the simplified adjustment factor, c*, was used. Each estimate is based on 1000 simulations. The nominal significance level used was 0.05 and for simplicity ~,,,÷1,---, hv were all taken to be 1. Table 1 contains some of the results for m = 2, 4 and p = 10, 15. These results dearly indicate that the adjustment is successful. 429
Volume 7, Number 5
STATISTICS & PROBABILITY LETTERS
April 1989
Table 1 Simulated type I error probabilities when the nominal significance level is 0.05 p =10
(hi, h2) (10, 10)
n 15 20 25 30 50
T2 0.882 0.728 0.575 0.440 0.249
(15, 10)
T2* 0.029 0.046 0.068 0.0.67 0.056
T2** 0.225 0.182 0.166 0.126 0.090
T2 0.886 0.725 0.544 0.441 0.259
(25, 10)
:/"2* 0.028 0.064 0.070 0.072 0.074
T2** 0.213 0.170 0.137 0.126 0.092
T2 0.856 0.661 0.546 0.437 0.241
(50, 10) T2* 0.035 0.068 0.076 0.103 0.075
T2** 0.166 0.141 0.133 0.136 0.100
T2 0.884 0.682 0.516 0.445 0.228
(100, 25) 7"2* 0.047 0.089 0.076 0.079 0.071
T2** 0.172 0.171 0.123 0.119 0.087
T2 0.839 0.645 0.472 0.419 0.225
T2* 0.087 0.100 0.091 0.094 0.082
T2** 0.134 0.127 0.098 0.108 0.086
(hi, h2, h3, X4) (10, 10, 10, 10)
(25, 10, 10, 10)
(25, 25, 10, 10)
(50, 10, 10, 10)
(100, 25, 10, 10)
n
T,,
14"
T~*
14
14"
T~**
7"4
T~
T~**
14
14"
T~**
T4
T~*
7"4**
15 20 25 30 50
0.952 0.822 0.679 0.577 0.322
0.029 0.065 0.073 0.081 0.061
0.294 0.232 0.190 0.153 0.100
0.952 0.828 0.659 0.585 0.322
0.020 0.059 0.085 0.081 0.079
0.210 0.194 0.151 0.158 0.109
0.942 0.809 0.702 0.560 0.276
0.051 0.089 0.084 0.079 0.058
0.226 0.174 0.161 0.125 0.076
0.945 0.823 0.694 0.563 0.319
0.039 0.076 0.099 0.080 0.081
0.244 0.215 0.187 0.145 0.110
0.944 0.816 0.624 0.549 0.298
0.065 0.101 0.074 0.074 0.079
0.221 0.194 0.131 0.131 0.105
p =15
(hi, h2) (15, 15) n 20 25 30 35 50
T2 0.972 0.921 0.793 0.709 0.477
(25, 15) T2* 0.015 0.057 0.082 0.094 0.071
(35, 15)
(50, 15)
7"2**
T2
T2*
7"2**
T2
7"2*
T~*
T2
0.222 0.178 0.170 0.158 0.112
0.977 0.903 0.769 0.687 0.480
0.038 0.088 0.089 0.087 0.079
0.192 0.197 0.156 0.132 0.113
0.976 0.906 0.781 0.694 0.446
0.057 0.080 0.075 0.111 0.084
0.186 0.151 0.137 0.151 0.106
0.978 0.909 0.792 0.669 0.476
(15, 15, 15, 15)
(30, 15, 15, 15)
(30, 30, 15, 15)
(100, 25) T2* 0.045 0.094 0.084 0.077 0.058
T2** 0.172 0.181 0.146 0.118 0.088
(50, 15, 15, 15)
T2 0.962 0.872 0.756 0.666 0.428
T2* 0.095 0.106 0.105 0.079 0.089
T2** 0.171 0.138 0.133 0.099 0.100
(100, 30, 15, 15)
"
14
14"
14"*
14
14"
14"*
14
14"
14"*
14
14"
14"*
T4
14"
14"*
20 25 30 35 50
0.997 0.980 0.909 0.849 0.639
0.019 0.041 0.079 0.088 0.078
0.245 0.207 0.194 0.174 0.127
0.999 0.981 0.921 0.857 0.614
0.030 0.065 0.087 0.100 0.076
0.229 0.235 0.194 0.181 0.114
0.998 0.983 0.931 0.831 0.602
0.045 0.085 0.105 0.077 0.067
0.247 0.204 0.193 0.137 0.101
0.998 0.973 0.925 0.848 0.596
0.042 0.081 0.087 0.090 0.080
0.235 0.217 0.195 0.163 0.111
0.999 0.972 0.914 0.814 0.584
0.054 0.089 0.088 0.101 0.072
0.239 0.195 0.171 0.149 0.102
References Anderson, T.W. (1963), Asymptotic theory for principal component analysis, Ann. Math. Statist. 34, 122-148. Bartlett, M.S. (1937), Properties of sufficiency and statistical tests. Proc. 17,. Soc. A 1 ~ , 268-282. Jamcs, A.T. (1977), Tests for a prescribed subspace of principal components, in: P.R. Krishnaiah, ed., Multivariate Analysis I V (North-Holland, Amsterdam) pp. 73-77. Lawley, D.N. (1956a), Tests of significance for the latent roots of covariance and correlation matrices, Biometrika 43, 128-136. Lawley, D.N. (1956b), A general method for approximating to the distribution of the likelihoodratio criteria, Biometrika 43, 295-303. 430
Mallows, C.L. (1961), Latent vectors of random symmetric matrices, Biometrika 48, 133-149. Sehott, J.R. (1987), An improved chi-squared test for a principal component, Statist. Prob. Letters 5, 361-365. Srivastava, M.S. (1987), Tests for covariance structure in familial data and principal component, in: A.K. Gupta, ed., Advances in Multivariate Statistical Analysis (D. Reidel Publishing Co., Boston) pp. 341-352. Sugiura, N. (1976), Asymptotic expan-~ions to the distributions of the latent roots and latent vector of the Wishart and multivariate F matrices, J. Mult. A n a l 6, 500-525. Tyler, D.E. (1981), Asymptotic inference for eigenvectors, Ann. Statist. 9, 725-736.