Computational North-Holland
Statistics
& Data Analysis
15 (1993) 179-198
179
Jackknife estimation of the eigenvalues of the covariance matrix Mario Romanazzi Statistical Laboratory,
University of Venice “Ca’ Foscari”, Z-30123 Venice, Italy
Received May 1991 Revised September 1991 Abstract: The properties of the jackknife statistic for the eigenvalues of the covariance and the correlation matrix are studied, using von Mises expansions. Influence functions of the eigenvalues, up to the third order for the covariance matrix, and up to the second order for the correlation matrix, are given. An explicit expression of the infinitesimal jackknife estimator is obtained. It is shown that, for small sample sizes, the jackknife standard error may be a biased estimator of the standard error of the sample eigenvalues. Due to the characteristics of the influence functions, the jackknife eigenvalues and the jackknife standard errors are non-robust estimators. Monte Carlo simulations show that the jackknife eigenvalues are more heavily affected than the sample eigenvalues by contaminations of the data. Keywords:
Eigenvalue;
Influence
function;
Jackknife;
Robustness.
1. Introduction
The jackknife is a standard method for distribution-free bias reduction and variance estimation. Let en be an estimator of the parameter 8 based on i.i.d. random observations X,, . . . , X,, with unknown distribution function F. The standa_rd jackkniff procedure starts with, the computation of pseudovalues Pi = n0, - (n - l)Onn(+ i = 1,. . . , n, where On(i)isJhe statistic e^, evaluated from all the X, except Xi. The jackknife estimator_ 8,, = II- ‘E,P, is the arithmetic mean of ,the pseudovalues. The va$ance of 8,, (and e^,) is estimated by the statistic V,, = [n(n - l)]-l[Ci(Pi - 8,,)*], i.e., the sample variance of the pseudovalues divided by n. Recently, Nagao (1985, 1988) investigated the asymptotic distribution of the jackknife statistic for the eigenvalues of covariance and correlation matrices. Under mild conditions for F, he showed that (i) the jackknife statistic for a simple eigenvalue has an asymptotically normal distribution and (ii) the jackknife estimator of the variance is consistent. Correspondence to: M. Romanazzi, Statistical 3246 Dorsoduro, I-30123 Venice, Italy. 0167-9473/93/$06.00
0 1993 - Elsevier
Science
Laboratory,
Publishers
University
of Venice
B.V. All rights reserved
“Ca’ Foscari”,
180
M. Romanazzi / Jackknife eigenualues
The use of the jackknife procedure in this context is indeed appealing because it provides an alternative to the normal-based asymptotic theory which can be imprecise for small sample sizes and non-normal or contaminated data. However, practical application of the jackknife must face several difficulties. First of all, in addition to the asymptotic properties, the finite sample behaviour of the jackknife should be investigated. For example, since the pseudovalues are correlated, fJJK can be a biased estimator of the variance of iJK. Another problem is the well-known sensitivity of the jackknife to outlying observations. A trimmed mean of the pseudovalues has been suggested to obtain a robust version of iJ, (Hinkley, 1978; Hinkley and Wang, 1980). Finally, the calculation of the pseudovalues requires n eigenanalyses, one for each observation, and this implies some computational effort. In this paper we examine to what extent these problems can be overcome in the estimation of the eigenvalues of covariance and correlation matrices. In Section 2 we obtain a characterization of the pseudovalues in terms of von Mises functionals which allows the asymptotic theory and some finite sample properties of the jackknife estimators to be derived easily. This also gives a practical method for the computation of the pseudovalues which greatly reduces the calculations required. The theoretical results are illustrated with numerical examples in Section 3. Section 4 gives the extension of the jackknife to the estimation of the eigenvalues of correlation matrices. Some brief conclusions are outlined in Section 5.
2. Jackknife estimation
of the eigenvalues
of covariance
matrices
2.1. Differentiability properties of the eigenvalues Suppose F is an unknown cumulative distribution function on Rp, p > 2, and let = lx dF( x ) and LXF) = /[x - p.( F)][x - p( F>lT dF(x) be the mean vector and the covariance matrix expressed as functionals of F. For j, j’ E 11,. . , , p}, we denote with Aj = hj( F) the jth largest eigenvalue of n(F) and with aj = aj(F) the associated eigenvector normalized in such a way that aTaj, = 1, a:aj = 0,
j-0’)
j #j’.
For an arbitrary cumulative
distribution
F = (1 - E)F + EG. Then a($> = R(F)
0, = [P(G) -
function
+ •0~ + &*a,,
G and 0 < E < 1, define where
/-@‘)Ib(G) - /-WIT+ on(G)-fin(F),
and 02
=
-2b(G)
-
,Q’>]
b.(G)
-
,4F)IT.
For a given E E [O, 11, let hi(~) > * . . 2 hp(e) be the ordered eigenvalues of L?(F) and let aj(e), j = 1,. . . , p, be the corresponding orthonormal eigenvectors. In general, the ranking of the eigenvalues depends on the intensity of the
181
M. Romanazzi / Jackknife eigenualues
perturbation, i.e., the value of E. Suppose A, E {A,, . . . , hp} is a simple eigenvalue of R(F) with eigenvector a,, and consider the corresponding perturbation A”(E) E (A@, . . . , A&E)} with eigenvector U&E). It is known (e.g., Magnus and Neudecker, 1988; Romanazzi, 1989) that ho(e) and U&E) are infinitely differentiable functions of E in an interval centered at E = 0. In particular, denoting with A$) and al:) the kth derivatives of A&E) and Q&E) evaluated at E = 0, we have A$) = aTR 0 1a 0, A($ = a&a,
up
=
-
T,R,a,;
+ 2a;fR,a(,‘)
ah*)= - 27’,( 0, - A$‘r)ai) - ToL?2ao - ( af’Tu~))ao; A($ = 3[ a~fi2af) + a;fR,a~f’ + A$bzf)Taf)], where a convenient
expression of the matrix To is
To = CCo,(Aj- Ao)-lajaT, with CCojdenoting summation over all j E (1,. . . , p] for which Aj + A,. Then, for E E [0, 11, we can write the Taylor expansion A,(E) = A, + A(& + $A$+* + ;A(;‘e3 + . . . .
(24
Now, A,(O) = A,(F), A,(l) = A,,(G), and, for k = 1, 2,. . . , AT’ is the kth Gateaux differential of A,, at F in the direction of G. Thus from (2.1) we obtain a Taylor expansion for the functional A,(G): A,(G) =
A,(F) + ++
/f&i
/I ( t,
Xl,
F) d[G(xl) -&)I x27 . F)fj
+i///t3(xl>
~27 ~3;
d[G(x,)
-
F(xi)l
F)hd[G(xi) -F(xi)]
+ . ** -
(2.2)
F) is the kth order influence function of A,(F), i.e., a Here tk(xl,...,xk; function symmetric in its arguments, satisfying for any distribution function G:
““$”IEEO= /
.a-
/t,(x,,...,x,;
F)fi
d[G(xJ
-F(x,)].
i=l
We recall that, from general results on von Mises expansions, EF[tl(x; F)] = 0 and E,[t,(x; F>]* is the asymptotic variance of the sample estimator of A,(F) (e.g., Hampel, 1974). Let c,(x) = a;f[x - p(F)] and, for r E (1, 2}, let d,(x, y; r> = [x p(F)ITT,J[y - p(F)], where, for notational simplicity, the dependence on F is omitted. It can be shown that t,(x; F) = [c,,(x)]‘-
A,,
(2.3)
182
M. Romanazzi / Jackknife eigenualues
(2.4)
hW[40
+co(Y
2.2.
7
z; 1) + (c,(X))*d,(Y,
2;
4).
P-5)
Properties df the jackknife estimator of the eigenualues
Let Fn be the empirical cumulative distribution function of a random sample IX,, . . . , X,,) from F, and let I$;) be the empirical cumulative distribution f?nction evaluated from all the X, except Xi. Note that it can be written F& = [l + (n - l)-‘IF, - (n - 1)-‘6x,, where 6, is the distribution with unit mass at Xi. Putting G = @,, in (2.2), we get F, + $K2CiCht*(Xjy
A(@) = A, + n-l Eitr(xi; + $z-3CiCJkt3(Xi,
Furthermore, h,(&,)
x,,
x,;
F) + . * * .
putting F = iti and G = @& in (2.21, we obtain =A#)
- (n - l)-‘tl(x,;
- i(n - 1)-3t,(xi,
$!) + i(n - 1)-*t2(xj,
of the pseudovalues
po,i =%( @a) - (n - %( fiaci,) =A,(&) + tl(xi; q - gn - l)-lt*(xi, + gn =
rC’C,P,
l)-*qx;,
xi, xi; q + **.
and the
xi; in) )
(2.6)
j
++2(n jth
xi; q
xi, xi; fin) + . * * .
This gives the following von Mises expansions jackknife estimator of h,:
i,;,,
Xh; F,
1)*]pit3(xi,
xi, xi; Fn)+ ***.
Let X = ,u(& be the vector of the sample means. Also, let ij = Ai be the largest eigenvalue of the sample covariance matrix S = OR(&), and let
M. Romanazzi / Jackknife eigenvalues
183
gj 2 ai be the associated eigenvector. We denote tj(x> 7 kT(x -Xl the jth sample principal component of x and, for YE {l, 2), we put d&x, y; r> = C&Aj - h^,)-‘2j(x);j(y). From (2.3), (2.4) and (2.5) it follows: t,(x; &) = [&)I2 t,(x,
- h”(),
x; Fxj = -2[c^,(x)]2[1
+&,
t,(x, x, x; gn) = 6[~0(~)]z(&(~,
x; l)], x; 1) + [d^,(% x; 1)12 -
[qx)]2&(x, x; 2)).
The function tJx_; cn) is unbounded and the expression of d^,(x, x; Y,>suggests that t,(x, x; F,) may also assume extreme values, especially when A,, for p}, is not well separated from Ai. As shown by (2.61, the some jE{l,..., pseudovalues directly reflect the behayiour of the influence functions on the sample observations. This implies that AOiJK,the sample mean of the pseudovalues, and nVwJK, the sample variance, are vulnerable estimators. The distribution of t,(x; F) may be asymmetric; for example, when F is normal, EF[tl(x; F>13 = Shi. This suggests that robust alternatives to the jackknife, like trimmed or Winsorized means of the pseudovalues, are not advisable, since they can lead to substantial underestimation, particularly for the largest eigenvalues. The quantities PO,,- h^, = -(n - l)(&ci, - h”,), where loci) = A,(@n&, give estimates of the influence function t,(x; F) at Xi, i = 1,. . . , n, which are useful diagnostics to recognize outlying observa$ons (Critchley, 1985). Using the above expressions of t,(x; F,), t,(x, x;_F,) and tJ(x, x, x; fn), we obtain up to third-order approximate expansions of Ajci)= Aj( F,& only depending on Aj and ii, j = 1,. .:, p. As we shall see in the next se$on, second-order approximate expansions Ajci) of the “deleted” eigenvalues Aj(+ and the corresponding A(approximate) sample influence function values SICi(Aj> = -t;n l)(Ajci) - Aj), will often be adequate. For any distribution F with finite fourth cumulants we obtain
E,[ ho(&)]
= A, - n-‘[ A,+~~O,(Aj-A,)-l(A,Aj+~:ui”)]
+O(n-2’ I,
(2.7) where ~(2oji)is the bivariate cumulant of order 4 of the joint distribution of c,(x) and cj(x) = LZ;[~- p(F)], j = 1,. . . , p. When F is a normal distribution, .(pjj) = 0 and (2.7) coincides with the expression first given by Girshick (1939). From standard properties of sample principal components it follows that Cit,(Xi;
~~) = 0,
E&(Xi,
xi; FJ = -2(&
+ Ci[c^l)(xj)]2~0(xi,
xi; l)>
184
M. Romanazzi / Jackknife eigenvalues
and then a second-order h,;,,=A,(Q
approximate
- i[+z
expansion of &..k is
- l)]-lc,c,(X,,
= [l + @I - 1))‘I&
xi; Fn)
+ [n(n - l)] -‘Z&(X,)]*&(X,,
Xi; 1).
(2.8)
Expression (2.8) is equivalent to the infinitesimal jackknife of Jaeckel (see Miller, 1974, p. 10-11; note, however, that the coefficient in-’ in equation (4.14) of the above mentioned paper is to be read $“). The first-order term is the obvious correction to achieve the eigenvalues of the unbiased estimator of the covariance matrix S, = [n/(n - l>]S. As regards the second-order term, it is negative for the largest sample eigenvalue and positive for the smallest; otherwise it can be of either sign. For i, h E 11,. . . , n}, i # h, let @II = var[ tl( Xi;
F)] ,
al*=cov[tl(X~;
F)7 f*(Xi,
xi;
F)]>
F),
xh>
xhi;
%=var[t2(Xi,
Xh;
F)],
and a13=cov[t,(Xi;
t3(Xi7
F)]”
Using results from Hinkley (1978) and Knott and Frangos (1983), we obtain nMSE[ ho( Fe)] = oI1 + n -‘(+a** + Cl2 + Q) nMSE(&,,,) nE(&,)
+ O(K2),
= (T,~+ $~-~a,, + O(n-*),
= (TI1+ yt-l
((722 + Cl2 + %)
+
w-*).
(2.9)
Here a,,/n
is the asymptotic variance of h,(@n) and i,.,,. For any distribution its expression is uI1 = 2At + K$“, where ~‘4” is the fourth cumulant of c,,(x) (see also Waternaux, 1976). Under normality,
F with finite fourth cumulants,
uI1 = 2A2,, Cl2
=
uZ2 = 4hz,[ 1 + Cco,(Aj - Aa)-*A;] >
-4A’,[l + CCo,(Aj- h,)-‘Aj],
Cl3
=
-4A3,CCO)(Aj- Ao)-*Ai,
and then ~MSE[A~(~~)]~2AZ,(l~~~1[1+~~~~(A~~A~)~2A~])+O(~~2)~ nMSE(i,;,,) nE(fo;r,)
= 2Az,{1 + II -I[1 + CCo,(Aj- A,))*A;]) = 2A’o+ O(n-*).
+ O(n-*), (2.10)
This shows that to O(n-*> MSE(AO;rK)> MSE[A,(F,)], under normality. Exp;fessions (2.9) and (2.10) also suggest that, for small or moderate sample sizes, V,;,, could be a biased estimator of the variance of &,.rk.
185
M. Romanazzi / Jackknife eigenualues
3. Numerical examples 3.1. Finite sample behaviour of the jackknife We used IMSL routines on the IBM 3090 computer at the University of Venice to generate a pseudo-random sample of size y1= 19 from a trivariate normal distribution with mean vector p = (OOO)T and covariance matrix R = (wij>, where wii = 1, i = 1, 2, 3, wr2 = 0.8 and or3 = wz3 = 0. The eigenvalues of the covariance matrix are A, = 1.8, A, = 1, A, = 0.2 and the corresponding orthonorma1 eigenvectors are a, = (4fi
+fi
a3= (i\/z -+fiO)T.
a*= (OOl)T,
OjT,
Expanding an example by Hinkley (1978), to the basic set of 19 observations we added another data point xzo = (u -u O)T, with u = 0, 1, 1.5, 2, 2.5, so obtaining five different data sets (see Figure 1). For u > 1 xzo, which lies on the direction defined by a3, is widely separate from the rest of the data and appears to be an outlier.
T //i /
/”
-+----__-_ li:i /
1.5
,’
---_i_:_~m
-0.6 0.4
u-2.5 1.4
-
--_-_i__ 1
--<<
I’ ,’
!
/”
x2 .
2.4 3.4
Fig. 1. Five trivariate samples of size n = 20. Nominal distribution of X,, . . . , X,, is trivariate normal with N(0, 1) marginals, E(X,,X,,) = 0.8, E(X,,X,,) = E(X,,X,,) = 0; x2” = (u - u OIT, u = 0, 1, 1.5, 2, 2.5.
1.96 1.99 2.08 1.85 2.18
- 0.086 - 0.028 0.113 - 0.073 - 0.091
1.93 1.40 -0.149 3.81 - 2.00
- 0.964 1.174 0.491 1.404 0
- 1.411 0.904 - 1.782 1.154
- 1.396 0.731 - 1.289 1.175 IA
-U
2.13 1.90 2.14 2.02 2.16
0.049 - 0.071 0.011 - 0.003 - 0.040
0.017 2.56 - 0.985 - 0.968 - 0.967
- 1.14 3.20 - 1.30 0.864 - 1.72
- 0.304 - 2.494 - 0.314 0.976 - 0.307
0.883 - 1.027 - 0.991 0.475 - 0.384
0.513 - 1.298 - 0.573 1.031 - 0.699 - 0.964 - 0.904 0.743 - 0.918 - 0.931
-
0.984 0.850 1.04 1.04 1.03 1.03 1.03 0.946 1.03 1.03
2.13 1.90 2.13 2.02 2.16 1.96 1.99 2.07 1.87 2.17
0.0897 0.0964 0.0919 0.0927 0.0948 0.0973 0.0941 0.0860 0.0966 0.0976
0.983 0.832 1.04 1.04 1.04 1.04 1.03 0.943 1.03 1.04
0.960 1.00 1.03 1.03 1.03
2.17 2.17 2.17 2.17 2.07
0.826 1.03 0.982 0.972 0.729
j=2
0.0977 0.0977 0.0973 0.0673 0.0816
0.958 1.00 1.03 1.03 1.03
2.17 2.18 2.18 2.18 2.07
- 0.092 - 0.093 - 0.084 0.454 0.195
0.480 0.325 0.788 0.798 0.893
- 1.89 -2.01 - 2.02 -2.02 - 0.097
1.313 0.888 0.559 0.359 0.980
- 0.372 - 0.357 - 0.337 0.016 0.079
‘j(i) j=l 2.11 2.16 1.79 2.06 2.14
j=3 0.0969 0.0841 0.0968 0.0952 0.0923
0.817 1.03 0.973 0.970 0.717
2.11 2.17 1.79 2.06 2.14
- 0.361 - 0.347 -0.166 - 0.984 0.930
- 0.079 0.152 - 0.077 - 0.047 - 0.009
3.02 - 0.791 0.049 0.239 4.86
- 0.848 - 1.82 5.16 0.199 -1.42
1.180 0.228 - 1.948 - 0.109 - 1.532
- 1.634 - 0.323 - 1.618 0.813 0.842
j=2
‘Ki,
0.0975 0.0945 0.0871 0.0968 0.0978
0.0904 0.0967 0.0924 0.0931 0.0951
0.0827
0.0979 0.0974 0.0691
0.0979
0.0955 0.0935
0.0971 0.0850 0.0970
j=3
eigenvalues ijci, and second-order approximations h,(i, for
j=l
- 1.578 - 1.007 - 1.591 1.264 1.350
j=3
j=2
j=l
j=3
j=2
SIC;(A j)
j=l
xij
Table 1 Influence function values SIC,(Aj) of the eigenvalues of the covariance matrix, “deleted” two trivariate data sets of size n = 20
S f E
8.
> % $ %
8 3 2 !? 3;. \
R
- 0.372 ~ 0.357 - 0.337 0.016 0.079
0.883 - 1.027 - 0.991 0.475 - 0.384
- 1.411 0.904 - 1.782 1.154 -u
- 0.361 - 0.347 -0.166 - 0.984 0.930
0.513 - 1.298 - 0.573 1.031 - 0.699
- 1.396 0.731 - 1.289 1.175 u
- 0.964 1.174 0.491 1.404 0
- 0.304 - 2.494 - 0.314 0.976 - 0.307
1.313 0.888 0.559 0.359 0.980 1.91 2.02 2.03 2.01 0.085
1.95 1.29 - 0.147 3.69 - 1.88
- 1.19 3.23 - 1.31 0.853 - 1.70
-
- 0.821 - 1.79 5.17 0.191 - 1.42
- 1.634 - 0.323 - 1.618 0.813 0.842
- 1.578 - 1.007 - 1.591 1.264 1.350
1.180 0.228 - 1.948 - 0.109 - 1.532
SIC,(hj) j=l
j=3
j=2
XI,
j=l
u=2
Table 1 (continued)
0.509 0.306 0.786 0.748 0.908
- 0.966 - 0.881 0.639 - 0.898 - 0.903
- 0.117 2.48 - 0.989 - 0.972 - 0.978
-
2.98 - 0.755 0.072 0.210 4.83
j=2 0.437 0.125 0.473 0.465 0.453
-
0.473 0.265 0.250 0.333 6.82
0.048 0.328 0.429 0.463 0.341
- 0.465 - 0.461 - 0.472 0.241 - 0.342
-
j=3
L
1.04 1.04 1.04 0.953 1.04 1.04
2.17 1.96 2.00 2.08 1.87 2.18
1.03 1.04
2.18 2.08
1.04 1.04
1.01 1.03
2.19 2.19
2.15 2.03
0.961
2.18
0.996 0.843
0.976 0.722
2.06 2.15
2.14 1.90
0.824 1.03 0.976
j=2
2.12 2.17 1.79
j=l
0.499 0.488 0.486 0.491 0.0976
0.492
0.497 0.499
0.475 0.489
0.460 0.492
0.499 0.499
0.499
0.499 0.498
0.497 0.480 0.499
j=3
‘j(O
1.97 2.00 2.08 1.88 2.17
2.16
2.14 2.03
2.13 1.90
2.18 2.08
2.18 2.18
2.17
2.06 2.15
2.12 2.17 1.80
j=l
1.04 1.03 0.955 1.04 1.04
1.04
1.04 1.04
0.995 0.858
1.03 1.04
1.00 1.03
0.962
0.978 0.735
0.832 1.03 0.985
j=2
0.498 0.487 0.486 0.490 0.114
0.491
0.496 0.497
0.475 0.490
0.460 0.491
0.497 0.498
0.497
0.497 0.497
0.496 0.480 0.498
j=3
%
E 5 D
2 ss z F,v 5. B
z? 9 2 e !z. \
188
M. Romanazzi / Jackknife eigenualues
Table 1 gives the data, the sample values of the influence function SIC,(_A,), the “deleted” eigenvalues ijci) and_ their sefond-order approximations hj(+ and V,, are given in Table 2 together j = 1, 2, 3. The sample values of h, A,, with their second-order approximations II,, and_fJ,. The following results are clear. The values of AjciJ are good approximations of ijci), except for the observations with very high values of the influence function. The SIC values are useful to discover influential observations and how they affect the eigenvalues. The jackknife and the infinitesimal jackknife are even more sensitive than the usual estimator of the eigenvalues to contamination of the data. Finally, as xZO becomes more extreme, the estimates of the variance of the third eigenvalue provided by V,, and V,, become too large to be of any practical value. 3.2. Monte Carlo results To investigate the characteristics of the distribution of the jackknife estimators, Ni replications of the estimation procedure were performed on pseudo-random samples of size nj (i = 1, 2, 3) from the normal distribution described in Section 3.1. We put N1 = 1000, N, = 500, N3 = 250, n, = 30, ~1~= 60 and n3 = 90. For sample sizeAni we evaluated the mean and the variance from the Ni replications the 5% of li, &, VJK, /I,, and vJ,. As a robust version ofAh,,, we considered trimmed mean of the pseudovalues, denoted by A,,;(,,,,,. The mean and the variance of the following sample estimators of (+ii, uZ2, (T,* and c,3 were also obtained:
t*(&
xn.1;
&j
=t*(X,1.
Xn+l,
x,+1;
x,;
kj
and t&L
Ej = t3(&
x,,
xi;
q.
For y1i = 30, Ni = 1000, the estimation procedure was also performed on distribution (1 - E)F + EG, pseudo-random samples from the “contaminated” where F is the previous normal distribution and G is a trivariate normal distribution with mean vector p = (00 O>T and covariance matrix 0, = diag(l1 10). We tried three different levels of contamination choosing pi = 0.05, = 0.07 E = 0.1. The results of the Monte Carlo simulations are given in Tables 3’and 4.
189
M. Romanazzi / Jackknife eigenvalues Table 2 Eigenvalues Figure 1 Ll
of the covariance
h^
matrix
and jackknife
statistics
sets represented
in
I’,,
L
j=l
j=2
j=3
j=l
j=2
j=3
0 1 1.5 2 2.5
2.07 2.07 2.07 2.07 2.08
0.985 0.986 0.987 0.989 0.995
0.0926 0.185 0.305 0.473 0.686
1.95 1.95 1.95 1.95 1.95
1.09 1.10 1.09 1.09 1.00
0.101 0.194 0.317 0.499 0.810
U
L
0 1 1.5 2 2.5
for the data
CK
j=l
j=2 0.254 0.253 0.251 0.250 0.246
0.146 0.146 0.145 0.141 0.107
j=3 0.00103 0.00887 0.0443 0.142 0.342
SIC&,)
j=l
j=2
j=3
j=l
j=2
j=3
j=l
1.97 1.97 1.97 1.97 1.97
1.07 1.07 1.07 1.06 0.999
0.0981 0.191 0.315 0.495 0.787
0.236 0.234 0.233 0.232 0.230
0.129 0.129 0.128 0.126 0.117
0.000921 0.00797 0.0399 0.130 0.353
~ -
2.00 1.95 1.92 1.88 1.84
j=2
j=3
-0.931 - 0.912 - 0.896 - 0.903 ~ 1.46
- 0.091 1.57 3.74 6.82 11.3
From Table 3 we draw the following conclusions. (1) For all value! of ~1, iJK is practically unbiased and its MSE is higher than the MSE of A. (2) The characteristics of the empirical distribution of iJK are very similar to those of i,, for all values of YE.The two estimators are practically the same for y1> 60. (3) For n = 30, pJK, pr, and ~?,r/n underestimate the variance of the empirical distribution of ALK and Ark, but they provide reasonably accurate estimates of the variance of A. (4) The estimates of (TV*,(T,~ and (T,~ are very imprecise for small or moderate values of yt. Therefore it does not appear feasible to try to reduce the bias of fr, by taking into account the correlation of the pseudovalues. (5) For all values of n, the trimmed jackknife is substantially biased. As regards the results given in Table 4, we note the following points. The sampling from the contaminated distribution (1 - E)F + EG should inflate the second and, to a minor extent, the third eigenvalue, while the first should remain almost unchanged. The behaviour of A,, and hJK reflects this pattern exactly, while the effects of the contamination on A are less important but they involve all the eigenvalues. The variances of the empirical distributions of A, A,, and hJK are considerably increased for all the eigenvalues and to a greater extent for he jackknife estimators. These effects are by no means negligible even for E = 0.05. Also, as the pa_rent distribution becomes more contaminated, the empirical distributions of A,, and A,, become more and more positively skewed. The jackknife estimators of the variance are damaged- even more seriously; the large variances of the empirical distributions of VJK, pJ, and h all/n suggest that these estimators, when used on contaminated samples, can produce totally unreliable results. This is confirmed by the empirical error rates of nominal 90% normal-based confidence intervals of the eigenvalues, derived
190
M. Romanazzi / Jackknife eigenualues
Table 3 Empirical moments from N replications
of the jackknife statistics for the eigenvalues of the covariance of the estimation procedure on samples of n observations Statistic
j = 1 Mean
n=30,
N=lOOO
Mean
Variance
Mean
Variance
1.88
0.216
0.924
0.0567
0.182
0.00233
L
1.78
0.253
1.01
0.0923
0.199
0.00297
4,
6.68
&K
1.80
GK
6.49
*.I,:a(0.05)
21.6 0.246 18.6
2.11
2.20
0.0810
0.00322
0.994
0.0855
0.195
0.00281
2.02
2.66
0.0732
0.00256
0.174 0.0573 0.363 - 0.350 --0.113
0.00248 0.00148 0.111 0.0981 0.102
1.55 6.25 37.0 1.26 - 30.7
0.204 15.7 3191 342 3765
0.881 1.52 25.5 - 9.88 - 20.6
0.0780 0.950 3180 242 6014
h^
1.85
0.100
0.975
0.0277
0.195
0.00130
&K
1.81
0.110
1.01
0.0350
0.204
0.00145
nr’,,
6.73
2.08
0.942
0.0848
0.00174
1.01
0.0342
0.203
0.00143
2.01
0.930
0.0814
0.00160
0.166 0.0719 0.461 - 0.451 -0.115
0.00107 0.00123 0.0874 0.0797 0.113
n
1
ff22 1 g12 . Cl1
N=500
Variance
j=3
i
Ull
n=60,
j=2
matrix. Results
h c d
1.81 6.56
11.2 0.109 10.4
n
1.47 6.48 35.7 2.09 - 35.1
0.0836 9.53 1156 191 2514
0.828 1.77 25.2 - 11.0 - 14.3
0.0274 0.652 1185 136 817
i
1.83
0.0679
0.980
0.0194
0.194
0.00074
AJK
1.80
0.0729
1.00
0.0243
0.200
0.00079
nQJK
6.64
7.89
2.08
0.693
0.0822
0.00093
LK
1.80
0.0728
1.00
0.0243
0.199
0.00079
6.52
7.55
2.04
0.730
0.0801
0.00089
0.164 0.0738 0.506 - 0.463 -0.111
0.00055 0.00075 0.0705 0.0379 0.114
*J,; n
Cl1 1
fl22
n=90,
N=250
(0.05) a
h
A
c
Cl2 n PI3
d
‘JK;;O.“S) n
VI1 1
h
u22 ,. u12
c
-
PI3
a ’ ’ d
Nominal Nominal Nominal Nominal
values: values: values: values:
”
1.48 6.48 35.4 2.68 - 40.7
0.0555 7.18 1067 123 3586
0.825 1.87 25.1 - 11.2 - 12.8
0.0182 0.493 923 72.2
u,,(Ar) = 6.48, u,,(Az) = 2, o,,(A~) = 0.08. = 0.612. u,,(A,) = 4.86, u,,(A,) = - 12, u12(A~) = - 0.54. uJAl) = -38.3, u,&A2) = - 12.5, u,,(A,) = -0.0725. u,,(A,) = 33.4, u2,(Az) = 24.5, u,,(A,)
from (h^,, - h)/p/,/2 and (A,, - AI/p;‘,/’ as pivots. The results in Table 5 show that, for the first and the third eigenvalue, yhose estimates are not distorted very much, the large variances given by V,, and pJK produce confidence
M. Romanazzi / Jackknife eigenvalues Table 4 Empirical moments of the jackknife statistics contamination of the nominal distribution. procedure on samples of 30 observations Statistic
j=2 Variance
1.97 1.77
Variance
Mean
Variance
0.114
0.204
0.00414
1.12
0.250
1.30
123
1.82
j=3
Mean
0.248
8.37
c=0.07
for the eigenvalues of the covariance matrix under Results from 1000 replications of the estimation
;=1 Mean
191
0.277
5.66
0.282
71.1
1.25
0.216
0.227
0.00596
0.149
0.0503
0.221
0.00535
10.7
461
4.99
66.0
0.128
0.0356
1.50 11.0
0.194 434
1.06 3.09
0.152 13.3
0.191 0.0934
0.00356 0.0148
h^
2.15
0.416
1.29
0.161
0.225
0.00536
L
1.87
0.376
1.53
0.395
0.252
0.00776
0.215
0.127
0.244
0.00697
n nYlK
14.2
819
1.95
A JK
8.12
0.487
1.46
III?JK
19.4
1652
7.73
LK;
1.50 19.4
0.205 1432
1.24 4.05
2.25
0.513
1.94
0.487
(0.05)
A
p11
A JK n
17.4
6K
1165
2.04
&K &K I * JK; (0.05) A Ull
108 0.351 405
0.188
0.102
0.221 19.0
0.206 0.138
0.00402 0.0462
1.38
0.180
0.247
0.00702
1.67
0.456
0.278
0.0105
0.284
0.218
0.269
0.00931
0.244
0.151
0.225 0.175
0.00529 0.0652
8.54
0.576
116
1.57
22.7
1848
7.18
1.52 23.2
0.216 1659
1.38 3.97
0.354 136 0.284 11.4
intervals which are almost uninformative but still include the true parameter, so that the empirical error rate is gradually reduced. On the contrary, for the second eigenvalue, whose variances are less inflated, the empirical error rate
Table 5 Empirical error rate of nominal 90% confidence intervals of the eigenvalues matrix for various levels of contamination of the parent distribution. Results tions of the estimation procedure on samples of 30 observations
of the covariance from 1000 replica-
_ &K
E=O E = 0.05 E = 0.07 E = 0.1
AJK
j=l
;=2
j=3
j=l
j=2
j=3
16.8 17.4 14.9 14.3
18.2 19.0 26.1 35.4
14.4 14.8 12.9 14.6
16.2 15.4 12.7 10.3
18.3 17.8 23.1 32.8
16.6 16.0 14.3 14.9
192
M. Romanazzi / Jackknife eigenvalues
shows a constant rise as the amount of contamination is increased. From Table 4 it is apparent that the trimmed jackknife behaves satisfactorily on the third eigenvalue, it is moderately distorted on the second and it consistently underestimates the first, as for E = 0.
4. Jackknife estimation 4.1. Differentiability
of the eigenvalues
properties
of correlation
matrices
of the eigenvalues
Let L(F) = diag[fi(F)]. The eigenvalues Aj = Aj(F) of the correlation matrix R(F) = L(F)-“2R(F)L(F)-“2 are the roots of the generalized eigenequation det[R(F) - tL(F)] = 0. Further, if aj = ai is a p-vector satisfying O(F>aj = AjL(F)aj, with aTL(F)aj = 1, then pi = pj(F> = [ L(F>]‘/2aj is a unit-length eigenvector of R(F) associated with Aj. Let F be perturbed to F = (1 - E)F + EG and let hi(e) and ai satisfy fl(e>aj(e) = Aj(c)L(c)aj(e), with aj(c)TL(e)aj(c) = 1, j = 1,. . . , p. Here L(E) = diag[fl(e)] = L(F) + EL, + ie2L2, with Li = diag(Ri), i = 1, 2. If A, E ]A,, **. , Ap) is a simple root of det[R( F) - tL(F)] = 0 and a, is the corresponding generalized eigenvector, their perturbations A&E) E {Al(~), . . . , Ap(~)] and a,,(E) are infinitely differentiable functions in an interval centered at E = 0 (Romanazzi, 1989). Putting Ci = fli - A,L,, i = 1, , we obtain A$) = aTC 0 1 a 0,
ah” = - ToCIao - +(a~L,a,)a,;
A(i)= a;f(C, - A$)L,)ao + 2a;fC,ab’), @
= - T,( C, - 2A$)Ll)ao
- 2T,[ C, - A($L( F)] ah’)
- 2[ azL,ab’) + $z;fL,a, A(i) = - Gaul,
+ A$)L,)a,
+ u(o’)=L(F)ah’)] a,; + 3aTC,ar)
+ 3aiC,a#’
+ 3A(i1’af)TL( F)ab’),
with To = C,,,( Aj - A,) - ‘aj$. From these results we obtain a Taylor expansion for the functional A,(G)
=Ao(F)
+;
+g//&b
+
//
/qh;
(
6?2Xl,X
A,(G):
F) d[G(xd -F(x,)l 2;
x2,
F)& x3;
d[G(q) -F(q)] F)n;=l
d[G(Xi) -J’(xi)]
+ . .. >
(4.1)
193
M. Romanazzi / Jackknife eigenoalues
where (4.2)
-
2Cc,,,( Aj - A,)-
’ [P,‘uP,‘u- A,&diag( UuT)Pj]
x [ P~uP~L’ - A,#zdiag( - ql(x;
F)P;fdiag(~~T)P,,
L:UT)P,] - q,(y;
F)P~diag(UUT)PO,
(4.3)
with u=
[L(F)]-1’2[x-#)],
u = [L(F)]
-1’2[ y -p(F)].
The expression of q&x, y, z; F) iz too long and so it is not given here. As in Section 2.2, putting G = F,, in (4.11, we obtain A@) Moreover, A,(&,)
=A,+n-’ putting
Ciq1(Xi;
F) + ~n~2CiChq2(Xi,
X,; F) + . . . .
F = &, G = incij in (4.11, we get
=A,(&)
- (IZ - l))‘q,(X,;
Fn) + ;(n - l)-2q2(X,,
Xi; fin) + *. . ,
and then Po,i=A,(~~)+q,(Xj;
+f(n-l)-‘q2(Xi,
Xi; pn)+
..*
and /i O,JK=AO(&) - ;[+z
- l)]-‘Ciq2(X,,
X,; Fn) + a*. ,
(4.4)
since &q,(X,; finI = 0. Once again ql(x; in) and q2(x, x; Fn), whose expressions follow from (4.2) and (4.3) by replacing population parameters with the corresponding sample estimates, depend only on the eigenvalues and eigenvectors_ of the sample correlation matrix. Thus, second-order approximations of A,(Fnc,,) and Poi, as well as (approximate) estimates of the influence function values P,,i - A,i14’,), i=l Also, taking the first two terms on the right -., n, are easily obtained. hand side of (4.41, gives the infinitesimal jackknife estimator, /IO,JK, of A,. The behaviour of the influence functions q,(x; inI and q2(x,’ x; fJ is more complicated than their covariance matrix analogues. However, two important features are confirmed, i.e., ql(x; &) is unbounded and q2(x, x; &) can give extreme values for very close eigenvalues. For a normal distribution with mean vector p(F) and covariance matrix R(F)
= A, + ‘1-‘A,,(2/3r]~(
Rt21 - A,$)
194
M. Romanazzi / Jackknife eigenvalues
where @I=&, *PO and R [*I= R * R with * denoting Hadamard product. Under the same hypothesis, the asymdtotic variance of A,(&> is (~~r/n, where @II =E,[q,(X;
F)]* = 2n2,[1 +P[;1T(R[21-
2h,I)/3a1].
This expression was first derived by Girshick in terms of the vector &, = h1,/*& (Girshick, 1939, Eq. 3.25). We also obtain EF[ ql( x; F)13 = 8A3,(1- 3h,p~j’p~’ -CJ
+ 3hOCiCh~;j&$-~h h
c k p*.p* 01 Oh p*Ok Y ih r ik Y hk ) 9
under normality, and this suggests that the distribution substantially skewed.
of ql(X; F) may be
4.2. Numerical examples To explore the finite sample behaviour of the jackknife estimators we performed an experiment similar to that described in Section 3.1. A pseudo-random sample of size y1= 19 was drawn from a trivariate normal distribution with mean vector p=(OOO)Ta n d covariance matrix R = (wij), where wI1 = 4, 02* = 1, 033 = 0.25, 0 12= 1.6 and w13 = w23 = 0. The correlation matrix R is identical with the covariance matrix of the example discussed in Section 3.1. To simulate contamination of the theoretical distribution, we added to the basic set of 19 observations a data point x2o = (2~4 -u OIT, with u = 0, 1, 1.5, 2, 2.5. On the five data sets so obtained the eigenvalues of the sample correlation matrix together with their jackknifed versions were evaluated.
Table 6 Eigenvalues 4.2 u
0
1 1.5 2 2.5 a u
I 1 1.5 2 2.5 a
matrix and jackknife
statistics
j=l
j=2
j=3
j=l
j=2
j=3
1.87 1.78 1.68 1.55 1.41
0.995 0.995 0.995 0.995 0.994
0.135 0.224 0.328 0.456 0.594
1.74 1.65 1.53 1.36 1.14
1.12 1.13 1.13 1.15 1.16
0.135 0.221 0.335 0.496 0.696
j=2
j=3
j=l
j=2
j=3
1.09 1.09 1.10 1.11 1.13
0.135 0.222 0.334 0.484 0.660
0.00825 0.0176 0.0461 0.103 0.185
0.00795 0.00903 0.0107 0.0139 0.0203
0.00350 0.0130 0.0417 0.0985 0.179
L 1.77 1.68 1.56 1.40 1.21
a Jackknifed
for the data sets described
in section
?I,
i
j=l
0
of the correlation
j=l
j=2 0.0112 0.0214 0.0537 0.122 0.230
0.0112 0.0125 0.0143 0.0175 0.0229
j=3 0.00410 0.0149 0.0480 0.117 0.225
SIC,,(hj)
eigenvalues
not ordered.
j=l -0.002 -1.61 -3.45 -5.59 -7.71
j=2
j=3
= 0 0.006 0.005 0.001 - 0.010
0.002 1.61 3.44 5.59 7.72
M. Romanazzi / Jackknife eigenualues
195
The results summarized in Table 6 confirm tha! li‘,, and are strongly affected by outlying observations, even more than A. The same remark applies to the jackknife estimators of the variance, which rapidly deteriorate as xZO is shifted farther from the main body of the data. For the highest values of U, confidence intervals for the first and the third eigenvalue turn out to be practically uninformative. The characteristics of the distribution of the jackknife estimators were checked along the lines of Section 3.2. A$ replications of the estimation procedure were performed on pseudo-random samples of size ~1, (i = 1, 2, 3) from the normal Table 7 Empirical moments from N replications
of the jackknife statistics for the eigenvalues of the correlation of the estimation procedure on samples of IZ observations Statistic
j=2
j = 1 Variance
Mean N=30,
N=lOOO
matrix. Results
;=3
Mean
Variance
Mean
Variance
1.85
0.0116
0.945
0.00630
0.199
0.00507
‘JK
1.80
0.0171
0.994
0.0115
0.204
0.00580
“JK
0.568
0.206
0.407
0.136
0.169
0.0171
0.0100
0.203
0.00559
0.119
0.152
0.0134
i n
‘JK
1.81
0.0155
0.982
ltVJK
0.528
0.175
0.375
1
A JK; (0.05) A a @II n @22 . 012
1.80 0.439 9.01 - 1.58
0.0103 0.0742 140 4.95
0.208 0.118 0.848 - 0.747
0.00611 0.00749 0.992 0.516
0.00433
0.969
0.00227
0.198
0.00212
A JK
1.80
0.00531
0.996
0.00322
0.200
0.00220
“JK
0.385
0.0844
0.245
0.0604
0.144
0.00544
A JK
1.81
0.00516
0.993
0.00308
0.200
0.00219
0.375
0.0804
0.237
0.0574
* JK, (0.05) A a fl11 A u22 n Cl2
N=250
0.972 0.293 8.36 - 1.52
1.83
“JK n
N=90,
0.0161 0.107 152 7.45
1.79 0.348 12.7 - 0.881
0.137
0.00492
0.00531 0.0632 156 5.42
0.979 0.208 12.1 - 1.14
0.00309 0.0455 152 3.51
0.201 0.121 0.954 - 0.797
0.00237 0.00364 0.544 0.303
i
1.82
0.00300
0.977
0.00116
0.198
0.00172
‘JK
1.80
0.00340
0.995
0.00152
0.200
0.00176
“JK
0.334
0.0542
0.194
0.0416
0.142
0.00388
‘JK
1.81
0.00336
0.994
0.00149
0.200
0.00176
0.327
0.0521
0.190
0.0400
0.138
0.00364
0.200 0.126 1.00 - 0.892
0.00181 0.00292 0.510 0.320
nvJK n * JK, (0.05) n a Ull . u22 n u12
a Nominal values: u,,(h,) ability 1.
= (or,
1.79 0.312 12.6 - 0.794
0.00327 0.0441 90.3 3.71
= 0.130, u,r(h2)
0.983 0.173 12.0 - 1.11
0.00152 0.0340 90.6 2.91
= 0: under normality,
q&Y; F) = 0 with prob-
196
M. Romanazzi / Jackknife eigenualues
distribution specified above. The main results are gjven in Table 7 and can be summarized as follows. (1) For IE = 30, the MSE of A,, is greater than the MSE of i. (2) &k and A,, are equivalent for sample sizes n > 60. (3) For a!1 values of are positively biased for the variance of A. (4) The II, PJKY V,, and gll/n trimmed jackknife underestimates the second eigenvalue, but it behaves satisfactorily on the first and the thirp. *We also note-that, for the second eigenvalue, the empirical distributions of A, AJK, A,, and hJK;(0,05)are markedly skewed: for yt = 90 the skewness index of the four estimators is about - 2. This may indicate that approximate normality of the jackknife statistic for the eigenvalues of the correlation matrix requires higher sample sizes than those needed for the covariance matrix. To check the sensitivity of the jackknife estimators under contamination of the underlying distribution, pseudo-random samples were drawn from the mixture distribution (1 - E)F + EG, where F is the previous normal distribution and G is a trivariate normal distribution with mean vector p = (00 OjT and covariante matrix L?, = 161. We chose Ed = 0.05, E* = 0.07, Ed = 0.1, and we performed 1000 trials with y1= 30 for:ach value of E. The results given in Table 8 suggest the following conclusions. A,, and, to a lesser extent, A,, are vulnerable estimators. The large variances of the empirical distributions of ,VJK and _I/k prove that they can give unreliable estimates of the variance of A,, and iJK, when using contaminated data. The trimmed jackknife behaves better than A,,; however, it is more affected than A by the contamination.
5. Conclusions When the parent distribution is multivariate normal, the jackknife procedure reduces the bias of the usual estimator of the eigenvalues of the covariance (or correlation) matrix and gives an estimate of its standard error. Both theoretical considerations and Monte Carlo simulations showed that, for small or moderate sample sizes, PJK can have substantial bias, so giving confidence inteyals whose confidence level is different from the nominal one. Correction of V,, to take into account the correlation of the pseudovalues seems difficult, because the estimates of uZ2, (TV*and c13 are imprecise. The infinitesimal jackknife AJK, and the related estimator of the variance pJK, are fast and easy statistics to calculate and they provide good approximations of /i,, and fJ, even for y1= 20. As known from other applications, e.g., estimation of the linear correlation coefficient (Hinkley, 19781, the jackknife method proved highly sensitive to contamination of the data. In these situations the point estimate of the variance large and the resulting confidence intervals given by fJ, is often exaggeratedly too wide to be of practical interest. An empirical method to overcome these undesirable features @ to carry^out a preliminary screening of the sample followed by calculation of A,, and V,, on
M. Romanazzi / Jackknife eigenualues Table 8 Empirical moments of the jackknife statistics contamination of the nominal distribution. procedure on samples of 30 observations Statistic
j=l Mean
E =
0.05
i
197
for the eigenvalues of the correlation matrix under Results from 1000 replications of the estimation
j=2 Variance
Mean
j=3 Variance
Mean
Variance
1.79
0.0416
0.973
0.0212
0.240
0.0165
A JK
1.66
0.124
1.04
0.0697
0.299
0.0432
&K
1.70
4.29
0.973
1.35
0.591
1.76
&K
1.73
0.0983
1.02
0.0639
0.255
0.0272
&K A
1.61
4.31
1.14
2.64
0.423
0.750
1.71 0.996
0.0523 1.30
1.01 0.680
0.0330 0.649
0.265 0.235
0.0175 0.125
1.73
0.0527
0.974
0.0281
0.295
0.0273 0.113
I
A,,: ,. fill
E = 0.07
(0.05)
h^ &K
1.54
0.176
1.04
0.124
0.426
.?IK
2.81
8.10
1.74
4.09
1.46
7.01
1.65
0.134
1.01
0.0961
0.336
0.0646
2.63
8.89
1.96
6.44
1.15
11.5
1.62 1.47
0.0788 2.28
1.02 0.997
0.0526 1.22
0.354 0.486
0.0501 0.529
i
1.66
0.0514
0.977
0.0249
0.365
0.0313
A JK
1.43
0.194
1.02
0.122
0.547
0.151
&K
3.18
7.77
1.78
2.80
2.07
8.84
&K
1.55
0.184
1.02
0.132
0.426
0.0730
nqK A
3.07
8.64
2.31
7.16
1.57
9.60
1.52 1.76
0.0996 2.56
1.02 1.18
0.0568 1.28
0.447 0.719
0.0737 0.758
ncK n A,,; A Cl1
l =O.l
*.I,; . g11
(0.05)
(0.05)
the sample with the deviant observations removed. The values SIC,(hj> are useful diagnostics for recognition of extreme observations. In the present context, Hinkley’s suggestion of a trimmed mean of the pseudovalues to “robustify” the classical jackknife can lead to a poor estimator, because the distribution of the first derivative of the eigenvalues is asymmetric. Probably, to obtain truly robust estimators of the eigenvalues, the starting point should be a robust estimate of the covariance or correlation matrix, for example one of those suggested by Devlin et al. (1975).
Acknowledgements
This work was supported in part by a grant from the Ministry for the University and Scientific Research and in part by a National Research Council grant.
198
M. Romanazzi / Jackknife eigenvalues
References Critchley, F., Influence in principal components analysis, Biometrika, 72 (1985) 627-636. Devlin, S.J., R. Gnanadesikan and J.R. Kettenring, Robust estimation and outlier detection with correlation coefficients, Biometrika, 62 (1975) 531-54.5. Girshick, M.A., On the sampling theory of the roots of determinantal equations, Ann. Math. Statist., 10 (1939) 203-224. Hampel, F.R., The influence curve and its role in robust estimation, J. Amer. Statist. Assoc., 69 (1974) 383-393. Hinkley, D.V., lmproving the jackknife with special reference to correlation estimation, Biometrika, 65 (1978) 13-21. Hinkley, D.V. and H.L. Wang, A trimmed jackknife, J. Roy. Statist. Sot. Ser. B, 42 (1980) 347-356. Knott, M. and C.C. Frangos, Variance estimation for the jackknife using von Mises expansions, Biometrika, 70 (1983) 501-504. Magnus, J.R. and H. Neudecker, Matrix Differential Calculus (John Wiley, New York, 1988). Miller, R.G., The jackknife - a review, Biometrika, 61 (1974) l-15. Nagao, H., On the limiting distributions of the jackknife statistics for eigenvalues of a sample covariance matrix, Comm. Statist. A - Theory Methods, 14 (1985) 1547-1567. Nagao, H., On the jackknife statistics for eigenvalues and eigenvectors of a correlation matrix, Ann. Inst. Statist. Math., 40 (1988) 477-489. Romanazzi, M., Derivatives of eigenvalues and eigenvectors in the generalized symmetric eigenproblem dependent on a parameter. Submitted for publication. (1989) Waternaux, C.M., Asymptotic distribution of the sample roots for a nonnormal population, Biometrika, 63 (1976) 639-645.