Statistics & Probability Letters 15 (1992) 21-26 North-Holland
3 September 1992
Robust statistics for test-of-independence and related structural models Yutaka
Kano
UniL,ersityof Osaka Prefecture, Osaka, Japan
Received September 1991
Abstract:
Recent research of asymptotic robustness shows that the likelihood ratio (LR) test statistic for test-of-independence based on normal theory remains valid for a general case where only independence is assumed. In contrast, under elliptical populations the LR statistic is correct if a kurtosis adjustment is made. Thus, the LR statistic itself is available for the first case, whereas a certain correction is needed for the second framework, which is seriously inconvenient for practitioners. In this article, we propose an alternative adjustment to the LR statistic which can be utilized for both of the distribution families. The theory is derived in the context of general linear latent variate models. AMS 1980 Subject Classifications: Primary 62F05; Secondary 62F35.
Keywords: Covariance structures; elliptical distributions; factor analysis; goodness-of-fit statistics; multivariate kurtosis.
1. Test-of-independence
and robust statistic
Let an observable random g-vector x be partitioned into [xi,. . . , x&l’ with xg being (g = 1,. . , G), and let the population and sample covariance matrices _J5and S be partitioned and [S,,] in accord with the partition of xc The test-of-independence is then described as H,: The likelihood
_Zgh=O
forg#h
ratio test statistic
-2log
h=n
5
versus is equivalent
H,:
.z>o.
)
(1.2)
I
where II is the sample size minus one, see e.g., Anderson (1984, Section 9). One way for getting distribution of -2 log A is to use general asymptotic theory, in which way one obtains -2
(I.11
to
logIS,,,-lo&s
i g=l
of k, x 1 into [C,,l]
log A Q_,
the
(1.3)
as n + 0~1,where p* = ipp(p + 1) and q = C,.,,k,kh. Recently, Anderson (1989) showed that (1.3) remains true for a quite general family of distributions in which independence among xx’s is only required. This class of the distribution will be labeled as C,. It should be noted that the fourth-order moments of x may not be finite in his framework. Correspondence
to: Yutaka Kano, Department Sakai, Osaka 591, Japan.
0167-7152/92/$05.00
of Mathematical Sciences, College of Engineering, University of Osaka Prefecture,
0 1992 - Elsevier Science Publishers B.V. All rights reserved
21
Volume
15, Number
STATISTICS
1
& PROBABILITY
3 September
LETTERS
1992
Elliptical distributions have been used to investigate robustness property of normal theory statistics (e.g., Mardia 1974). In this context elliptical distributions are characterized by a kurtosis parameter nn in such a way that the kurtosis parameters of the marginal variables are the same. Muirhead and Waternaux (1980) showed that in an elliptical population
(1.4) where +jE is a consistent estimator of the kurtosis parameter ~a. Many authors (e.g., Bentler, Browne, 1982) have recommended GE based on (resealed) Mardia’s multivariate kurtosis (Mardia,
1989; 1970):
(1.5) where F is the population mean vector. The class of elliptical distributions will be denoted by C,. The result to be obtained here is applicable not only to the LR statistic but also to statistics that have the properties in (1.3) and (1.4) such as Nagao’s (1973). . Notice that the convergence in (1.3) is not true for C, with nn # 1; that in (1.4) is not generally true for the first class C,, either. This fact may trouble practitioners about which treatment, division by en or no adjustment, should be done. In this article we suggest the LR statistic adjusted by
E[(xg-~g)‘yC,‘(Xg-~g)(xh-~h)‘~hhl(Xh-~h)]
G VNEW=
C
+k,k,G(G
ch
Then
- 1)
(1.6)
we can establish -2logh/$
for both C, and C,. nNEW-
NEW
L
+
(l-7)
x;*-,
The key property
r 1
for C,,
nE
for C,
of nNnw for (1.7) to hold is that
(l-8)
The Mardia multivariate kurtosis 71~ in (1.5) does not have this property. In the next section, we shall provide a general theory for how to construct such an nNEw for general linear latent variate models. There is close relationship between our result and the work of Satorra and Bentler (1988a,b), in which they proposed a scaling correction to LR statistics that possesses the property in (1.8) under the assumption that the fourth-order moments be finite. The scaling corrections involves operation of the huge matrix of the fourth-order moments, although it is applicable to general covariance structures C(0). A new formula proposed here is much simpler, and does not require the fourth-order moments to be finite, while a certain assumption on covariance structures is made (see (2.4)).
2. General formula In a general
linear
x=cL
22
latent
variate
+ 5 Ag(Y)z,> g=o
model,
an observable
random
p-vector
x is expressed
in the form:
(2.1)
Volume
15, Number
STATISTICS
1
& PROBABILITY
LETTERS
3 September
1992
where p is a general mean z,‘s are latent kg-vectors satisfying E(z,) E(z&)
= 0,
qZ&>
=O
(g+h),
E(z,z;)
= @o(r),
=Qg
Here y and r are r- and s-vectors of parameters, ,Z of x is represented as
(g=
l,...,G).
respectively. Under this model, the covariance matrix
and the parameter vector to be estimated is 0 = [y’, T’, UC@,)‘,. . . , u(@,,Y]‘. Here v(A) denotes a *-vector of distinct elements of a p x p symmetric matrix A with p * = ip( p + 1). A similar notation is vet(A) which represents a p2-vector obtained by stacking p columns of A. The linear latent variate model was originally defined by Browne and Shapiro (1988), and was slightly modified by Anderson (1987, 1989). A main concern of this article is to investigate the likelihood ratio test statistic for testing goodness-offit:
p
H,:
versus
X.5=X(e)
The test-of-independence Consider
H,:
2 > 0.
(2.2)
described in Section 1 can be regarded
as a linear latent variate model.
(2.3) Note that the A-matrices are known. Under the model, we have
with Qj’s free parameters. Thus, the hypothesis in (2.2) for the model in (2.3) is identical with that in (1.1). The factor analysis model is also a typical example of the general linear latent variate models in which x=p
+An,(y)z,
+e,Z,+
*.. +e,Zp+l
where ej is the jth vector of the canonical base in [W”.‘We then have Cov(x) =4(~)@,4(~)‘+
p
with Yf = diag(@,, . . . , @,+,). Let FwL(S, Z(O)) =log I-X(0)1-log
-p.
ISl+tr(X-‘(0)s)
Under the normality assumption, the likelihood ratio (LR) statistic for testing (2.2) is represented form -2log
A =n.
mjnF,,(S,
Z(O)) =n.Fw,(S,
in the
X(i)), 23
STATISTICS & PROBABILITY LETTERS
Volume 15, Number 1
3 September 1992
where e^ is the maximum likelihood estimator (MLE). For alternative test statistics, see Satorra (1989) and Satorra and Bentler (1991). In the case of test-of-independence, the LR statistic becomes (1.2). When the normality assumption is met, the usual asymptotic chi-squaredness in (1.3) holds under the null hypothesis, where q = dim(@). Recent robustness research shows that the chi-squaredness is basically correct not only under the normality but under more general distribution assumptions as well. Browne and Shapiro (1988) showed that the chi-squaredness in (1.3) holds true if the latent variates zg’s are independent, the fourth-order moments of 2,‘s are finite, and the fourth-order cumulants of za are zero. Anderson (1989) and Arnemiya and Anderson (1990) showed that the convergence in (1.3) remains true even if the condition of the finite fourth-order moments is dropped among the Browne and Shapiro assumptions. (Historically, Amemiya and Anderson’s result is earlier than Browne and Shapiro?.) The class will be called C,. Browne (1982, 1984), Satorra and Bentler (1986), Shapiro and Browne (1987), and Tyler (1983) have investigated asymptotic behavior of normal theory LR statistics in elliptical populations, and showed that the LR statistic adjusted by a kurtosis parameter is asymptotic chi-squared for a general covariance structure model C(e) that satisfies an assumption of invariance under a constant scaling factor (ICSF): for any 8 and (Y> 0, there exist a 0* such that aZ(0) =,X0*). In order to constitute a new formula for 77 which possesses the property in (1.8) we assume that there exist two matrices A and B of p X m, and p X m2 such that A’A,(y)
= 0
or
It follows from this assumption
B’A,(y)
=0
for each g with g=O,...,G.
(2.4)
that
(A @B)’ vec(.Z) =vec(B’XA)
=0
(2.5)
and
A’x and B’x are independent
under
C,.
(2.6)
Define
?7NEW=~[(X-~)‘~(~‘~A)-1A’(x-p).(x-~)’B(B’ZB)-’B’(x-~)]/m1m2. xN be a sample of size N, and let d and Z? be consistent Let xi,..., can be consistently estimated by
estimators
(2.7) of A and B. The nNnw
where X and S are the sample mean vector and sample covariance matrix. It is easily verified that nNEW = 1 for C, in view of (2.6). Notice that existence moments of x is not assumed. To investigate the property of nNnw under C,, we rewrite
.(A@B’)E{vec((x-p)(x-P)‘)
of the fourth-order
vec((x-F)(x-p)‘)‘)(A@B)]/mlmz. (2.8)
Under
the elliptical
population,
the expectation
in (2.8) which is the fourth-order
moment
is expressed
as
(2.9) 24
Volume
15, Number
STATISTICS
1
& PROBABILITY
3 September
LETTERS
1992
(see, e.g., Bentler, 1983, (3.12)), where K,, is the commutation matrix (see Magnus and Neudecker, 1988, Chapter 3). Noting that K,,(A 8 B) = (B c+‘A)K,~~~,, substitution of (2.9) into (2.8) and use of (2.5) leads to nNEw = nn. As a consequence, it follows that qNEw defined in (2.7) enjoys the property in (1.8). For a general linear latent variate model, whether such correction to LR statistics as in (1.7) can be made depends on whether matrices A and B satisfying (2.4) exist. Consider the case of test-of-independence described in (2.3). Define
for 1
<
G. Then we easily verify that they meet (2.4), and the nNEw in (2.7) becomes
E[(%
-~,)‘~,‘(x,-~~)(~~-~~)‘~~‘(~~-~~)]/k~k~
(=vgh,
say).
Although vgh itself is available, a symmetrized version in terms of x,, . . . , xG may be preferable, was given by (1.6). For the factor analysis model, we may choose A and B such that A’[ A,(y), where
h
e,,.. .,e,]
= 0
and
is an arbitrary integer with 1 <
h
B’[eh+,,. - k,.
which
. ,, e,] = 0,
Symmetrization
is possible for this case as well.
Acknowledgment
The author thanks Professors P.M. Bentler of UCLA, Los Angles, H. Nagao of the University of Osaka Prefecture, Osaka, and A. Satorra of the University Pompeu Fabra, Barcelona for instructive comments and discussions.
References Amemiya, Y. and T.W. Anderson (1990), Asymptotic chisquare tests for a large class of factor analysis models, Ann. Sfatist. 18, 1453-1463. Anderson, T.W. (1984), Introduction to Multicariate Statistical Analysis (Wiley, New York, 2nd ed.). Anderson, T.W. (1987), Multivariate linear relations, in: T. Pukkila and S. Puntanen, eds., Proc. 2nd. Inter. Tumpere Con. Statist. (University of Tampere, Finland) pp. 9-36. Anderson, T.W. (1989), Linear latent variable models and covariance structures, J. Econometrics 41, 91-l 19. Bentler, P.M. (1983), Some contributions to efficient statistics in structural models: Specification and estimation of moment structures, Psychometrika 48, 493-517. Bentler, P.M. (1989), EQS Structural Equations Program Manual (BMDP Statistical Software, Los Angeles, CA).
Browne, M.W. (1982), Covariance structures, in: D.M. Hawkins, ed., Topics in Applied Multiuariate Analysis (Cambridge Univ. Press, Cambridge) pp. 72-141. Browne, M.W. (19841, Asymptotically distribution-free methods for the analysis of covariance structures, Bribsh J. Mach. Statist. Psych. 37, 62-83. Browne, M. and A. Shapiro (1988), Robustness of normal theory methods in the analysis of linear latent variate models, British J. Math. Statist. Psych. 41, 193-208. Magnus, J.R. and H. Neudecker (1988), Matrix Differential Calculus with Applications in Statistics and Econometrics (Wiley, New York). Mardia, K.V. (19701, Measures of multivariate skewness and kurtosis with applications, Biometrika 57, 519-530. Mardia, K.V. (1974), Applications of some measures of multi25
Volume
15, Number
1
STATISTICS
& PROBABILITY
variate skewness and kurtosis in testing normality and robustness studies, SankhyZ Ser. B. 36, 115-128. Muirhead, R.J. and C.M. Waternaux (19801, Asymptotic distributions in canonical correlation analysis and other multivariate procedures for nonnormal populations, Biometrika 67, 31-43. Nagao, H. (19731, On some test criteria for covariance matrix, Ann. Statist. 1, 700-709. Satorra, A. (19891, Alternative test criteria in covariance structure analysis, Psychometrika 54, 131-151. Satorra, A. and P.M. Bentler (1986), Some robustness properties of goodness of fit statistics in covariance structure analysis, in: Proc. Bus. Econ. Statist. Sect. (Amer. Statist. Assoc., Providence, RI) pp. 549-554. Satorra, A. and P.M. Bentler (1988a), Scaling corrections for statistics in covariance structure analysis, UCLA Statistical Series No. 2. (Los Angles, CA).
26
LETTERS
3 September
1992
Satorra, A. and P.M. Bentler (1988b), Scaling corrections for chi-square statistics in covariance structure analysis, in: Proc. Bus. Econ. Statist. Sect. (Amer. Statist. Assoc., Providence, RI) pp. 308-313. Satorra, A. and P.M. Bender (1991), Goodness-of-fit test under IV estimation: Asymptotic robustness of NT test statistic, in: R. Gutierrez and M.J. Valderrama, eds., Applied Stochastic Models and Data Analysis (World Scientific, London) pp. 555-567. Shapiro, A. and M. Browne (1987), Analysis of covariance structures under elliptical distributions, J. Amer. Statist. Assoc. 82, 1092-1097. Tyler, D.E. (1983), Robustness and efficiency properties of scatter matrices, Biometrika 70, 411-420.