Robust statistics for test-of-independence and related structural models

Robust statistics for test-of-independence and related structural models

Statistics & Probability Letters 15 (1992) 21-26 North-Holland 3 September 1992 Robust statistics for test-of-independence and related structural mo...

383KB Sizes 0 Downloads 35 Views

Statistics & Probability Letters 15 (1992) 21-26 North-Holland

3 September 1992

Robust statistics for test-of-independence and related structural models Yutaka

Kano

UniL,ersityof Osaka Prefecture, Osaka, Japan

Received September 1991

Abstract:

Recent research of asymptotic robustness shows that the likelihood ratio (LR) test statistic for test-of-independence based on normal theory remains valid for a general case where only independence is assumed. In contrast, under elliptical populations the LR statistic is correct if a kurtosis adjustment is made. Thus, the LR statistic itself is available for the first case, whereas a certain correction is needed for the second framework, which is seriously inconvenient for practitioners. In this article, we propose an alternative adjustment to the LR statistic which can be utilized for both of the distribution families. The theory is derived in the context of general linear latent variate models. AMS 1980 Subject Classifications: Primary 62F05; Secondary 62F35.

Keywords: Covariance structures; elliptical distributions; factor analysis; goodness-of-fit statistics; multivariate kurtosis.

1. Test-of-independence

and robust statistic

Let an observable random g-vector x be partitioned into [xi,. . . , x&l’ with xg being (g = 1,. . , G), and let the population and sample covariance matrices _J5and S be partitioned and [S,,] in accord with the partition of xc The test-of-independence is then described as H,: The likelihood

_Zgh=O

forg#h

ratio test statistic

-2log

h=n

5

versus is equivalent

H,:

.z>o.

)

(1.2)

I

where II is the sample size minus one, see e.g., Anderson (1984, Section 9). One way for getting distribution of -2 log A is to use general asymptotic theory, in which way one obtains -2

(I.11

to

logIS,,,-lo&s

i g=l

of k, x 1 into [C,,l]

log A Q_,

the

(1.3)

as n + 0~1,where p* = ipp(p + 1) and q = C,.,,k,kh. Recently, Anderson (1989) showed that (1.3) remains true for a quite general family of distributions in which independence among xx’s is only required. This class of the distribution will be labeled as C,. It should be noted that the fourth-order moments of x may not be finite in his framework. Correspondence

to: Yutaka Kano, Department Sakai, Osaka 591, Japan.

0167-7152/92/$05.00

of Mathematical Sciences, College of Engineering, University of Osaka Prefecture,

0 1992 - Elsevier Science Publishers B.V. All rights reserved

21

Volume

15, Number

STATISTICS

1

& PROBABILITY

3 September

LETTERS

1992

Elliptical distributions have been used to investigate robustness property of normal theory statistics (e.g., Mardia 1974). In this context elliptical distributions are characterized by a kurtosis parameter nn in such a way that the kurtosis parameters of the marginal variables are the same. Muirhead and Waternaux (1980) showed that in an elliptical population

(1.4) where +jE is a consistent estimator of the kurtosis parameter ~a. Many authors (e.g., Bentler, Browne, 1982) have recommended GE based on (resealed) Mardia’s multivariate kurtosis (Mardia,

1989; 1970):

(1.5) where F is the population mean vector. The class of elliptical distributions will be denoted by C,. The result to be obtained here is applicable not only to the LR statistic but also to statistics that have the properties in (1.3) and (1.4) such as Nagao’s (1973). . Notice that the convergence in (1.3) is not true for C, with nn # 1; that in (1.4) is not generally true for the first class C,, either. This fact may trouble practitioners about which treatment, division by en or no adjustment, should be done. In this article we suggest the LR statistic adjusted by

E[(xg-~g)‘yC,‘(Xg-~g)(xh-~h)‘~hhl(Xh-~h)]

G VNEW=

C

+k,k,G(G

ch

Then

- 1)

(1.6)

we can establish -2logh/$

for both C, and C,. nNEW-

NEW

L

+

(l-7)

x;*-,

The key property

r 1

for C,,

nE

for C,

of nNnw for (1.7) to hold is that

(l-8)

The Mardia multivariate kurtosis 71~ in (1.5) does not have this property. In the next section, we shall provide a general theory for how to construct such an nNEw for general linear latent variate models. There is close relationship between our result and the work of Satorra and Bentler (1988a,b), in which they proposed a scaling correction to LR statistics that possesses the property in (1.8) under the assumption that the fourth-order moments be finite. The scaling corrections involves operation of the huge matrix of the fourth-order moments, although it is applicable to general covariance structures C(0). A new formula proposed here is much simpler, and does not require the fourth-order moments to be finite, while a certain assumption on covariance structures is made (see (2.4)).

2. General formula In a general

linear

x=cL

22

latent

variate

+ 5 Ag(Y)z,> g=o

model,

an observable

random

p-vector

x is expressed

in the form:

(2.1)

Volume

15, Number

STATISTICS

1

& PROBABILITY

LETTERS

3 September

1992

where p is a general mean z,‘s are latent kg-vectors satisfying E(z,) E(z&)

= 0,

qZ&>

=O

(g+h),

E(z,z;)

= @o(r),

=Qg

Here y and r are r- and s-vectors of parameters, ,Z of x is represented as

(g=

l,...,G).

respectively. Under this model, the covariance matrix

and the parameter vector to be estimated is 0 = [y’, T’, UC@,)‘,. . . , u(@,,Y]‘. Here v(A) denotes a *-vector of distinct elements of a p x p symmetric matrix A with p * = ip( p + 1). A similar notation is vet(A) which represents a p2-vector obtained by stacking p columns of A. The linear latent variate model was originally defined by Browne and Shapiro (1988), and was slightly modified by Anderson (1987, 1989). A main concern of this article is to investigate the likelihood ratio test statistic for testing goodness-offit:

p

H,:

versus

X.5=X(e)

The test-of-independence Consider

H,:

2 > 0.

(2.2)

described in Section 1 can be regarded

as a linear latent variate model.

(2.3) Note that the A-matrices are known. Under the model, we have

with Qj’s free parameters. Thus, the hypothesis in (2.2) for the model in (2.3) is identical with that in (1.1). The factor analysis model is also a typical example of the general linear latent variate models in which x=p

+An,(y)z,

+e,Z,+

*.. +e,Zp+l

where ej is the jth vector of the canonical base in [W”.‘We then have Cov(x) =4(~)@,4(~)‘+

p

with Yf = diag(@,, . . . , @,+,). Let FwL(S, Z(O)) =log I-X(0)1-log

-p.

ISl+tr(X-‘(0)s)

Under the normality assumption, the likelihood ratio (LR) statistic for testing (2.2) is represented form -2log

A =n.

mjnF,,(S,

Z(O)) =n.Fw,(S,

in the

X(i)), 23

STATISTICS & PROBABILITY LETTERS

Volume 15, Number 1

3 September 1992

where e^ is the maximum likelihood estimator (MLE). For alternative test statistics, see Satorra (1989) and Satorra and Bentler (1991). In the case of test-of-independence, the LR statistic becomes (1.2). When the normality assumption is met, the usual asymptotic chi-squaredness in (1.3) holds under the null hypothesis, where q = dim(@). Recent robustness research shows that the chi-squaredness is basically correct not only under the normality but under more general distribution assumptions as well. Browne and Shapiro (1988) showed that the chi-squaredness in (1.3) holds true if the latent variates zg’s are independent, the fourth-order moments of 2,‘s are finite, and the fourth-order cumulants of za are zero. Anderson (1989) and Arnemiya and Anderson (1990) showed that the convergence in (1.3) remains true even if the condition of the finite fourth-order moments is dropped among the Browne and Shapiro assumptions. (Historically, Amemiya and Anderson’s result is earlier than Browne and Shapiro?.) The class will be called C,. Browne (1982, 1984), Satorra and Bentler (1986), Shapiro and Browne (1987), and Tyler (1983) have investigated asymptotic behavior of normal theory LR statistics in elliptical populations, and showed that the LR statistic adjusted by a kurtosis parameter is asymptotic chi-squared for a general covariance structure model C(e) that satisfies an assumption of invariance under a constant scaling factor (ICSF): for any 8 and (Y> 0, there exist a 0* such that aZ(0) =,X0*). In order to constitute a new formula for 77 which possesses the property in (1.8) we assume that there exist two matrices A and B of p X m, and p X m2 such that A’A,(y)

= 0

or

It follows from this assumption

B’A,(y)

=0

for each g with g=O,...,G.

(2.4)

that

(A @B)’ vec(.Z) =vec(B’XA)

=0

(2.5)

and

A’x and B’x are independent

under

C,.

(2.6)

Define

?7NEW=~[(X-~)‘~(~‘~A)-1A’(x-p).(x-~)’B(B’ZB)-’B’(x-~)]/m1m2. xN be a sample of size N, and let d and Z? be consistent Let xi,..., can be consistently estimated by

estimators

(2.7) of A and B. The nNnw

where X and S are the sample mean vector and sample covariance matrix. It is easily verified that nNEW = 1 for C, in view of (2.6). Notice that existence moments of x is not assumed. To investigate the property of nNnw under C,, we rewrite

.(A@B’)E{vec((x-p)(x-P)‘)

of the fourth-order

vec((x-F)(x-p)‘)‘)(A@B)]/mlmz. (2.8)

Under

the elliptical

population,

the expectation

in (2.8) which is the fourth-order

moment

is expressed

as

(2.9) 24

Volume

15, Number

STATISTICS

1

& PROBABILITY

3 September

LETTERS

1992

(see, e.g., Bentler, 1983, (3.12)), where K,, is the commutation matrix (see Magnus and Neudecker, 1988, Chapter 3). Noting that K,,(A 8 B) = (B c+‘A)K,~~~,, substitution of (2.9) into (2.8) and use of (2.5) leads to nNEw = nn. As a consequence, it follows that qNEw defined in (2.7) enjoys the property in (1.8). For a general linear latent variate model, whether such correction to LR statistics as in (1.7) can be made depends on whether matrices A and B satisfying (2.4) exist. Consider the case of test-of-independence described in (2.3). Define

for 1
<

G. Then we easily verify that they meet (2.4), and the nNEw in (2.7) becomes

E[(%

-~,)‘~,‘(x,-~~)(~~-~~)‘~~‘(~~-~~)]/k~k~

(=vgh,

say).

Although vgh itself is available, a symmetrized version in terms of x,, . . . , xG may be preferable, was given by (1.6). For the factor analysis model, we may choose A and B such that A’[ A,(y), where

h

e,,.. .,e,]

= 0

and

is an arbitrary integer with 1 <

h
B’[eh+,,. - k,.

which

. ,, e,] = 0,

Symmetrization

is possible for this case as well.

Acknowledgment

The author thanks Professors P.M. Bentler of UCLA, Los Angles, H. Nagao of the University of Osaka Prefecture, Osaka, and A. Satorra of the University Pompeu Fabra, Barcelona for instructive comments and discussions.

References Amemiya, Y. and T.W. Anderson (1990), Asymptotic chisquare tests for a large class of factor analysis models, Ann. Sfatist. 18, 1453-1463. Anderson, T.W. (1984), Introduction to Multicariate Statistical Analysis (Wiley, New York, 2nd ed.). Anderson, T.W. (1987), Multivariate linear relations, in: T. Pukkila and S. Puntanen, eds., Proc. 2nd. Inter. Tumpere Con. Statist. (University of Tampere, Finland) pp. 9-36. Anderson, T.W. (1989), Linear latent variable models and covariance structures, J. Econometrics 41, 91-l 19. Bentler, P.M. (1983), Some contributions to efficient statistics in structural models: Specification and estimation of moment structures, Psychometrika 48, 493-517. Bentler, P.M. (1989), EQS Structural Equations Program Manual (BMDP Statistical Software, Los Angeles, CA).

Browne, M.W. (1982), Covariance structures, in: D.M. Hawkins, ed., Topics in Applied Multiuariate Analysis (Cambridge Univ. Press, Cambridge) pp. 72-141. Browne, M.W. (19841, Asymptotically distribution-free methods for the analysis of covariance structures, Bribsh J. Mach. Statist. Psych. 37, 62-83. Browne, M. and A. Shapiro (1988), Robustness of normal theory methods in the analysis of linear latent variate models, British J. Math. Statist. Psych. 41, 193-208. Magnus, J.R. and H. Neudecker (1988), Matrix Differential Calculus with Applications in Statistics and Econometrics (Wiley, New York). Mardia, K.V. (19701, Measures of multivariate skewness and kurtosis with applications, Biometrika 57, 519-530. Mardia, K.V. (1974), Applications of some measures of multi25

Volume

15, Number

1

STATISTICS

& PROBABILITY

variate skewness and kurtosis in testing normality and robustness studies, SankhyZ Ser. B. 36, 115-128. Muirhead, R.J. and C.M. Waternaux (19801, Asymptotic distributions in canonical correlation analysis and other multivariate procedures for nonnormal populations, Biometrika 67, 31-43. Nagao, H. (19731, On some test criteria for covariance matrix, Ann. Statist. 1, 700-709. Satorra, A. (19891, Alternative test criteria in covariance structure analysis, Psychometrika 54, 131-151. Satorra, A. and P.M. Bentler (1986), Some robustness properties of goodness of fit statistics in covariance structure analysis, in: Proc. Bus. Econ. Statist. Sect. (Amer. Statist. Assoc., Providence, RI) pp. 549-554. Satorra, A. and P.M. Bentler (1988a), Scaling corrections for statistics in covariance structure analysis, UCLA Statistical Series No. 2. (Los Angles, CA).

26

LETTERS

3 September

1992

Satorra, A. and P.M. Bentler (1988b), Scaling corrections for chi-square statistics in covariance structure analysis, in: Proc. Bus. Econ. Statist. Sect. (Amer. Statist. Assoc., Providence, RI) pp. 308-313. Satorra, A. and P.M. Bender (1991), Goodness-of-fit test under IV estimation: Asymptotic robustness of NT test statistic, in: R. Gutierrez and M.J. Valderrama, eds., Applied Stochastic Models and Data Analysis (World Scientific, London) pp. 555-567. Shapiro, A. and M. Browne (1987), Analysis of covariance structures under elliptical distributions, J. Amer. Statist. Assoc. 82, 1092-1097. Tyler, D.E. (1983), Robustness and efficiency properties of scatter matrices, Biometrika 70, 411-420.