Estimation of variance components in an unbalanced one-way classification

Estimation of variance components in an unbalanced one-way classification

Journal of Statistical Planning and Inference 27 8 (1983) 27-41 North-Holland ESTIMATION OF VARIANCE COMPONENTS IN AN UNBALANCED ONE-WAY CLASSI...

656KB Sizes 0 Downloads 41 Views

Journal

of Statistical

Planning

and Inference

27

8 (1983) 27-41

North-Holland

ESTIMATION OF VARIANCE COMPONENTS IN AN UNBALANCED ONE-WAY CLASSIFICATION Shoutir Kishore CHATTERJEE

and Kalyan DAS

Department of Statistics, Calcutta University, Calcutta, India Received

12 July

Recommended

1982; revised

by P.K.

Abstract: In the unbalanced with estimated

weights

As the number totically

received

one-way

random

is used to develop

of classes

normal

manuscript

30 December

1982

Sen

increases,

effects

a relatively

the proposed

but also to be asymptotically

model the weighted simple estimator

estimator

equivalent

least squares of variance

approach

components.

is seen not only to be best asymp-

to the maximum

likelihood

estimator.

AMS Subject Classification: 62510. Key words and phrases: One-way asymptotically

normal

classification;

Random

effects;

Variance

components;

Best

estimator.

1. Introduction

Consider an unbalanced one-way classification model. The j-th observation in the i-th class is yii=p+bj+eij,

j=l,...,

ni, i=l,...,

where the hi’s and e;j’s are independently b(bi) = 0,

Var(bi) = ~1,

&(e,) = 0,

Var(eii) = cro.

(To avoid notational Writing $,.i=,,

complication

subject to a random k,

normally distributed

effects

(1.1) with (1.2)

we use bi instead of a;.)

ynxl=(Yll,...,Yln,,...tYkl,...,Ykn*)’,

&“x1=(1,

1, . . . , l)‘, Z= 0~1~ + cr, diag(E,,,, . . . , En,),

where E, is a v x v matrix with all elements unity, we then have y-N&s,

Z).

(1.3)

Our object is to estimate a=(ao,al)‘.

0378-3758/83/$3.00

0

1983, Elsevier

Science

Publishers

B.V. (North-Holland)

S.K. Chatterjee, K. Das / Estimation of variance components

28

For this problem, in the balanced case of equal ni’s, the classical analysis of variance (ANOVA) estimators have well known optimum properties and are generally accepted as the standard estimators. However, when the classification is unbalanced, i.e. the ni’s are unequal, the ANOVA estimators are no longer optimal either in the small-sample or in the large-sample sense (Seely (1975), Das (1978)). Alternative estimators such as the maximum likelihood estimators (MLE), restricted maximum likelihood estimators (REMLE), mimimum norm quadratic unbiased estimators (MINQUE), etc., have been proposed and studied in the unbalanced situation. While these estimators are mostly (MLE and REMLE generally and MINQUE when correct weights are used) best asymptotically normal (BAN) in the traditional sense (Miller (1977), Das (1979), Brown (1976)), their computation is quite tedious. In an earlier paper, one of the authors (Das (1978)) considered some simple BAN estimators of ao, cri based on a decomposition of the total sum of squares due to Lamotte (1976) and, assuming that ni'S have only a finite number of possible values, showed that these are BAN. In the present paper, we develop some alternative estimators which look more natural and are BAN under broader assumptions. As the present estimators are not based on any special decomposition, they seem capable of generalization to more complex situations.

2. MLE’s their asymptotic

properties

and some auxiliary

results

In this section we consider the MLE’s of the parameters and state briefly the conditions under which they are BAN in the traditional sense. The results follow as a particular case of a general theorem due to Miller (1977). The conditions appearing in this section will be used afterwards in developing alternative BAN estimators. Let us use the notation 8 = (eo,ei, e,)’ = b,, 01, lu)’ = (6’9 p)‘. The parameter

space of 0 is the open set

e=(ed3100>0,

0+0,

From (1.3) the log likelihood

pdP}.

(2-l)

function is given by

1(8)=logL=const-i_logICI

-+(Y-~JE)‘Z-I(Y-PE).

Substituting ,?Z-1=_1_ I-% 00

00

diag

k (XI=

JJ, i=l

OgRi-l(Oo+tliO~),

(

1 Qo+nl~,

En,,...,



co+ nkol

E nk ’ >

S.K. Chatterjee, K. Das / Estimation of variance components

29

and writing i=l

w=c c (Yij-JJ29

, -a*, k,

1

j

we obtain A(O)= const - + f (ni - 1)log (50- + i lo&00 + nial) i=l

i=l

(2.2) Hence,

all(e) ah

ni

Ii i=l

=-+

++

(aO+nja~)

k c i=l

nf(j$-py (CJO+?Zj01)2



(2.3)

aw) _ k nAYi-/

de2

c (CO+niO,)



I=1

a2qe) k c aeoae2 i=l

ni(Yii-P)

-=-

a2A(e) ae: =+

-

iI

(C70+ni01)2



(Oo+;,01)2

-$,

I

a21(8) = i

(2.4)

($;;$3

l



&wP~

ae,ae2i=,(ao+nja,y '

a2qe) -=ae,"

i i=l

nj (0O+njGl)

*

In the following we shall use the same notation 0 for the parameter as well as its true value. The MLE 6k of 8 is obtained by solving the likelihood equations aA =o, a&

i=o, 1,2.

(2.5)

To establish the asymptotic properties of & we require to impose some conditions on the sequence n,, n2, . . . , generated with increasing k. Specifically, we assume the following.

S.K. Chatterjee, K. Das / Estimation of variance components

30

Assumption

(i) (ii)

2.1. The sequence {n;} is such that, for 8~0,

lim L i ni=/o exists,

k-m

I,> 1;

k i=,

lim 1 i k-m



k i=~ (Oo+ntOl)

= ll(a) = II exists;

(2.6)

= Iz,(cr) = 12, exists for s = 0, 1,2.

(iii)

(i) implies that it will not do if an overwhelming unity. Then from (2.4) and Assumption 2.1 we get _& 1 -a*GR =C=(c,), ’ k &9,Mj >>

lim

k-m

((

i,j=o,

number of values of ni’s is

1,2.

(2.7)

where lo- 1 c, = 4, + -

20;



col=+~*,,

co,=%

Cl1 =+122,

c12=0,

(2.8)

c22 = 34 * AS

nil13

hence (2.6) implies 1 I, 1 ao+a1

and

I221

1 (a0 + 01 Y

>o.

It is now straightforward to check that C in (2.7) is positive definite whatever 8 E 0. In the following we shall write

c= Cl

(

0

0

>

(2.9)

c22

all where Cl = (Cu)i,j=o,1 is, of course, positive definite. We now see that, as k-co, the requirements of Theorem 3.1 in Miller (1977) are met. Hence with probability tending to 1, there exists a consistent solution & = (I?,, Blk, &)‘= (6;, &)’ of the maximum likelihood equations (2.5) such that G(& -

e) = dX((&oOk-ao),(Blk- CTI),ti&-lw

-5

MO, c-9.

(2.10)

Hence, marginally, \T(B/( - a) = v%((&Jk- Do),(8,k - al))’ -L The estimator

N*(O, c;‘).

Sk = (bok, blk)’ is BAN in the sense implied by (2.1).

(2.11)

S.K. Chatterjee, K. Das / Estimation of variance components

31

In the remainder of this section we consider some auxiliary results leading to a representation of r3.kwhich will be useful later. For any two sequences of (real or vector valued) random variables X, and Y,, we shall use the notation Xkp Y, to mean X, - Y, z 0. In the following, de, br , ,~2stand for any consistent estimators of 00, 01, P, and 6= (do, dr, p)’ and d = (a,, 6,)‘. (We keep the subscript k suppressed for the sake of simplicity.)

1 +*a*+ + (60 + tZjdl)P-2(Oo + t7iC71) (oo+n~o,)p-~ 1 n,Q+‘(pi-/Au)’ 1 +(ol-e*)

1k is; I (60 + nibl)(Oo

1 + (tie + nidi)P-2(oe AS Oi - 5iL

1

+ tZial)

(tie + TZi6,)P- ’

1 (6O+ni~l)‘-’

+ nioi) +...+

0, i= 0, 1, it is enough to prove that n4(Ji-P)’ ‘f k i=1 (60 + nid,)(Oo + nial)

1 + (60 + ni6t)P-2(00+

1 1

(60 + tZiti,)P- *

Trioi) + .-.

+

(oo+n~o,)p-i

(2.12) I

and “q+‘(j$ -/# 1-i k i=1 (6.0+ tZi6,)(00 + nial)

1 (60 + ni6,)‘-

1 + (60 +

?Zi6~)P-2(O0 + niOl)

are OJ 1) (stochastically

bounded).

+ *** +



(oo+n:oy I

(2.13)

32

S.K. Chatterjee, K. Das / Estimation of variance components

Let us consider (2.14) where ds

1 and

1 + (do+nidl)P-2(00+“ial)

1 +***+ (oO+rricrl)p-t 1 ’

i=l 9-*-, k. (2.15)

Writing p=_

a161

1

-p-l

I -01 1

+

+

-p-2

-01 1

01

+ ...

p-l

-=1 1

’ 1

since d I 1, clearly (2.16) But, as &, 6, are consistent, moment of N(0, l), rr0,

P is Or,(l). Also, writing v, for the r-th absolute

(2.17) By assumption 2.1(i), the right-hand Hence, by Tchebycheff’s Lemma,

is also Or,(l). Therefore,

member

of (2.17) is uniformly

bounded.

by (2.16), we get that

is O,(l). Now taking d= q-p(2.13) are both O,,(l). 0

1 and q-p

respectively,

we get that (2.12) and

S.K.

Proof.

Chorrerjee,

K. Das / Estimation

of variance components

33

The case r = 0 has been tackled in Lemma 2.1. For rz 1 consider

(2.18) Here the first term on the right converges in probability to zero by Lemma 2.1. Also, proceeding as in the latter part of the proof of Lemma 2.1, it can be shown, each of the terms $ j,

9 s=o, 1, . ..) f-- 1, is O,(l).

(;y;;;;P

Hence, as fi is consistent for P, all the terms on the right of (2.18) converge in probability to zero. 0 Lemma

2.3. As k-too,

(-t$$)‘C,

i,j=O,l,2,

(2.19)

where C is defined by (2.7) and (2.8). Proof.

(2.20)

Denote the three terms on the right of (2.20) by pl, v2, p3, respectively. From (2.7), (us+O as k-m. To show that p2 --% 0, by Tchebycheff’s inequality, it would be enough to show that Var,(i It is straightforward

$$)

+O.

to deduce this from the expression (2.4). That ‘pr‘0,

follows

S.K. Chatterjee, K. Das / Estimation of variance components

34

from (2.4) and Lemma 2.2. As for example, taking i=O, j= 1 since ni(Ji -P)~ is distributed as x:(00 + nial),

nf =- 2i k2 ;=I (a0+n;a~)4 zz2/k(00+a,)‘o:+0

as k-+03.

Also, for i=O, j=l,

ni

q7,2 i 2k

i=t (oO+niol)’

_i

ni

i=~ (~o+n;ol)~

_ i nf(Jii-iG2 (~o+nio,)~ i=, (60+ni(?1)3 ni2(A-P)2

by Lemma 2.2. Theorem

LO

0

2.1. If bk is a consistent solution

of the maximum

likelihood equations

(2.5) then

as k+=,

where

am

-=---

ae

an(e)an(e)’ de0 ’ ae, ’ ae2>

am)

(

and C is defined by (2.7). Proof.

It is enough to show that

(2.21) Expanding the left-hand members of the likelihood equations (2.5) around the true value e; (2.22) where f?* is a convex linear ,combination From (2.22)

_l!?T!G!? k &$Mj

of & and 0.

(2.23)

35

S.K. Chatterjee, K. Das / Estimation of variance components

As 8* is consistent,

by Lemma 2.3,

Since v%(& - 0) is O,(l),

this implies

-1?!@!3 k

&$Mj

Ck(& - 8) PcdX(& - e).

From (2.23) and (2.24), (2.21) is immediate. Corollary

(2.24)

>

Cl

2.1. As k+oo,

This follows from Theorem 2.1 by using (2.9). Before concluding this section we prove a result (Lemma 2.5) which will be useful in the next section. But for that we first need an auxiliary result. Lemma

2.4. As k+a,

for O
ni(Ji-P)2-(aO+nial) (00

Proof.

+

njo,)(&

+

rzi6,)

1 + (dc++itir) 1

is O,(l). (2.25)

Define ni wi=

ti=nrp2

ni

(Bo+tli6,)

I (CJO-tPliCTl)

+

(Bo+ni6,)!,’

(2.26)

ni(_E-P)2-(%+niCJl) (%+nicl)



so that the left-hand member of (2.25) becomes If=, ti wi/fi. Let (or, (r2, . . . , ak) be the permutation of (1,2, . . . , k) such that n,, I n,, I e-e5: n,,. Then w,, 2 w,, r .a. war. Hence by Abel’s well-known Lemma, writing h=min(tO,,ta,+ta, H=max{t,,,t,,+t,

,..., t,,+t,,+~~~+t,,,, ,..., t,,+t,+*-+t,,},

we have

(2.27)

S.K. Chatterjee, K. Das / Estimation of variance components

36

By (2.27),

for any number A >0,

I

max(lt,,j,jt,,+t,,I

,..., It,,+...+t,,l)<-

Al& wa, (2.28)

In (2.28) nLa,

wa,=

I

1

1

+

(tio+%,d,)

(00 + %,a,) I (2.29)

Now

(say).

w(k) Then for any E, 6>0,

for sufficiently

large k,

P{w,,,l--36.

(2.30)

Using (2.28) and (2.29) in (2.27), by a standard technique, P & ICtiWiI
2P

Now, by GW,

t

t

max(It,,I,

1

for large k,

AIL? I~,,+~,,+~~~+4& <--- ,,‘+
Ifa,+fJ,...,

1

wk

max(It,,l,

ta,,fa2,. . .

lt,,+L,+-~-+takI)
It,,+t,,I,...,

0

9

t ak are independent

I-is.

(2.3 1)

random variables with mean zero

and Var( C fa,) = Var( C ti) = 2 i $@-‘) 52k i=l

(since r~ 2). Hence applying Kolmogorov’s of (2.31) we get that for large k 1 P ~ICtiWtI
I

inequality to the first term on the right

(2.32)

21I

The bound in (2.32) can be taken arbitrarily propriately. This proves the lemma. 0 Lemma 2.5. Iffy is such that fi(p-p)

close to 1 by choosing a, 6, A ap-

is O,(l),

then

SK.

Chatterjee,

K. Das / Estimation of variance components

37

Proof.

We give the proof of(i), the proof of (ii) being quite similar. The difference between the two sides of (i) can be written as

(2.33) Now, the first term in (2.33) can be simplified to (oo-Be)

i

fi

i=

n?(~i-_)‘-ni(ao+nial) I

(6, + tZiti,)(Co+ tliC71)

(al-fil)

+

1

n!(~ii_)‘-ni2(ao+nial)

i

v%

i=

(6o+tZifi,)

I

(CF.0 + njCfl)(CTo+ njC7l)

1 + (Oo+niOl) 1

1

1

L(&o+niti,)

+ (Oo+?ZiOl) 3 . (2.34)

Remembering that c?,,, 6i are consistent, and applying Lemma 2.4 with r = 1,2, we deduce that both the terms in (2.34) converge in probability to zero. The second term in (2.33) can be rewritten as

II-

1 1 (6.0+ !7j6i)2 - (00 + ?Zioi)2

(2.35)

In (2.35), within square brackets, both terms converge in probability to zero (the first by Tchebysheff’s Lemma and the second by Lemma 2.1). As v%(p-p) is O,(l), this means that (2.35) itself converges in probability to zero. By Lemma 2.1 the third term in (2.33) is equivalent in probability to

fi(P-fi2 ;

.$ +2.* )2

(a



0

.

11

That this converges to zero follows from Assumption 2.l(iii) and the properties fi. Thus all the terms in (2.33) converges in probability to zero so that (i) is true.

of 0

Note. Although Lemma 2.5 is superficially similar to Lemma 2.2, the proof is considerably more involved. The reason is that, whereas the factor l/k appears in Lemma 2.2, we have l/6? in Lemma 2.5.

3. An alternative estimator In this section we propose an alternative

BAN estimator which is easier to com-

38

S.K. Chatterjee, K. Das / Estimation of variance components

pute than the MLE. Furthermore, we show that the proposed estimator is asymptotically equivalent to the MLE in the sense that the difference of the two estimates of each of cro and o1 when appropriately normalized, converges in probability to zero. Defining j$ and W in Section 2, we have b{nj(~i-_)‘}=ao+ni~l, Var{?Zi(J~-~)2}=2(00+~ioi)2, b(W) = (n - k)ao,

i=l,2,...,k,

(3.1)

Var( W) = 2(n - k)ai.

Let t?= (a,, 8i, ,ii)’ be any consistent estimator of 8, p being such that fi(p -p) = O,(l). In particular do, &t may be the ANOVA estimators of oO, crl and ii may be the grand mean. We now follow the weighted least square approach based on (3.1) with p replaced by p and cro, crl in the weight replaced by do, tit. Set up s= i {ni(~i-_)2-(aO+ni~1)>2 + W-(n-kbO (3.2) 2(da + niC*)2 2(n-k)ti; a i=l Suppose, minimizing S with respect to o. and cri, we get a solution 8,=(8,,, This satisfies the normal equation B(c?)Bk =g(B),

8,,).

(3.3)

where 1 B(b) =

n-k

(60 + niB1)2 +

1 2k

$1 (60+li6*)2

6;

i



n’

(3.4)

i= I (60 + nidl)2 1

g(B)=& (ck ni(Ji-/i)*)zcg i (bo+n.d

i-l

I

We shall study the asymptotic by CJand defining h(6) = fiIg(6)

i=l

1

ni2(~ii_)2



(60 + TZidl)’ > .

(3.5)

behaviour of $,, and 6tk. Denoting the true value

- B(6)a]



c

nZ(gj-~)‘-ni(aO+nia*)

i=l

(60 +

nidl)*



>

,

(3.6)

we can rewrite (3.3) as B(@A&(&

- a) = h(6).

(3.7)

We next note that as k-+03, B(6) -

P

C,

(3.8)

39

S.K. Chatterjee, K. Das / Estimation of variance components

and h(G) p h(B),

(3.9)

where h(0) is obtained by replacing bin (3.6) by the true value 0. (3.8) follows from (2.7) by an application of Lemma 2.1 with r= 0; (3.9) is immediate from Lemma 2.5. Since Ct is positive definite, (3.8) implies as k-03, that B(B) is positive definite in probability. Therefore, with probability tending to 1 as k-too, from (3.7) we get (3.10)

\lk(b, - a) = P(d)h(B). Also since B-‘(e) vq6,

and h(8) are bounded in probability,

(3.8)-(3.10)

give

- a) p B-‘(d)h(B)

!f C;‘h(e).

(3.11)

Now from (2.3) and (3.6) it is easy to check that h(O)= z

1 (

8W) 7

>

*

Hence we can rewrite (3.11) as (3.12) From Corollary 2.1 and (3.11) we arrive at the following conclusion immediately. Theorem

3.1. As k--+oo a(&

- b) 1 \j;j;(& - a)

where dk, dk are respectively the proposed estimator and the maximum likelihood estimator. From (2.11) and Theorem Theorem

3.1 we deduce:

3.2. As k+co,

fi(&

- a) -L

Ago, CT’).

Thus c?, is also a BAN estimator. Note.

Instead of proving (3.12) via Theorem 3.1 we could deduce it directly from the representation (3.11) and the fact that (l/fi)(aA(e)/aa) is asymptotically distributed as I$(O, C,).

SK. Chatterjee, K. Das / Estimation of variance components

40

4. Concluding

remarks

It has been generally known that the use of weighted least squares with estimated weights leads to BAN estimators of parameters of linear models (see e.g. Malinvaud (1973), Chapter 9). We have however used the approach for the estimation of variance components. Unlike Das (1978), here no attempt has been made to apply the technique on any set of minimal sufficient statistics. Instead, a simple set of sufficient statistics has been used for the purpose. This makes the technique capable of extension to more complex models. Some of this is intended to be presented in a subsequent communication. Further insight into the technique is obtained by rewriting (3.3) as (4.1) This shows that b is obtained by applying a correction to the initial estimate 6. In view of (2.7) and (3.8), B(d) can be regarded as an estimate of the (marginal) information matrix of oo, oi. Thus (4.1) resembles a well known result for the i.i.d case (see e.g. Zacks (1971)). The proof for the i.i.d case, however, is not valid under the present model. Before concluding this section we make some remark about our choice of normalizing factors. Throughout our discussion for the sake of simplicity we have used for both cro and crl a common normalizing factor v% to derive the asymptotic distribution of the proposed estimator. As noted by Weiss (1971,1973), when the observations are not i.i.d., it is not automatic that a common normalizing factor would serve for the estimates of all the parameters. In the present context one might think that it would be more natural to consider the asymptotic distribution of

(m(Bok -

Go),dZ(f51k - 0,)

(jKX(Bok

or

- cro),fi(b,,

- at)).

(In fact, if one applies Miller’s (1977) result in the original form, one gets the asymptotic distribution of MLE in this form.) However, this poses no new problem. By Assumption 2.1, (n - k)/k-+(l, - l), so that (2.11) and the conclusions of Theorems 3.1 and 3.2 remain valid if we replace fi(BOk - oo), fi(dOk - ao) by v(Bek - oe), m(Bok - ao). Then, the only modification is that C;’ is replaced by diag((l, - I)“‘, 1)C;’ diag((lo - l)‘“, 1).

References Brown,

K.G. (1976). Asymptotic

behaviour

of MINQUE-type

estimators

of variance

components.

Ann.

Statist. 4, 746-754. Das, K. (1978). Ban estimators

of variance

components

for the unbalanced

one-way

classification.

Co/.

Stat. Assoc. Bull. 27, 97-118. Das, K. (1979). Asymptotic

optimality

Cal. Stat. Assoc. Bull. 28. 125-142.

of restricted

maximum

likelihood

estimates

for the mixed model.

S.K. Chatterjee, K. Das / Estimation of variance components Lamotte,

L.R. (1976). Invariant

quadratic

estimators

in the random

one-way

ANOVA

41 model.

Biometrics

32, 793-804. Malinvaud, Miller, J.J. analysis

T. (1973). Statistical Methods of Economeirics. North-Holland, (1977). Asymptotic properties of maximum likelihood estimates of variance.

Amsterdam. in the mixed model

of the

Ann. Statist. 5, 746-762.

Seely, J. (1975). An example Biometrika 62, 689-693. Weiss, L. (1971). Asymptotic

of an inadmissible properties

analysis

of variance

estimator

for a variance

component.

of maximum

likelihood

estimators

in some non-standard

cases.

of maximum

likelihood

estimators

in some non-standard

cases.

J. Amer. Statist. Assoc. 66, 345-350. Weiss,

L. (1973). Asymptotic

properties

J. Amer. Statist. Assoc. 68, 428-430. Zacks, S. (1971). The Theory of Statistical Inference. Wiley, New York, 250-251.