Journal
of Statistical
Planning
and Inference
27
8 (1983) 27-41
North-Holland
ESTIMATION OF VARIANCE COMPONENTS IN AN UNBALANCED ONE-WAY CLASSIFICATION Shoutir Kishore CHATTERJEE
and Kalyan DAS
Department of Statistics, Calcutta University, Calcutta, India Received
12 July
Recommended
1982; revised
by P.K.
Abstract: In the unbalanced with estimated
weights
As the number totically
received
one-way
random
is used to develop
of classes
normal
manuscript
30 December
1982
Sen
increases,
effects
a relatively
the proposed
but also to be asymptotically
model the weighted simple estimator
estimator
equivalent
least squares of variance
approach
components.
is seen not only to be best asymp-
to the maximum
likelihood
estimator.
AMS Subject Classification: 62510. Key words and phrases: One-way asymptotically
normal
classification;
Random
effects;
Variance
components;
Best
estimator.
1. Introduction
Consider an unbalanced one-way classification model. The j-th observation in the i-th class is yii=p+bj+eij,
j=l,...,
ni, i=l,...,
where the hi’s and e;j’s are independently b(bi) = 0,
Var(bi) = ~1,
&(e,) = 0,
Var(eii) = cro.
(To avoid notational Writing $,.i=,,
complication
subject to a random k,
normally distributed
effects
(1.1) with (1.2)
we use bi instead of a;.)
ynxl=(Yll,...,Yln,,...tYkl,...,Ykn*)’,
&“x1=(1,
1, . . . , l)‘, Z= 0~1~ + cr, diag(E,,,, . . . , En,),
where E, is a v x v matrix with all elements unity, we then have y-N&s,
Z).
(1.3)
Our object is to estimate a=(ao,al)‘.
0378-3758/83/$3.00
0
1983, Elsevier
Science
Publishers
B.V. (North-Holland)
S.K. Chatterjee, K. Das / Estimation of variance components
28
For this problem, in the balanced case of equal ni’s, the classical analysis of variance (ANOVA) estimators have well known optimum properties and are generally accepted as the standard estimators. However, when the classification is unbalanced, i.e. the ni’s are unequal, the ANOVA estimators are no longer optimal either in the small-sample or in the large-sample sense (Seely (1975), Das (1978)). Alternative estimators such as the maximum likelihood estimators (MLE), restricted maximum likelihood estimators (REMLE), mimimum norm quadratic unbiased estimators (MINQUE), etc., have been proposed and studied in the unbalanced situation. While these estimators are mostly (MLE and REMLE generally and MINQUE when correct weights are used) best asymptotically normal (BAN) in the traditional sense (Miller (1977), Das (1979), Brown (1976)), their computation is quite tedious. In an earlier paper, one of the authors (Das (1978)) considered some simple BAN estimators of ao, cri based on a decomposition of the total sum of squares due to Lamotte (1976) and, assuming that ni'S have only a finite number of possible values, showed that these are BAN. In the present paper, we develop some alternative estimators which look more natural and are BAN under broader assumptions. As the present estimators are not based on any special decomposition, they seem capable of generalization to more complex situations.
2. MLE’s their asymptotic
properties
and some auxiliary
results
In this section we consider the MLE’s of the parameters and state briefly the conditions under which they are BAN in the traditional sense. The results follow as a particular case of a general theorem due to Miller (1977). The conditions appearing in this section will be used afterwards in developing alternative BAN estimators. Let us use the notation 8 = (eo,ei, e,)’ = b,, 01, lu)’ = (6’9 p)‘. The parameter
space of 0 is the open set
e=(ed3100>0,
0+0,
From (1.3) the log likelihood
pdP}.
(2-l)
function is given by
1(8)=logL=const-i_logICI
-+(Y-~JE)‘Z-I(Y-PE).
Substituting ,?Z-1=_1_ I-% 00
00
diag
k (XI=
JJ, i=l
OgRi-l(Oo+tliO~),
(
1 Qo+nl~,
En,,...,
’
co+ nkol
E nk ’ >
S.K. Chatterjee, K. Das / Estimation of variance components
29
and writing i=l
w=c c (Yij-JJ29
, -a*, k,
1
j
we obtain A(O)= const - + f (ni - 1)log (50- + i lo&00 + nial) i=l
i=l
(2.2) Hence,
all(e) ah
ni
Ii i=l
=-+
++
(aO+nja~)
k c i=l
nf(j$-py (CJO+?Zj01)2
’
(2.3)
aw) _ k nAYi-/
de2
c (CO+niO,)
’
I=1
a2qe) k c aeoae2 i=l
ni(Yii-P)
-=-
a2A(e) ae: =+
-
iI
(C70+ni01)2
’
(Oo+;,01)2
-$,
I
a21(8) = i
(2.4)
($;;$3
l
’
&wP~
ae,ae2i=,(ao+nja,y '
a2qe) -=ae,"
i i=l
nj (0O+njGl)
*
In the following we shall use the same notation 0 for the parameter as well as its true value. The MLE 6k of 8 is obtained by solving the likelihood equations aA =o, a&
i=o, 1,2.
(2.5)
To establish the asymptotic properties of & we require to impose some conditions on the sequence n,, n2, . . . , generated with increasing k. Specifically, we assume the following.
S.K. Chatterjee, K. Das / Estimation of variance components
30
Assumption
(i) (ii)
2.1. The sequence {n;} is such that, for 8~0,
lim L i ni=/o exists,
k-m
I,> 1;
k i=,
lim 1 i k-m
”
k i=~ (Oo+ntOl)
= ll(a) = II exists;
(2.6)
= Iz,(cr) = 12, exists for s = 0, 1,2.
(iii)
(i) implies that it will not do if an overwhelming unity. Then from (2.4) and Assumption 2.1 we get _& 1 -a*GR =C=(c,), ’ k &9,Mj >>
lim
k-m
((
i,j=o,
number of values of ni’s is
1,2.
(2.7)
where lo- 1 c, = 4, + -
20;
’
col=+~*,,
co,=%
Cl1 =+122,
c12=0,
(2.8)
c22 = 34 * AS
nil13
hence (2.6) implies 1 I, 1 ao+a1
and
I221
1 (a0 + 01 Y
>o.
It is now straightforward to check that C in (2.7) is positive definite whatever 8 E 0. In the following we shall write
c= Cl
(
0
0
>
(2.9)
c22
all where Cl = (Cu)i,j=o,1 is, of course, positive definite. We now see that, as k-co, the requirements of Theorem 3.1 in Miller (1977) are met. Hence with probability tending to 1, there exists a consistent solution & = (I?,, Blk, &)‘= (6;, &)’ of the maximum likelihood equations (2.5) such that G(& -
e) = dX((&oOk-ao),(Blk- CTI),ti&-lw
-5
MO, c-9.
(2.10)
Hence, marginally, \T(B/( - a) = v%((&Jk- Do),(8,k - al))’ -L The estimator
N*(O, c;‘).
Sk = (bok, blk)’ is BAN in the sense implied by (2.1).
(2.11)
S.K. Chatterjee, K. Das / Estimation of variance components
31
In the remainder of this section we consider some auxiliary results leading to a representation of r3.kwhich will be useful later. For any two sequences of (real or vector valued) random variables X, and Y,, we shall use the notation Xkp Y, to mean X, - Y, z 0. In the following, de, br , ,~2stand for any consistent estimators of 00, 01, P, and 6= (do, dr, p)’ and d = (a,, 6,)‘. (We keep the subscript k suppressed for the sake of simplicity.)
1 +*a*+ + (60 + tZjdl)P-2(Oo + t7iC71) (oo+n~o,)p-~ 1 n,Q+‘(pi-/Au)’ 1 +(ol-e*)
1k is; I (60 + nibl)(Oo
1 + (tie + nidi)P-2(oe AS Oi - 5iL
1
+ tZial)
(tie + TZi6,)P- ’
1 (6O+ni~l)‘-’
+ nioi) +...+
0, i= 0, 1, it is enough to prove that n4(Ji-P)’ ‘f k i=1 (60 + nid,)(Oo + nial)
1 + (60 + ni6t)P-2(00+
1 1
(60 + tZiti,)P- *
Trioi) + .-.
+
(oo+n~o,)p-i
(2.12) I
and “q+‘(j$ -/# 1-i k i=1 (6.0+ tZi6,)(00 + nial)
1 (60 + ni6,)‘-
1 + (60 +
?Zi6~)P-2(O0 + niOl)
are OJ 1) (stochastically
bounded).
+ *** +
’
(oo+n:oy I
(2.13)
32
S.K. Chatterjee, K. Das / Estimation of variance components
Let us consider (2.14) where ds
1 and
1 + (do+nidl)P-2(00+“ial)
1 +***+ (oO+rricrl)p-t 1 ’
i=l 9-*-, k. (2.15)
Writing p=_
a161
1
-p-l
I -01 1
+
+
-p-2
-01 1
01
+ ...
p-l
-=1 1
’ 1
since d I 1, clearly (2.16) But, as &, 6, are consistent, moment of N(0, l), rr0,
P is Or,(l). Also, writing v, for the r-th absolute
(2.17) By assumption 2.1(i), the right-hand Hence, by Tchebycheff’s Lemma,
is also Or,(l). Therefore,
member
of (2.17) is uniformly
bounded.
by (2.16), we get that
is O,(l). Now taking d= q-p(2.13) are both O,,(l). 0
1 and q-p
respectively,
we get that (2.12) and
S.K.
Proof.
Chorrerjee,
K. Das / Estimation
of variance components
33
The case r = 0 has been tackled in Lemma 2.1. For rz 1 consider
(2.18) Here the first term on the right converges in probability to zero by Lemma 2.1. Also, proceeding as in the latter part of the proof of Lemma 2.1, it can be shown, each of the terms $ j,
9 s=o, 1, . ..) f-- 1, is O,(l).
(;y;;;;P
Hence, as fi is consistent for P, all the terms on the right of (2.18) converge in probability to zero. 0 Lemma
2.3. As k-too,
(-t$$)‘C,
i,j=O,l,2,
(2.19)
where C is defined by (2.7) and (2.8). Proof.
(2.20)
Denote the three terms on the right of (2.20) by pl, v2, p3, respectively. From (2.7), (us+O as k-m. To show that p2 --% 0, by Tchebycheff’s inequality, it would be enough to show that Var,(i It is straightforward
$$)
+O.
to deduce this from the expression (2.4). That ‘pr‘0,
follows
S.K. Chatterjee, K. Das / Estimation of variance components
34
from (2.4) and Lemma 2.2. As for example, taking i=O, j= 1 since ni(Ji -P)~ is distributed as x:(00 + nial),
nf =- 2i k2 ;=I (a0+n;a~)4 zz2/k(00+a,)‘o:+0
as k-+03.
Also, for i=O, j=l,
ni
q7,2 i 2k
i=t (oO+niol)’
_i
ni
i=~ (~o+n;ol)~
_ i nf(Jii-iG2 (~o+nio,)~ i=, (60+ni(?1)3 ni2(A-P)2
by Lemma 2.2. Theorem
LO
0
2.1. If bk is a consistent solution
of the maximum
likelihood equations
(2.5) then
as k+=,
where
am
-=---
ae
an(e)an(e)’ de0 ’ ae, ’ ae2>
am)
(
and C is defined by (2.7). Proof.
It is enough to show that
(2.21) Expanding the left-hand members of the likelihood equations (2.5) around the true value e; (2.22) where f?* is a convex linear ,combination From (2.22)
_l!?T!G!? k &$Mj
of & and 0.
(2.23)
35
S.K. Chatterjee, K. Das / Estimation of variance components
As 8* is consistent,
by Lemma 2.3,
Since v%(& - 0) is O,(l),
this implies
-1?!@!3 k
&$Mj
Ck(& - 8) PcdX(& - e).
From (2.23) and (2.24), (2.21) is immediate. Corollary
(2.24)
>
Cl
2.1. As k+oo,
This follows from Theorem 2.1 by using (2.9). Before concluding this section we prove a result (Lemma 2.5) which will be useful in the next section. But for that we first need an auxiliary result. Lemma
2.4. As k+a,
for O
ni(Ji-P)2-(aO+nial) (00
Proof.
+
njo,)(&
+
rzi6,)
1 + (dc++itir) 1
is O,(l). (2.25)
Define ni wi=
ti=nrp2
ni
(Bo+tli6,)
I (CJO-tPliCTl)
+
(Bo+ni6,)!,’
(2.26)
ni(_E-P)2-(%+niCJl) (%+nicl)
’
so that the left-hand member of (2.25) becomes If=, ti wi/fi. Let (or, (r2, . . . , ak) be the permutation of (1,2, . . . , k) such that n,, I n,, I e-e5: n,,. Then w,, 2 w,, r .a. war. Hence by Abel’s well-known Lemma, writing h=min(tO,,ta,+ta, H=max{t,,,t,,+t,
,..., t,,+t,,+~~~+t,,,, ,..., t,,+t,+*-+t,,},
we have
(2.27)
S.K. Chatterjee, K. Das / Estimation of variance components
36
By (2.27),
for any number A >0,
I
max(lt,,j,jt,,+t,,I
,..., It,,+...+t,,l)<-
Al& wa, (2.28)
In (2.28) nLa,
wa,=
I
1
1
+
(tio+%,d,)
(00 + %,a,) I (2.29)
Now
(say).
w(k) Then for any E, 6>0,
for sufficiently
large k,
P{w,,,
l--36.
(2.30)
Using (2.28) and (2.29) in (2.27), by a standard technique, P & ICtiWiI
2P
Now, by GW,
t
t
max(It,,I,
1
for large k,
AIL? I~,,+~,,+~~~+4& <--- ,,‘+
Ifa,+fJ,...,
1
wk
max(It,,l,
ta,,fa2,. . .
lt,,+L,+-~-+takI)
It,,+t,,I,...,
0
9
t ak are independent
I-is.
(2.3 1)
random variables with mean zero
and Var( C fa,) = Var( C ti) = 2 i $@-‘) 52k i=l
(since r~ 2). Hence applying Kolmogorov’s of (2.31) we get that for large k 1 P ~ICtiWtI
I
inequality to the first term on the right
(2.32)
21I
The bound in (2.32) can be taken arbitrarily propriately. This proves the lemma. 0 Lemma 2.5. Iffy is such that fi(p-p)
close to 1 by choosing a, 6, A ap-
is O,(l),
then
SK.
Chatterjee,
K. Das / Estimation of variance components
37
Proof.
We give the proof of(i), the proof of (ii) being quite similar. The difference between the two sides of (i) can be written as
(2.33) Now, the first term in (2.33) can be simplified to (oo-Be)
i
fi
i=
n?(~i-_)‘-ni(ao+nial) I
(6, + tZiti,)(Co+ tliC71)
(al-fil)
+
1
n!(~ii_)‘-ni2(ao+nial)
i
v%
i=
(6o+tZifi,)
I
(CF.0 + njCfl)(CTo+ njC7l)
1 + (Oo+niOl) 1
1
1
L(&o+niti,)
+ (Oo+?ZiOl) 3 . (2.34)
Remembering that c?,,, 6i are consistent, and applying Lemma 2.4 with r = 1,2, we deduce that both the terms in (2.34) converge in probability to zero. The second term in (2.33) can be rewritten as
II-
1 1 (6.0+ !7j6i)2 - (00 + ?Zioi)2
(2.35)
In (2.35), within square brackets, both terms converge in probability to zero (the first by Tchebysheff’s Lemma and the second by Lemma 2.1). As v%(p-p) is O,(l), this means that (2.35) itself converges in probability to zero. By Lemma 2.1 the third term in (2.33) is equivalent in probability to
fi(P-fi2 ;
.$ +2.* )2
(a
”
0
.
11
That this converges to zero follows from Assumption 2.l(iii) and the properties fi. Thus all the terms in (2.33) converges in probability to zero so that (i) is true.
of 0
Note. Although Lemma 2.5 is superficially similar to Lemma 2.2, the proof is considerably more involved. The reason is that, whereas the factor l/k appears in Lemma 2.2, we have l/6? in Lemma 2.5.
3. An alternative estimator In this section we propose an alternative
BAN estimator which is easier to com-
38
S.K. Chatterjee, K. Das / Estimation of variance components
pute than the MLE. Furthermore, we show that the proposed estimator is asymptotically equivalent to the MLE in the sense that the difference of the two estimates of each of cro and o1 when appropriately normalized, converges in probability to zero. Defining j$ and W in Section 2, we have b{nj(~i-_)‘}=ao+ni~l, Var{?Zi(J~-~)2}=2(00+~ioi)2, b(W) = (n - k)ao,
i=l,2,...,k,
(3.1)
Var( W) = 2(n - k)ai.
Let t?= (a,, 8i, ,ii)’ be any consistent estimator of 8, p being such that fi(p -p) = O,(l). In particular do, &t may be the ANOVA estimators of oO, crl and ii may be the grand mean. We now follow the weighted least square approach based on (3.1) with p replaced by p and cro, crl in the weight replaced by do, tit. Set up s= i {ni(~i-_)2-(aO+ni~1)>2 + W-(n-kbO (3.2) 2(da + niC*)2 2(n-k)ti; a i=l Suppose, minimizing S with respect to o. and cri, we get a solution 8,=(8,,, This satisfies the normal equation B(c?)Bk =g(B),
8,,).
(3.3)
where 1 B(b) =
n-k
(60 + niB1)2 +
1 2k
$1 (60+li6*)2
6;
i
’
n’
(3.4)
i= I (60 + nidl)2 1
g(B)=& (ck ni(Ji-/i)*)zcg i (bo+n.d
i-l
I
We shall study the asymptotic by CJand defining h(6) = fiIg(6)
i=l
1
ni2(~ii_)2
’
(60 + TZidl)’ > .
(3.5)
behaviour of $,, and 6tk. Denoting the true value
- B(6)a]
’
c
nZ(gj-~)‘-ni(aO+nia*)
i=l
(60 +
nidl)*
’
>
,
(3.6)
we can rewrite (3.3) as B(@A&(&
- a) = h(6).
(3.7)
We next note that as k-+03, B(6) -
P
C,
(3.8)
39
S.K. Chatterjee, K. Das / Estimation of variance components
and h(G) p h(B),
(3.9)
where h(0) is obtained by replacing bin (3.6) by the true value 0. (3.8) follows from (2.7) by an application of Lemma 2.1 with r= 0; (3.9) is immediate from Lemma 2.5. Since Ct is positive definite, (3.8) implies as k-03, that B(B) is positive definite in probability. Therefore, with probability tending to 1 as k-too, from (3.7) we get (3.10)
\lk(b, - a) = P(d)h(B). Also since B-‘(e) vq6,
and h(8) are bounded in probability,
(3.8)-(3.10)
give
- a) p B-‘(d)h(B)
!f C;‘h(e).
(3.11)
Now from (2.3) and (3.6) it is easy to check that h(O)= z
1 (
8W) 7
>
*
Hence we can rewrite (3.11) as (3.12) From Corollary 2.1 and (3.11) we arrive at the following conclusion immediately. Theorem
3.1. As k--+oo a(&
- b) 1 \j;j;(& - a)
where dk, dk are respectively the proposed estimator and the maximum likelihood estimator. From (2.11) and Theorem Theorem
3.1 we deduce:
3.2. As k+co,
fi(&
- a) -L
Ago, CT’).
Thus c?, is also a BAN estimator. Note.
Instead of proving (3.12) via Theorem 3.1 we could deduce it directly from the representation (3.11) and the fact that (l/fi)(aA(e)/aa) is asymptotically distributed as I$(O, C,).
SK. Chatterjee, K. Das / Estimation of variance components
40
4. Concluding
remarks
It has been generally known that the use of weighted least squares with estimated weights leads to BAN estimators of parameters of linear models (see e.g. Malinvaud (1973), Chapter 9). We have however used the approach for the estimation of variance components. Unlike Das (1978), here no attempt has been made to apply the technique on any set of minimal sufficient statistics. Instead, a simple set of sufficient statistics has been used for the purpose. This makes the technique capable of extension to more complex models. Some of this is intended to be presented in a subsequent communication. Further insight into the technique is obtained by rewriting (3.3) as (4.1) This shows that b is obtained by applying a correction to the initial estimate 6. In view of (2.7) and (3.8), B(d) can be regarded as an estimate of the (marginal) information matrix of oo, oi. Thus (4.1) resembles a well known result for the i.i.d case (see e.g. Zacks (1971)). The proof for the i.i.d case, however, is not valid under the present model. Before concluding this section we make some remark about our choice of normalizing factors. Throughout our discussion for the sake of simplicity we have used for both cro and crl a common normalizing factor v% to derive the asymptotic distribution of the proposed estimator. As noted by Weiss (1971,1973), when the observations are not i.i.d., it is not automatic that a common normalizing factor would serve for the estimates of all the parameters. In the present context one might think that it would be more natural to consider the asymptotic distribution of
(m(Bok -
Go),dZ(f51k - 0,)
(jKX(Bok
or
- cro),fi(b,,
- at)).
(In fact, if one applies Miller’s (1977) result in the original form, one gets the asymptotic distribution of MLE in this form.) However, this poses no new problem. By Assumption 2.1, (n - k)/k-+(l, - l), so that (2.11) and the conclusions of Theorems 3.1 and 3.2 remain valid if we replace fi(BOk - oo), fi(dOk - ao) by v(Bek - oe), m(Bok - ao). Then, the only modification is that C;’ is replaced by diag((l, - I)“‘, 1)C;’ diag((lo - l)‘“, 1).
References Brown,
K.G. (1976). Asymptotic
behaviour
of MINQUE-type
estimators
of variance
components.
Ann.
Statist. 4, 746-754. Das, K. (1978). Ban estimators
of variance
components
for the unbalanced
one-way
classification.
Co/.
Stat. Assoc. Bull. 27, 97-118. Das, K. (1979). Asymptotic
optimality
Cal. Stat. Assoc. Bull. 28. 125-142.
of restricted
maximum
likelihood
estimates
for the mixed model.
S.K. Chatterjee, K. Das / Estimation of variance components Lamotte,
L.R. (1976). Invariant
quadratic
estimators
in the random
one-way
ANOVA
41 model.
Biometrics
32, 793-804. Malinvaud, Miller, J.J. analysis
T. (1973). Statistical Methods of Economeirics. North-Holland, (1977). Asymptotic properties of maximum likelihood estimates of variance.
Amsterdam. in the mixed model
of the
Ann. Statist. 5, 746-762.
Seely, J. (1975). An example Biometrika 62, 689-693. Weiss, L. (1971). Asymptotic
of an inadmissible properties
analysis
of variance
estimator
for a variance
component.
of maximum
likelihood
estimators
in some non-standard
cases.
of maximum
likelihood
estimators
in some non-standard
cases.
J. Amer. Statist. Assoc. 66, 345-350. Weiss,
L. (1973). Asymptotic
properties
J. Amer. Statist. Assoc. 68, 428-430. Zacks, S. (1971). The Theory of Statistical Inference. Wiley, New York, 250-251.