Journal of Statistical North-Holland
Planning
and Inference
39 (1994) 43-55
43
Shrinkage domination of some usual estimators of the common mean of several multivariate normal populations Sanat K. Sarkar Speakman Hall (066-00). Department of Statistics, School of Business and Management, Temple University, Philadelphia, PA 19122. USA Received
2 November
1992
Abstract Given n mutually independent p-dimensional random vectors (XI, ,X,), where Xi N N,( p, c f I), the problem ofestimating the common mean vector p is considered assuming that af’s are unknown. A class of Stein-type shrinkage estimators uniformly dominating a usual estimator based only on Xi’s is given. This strengthens a similar result due to George (1991) and Krishnamoorthy (1991) for the special case n = 2. As a consequence of this result, a class of Stein-type shrinkage estimators uniformly dominating a usual estimator based on Xi’s and some estimators of et’s distributed independently of Xi’s are also obtained. AMS Subject Classijication: Primary Key words: domination.
Multivariate
normal
62Hl2; populations;
secondary common
62C99, 62507 mean;
usual
estimators;
Stein-type
shrinkage
1. Introduction Suppose that we have n mutually independent p-dimensional random vectors X r ,..., X, distributed as N#,o?I), i=l,... , n, respectively. We consider the problem of estimating the common mean vector p when af’s are unknown, and the loss incurred in using fi as an estimator is assumed to be G4P)=Il~-Pl12.
(1.1)
First, we consider this estimation problem based only on Xi’s. When the proportions yli=~;*/CJE1 o,:* are known, the estimator
Xv= i
ViXi
(1.2)
i=l
Correspondence to: Sanat K. Sarkar, Speakman Hall, (066-00) Department Business and Management, Temple University, Philadelphia, PA 19122, USA.
0378-3758/94/$07.00 0 1994-Elsevier SSDI 0378-3758(93)E0044-H
Science B.V. All rights reserved.
of Statistics,
School
of
SK
44
SarkarJShrinkage
domination
of some usual estimators
is optimum in several senses. It is, however, inadmissible when p > 2, and is uniformly dominated by its Stein-type shrinkage versions
O
(1.3)
The proof of this result follows easily by applying the standard technique of proving this kind of dominance result [see, for example, Stein (1981), Berger (1985) or Brandwein and Strawderman (1990)] and making use of the following facts. There exist independent random variables V2, . . . , V,, where I/i-x: (central chi-squared with p degrees of freedom), for i = 2, . . . , n, independently of X, N NJ p, a, I ), such that
for some nonnegative constants ai, i= 1, . . . , n. Also, a1
alE(C1=2aiVi)a
E(CTz2 Ui Vi)’
(17~2 Ui)’
(p+2)
using Bhattacharya’s (1984) inequality, and al --->m,in Cy= 2 Ui
& L
(see Section 2). With unknown Y/i,an estimator of the form ” (1.4) is often a natural choice, where rci)s satisfying O
!
zn,s=l-
CIIliI$
(
OGccqP--2)/(P+2),
&
C1=17CiIIXi-~n112
,>
Xz,
II2, II2 1
(1.5)
for p > 2 over the restricted parameter space ((p, CT :, . . . , (rz) : vi = q, i = 1, . . . , n} does not immediately suggest that the same phenomenon holds for 6, over the entire parameter space; the primary reason being that, here, x7= 1 71i11 Xi- r?,II 2 and d, become heavily dependent. George (1991) treated this problem for n = 2 and showed
S.K. SarkarlShrinkage
domination of some usual estimators
45
that, in spite of this dependence, Xn,s uniformly dominates X, for p > 2; however, c in an interval smaller than in (1.5). He, of course, suggested a better interval c through
extensive
simulation,
which
is again
slightly
smaller
than
for for
that in (1.5).
Krishnamoorthy (1991b), in a recent paper, improved George& result by analytically proving that this dominance result indeed holds for c in the interval given in (1.5). In this article, we have strengthened Krishnamoorthy’s (1991b) result, and hence George’s (1991). We treat not only the problem in a more general set-up, but also give a wider class of shrinkage that
estimators
mini( 5
dominating
X,. More specifically,
it is shown
1~~llx~l12~~1=171iIIxi~xrrl12 x,, IIXnl12 I
(1.6)
for a certain class of 4, which includes that considered in George (1991) and Krishnamoorthy (1991b), dominates X, uniformly if p > 2. Furthermore, our method of proof is much more general, involving some new identities and inequalities. We also consider the problem of estimating ,u when additional observations are available to estimate ~~2’s. Let e:, . . . ,8,2, which are distributed independently of (X i, . . . , X,), represent some estimates of a:, . . . , a,2, respectively. Several authors have considered this problem. For p= 1, Graybill and Deal (1959), Bement and Williams (1969), Khatri and Saha (1974), Norwood and Hinkelman (1977), and Bhattacharya (1984) derived conditions for the most commonly used estimator, namely, X;=
i:
&Xi,
9i=&+
i=l
i3]?,
(1.7)
j=l
with unbiased 6;‘s to dominate each Xi. Brown and Cohen (1974), Cohen and Sackrowitz (1974), and Bhattacharya (1980) discussed properties of some alternative estimators when p = 1. Shinozaki (1978), Swamy and Mehta (1979), Yancey et al. (1984), and Kubokawa
(1990) considered
equivalent
problems
in a regression
setting.
In a more general multivariate setting, Chiou and Cohen (1985), Krishnamoorthy (1991a), and Loh (1990) also treated this problem. In this paper, we provide a class of Stein-type shrinkage versions of X, uniformly dominating it for p > 2. This dominance result follows as a direct consequence Xn,s over X,, of fii’s.
and interestingly,
2. Shrinkage domination We first present (1.6) over X,.
of the previous
does not depend
result showing
on the distributional
dominance
of
properties
of J$?nand _8$
in this section
the main result establishing
uniform
dominance
of
SK
46
SarkarjShrinkage
domination
of some usual estimators
Theorem 1. Let 4(x) be afunction satisfying(i) 0<4(~)<2(p-2)/(p+2), (ii) 4’(x)= ati(x)/ax 2 0, 4”(X) = d24(X)/aX2 < 0, $( x )/ x is convex, and 4”(x) is nondecreasing. Then, for any C$satisfying these conditions, X,,,(4) lossfunction (1.1) $O
1. The
most
Theorem
1 is a constant
uniformly dominates x, under the
1, i= 1, . . . ,n, and p>2.
obvious
choice
function.
for 4 satisfying
Another
the conditions
choice is a class of functions
stated
in
of the form
4(x) = ax/(x + b), for some a > 0 and b > 0. This latter C$leads to shrinkage estimators of the type considered by Shinozaki (1984) in estimating some nonnormal location parameters. We need some supporting results that will facilitate the proof of the theorem. Towards this end, first, we have two lemmas related to central chi-square distribution. Lemma 1. Let 4 be a d$erentiable function such that E[+(xz)]
< co, for any v>O.
Then, if v > 2,
(2.1) Proof. The proof of this lemma
follows directly
from the following
identity:
EC~(~~)I=(V-~)EC~;~~(X~)I+~EC~‘(X~)I [see, for example, Efron and Morris (1976), or Haff (1977) for a more general identity for Wishart distribution]. 0 Lemma 2. Let g(x) = 4(x)/x, for some nonnegative function 4 such that E [4(x:)]
< co,
for any v > 0. Then, for v > 2, E2Cs(x:+dl
6-
v+2 V
ECs(x~)lECs(x:+dl~
Proof. ECs(x:+dl= EL-&:)1
1
EC&:)1
~-W/X:MX~)I =k [ jx-‘h.(x)dx]-i,
(2.2)
where
fy being the density of xy’. It is easy to check that h,,(x) is TP2 in (x, v). Hence, from Karlin (1968), the expectation of l/X with respect to this density is decreasing in v > 2,
SK
SarkarlShrinkage
domination
ofsomeusual estimators
41
which implies
Hence, the lemma
is proved.
0
Now, we present some useful lemmas related to expectations of certain functions a 2p-dimensional random vector (X’, Y’)’ having the following distribution:
of
(2.3) where 0 is a p x 1 vector and - 1~ p < 1. First, we have the following
identity.
Lemma 3. Let C/Ibe a difirentiable function such that E [4(xy’)] < 00, for any v>O. Then, if p > 2, E
(X-WXdJ(IlXl12)
[
IIXl12
1
II Yl12
=ECII/(IIXl12)II ~l121+2~2EC$(IIXl12)1~
(2.4)
where
Proof. Let g(x)= 4(x)/x. Observing that Y can be represented as Y=pX+ (1 --P~)“~Z, for some Z distributed as NJ -p(l -~~)-r/~ 8, Z], independently of X, and applying the Stein’s (1981) identity, namely, E [ (Xi - ei)‘h(Xi)] = E[h’(Xi)], for any differentiable function h satisfying E Ih’(Xi)I < co, we see that the left-hand side of (2.4) is equal to
E i
i=l
{~(llXI12)IIYI12+2~‘(llXl12)X~ll YI12+2pXi[pXi +(1-P2)“2zil~(IIxl12)}
=PECdIIXI12)II YI121+2ECd(IIXl12)llXl12II Yl121 +‘G2ECg(IlXl12)lI~l121+~~~~-~2~1’2~C~~ll~I12~~‘~1 =~~CP~~ll~I12~+~9’~/l~I12~II~l121 II Yl12)
+2~2EC(X--8)‘Xg(IIXl12)1. Applying
again the Stein’s identity
we note that
~C~~--8)‘~~OI~ll2~1=~C~~~II~/l2+~~’~ll~ll2~ll~ll21.
(2.5)
48
SK
SurkarjShrinkage
domination of some usual estimators
The lemma then follows because
tw)=Pd4+w(x).
0
The next lemma gives some important inequalities. Lemma 4. Let q5 be a function satisfying E[c$(x~)] < 00, for any v >O, and condition (ii)
of Theorem 1. Then,
Gb+W [
(b)
1
4( IIx II2, IIxl12 +2EC4’(IIXl12)1 ifp=3,
Ilx-~ll”4wl12) I,xll2
]Q+2)E[
(2.6)
“““i;;;!‘““)]
+2ECIlX-~l12~‘(IIX/12)1 if pk4
+2ECIlX-Bl124’(llXl12)]
ifp=3.
(2.7)
Proof. Let
XA-fA(x)=(2~)-P’2Api2exp
-+112),
the density of N,(6), A-‘I). Consider the following identity:
s
44 IIx II2, IIxl12 _Ux)dx=E
(2.8) where K-Poisson (AII8 )I“/2). Note that under the assumed conditions on 4, this identity is defined for p 2 3. Let 2K
A,(K)= p_4+2KE[S(X:-4+2K)].
S.K. SarkarlShrinkage
domination
of some usual estimators
49
Since C$ is increasing, A,(K) is increasing in K (K =O, 1, . . . ), for ~24, dA,(K)=A,(K+l)-A,(K)30 if pd4, for K=O, 1, . . . . However, 2(K+l)
~A,w)=2K+1wX:m+I)l
i.e.,
-&+NXtx-l)l
(2.9) for K=O, l,...
. Hence, for all A>O,
a 4(~llx112) fn(x)dx =zi s IIxl12 = /1~11-2 ~W,W)I
=;E[dA,(K)] 0 B
if ~34
-2E
E[c$(&+~)]}=
&
-2E[
“:;c;?‘)]
if p=3.(2’10)
i
The inequalities in (2.6) then follow by taking A= 1. To prove the inequalities in (2.7), we take derivatives of both sides of the identity in (2.8) twice with respect to il and then consider A= 1. This yields
IIx I/2 4 [ Ilx-~ll”4(llxl12) +IE ; P(P;2)
E[
1-- [ 2 PE
IIXl12 IIx-~l12~(Ilxl12)
1
“;lllw~‘]
= liw2~ECA,(K)11r=I IIfI II2
=TE[d2A,(K)]/,=,.
(2.11)
SK.
50
Differentiating
SarkarlShrinkage
domination
both sides of the following
of some usual estimators
identity
s
~‘(~IIxIl’)fn(x)dx=E{EC~‘(~~+2~)1~
once with respect to il and then putting
;1= 1, we also obtain
Ecllxll2~“~llxll2~1-~~Cllx-ell2~~~llxll2~1+~~I~‘~llxll2~1
II 8
112
=~ECdEC~‘(X:+zK)l}II=1
(2.12)
= ll~l12qw”(X;+2+2K)1111=1~ because
of Lemma E
1. From
(2.11) and (2.12), we get
llx-~l14~(llxl12) II x II 2
1 [ =2PE
llX-q~;~y)]_P(P_2)e[dwjl)] (2.13)
+2ECIIX-~l12~‘(lIXl12)1-2PEC~‘(I/Xl12)1+Q, where ll~ll2~C~2~~(~)IIn=~-4ll~ll2~{~C~~(~~+~+~~)l~I~=~.
a= We now observe
using Lemma
1 again that
~d2A,(K)=~d”E[~(~~-4+2K)1-~
=EC#‘(x
(2.14)
P-4 4
d2ECdx;-2+2dl
:+x)1-(P--4)ECd’(x;+2+x)l
6W”(~;+2+2~)1,
(2.15)
if p>4, because of the assumptions on 4. Application of (2.15) and also of the first inequality in (2.6) to (2.13) and (2.14) yield the first inequality in (2.7). For p = 3, we need to write A 2 A,(K) in a different way because E &‘(x%+ 2 + 2K)] < co only for pb4. For this p, using Lemma 1 and the conditions on 4, we see that
(2.16)
S.K. SarkarlShrinkage
Since &x)/x
is convex
in x >O,
domination of some usual estimators
51
4”(x)-24’(x)/x + 24(x)/x220, from which we note
that
~c4”(x:)l23EC4’(X:)I -3J%4(xT)l. Hence, the right-hand
side of (2.16), when K =O, is
-%~C~(X~)I-~~C~“(X:)I+~~~C~‘(X:)I d -9%4(X:)I+wE4’(X:)I
+8EC4(xI)I
=o.
(2.17)
Also, when K> 1, the right-hand
side of (2.16) is less than or equal to (2.18)
Thus, for p = 3,
(2.19) The last inequality
in (2.19) is true because
2 0,
(2.20)
using Lemma 2. The second inequality in (2.7) now follows by applying (2.19) and inequality in (2.6) to (2.13) and (2.14). Thus, the lemma is proved. 0 Lemma 5. For
Proof. Using
4 satisfying the conditions in Lemma 4,
Lemma
3 and the following
identity
ECIIYl121X1=~211X-~l12+~(l-~2),
the second
52
SK. SarkarlShrinkage
we first see that the right-hand
domination ofsome
side of (2.21) is
Ilx-~ll’4(llxl12) IlXl12
>(P-2)
P2E i
1
usual estimators
1+Ml
-P2)+2PZ1 E[ 4lilyxT;P]}
IIx-~l124(IIXI12) IIXl12
I
+2P2(~-~2)~C~‘(IIX/12)1+
1
+iP(l-P2)+2P’lE[
m-m4
~;lll$)]
~CII~-~l12~‘~II~l12)l. (2.22)
p+2
Using the identity
in the left-hand side of (2.21), we now see that the difference between right-hand sides of (2.21) is less than or equal to
-2EC@(IIXl12)l if p 3 3, using Lemma
the left-hand
and
(2.23)
4. This proves the lemma.
0
We are now ready to prove the theorem. Proof of Theorem 1. Let P denote an n x n orthogonal matrix with the first column 112) . . ,xy2 )‘. Using this orthogonal matrix we now define Yr, . . . , Y, as follows: (711 (Y,, . ..) Yn)=(7F2X,, It is easy to check that
. . . ,71,1’2x,)P.
SK. SarkarlShrinkage
where T=P
. . . ,q,ai)P’.
diag(rr,a:, Y1= i
domination
of some usual estimators
53
Also,
71ixi=xn
i=l
j2
II yil12= i ~iIIxi-x~l12~ i=l
In terms of these yi’s, our problem mini
l-
dominates difference
(
1”’
reduces
44 II YI II 2, C1=2
’
1
Yl 1
to proving
II K II2
II YI II2
that
(2.24)
Y, uniformly under the conditions stated in the theorem. between (2.24) and Y, under the assumed loss function is
mini &
c
>4”(
II YI II21 E=2 II YI
(Y1-P)‘Y14(II
yIl/2)c1=211
The risk-
1
II yil12j2
II2
Kl12
(2.25)
II YI II 2
Let Zi=yii1’2yi,
i=l,...,12,
and
8=~;,“~
p, where r= ((yij)). Then, i=2, . . . ,n, (Z;, Z:)’ is distributed like (2.3) with the above 0 and p=pli=yli/(yli For 0 d 4 <2c, the first term within the parentheses in (2.25) is equal to
$(Yll
lIzI II21 lIzill
The first inequality in (2.26) follows by using the Cauchy-Schwarz as, the second inequality is obtained as follows: CY=2Yii
-=
Yll
x1=1
Yii-Yll Yll
for each yii)“‘.
(2.26) inequality;
where-
54
S.K. SarkarlShrinkage
domination of some usual estimators
The second term within the parentheses n c
i=2
in (2.25) reduces to the following:
1,
y E ~~~~~~‘z~~~Y~~IIz~l12~llzil12 II.’ [
IIZI II2
(2.27)
Using (2.26) and (2.27) in (2.25), we obtain
_E
4(YIl lIzI l12)(zI~e)‘z1 lIzill IIZI II2 [
II
GO,
from Lemma
(2.28)
5. Thus, the theorem
is proved.
0
The proof of Theorem 1 clearly indicates that this result would still hold if rri)s were random with a distribution independent of (X,, . . . ,X,). Thus, as an immediate consequence of this theorem, we have the following corollary. Corollary 1. For any C$satisfying the conditions
x,,=
l-
mini +&f ~(ll~~l12~~~~~riiIIxi~xfil12 c ,i IIxi II2
[ dominates
&, 1
stated in Theorem
1,
(2.29)
xri uniformly if p > 2.
References Bement, T.R. and J.S. Williams (1969). Variance of weighted regression estimates when sampling errors are independent and heteroscedastic. J. Amer. Statist. Assoc. 64, 1369-1382. Berger, J.O. (1985). Statistical Decision Theory, 2nd ed. Springer, New York. Bhattacharya, C.G. (1980). Estimation of a common mean and recovery of interblock information. Ann. Statist. 8, 205-211. Bhattacharya, C.G. (1984). Two inequalities with an application. Ann. Inst. Statist. Math. 36, 129-134. Brandwein, A.C. and W.E. Strawderman, (1990). Stein estimation: the spherically symmetric case. Statist. Sci. 5, 356-369. Brown, L.D. and A. Cohen (1974). Point and confidence estimation of a common mean and recovery of interblock information. Ann. Statist. 2, 963-976. Chiou, W.J. and A. Cohen (1985). On estimating a common multivariate normal mean vector. Ann. Inst. Statist. Math. 37, 499-506. Cohen, A. and H.B. Sackrowtiz (1974). On estimating the common mean of two normal distributions. Ann. Statist. 2, 12741282. Efron, B. and C. Morris (1976). Families of minimax estimators of the mean of a multivariate normal distribution. Ann. Statist. 4, 11-21.
SK.
SarkarlShrinkage
domination
of some usual estimators
55
George, E.I. (1991). Shrinkage domination in a multivariate common mean problem. Ann. Statist. 19, 952-960. Graybill, F.A. and R.D. Deal (1959). Combining unbiased estimators. Biometrics 15, 543-550. Haff, L.R. (1977). Minimax estimators for a normal precision matrix. J. Mult. Anal. 7, 374385. Karlin, S. (1968). Total Positiuity, Vol. 1. Stanford University Press, Stanford, CA. Khatri, C.G. and K.R. Shah (1974). Estimation of location parameters from two linear models under normality. Comm. Statist.-Theory Methods 3, 647-663. Krishnamoorthy, K. (1991a). Estimation of a common multivariate normal mean vector. Ann. Inst. Statist. Math. (to appear). Krishnamoorthy, K. (1991b). On a shrinkage estimator of a normal common mean vector. J. Mult. Anal. (to appear). Kubokawa, T. (1990). Minimax estimation of common coefficients of several regression models under quadratic loss. J. Statist. Plann. Inf 24, 337-345. Loh, W.-L. (1990). Estimating the common mean of two multivariate normal distributions. Ann. Statist. 19, 2977313. Norwood, T.E. and K. Hinkelmann (1977). Estimating the common mean of several normal populations. Ann. Statist. 5, 1047-1050. Shinozaki, N. (1978). A note on estimating the common mean of k normal distributions and the Stein problem. Comm. Statist.-Theory Methods 7, 1421-1432. Shinozaki, N. (1984). Simultaneous estimation of location parameters under quadratic loss. Ann. Statist. 12, 322-335. Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9, 1135-1151. Swamy, P.A.V.B. and J.S. Mehta (1979). Estimation of common coefficients in two regression equations. J. Econometrics 10, 1-14. Yancey, T.A., G.G. Judge and S. Miyazaki (1984). Some improved estimators in the case of possible hateroscedasticity. J. Econometrics 25, 133-150.