Shrinkage domination of some usual estimators of the common mean of several multivariate normal populations

Shrinkage domination of some usual estimators of the common mean of several multivariate normal populations

Journal of Statistical North-Holland Planning and Inference 39 (1994) 43-55 43 Shrinkage domination of some usual estimators of the common mean o...

585KB Sizes 0 Downloads 65 Views

Journal of Statistical North-Holland

Planning

and Inference

39 (1994) 43-55

43

Shrinkage domination of some usual estimators of the common mean of several multivariate normal populations Sanat K. Sarkar Speakman Hall (066-00). Department of Statistics, School of Business and Management, Temple University, Philadelphia, PA 19122. USA Received

2 November

1992

Abstract Given n mutually independent p-dimensional random vectors (XI, ,X,), where Xi N N,( p, c f I), the problem ofestimating the common mean vector p is considered assuming that af’s are unknown. A class of Stein-type shrinkage estimators uniformly dominating a usual estimator based only on Xi’s is given. This strengthens a similar result due to George (1991) and Krishnamoorthy (1991) for the special case n = 2. As a consequence of this result, a class of Stein-type shrinkage estimators uniformly dominating a usual estimator based on Xi’s and some estimators of et’s distributed independently of Xi’s are also obtained. AMS Subject Classijication: Primary Key words: domination.

Multivariate

normal

62Hl2; populations;

secondary common

62C99, 62507 mean;

usual

estimators;

Stein-type

shrinkage

1. Introduction Suppose that we have n mutually independent p-dimensional random vectors X r ,..., X, distributed as N#,o?I), i=l,... , n, respectively. We consider the problem of estimating the common mean vector p when af’s are unknown, and the loss incurred in using fi as an estimator is assumed to be G4P)=Il~-Pl12.

(1.1)

First, we consider this estimation problem based only on Xi’s. When the proportions yli=~;*/CJE1 o,:* are known, the estimator

Xv= i

ViXi

(1.2)

i=l

Correspondence to: Sanat K. Sarkar, Speakman Hall, (066-00) Department Business and Management, Temple University, Philadelphia, PA 19122, USA.

0378-3758/94/$07.00 0 1994-Elsevier SSDI 0378-3758(93)E0044-H

Science B.V. All rights reserved.

of Statistics,

School

of

SK

44

SarkarJShrinkage

domination

of some usual estimators

is optimum in several senses. It is, however, inadmissible when p > 2, and is uniformly dominated by its Stein-type shrinkage versions

O
(1.3)

The proof of this result follows easily by applying the standard technique of proving this kind of dominance result [see, for example, Stein (1981), Berger (1985) or Brandwein and Strawderman (1990)] and making use of the following facts. There exist independent random variables V2, . . . , V,, where I/i-x: (central chi-squared with p degrees of freedom), for i = 2, . . . , n, independently of X, N NJ p, a, I ), such that

for some nonnegative constants ai, i= 1, . . . , n. Also, a1

alE(C1=2aiVi)a

E(CTz2 Ui Vi)’

(17~2 Ui)’

(p+2)

using Bhattacharya’s (1984) inequality, and al --->m,in Cy= 2 Ui

& L

(see Section 2). With unknown Y/i,an estimator of the form ” (1.4) is often a natural choice, where rci)s satisfying O
!

zn,s=l-

CIIliI$

(

OGccqP--2)/(P+2),

&

C1=17CiIIXi-~n112

,>

Xz,

II2, II2 1

(1.5)

for p > 2 over the restricted parameter space ((p, CT :, . . . , (rz) : vi = q, i = 1, . . . , n} does not immediately suggest that the same phenomenon holds for 6, over the entire parameter space; the primary reason being that, here, x7= 1 71i11 Xi- r?,II 2 and d, become heavily dependent. George (1991) treated this problem for n = 2 and showed

S.K. SarkarlShrinkage

domination of some usual estimators

45

that, in spite of this dependence, Xn,s uniformly dominates X, for p > 2; however, c in an interval smaller than in (1.5). He, of course, suggested a better interval c through

extensive

simulation,

which

is again

slightly

smaller

than

for for

that in (1.5).

Krishnamoorthy (1991b), in a recent paper, improved George& result by analytically proving that this dominance result indeed holds for c in the interval given in (1.5). In this article, we have strengthened Krishnamoorthy’s (1991b) result, and hence George’s (1991). We treat not only the problem in a more general set-up, but also give a wider class of shrinkage that

estimators

mini( 5

dominating

X,. More specifically,

it is shown

1~~llx~l12~~1=171iIIxi~xrrl12 x,, IIXnl12 I

(1.6)

for a certain class of 4, which includes that considered in George (1991) and Krishnamoorthy (1991b), dominates X, uniformly if p > 2. Furthermore, our method of proof is much more general, involving some new identities and inequalities. We also consider the problem of estimating ,u when additional observations are available to estimate ~~2’s. Let e:, . . . ,8,2, which are distributed independently of (X i, . . . , X,), represent some estimates of a:, . . . , a,2, respectively. Several authors have considered this problem. For p= 1, Graybill and Deal (1959), Bement and Williams (1969), Khatri and Saha (1974), Norwood and Hinkelman (1977), and Bhattacharya (1984) derived conditions for the most commonly used estimator, namely, X;=

i:

&Xi,

9i=&+

i=l

i3]?,

(1.7)

j=l

with unbiased 6;‘s to dominate each Xi. Brown and Cohen (1974), Cohen and Sackrowitz (1974), and Bhattacharya (1980) discussed properties of some alternative estimators when p = 1. Shinozaki (1978), Swamy and Mehta (1979), Yancey et al. (1984), and Kubokawa

(1990) considered

equivalent

problems

in a regression

setting.

In a more general multivariate setting, Chiou and Cohen (1985), Krishnamoorthy (1991a), and Loh (1990) also treated this problem. In this paper, we provide a class of Stein-type shrinkage versions of X, uniformly dominating it for p > 2. This dominance result follows as a direct consequence Xn,s over X,, of fii’s.

and interestingly,

2. Shrinkage domination We first present (1.6) over X,.

of the previous

does not depend

result showing

on the distributional

dominance

of

properties

of J$?nand _8$

in this section

the main result establishing

uniform

dominance

of

SK

46

SarkarjShrinkage

domination

of some usual estimators

Theorem 1. Let 4(x) be afunction satisfying(i) 0<4(~)<2(p-2)/(p+2), (ii) 4’(x)= ati(x)/ax 2 0, 4”(X) = d24(X)/aX2 < 0, $( x )/ x is convex, and 4”(x) is nondecreasing. Then, for any C$satisfying these conditions, X,,,(4) lossfunction (1.1) $O
1. The

most

Theorem

1 is a constant

uniformly dominates x, under the

1, i= 1, . . . ,n, and p>2.

obvious

choice

function.

for 4 satisfying

Another

the conditions

choice is a class of functions

stated

in

of the form

4(x) = ax/(x + b), for some a > 0 and b > 0. This latter C$leads to shrinkage estimators of the type considered by Shinozaki (1984) in estimating some nonnormal location parameters. We need some supporting results that will facilitate the proof of the theorem. Towards this end, first, we have two lemmas related to central chi-square distribution. Lemma 1. Let 4 be a d$erentiable function such that E[+(xz)]

< co, for any v>O.

Then, if v > 2,

(2.1) Proof. The proof of this lemma

follows directly

from the following

identity:

EC~(~~)I=(V-~)EC~;~~(X~)I+~EC~‘(X~)I [see, for example, Efron and Morris (1976), or Haff (1977) for a more general identity for Wishart distribution]. 0 Lemma 2. Let g(x) = 4(x)/x, for some nonnegative function 4 such that E [4(x:)]

< co,

for any v > 0. Then, for v > 2, E2Cs(x:+dl

6-

v+2 V

ECs(x~)lECs(x:+dl~

Proof. ECs(x:+dl= EL-&:)1

1

EC&:)1

~-W/X:MX~)I =k [ jx-‘h.(x)dx]-i,

(2.2)

where

fy being the density of xy’. It is easy to check that h,,(x) is TP2 in (x, v). Hence, from Karlin (1968), the expectation of l/X with respect to this density is decreasing in v > 2,

SK

SarkarlShrinkage

domination

ofsomeusual estimators

41

which implies

Hence, the lemma

is proved.

0

Now, we present some useful lemmas related to expectations of certain functions a 2p-dimensional random vector (X’, Y’)’ having the following distribution:

of

(2.3) where 0 is a p x 1 vector and - 1~ p < 1. First, we have the following

identity.

Lemma 3. Let C/Ibe a difirentiable function such that E [4(xy’)] < 00, for any v>O. Then, if p > 2, E

(X-WXdJ(IlXl12)

[

IIXl12

1

II Yl12

=ECII/(IIXl12)II ~l121+2~2EC$(IIXl12)1~

(2.4)

where

Proof. Let g(x)= 4(x)/x. Observing that Y can be represented as Y=pX+ (1 --P~)“~Z, for some Z distributed as NJ -p(l -~~)-r/~ 8, Z], independently of X, and applying the Stein’s (1981) identity, namely, E [ (Xi - ei)‘h(Xi)] = E[h’(Xi)], for any differentiable function h satisfying E Ih’(Xi)I < co, we see that the left-hand side of (2.4) is equal to

E i

i=l

{~(llXI12)IIYI12+2~‘(llXl12)X~ll YI12+2pXi[pXi +(1-P2)“2zil~(IIxl12)}

=PECdIIXI12)II YI121+2ECd(IIXl12)llXl12II Yl121 +‘G2ECg(IlXl12)lI~l121+~~~~-~2~1’2~C~~ll~I12~~‘~1 =~~CP~~ll~I12~+~9’~/l~I12~II~l121 II Yl12)

+2~2EC(X--8)‘Xg(IIXl12)1. Applying

again the Stein’s identity

we note that

~C~~--8)‘~~OI~ll2~1=~C~~~II~/l2+~~’~ll~ll2~ll~ll21.

(2.5)

48

SK

SurkarjShrinkage

domination of some usual estimators

The lemma then follows because

tw)=Pd4+w(x).

0

The next lemma gives some important inequalities. Lemma 4. Let q5 be a function satisfying E[c$(x~)] < 00, for any v >O, and condition (ii)

of Theorem 1. Then,

Gb+W [

(b)

1

4( IIx II2, IIxl12 +2EC4’(IIXl12)1 ifp=3,

Ilx-~ll”4wl12) I,xll2

]Q+2)E[

(2.6)

“““i;;;!‘““)]

+2ECIlX-~l12~‘(IIX/12)1 if pk4

+2ECIlX-Bl124’(llXl12)]

ifp=3.

(2.7)

Proof. Let

XA-fA(x)=(2~)-P’2Api2exp

-+112),

the density of N,(6), A-‘I). Consider the following identity:

s

44 IIx II2, IIxl12 _Ux)dx=E

(2.8) where K-Poisson (AII8 )I“/2). Note that under the assumed conditions on 4, this identity is defined for p 2 3. Let 2K

A,(K)= p_4+2KE[S(X:-4+2K)].

S.K. SarkarlShrinkage

domination

of some usual estimators

49

Since C$ is increasing, A,(K) is increasing in K (K =O, 1, . . . ), for ~24, dA,(K)=A,(K+l)-A,(K)30 if pd4, for K=O, 1, . . . . However, 2(K+l)

~A,w)=2K+1wX:m+I)l

i.e.,

-&+NXtx-l)l

(2.9) for K=O, l,...

. Hence, for all A>O,

a 4(~llx112) fn(x)dx =zi s IIxl12 = /1~11-2 ~W,W)I

=;E[dA,(K)] 0 B

if ~34

-2E

E[c$(&+~)]}=

&

-2E[

“:;c;?‘)]

if p=3.(2’10)

i

The inequalities in (2.6) then follow by taking A= 1. To prove the inequalities in (2.7), we take derivatives of both sides of the identity in (2.8) twice with respect to il and then consider A= 1. This yields

IIx I/2 4 [ Ilx-~ll”4(llxl12) +IE ; P(P;2)

E[

1-- [ 2 PE

IIXl12 IIx-~l12~(Ilxl12)

1

“;lllw~‘]

= liw2~ECA,(K)11r=I IIfI II2

=TE[d2A,(K)]/,=,.

(2.11)

SK.

50

Differentiating

SarkarlShrinkage

domination

both sides of the following

of some usual estimators

identity

s

~‘(~IIxIl’)fn(x)dx=E{EC~‘(~~+2~)1~

once with respect to il and then putting

;1= 1, we also obtain

Ecllxll2~“~llxll2~1-~~Cllx-ell2~~~llxll2~1+~~I~‘~llxll2~1

II 8

112

=~ECdEC~‘(X:+zK)l}II=1

(2.12)

= ll~l12qw”(X;+2+2K)1111=1~ because

of Lemma E

1. From

(2.11) and (2.12), we get

llx-~l14~(llxl12) II x II 2

1 [ =2PE

llX-q~;~y)]_P(P_2)e[dwjl)] (2.13)

+2ECIIX-~l12~‘(lIXl12)1-2PEC~‘(I/Xl12)1+Q, where ll~ll2~C~2~~(~)IIn=~-4ll~ll2~{~C~~(~~+~+~~)l~I~=~.

a= We now observe

using Lemma

1 again that

~d2A,(K)=~d”E[~(~~-4+2K)1-~

=EC#‘(x

(2.14)

P-4 4

d2ECdx;-2+2dl

:+x)1-(P--4)ECd’(x;+2+x)l

6W”(~;+2+2~)1,

(2.15)

if p>4, because of the assumptions on 4. Application of (2.15) and also of the first inequality in (2.6) to (2.13) and (2.14) yield the first inequality in (2.7). For p = 3, we need to write A 2 A,(K) in a different way because E &‘(x%+ 2 + 2K)] < co only for pb4. For this p, using Lemma 1 and the conditions on 4, we see that

(2.16)

S.K. SarkarlShrinkage

Since &x)/x

is convex

in x >O,

domination of some usual estimators

51

4”(x)-24’(x)/x + 24(x)/x220, from which we note

that

~c4”(x:)l23EC4’(X:)I -3J%4(xT)l. Hence, the right-hand

side of (2.16), when K =O, is

-%~C~(X~)I-~~C~“(X:)I+~~~C~‘(X:)I d -9%4(X:)I+wE4’(X:)I

+8EC4(xI)I

=o.

(2.17)

Also, when K> 1, the right-hand

side of (2.16) is less than or equal to (2.18)

Thus, for p = 3,

(2.19) The last inequality

in (2.19) is true because

2 0,

(2.20)

using Lemma 2. The second inequality in (2.7) now follows by applying (2.19) and inequality in (2.6) to (2.13) and (2.14). Thus, the lemma is proved. 0 Lemma 5. For

Proof. Using

4 satisfying the conditions in Lemma 4,

Lemma

3 and the following

identity

ECIIYl121X1=~211X-~l12+~(l-~2),

the second

52

SK. SarkarlShrinkage

we first see that the right-hand

domination ofsome

side of (2.21) is

Ilx-~ll’4(llxl12) IlXl12

>(P-2)

P2E i

1

usual estimators

1+Ml

-P2)+2PZ1 E[ 4lilyxT;P]}

IIx-~l124(IIXI12) IIXl12

I

+2P2(~-~2)~C~‘(IIX/12)1+

1

+iP(l-P2)+2P’lE[

m-m4

~;lll$)]

~CII~-~l12~‘~II~l12)l. (2.22)

p+2

Using the identity

in the left-hand side of (2.21), we now see that the difference between right-hand sides of (2.21) is less than or equal to

-2EC@(IIXl12)l if p 3 3, using Lemma

the left-hand

and

(2.23)

4. This proves the lemma.

0

We are now ready to prove the theorem. Proof of Theorem 1. Let P denote an n x n orthogonal matrix with the first column 112) . . ,xy2 )‘. Using this orthogonal matrix we now define Yr, . . . , Y, as follows: (711 (Y,, . ..) Yn)=(7F2X,, It is easy to check that

. . . ,71,1’2x,)P.

SK. SarkarlShrinkage

where T=P

. . . ,q,ai)P’.

diag(rr,a:, Y1= i

domination

of some usual estimators

53

Also,

71ixi=xn

i=l

j2

II yil12= i ~iIIxi-x~l12~ i=l

In terms of these yi’s, our problem mini

l-

dominates difference

(

1”’

reduces

44 II YI II 2, C1=2



1

Yl 1

to proving

II K II2

II YI II2

that

(2.24)

Y, uniformly under the conditions stated in the theorem. between (2.24) and Y, under the assumed loss function is

mini &

c

>4”(

II YI II21 E=2 II YI

(Y1-P)‘Y14(II

yIl/2)c1=211

The risk-

1

II yil12j2

II2

Kl12

(2.25)

II YI II 2

Let Zi=yii1’2yi,

i=l,...,12,

and

8=~;,“~

p, where r= ((yij)). Then, i=2, . . . ,n, (Z;, Z:)’ is distributed like (2.3) with the above 0 and p=pli=yli/(yli For 0 d 4 <2c, the first term within the parentheses in (2.25) is equal to

$(Yll

lIzI II21 lIzill

The first inequality in (2.26) follows by using the Cauchy-Schwarz as, the second inequality is obtained as follows: CY=2Yii

-=

Yll

x1=1

Yii-Yll Yll

for each yii)“‘.

(2.26) inequality;

where-

54

S.K. SarkarlShrinkage

domination of some usual estimators

The second term within the parentheses n c

i=2

in (2.25) reduces to the following:

1,

y E ~~~~~~‘z~~~Y~~IIz~l12~llzil12 II.’ [

IIZI II2

(2.27)

Using (2.26) and (2.27) in (2.25), we obtain

_E

4(YIl lIzI l12)(zI~e)‘z1 lIzill IIZI II2 [

II

GO,

from Lemma

(2.28)

5. Thus, the theorem

is proved.

0

The proof of Theorem 1 clearly indicates that this result would still hold if rri)s were random with a distribution independent of (X,, . . . ,X,). Thus, as an immediate consequence of this theorem, we have the following corollary. Corollary 1. For any C$satisfying the conditions

x,,=

l-

mini +&f ~(ll~~l12~~~~~riiIIxi~xfil12 c ,i IIxi II2

[ dominates

&, 1

stated in Theorem

1,

(2.29)

xri uniformly if p > 2.

References Bement, T.R. and J.S. Williams (1969). Variance of weighted regression estimates when sampling errors are independent and heteroscedastic. J. Amer. Statist. Assoc. 64, 1369-1382. Berger, J.O. (1985). Statistical Decision Theory, 2nd ed. Springer, New York. Bhattacharya, C.G. (1980). Estimation of a common mean and recovery of interblock information. Ann. Statist. 8, 205-211. Bhattacharya, C.G. (1984). Two inequalities with an application. Ann. Inst. Statist. Math. 36, 129-134. Brandwein, A.C. and W.E. Strawderman, (1990). Stein estimation: the spherically symmetric case. Statist. Sci. 5, 356-369. Brown, L.D. and A. Cohen (1974). Point and confidence estimation of a common mean and recovery of interblock information. Ann. Statist. 2, 963-976. Chiou, W.J. and A. Cohen (1985). On estimating a common multivariate normal mean vector. Ann. Inst. Statist. Math. 37, 499-506. Cohen, A. and H.B. Sackrowtiz (1974). On estimating the common mean of two normal distributions. Ann. Statist. 2, 12741282. Efron, B. and C. Morris (1976). Families of minimax estimators of the mean of a multivariate normal distribution. Ann. Statist. 4, 11-21.

SK.

SarkarlShrinkage

domination

of some usual estimators

55

George, E.I. (1991). Shrinkage domination in a multivariate common mean problem. Ann. Statist. 19, 952-960. Graybill, F.A. and R.D. Deal (1959). Combining unbiased estimators. Biometrics 15, 543-550. Haff, L.R. (1977). Minimax estimators for a normal precision matrix. J. Mult. Anal. 7, 374385. Karlin, S. (1968). Total Positiuity, Vol. 1. Stanford University Press, Stanford, CA. Khatri, C.G. and K.R. Shah (1974). Estimation of location parameters from two linear models under normality. Comm. Statist.-Theory Methods 3, 647-663. Krishnamoorthy, K. (1991a). Estimation of a common multivariate normal mean vector. Ann. Inst. Statist. Math. (to appear). Krishnamoorthy, K. (1991b). On a shrinkage estimator of a normal common mean vector. J. Mult. Anal. (to appear). Kubokawa, T. (1990). Minimax estimation of common coefficients of several regression models under quadratic loss. J. Statist. Plann. Inf 24, 337-345. Loh, W.-L. (1990). Estimating the common mean of two multivariate normal distributions. Ann. Statist. 19, 2977313. Norwood, T.E. and K. Hinkelmann (1977). Estimating the common mean of several normal populations. Ann. Statist. 5, 1047-1050. Shinozaki, N. (1978). A note on estimating the common mean of k normal distributions and the Stein problem. Comm. Statist.-Theory Methods 7, 1421-1432. Shinozaki, N. (1984). Simultaneous estimation of location parameters under quadratic loss. Ann. Statist. 12, 322-335. Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9, 1135-1151. Swamy, P.A.V.B. and J.S. Mehta (1979). Estimation of common coefficients in two regression equations. J. Econometrics 10, 1-14. Yancey, T.A., G.G. Judge and S. Miyazaki (1984). Some improved estimators in the case of possible hateroscedasticity. J. Econometrics 25, 133-150.