Journal of Statistical Planning and Inference 15 (1987) 347-363 North-Holland
347
S I M U L T A N E O U S E S T I M A T I O N OF P A R A M E T E R S U N D E R ENTROPY LOSSt Dipak K. DEY University o f Connecticut, Storrs, CT 06268, USA
Malay GHOSH* University o f Florida, Gainesville, FL 32611, USA
C. SRINIVASAN** University of Kentucky, Lexington, K Y 40506, USA Received 18 May 1984; revised manuscript received 26 April 1986 Recommended by J.O. Berger
Abstract: The problem considered is simultaneous estimation of scale parameters and their reciprocals from p independent gamma distributions under a scale invariant loss function first introduced in James and Stein (1961). Under mild restrictions on the shape parameters, the best scale invariant estimators are shown to be admissible for p = 2. For p_> 3, a general technique is developed for improving upon the best scale invariant estimators. Improvement on the generalized Bayes estimators of a vector involving certain powers of the scale parameter is also obtained.
AMS Subject Classification: Primary 62C15, 62F10; Secondary 62H99. Key words and phrases: Entropy loss; Simultaneous estimation; Gamma scale parameters; Best invariant estimators; Admissibility; Inadmissible estimators; Differential inequalities; Trimmed estimators.
1. Introduction Recently, there has been considerable interest in the simultaneous estimation of parameters from several independent distributions other than normal. This problem is addressed quite extensively in Borger (1980) with particular emphasis on gamma distributions. Berger (1980) proposed a number of estimators improving on the t The order of the authors' names is alphabetical, and does not indicate their relative contributions to the paper. * Research supported by NSF Grant Number DMS-8218091. ** Research supported by the NSF Grant Number MCS-8212968. 0378-3758/87/$3.50 © 1987, Elsevier Science Publishers B.V. (North-Holland)
348
D.K. Dey et al. / Estimation under entropy loss
gamma mean vector under different quadratic losses and an estimator improving on the natural parameter vector of several gamma distributions under squared error loss. Estimators alternate to Berger's in the gamma case have recently been proposed by Das Gupta (1986). Berger's (1980) results have been generalized in several directions by Ghosh and Parsian (1980), Ghosh, Hwang and Tsui (1984), and Ghosh and Dey (1984). The first objective of this paper is the simultaneous estimation of the mean vector, and the vector of reciprocals of the means from p independent gamma distributions. Let Xl,..., Xp be independently distributed, Xi having pdf
fo,(xi)=exp(-Oixi)x•'-lO?'/F(ai),
xi>O,
(1.1)
where 0i(>0) is unknown, but ai(>0) is known. For estimating 0=(01, ...,09) by a = (al, ..., ap), consider the loss P
L(O,a)= ~, (aiOi-l-log(ai071)-1),
(1.2)
i=1
while for estimating the mean vector 0 - 1 = (0t= l, ..., 0p-i), consider the loss P
L(O-l,a)= ~, (aiOi-log(aiOi)- 1).
(1.3)
i=1
These losses can be described as entropy loss; it is easy to show that they correspond to an entropy measure of distance between the distributions indexed by 0 and a. An analogous loss was considered in James and Stein (1961) for the estimation of the vaxiance-covariance matrix of a multinormal distribution. Under the loss (1.3), the best invariant estimator of 0-1 is = ( x l / a l , ... , X p / a p )
(see for example, Exercise 3 of Ferguson (1967), p. 181). It is also easy to check that J°(X) is the UMVUE of 0 -1. And, if m i n a i > l , l
J+(X)=((al- 1)X71, ..., (ap - 1)Xp 1). For p= 1, it follows from Stein (1959) or Brown (1966) that j0 and J+ axe admissible for 0-x and 0, respectively. In Section 2, for p = 2 we prove the admissibility of J°(X) for estimating 0 -1 under the loss (1.3) when min(al, a2)>4. In this section, we also prove for p=2 the admissibility of J+(X) for estimating 0 under the loss (1.2) when min(al, 32)> 5. The results axe obtained by verifying the conditions of Brown and Fox (1974). In Section 3, we prove the inadmissibility of J°(X) for estimating 0-1 when p_> 3 and several estimators improving on J°(X) axe obtained. This problem was first considered in Das Gupta (1984), and our results generalize those of Das Gupta in several directions. In this section we prove also the inadmissibility of J+(X) for estimating 0 when p_>3. Finally, in this section, we obtain a general result showing the inadmissibility of the generalized Bayes
349
D.K. Dey et al. / Estimation under entropy loss
estimators of (0f', ...,0~ p) with respect to the (possibly improper) prior with pdf HP_I Oik' under losses
p 0 mi (aiO : bi -- log(a/0/- b,)_
I),
(I.4)
i=I
provided certain relationships exist among the mi's, Ofi'S, ki'S and results under quadratic loss were obtained by Das Gupta (1986).
bi's. Similar
2. The admissibility result
Consider two independent gamma random variables X1 and X2 with pdf's given in (1.1). The following theorem is proved.
Under the loss (1.3), J°(X)=(Xl/al,X2/a2) is an admissible estimator of 0- l = (011, 021) if min(al, a2) > 4.
Theorem 2.1.
Proof. Use the transformation
Yi=logXi, t / i = - l o g 0i, ti=logai, J*(Y/)=
Yi-log ai. Then the loss (1.3) can be rewritten as
2
W(t-tl)= ~ [exp(ti-~i)-(ti-rli)-1],
(2.1)
i=l
where t=(h, t2) and r/= (/71,/~2).Theorem 2.1 is proved by verifying the conditions of Brown and Fox (1974) stated and proved below as lemmas. Lemma 2.1.
The best invariant estimator of tl under the loss (2.1) exists and is
unique.
The proof is immediate since the loss is strictly convex. For the rest of the argument, define Y=(Yl,Y2), lY[=(y2+y2) 1/2,
=
(t~(Yl),£~ (Y2)). Lemma 2.2.
I ly[2W(J*(y))pO,)dy<
oo,
(2.2)
where p(y) denotes the pdf of Y= (YI, }'2) when II = O. Proof. The left-hand side of (2.2) can be expressed as
(F(al)F(ct2))-l I (i~=llOg2xi) Ii~=l(Xia~-l-logxi+ log ai-1) 1 • exp ( - i
=~1xi) i=1 H2 x•,-
1dxl dx2
(2.3)
D.K. Dey et aL / Estimation under entropy loss
350
which is finite when ai>O (i= 1,2).
[]
Denote by p ( y - r/) the pdf of y when r/is the true parameter. Then we have the following lemma. Lemma 2.3. Suppose J(Y) is any estimator such that R(r/, J)_R(r/, J*) =Rofor all rl. Then, there must exist a sequence {JL } o f estimators such that
R(n, JL)<_ROI, J) for all [nl_
(2.4)
for L <_M, and [y[ < " W(JL (Y)-- y + Z)p(z) dz < l W ( J M ( Y ~) +y
z)p(z) dz;
(2.:5)
J
L
and (2.6) Proof. L e t j ( y ) = ( j l ( y ) , j 2 ( y ) )
be given as in the hypothesis of the lemma. Define JL (Y) = (JL1(Y), j2 (y)) as follows: j~(y)=ji(y)
if (ai+ 1)exp(-~L)
=0
otherwise,
where K(> 0) will be determined later. To prove (2.4) it suffices to show that W(8L (y) -- r/) --
2
[exp(J~(y)-v/i)-(J~(y)-t/i)]_< ~ [ e x p ( j i ( y ) - v / i ) - ( j i ( y ) - v k ) ] , i=1
(2.8)
i=l
for 117[_
KL and J 1(y) < _ exp(~L). When J 1(y) > KL, J2(y)=O and W ( J ~ O , ) - r h ) _ w ( j l ( y ) - r / l ) rediaces to [exp(JnO,))-l]/jl(y)_> exp(v/1). Since the function { e x p ( u ) - 1 } / u is T in u for u > 0 , for Jl(y)>KL, [exp(Jl(y)) - l l / J l ( y ) > [exp(KL)- ll/(KL), while for I,tll < Z , expOll)_~-, so that for all large L, [exp(KL)-1]/(KL)>exp(L) which ensures the desired inequality. Agal'n for J l ( y ) < - ( a l + l ) e x p ( ~ - L ) , W(J2(y)-vh)-< W(Jl(y)--r/1) reduces to [1-exp(JIO,))]/[-JlO,)l_exp(-L) so that the inequality w ( J l o , ) - r h ) _ < W(Jl(y) - th) holds true.
D.K. Dey et al. / Estimation under entropy loss
351
To prove (2.5), it suffices to show that for lYI<~-L, and L
i Z2
{exp(J~-yi+zi)-(t~-yi+zi)}p(z)dz
i=l
-< l i=1 ~ { e x p ( J ~ - y i + Zi)--((JiM--Yi + Zi)}P(Z)dz.
(2.9)
Once again, we prove the inequality (2.9) coordinatewise. Since, the argument is the same for both the coordinates, the proof is given only for i = 1. It suffices to consider the two cases KL
t
{exp(z~ -Yt) - (Zl -Yl)}P(Z) dz -< l {exp(J l (y) + zl -Yl) - (J I 0') + zl -Yl)}P(Z) dz.
(2.10)
Noting that I exp(zl)p(z)dz= ~1, the inequality (2.10) reduces to a~ exp(- y~) _ 1 + g l (y)ct{ 1 exp0h).
(2.11)
Now for KL [exp(KL)- I]/(KL), while a l I exp(yl) ~ , the desired inequality follows for all sufficiently large L. For - (t~l + 1)exp(~-M) < J 1(y) < - (cq + 1)exp(~-L), the inequality (2.11) reduces to [1 - exp(J l (y))]/[_ J 1(y)] < ct~- l exp(y0. Since a/- l exp(yl) > a~- 1 exp(- ~-L), while [1 - exp(J l (Y))I/[- J 1(Y)] < 1 / ( - jl(y))_< (al + 1)-1 exp(-~L), the inequality follows. To prove (2.6), once again we prove the inequality for each coordinate. Consider the ease i = 1. Now, for ]yl<+L, define yl=j~(y)_y~ and zi=Yi-rli, i= 1,2. Then, w(jl(Y) - rh)= exp(YLI(Y) + Zl)-- YLl(Y) + Zl)-- 1.
(2.12)
Again, from the definition of JLl, one gets exp(yL1(y)) < exp(½L + KL),
(2.13)
- ) , ~ ( y ) - 1 -<(al + 1)exp(~L) + ½ L - 1.
(2.14)
and
Also, since lY[ <½L and [r/[>L together imply that Iz[ = [y-r/I>½L, one gets the inequality
D.K. Dey et al. / Estimation under entropy loss
352
<-I
dYl [yI < L/2
exp(-~L+KL)exp(zl)p(z)dz [zI> L /2
+((al+ l)exp(-~L)+-~L-1) I
lYl<
dy I p(z)dz LIE Izl> L/2
+ IlylL/2'z]p(z)dz < ¼~zL2 e x p ( I L +
KL) l
exp(zl )p (z)
J Izl > L/2
++ztL2(exp(kL)+~L) I
dz
p(z)dz Izl > L/2
I
[zlp(z)dz.
+~-~L 2
(2.15)
[Z[> L/2
Now it suffices to verify that the right-hand side of (2.15) tends to zero as L ~ for al > 4 and First observe that
az> 4.
exp(zl)P(Z) dz = tYl > L/2 zi>0, i= 1,2
i(xl, x2) dXl dx2 ~ log2xi> L2/4 logxi>0, i= 1,2
<-j~=lllogxj>LA/g i ( x l ' x 2 ) d X l d x 2
-< C1 e x p ( - ~- exp(L/x/8))
(2.16)
where
i(x~,x2)= e x p ( - Xl - x2)x~'x~~- l/{1-'(al)1"(a2)} and C1 is an absolute constant which may depend on al and a2 but not on L. The corresponding integrals over the other quadrants of z can be handled similarly, resulting in the conclusion
l
'Jzl
> L/2
exp(zl)p(z) dz = O(e- aL/2),
(2.17)
where a = rain(a1 + 1, 0t2). Thus the first term in the right-hand side of (2.15) can be bounded as follows:
L2exp(TL+KL) I
exp(zl)p(z)dz~CL2exp(~L+KL-~aL) Izl> L/2
(2.18)
where C is an absolute constant depending only on ctl, a2 but not on L. Consequently, if tz > 4 the right-hand side of (2.17) goes to zero (for K sufficiently close
D.K. Dey et al. / Estimation under entropy loss
353
to {). Similarly, the other terms on the right-hand side of (2.15) go to zero for the same choice of a as L - , oo. [] L e m m a 2.4. There exists a non-increasing function Kl(o ) : (0, 0o)-,(0, oo) such that
lo
Kl(o) d o < 00,
(2.19)
and f o r any invariant estimator J ( Y ) = J*(Y) + t with t = (h, t2), the following inequality holds .Ityl_< o [ W(J*(y)) - W(J(y))lp(y) dy -
Proof. First note that since J(y) = J*(y) + t, 2
W ( J * O ' ) ) - W(J(y))= ~ [exp(j*i(yi)){exp(ti) - 1}-ti].
(2.20)
i=I
Now, use the fact that when Yl =72 =0, i.e., when 01 = 02 = 1, E[exp(J*i(yi))] = 1. Then it follows from (2.20) that 2
t [W(J*(y)) - W(J(y))]p(y) dy = ~ (exp(ti)- 1 - t i ) .
(2.21)
i=l
Hence, from (2.20) and (2.21), it suffices to show that
2
]
~, {ti+ exp(j*i(yi)) - exp(j*io'i) + ti)} p(Y) dy i=l
<_KI(o )
{exp(ti) - 1 - ti}
(2.22)
i=l
We will obtain an upper bound for each summand in the left-hand side of (2.22), and then combine them to get the desired inequality. Once again, we obtain a bound only when i= 1, as the bound when i = 2 is similar. Thus, it suffices to show that
l
' [tl + exp(j*i(Yi)) - exp(J*i(Yi) + ti)]P(Y) dy lyl
<-K2(o)(exp(h) - 1 -
(2.23)
since (exp(tj) - 1 - t j ) 1/2<- { ~2= l (exp(ti) - 1 - ti) } 1/2 for each j = 1, 2. We consider below the different cases. Case I. tl > 0 . Let T 0 > 0 be large enough that tl + 1 < e x p ( { t l ) if tl-> To. Case IA. t l > T0>0.
354
D.K. Dey et al. / Estimation under entropy loss
Using J exp[j*i(yi)]p(y) dy = 1, and h + 1
-< tl + 1 - exp(~-fi) Oily[
= t~ + 1 - exp(½q) + exp(~-q) l lyl > exp(J* l (yl))PO ,) dy
-- exp(½tl) llyl >v exp(J* l(y l))p(y) dy _ < c 1 (exp(tl)
1 -tl) 1/2 l
exp(J* l(yl))p(y)dy,
(2.24)
J lyl>v
where Cl is a constant which might depend on To. Case IB. 0 < t l < T o. Then left-hand side of (2.23) exp(J* 1(Yl))P(Y) dy
p(y)dy+exp(q) I
= t , + 1 - e x p ( t l ) , t 11 lyl > o
]yl>
0
- Ilyl> o exp(J* 1(Yl))P(Y) dy
_< [exp(h) - 1] I exp(j*l(yl))p(y) dy J lYl>v = {[exp(tl) - l]/[exp(tl) - 1 - q]l/2} • {exp(tl) - 1 - tl }1/2 I exp(J* l(yl))p(y) dy J lyl>v < c2(exp(tf)- 1 - h ) 1/2 1 exp(J*l(yl))pO ,) dy, J lyl>v
(2.25)
where c2 is a constant which might depend on To.
Case IIA. t I < - 1. In this case left-hand s~ide of (2.23) _<- [tl[l/2 I ; p(y) dy+ 1 I I---v = 1 -Itl[ 1/2 + ]tl[ 1/2 ( p(y) dy 3 lyl>v
<_c I
p(y) dy{exp(q)- tl - 1} 1/2. lYl > v
(2.26)
355
D . K . D e y et al. / Estimation under entropy loss
The last inequality can be checked by expanding exp(tl) and observing that tl<-l.
Case IIB. - 1 < t l < O. Here left-hand side of (2.23)
=[tl+l-exp(h)]-h(l
P(y)dy- l tyl > o
exp(J* l (Y))p(Y) dY) ly[ > o
+ (exp(q) - tl - 1) I exp(J* l (y))p(y) dy J lyl>o -<[l
exp(j*IO'))P(y)dy] (Itll+exp(tl)-tl-1)
P0') dY+ I lyl>o
lyl>o
<-c4[llyl>oP(y)dy+flyl>oexp(J*l(Y))PO')dy](exp(tl)-t,-1)l/2" (2.27) Hence, taking c = max(el, c2, c3, c4) and
(2.23) follows from (2.24)-(2.27). A similar argument shows that
l
(t2 + exp(j*2(y)) - exp(t2 + O*2(y))p(Y) dy lYl > v
<-K3(o)(exp(t2 -
(2.28)
1 - t2)) 1/2,
where
and d is a generic constant. Adding the coordinates and taking K 1(v) = K2(o) + K 3(o), the lemma follows if we can show that Kl(v)~ in v and Io 1(i(o)dr--, 0o. The first fact is immediate from the definitions of K2(v) and K3(v), while the second can be verified easily by going back to the X variables. []
Lemma 2.5. There exists a non-increasing function K4(o ) : (0, oo)-} (0, 0o) such that
I?K4(v) dv<0o and
Ityl>v
[w(
*ty))-
dy
D.K. Dey et al. / Estimation under entropy loss
356
for every invariant estimator iS(y)satisfying
$(Y)=~*O')+t,
for some fixed
t = (t l, t2).
Proof. The proof will be given for the one coordinate case. Since W(~*(y)) = exp(~*(y))- 1 -t~*(y), it follows that W(t~* (y)) - W(t~(y)) = exp(~*O,)) - exp(t~*O,) + t) + t = exp(t~* (y)) [ 1 - exp(t)] + t. Thus [W(6*(y))-
W(6(y))]+={exp($*(y))[1 -
exp(t)] + t } +.
(2.29)
Case I.
t>0. From (2.29) it follows that
•f
[ W ( $ * ( y ) ) - W(6(y))l+p(y) dy
lYl>o
=( [exp(O*(y))(1 3 lyl>v
<<-I tp(y)dy
-- e
t) + tl+p(y) dy
(since e t > l for t > 0 )
J lyl>v
_u p(y)dy (since e t - 1-t>_-~t
2
for t>O)
1/2
=x/-2I,y,>oP(y)dyll [W(~(Y))-W(~*(Y))]P(y)dy1 =KI(v)II
[W($(y))-W(~*(y))]p(y)dy]
where r
K~ (v) = v~ | prY) dy. J lyl>o
Case II.
t<0. In this case, it follows from (2.29) that [ W(t$*(y)) - W(t~(y))] + _v [W(t~*O'))- W(t~(y))]+p(y) dy
1/2,
D.K. Dey et al. / Estimation under entropy loss
357
r < (1 - e t) llyl>v (J*(Y))P(Y) dy _const.(e t - 1 - t) 1/2 1 exp(J*(y))pO,) dy lyl>o
=K2(o)IX [W(t~O'))- W(~*(y))]p(y)dyl 1/2 where
exp(J* (y))p(y) dy.
K2(o) = c lly) >o
Defining K4(o)=Kl(o)+K2(o), the proof is complete.
[]
Now the assumption (4) of Brown and Fox (1974) follows from Lemma 2.4 and Lemma 2.5 by taking K(o)=Kl(o)+K4(o). Thus J ° ( X ) = ( X l / a l , X 2 / a 2 ) is an admissible estimator of (0~-1,02 1) under the loss (1.3). [] Next we prove the admissibility of ((tt I - - 1 ) / X l , ( a 2 - 1 ) / X 2) for estimating (01, 02) under the loss (1.2). Specifically, the following theorem is proved.
Theorem 2.2. Under the loss (2.1), J+(X)=((al-1)X~l,(aE-1)X~ l) is an admissible estimator of (01, 02) if min(al, a2) > 5. The proof of this theorem is very similar to the proof of Theorem 2.1, and we omit essentially all the details. The key points to note are that we use the transformation Y/=log X/-l, t/i=log Oi, ti=logai, J*(Y/) = Y~-+log(ai- 1) (i= 1,2) to change the scale problem into a location problem. Then the loss is the same as the one given in (2.1). The proofs of Lemmas 2.1-2.4 follow the same pattern except that i(xl,x2) should now be redef'med as
i(xl, x2) = exp( - xl - x2)x~' - 2x~2- l/{F(al )F(a2)}. Accordingly,_a in (2.17) should be redefined as a = min(al - 1, 02). Thus, to ensure part (iii) of Lemma 2.3, we need min(ttl- 1 , a 2 - 1 ) > 4 , i.e. min(al, a2)>5.
3. Inadmissibility results In this section, first we will prove inadmissibility results concerning the simultaneous estimation of means and the reciprocal of means of independent gamma variables with known shape parameters under losses (1.3) and 0.2). Suppose J * ( X ) = J ° ( X ) + ~ ( X ) is a competing estimator of J°(X), the best invariant UMVUE of 0 - i . Then if the conditions as given in Lemma l of Berger (1980) hold, it follows that
D.K. Dey et al. / Estimation under entropy loss
358
A *(0) = R(cS* O) - R(cS °,O) =
EoAo(X),
(3.1)
where p
A0(x) = ~ [¢~//O)(x)+ ( a i - 1)~)i(x)/xi -- log(1 + a#Ax)/xi)],
(3.2)
i=1 with
Putting Oi(x) =xi c/i(x), one gets
C/(1)(X) = ~ ¢ i ( X ) / O X i . p
Ao(X) = ~ [Xjc//O)(x) + aic/i(x) - log(1 + aic/i(x))].
(3.3)
i=l
Hence if we can find a solution C/=(C/I, .--, C/p) to the differential inequality d0(x)-< 0 for all 0 x
-
3--U
(3.4)
x 2.
6(1 - u )
We obtain several solutions to the differential inequality Ao(x)<-0 where Ao(x) is given in (3.3). The next few theorems provide different solutions to Ao(x)<_O and these solutions correspond to different classes of shrinkage estimates. Theorem 3.1.
S=
Suppose P ~..
P
(logXi-/ai) 2 and s= ~ ( l o g x i - - l Z i ) 2.
i=1
i=1
Consider an estimator J * ( X ) = ( J ~ ( X ) , . . . , J ~ ( X ) ) given componentwise as xi J* (X) =
ai
xirts) - -
b+S
(log X i -- ~i),
i:
1,..., p,
(3.5)
with b> 3 6 ( p - 2)2/25a 2 where z(s) is a function satisfying: O) O< z(s) < 6 ( p - 2)/5a 2 with a = m a x ( t q , ... , ap), (ii) r(s) Tin s, and (iii) ET'(S) < ~ .
Then J * ( X ) dominates J ° ( X ) f o r p>_3, in terms o f risk. Proof.
Define c/i(x) = - [z(s)/(b +s)](log xi-lai), i= 1, ..., p. First, observe that
ilog
-u,l
D.K. Dey et al. / Estimation underentropy loss
359
at(s) ( p - 2)a 6 -< 2---~- 2x/b 5a 2 < ~" Now using Lemma 3.1, with u = ~-, it follows that log(l
+
t~i~lli(X))>-~t~i~i/i(X)--~Ot:~i/:(X).
Finally, from (3.3), it follows that ¢/= (¥/1, "", ~//p) is a solution of Ao(x)<_O, which completesthe proof of the theorem. [] A Monte Carlo simulation study was performed to compute the risks of $*(X) given in (3.5) with/ai=0, z ( s ) = 3 ( p - 2 ) / S a 2 and b = 1 . 4 5 ( p - 2 ) 2 / a 2 for several values o f p and a. The percentage improvements in risk of J*(X) over the standard estimator J°(X) were computed for different ranges of the parameter Oi and the improvements seemed to be in the 1°70 to 8070 range. The class of estimators given in (3.5) shrinks (or expands) the best invariant estimator towards an arbitrary preassigned point. A recommended way of finding an improved estimator is to choose the point ~ towards which the estimator shrinks or expands, adaptively. The following theorem gives the class of adaptive estimators and emphasizes the role of geometric mean in shrinking. This is also evidenced in Ghosh, Hwang, and Tsui (1984) in the estimation of gamma parameters under quadratic loss (see specifically their Example 2). Theorem 3.2. Suppose
S = Ep
i=l
i logx i _p_l Ep j=l
logxj )2 .
Consider an estimator $ a ( x ) = (J~(X), ..., Jp(X)) given componentwise as
j~(x)=X/ ai
(
P
)
X/r(S) l o g X i _ P _ 1 ~ logXj ,~ b+ S
-
(3.6)
j=l
with b > 3 6 ( p - 3)2/25a 2, i= 1, ..., p, where z(s) is a function satisfying: (i) 0 < ~(s)< 6 ( p - 3)/5a 2 with a = max(al, ..., ap), (ii) r(s) Tin s, and (iii) E z ' ( S ) < oo. Then da(X) dominates J ° ( X ) f o r p>_4, in terms o f risk.
The proof is omitted because of its similarity to that of Theorem 3.1. The following theorem gives a trimmed shrinkage estimator which is very useful when some of the 0i's could possibly arise from a flat-tailed prior (see Stein (1981) or Ghosh and Dey (1984)). Theorem 3.3. Let bi= (log Xi--~li) 2 where the lai's are Certain specified constants and b o ) < . . . < b o , ) denote the ordered bi's. Suppose
360
D.K. Dey et aL / Estimation under entropy loss
S=
~
bi+(p-l)bv),
i~ bi< b(t)
where 0 < l < p . A s s u m e that r(s) is a function satisfying: (i) 0 < r(s) < 2 ( / - 2), (ii) r(s) Tin s, and (iii) Er'(S) < oo. Then the trimmed estimator o r ( x ) defined componentwise as
7(x) =
si aj
XiT(S-------~)( l o g X i - f l i ) s
Xi
Xi (S)
i f bi<_b(t ),
(3.7)
b(~2sgn(logXi-/~i)
/f hi>bit ),
where sgn u = 1, 0 or - 1 according as u >, = or <0, dominates O°(X) in terms of risk.
The proof is again omitted because of its similarity to that of Theorem 3.1. The integer l should be chosen to be some appropriate fraction of p, say l = [ap], where 0 < a < 1 and [y] denotes the smallest integer greater than or equal to y. One possible way is to choose I adaptively. That is, let the data choose the trimming point which maximizes the estimated expected improvement. This can be achieved by the Monte Carlo simulation method and will give rise to an adaptive trimmed estimator of the mean vector. For further reference, see Ghosh and Dey (1984). Next, we consider estimation of 0 and aim at improving on the best scale invariant estimator al-1
ap-1) ,...,
x.
Consider the competing estimator given by J ; ( X ) - - - J + ( X ) + ¢ ( X ) with ¢ ( X ) = (Ol(X),...,qbr(X)). Then, using Lemma 2 of Berger (1980) and assuming that ¢ satisfies the needed regularity conditions, the risk difference is given by (3.8)
R (J +, O)- R (J +, O) = EoA +(X),
where A+(x)=•i=l
( x ) x / - a ' - l o g 1+
a-~-I
/)
with O '-1, g/<1>(x)= --gi(x)=q~i(X)X~ Oxi
i= 1,..., p.
The following theorem gives an improved estimator of 0.
(3.9)
D.K. Dey et al. / Estimation under entropy loss
361
3.4. Consider the estimator J + ( X ) = J + ( X ) + O ( X ) ( ¢ l ( X ) , . . . , ~ p ( X ) ) with
Theorem
$ i ( X ) - a i - Xi l(D)I
(ai - 1)log X i + 1
2 l°g2Xi I' D
where
i=1, .... p,
¢(x) =
(3.10)
where D = b + ~,jP l l°g2Xj and b and c are constants such that (i) 0 < c < 3(p - 2 ) / 5 { ( d - 1)2 + 4} and (ii) (c(a - 1)/2V~) + 3c/b < ½ f o r f i r e d c, where a = max1 <_i<_pai. Then J,+(X), in terms o f risk f o r p > 3. Proof.
The proof follows by observing that c(ai - 1)log xi gi(x) = b + ~j-p 1 l°g 2 xj
is a solution to A +(x) < 0.
i = 1, ..., p ,
[]
In the remainder of this section, we develop certain generalized Bayes estimators of (0~ ~,..., 0pb.) with respect to the (possibly improper) prior with pdf 1-I~=10/u~, and study their admissibility. First note that assuming the loss p L(O, a) = ~, 0 mj {aiOf-b~- log(aiOf-bo - 1}, (3.11) i=I with mi~:0 for all i = 1, ,.., p, the generalized Bayes estimator of (0~ ~,..., Obp") with respect to the aforesaid prior is given by a(x)= (al (xl), ..., ~p(Xp)) where (3.12)
di(xi) = cixf- b,,
with c i = F ( a i + u i + m i + 1 ) / F ( a i + m i - b i + 1),
i= 1,...,p.
It is assumed that the constants tti, ui, mi and bi axe such that O) ai> mi/p, (ii) ai +-~mi- m i / 2 p > bi, (iii) ai + ui + m i - Ibi I + 1 > O, and (iv) ci(-~mi- m i / 2 p + a i - bi) < F(-~mi- m i / 2 p + ai) hold for all i = l , . . . , p . For example when m I . . . . = m p = l , al . . . . . ttp=ao> l, conditions (i)-(iv) hold when u i = - - ~ p - 1 and m a x l ~ i s p b i < 1. The following theorem gives an improved estimator of (0~ ~,..., 0pb"). 3.5. Consider the estimator J ( X ) = ( J I ( X ) , . . . , J p ( X ) ) , ai(xi)(1 + ¢i(x)) where b ~i(x) =dxm'/21-I xj -m/2p, i= 1, . . . , p , j=l Theorem
with Ji(x)=
(3.13)
D.K. Dey et aL / Estimation under entropy loss
362
and d is a positive constant. Then J(X) dominates a(X) in terms o f risk. Proof. The risk difference is clearly given as
A = R(O, J) - R(O, a)
[p
=E
p
]
O~i-bicix;bioi(X)-- ~ 0/re'log(1 + Oi(X))
i=1
i=1
<~E I EP CiOmi-bixtbioi(X) - EP om'~i(X) + 1 EP O ~ i ~ ? ( X ) 1 " i=1 i=1 i=1
(3.14)
Direct calculation in the right-hand side of (3.14) then yields
A<_dq ( f I 1 O~~/2p) ~P . Ofn'/2F-1 (oti-mi/2P) "=
i=1
• {ciF(mi/2 - mi/2 p + a i - bi) - F(mi/2 - mi/2P + ai)} +d2q0
O~j/p
)p
~ F(oti+mi-mi/P)/F(ai-mi/P)
i=1
<--dqe(i~=l Oml/2)(j=(Il O~J/2P)+d2qov(j~__l O~J/P),
(3.15)
where P
q= j=l [I F(aj-mjl2p)lF(aj),
P qo = l'I
j=l
F(aj-mj/p)lF(aj),
e = m i n IF(rail2 - rail2 p + ai) - ciF(mi/2 - mi/2p + ai- bi)] 1<-i
v = ~ F(oti + m i - m i/p)/F(¢~i - m i/p). i=l
Using the arithmetic-geometric mean inequality, it follows from (3.15) that
P A<_ fi 0]7J2p ) (-dqe+d2qo v) ~. j=l
o/m,/2.
(3.16)
i=1
Now choose d > 0 such that d<(qe)/(qoO). The, it follows from (3.16) that A <0, which completes the proof of the theorem. [] The above technique of proof is borrowed from Das Gupta (1986). Note that the method involves a direct evaluation of risk rather than setting up a differential inequality and then solving it. It should be noted however that the inadmissibifity of (Xl/al,...,Xp/ap) for estimating (01-1, ... , 07 l) under the loss (1.3) or the inadmissibility of ((al - 1)/Xl,..., (ap- 1)/Xp) for estimating (01, ..., Op)
D.K. Dey et aL / Estimation under entropy loss
363
under the loss (1.2) does not follow from this general result. The reason is that for estimating (0i-1, ..., 0,-1) under the loss (1.3), the best invariant unbiased estimator ( X 1/t~ 1, . . . , X v / a p ) is generalized Bayes with respect to the prior with pdf H/P=10/-i. Thus b l = ' " = bp = - 1, u I = . . . . Up = - 1, m~ = . . . = m p = O. This leads to c i = a i - i for all i = 1 , . . . , p . Accordingly (3.15) does n o t hold since both its left-hand and right-hand side equal F(oti). Similarly, for estimating (01, . . . , O p ) under the loss (1.2), the best invariant unbiased estimator ( ( t t l - 1 ) / X 1 , . . . , ( a p - 1 ) / X p ) is generalized Bayes with respect to the prior H/P=10/-1. Here, u i = - 1, m i = O, b i = 1 for all i = 1, . . . , p , leading to c i = a i - 1 for all i = 1, . . . , p . One can notice the analogy of these two cases with those of Das Gupta (1986) where his method cannot b e used to prove the inadmissibility of best invariant unbiased estimators (01-1, ..., 0p 1) or of (01, ...) 0p).
References Berger, J. (1980). Improving on inadmissible estimators in continuous exponential families with application to estimation of gamma scale parameter. Ann. Statist. 8, 545-571. Brown, L.D. (1966). On the admissibility of invariant estimators of one or more location parameters. Ann. Math. Statist. 37, 1087-1136. Brown, L.D. (1968). Inadmissibility of the usual estimator of scale parameters in problems with unknown location and scale parameters. Ann. Math. Statist. 39, 29--48. Brown, L.D. and M. Fox (1974). Admissibility of procedure in two dimensional location parameters. Ann. Statist. 2, 248-266. Das Gupta, A. (1984). Admissibility in the gamma distribution: Two examples. Sankhyd Ser. A 46, 395-407. Das Gupta, A. (1986). Simultaneous estimation in the multiparameter gamma distribution under weighted quadratic losses. Ann. Statist. 14, 206-219. Ferguson, T.S. (1967). Mathematical Statistics: A decision theoretic approach. Academic Press, New York. Ghosh, M. and A. Parsian (1980). Admissible and minimax multiparameter estimation in exponential families. J. Multivariate Anal. 10, 551-564. Ghosh, M., Hwang, J.T. and K.W. Tsui (1984). Construction of improved estimators in multiparameter estimation for continuous exponential families. J. Multivariate Anal. 14, 212-220. Ghosh, M. and D.K. Dey (1984). Trimmed estimates in simultaneous estimation of parameters in exponential families. J. Multivari'ate Anal. 15, 183-200. Haft, L.R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix. Ann. Statist. 8, 586-597. Hudson, H.M. (1978). A natural identity for exponential families with applications in multiparameter estimation. Ann. Statist. 6, 473--484. Hwang, J.T. (1982). Improving upon standard estimators in discrete exponential families with applications to Poisson and negative binomial cases. Ann. Statist. 10, 857-867. James, W. and C. Stein (1961). Estimation with quadratic loss. Proc. Fourth Berkeley Symp. Math. Statist. Probab. Vol. 1,361-379. University of California Press. Stein, C. (1959). Admissibility of Pitman's estimator of a single location parameter. Ann. Math. Statist. 30, 970--979. Tsui, K.W. (1979). Multiparameter estimation of discrete exponential distributions. Canad. J. Statist. 7, 193-200.