A note on universal admissibility of scale parameter estimators

A note on universal admissibility of scale parameter estimators

STATISTICS & PROBABILITY LETTERS ELSEVIER Statistics & Probability Letters 38 (1998) 59 67 A note on universal admissibility of scale parameter esti...

453KB Sizes 4 Downloads 150 Views

STATISTICS & PROBABILITY LETTERS ELSEVIER

Statistics & Probability Letters 38 (1998) 59 67

A note on universal admissibility of scale parameter estimators Debashis Kushary Department of Mathematieal Sciences, Rut qers University, Camden. NJ 08102, USA

Received 1 January 1997; revised I July 1997

Abstract The notion of universal admissibility of estimators was introduced and developed by Hwang (1985) and Brown and Hwang (1989). In several models commonly used estimators of scale parameters are shown to be inadmissible under specified loss functions. Here we focus on the scale and location-scale invariant estimation of the scale parameter under the universal admissibility criterion. For a one parameter gamma distribution, we characterize the class of universal admissible estimators. For the two parameter normal and exponential models we derive the condition for universal inadmissibility of the estimators of the scale parameter. (~) 1998 Elsevier Science B.V. All rights reserved A M S Classification: primary 62F 10; secondary 62C 15 Keywords." Normal distribution; Gamma distribution; Exponential distribution; Universal admissibility

1. Introduction and summary If an estimator 62 dominates another estimator 61 under a specified loss function LI, it always raise the question how well 62 performs against 61 under a different loss function L 2. In other words, is the superiority of 6 2 over 61 robust with respect to the loss function? To answer such an important question Hwang (1985) introduced the concept o f universal domination and stochastic domination. Then Brown and Hwang (1989) developed the concept further in the context of estimating a multivariate normal mean. Similar criteria were dealt with in a less formal manner by Brown (1968), Rukhin (1987) and Cohen and Sackrowitz (1970). Recently, Cohen and Kushary (1998) have proven universal admissibility o f maximum likelihood estimator under constrained spaces for several different models. All the above references considered estimation of means or in general estimation o f location parameters. In this paper, we consider estimation of a scale parameter under similar conditions. Suppose X = (XI,X2, .... X,) is a random sample o f observations from a distribution with scale parameter cr and 61(X) is an estimator of a. Most researchers assume the loss function Lo(6, a ) = (6 - ~r)z/a 2, (or the squared error loss function). If the estimator 61 is dominated by another estimator 62 under L0, we ask the question, does it remain inferior to 62 for the class o f loss functions which consist of all the non-decreasing functions of

0167-7152/98/$19.00 (~) 1998 Elsevier Science B.V. All rights reserved PH SO 167-7152(97)00154-5

60

D. Kushary / Statistics & Probability Letters 38 (1998) 59-67

l(6 - a)/a I. In other words, is it true that there exists a 62 such that

for all nondecreasing function L(.) and Va > 0. If such 62 exists, then the estimator bl will be called universally inadmissible (u-inadmissible). While estimating location parameters, Hwang (1985) referred to this notion as universal domination. He also defined the concept of stochastic domination and proved its equivalence to universal domination. In the present context, stochastic domination which is equivalent to (1.1) can be stated as

For the proof of the equivalence of (1.1) and (1.2) the reader is referred to Hwang (1985). We exploit (1.2) to derive the u-admissibility property of estimators. In this paper we consider the following models: I. Let X =(XI,X2 . . . . . An) be a random sample from a gamma distribution with density 1 x ~-I e-X/;~ for x>O, fG(xl~'2)= 2~F(~)

(1.3)

where a is a known positive shape parameter and 2 is the unknown positive scale parameter. Let Y = ZX, which is the sufficient statistic for 2. Consider the class of estimators, ~6, which are constant multipliers of Y, i.e. 6 ( X ) = a.Y where a is a positive constant. More precisely, the class of estimators is defined as ~G = { 6 ( X ) : 6 a ( X ) = a . Y for some a > 0 } . Under the loss function L0, it can be easily shown (see Berger, 1985, p. 255) that the only admissible estimator within the class ~6 is 6, ° where ao=(1/(na + 1)). In Section 2, we consider the same class and derive necessary and sufficient conditions for u-admissibility of estimators within this class. II. Let X = ( X , , X 2 ..... Xn) be a random sample from normal distribution with unknown mean tt and unknown variance a 2. Let X = SXi/n, s 2 = Z ( X , - X ) 2 and Z = ( ( v ~ X ) / s ) 2. We consider two classes of estimators for 0"2; namely the location-scale invariant class and the scale invariant class. Properties of the gamma distribution are used to derive the necessary and sufficient conditions for u-admissibility of estimators within the location-scale invariant class. Scale invariant estimation of a normal variance has been studied extensively in the context of point and interval estimation. Maatta and Casella (1990) present a wonderful review of the developments of the variance estimation in a decision theoretic frame work. In the case of point estimation of the variance, Stein (1964) first showed that "usual" estimator of a normal variance is inadmissible in the scale invariant class. Brown (1968) and Brewster and Zidek (1974) improved on Stein's result but they considered a more general set-up which includes a larger class of loss functions than scaled squared error loss. In the context of interval estimation, Cohen (1972) first showed that "usual" confidence interval can be improved upon in terms of probability of coverage keeping the same length. Later Shorrock (1990) and Goutis (1989) obtained better confidence interval than that of Cohen (1972). In Section 3, we look at the universal admissibility of the scale invariant estimators (see Maatta and Casella, 1990 for definition). We show that the unbiased estimator of normal variance is universally inadmissible. III. Let X = ( X I , X 2 . . . . . X,) be a random sample from an exponential distribution with unknown location parameter kt and unknown scale parameter a. As in the normal case we look at two classes of estimators, the location-scale invariant class and the scale invariant class. Amold (1970) first proved the inadmissibility of the usual scale estimator for the exponential distribution, which is similar to Stein's (1964) result for the normal case. Zidek (1973) considered the estimation of exponential scale under strictly bowl-shape loss functions and presented some useful and strong results. Brewster

D. KusharyI Statistics & ProbabilityLetters 38 (1998) 59-67

61

(1972) provided improvement over the usual estimator by using smooth estimators. In Section 4, we prove universal admissibility results for the estimation of a which are similar to those of the normal case. In this paper we focus on estimation without a specified loss function. We identify some universally inadmissible estimators which are commonly used.

2. Gamma distribution

Let X =(X1,X2 ..... An) be a random sample from a gamma distribution with density given in (1.3). [In some places, for convenience, we suppress the parameter values while writing f ~ and F~ (the corresponding distribution function).] Since Y = SXi is the sufficient statistic which also follows a gamma distribution with density JG(y[n~,2), we consider estimating 2 by an estimator 6a(X) in egG. It is well known that (see Berger, 1985), under scaled squared error loss, the value of a which minimizes the risk of ~a is (n~ + 1) -l. In other words, within the class cgc, 6a is admissible if and only if a=(n~ + 1) - l . Here we focus on the same class of estimators and provide the range of a for which 6~ is u-admissible within the class c~c. Before we state and prove the result in Theorem 2.3, we need the following technical lemmas. Lemma 2.1. If fG(xl~, 2) is the density function of a gamma distribution as defined in (1.3) and if ~> 1,

then fG(x+y)>~fG(x--y)

VO~O.

Proof. The above inequality is trivially true for y>~x since gamma is a positive random variable. Now, suppose 0 ~<3'
fa(x + y) r~(y) = ~ y) To complete the proof it is enough to show that for any fixed x, rx(y) is an increasing function of y for all y C [0,x). Using the expression of f a from (1.3), rx(y) can be written as

rx(y)=e_2y/; ( x + y ~ - l . \x-y/ Hence, drr(Y)dy

e-2y/z ( @ ) \ x e_2y/;" ( x + y']

kx - y /

-~--fi/

,

(x +

+e-2y/;~(~-l) \ x - y /

[(x-y)+(x+(.x~_f_)2

{y2 + x(2(o~ - 1) - x ) l .

Observing that O
62

D. Kusharv / Statistics & Probability Letters 38 (1998) 59-67

For any fixed c > O,

L e m m a 2.2.

D ( a ) = F ~ ( 1 - c ' l +-ac

~' )~

is a decreasin9 function of a tf a > ( ~ 2 ) - 1, where FG(a,b]~,)~) = F~(blc~,)~) - Fc(al~,2 ).

(2.1)

Proof. Since the distribution function is a continuous function of a, the lemma is proven by showing that the derivative of D ( a ) with respect to a is negative for a>(:~2) - j .

d(D(a)) -

1_~ fc ( ~ _ c

a,)o) + l - c

Using Lemma 2.1 it is easily seen that the quantity inside the bracket is positive for a > (c~2) -1, hence the derivative is negative and it completes the proof. Theorem 2.3.

Within the class ofC£6, 6a(X) is u-admissible if and only t f 0 < a ~ < ( n : ~ ) -l.

" I f part". Let 6~, = a l Y for some 0 < a l ~<(n~) -1. To prove the u-admissibility of 6a, within ~ , we show that given any other 6a~ E ~G, 3 a c > 0 for which (1.2) is violated. Suppose 6~2 =a2Y dominates 6~, universally, i.e. Proof.

P[~-I



Vc>0, 2>0.

This inequality can be rewritten as FG

al

,

--

al

a2

,

--

a2

n~,

1

.

(2.2)

Now, suppose 0 < a l 1 we see that (2.2) fails to hold as gamma is a non-negative random variable. If 0 < a 2 < a l , define FG((1 + ~(C) = FG((1 +

c)/al) -- FG((1 -- c)/al) c)/a2) -- FG((1 -- c)/a2)"

(2.3)

We claim that for sufficiently small value of c, ,~(c) exceeds 1 which contradicts (2.2). To see this fact we show that l i m ~ 0 ~ / ( c ) > 1. Then 3 a r > 0 such that V c E ( 0 , r ) (2.2) will be violated and that will conclude this part. We note that numerator and denominator of (2.3) are continuous functions of c but both become 0 when c = 0. Hence, to find the limit of ,~ as c ~ 0 we differentiate both of them w.r.t c and then put in c = 0. We see that lim ~ ( c ) = [(1/al )f~((1 + c-~O

c)/al ]nc~,1 ) + (l/al)fc((1 - c)/al Ins, 1)]~=0

[ ( 1 / a z ) f G ( ( 1 + c)/a21nc¢, 1) + ( 1 / a z ) f G ( ( 1

f~((1/al )ln:~ + 1, 1) fG((1/a2)]nct + 1, 1)"

-- c)/a21not, 1)]c=O

D. Kushary I Statistics & Probability Letters 38 (1998) 59-67

63

Notice that the above gamma distribution with density fo(" Inc~+ 1, 1 ) has a unique mode at n~ and n~ ~ 1/al < 1/a2. Then J'G(l/alln~ + 1, 1)>fo(1/a2ln~ + 1, 1) and hence limc~o .~(c)> 1 which completes this part. "Only if" part. To prove this part, we are going to show that if a > ( n ~ ) -l then 6a(.) is u-inadmissible. This is proven by virtue of the fact that, D(a)=P(laa/2 - l [ < c ) is a decreasing function of a for a > ( n ~ ) -1. In other words it means that for any a2>(n~) - l , 8a: is universally dominated by 6a, where (n~) -1 <~al
3. N o r m a l distribution

Let X = (XI ,)(2 . . . . . Xn ) be a random sample from a normal distribution with unknown mean # and variance o"2. Define X = XXi/n, s2= X(X,. - ~ ) 2 and Z = ( ( v ~ X ) / s ) 2. Consider the problem of estimating a2. We consider two classes of estimators, namely the location-scale invariant class gN and the scale invariant class ~N i.e. Cg'N = {aa(X): 6 a ( X ) = a s 2, for some constant a > 0 } , ,SPy = {6~(X): 64,(X)= q~(Z)s 2, for some positive function 4) of Z}. Note that g'N is a proper subset of SN. We study the u-admissibility of the location-scale invariant estimators within gN and find the necessary and sufficient condition on a for which 6a(X) is u-admissible. Then we study the estimators in 5PN. It turns out that some estimators which are u-admissible in (gN become u-inadmissible in ,~NIt is known that (s2/2) follows a gamma distribution with shape parameter ( ( n - 1)/2) and scale parameter a 2. Hence we can apply Theorem 2.3 to characterize the u-admissible estimators within ~N. The result can be stated as follows. T h e o r e m 3.1. A location-scale invarDnt estimator 8a of normal variance is u-admissible within the class

(6N (ff O3. ProoL Letting Y = (s2/2) and b = 2a, we see that 6a(X) = bY.

Since Y follows a gamma distribution with density f G ( y l ( n - l)/2, a2), we apply Theorem 2.3 to conclude that aa(X) is u-admissible if and only if O < b < ~ 2 / ( n - 1) which is same as O < a < , l / ( n - 1). It concludes the proof. [] Next, we consider the scale invariant estimators of normal variance, i.e. the class ,9°N. Standard distribution theory shows that the joint density function of Z and s 2 is f ( z , s ) 2=

pl(v) fF(z[ll fG



~*,2"

(3.1)

,

I 0

where .l~; is defined in (1.3) and n# 2 "c= 20.2,

Pl('r)--

e-Z.e l 1!

'

. _ n + 2l 2

'

2* --

2 I+Z

and

F((n + 2•)/2) z(21+l )/2--I fF(Zll)= F((21 + l )/2)F((n - 1 ) / 2 ) ' ( 1 4- Z) (2l+n)/2"

D. Kushary / Statistics & Probability Letters 38 (1998) 59 67

64

To derive the result we use the joint distribution in the similar way as done by Shorrock (1990) and Strawderman (1974). Consider an auxiliary random variable L which follows a Poisson distribution with mean z. Shorrock (1990) constructed a better confidence interval by centering it in an appropriate location. But in proving the next theorem we have used (1.2) where c is unknown, hence we could not apply the technique of Shorrock (1990). The result is as follows: Theorem 3.2. Let ~(X)=~9(Z)s 2 be an estimator of normal variance. I f ~b(Z)>((1 + Z)/n) with some positive probability then 6 is u-inadmissible by virtue of being dominated by 6*(Z)=c~*(Z)s 2 where ~b*(Z) = min{q~(Z), (1 + Z)/n}. Proof. We prove that for all nonnegative values of ~,

0.2

and c

We observe that

l c)

1-c l+c a,,~,/ /

F

where expectation is taken w.r.t, the joint distribution of Z and the auxiliary random variable L (defined before). The above equality can be easily checked by using (3.1) and by conditioning over Z and L. Now it is enough to show that the quantity inside the bracket in (3.2) is non-positive for any given values of Z = z and L = 1. We notice that for any given value of Z = z whenever ~b(z)~<(1 + z)/n then qS(z)= ~b*(z) and hence the quantity is zero. But if ~b(z) > (1 + z)/n then for a given L = l, i.e. ~* = (n + 2l)/2, 2* = 2/(1 + z)

¢~(z)>~*(z)-

1 +___2z_ 1 + z ~>( , . ; , ) _ ~ . n

n+2l

We apply Lemma 2.2 to get

which makes the quantity inside the bracket in (3.1) negative. Hence, the quantity inside the bracket is always non-positive and in some cases negative. It concludes the proof. Remark 3.3. Using the above theorem we can see that the most commonly used unbiased estimator of the variance, 6 ( X ) = s 2 / ( n - 1) (see Lehmann, 1983), is u-inadmissible in the class 5~N, but it is u-admissible in ~N. Remark 3.4. The above results can be extended to the estimation of Strawderman (1974).

a 2m

for m > 0 which was considered by

Remark 3.5. The above results can also be extended to the variance estimation in linear regression model as discussed in Hwang (1990).

D. Kushary I Statistics & Probability Letters 38 (1998) 59 67

65

Remark 3.6. Observe that even though 3*(X) dominates 6(X) universally, the u-admissibility of 6*(X) is not clear. Similar results occur for many dominating estimators in variance estimation (see Shorrock, 1990; Maatta and Casella, 1990).

4. Exponential distribution Suppose that XI,X2 ..... X, are independent and identically distributed random variables with common density fE(xl#,a ) = (1/a)e -tx-F')/~ for (x>/O. The goal is to estimate a. Let M=minX,.=X(l),

S=S(X/-M)

and

M Z----. S

Consider two classes of estimators (gE and SfE defined as follows: WE = {6a(X): 3~(X)=aS for some constant a > 0 } , ~9~E= {6~b(X): 6~ =qS(Z)S for some positive function ~b of Z}. It is known that (S/a) follows a gamma distribution with shape parameter (n - 1) and scale parameter 1. We can use Theorem 2.3 to draw the following conclusion. Theorem 4.1. Within the class c~E an estimator 6a(X)=aS is u-admissible if and only if0(1 + nZ)/n

for some Z > O. Proof. Observe that M is exponentially distributed with density fE(xllt, a/n) and S independently follows a gamma distribution with density fG(sln- 1, a). Hence the joint distribution of M, S is f M . s ( X , S) =

n

F(n - 1)a n

e_,(x_j,)/~s,_2e_S/,,

where x>It, s > 0 . Then joint distribution of Z(=M/S) and S is

fz, s(z, s) = where

F(n

n -

1

)a n

e -n(zs-v)/* sn-le -s/*

A=

{ (z,s): z < O and O < s < ~'}

B=

{ (z,s): z > 0

and

'} .

and s > z

for (z, s) C A U B,

66

D. Kushary I Statistics & Probability Letters 38 (1998) 59 67

To prove the theorem, it is sufficient to show that if

6*(X)=cb*(Z)S,

where ~b*(Z)=

{ ~b(Z)

min(1/n,c~(Z))

for Z < 0 for Z > O

then P(-~-1


~--1


Vc>O, /t>O and a>O.

Now the left-hand side of the above inequality is

=Ez[p[1-c~_.)
< 6--;~ ,+c

"]1 "

To complete the proof we are going to show that for Z > 0 the difference is non-positive. For a given positive value of Z = z the conditional distribution of S is a truncated gamma with density

f(SlZ = z ) = ~ Ksn-le-"~/~l+nz)

(0 The constant Even though only effects the Now for any 1 -c

~(z)

for S>max(O,p/z), otherwise.

K depends on It, a and z. the conditional density is truncated gamma, we still can use Lemma 2.2 because truncation constant K and the support of S. fixed Z > 0, we see that for 0 < c < 1

1 -c

~*(z)'

so max ( ~ ' ~ * T ~ ) ~ < m a x ( ~ ' - ~ Z ~

) "

Hence, we get

,÷c ,) by using Lemma 2.2 for ~b(Z)>(1 + nZ)/n. Here G(x]Z = z ) represents the conditional distribution function of S given Z = z and G(a,b[Z=z)=G(b]Z=z)- G(atZ = z ).

Observe that the unbiased estimator (i.e. 6a(X)= S/n- 1) is u-inadmissible because t f 0 < z < I/ [n(n- 1)] then 1/(n- 1)>(1 +nz)/n.

Corollary.

Acknowledgements

I would like to thank Professor Arthur Cohen for many valuable discussion and comments and Professor Dinesh Bhoj for continuous support and valuable suggestions which led to an improved presentation of the article. I would also like to thank the anonymous referee for his/her careful reading and valuable suggestions.

References Arnold, B.C., 1970. Inadmissibility of the usual scale estimate for a shifted exponential distribution. J. Amer. Statist. Assoc. 65, 1260-1266.

D. Kushary / Statistics & Probability Letters 38 (1998) 59 67

67

Berger J.O., 1985. Statistical Decision Theory and Bayesian Analysis, 2nd edn., Springer, New York. Brewster, J.F., 1972. Alternative estimators for scale parameter of the exponential distribution with unknown location. Ann. Statist. 2, 553 557. Brewster, J.F., Zidek, J.V., 1974. Improving on equivariant estimators. Ann. Statist. 2, 21 38. Brown, L., 1968. Inadmissibility of usual estimators of scale parameters in problems with unknown location and scale parameters. Ann. Math. Statist. 39, 29-48. Brown, L., Hwang, J.T., 1990. Universal domination and stochastic domination in U-admissibility and U-inadmissibility of least squares estimators. Ann. Statist. 17, 252-267. Cohen, A., 1972. Improved confidence intervals for the variance of a normal distribution. J. Amer. Statist. Assoc. 67, 382-387. Cohen, A., Sackrowitz, H., 1970. Estimation of the last mean of a monotone sequence. Ann. Math. Statist. 41, 2021 2034. Cohen, A., Kushary, D., 1998. Universal admissibility of maximum likelihood estimators in constrained spaces. Statistics and Decision, to appear. Goutis, C., 1989. Improved invariant set estimation of a normal variance with generalization. Ph.D. Dissertation, Cornell University, Cornell. Hwang, J.T., 1990. Comment: "How much can the improvements be realized?" Statist. Sci. 5, 110-111. Hwang, J.T., 1985. Universal domination and stochastic domination: Estimation simultaneously under a broad class of loss functions. Ann. Statist. 13, 295-314. Lehmann, E.L., 1983. Theory of Point Estimation. Wiley, New York. Maatta, J.M., Casella, G., 1990. Decision-theoretic variance estimation (with discussion). Statist. Sci. 5, 90 101. Rukhin, A.L., 1987. How much better are better estimators of a normal variance? J. Amer. Statist. Assoc. 82, 925-928. Rukhin, A.L., Strawderman, W.E., 1982. Estimating a quantile of an exponential distribution. J. Amer. Statist. Assoc. 77, 159-160. Shorrock, G., 1990. Improved confidence intervals for a normal variance. Ann. Statist. 18, 972-980. Stein, C., 1964. Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean. Ann. Inst. Statist. Math. 16, 155-160. Strawderman, W.E., 1974. Minimax estimation of powers of the variance of a normal population under squared error loss. Ann. Statist. 2, 190-198. Zidek, J.V., 1973. Estimating the scale parameter of the exponential distribution with unknown location. Ann. Statist. 1, 264-278.