Estimation of the variance and its applications

Estimation of the variance and its applications

Journal of Statistical Planning and Inference 319 35 (1993) 319-333 North-Holland Estimation T. Kubokawa, of the variance and its applications...

903KB Sizes 0 Downloads 69 Views

Journal

of Statistical

Planning

and Inference

319

35 (1993) 319-333

North-Holland

Estimation T. Kubokawa,

of the variance and its applications K. Morita,

S. Makita and K. Nagakura

Department of Mathematical Engmeering and Information Physics, University of Tokyo, Bunkyo-ku, Tokyo 113, Japan Received 4 September

1990; revised manuscript

Abstract: For the variance

of a normal distribution

tors superior

to the best affine equivariant

numerically.

As an application,

it is demonstrated Stein estimator

received 14 April 1992

the simultaneous

that using an improved

with an unknown estimation

estimator

for the mean vector. Also simulation

AMS Subject ClassiJication: Primary Key words andphrases:

of a multivariate

of the variance

estima-

asymptotically

normal mean is considered

leads to the improvement

and and

on the James-

results for the relative risk improvement

62FlO; secondary

are given.

62507.

variance; multivariate

Point estimation;

mean, three types of truncated

are treated and their efficiencies are compared

normal mean; shrinkage;

James-Stein

estima-

tor; efficiency comparison.

1. Introduction In many applications such as designs of experiments and linear regression models, one has the models with the following canonical form: The statistics X, Y and S are mutually independent random variables and X-N,<&

02ZP),

Y-N 4’(< 0’1 4)

and

S-rr’~~ n’

where 0 and a2 are unknown parameters of interest and [ is a nuisance unknown parameter. In this paper we want to treat two problems of estimation of the variance o2 and the mean vector 8. For estimation of 02, it is desired to find a superior estimator 6 = 6(S, A’, Y) in the sense of minimizing the risk function R(o,6)=E,[(6/a21)2] for co = (0, <, 02). Let Z= (X’, Y’)’ and ZI= (8: 5’)‘. This problem is invariant under the group of affine transformations (S,Z)+(&s,aTZ+b),

(&)+7W,aZ~+b)

for any positive constant a> 0, any vector b and any (p + Correspondence to: Prof. T. Kubokawa, Information

Physics, Bunkyo-ku,

0378-3758/93/$06.00

0

Tokyo

1993-Elsevier

University

of Tokyo,

q) x (it, + q)

Dept. of Mathematical

113, Japan.

Science Publishers

B.V. All rights

reserved

orthogonal

Engineering

and

320

T. Kubokawa et al. / Estimation

of variance

matrix r. Any affine equivariant estimator 6,,(S, Z), satisfying 6,&a2S, aTZ + b) = Z), can be expressed as 6&S, Z) =cS for c>O. Then the best affine equivariant estimator in terms of risk is

a26&S,

60=(n+2))‘S,

(1.1)

with the risk R(o, 6,) = 2/(n + 2). For improving subgroup described by (S, Z) + (a*S, al-Z),

on &, Stein (1964) considered

a

(a2,p) + (a2a2, arp),

and tried to look for a better estimator among a broader class of scale equivariant estimators &,(,I$ Z) = S@(IjZl12/S) for a positive valued function @( +) and the Euclidean norm 11 . 112.In fact, he derived the improved estimator 6i =min

& 1

S+

IlXl12 + IIY/l2

n+p+q+2

(1.2)

I ’

which may be viewed as a preliminary test procedure in the sense that the decision whether to employ 6e or (S+ /lXl12+ IIYl12)(n +p+ q+ 2)-l as an estimator of o2 depends on a test statistic for testing He: p =0 vs. H,: ,u#O. The above type of domination results in point and confidence estimation and various extensions have been studied by several authors. For the bibliography, see Maatta and Casella (1990). Among these, Sinha (1976) and Gelfand and Dey (1988) considered the estimator a2=min

S+ 11~112 s+ 11~112+ IIYl12 &, n+p+2’ n+p+q+2 L I )

(1.3)

and verified that a2 has a smaller risk than min_{&, (S_+ llXl12)(n +p + 2)-l). While the improvement by 6, is possible for small IIXIIL+ IIYIJL,the superiority of a2 arises when either of l(Xlj2, llXl12+ IIYl(* is small. In fact, if 11Xl12is small, and if 11Y/l2 is This imvery large, then llXl1* is available, but the effect of l/X/l2 in 6i disappears. plies that d2 is superior to 6t. However, when /lXl12 is very large, even if IIYl)* is small, the effects of IIYII2 in both 6t and a2 are gone. To eliminate this undesirable property, it may be reasonable to consider the estimator of the form Bs=min

&, t

s+

/Ixl12s+ IIYl12 s+ 11x/1*+ IIYl12

n+p+2’

n+q+2’

n+p+q+2

I ’

(1.4)

which was suggested by George (1990), but no dominance properties for 8s have been established. It seems difficult to present an exact decision-theoretic result. In Section 2, we derive the asymptotic risk expansions for the estimators 6i, a2 and 8s and compare numerically their second order terms. From the expansion for as, it is analytically demonstrated that 8s is asymptotically better than 6e. Section 3 presents the results of Monte Carlo simulation for the relative risk improvement. As expected, if one has the prior knowledge that A = ~I~~~*/(2a2) or T= II
321

T. Kubokawa et al. / Estimation of variance

and

r, the estimator

d3 is desirable.

It is also revealed

that

the relative

risk im-

provements are getting greater for larger dimension p or q. In Section 4, we deal with the problem of estimating the mean vector 0 simultaneously and consider an application of the estimation of variance. Stein (1956), James and Stein (1961) showed that for ~23, the usual estimator X is dominated

by the shrinkage

estimator

eJS = (1 - (p - 2)6e/l~X1~2)X relative to the loss function I/8--B112/a2. Since the estimator 6e of a2 can be improved on by using the information contained in X and Y, we have one question whether gJs can be further improved on by employing better estimators of o2 instead of &,. This is a conjecture of George (1990). The answer is affirmative and it can be proved that eJS is dominated by

min 6 s+ IIYl12 s-t 11~/12+ lIYl12x n+p+q 1 IlXl12 L O’n+q+2’

x-p-2

for instance. That is, using the improved estimator of the variance leads to the improvement on the James-Stein estimator of the mean vector. Also numerical investigations of relative risk improvements are given. Finally two examples in designs of experiments are stated.

2. Asymptotic

properties

of the truncated

estimators

For investigating the nature of the risk improvements about the estimators given in Section 1, we shall derive their asymptotic risk expansions and compare their second order terms. Define the asymptotic risk difference for the estimators a0 and 6 by ARD(G0,6)=

lim n2{R(~,60)-R(o,6)}. n-C=

(2.1)

Note that llXl12-a2~~+2, and IIYll’- a2~j+2K where J and K follow Poisson laws with means L = (181(2/(2a2) and r= II
Proposition

2.1. Let u1 =p - ljXl12/~2 and u2 = q - I/Y112/a2. Then, ARD(G0,6,)=E[(u,

+u2)(u1 +u2+4J+4K)I(ul

+u2>O)],

(2.2)

ARD(G0,~2)=E[~,(~1+2u2+4J+4K)I(ul~0,u2<0) + (u, + u,)(u,

+ u2 + 4J+ 4K) (2.3)

T. Kubokawa

322

ARD(c~~,c~~)=E[u~(LQ

et al. / Estimation

of variance

+2u, +4J+4K)Z(u,

20, u2
+24,(~~+224,+4~+4K)Z(24~<0,24~~0) +(u, + U2)(UI +u,+45+4K)Z(u,

>O,u2>0)],

(2.4)

where I(. ) designates the indicator function. The proof is given in the Appendix. Noting ~,(~~+22~2+4~+4K)+u~(U2+2U~+4~+4K)-2U~U~ E [u2 1K] = -2K, we can rewrite (2.4) as ARD(60,83)=E[~,(~1

that

(ul + u2)(uI + u2+4J+4K) and that E[u, IA=-25

= and

+2u,+4J+4K)Z(u,>O) +U2(U2+2U1 +4J+4K)Z(U,>O) -2u,u2z(u~~Q~2~0)1

=E[u,(u, +4J)Z(ul >o)+U2(U2+4K)Z(U2>o) - 2241u&Q 2 0, u2 2 011 =E[u,(u, +4J)Z(u,

>0,24,<0)

+ u2 (u2 + 4K)Z(u, < 0, u2 > 0) +{(U,-U2)2+4~U,+4K~~}Z(~~~0,~2>O)] >o. Hence

Table

we get

1

The asymptotic I 0

1

3

5

(2.5)

risk differences

for p =

q =2

0

1

2.59 1.81

2.21

82

1.67

63

1.03

1.39

4

2.21

62

1.93

83

3

4

5

1.57

1.01

0.62

0.36

1.46

1.29

1.19

1.16

1.37

1.27

1.18

1.13

1.57 1.52

1.01

0.62 1.06

0.36

1.24

0.97

0.20 0.91

1.39

1.47

1.30

1.14

1.02

0.95

61

1.01

0.62

0.36

0.20

0.11

0.06

62

0.69 1.14

0.52 0.87

0.43

0.38

63

0.99 1.27

0.65

0.51

0.36 0.43

61

0.36

0.20

0.36 1.13

0.23

0.11 0.17

0.06

82

0.13

0.03 0.11

0.11

0.95

0.66

0.43

0.28

0.20

5

61

63

2

0.02

T. Kubokawa et al. / Estimation of variance

Proposition

2.2.

The estimator ~3~is asymptotically

323

better than do.

Table 1 provides the numerical values of the asymptotic risk differences _4RD(& S) for 6 = aI, a2 and a3 when p = q = 2. Table 1 reveals that (1) the asymptotic risk reduction of 6, is great (resp. small) when both A and r are small (resp. large), (2) & and a3 are better than d1 when A is small and Y is large, (3) a3 is better than a1 and a2 for large A, (4) ARD(&,&) is concave in 5 when L is small, and (5) ARD(&,6,) and ARD(&&) are decreasing in r. Although the maximum reduction for a3 is smaller than those for &I and &, a3 is superior in a wide area of unknown parameters I and T.

3. Simulation

results

In this section

we present the results of Monte risk improvement which is defined by

Carlo

simulation

for the relative

RRI(G)=lOOx(R(w,&,)-R(o,&}/R(o,&), where 6=a1, a2 and d3 given in Section 1. This is done in the two cases of (n=l,p=q=l) and (n=l,p=q=lO). Using a VAX8600 computer with ULTRIX-32 operating system at the University of Tokyo, uniform deviates are generated by the linear congruential method stated in Fushimi (1989). Tables 2 and 3 report the average values of the relative risk improvements based on 10000 replications. In the tables, we see that for the small sample size (n = l), the estimators aI, C& and a3 have the similar risk properties as described below Proposition 2.2. In the

Table 2 The relative

risk improvements

in estimation

I

r

0

0

61

2.86

62

1.93

63

1

of the variance

for p = q = 1 and n = 1 (in percents)

2

3

4

5

2.28

1.54

1.04

0.67

0.49

1.65

1.55

1.51

1.51

1.51

1.01

1.27

1.40

1.46

1.49

1.56

2.47

1.58

1.14

0.77

0.56

0.47

2.11

1.55

1.30

1.23

1.22

1.21

1.50

1.30

1.24

1.20

1.27

1.20

1.36 1.31

0.90

0.64 0.75

0.53

0.46

0.91

0.70

0.39 0.66

1.62

1.09

0.85

0.71

0.68 0.69

0.76

0.57 0.59 1.00

0.47 0.51

0.42

0.75 1.62

0.70

0.48 0.55

0.37 0.47 0.52

0.65 0.37 0.47 0.48

T. Kubokawa et al. / Estimation of variance

324

Table 3 The relative I

risk improvements

in estimation

0

T

of the variance

for

p = q = 10 and n = 1 (in percents)

1

2

3

4

6.92

6.87

6.69

6.41

6.08

5.71

6.38

6.38

6.34

6.27

6.21

6.16

5.86

6.14

6.23

6.24

6.21

6.17

6.83

6.65

6.39

6.06

5.70

5.32

6.56

6.45

6.32

6.20

6.09

5.99

6.08

6.28

6.31

6.24

6.14

6.04

5

6.35

6.03

5.68

5.30

4.89

4.49

6.28

6.31

5.81

5.58

5.39

4.49

6.19

6.27

6.16

5.96

5.73

5.50

5.65

5.29

4.89

4.48

4.09

3.70

5.64

5.34

5.04

4.76

4.50

4.28

6.17

6.15

5.93

5.62

5.27

4.93

sequel, as expected, if one has the prior knowledge that both A and r are small, then 6, is desired and if it is known that I. only is small, the estimator & should be chosen. When one has no information about i and r, the estimator a3 may be desirable. From the comparison of Tables 2 and 3, it is also revealed that one can get great relative risk improvements for large dimension p or q.

4. An application

to simultaneous

estimation

of a mean vector

The improvements in estimation of the variance are discussed in the previous sections. As an application of the variance estimation, George (1990) suggested the problem of simultaneous estimation for the multinormal mean vector. Following his suggestion, we shall demonstrate that using the improved estimator of the variance leads to the improvement on the James-Stein estimator. In the model given in Section 1, suppose that we want to estimate the mean vector 6’ relative to the loss l/15- 8112/02. Stein (1956), James and Stein (1961) showed that for ~23, the estimator X of 0 is dominated by the shrinkage estimator P=

(1 -(p

- 2)60/~Ix~~2)x

with do= (n +2))‘S. Since then, as one of the most famous instances of inadmissibility, this Stein-rule estimation theory has been studied in a considerable literature. Looking at the James-Stein estimator oJs in the model given in Section 1, we notice that the statistic S is utilized for improving on X, while the statistic Y is still neglected. Then we have the question: Can Y be used for dominating the James-Stein estimator? Noting that the random variables X, Y and S have a common parameter cr2 in distributions, following the results in the previous sections,

T. Kubokawa et al. / Estimation of variance

325

we think of an idea that Y may be available for estimation of the variance 02. In the next subsections, we shall verify that eJS is further dominated by using an improved 4.1.

estimator

based

on S and

Y for the variance

Use of the statistic Y for estimation

Consider

o2 instead

of Se.

of the variance

the estimator

&f)=(

l-p~sf(llYl12~s)

where f (. ) is a positive valued estimator of 02, that is,

function.

>

x9

Assuming

(4.1) that

E[{Sf(llY1)2/S)/a2-1}2]
Sf(11 Y(I’/S) is an improved for all 0,

by 6(f).

For the purpose,

=~2~W’(Z)l,

(4.2) the following

(4.3)

where Z is a random variable having N(P, a2) and h( . ) is an absolutely continuous function. By using the identity (4.3), the risk function of 6(f) is written as

~(sf)2,0’-~x’(x-B)sf/02] llxl12

R(w &f )) =p+E =P+E =p-E[

(P-2)2

Ilxl,2 (Sf)2/02-2

[

go21

(P-2)2

llxl12 sf

1

+E[ go21

which yields Theorem 4.1. Let p 2 3 and assume that (4.2) holds. Then the estimator g(f) given by (4.1) is better than sJs. Theorem 4.1 implies that any improved estimator of the variance gives an improved version of the James-Stein rule. The usual choice for the function f (. ) is f(u)=min{(n+2)-‘, (n+q+2)-‘(l+u)}, giving 6, =X-‘G

min [a,,

n+;+2

ww~]~.

Other choices are the smooth function given by Brewster and Zidek (1974), several forms stated in the previous sections and so on. Theorem 4.1 is further extended in the next section.

T. Kubokawa

326

4.2.

et al. / Estimation

of variance

Use of the statistics Y and X for estimation of the variance

In the estimation of the variance consider the estimator

a2, we try to employ

not only

Y but X, and

(4.5) where g( . , . ) is a positive-valued function. In this case, it should be noted that the estimator for the variance is not independent of the estimator X for the mean. Assume that g(u,o) is an absolutely continuous with respect to u. Therefore from the Stein identity (4.3),

R(o,@g))=p+E

(Sg)2/a2-

=P+E

,,x,,2 (sg)2/02-

2(P - 2) ,,x,,2 X’(X2(P - 2)

,,x,,2 S

c a2

+(P-~)~E where g;,) = (d/do)g(u,

m(Sg/a2-I)2

u). Here we observe

0)Sg/a2

1 IIX112g~2~

1,

(4.6)

that for n b 3,

(4.7) where U=x,“(r)/~i_~, ditional expectation

V=$@)/X~_~, r= l/
gA,,(a,b)=E[x,f_2

1U=a,

V=WE[(x~_2)2 1U=a, V=bl

jo” t3S,-2(t)fp(bf;~)Sq(af; ~1dt =!r t4fn-2(t)fp(bt;~)f4(at;5)dt’ where fp(x; A) designates the density density of x,2-2. Since &(x; J)/‘_(x)

of xj (A) and f, _ 2 (x) = f, _ 2(x; 0), namely is increasing in x, we have that

jr t3f,-2(t)fp(bt;~)f4(at;

r) dt

!r t4f,-2(t)fp(bt;~)f,(at;

r) dt

(4.8) the

T. Kubokawa

~

et al. / Estimation

327

50”t3fn-2(t)fp(bt)fq(at) dt j,” t4fn- z Wf,Wf,(aO dt e-(‘ta+b)f/2dt

(n+p+q)/2-1

=

of variance

so”t S$ t

(n+p+q+2)/2-1

&l+a+b)t/2

&

=(l+a+b)/(n+p+q),

which,

together

with (4.8), shows that

g&Au, V<(l+ Hence

(4.9)

u+

(4.10)

W(n+p+q).

from (4.7), if we set

l

g*(U, V)=min

g(U, V),

l+U+V

n+p+q

(4.11)

1’

then for all CO,

E[-$ GE[ Therefore

[Sg*(q$y&l]2]

$

[Sg(~,+y0’l12].

we get the following

(4.12)

theorem.

Theorem 4.2. Let n,p > 3 and let g*(u, u) be given by (4.11). Assume the following

conditions: (a) g(u, u) and g*(u, u) are absolutely continuous (b) E[d/do{g*(U,v)-g(U,v)} lo=v]>Oforallo.

with respect to v.

Then the estimator 6(g*) is better than 8(g). Corollary

4.3. Let n,p>3

and put a,, _s+

Q2=X-PAmin

IlX/12

IIY/l2s+ 11~112+ IIYl12x

n+q+2’

L

n+p+q

I

(4.13)

*

Then Q2dominates dl, being better than eJS. In the case where the statistic Corollary

Y does not exist, the same arguments

give

4.4. Let n,p 2 3 and put (4.14)

Then d3 is better than I?‘.

T. Kubokawa et al. / Estimation qf variance

328

Table 4 The relative

risk improvements

I

0

5

0

p

From

in estimation

of the mean vector for p = 4, 4 = 2 and n = I (in percents) 1

2

3

4

5

16.16

16.16

16.16

16.16

16.16

16.16

01

16.93

16.82

16.64

16.42

16.31

16.23

02

18.38

18.02

17.67

17.26

16.93

16.63

e3

20.01

20.01

20.01

20.01

20.01

20.01

04

18.95

19.55

19.83

19.89

19.95

19.97

QJS

10.26

10.26

10.26

10.26

10.26

10.26

e1

10.79

10.52

10.45

10.37

10.33

10.28

02

11.40

11.12

10.85

10.65

10.51

10.41

03

12.12

12.12

12.12

12.12

12.12

12.12

e4

11.68

11.91

12.02

12.07

12.09

12.10

&JS

4.98

4.98

4.98

4.98

4.98

4.98

01

5.24

5.15

5.07

5.03

5.00

4.99

e2

5.35

5.26

5.16

5.08

5.03

5.01

e3

5.39

5.39

5.39

5.39

5.39

5.39

04

5.41

5.43

5.42

5.40

5.39

5.38

QJS

2.88

2.88

2.88

2.88

2.88

2.88

01

3.06

3.00

2.94

2.90

2.88

2.88

e2

3.09

3.02

2.96

2.92

2.90

2.88

03

2.99

2.99

2.99

2.99

2.99

2.99

04

3.11

3.08

3.03

3.01

2.99

2.99

the results

given by Sections

2 and 3, one may consider

s+ 11~112s+ IIYl12s+ ~

B,=X-p$min[&,

n+P

‘n+q+2’

the estimator

11~112+ - IIYl12x. n+p+q

I

-” , Ql, e2, I$ and e,, in the case ofp=4, q=2 For the five estimators 19 Table 4 provides the simulated values of the relative risk improvements defined

(4 15)

and n=l, which are

by RRI(@=lOOx{R(w,X)-R(~,~)}/R(u,X),

where R(w, 6) =E[ll&

8112/a2]. Table

eJs4B82af?

4 reveals that

49

where 6+6* means that 6* is better than 6. Also, if A is small, then & is desirable and if A is large, the estimator d4 is a good choice. The above relations between estimators in Table 4 suggest that Theorem 4.2, Corollaries 4.3 and 4.4 hold without the condition of n > 3. It is also expected that the relative risk improvements are getting greater for larger dimension p or q. Remark

4.1.

Table

4 reveals

that

& has a substantially

smaller

risk when

A is

T. Kubokawa et al. / Estimation of variance

329

small, or 0 is close to the origin. Also a simulation result we tried for n = 10, p = 4 and q=2 shows that the maximum relative risk improvement for 0s is 44.75 percents

at A= 0.

Thereby I!?~,shrunken toward the origin, is a good estimator when it is known that 0 is around the origin. In general, based on vague prior information as to 0, there are cases where 19may be assumed to be close to a linear subspace V/c Rp of dimension dim V=r. Sclove, Morris and Radhakrishnan (1972) considered shrinking X toward subspace V. Let PvX denote the projection of X onto I/ defined by

The resulting

shrinkage

estimator

is given by

p-r-2

MV=X-

min60,

I,x_p,x,,2

s+ 11x- P,XI12

(X-

ni-p-r

P,X),

(4.16)

which dominates X for p - r 3 3 and presents, of course, a great risk gain for 8 E I/. For instance, if V is thought to be {u: u =Pe} for unknown scalar ,U and e=(l,..., l)‘, then @s(V) is the Lindley (1962) type estimator S+ IlX-Xel12

P-3 BL=X-

(IX_Xel12

i

4,

n+p-1

(X- Xe), 1

where X= Cf=‘=,Xi/P (George, 1986b). George (1986a) further generalized the shrinkage estimators to the cases where several subspace targets Vi, . . . , i$ are taken, and considered shrinking X towards V,, . . . , V,. Such procedures are called multiple shrinkage estimators, which will be also considered for (4.16). Remark 4.2. The same way of thinking as in Remark 4.1 is applicable to the estimation of the variance. Let W be a linear subspace of Y with dim W= t. < guessed be

W, then the estimator 6,(W)

= min

do, i

shrunken

to W is proposed

as

s+1/Y-P,YI~2 n+q-t+2

I ’

(4.17)

which is better than So and the maximum improvement is attained at l/l- Pw ~112/(202)= 0. For instance, putting W= {o: o=pe} yields min{aO,(S+ I/Y- Yel~2)(n+q+1)-1} for q > 2. When several subspace targets W,, . . . , wk are chosen, a combined shrinkage estimator Cf=, PiSs( I+$) may be considered as one of improved procedures where Pi’s are positive constants satisfying Cf= 1pi = 1. If both B and r are thought to be close to guessed subspaces, an appropriate combination of (4.16) and (4.17) will provide a substantial improvement. We conclude

this section

with examples.

T. Kubokawa et al. / Estimation of variance

330

Example 4.1 (one way analysis of variance). Consider the one way layout with XV (i= 1, . ..) m; j = 1,2) independent normal variables with means ,u; and variances cr2 (Cox and Hinkley, 1974, p. 17). Let Xj=+(Xj1+Xi2) for i=l,...,m 1 C (Xi;-Xi)2. Then X= (Xi, . . . , X,)’ and S are independent random such that X-

N,(u

and S= variables

S-(&2)X$

(Q~/~)M,

p,)‘. This is the situation where the statistic Y does not exist. where p=(pi,..., When we want to estimate p for m = 3, by the Stein effect, X can be improved on by the James-Stein rule. However, the estimator +S has a disturbance to a certain extent since the degrees of freedom in S are small. In this case it seems meaningful to modify +S by using the statistic X, so that employing the estimator 8s may be better. In some practical applications, the treatment contrasts ,u2-pl, . . . ,P~ --,u~ are parameters of interest. For example, this is the situation where ,~i is the effect of and we a control treatment, ,u2, . . . , pm are the effects of new competing treatments want to know the treatment differences pi - pl. Let 2 = (X2 - Xt, . . . ,X, -Xi)‘, Y= (2/m)“2(X, + . ..+X.,,), I!?=(~~-~~,...,~~-~~)’ and ~=(~/Rz)“~(~~+...+&). Then 2 and Y are independent variables such that z-N,-,(~,02~,-i), which is the model

given in Section

Example 4.2 (balanced incomplete block design (BIBD) as

1.

block designs).

Consider

a balanced

incomplete

for (i,j)ED,

E;j=/A+CXi+Pj+&ij

where D is defined of treatments and dependent random tions and k = # of

Y- N(& 02),

by a BIBD, aj (i= 1, . . . , t) and pj (j= 1, . . . , b) are fixed effects blocks, respectively, such that C ai= C bj = 0, and Eij are inerrors following N(0,a2). Let r= # (the number) of replicacells per block. Then Hirotsu (1976) derived the canonical form

as X-N,_,(8,a2Z,-~), Y- Nb(& 02&),

S-a2xik-b-1+1y where X, Y, S are minimal sufficient and mutually independent and 8 corresponds to a vector of treatment contrasts. Here f3 is of interest and (<, a2) are nuisance. This is the model treated in the present paper. For the designs (T, t, b, k) = (3,4,6,2), (4,5, 10,2), we have (t - 1,6, bk - b - t + 1) = (3,6,3), (4,10,6), respectively. In these cases, the dimension of Y is greater than the degrees of freedom in S and the information contained in Y is available for improvement in the estimation of 0.

T. Kubokawa et al. /Estimation

of variance

331

Appendix Proof of Proposition u = (S +

2.1.

Let

s

I/XII2+ 11 Yl/2)/02,

w1=s + IIX(12

and

s+ IIW w2=s+ l/3$+ llYl/2’ Then the random independent and

u, w1 and w2 are, given .Z and K, conditionally

variables

2 v-Xn+p+q+2J+2K3

mutually

wr-beta(t,y)

and

n+p+2J

w2 - beta

Proof of (2.2).

2

Using

at/a2

q+2K ‘2

the random

= min

>* variables

o, w1 and w2, we express

1

Wl w2

-

6r/02

as

u

n+2’n+p+q+2 u

Wl w2 =-VI,,

+

n+2

n+p+q+2

(A-1)

I129

Z,,=Z{w,w2~(n+2)/(n+p+q+2)} and Zr2= l-Z,,. Letting T,,= n(w,w2-(n+2)/(n+p+q+2)}, we can see that T,-,ul+u2 a.s. as nd@. Note that 6a/02 = (n + 2))‘~~ w2u(Zr1 + Zr2), and that given J and K, u is conditionahy independent of (w,, w2). Since w1 w2 = T,/n + (n + 2)/(n +p + q + 2),

where

=E[[(3-92-(n+p~q+2-1)2]42] =E =E

v2

T2+

2

n2(n+2)2



n(n+2)

n+p+q+2J+2K

* n+p+q+2

I n+p+q+2J+2K+2

(

n+p+q+2

=n-2E[{T,2+4(J+K)T,)Z12] which yields (2.2) by noting

-V

112

>I

1

n+p+q+2J+2K+2

n(n + 2) +2T,

v2

T

T,2

n(n + 2) -1

>I

112

1

+o(nm2),

that Zr2-’ Z(u, + u2 20)

as.

(A4 as n--t 03.

T. Kubokawa et al. / Estimation of variance

332

Proof of (2.3).

Similar

to (A.l),

w2

n+p+2

V 0122

+

where Z2i = I{ w1 <(n + 2)/(n +p + 2), w1 w2 < (n+2)/(n+p+2), ~,<(n+p+2)/(n+p+q+2)} (n+p+q+2), w2>(n+p+2)/(n+p+q+2)}. I22 -+ I(u, >0, u2O, n{w2-(n+p+2)/(n+p+q+2)}, we can see same arguments as in (A.2),

=n-2~[(Tn

-

u,,)(T,

+

u,, +

n+p+q+2 (n + 2)/(n

I23 3

+p + q + 2)}, Zz2 = I{ w1 >

and Z33=I{~i~2>(n+2)/ Note that Z2, +I(u, GO, uI +u,O) as. as n + 03. Letting U, = that U,,-+u2 a.s. as n-+~. By the

45+ 4K)I22 + T,(T, + 45+ 4K)I,,]

+ o(n-2),

which shows (2.3). Proof of (2.4).

Write d3/a2

as

+

n+p+2

V

w,w2+1-w2

w2 0131

vz32 +

n+q+2

VI33+

n+p+q+2

IMY

w,w2<(n+2)/ 13,=I{w,<(n+2)/(n+p+2), qwl w2<(n+W~2)~ w2<(n+p+2)/(n+q+2)x I32=I{Wi>(n+2)/(n+p+2), (n+p+q+Q), w2<(n+p+2)/(n+p+q+2)}, 133=I{qw,w2>(n+2)(1-w2), (wlw2+1-w2), and w2> (n +p + 2)/(n + q + 2)(w, w2 + 1 - w2), w2(1-w,)~p/(n+p+q+2)) w2>(n+p+2)/(n+p+q+2), w2(l-wl)< 13,=I{w,w2>(n+2)/(n+p+q+2), p/(n+p+q+2)}. Note that I3i+I(uiO, ~290), I33+ I(u,O) and 134+1(~1>0, u2>O) as. as n+m. Letting ‘V,=n{w,w2+ I- w2- (n+q+2)/(n+p+q+2)}, we see that V, + u1 a.s. as n + 00. Similar to where

(A.2),

=nm2E[(T,-

U,,)(T,+ U,,+4J+4K)132

+ (T, - V,)(T, + V, + 4J+ 4K)13, + T,(T, + 4J+ 4K)13,] + o(ne2), which establishes

(2.4).

Acknowledgements The authors are grateful to Editor, Associate valuable comments and helpful suggestions.

Editor

and the referees

for their

T. Kubokawa

et al. / Estimation

of variance

333

References Brewster, Cox,

J.F.

and _I. Zidek

(1974).

D.R. and D.V. Hinkley

Fushimi, Gelfand,

Improving

(1974).

on equivariant

Theoretical

Statistics.

M. (1989). Random Numbers. Univ. Tokyo E. and D.K. Dey (1988). Improved estimation

model.

J. Econometrics E.I. (1986a).

Minimax

George,

E.I. (1986b).

Combining

multiple

George, E.1. (1990). Comment 90-120.

shrinkage

minimax on paper

C. (1976). Analysis

estimation.

shrinkage by J.M.

of Variance.

Lindley, D.V. (1962). Contribution 24, 285-287. Maatta, J.M. and G. Casella Science 5, 90-120.

Kyoiku

to discussion

B.K. (1976). On improved

Stein, C. (1956). Inadmissibility Stein,

Third Berkeley

C. (1964).

unknown

mean.

Inadmissibility Ann.

Inst.

estimators

Maatta

Press,

Math.

and G. Casella

2, 21-38.

London.

Math.

Tokyo

Berkeley

Stein. J. Roy. variance

Symp.

Statist.

of preliminary-test

variance.

1, 197-206,

for the variance

16, 155-160.

Science 5, Math.

J.

Sot.

estimation.

Ser. B

Statistical

estimators

for

Mu/t. Anal. 6, 617-625.

for the mean of a multivariate

Probab.

81, 437-445.

Assoc.

(1990). Statistical

(in Japanese).

by C.M.

of the generalized

of the usual estimator Statist.

14, 188-205. Statist.

loss. Proc. Fourth

(1972). Nonoptimality 43, 1481-1490.

Statist.

Statist.

J. Amer.

in decision-theoretic

of the usual estimator Symp.

Ann.

estimators.

on paper

(1990). Developments

Sclove, S.L., C. Morris and R. Radhakrishnan the multinormal mean. Ann. Math. Statist.

tion. Proc.

Statist.

Press, Tokyo (in Japanese). of the disturbance variance in a linear regression

James, W. and C. Stein (1961). Estimation with quadratic Statisf. Probab. 1, 361-379, Univ. California Press.

Sinha,

Ann.

and Hall,

39, 387-395.

George,

Hirotsu,

estimators. Chapman

normal

Univ. California of a normal

distribu-

Press.

distribution

with