Journal
.
of Statistical
Inference
Planning
journal of statistical planning and inference
and
44 (1995) 277-289
Fixed-width sequential confidence interval the mean of a gamma distribution Yoshikazu
Takada”q*,
Yasushi
for
Nagatab
’Department of Mathematics, Facu1t.p of Science, Kumamoto Unicrrsity. Kumamoto 860, Japan b Faculty of Economics. Oka)ama Uniwrsity. Okayama 700. Japan Received 4 May 1992; revised
17 December
1992
Abstract This paper considers a sequential procedure for setting a fixed-width confidence interval for the mean of a gamma distribution. Instead of a coverage probability, an average coverage probability is considered and its asymptotic expansion is obtained, from which it turns out that the interval with bias correction performs better than that with no bias correction.
AMS Sdject
Class$cation:
62Ll2
Key words: Gamma distribution; ity: Asymptotic expansion
Stopping
time; Bias correction;
Average coverage
probabil-
1. Introduction Let Xl, Xx, with
common
. . be independent
and identically
distributed
(i.i.d.) random
variables
density (1.1)
where 9>0 is unknown and i>O is known. The mean and variance of the gamma distribution are 0 and c2 =d2/1. We consider the problem of finding a confidence interval for tI of width at most 2h (h > 0) and confidence coefficient y (0 < 7 < 1). Given a sample size n, the natural estimator for B is .%, = (XI + ... +X,)/n. Then for large n
z 2@(n”‘h/g)-
*Corresponding
1,
author.
037%3758/95/$09.50 (0 1995SSDl 0378-3758(94)00053-O
Elsevier Science B.V. All rights reserved.
278
Y. Takada,
Y. NagatalJournal
of Statistical
Planning and Inference
44 (199.5) 277-289
where @ denotes the standard normal distribution. The problem is solved asymptotically by taking the sample size n which satisfies n&r=(c2/h*)o*, where 2 @J(C) - 1= y and the confidence interval (_%,- h, 2, + h). Since g2 is unknown, we consider the following stopping time: t=inf(n>m;
n>(c*/h*)/,8,2},
(1.2)
where m is the initial sample size, 6: =x:/n and e, > 1. The constant sequence {e,} is considered to avoid underestimation at the termination. The confidence intervals are of the form I,=(&-h,8,+h),
(1.3)
with ($ = 2, + b (Xr)/t for some functions b ( .), since x, is a biased estimator for 0 (see Theorem 2) and an appropriate choice of b ( .) may improve the coverage probability. In Section 2 the asymptotic expansion for the expected sample size and the asymptotic bias of x, are obtained. By Anscombe’s theorem it is not difficult to show that P&~EZ,)-+Y
as h+O.
But unlike the case of the normal distribution (see Woodroofe, 1982, Ch. lo), it is difficult to obtain the asymptotic expansion for the coverage probability due to the fact that the event {t=n> and _%,are not independent. In Section 3, instead of the coverage probability, we consider its average coverage probability with respect to some density 5, that is,
s
Pe(B~1,)5(8)de=P5(e~Z,),
where Pr denotes probability in the Bayesian model in which 8 has prior density 5 and are conditionally i.i.d. with common distribution (1.1). See Woodroofe x1,x2, ... (1986, 1987) and Meslem (1987a, b). Meslem (1987a) obtained in his Ph.D. thesis the asymptotic expansion of the average coverage probability of the confidence interval with no bias correction. We shall extend his result to the case of confidence intervals with bias correction. From the expansion it turns out that the interval with bias correction is better than that with no bias correction. In Section 4 a numerical example is studied to see whether the bias corrected interval improves the coverage probability of the interval with no bias correction.
2. Expected sample size and asymptotic bias Let Z,=n (~,(~,/0)2} t=inf{nam;
-I. Then it follows from (1.2) that
Z,>a}.
(2.1)
Y. Takada, Y. NagatalJournal
Z, may be written
279
of Statistical Planning and Infhrence 44 (1995) 277-289
in the form
where T,= i
{l-2(Y,-1))
i=l
with Yi=Xi/8
(i= 1,2, . ..) and
~“=3U”-4n(Y”-1)~(1+d,/n)+d,{1-2(Y”-l)} with
e,‘=l
+A&
non-linear renewal Suppose that /,= Then
and
lU,--11~1
theorem.
1 +/,/n+o(l/n)
Y,--11.
See Woodroofe
and x: the x2 random
Lemma 1. lfO<6<1,
is in the form
considered
in the
(1982).
as n-+a.
it is easy to see that q,*33. -r x: -/,,
distribution
This
variable
where
=E=denotes
the convergence
in
with one degree of freedom.
then
P(t<&)=O(hm2)
as h-0.
Proof. The stopping time (1.2) can be rewritten in the form of (1.1) of Woodroofe 0 (1977). Then the lemma follows from Lemma 2.3 of Woodroofe (1977). The asymptotic expansion of the expected sample size follows from Theorem 4.5 of Woodroofe (1982) (see also Theorem 2.4 of Woodroofe, 1977). We omit the details of checking the conditions of the theorem, but note that (4.16) holds by Lemma 1. Theorem 1.
[f mA>2, then
E(t)=a+p,-33/2+/,+0(l) where pn denotes the asymptotic
as h+O, mean of the random variable R,=Z,-a.
The sample mean X, is not unbiased for 8 since the sampling by the stopping time. Next we compute the asymptotic bias. Lemma 2. On {t>m} (F, + 1)/V, Y?) d KI (a/V2
+
K2
(a/t)
distribution
is affected
280
Y. Takada, Y. NagataJJournal
of Statistical Planning and Inference 44 (I995)
277-289
and 4 G K3 (a/t), where K,, K, and K3 are some constants. Proof. Observe
that on {t > m}
t/{& ~2}>a>(t-l)/{&l
E2_,}.
Then it follows that
and
(2.2)
E < {t/(e,a)} 1’2. Using
these inequalities,
(~+l)l(~,~12)~4C{tl(~,a)~“2+lll~~~(t-l)/(~~-~a)~ =4(~,-~l~,)C{tl(~,a)~“2+11~al(t-1)3 < 8 (4 - 1l4) C{t/V, 4 >‘I2 + 11(a/t) d K 1(a/t)“2 + K2 (a/t) for some constant
Kl and K,, and
for some constant
K3. Then the proof is completed.
Theorem 2.
IfmA > 2, then
E(_%,)=Q-(28/A)/a+o(l/a)
as h-+0.
0
Y. Takada,
Y. NagataJJournal
Proof. Let S,=
of’ Statistical
Planning
i Xi. Then by Wald’s lemma
und Ir+rence
44 i 1995)
277-289
‘XI
and (2.1)
i=l
E(X*-U)=E{(S,-tU)/t} =(t/a)E{(u/t-1)(S-tu)} =(l/a)E{((e*
%2:,-i-
I)@-tcr); r,2)y’)(s,-tfl)j
+(l/a)E{(u/t-((L, =-(e/u){E[((Y,+
l)/(/* t:))t(
r,-
1)2]
+~C(t(~,-l)l~,)(Y,-l)l+~C~(~,-1)1}. Observe
(2.3)
that
((Y,+l)/(e*t:)}t(r,-1)2j(2!~)X:. From Lemma 1, (~/t)~, h > 0, is uniformly Chow and Yu (198 1). Then using Lemma Holder’s inequality we have
integrable for 1 < p < mA/2. See Lemma 1 of 2 and Theorem 2.3 of Woodroofe ( 1977), by
(2.4)
E[{(Y,+1)/(&,2)}t(~-1)2]=2/r,+o(l). It follows easily from Lemma
2 and (2.2) that
ECjt(e,-1)/~~}(r,-1)1=ocl)
(2.5)
E{R,(Y,-l)}=o(l).
(2.6)
and
Then substituting
(2.4)-(2.6)
into (2.3), the proof is completed.
0
Theorem 2 suggests that the interval 1, with 6(=X,+(2X,/1)/t may be better than that with $=X,. In the next section we shall show that it is asymptotically true with respect to the average
coverage
probability.
3. Average coverage probability It is easy to see that the gamma family F,(dx)=exp[ox-$(o)] with (1,=-3,/0, Throughout -x
(1.1) is a one-parameter
exponential
A(dx)
11/(o)=-Alog this section and q32
<(W)=(w-wO)f
distribution
(f&-co);
and A(dx)=x’-‘dx/r(A) 4 denotes a density of
&)(W),
w
w
for x>O. such that
for
some
(3.1)
282
Y. Takada, Y. NagataJJournal
of Statistical Planning and Inference 44 (1995) 277-289
where to is positive and q times continuously differentiable on o
and and
(3.2)
where
and P2(~,z)=--5z~(z)52(~)+6(3z+z3)~(z)51(~)~3 +&$(3z+z3)~(z)$4-7$(15z+5z3+z5)~(z)~j. with +=@I, 5r=<‘/(a<), <2=5”/(~2{), Woodroofe (1986). Note that [G-~[
1,+~=2A-r~*
and
$4=6/A.
See
(13)
of
iff U(-h,~)
where cIj=-,I/e
and for x > A/w.
U (x, 0) = A/(/I/w -x) Let
c:
=t”*Bt[U(h,dg-w,]
(3.3)
and
c; =-Pb,[U(-h,f&)-w,]. Then we have
liI&-dl
iff -C;
distributions
are unaffected
by optional
stoppings,
it follows
from
PQ&8l
Lemma 3. Zf ml > 2, then for 0 < 6 < 1 Ps(t<&z)=o(h2)
as h-0.
(3.4)
Y. Takada,
Y. NagataJJournal
of Statistical
Planning
and Inference
44 11995)
277-39
283
Proof. It is easy to see that for 8i
Then the proof follows from Lemma
1.
Cl
Define B, by B,={r>,Ga,
o~+logt/t”~dc~tdw~-logt/t”~)
for 0 < 6 < 1. Then from Lemma Pr(B:)=o(h2)
where B: denotes
1 of Woodroofe
(1987) and Lemma
3
as h-+0, the complement
of B,. Hence
we have
P’(@,-fq
(3.5)
It follows from (3.4) that
s
Pq&eI
P’(@-81
B,
=
[@(C:)-@(-C,)]
dP’
s B, +h2
+h2
s B,
s B,
(th2)-1’2h-1[pl(0,,C:)-pl(Ot,C,)]dPr
lth2)-1[pAMi+)+p2(o,,C;)]dP5+o(h2)
=Bo(h)+hZBl(h)+h2B2(h)+o(h2)
(say),
where we used the fact that
s
(a/t)3’2R(w,,C:)dP5=o(l). B‘Y
See (16) and Lemma
1 of Woodroofe
Lemma 4. Suppose that 6, =x, PI(h)=&(c)
+ b 6,/t for some constant b. Then as h+O,
(c~2)-‘{2(c2-bA1’2)(c2-9)/3}~(w)do+o(l) s
and Px(h)=Hc)
(1986).
s
(ce2)-lrc5(W)do+o(l),
(3.6)
284
Y. Takada, Y. NagatalJournal
of Statistical Planning and Inference 44 (1995) 277-289
where r,=-12+$(3+c2)-(15+5c2+c4)/9. Proof. Observe
that
th2/6:=(~2/a),~Z,=(~2/u),z(Rt+u) =c2+(C2/u)Rt+(c2/a)(e,-1)(R,+a). Let yr=(th2)112/8r-c.
Then
yt=[c+(th2)1’2/8t]
-I {th2/8:-c2}
={[c+(th2)1’2/BJa}-’
{~~R~+(a/t)c~t(~,-1)+~~(~,-1)R,}.
(3.7)
It follows from (3.3) that C: = t 1’2b, [U (0, &,,) + h U1 (0, Q,) + (h2/2) U2(0, &)
+(h3/6)U3(h+,~t)-o,l, where jh+l
and ui(x,Cc))=(a/ax’)U(x,
w) (i=1,2,3).
Hence we have
C: =c+t1’2c$(d~-wt)+[t”28,hU1(0,01)-c] + t l” 8, h [ U1 (0, G,,) - U1 (0, W1)] +t”28,[(h2/2)U2(0,Li)r)+(h3/6)U3(h+,C;)t)].
(3.8)
Since t”2BthU1(0,w,)-c=y,
(3.9)
and (3.10)
Q,-o,=Ab&,/(tx,i,), it follows from (3.8) that C: =c+h{Ab8~/[(th2)1’2~L~f]+(th2)1’28tU2(0,dt),’2} +h2{y,/h2+[b/(th2)“2][B:/(X,8,)](d,+w,)
(3.11)
+(th2)“2c?tU3(h+,dJ6}. Likewise C; =c-h{Ib8:/[(th2)1’2Xrf?J+(th2)1’2B,U2(0,C;)t)/2} +h2{y,/h2+[b/(th2)1’2] +(th2)1’28rU3(h-,dt)/6)
[B:/(%,i,)](~G~+ti,) (3.12)
Y. Takada,
with
Ih- I
Y. NagatalJournal
Hence
of Statistical
as
h-0,
+(bA”2 - c~)/(coA~/~). By expanding
Planning
and Inference
(C: -c)/h pi (W,, C:)
converges around
44 11995)
277-289
almost
surely
2x5
to
c, we have
p~(~,,C:)=p~(w,,c)+p;(w,,c)(c,+-~)+lp;(cu,,C:’)(C:--)2 and ~~(c-ii,,C,)=p~(o,,c)+p;(o,,c)(c,~-c)+fp;(o,,C,‘)(C~where C:’ are intermediate
points.
-cY,
Hence the integrand
in fli (h) is equal to
(th2)~1’2h-1(~;(W,,~)(C~-C~)+~[~;l(Or,C~’)(C~-c)2-p;‘(~T)t,C,-’)(C,-c) which converges
almost
surely to
2(hA”2 -c2)w2p;(o,c)/(c2j”3’2) as h-+0. Thus the dominated /J1 (h)+
convergence
theorem
implies that as h-+0,
2(h3.“2 -~2)W2p;((,~,c)/(c2~3’2)~((U)d[l) s c2--b1’2)(c2-9)/3}<(w)dw.
=~(+P-‘j2( Likewise
it can be shown
b2(h)+
that as h+O,
2(c2a2)-‘pz(w,c)5(cu)dto s =~(c)jjcfP)‘r,
<((/>)dto.
Hence the proof is completed.
0
Theorem 3. Suppose that mi >2 and 0,=x,+ P<(QE I,)=y+h2@(c)
s
(c+-’
h8,lt,
then
as h-+0,
{i(p,+&J-4b%“2+2c2
-(c2-b1”2)2+2(c2-h~1’2)(c2-9)/3+T,}~(W)do+o(h2), where r,=-12+$(3+c2)-(15+5c2+c4)/9. Proof. It follows from (3.5) (3.6) and Lemma P5(O~ I,)=flo(h)+h2q%(c)
s
4 that
(cQ’)-~ {2(~~-hA”~)
x(c2 -9)/3+r,)+)do+o(h2).
(3.13)
286
Y. Takada,
Now we proceed
Y. NagatalJournal
of Statistical
to j&(h). By expanding
Planning and Inference
@(C: ) around
44 (1995) 277-289
c, we have
@(c))=@(c)+qqc)(c:-c)-c:‘q5(c:‘)(c:-c)~/2, where C:’ are intermediate
points.
Then
@(C+)-@(-c;)=@(c:)+@(c;)-1 =y+4(c)(C+
-c+c;
-f[c:‘4(c:‘)(c:
-c) -c)2+c;‘qqc;‘)(c;
-c)Z].
Thus we have
s
PoW=Y+h2
~(c)h-2[(C:-c)+(C;-c)]dPT
B.
-(P/2)
[C:‘4(C:‘)K2(C: I B‘l
=~+h~ljo1(h)-h~P02@)
-c)‘+C;‘qS(C;‘)h-‘(C;
-c)“]
dPr (3.14)
(say).
It follows from (3.7), (3.11) and (3.12) that as h-0, h-2[(C:
-c)+[(C[
-c)]
S 2[I(R+2!,)/(2ce2)-2b/(ca2~“2)]+2c2/(c~~) =(c~~)-‘[I(R+~,)-~A”~~+~c~],
where R,=s-R and the distribution of R does not depend the dominated convergence theorem, as h-0
Likewise
as h-+0
/&,2(h)+(c)
(cd2)-1(b1”2-c2)2
&)do.
s Hence it follows from (3.14) that Po(h)=y+h2$+)
(~0~)-‘[~(p,+~~)-4A”~b+2~~ s
-(bA”’
-c”)“]
+)do+o(h’).
Thus the result follows from (3.13). The following
corollary
0
is immediately
obtained.
on 13.Then by Lemma
2 and
Y. Takuda, Y. NagatalJournal
Corollary.
of Statistical Planning and Inference 44 (19951 ,777-289
Suppose that m;L>2 and 8,=x,+
P’(OEZ,)3y+o(h2) for all 4 qf‘the form
2X7
b&,/t; then
as h-+0
(3.1) if
Let I, and ItC be intervals with fi, = x, and 6, = 2, + (2 x,,‘A)/t, respectively. follows from Theorem 3 that
PS(fIEl,,)-Pp5(QEI,)=h2(8/3)c4(c)
Then it
0-‘<(W)dw+o(h2), s
which shows that the confidence interval proves the average coverage probability correction. From the corollary
with bias correction asymptotically imof the confidence interval with no bias
if e0 satisfies that
I(p,+e,)>$c4-+&c2++y,
P5(QEL)>y+o(h2)
(3.15)
as h-+0
for all 5 of the form (3.1).
4. Simulation
result
A simulation study is carried out to investigate the main results when the observations come from the exponential distribution with density f(x; 0) =exp(-.x/O), x> 0 (0 >O). The confidence coefficient is chosen to be y = 0.9 and the initial sample size is m= 3 throughout the simulation. The simulation results are based on 5000 replecations. First we consider the stopping time (1.2) with l”=O. Tables 1 and 2 give the estimates E(X,), E(&) with &=x, +2(X,/t), and each coverage probability for 8=0.8 (0.1) 1.2 and h=0.2 (0.05)O.S. Table 1 shows that x, underestimates Q and 6t is closer to 0 than x,. From Table 2 we observe that the coverage probability of the confidence interval with bias correction improves that of the confidence interval with no bias correction, but is quite below the nominal value. Next we consider the stopping time (1.2) with 8, = 1+(8/n). The /,, = 8 is chosen to satisfy (3.15). Table 3 gives the estimate of the coverage probability of the confidence
288
Y. Takada, Y. NagatalJournal
of Statistical Planning and Inference 44 (1995) 277-289
Table 1 Average values of the estimates h
(y =0.9, 8. =0)
0 0.8
0.9
1.0
I.1
1.2
0.50
0.650 0.866
0.727 0.935
0.813 1.011
0.890 1.079
0.975 1.154
0.45
0.646 0.834
0.720 0.898
0.814 0.982
0.897 1.055
0.987 1.136
0.40
0.646 0.805
0.727 0.876
0.813 0.952
0.906 1.036
1.000 1.120
0.35
0.649 0.778
0.742 0.860
0.832 0.942
0.930 I.030
1.035 1.127
0.30
0.665 0.763
0.751 0.841
0.854 0.936
0.958 1.032
1.064 1.131
0.25
0.675 0.746
0.783 0.845
0.886 0.942
0.992 1.042
1.107 1.152
0.20
0.711 0.755
0.818 0.857
0.937 0.971
1.039 1.069
1.147 1.174
The upper figure is te average that of E(B,).
Table 2 Coverage
probabilities
h
0
simulaetd
value of E(%,) and the lower figure is
(y =0.9, e. =0)
0.8
0.9
1.0
1.1
1.2
0.50
0.8922 0.9430
0.8454 0.9228
0.7916 0.9082
0.7396 0.8934
0.7256 0.8674
0.45
0.8494 0.9200
0.7744 0.9016
0.7438 0.8940
0.7162 0.8614
0.7 170 0.8324
0.40
0.7796 0.9080
0.7324 0.8894
0.7238 0.8576
0.7116 0.8048
0.7198 0.7776
0.35
0.7310 0.8868
0.7260 0.8478
0.7212 0.7896
0.7296 0.7798
0.7516 0.7882
0.30
0.7306 0.8330
0.7144 0.7684
0.7414 0.7744
0.7502 0.7778
0.7700 0.7924
0.25
0.7250 0.7716
0.7442 0.7782
0.7698 0.7962
0.7884 0.8064
0.8090 0.8220
0.20
0.7766 0.8044
0.7884 0.8142
0.8314 0.8454
0.8382 0.8554
0.8472 0.8618
The upper figure is the simulated coverage probability of the confidence interval with no bias correction and the lower figure is that of the confidence interval with bias correction.
Y. Tukada.
Y. NagaruiJournal
of Statistical
Table 3 Coverage probabilities & = I +8/n)
of confidence
interval
0.8
0.9
1.0
1.1
1.2
0.9566 0.9362 0.9052 0.8908 0.8796 0.8904 0.8894
0.9332 0.9076 0.8904 0.8780 0.8800 0.8838 0.8904
0.8978 0.8848 0.8834 0.8770 0.8780 0.8792 0.8948
0.8858 0.8908 0.8746 0.8762 0.8834 0.8856 0.x944
0.8832 0.8786 0.8670 0.8802 0.8860 0.8938 0.8970
0.50 0.45 0.40 0.35 0.30 0.25 0.20
interval with bias correction. We observe nominal value than the above case.
Planning
and Inftirenw
with
44
/ 1995) -777-m-7XY
bias correction
that the coverage
28‘)
(~=0.9.
probability
is closer to the
Acknowledgments The authors wish to thank the associate which is related to our problem.
editor
for pointing
out Meslem’s
thesis
References Chow, Y.S. and K.F. Yu (1981). The performance of a sequential procedure for the estimation of the mean /Inn. Statist. 9, 184-189. Meslem, A.-E.-H. (1987a). Asymptotic expansions for confidence intervals with fixed proportional accurac! Ph.D. thesis, Univ. of Michigan. Meslem, A.-E.-H. (1987b). Asymptotic expansions for fixed width confidence interval. .I. Statist. P/am. Inf>rence 17. 51-65. Woodroofe, M. (1977). Second order approximation for sequential point and Interval estimation. Ann. Statist. 5. 984-995. Woodroofe, M. (1982). Nonlinear Renewal Theory in Sequenrial Anulysis. Sot. Indust. Appl. Math. Philadelphia. Woodroofe, M. (1986). Very weak expansions for sequential confidence levels. Ann. Statisr. 14, 1049-106’7. Woodroofe. M. (1987). Confidence intervals with fixed proportional accuracy. J. Starist. Plann. Inferenc~e 15. 131-146.