journal of statisticalplanning and inference
Journal of Statistical Planning and Inference 44 (1995)3 13-325
Estimation
in some binary regression with prescribed accuracy Yuan-chin
Institute
qf’Statistical
Scicwr.
Ivan .4cademia
models
Chang Siniu,
Taipei.
Taiwan,
ROC’
Received 28 April 1992; revised 16 March 1994
Abstract Let (Xi, Yi) be independent, identically distributed observations that satisfy a binary regression model; i.e. for each i=1,2, . . ..P(Yi= 1 IX,)=F(XT&), where F is some continuous distribution function, Y+{O, l}, XieRp, and &,E[W~ is the unknown parameter vector of the model. The marginal distribution of Xi is assumed to be unknown. Sequential procedures for constructing fixed size confidence regions for p,, and linear combinations of & are proposed and shown to be asymptotically consistent and efficient as the size of the region becomes small. Moreover, a sequential confidence interval for the probability of response at a given factor will also be given. AMS
Subject
Class$ication:
Primary
62L.12; Secondary
62F25, 62512
Key words: Logistic regression; Probit analysis; Fixed size confidence set; Sequential tion; Stopping rule; Last time; Uniform integrability; Asymptotic efficiency
1. Introduction
estima-
and summary
Binary regression models, such as logistic regression models or probit models, are commonly used as statistical tools in medical applications and many other areas (Cox and Snell, 1989; Finney, consider a general binary
1980; McCullagh regression model,
and
Nelder,
1989). In this paper,
we
where F is some continuous distribution function, ( Yi} are binary response variables, {Xi} are p-dimensional covariates and /&,E[W~is the unknwon parameter vector of the model. For example, if F is a logistic distribution or a normal distribution, then (1) will be called a logistic regression model or a probit model, respectively. Under some 0378-3758/95/$09.50 :o 1995- Elsevier Science B.V. All rights reserved SS’DI 0378-3758(94)00054-9
314
Y.I. Chang / Journal of Statistical Planning and Inference 44 (1995) 313-325
regularity
conditions
can be shown
on F, the conditional
to be strongly
Bn-+Be
consistent
maximum
likelihood
and asymptotically
estimate
normal;
of PO, p,,
i.e.
as.,
(2)
&(&&)-JVO,~-l), as n+co,
C=E
(3)
where f ‘ GmM F(X:j&)[l-F(X:/$,)]
x1x
T
1
is the Fisher information matrix for PO. Suppose C is known. For any given a~(0, l), let
satisfies that P(x ‘(p) >a’)= a. Then, for large enough n, where a2 is a constant & defines a confidence ellipsoid, centered at b,,,, with approximate (1 -a) x 100% coverage probability and maximum axis equals to 2,/m, where A is the smallest eigenvalue of Z. If we require further that the length of the maximum axis of the confidence
ellipsoid
is no greater
than 2d, then the best fixed sample
n, z a2/ld 2.
size is (4)
Because C contains the unknown parameter Do, it is usually unknown, so is A. Therefore, there is no fixed sample size procedure that can be used to construct a confidence ellipsoid for /?e with prescribed accuracy (i.e. the length of the maximum axis of ellipsoid in this case) and coverage probability, simultaneously. Then sequential procedures are the only methods that offer such a possibility for achieving both goals. The idea of ‘fixed size’ confidence
interval
estimation
problem
was originally
from Stein (1945) and extended by Chow and Robbins (1965). After that, many authors have applied their ideas to more general case; such as linear regression models (Albert, 1966; Gleser, 1965; Mukhopadhyay, 1974; Finster, 1985; Martinsek, 1989) and many other statistical models. In this paper, a sequential procedure is proposed for constructing a confidence ellipsoid for the unknown parameter vector Do. Under mild conditions on F and by using a ‘last time’ random variable (Chow and Lai, 1975), the proposed sequential procedure will be shown to be ‘asymptotically consistent’ and ‘asymptotically efficient’ (Chow and Robbins, 1965). There is a crucial difference between binary regression models and the previous works on general linear regression models. In binary regression models defined in (l), the asymptotic covariance matrix of the estimate usually will be complicated function of the unknown parameter of interest, j?e. But under the assumptions of the previous works on general linear models, the asymptotic covariance matrix of estimate depends only on the design and a nuisance parameter 0’ (Finster, 1985). Chang and Martinsek (1992) considered similar problems for
Y.I. Chang / Journal
logistic
regression
models,
of Statistical
Planning
but their arguments
und Injerence
depend
44 i IYY_O
315
313-325
on the ‘natural
link function’
properties of logistic regression models (McCullagh and Nelder, 1989) and cannot be applied to the current problem. Here, we extend their ideas to some general binary regression
models.
This paper is organized in the following way. In Section 2, we will state main theorems and the assumptions. Scetches of proofs of the theorems will be given in Section
3. As some applications,
a fixed size confidence
ellipsoid
for linear combina-
tions of PO will be given in Section 4, and for any given factor X=.x~[w~, a fixed width confidence interval for F(xTlO), which is the probability of response at the given factor, will also be given there.
2. Main results Suppose
(Xi, Yi), i= 1,2, . . . , are i.i.d. observations
Yi 1Xi-Bernoulli
satisfying
(1). Hence
(pi),
(51
where pi=F(X,Tfie) and Bc, is the unknown parameter vector. Assume further that F satisfies the following regular conditions: (Al) F, 1 -F and fare log[ 1 -E] and log fare
log-concave, where f is the density concave functions).
function
of F (i.e. log F,
(A2) .f‘is twice differentiable. for all /JE[W~. (A3) EllogF(X:/?)l
an unimodal
density
and symmetric
about
0.
(A@ E IIi.fx:b,lW:b,) [I1-W:Po)l; XI II< x. It follows from (5), that the conditional log-likelihood function size n is
‘n(B)=;
Therefore,
,i { YilogF(X,Tfl)+(lr-l
Yi)lOg[l-F(XiTB)]}.
based on a sample
(6)
Y.I. Changl Journal of Statistical Planning and Inference 44 (1995) 313-325
316
_t1
_
1’’I’ x,x,T
-FX,‘P)l +f’V,‘P)
,,If’W:B)C1
I
[1-F(X,‘/?)]2
(7)
Let /?” be the conditional maximum likelihood estimate of p,,; i.e. j?,, satisfies the equation 1A(b”)= 0. Then, under (Al)-(A3), by Rockafellar (1970) and Chang (1991), it can be shown that Bn-+j& almost surely, as n +co, provided that the condition below is satisfied: P{X1~ V> < 1
V Vclwp
Eq. (8) states that the distribution with dim V
it can be proved
&~,“‘(~~-~o)+dN(O,I)
with dim Y
(8)
of Xi is nondegenerate that the condition that
on any vector subspace
(8) holds throughout
Yc Iwp
this paper.
as n-co,
or equivalently n(~“-Bo)TC^,(Bn-Po)-~~2(~)
as n-+a,
(9)
where
in=: ,$ [ r-l
Yi-F(XTpn)]2
which can be shown
f WiTB”) F(Xi%)C1
to be a strongly
consistent
-F(X$n)] estimate
2
Xix:, 1
(10)
of C. Let &, be the smallest
eigenvalue of c^, and 1 be the smallest eigenvalue of C as before. Then, it follows from the strong consistency of /?” that 1 .+1 almost surely, as n-+ co. Hence, Eq. (4) suggests a stopping rule, T,=inf{n31: for constructing conditions properties
nJ,>a2/d2}, a fixed size ellipsoid
(11) for /I,,. Then,
under
some additional
moment
on Xi, the proposed sequential procedure will be shown to have the expected - “asymptotic consistency and efficiency” (Chow and Robbins, 1965).
Remark. Since the proposed
estimate, fin, of PO here is a conditional maximum likelihood estimate, the assumptions (A2)-(A4) are very similar to the regularity conditions for the usual maximum likelihood estimate (Serfling, 1980). It is clear that these assumptions are satisfied if F is either a normal or a logistic distribution function. In particular, if F is a logistic distribution function, then we can choose M,V1)=M2W1)=
II x1
l13.
Y.I. Changj
Journal
qf Statistical
Theorem 2.1. Suppose (Al)-(A4) (i) P{T,
Planning
are satisjied.
and Inference
44 1199Si
317
313-335
Then
HER, Td +CE, with probability
one, as d-0
and
1.
a2
(ii) Td(BTd-Bo)TC^Td(BTd-Bo)-'dX2(P),
a.9 d-+0.
(iii) lim ddcoP{ fiOERTd} = 1 --c( (“asymptotic where
consistency”),
R,= {BeIWP: n(j?-[n)TC^n(/j-/!?,,)
,a~)=cc. Theorem 2.2. Suppose (AlHA6) TdAd2 lim E [ a2
d-t0
1
are satisfied
= 1 (“asymptotic
and E /(X, /I4 < CC. Then
efficiency”).
The third part of Theorem 2.1 states that the coverage probability will converge to the required 1 -a as d goes to 0. Theorem 2.2 says that the ratio of the best (unknown) fixed sample size to the expected random sample size will converge to 1 as d approaches 0. In other words that the sequential procedure is asymptotically as “efficient” as the best fixed sample size procedure, small. The proof of Theorem
when the size of the region becomes
2.1 follows easily by applying
Chow and Robbins
(1965) and
Gleser (1969). The proof of Theorem 2.2 is more complicated and involves the supermum of uncountable many “last times” (Chang and Martinsek, 1992) along with “log-concavity” of F. To prove “asymptotic efficiency”, it is natural that one will try to use the nonlinear renewal theorem first [see, e.g. Lai and Siegmund, 1977. 1979; Woodroofe, 19821. But it turns out that the necessary conditions for those results are very difficult to check for the j?n in the current try to apply Chow and Robbins
set up. Similar difficulties
rise when we
(1965) lemmas.
3. Proofs of theorems Chang and Martinsek (1992) have similar theorems for logistic regression models, but their arguments depend on the “natural link function” properties of logistic regression models which are special cases of general binary regression models defined in (1). In this paper, we want to extend their results to a more general setup. Hence, we need to exploit the log-concavity of F more deeply. Moreover, the stopping rules in this paper depend directly on the sample covariance matrix instead of the sample
318
Y.I. Chang / Journal of Statistical Planning and Inference 44 (1995) 313-325
“conditional covariance matrix” as in Chang and Martinsek (1992). Although the procedures of the proofs in this paper are moreinvolved, the ideas of them are similar to that of Chang and Martinsek (1992). Therefore, only highlights of proofs will be given. Recall that the conditional l.(,=ki$I
If (Al)-(A4)
likelihood
function
{ Yilo~F(XiT~)+(l-Yi)lOg[l-F(XiT~)]}.
are satisfied
SLLN, with probability
then
-1:‘(p)
is positive
semidefinite
for all HEN. Then, by
one, eventually
W)+E{F(XirBO)
-W~h)l
log F(Xr?B)+C1
logC1-FW,TRl)=H(B)
(say),
forallBEIWP,asn-+co. LetH’(/?)=aH(/?)/@andH”(jI)=~ZH(/I)/~j?2 bethefirstand the second-order partial derivatives, with respect to /I, of H, respectively. By (8) and (Al)-(A4), - ,“( /I) is positive definite for all /?E IWP,so that /I0 is the unique solution to the equation H’(P)=O. This implies that H(B) has its unique maximum at /I0 and the strong consistency of & follows from it (see Chang, By definition of j?,,
1;(jJ= i: i=l
1 - Yi
yi
i F(X:B,)-1
Hence, by applying
Taylor
-F(X:B,) expansion
1991; or Rockafellar,
1970).
(12)
f(xiTB.)Xi=O. 1
to (12), we have (13)
where
and
/?~E[W~ is between
& and
/IO. (Note
that
it follows
from
that Z=E[(f2(X~/?O)/F(Xir~O)[1-F(Xir~O)]}X1X~]=-H”(~O).) strong consistency of Bn and a multivariate Rao (1973) and Serfling (1980)], as n-co,
version
of Lindeberg-Feller
definition
of H”
Then, by the Theorem [cf.
(15) Proof of Theorem 2.1. Proof of Part (i) will follow from Chow and Robbins (1965) Lemma 1. Proofs of Parts (ii) and (iii) will follow from Gleser (1969) and by applying Kolmogorov’s inequality (Chang and Martinsek, 1992). 0
Y.I. Changj
Journal
cf Statistid
Pkmning
ami Inference
44 ilW5)
31’)
313-325
Because of Theorem 2.1(i), to prove the “asymptotic efficiency”, it is sufficient to But the stopping rule proposed show that {d2 T,,: d~(0,1)) IS uniformly integrable. here depends on the smallest eigenvalue of i,, which is also a function of Bn. In general, since there are no explicit solutions for eigenvalues, of the usual nonlinear renewal theory, which is commonly
the necessary conditions used in some sequential
problems, will be very difficult to check. To avoid these difficulties, we follow the ideas of Chang and Martinsek (1992); i.e. to show that ( d2 Td: d~(0,1); is uniformly integrable by using several last time random variables. Remark. The last-time methods that we used here depend on the i.i.d. assumption. If one would like to apply such an idea to nonstochastic covariates case, then a ‘last time’ theorem of non-i.i.d. Define a last time L,=sup{n>l:
random
variables
1,(fl)-I&?o)>O,
will be needed (Hjort and Fenstad,
199 1).
3gEae,),
(16)
BP= {BER? 11 /I - /&, I/ d p},p > 0 is a constant BP.Then, by definition of L,,
where
and
(38, denotes the boundary
of
in>L,kikBpj. If, in addition, (A6) holds, then we have the following Theorem 2.2 will follow from it. Lemma 3.1. Assume (Al)-(A6) (Proof of the Lemma
are satisfied and
lemma
and the proof of
/I X1 /I4 < a, then EL, < x).
3.1 will be given in the end of this section.)
Proof of Theorem 2.2. As defined
in (1 l), T, is the smallest
no N, such that
ni,, > a2/d 2. From
previous
discussions,
if n> L, then
B"EB,,, Therefore,
for n > L,,
IXiTPnl~IX~~Bn~B~~l+lX~~~ldllXiIIP+lx~P~l~ , . . . ,n. Assume (A$ then for all nEN and i= 1, . . . ,n,f(XiTlj^,)=f( Since YiE(O,l}, for i=l,2 ,... and O
for i=l
Then, by an inequality
of eigenvalues
(Bellman,
(17) IXiT/i,l).
1960) this implies that for n > L,,
320
Y.1. Chang 1 Journal of Statistical
Planning and Inference 44 (1995) 313-325
where Mi=f’( IIXiIIp+)X,‘B,I)XiXiT. (The notation eigenvalue of square matrix A.) Now, define another last-time random variable,
L&f=SUp
i where M =E(M,)
Therefore,
?I>11 ZT i
(Mi-M)Z<2~
- nl,
A,,“(A)
denotes
3ZERP, IIZII=l
i=l
)
(18)
I
and I,=Ami,(M).
by the definition
the smallest
It follows from Wilkinson
(1963) that
of L,,,,
Z>~,
VZEIW~ with /IZI(= 1,
(19)
Hence,
by (18) and (19) if n>max(L,,
Therefore,
LM), then
for d~(0, l),
d2T~=d2T&d>max(~
< y+
p, L.) u
1 +d2
Td~{T~anax(L,,L,);
1 +max(L,,L,).
(20)
P
By Lemma completes
3.2 below,
EL,<
the proof of Theorem
co. So {d 2 T,: d@O, l)} is uniformly 2.2.
integrable.
This
0
Lemma 3.2. Zf E IIX, II4< co, then ELM < a.
[Proof of Lemma 3.2 will follow by the similar arguments as in Chang and Martinsek’s (1992) Lemma 2.1 and will not be given here.] Note that for p> 1, the last-time random variables in Lemmas 3.1 and 3.2 are suprema of uncountable many last times for random walks of the type considered by Chow and Lai (1975).
32
Y.I. Chary/ Journal of Statistical Planning and Inj&w~cr 44 11995) 313-3-75
Proof of Lemma 3.1. First, note that by assumptions
1
(Al) and (A2), for y~f0, 1) and
tcIW, y
.f”@M+f2(~) F2(t)
-f’(w
_g
-F(Ql-f2(t)
Cl-FW12
-f’*(t)
-F(r)]
=a[(2y-
+(I
(21)
l)f”(t)]+h(t),
(22)
F(t)-,f2(t) is symmetric about 0. where h(t)= -,f”(t) [l -F(f)] -f2(t)+f’(f) Let .4(t)=&[(2y-l)f’(t)]+h(t) and k(t)= -i.f“(jtl)+f h(t). Then, by assumption (AS), we have following inequalities that 4 (24’- l).f”(f)<
V’tER
-&f’(ltl),
and g(t)dk(t), Moreover, increasing
VrER.
by assumptions (Al), (A5) and symmetry of h(t), it can be shown that k(r) is for t~[0, m) and symmetry about 0 (Chang, 1992).
Now, by Taylor
expansion
theory,
for any lj~?B~,
1 yi~~XiB:)F(Xi’,R.*)-_I’2(X
1 - Yi _ F(X~fio)-l-F(X~fio) yi
(23)
f(Xi%,Xi’(B-BO)
(24)
F2(x,TK)
+(I
-
Yi)
where /?ZE[W~ is between From previous
(B_B ) 0.
-Qx,‘Bn*)l-f2(X:Bn*) Cl-wq%312
-.f’(Xi’Bn*)Cl
1 and PO.
discussions,
if n > L, then B”E B,, so does fi,*. Therefore,
&, by /I’: in (17) and let t =XiT/Iz
by replacing
in s(t) and k(r), we have that
,Y(Xi’l)n*)bk(Xi’~~)=k(lXi’~~I)bk(~IXi~~p+lXi’~~oI) (Note that the equal sign above
follows from the symmetry
t/i=l,...,~~. of k(t).)
LetG,,i=~(X~~,*)XiX~fornE~,i=1,...,nandKi=Ki(p)=k(1IXi~Ip+IXiT~ol)XiX,?‘, for i= 1,2, . . . Then {Kc), i = 1,2, . . , are i.i.d. random matrices which depend p and PO. Therefore, by (21) and (29, for any BE?B,,, 2 x(Wd(P-p~)~
G(B-PO)’
1 1
i [ i=l
Gn,i
i [ i=l
Ki
(251)
only on
WBO)
(P-PO).
(26)
322
Y.I. ChanglJournal
of Statistical
Planning and Inference
44 (1995) 313-325
Hence.
n x Cln(D)-ln(80)l=(23)+(24)~(23)+~(26). Since k(t) ~0, Vt >O, --EKl(p)
is positive
definite
for all p~[w. This implies
that for
any fixed p~[w, JC=+)=
-wBo)Twwo)=,
inf @EPBp
(_K)>()
Ill,”
P2
7
(27)
where K = EK 1( p). Moreover, (23)+4(26)8
0 (23)+i
(26)-in(B-Bo)TK(B-8,)>
-f
~(23)+~(26)-3n(B-Bo)TK(P-Bo)3: =$23)>+
nrc
or
3
(1991).
nK
C(26)-n(B-p,)TK(B-B0)1 2a nK.
Then, the rest of the proof can be completed and Martinsek
n(B-h)TK(B-hJ
by modifying
the arguments
of Chang
0
4. Confidence ellipsoids for linear combinations of PO Sometimes, instead
we are interested
of /I0 itself. For
in linear
example,
we may
combinations
of the components
like to estimate
the difference
of /IO, of two
particular components such as po,j-Po,k, where Bo,j, j= 1, . . . , p, denotes the jth component of PO. In this section, we present a sequential procedure for constructing a confidence ellipsoid for many linear combinations of j&, at the same time, which can also be shown to be asymptotically consistent and efficient. Moreover, for any given factor X=x, we can also construct a fixed width confidence interval for F(xT/IO), which is the probability of response at a given factor X=x. Let C be any p x k nonrandom full-rank matrix, then it follows from previous results, as n-co, that &CT(B,-/3,,)TC+,,N(0,CTC-1C),
(28)
nC(ifn-Bo)TClCCTC-‘Cl-’ CCTT(h,-Bo)l+~x2V4.
(29)
or
Replacing
C by its estimate,
c^,, we have
nCU%-PoJTCICCTfi’Cl-’ as n-co.
CCT(b,-Po)l+~x2W~
(30)
For a given CIE(O,~), let
R,={pERP:n[(Bn-Bo)TC][CTC^,‘C]-‘[CT(B.-Po)]~22),
(31)
of Sfatistkal
Y.I. Chang I Journal
Planning
and Inference
44
f IYY5) 313-32.5
32.t
where r2 is a constant satisfies that P( x *(A)) > r* = 1 -LX. Then, for large enough P(CTj$~I?“)z 1 --r and the length of the maximum axis of l? is 22 (?l&”
(32)
[(cr’c^“-‘c)-‘])“*~
Based on this, if we require define a stopping rule, Ti=inf(n31:
n.
the length
of the maximum
axis of /?<2d,
then we can
(33)
nn,i”[(CTC^,1C)~1]~~2/d2}.
By matrix theory and similar arguments theorem can be proved.
as in the proof of Theorem
Theorem 4.1. Suppose (Al)-(A4), then P{CT/&,~RT;}=l-r and (i) lim,,,
lim,,,
almost surely. IA in addition, (A$ (A6) are satisfied (ii) limd,, ET,“d2n,,,([CTC-1C]-1)/22= 1.
2.2, the following
T~d2&in[(CTC-1C)~1],i~2=
I
and E 11 Xl II4 < x, then
The proof of(i) follows the same arguments as in the proof of Theorem 2.1 and will be skipped. The proof of Theorem 4.l(ii) is similar to the proof of Theorem 2.2 and only outlines is given below. For n large enough, eventually, 2, is positive definite with probability one. Then, b:y matrix theory (cf. Rao, 1973), as n large enough, with probability one, we have that, nn,i,[(CT~,1C)-1]322/d20 -S A,,,(CTf; (Notation &,,(A) Moreover,
nC~,,,(CT~,‘C]~‘~z2/d2
’C) d nd */T*. denotes
(34)
the maximum
I1,,,(CT~,lC)~(lmax(C^nl)
eigenvalue
of square
matrix
A.)
X &,,(CTC)
=CAmin(c^n)l-l x Amax(cTc)~ By definition,
C is full rank, so that il,,,
n,i”(~,)~‘n,,,(CTC)~nd2/2*
0
(CTC)>O.
Therefore,
by (34) and (35),
n,,,(~,‘)n,,,(CTC)~nd2/2’
+ Amax[CTf~m1C],~*/d*.
(35)
Now, let fi=inf{nal:
nA,,,(~,)>(z2/d2)A,,,(CTC)).
(36)
Y.I. ChanylJournal
324
of Statistical
Planning
and Inference
44 (1995)
313-325
as in the proof of Then, with probability one, that Ti < Fj . By similar arguments Theorem 2.2, it can be shown that {d2 ?i: d~(0, l)} is uniformly integrable. This implies that {d2 T,C: d@O, l)} IS uniformly integrable and then completes the proof of Theorem 4.1 (ii). For any given factor X=x,
by the &method
&[F(x~/&)-F(x~/I,,)]+~N(O,G~), where 02=(xTC-lx) I,=
(37)
Thus,
1
,~(~r@“)+%
[
onf
as n-+00,
[f(~~p~)]~.
F(~‘&?!!L?!?
and assumption
6
J;r
is a (1 -a) x 100% confidence interval for F(xT/Io), where zi _a,2 is a constant satisfying that @(zr _a,2) - @( - zi _b,2) = 1 -CL If we required further that the length of Id is no more than 2d, then the best (unknown)
This
CT2 nF=-Zl-a/2 d2
2
suggests
a stopping
fixed sample
size is (38)
rule for constructing
a fixed-width
confidence
interval
for F(xT/IO), i.e. 62
Tf=inf
n> 1: n>3
where 6-,’=(xTfnlx)
zf-,,2
[f(~~/~)]~. -2,
Id=
G, J?
m
~:<(x~Z~~X)~~(O)
interval
for F(xT&).
1
By assumption
(40)
off,f(xT/?,,)<
f(O),
VneN.
Note that f(0) #O is a constant.
T,f < FdF with probability
have the corollary
~:-a,2
,F(Xylp)+
will be used as a confidence VnE N. This,
Then
Once we stop sampling,
2
F(XT&,F)-a=dZ1-a’2
[
(39)
,
Now, let
one, Hence,
by applying
Theorem
4.1 to C= x, we
below.
Corollary 4.1. Under the assumptions of Theorem 4.1, = 1 -a and lim+o T:/nF= (i) lim d+O P{F(xTpO)Eld} (ii) limd,o E[Ti/n,] = 1.
1 with probability one,
Y.I. Chary/
Journal
of Statistical
Planning
und Inference
44
i 1995)313-
325
3-75
Acknowledgment The author presentation the paper.
would like to thank
referees for their comments
of the paper and for a useful suggestion
that helped improve
that led to the last corollary
the of
References Albert, A. (1966). Fixed size confidence ellipsoids for linear regression parameters. Ann. ,Muth. Stuti.\r. 37, 1602- 1630. Bellman, R.E. (1960). Introduction to Matrix Amdysis. McGraw-Hill, New York. Chang. Y.C. (1991). Some sequential estimation problems in logistic regression models. Ph.D. thesis. University of Illinois at Urbana/Champaign. Chang, Y.C. (1992). Estimation in some binary regression models. Technical report c-92-9, lnstltute N,f Statistrcal Science, Academia Sinica, Taipei, ROC. Chang. Y.C. and A.T. Martinsek (1992). Fixed size confidence regions for parameters of a logistic regression model. Ann. Starist. 20, 1953-1969. Chow. Y.S. and T.L. Lai (1975). Some one-sided theorem on the tail distribution of sample sums with application to the last time and largest excess boundary crossings. Trans. Amer. Math. Sot,. 208, 51~ 71. Chow, Y.S. and H. Robbins (1965). On the asymptotic theory of fixed-width confidence intervals for the mean. Ann. Murh. Starisr. 36, 457462. Chow. Y.S. and H. Teicher (1978). Prohahilitx Theory: lndependenc~e. Interchangeahilir~, Mtrrtinytrk\. Springer, New York. Cox. D.R. and E.J. Snell (1989). Analysis of Binary Data, 2nd ed. Chapman and Hall, New York Finney, D.J. (1980). Prohir Analysis, 3rd ed. Cambridge University Press, Cambridge. Finster, M. (1985). Estimation in the general linear model when the accuracy IS specified before data collection. Ann. Statist. 13, 663-675. Gleser, L.J. (1965). On the asymptotic theory of the fixed-size sequential contidence bounds for line.lr regression parameters. Ann. Math. Sratist. 36, 463467. Gleser. L.J. (1969). On limiting distributions for sums of a random number of Independent random variables. Ann. Math. Statist. 40, 935-941. Hjort, N.L. and G. Fenstad (1992). On the last time and the number of times an estimator is more than E from its target value. Ann. Statist. 20, 469489. Lai, T.L. and D. Siegmund (1977). A nonlinear renewal theory with applications to sequential analysis I. Ann. starisr. 5, 94&954. Lal. T.L. and D. Siegmund (1979). A nonlinear renewal theory with applications to sequential analysis II. Ann. Stutiat.
Martinsek.
7. 60-76.
A. (1989). Sequential
Insr. Stutist.
Muth.
Estimation
in regression
models using analogues
of trimmed
means.
4ritt.
41, 521-540.
McCullagh. P.and J.A. Nelder (1989). Generalized Lineur Models, 2nd ed. Chapman and Hall, New York. Mukhopadhyay, N. (1974). Sequential estimation of regression parameters in Gauss-Markoff setup. J. Indiun
Sratisr.
Assoc. 12, 39443.
Rao. C.R. (1973). Linear Statistical Irferencr und Its Applications. 2nd ed. Wiley, New York. Rockafellar. R.T. (1970). Concc~ Analysis. Princeton University Press. Princeton. Serfling. R.J. (1980). Approximarion Theorems of Mathemakd Sraristics, Wiley. New York. Srlvastava. MS. (1967). On fixed-width confidence bounds for regression parameters and mean vector. J. Roy. Sfcltlst. SK. Ser. B 29, 132%140. Stein, C. (1945). A two sample test for a linear hypothesis wjhose power is independent of the variance. 4r1n Mrrth.
Stiltisr.
16, 243-252.
Wilkinson. J.H. (1965). The Alyebraic Eigenuulue Problem Clarendon Press, Oxford. Woodrofe. R.J. (1982). Nonlinear Renewal Theory in Sequenticd Analwis. Society for lndustrlal Mathematics, Philadelphia.
and Apphed