The efficiency of Buehler confidence limits

The efficiency of Buehler confidence limits

Statistics & Probability Letters 65 (2003) 21 – 28 The eciency of Buehler con!dence limits Paul Kabailaa;∗ , Chris J. Lloydb a Department of Statis...

224KB Sizes 4 Downloads 95 Views

Statistics & Probability Letters 65 (2003) 21 – 28

The eciency of Buehler con!dence limits Paul Kabailaa;∗ , Chris J. Lloydb a

Department of Statistical Science, La Trobe University, Victoria 3086, Australia Melbourne Business School, 200 Leicester St., Carlton, Victoria 3026, Australia

b

Received January 2002; received in revised form June 2003

Abstract The Buehler 1 −  upper con!dence limit is as small as possible, subject to the constraints that (a) its coverage probability never falls below 1 −  and (b) it is a non-decreasing function of a designated statistic T . We provide two new results concerning the in2uence of T on the eciency of this con!dence limit. Firstly, we extend the result of Kabaila (Statist. Probab. Lett. 52 (2001) 145) to prove that, for a wide class of Ts, the T which maximizes the large-sample eciency of this con!dence limit is itself an approximate 1 −  upper con!dence limit. Secondly, there may be ties among the possible values of T . We provide the result that breaking these ties by a suciently small modi!cation cannot decrease the !nite-sample eciency of the Buehler con!dence limit. c 2003 Elsevier B.V. All rights reserved.  Keywords: Con!dence upper limit; Reliability; Biostatistics; Discrete data; Nuisance parameter

1. Introduction Suppose that the distribution of the data Y is indexed by  ∈ B and that the scalar parameter of interest is  = (). Also suppose that Y is a discrete random vector taking values in a known countable set Y. This paper is concerned with the construction of a 1 −  (0 ¡  ¡ 12 ) upper con!dence limit for  i.e. a statistic u(Y ) satisfying P( 6 u(Y ) | ) ¿ 1 −  for all . Such limits are appropriate when  is a measure of loss. One method of !nding an upper con!dence limit is to invert a test of H0 :  =  against H1 :  ¡  using the test statistic t( ; Y ) which tends to be smaller under H1 than under H0 . De!ne the P-value g( ; y) =

∈

sup

B:()=

P(t( ; Y ) 6 t( ; y) | )

for each y ∈ Y:



Corresponding author. Tel.: +61-3-9479-2594; fax: +61-3-9479-2466. E-mail address: [email protected] (P. Kabaila).

c 2003 Elsevier B.V. All rights reserved. 0167-7152/$ - see front matter  doi:10.1016/S0167-7152(03)00215-3

22

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

A con!dence set for  with coverage probability at least 1 −  is {() : g(; y) ¿ }. Thus a 1 −  upper con!dence limit for  is provided by the formula sup{() : g(; y) ¿ }

(1)

for observed data y. Suppose, for the moment, that t(; Y ) depends on . For example, t(; Y ) = ˆ ˆ where ˆ is an estimator of . Assuming (as do Bolshev and Loginov (−)=(standard error of ), (1966)) that g(; y) is a decreasing continuous function of  for each y ∈ Y, the solution for  of g(; y) =  provides a 1 −  upper con!dence limit for . However, this assumption is not tenable for the discrete data case which we are considering. This is because g(; y), when considered as a function of  for !xed y, has “jump” and “drop” discontinuities (Kabaila, 2002b). Henceforth, we restrict attention to t(; Y ) a function of Y , which we denote by T =t(Y ). Formula (1) now becomes the following: sup{() : P(t(Y ) 6 t(y) | ) ¿ ;  ∈ B}:

(2)

Under mild regularity conditions, g(; y) is a continuous function of  for every y ∈ Y (see e.g. Kabaila and Lloyd, 1997). Furthermore, for a wide class of models (including binomial and poisson), parameters of interest  and designated statistics T , g(; y) is a non-increasing function of  for each y ∈ Y (Harris and Soms, 1991; Kabaila, 2002a). The combination of g(; y) being continuous and non-increasing in  greatly eases the computation of (1). Formula (2) was presented by Buehler (1957). He also provided the insight that the 1 −  upper con!dence limit given by this formula is as small as possible subject to the constraint that it is a non-decreasing function of the designated statistic T . For proofs see Bolshev (1965) (for the case of no nuisance parameters), Jobe and David (1992) and Lloyd and Kabaila (2003). This con!dence limit is commonly called the Buehler con!dence limit. The eciency of a Buehler 1 −  upper con!dence limit is measured by how probabilistically small it is. Kabaila (2001) proves that, for T a modi!ed Wald upper con!dence limit for  with nominal coverage 1 −  (0 ¡  6 12 ), maximum large sample eciency is achieved when  = . It is reasonable to expect that a similar result will hold for the class A of statistics that are asymptotically equivalent to these designated statistics. For example, the set of signed root likelihood ratio upper con!dence limits with nominal coverage 1 −  (0 ¡  6 12 ) belongs to A. We provide two new results on how T should be chosen so as to maximize the eciency of the Buehler con!dence limit. Firstly, not all of the designated statistics T which have been proposed in the literature belong to A. For example, neither the simple upper con!dence limit for  which was suggested as a reasonable choice for T by Buehler (1957) nor its natural generalization, as described by Kabaila and Lloyd (2002), belong to this class. In Section 2 we describe a class of statistics that includes both these designated statistics and A. This class is much wider than A. We then extend the result of Kabaila (2001). We prove that the T belonging to this much wider class that maximizes the large-sample eciency of the Buehler 1 −  upper con!dence limit is asymptotically equivalent to a modi!ed Wald upper con!dence limit with nominal coverage 1 − . Secondly, there may be ties among the possible values of t(Y ). In Section 3 we provide the result that breaking these ties by a suciently small modi!cation cannot decrease the !nite-sample eciency of the Buehler con!dence limit.

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

23

2. The large-sample eciency of the Buehler upper condence limit Let Tn denote a sequence of designated statistics, where n is a positive integer measuring sample size. De!ne  = {() :  ∈ B}, the set of possible values of . It is convenient to introduce the nuisance parameter vector where (; ) is a one–one function of . De!ne  = {((); ()) :  ∈ B}, the set of possible values of (; ). In what follows, all supremums with respect to are over ∈ { () : () = ;  ∈ B}. Let t denote the observed value of Tn and denote the upper con!dence limit (2) by un∗ (t). We re-express (2) as the following:   un∗ (t) = sup  : sup P(Tn 6 t | ; ) ¿  ;

(3)

assuming that P(Tn 6 t | ; ) ¿  for some (; ) ∈ . We will !nd a large-sample approximation to a lower bound for the Buehler 1− upper con!dence limit based on the designated statistic Tn . Suppose that (ˆ n ; ˆ n ) is an estimator of (; ) with the same asymptotic distribution as the maximum likelihood estimator. Also suppose that n1=2 (ˆ n − ) converges in distribution to N(0; s2 (; )). We will restrict attention to the family of designated statistics that satisfy Tn = ˆ n + n−1=2 [c s(; ) − r(; )] + op (n−1=2 );

(4)

where c =−1 (1−) and  denotes the N(0; 1) distribution function. Let Tn = ˆ n +n−1=2 c s(ˆ n ; ˆ n ), a modi!ed Wald upper con!dence limit for  with nominal coverage 1 − . Under mild regularity conditions, Tn = ˆ n +n−1=2 c s(; )+op (n−1=2 ), so that r(; ) ≡ 0 for Tn =Tn . Thus r(; ) measures !rst-order departure of Tn from Tn . The following is the main result of the paper; it provides a large-sample approximation to a lower bound Vn for un∗ (Tn ). Theorem 2.1. Let un∗ (Tn ) denote the Buehler 1 −  upper con3dence limit based on Tn . Suppose that Tn satis3es (4) and that Assumptions A–C, described in Appendix A, are satis3ed. Then for each (; ) ∈  such that  ∈ (inf ; sup ) there exists a sequence of random variables {Vn } such that un∗ (Tn ) ¿ Vn for all n and such that   − 1=2 − 1=2 sup r(; ) − r(; ) + op (n−1=2 ): c s(; ) + n (5) Vn = ˆ n + n This result is proved in Appendix B. Kabaila (2001) proves that, under regularity conditions, un∗ (Tn ) = ˆ n + n−1=2 c s(; ) + op (n−1=2 ): Combining (5) with (6) gives ∗





 ∗



un (Tn ) − un (Tn ) ¿ Vn − un (Tn ) = n

−1=2

(6) 

sup r(; ) − r(; ) + op (n−1=2 ):

(7)

24

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

Since both un∗ (Tn ) and un∗ (Tn ) possesses the desired coverage properties, inequality (7) provides a large-sample measuring stick for comparing these two upper limits. Unless sup r(; ) = r(; ) for all (; ) ∈  such that  ∈ (inf ; sup ), un∗ (Tn ) is more ecient in large samples than un∗ (Tn ). 3. The nite-sample eciency of the Buehler upper condence limit There may be ties among the possible values of t(Y ). We provide the result that breaking these ties by a suciently small modi!cation cannot decrease the !nite-sample eciency of the Buehler con!dence limit. Denote the Buehler 1 −  upper con!dence limit (2) by u∗ (y; t). An immediate consequence of (2) is that if t(y) takes the same value for all y in Y˜ ⊂ Y, then u∗ (y; t) takes the ˜ This is a very undesirable feature of u∗ (y; t) in the case that, intuitively, same value for all y in Y. ˜ An illustration is provided by the the upper con!dence limits should not be identical for all y in Y. following example: Example (Product of binomial parameters): Suppose that Y = (Y1 ; Y2 ) where Y1 ∼ Binomial(n; 1 ) and Y2 ∼ Binomial(n; 2 ) are independent random variables. The parameter of interest is =1 2 . Let t(Y ) be the maximum likelihood estimator of  i.e. t(Y ) = (Y1 =n)(Y2 =n). Now t(y) = 0 if either y1 = 0 or y2 = 0. Thus u∗ (y; t) takes the same value for all y in Y˜ = {(0; 0); (0; 1); (0; 2); : : : ; (0; n); (1; 0); (2; 0); : : : ; (n; 0)}. As observed by Harris and Soms (1983, p. 65), this is undesirable since u∗ ((0; 0); t) should, intuitively, not be the same as, say, u∗ ((0; n); t). The desirability of breaking ties among the possible values of t(Y ) follows, in part, from the following result: Lemma 3.1. Suppose that the designated statistics r(Y ) and t(Y ) satisfy the following condition. For every y and y˜ in Y; t(y) ¡ t(y) ˜ implies that r(y) ¡ r(y). ˜ Then u∗ (y; r) 6 u∗ (y; t) for all y ∈ Y. The proof of this result is straightforward, but this important result has not previously been noted. If the designated statistics r(Y ) and t(Y ) satisfy the condition stated in this lemma and there exists y∗ ; yP ∈ Y such that t(y) P = t(y∗ ) and r(y) P ¡ r(y∗ ) then we say that r(Y ) is a re!nement of t(Y ). Suppose t(y) takes the same value for two values y∗ ; yP in Y. De!ne r(y) P = t(y) P +  and r(y) = t(y) for all y = y. P When Y is a !nite set, r(Y ) is a re!nement of t(Y ) for suciently small ||. Re!nement is a mathematical expression of the idea of breaking ties in a designated statistic t(Y ) without violating any other aspects of the ordering induced on the data by this statistic. Kabaila and Lloyd (2003) provide numerical examples and theoretical results showing that, in many cases, if r(Y ) is a re!nement of t(Y ) then u∗ (y; t) ¿ u∗ (y; r) for all y in Y, with strict inequality for at least one y. In other words, u∗ (Y ; r) is a better upper con!dence limit than u∗ (Y ; t). Appendix A. list of assumptions required for Theorem 2.1 In this appendix we describe the three sets of assumptions required for Theorem 2.1.

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

25

Assumption A (Parameter space): Assume that  is an interval with !nite endpoints ‘ and u (‘ ¡ u ). Assumption B (Continuity and positivity): De!ne r(; ) and s(; ) as in Section 2. Assume that (a) (b) (c) (d)

r(; ) is a continuous function of (; ) ∈ , sup r(; ) is a continuous function of  over every closed sub-interval of (‘ ; u ), sup(; )∈ s(; ) ¡ ∞, s(; ) ¿ 0 for all (; ) ∈  satisfying  ∈ (‘ ; u ).

Our analysis of the large sample properties of un∗ (Tn ) involves a large-sample approximation to P(Tn 6 t | ; ) in (3). The following outline argument indicates the form this approximation will take. Under regularity conditions, P(Tn 6 t | ; ) = P(ˆ n + n−1=2 [c s(; ) − r(; )] 6 t | ; ) + o(1)   n1=2 (ˆ n − ) n1=2 (t − ) + r(; ) =P + o(1) 6 − c | ; s(; ) s(; )  =

n1=2 (t − ) + r(; ) − c s(; )

 + o(1):

The approximations used in the derivation of the large-sample distribution of un∗ (Tn ) may break down for  close to the boundaries of , necessitating a very careful consideration of such values. Our !nal set of assumptions is the following: Assumption C. For every a and b such that ‘ ¡ a ¡ b ¡ u the following conditions are satis!ed, where we de!ne ab = {(; ) ∈  :  ∈ [a; b]}: (a) ab is a compact subset of Rd , (b) de!ne the approximation error   1=2 n (t − ) + r(; ) − c : !n = sup P(Tn 6 t | ; ) −  s(; ) ((; ); t)∈ab ×R Then !n → 0 as n → ∞.

Appendix B. proof of Theorem 2.1 In this appendix we prove Theorem 2.1 in three parts. For simplicity of exposition, we provide the proof for a scalar. As a !rst step, we extend the de!nition of u∗ (t) given by (2) to all t ∈ R by de!ning u∗ (t) = ‘ when the assumption that P(Tn 6 t | ; ) ¿  for some (; ) ∈  is not satis!ed.

26

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

Part 1: Fix a and b satisfying ‘ ¡ a ¡ b ¡ u and de!ne I = [a; b] and ab = {(; ) ∈  :  ∈ I }. It follows from Assumption C(b) that   1=2 n (t − ) + r(; ) sup  − c − !n 6 sup P(Tn 6 t | ; ) ∀(; t) ∈ I × R: (B.1) s(; ) De!ne g() = sup r(; ). Further, de!ne () to be the largest value of maximizing r(; ) with respect to ∈ (). Also de!ne   1=2   1=2 n (t − ) + r(; ()) n (t − ) + g() fn (t; ) =  − c =  − c : s(; ()) s(; ()) Note that fn (t; ) 6 sup 



n1=2 (t − ) + r(; ) − c s(; )

 ∀(; t) ∈ I × R:

(B.2)

It follows from (B.1) and (B.2) that fn (t; ) − !n 6 sup P(Tn 6 t | ; )

∀ (; t) ∈ I × R:

(B.3)

Fix $ satisfying 0 ¡ $ ¡ 16 (b−a). It may be proved that there exists N ¡ ∞ such that for all n ¿ N and all t ∈ [a + 2$; b − 2$]: fn (t; a + $) − !n ¿ 

and

fn (t; ) − !n ¡ 

∀ ∈ [b − $; b]:

(B.4)

Part 2: Throughout this part we restrict attention to n satisfying n ¿ N and to t satisfying t ∈ [a + 2$; b − 2$]. De!ne vn (t) = sup{ ∈ I : fn (t; ) − !n ¿ }. It follows from (B.4) that vn (t) ∈ [a + $; b − $] for all t. Also, it follows from (B.3) that   vn (t) 6 sup  ∈ I : sup P(Tn 6 t | ; ) ¿  6 un∗ (t) for all t. It follows from the de!nition of vn (t) that for each t there exists  ∈ [vn (t) − Qn−1 ; vn (t)] such that fn (t; ) − !n ¿  i.e. such that  ¡ t + n−1=2 g() − n−1=2 [−1 ( + !n ) + c ]s(; ()):

(B.5)

Thus there exists a sequence of numbers {'n }, satisfying −Qn−1 6 'n 6 0, such that  = vn (t) + 'n satis!es (B.5) for each n. Now any  ∈ I satisfying (B.5) also satis!es  ¡ t + n−1=2 Mn where Mn = sup∈I |g()| + |−1 ( + !n ) + c |sup(; )∈ s(; ) ¡ ∞. Thus vn (t) + 'n ¡ t + n−1=2 Mn for all n and t, so that vn (t) 6 t + n−1=2 Mn + Qn−1

for all n and t:

(B.6)

It also follows from the de!nition of vn (t) that fn (t; ) − !n 6  for  = vn (t) + Qn−1 i.e.  ¿ t + n−1=2 g() − n−1=2 [−1 ( + !n ) + c ]s(; ()):

(B.7)

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

27

Now any  ∈ I satisfying (B.7) also satis!es  ¿ t − n−1=2 Mn . Thus vn (t) + Qn−1 ¿ t − n−1=2 Mn for every t so that vn (t) ¿ t − n−1=2 Mn − Qn−1

for all n and t:

(B.8)

It follows from (B.6) and (B.8) that t − n−1=2 Mn − Qn−1 6 vn (t) 6 t + n−1=2 Mn + Qn−1

for all n and t:

(B.9)

Since  = vn (t) + 'n satis!es (B.5), vn (t) 6 t + n−1=2 g(vn (t) + 'n ) + n−1=2 |−1 ( + !n ) + c | sup s(; ) + Qn−1 (; )∈

(B.10)

for all t. Also, since  = vn (t) + Qn−1 satis!es (B.7), vn (t) ¿ t + n−1=2 g(vn (t) + Qn−1 ) − n−1=2 |−1 ( + !n ) + c | sup s(; ) − Qn−1 (; )∈

(B.11)

for all t. Now g is uniformly continuous in  ∈ I . Thus, from (B.9), |g(vn (t) + 'n ) − g(t)| ¡ )n and |g(vn (t)+Qn−1 )−g(t)| ¡ )n for all n and t, where )n → 0 as n → ∞. Note that −1 (+!n )+c → 0 as n → ∞. It then follows from (B.10) and (B.11) that for all n ¿ N , sup

t ∈[a+2$; b−2$]

Now de!ne  Vn =

|vn (t) − [t + n−1=2 g(t)]| = o(n−1=2 ):

vn (Tn )

for Tn ∈ [a + 2$; b − 2$];

‘

otherwise

and observe that Vn 6 un∗ (Tn ). Now !x (; ) ∈ , where  ∈ [a + 3$; b − 3$]. Observe that P(Tn ∈ [a + 2$; b − 2$]) → 1 as n → ∞. Thus for each given  ¿ 0, P(n1=2 |Vn − [Tn + n−1=2 g(Tn )]| ¿ ) → 0 as n → ∞. In other words, Vn = Tn + n−1=2 g(Tn ) + op (n−1=2 ) so that Vn = Tn + n−1=2 g() + op (n−1=2 ). Hence Vn satis!es (5). Part 3: The arguments described in Parts 1 and 2 hold for each !xed (a; b; $) satisfying ‘ ¡ a ¡ b ¡ u and 0 ¡ $ ¡ 16 (b − a). Hence for each (; ) ∈  satisfying  ∈ (‘ ; u ), there exists a sequence of random variables {Vn } such that un∗ (Tn ) ¿ Vn for all n, where Vn satis!es (5). References Bolshev, L.N., 1965. On the construction of con!dence limits. Theory Probab. Appl. 10, 173–177. Bolshev, L.N., Loginov, E.A., 1966. Interval estimates in the presence of nuisance parameters. Theory Probab. Appl. 11, 82–94. Buehler, R.J., 1957. Con!dence intervals for the product of two binomial parameters. J. Amer. Statist. Assoc. 52, 482–493. Harris, B., Soms, A.P., 1983. Recent advances in statistical methods for system reliability using Bernoulli sampling of components. In: DePriest, D.J., Launer, R.L. (Eds.), Reliability in the Acquisitions Process. Marcel Dekker, New York, pp. 55–68. Harris, B., Soms, A.P., 1991. Theory and counterexamples for con!dence limits on system reliability. Statist. Probab. Lett. 11, 411–417.

28

P. Kabaila, C.J. Lloyd / Statistics & Probability Letters 65 (2003) 21 – 28

Jobe, J.M., David, H.T., 1992. Buehler con!dence bounds for a reliability-maintainability measure. Technometrics 34, 214–222. Kabaila, P., 2001. Better Buehler con!dence limits. Statist. Probab. Lett. 52, 145–154. Kabaila, P., 2002a. Computation of exact con!dence limits from discrete data. Technical Report No. 2002-6, School of Mathematical and Statistical Sciences, La Trobe University. Kabaila, P., 2002b. Computation of exact con!dence intervals from discrete data using studentized test statistics. Technical Report No. 2002-8, School of Mathematical and Statistical Sciences, La Trobe University. Kabaila, P., Lloyd, C.J., 1997. Tight upper con!dence limits from discrete data. Austral. J. Statist. 37, 193–204. Kabaila, P., Lloyd, C.J., 2002. The importance of the designated statistic on Buehler upper limits on a system failure probability. Technometrics 44, 390–395. Kabaila, P., Lloyd, C.J., 2003. Improved Buehler con!dence limits based on re!ned designated statistics, submitted for publication. Lloyd, C.J., Kabaila, P., 2003. On the optimality and limitations of Buehler bounds. Austral. New Zealand J. Statist. 45, 167–174.