Conditional coverage probability of confidence intervals in errors-in-variables and related models

Conditional coverage probability of confidence intervals in errors-in-variables and related models

STATISTICS& ELSEVIER Statistics & Probability Letters 40 (1998) 165-170 Conditional coverage probability of confidence intervals in errors-in-variab...

314KB Sizes 0 Downloads 18 Views

STATISTICS& ELSEVIER

Statistics & Probability Letters 40 (1998) 165-170

Conditional coverage probability of confidence intervals in errors-in-variables and related models Chen-Hai Andy Tsao* Department of Applied Mathematics, Dony Hwa University, Haulien, Taiwan, ROC Received May 1997; received in revised form December 1997

Abstract

Gleser and Hwang (1987) show that there exists no nontrivial confidence interval with finite expected length for many errors-in-variables and related models. Nonetheless, the validity of using these confidence intervals only when their realizations are of finite length remains uninvestigated. We scrutinize this practice from a conditional frequentist viewpoint and find it not justifiable. It is shown that under some conditions, the minimum coverage probability of the confidence interval is zero if conditioning on the event that the confidence interval has finite length. The result is then applied to confirm Neyman's conjecture for Fieller's confidence sets. (~) 1998 Elsevier Science B.V. All rights reserved

AMS classification: primary 62A99; 62F25 Keywords: Conditional coverage probability; Errors-in-variables models; Fieller's confidence sets; Neyman's conjecture; Conditional inference

1. Introduction

Confidence intervals are among the most widely employed statistical tools. They are considered as supplements to point estimators of the unknown parameter or as set estimators per se. In most applications, only confidence intervals with finite length are actually reported. This routine practice is pervasive, yet seldom questioned. In this study, we scrutinize this practice from a conditional frequentist viewpoint. Two guiding inquiries are: (1) Does there always exist a (valid) confidence interval with finite expected length? (2) Is it justifiable to report a confidence interval only when its realization is of finite length? The answer to the first question is "NO" as established in Gleser and Hwang (1987). They show that for some errors-in-variables models, there exists no (1 - ~) confidence sets of finite expected length unless equals 1. For the models they considered, every nontrivial valid confidence interval must have infinite length with positive probability. However, from the practitioners' viewpoint, using such confidence intervals seems * E-mail: [email protected] 0167-7152/98/$ - see front matter (~) 1998 Elsevier Science B.V. All rights reserved PH SO 1 6 7 - 7 1 5 2 ( 9 8 ) 0 0 0 9 0 - X

C.-H.A. Tsao / Statistics & Probability Letters 40 (1998) 165-170

166

impractical and non-informative. As a result, these valid confidence intervals are often employed only when their realized lengths are finite. This raises the concern whether a (1 - ~ ) confidence interval maintains its minimum coverage probability under this conditional implementation. Also in Fieller's problem, Neyman (1954) suspected that using Fieller's confidence set only when its realization has finite length may lead to a very low conditional coverage probability. The pervasiveness of the practice and the historic Neyman's conjecture signify the importance of our study. Unfortunately, we show the answer is still "NO" to the second question as in the first one. To make the statement precise, we will need some definitions. Let Y be a random vector distributed according to a parametric model with unknown parameter (vector) 0. In many cases, some function of this parameter is of importance, say y(0) while the rest are the nuisance part. When ?(0) is a scalar, one can define a (1 - ~) confidence interval as an interval C(Y) -= [L(Y),U(Y)], where L(X), U(X) are two measurable functions with L(Y)<~U(Y) for all Y such that inf Po[y(O) E C(Y)]/>1 - c~.

0EO

(1)

Any valid (1 - c~) confidence set has to guarantee that its minimum (infimum) coverage probability is no less than (1 - a ) . Because the shorter confidence intervals are, the more informative they are; it is desirable to have an interval of shorter length while maintaining the minimum coverage probability. The length of interval C(Y) is D(C) = D(C(Y)) defined as U(Y)-L(Y). Its dependence with Y is notationally suppressed and should raise no confusion. The expected length of the confidence interval is defined as EoD(C). For higher dimensional cases, one might want to estimate a m-vector y(0) = (71(0),~2(0) ..... 7m(0))' simultaneously by a confidence set C(Y). The diameter D(C(Y)) is defined to be the maximum (supremum) distance between any two points in C(Y). We show that the practice of only using finite length confidence intervals is not justifiable in the errors-invariables and related models that Gleser and Hwang (1987) considered. Under some conditions, it is proved inf Po[~(O) E C(Y)]D(C) < oc] = 0, 0

(2)

where ?(0) is the parameter of interest and C(Y) is a set estimator (confidence interval or credible interval) for 7(0). As an application and example, we confirm a general version of Neyman's conjecture regarding Fieller's confidence sets.

2. Results Let Y be a random variable of (~, ~ ) where ~ is a a-field of ~¢. Let ( be a a finite measure on (off,~ ) and Y have probabilities determined by one of a parametric family of densities f(Y]O) relative to ~ with unknown common support and

Po[Y ¢ A] = fAf(Y[O)d~(Y).

(3)

Assume 0 = (01,02) E 6~ = (91 x (92, where 8 1 C ~P, ~92 C Rq. For reason of practicality, we also assume that all the confidence intervals considered hereafter satisfy

Po[(U(Y) = oo,L(Y) = oo)] = Po[U(Y) = -oo, L(Y) = - o o ) ] = 0.

(4)

The reason for considering the minimum conditional probability (2) lies in the spirit of frequentist conservatism. We assert that an interval is a (1 - ~) confidence interval only if its minimum coverage probability >i(1 - ~). Likewise, from a conditional frequentist viewpoint, if a set estimator is employed only when the

C.-H.A. Tsao I Statistics & Probability Letters 40 (1998) 165-170

167

sample is in a subset, S, of the sample space, the set should be evaluated by its minimum conditional coverage probability conditioning on the subset S. A low conditional coverage probability thus indicates a poor frequentist performance of the set in such conditional implementation. Theorem 2.1. Let 7(01 ) be a scalar function for 03 E Oj. Suppose that there is a subset 0~ C O1 and a

O~ E 02 such that (1) 7(01) has unbounded range over O: E 0~; (2) for each fixed Oi E 0~, Y E ~t, lim f(YLO~,O2)=f(YIO~)

(5)

exists, is a density for Y relative to ~ and is independent of O: : P0; [D(C) < :x~] > 0.

(6)

Then inf Po[7(Oi) E C(Y) ID(C) < c~] = 0.

(7)

o

ProoL

inf Po[7(01) E C(Y)ID(C) < co]

(8)

0

~<

lim0,_~0; Po[7(O:) E C(Y),D(C) < col " lim02-~0; Po[D(C) < oo]

for all 01 E O T.

(9)

The denominator of the above expression is positive by (6). Hence, the proof is complete if we can show that the numerator of (9) is zero. Let M > 0, define I C(Y) C*(Y) =

(-M,M)

if D(C(Y)) < co, if D(C(Y)) = oc.

Note that the expected length of C*(Y), EoD(C*), is finite, therefore inf

lim Po[7(03) E C*(Y)] = 0

(10)

by the proof of Gleser and Hwang (1987). However, since Po[~(O:) E C(Y),D(C) < cc]<~Po[?(O~)E C*(Y)], which, together with (10) implies the numerator of (9) equals 0. This completes the proof.

[]

Theorem 2.1 can be extended to the vector-valued function 7(01). Theorem 2.2. Let Y be a random vector whose distribution depends on an unknown vector parameter

0 E Ol x 02. Let 7(0:) be an m-dimensional vector-valued function of 01. I f there exists a O~ E 6)2 such that for any fixed 01 E O~ c Ol, condition (5) holds and

(1) lim Po[D(C) < co] > 0

02~0~

and independent of Or;

(11)

C-H.A. Tsao I Statistics & Probability Letters 40 (1998) 165-170

168

(2) there are some

a E R m and O~ E O~ such that

lim Po[a'7(01) E Ca(Y) ID(Ca) < oo] = O. o~=o;,o~o~

(12)

inf Po[~(O1)E C ( Y ) I D ( C ) < cx~]= 0,

(13)

Then 0

where Ca(Y) is a Scheffe's projection of C(Y) defined by Ca(Y) = {a'919 E C(Y)} = a'C(Y).

(14)

The intuition behind Theorems 2.1 and 2.2 is similar to that in Gleser and Hwang (1987): the (transformed) parameter space of 7(0t) has singularities when 0t E O~' while assumptions in the theorems make the probability structure remain stable as 7(01) approaching singularities. Technically speaking, the assumptions allow the generalized Lebesgue-dominated convergent theorem to operate, namely, interchanging the order of limit and the integration.

3. Applications Here we apply our theorems to confirm a general version of Neyrnan's (1964) conjecture. Let Y ,-~ Np(p, a21p) with /~ = (/tl ..... /tpf while tr2 is assumed to be known. Without loss of generality, assume that tr2 = 1. The parameter of interest is P = ( P l . . . . . P p - l ) t,

(15)

where #p ~ 0 and Pi = ~ti/~p,

Vi = 1,2 ..... p - 1.

Fieller's confidence set is a confidence set for p. For p equal to 2, it is given in Fieller (1954) as

CF(Y)= p"

IY, - rzpl } V/l+p2 < v~ ,

(16)

where c is the upper a cutoff point of a chi-square distribution with 1 degree of freedom and Y = (I"1, Y2)'. In Neyman's discussion of Fieller's paper (1954), he stated It appears that, in all cases, the conditional probability that the statement regarding q/¢ will be correct depends on all the unknown parameters and is less than the prescribed confidence coefficient. In unfavorable conditions, this conditional probability may be close to zero. The case Neyman considered is when p = 2 and the ratio r//~ is our parameter p. Nonetheless, we will show that Neyman's conjecture holds for general p~>2. A general form for Fieller's confidence set, CF(Y), is derived in Tsao and Hwang (1997):

cF(y)={p: r'r (y'')2 ll~ll2 < ~ }.

(17)

C.-H.A. Tsao I Statistics & Probability Letters 40 (1998) 165-170

169

Note that (17) refers to a set of p even though the condition is expressed in terms of #. Neyman's conjecture ( p = 2) and its generalization for p > 2 is confirmed by the following theorem.

Theorem 3.1. (18)

inf P. [p E CF(Y)ID(CF(Y)) < CX?] = O. #

Proof. Let 01 = p =- (Pl . . . . . P p - l ) I E Ol = R p - I , 02 -~ 12p E 0 2 = R - {0},

0~ ~---0 E g2,

0 7 = {0! ] O!Op = lp-i,Pi > 0, for i = 1. . . . . p - 1},

where lp_! = (1 . . . . . 1)' E R p-!.

Let a = (1,0 . . . . . 0)', a ( p - 1) vector. It is easy to see that (19)

lim Po[a'p E a'CF(Y) ID(dCF(Y)) < oo] = O.

02~0~

The proof is completed by Theorem 2.2 if we show that for 01 E O~', lim Po [D(CF(Y)) < c~] > 0.

(20)

o2~o~

To show (20), it suffices to show that the left limit as 02 ~ component of p is positive. Let

0~+ is positive. Under this setup, every

U = {y E ~ l Y i > 0,for i = 1 , . . . , p ; y 2 > c} and y(p) denote (Yl . . . . . y p - i Y. Recall the representation of Fieller's confidence set (17). Note that for our choice of subset of parameter space and any y E U, {Pl

Ilyll 2 - (yt

p+yp)2

c{N

1 + [Ipll2

< c}

[lyll2(1 + Ilpll=) - ¢(1 ÷ Ilpll 2) - (lly

ll Ilpll + yp)2 < o}

can be expressed as a finite interval of Hence for all p E 0 I, 0~ = 0,

(21) (22)

IlPll-

lim Po [D(CF(Y)) < oo] >~ lim Po[U] > O. 02--*0~'

02---+0~

This completes the proof.

[]

In Gleser and Hwang (1987), many errors-in-variables and related models (estimation of principal component vectors, inverse regression, estimation of the location parameter of the von Mises distribution on the circle, etc.) are examined. These models are shown to have no nontrivial confidence intervals with finite expected length under some conditions. For these models, most conditions in Theorem 2.1 or Theorem 2.2 are met. The only crucial condition left to be checked is (11 ). The case-by-case examination of the conditions for various confidence intervals for these models are beyond the scope of this paper. Readers are referred to Cheng and Van Ness (1994) for a survey on the confidence intervals commonly used for these models.

170

C-H.A. Tsao / Statistics & Probability Letters 40 (1998) 165-170

4. Discussion Gleser and Hwang (1987) show that there exists no nontrivial confidence interval with finite expected length for many errors-in-variables and related models. However, the practitioners typically use these confidence intervals only when their realizations are o f finite lengths. W e delve into the validity o f this practice. This conditional implementation o f confidence intervals is found unjustifiable and leads to zero minimum conditional coverage probability. Our result is particularly surprising since it requires very few assumptions on the distribution family. What are the remedies then if one has to use confidence intervals o f infinite length? We suggest the employment o f confidence estimators. The interested readers are referred to, for example, Berger (1985, 1988). Particularly, Tsao and Hwang (1997) give data-dependent confidence estimators for Fieller's confidence sets. W e interpret our result as a warning against the general practices o f using statistical procedures only when their realizations are "desirable". It also indicates the importance o f investigations o f conditional performances o f confidence intervals or other statistical procedures in the context o f implementation in practice.

References Berger, J. O., 1985. The frequentist viewpoint and conditioning. In: Le Cam, L., Olshen, R. (Eds.), Proc. Berkeley Conf. in Honor of Jerzy Neyman and Jack Kiefer, vol. 1. Wadsworth, Monterey, CA, pp. 15--44. Berger, J. O., 1988. An alternative: The estimated confidence approach. In: Gupta, S. S. Berger, J. O. (Eds.), Statistical Decision Theory and Related Topics W, vol. 1 Springer, New York, pp. 85-90. Cheng, C. L., Van Ness, J. W., 1994. On estimating linear relationship when both variables are subject to errors. J. Roy. Statist. Soc. Ser. B 56, 167-183. Fieller, E. C, 1954. Some problems in interval estimation. J. Roy. Statist. Soc. Ser. B 16, 175-183. Gleser, L., Hwang, J. T., 1987. The nonexistence of 100(l - ~)% confidence set of finite expected diameter in error in variable and related models. Ann. Statist. 15, 1351-1362. Neyman, J., 1954. Discussions on Some problems in interval estimation (by Fieller, E. C.). J. Roy. Statist. Soc. Ser. B 16, 216-218. Tsao, C. H. A., 1994. Confidence estimation and conditional inferences for Fieller's confidence sets. Ph.D. Thesis, Comell University, Ithaca, New York. Tsao, C. A., Hwang, J. T. G, 1997. Improved confidence estimators for the Fieller's confidence set. Technical Report, Dong Hwa University, Hualien, Taiwan, Can. J. Statist., to appear.