Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893 www.elsevier.com/locate/jspi
New large-sample confidence intervals for a linear combination of binomial proportions Joshua M. Tebbsa,∗ , Scott A. Rothsb a Department of Statistics, University of South Carolina, Columbia, SC 29208, USA b Department of Statistics, Penn State University, University Park, PA 16802, USA
Received 24 October 2006; accepted 5 July 2007 Available online 15 August 2007
Abstract In this paper, we consider the problem wherein one desires to estimate a linear combination of binomial probabilities from k > 2 independent populations. In particular, we create a new family of asymptotic confidence intervals, extending the approach taken by Beal [1987. Asymptotic confidence intervals for the difference between two binomial parameters for use with small samples. Biometrics 73, 941–950] in the two-sample case. One of our new intervals is shown to perform very well when compared to the best available intervals documented in Price and Bonett [2004. An improved confidence interval for a linear function of binomial proportions. Comput. Statist. Data Anal. 45, 449–456]. Furthermore, our interval estimation approach is quite general and could be extended to handle more complicated parametric functions and even to other discrete probability models in stratified settings. We illustrate our new intervals using two real data examples, one from an ecology study and one from a multicenter clinical trial. © 2007 Elsevier B.V. All rights reserved. Keywords: Bayesian estimation; Gram–Schimdt othogonalization; Multicenter clinical trial; Nuisance parameters; Reparameterization
1. Introduction Binomial confidence intervals are frequently used in applied research, and their ubiquitous use has triggered a number of statisticians to develop new confidence interval procedures, both asymptotic and exact. In particular, the one- and two- (independent) sample binomial problems have garnered a substantial amount of attention in recent years; see Brown et al. (2001), Reiczigel (2003), Zhou et al. (2004), Brown and Li (2005) and the references therein for a review. In comparison, confidence interval construction for a linear combination of binomial proportions from k > 2 independent populations has received far less attention. This is surprising since many studies require one to estimate proportions from multiple treatment groups or strata (e.g., dose–response studies, public-health surveys, multicenter clinical trials, agricultural experiments, etc.), and properly chosen linear combinations (i.e., contrasts) can be used to compare the strata. Price and Bonett (2004) have recently reviewed three asymptotic approaches to construct intervals for a linear combination. Each interval possesses a Wald-type form; two of these use an “adjustment,” similar to the intervals made famous by Agresti and Coull (1998) and Agresti and Caffo (2000). In particular, Price and Bonett (2004) propose a variant of the Agresti–Caffo (2000) interval for a single linear combination and demonstrate that it has better ∗ Corresponding author.
E-mail addresses:
[email protected] (J. M. Tebbs),
[email protected] (S. A. Roths). 0378-3758/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2007.07.008
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
1885
coverage properties than competing methods. Furthermore, their interval is very simple to compute, which makes it attractive for applied researchers. In this paper, we propose a new flexible asymptotic approach to construct confidence intervals for a single linear combination. Our approach, while more involved computationally than those previously mentioned, produces a family of intervals, one of which competes fairly well with the generalized Agresti–Caffo interval proposed by Price and Bonett (2004). Furthermore, the methodology we outline can be extended to other more complicated parametric functions (i.e., other than linear combinations) and even to other discrete probability distributions, such as the Poisson or geometric models. Subsequent sections of this paper are organized as follows. In Section 2, we review the three intervals discussed in Price and Bonett (2004). In Section 3, we present a new family of confidence intervals arising from an approach similar to that taken by Beal (1987) and Roths and Tebbs (2006) in the two-population case. In Section 4, we examine our new intervals in terms of coverage probability and mean length for small samples. In Section 5, we apply our methods to two real data examples. In Section 6, we provide a summary discussion and describe the many possible extensions of this work. 2. Wald and adjusted Wald intervals Suppose that X1 , X2 , . . . , Xk are independent binomial random variables with parameters ni and 0 < pi < 1, for i = 1, 2, . . . , k. Define X = (X1 , X2 , . . . , Xk ) so that the joint probability mass function of X is k ni f (x|p) = pixi (1 − pi )ni −xi , xi
(1)
i=1
for xi = 0, 1, . . . , ni , where p = (p1 , p2 , . . . , pk ) . In this paper, we focus on writing confidence intervals for the linear combination 1 = c1 p =
k
c1i pi ,
i=1
(c11 , c12 , . . . , c1k ) and the c1i ’s are fixed constants. The maximum likelihood estimator (MLE) of 1 is where c1 = i = Xi /ni , and standard large-sample arguments can be used to show that i , where p 1 = c1 p = ki=1 c1i p 1 − 1 d −→ N(0, 1), v( 1 )
(2)
2 p (1 − p )/n is the variance of as ni / i ni → ai ∈ (0, 1) and mini ni → ∞, where v( 1 ) = ki=1 c1i 1 . For i i i 2 /n , −c2 /n , . . . , −c2 /n ) and our purposes, it is helpful to note that v(1 ) = p Dp + d p, where D = diag(−c11 1 k 12 2 1k 2 /n , c2 /n , . . . , c2 /n ) . The limiting result in (2), along with Slutsky’s Theorem, allows one to conclude d = (c11 1 12 2 k 1k 1 ), that, for v ( 1 ) being any consistent estimator of v( 1 ± z/2 v ( 1 ) (3) is an approximate 100(1 − ) percent confidence interval for 1 , where z/2 denotes the upper /2 quantile of the standard normal distribution. The usual Wald (W) interval results from using v ( 1 ) = p in (3). p D p + d Price and Bonett (2004) demonstrate that the simple Wald approach, as expected, fails to hold nominal coverage, often providing intervals which are unsuitably anticonservative. These authors also investigate the Laplace–Wald (LW) interval and a new interval that we call the Price–Bonett (PB) interval. The LW interval, suggested by Greenland (2001), i∗ = (xi + 1)/n∗i replace ni and p i , respectively, in (3). is identical to the Wald interval except that n∗i = ni + 2 and p i∗ = (xi + 2/k ∗ )/n∗i replace The PB interval is also identical to the Wald interval except that n∗i = ni + 4/k ∗ and p i , respectively, in (3). The value k ∗ is the number of nonzero constants among c11 , c12 , . . . , c1k . When k = 2, ni and p c11 = 1, and c12 = −1, the LW and PB modified intervals both reduce to the Agresti–Caffo (2000) version. Among the
1886
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
three intervals, Price and Bonett (2004) demonstrate that their interval performs best in terms of holding the correct coverage probability. It should be noted that these authors do not provide a general comparison of the three intervals (W, LW, PB) with respect to mean interval length. 3. A new family of asymptotic intervals To create a new family of intervals for 1 , we propose a multivariate extension of the approach taken by Beal (1987) in the two-sample problem. In Beal’s two-sample setting, the linear combination of interest is the difference 1 = p1 − p2 . To estimate 1 , Beal first creates the nuisance parameter 2 = p1 + p2 and then performs a Bayesian analysis to remove the effects of 2 . See Beal (1987) and Roths and Tebbs (2006) for complete details. Note that in the two-sample case, 1 = c1 p and 2 = c2 p, where c1 = (1, −1) and c2 = (1, 1) , and that the inner product c1 c2 = 0, i.e., c1 and c2 are orthogonal vectors in R2 . Our focus here is on the k-sample problem, generalizing Beal’s two-sample approach. In particular, we consider the reparameterization ≡ (1 , 2 , . . . , k ) = C p, where C = (c1 c2 · · · ck ) and cj = (cj 1 , cj 2 , . . . , cj k ) , for j = 1, 2, . . . , k, C is a k × k orthogonal matrix of constants, and denotes horizontal concatenation. Using this reparameterization, one notes that 1 = c1 p, the parameter of central interest. Removing the effects of the nuisance parameters j = cj p, for j = 2, 3, . . . , k, and choosing C are discussed in Sections 3.1 and 3.2, respectively. We let M = (C )−1 and generically write M = (m1 //m2 // · · · //mk ), where // denotes the vertical 1 expressed as a function of is concatenation, so that p = M, pj = mj , for j = 1, 2, . . . , k, and the variance of v( 1 ) = p Dp + d p = M DM + d M. Using the asymptotic result in (2), an approximate 100(1 − ) percent confidence interval for 1 can be constructed by finding the values of 1 that satisfy M DM + d M , (4) |1 − 1 | z/2 where = ( 1 , p, the 2 , . . . , k ) and j is an estimate of j . In (4), it is not difficult to argue that setting j = j = cj MLE of j , for j = 1, 2, . . . , k, produces the Wald interval from Section 2. However, with our reparameterization, it is unnecessary to estimate 1 = c1 p in v( 1 ). By setting 1 = 1 , the confidence interval limits from (4) are the roots of the equation in 1 given by ( 1 − 1 )2 − z2/2 [ (1) M DM (1) + d M (1) ] = 0,
(5)
where (1) = (1 , 2 , . . . , k ) . Furthermore, because (5) is a second-order polynomial in 1 , the roots that delimit the intervals in this new family can be found using the quadratic formula (in rare cases, imaginary roots result; see Section 4). One should observe that the limits obtained from solving (5) depend on 2 , 3 , . . . , k , i.e., the estimates of 2 , 2 , . . . , k , respectively. Similarly to Beal (1987) and Roths and Tebbs (2006), we propose Bayesian and parametric empirical Bayesian approaches to remove the effects of these parameters. 3.1. Nuisance parameter estimation Treating p as a realization of the random variable P, we assume, similar to Beal (1987), that the prior distribution for P is given by f (p|) =
k {pi (1 − pi )}−1 , B(, )
(6)
i=1
for 0 < pi < 1 and 0, where B(·, ·) is the usual beta function. Note that (6) is the product of k symmetric beta densities if is required to be strictly positive. Viewing (1) as the conditional distribution of X, given P = p, we multiply this function by (6) and integrate over the unit k-hypercube to obtain the marginal distribution of X, i.e., k ni B(xi + , ni − xi + ) f (x|) = , (7) xi B(, ) i=1
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
1887
for xi = 0, 1, . . . , ni and i = 1, 2, . . . , k. The posterior distribution of P, given X = x, is k pixi +−1 (1 − pi )ni −xi +−1 f (p|x; ) = , B(xi + , ni − xi + ) i=1
for 0 < pi < 1. Following Beal’s (1987) approach, we estimate the nuisance parameters by setting j equal to the posterior mean of j = cj P, i.e., set j ≡ j () = E(cj P|X = x) =
k
cj i
i=1
xi + , ni + 2
(8)
for j = 2, 3, . . . , k. Different choices of the hyperparameter in (8) will lead to different values of j and, ultimately, to different confidence intervals. Beal (1987) examines many choices of in the two-sample case and studies the smallsample behavior of the resulting intervals. His findings suggested that = 0 and 21 provide reliable (albeit conservative) confidence intervals with small samples. It is worth noting that the choice = 0 gives j (0) =
k
i = cj p = j , cj i p
i=1
the MLE of j . We refer to the interval resulting from this choice as the Haldane (H) interval because the prior distribution is the product of k Haldane priors (Haldane, 1945). Similarly, we refer to the interval arising from taking = 21 as the Jeffreys–Perks (JP) interval since the beta ( 21 , 21 ) distribution is the Jeffreys noninformative prior (Good, 1965). Instead of specifying a choice for in (6) and carrying through a Bayesian analysis to eliminate the effects of 2 , 3 , . . . , k , there is no reason to prevent one from letting the observed data X suggest a value for using an empirical-Bayesian approach (Morris, 1983) similar to that used in Roths and Tebbs (2006) in the two-sample problem. In order to estimate from the observed data, we examine the marginal distribution of X in (7) and use the method of maximum likelihood. Given X = x, (7) can be viewed as the likelihood function of , and maximizing it may be done by numerically solving j [(xi + ) + (ni − xi + ) + 2(2) − 2(ni + 2) − 2()], log f (x|) = j k
0=
i=1
for , where (·) represents the digamma function. This solution, which we denote by , is then used for the choice of in (8). We call the resulting interval the empirical Bayesian MLE (EBMLE) interval. It should be noted that on rare occasion the value of is infinite (see Roths and Tebbs, 2006); in these instances, we take j = lim
k
→∞
i=1
cj i
xi + ni + 2
=
k 1 cj i , 2 i=1
for j = 2, 3, . . . , k. We also note that other marginal estimation techniques, such as the method of moments, could be used to elicit an estimate for . However, we have found that the resulting intervals arising from method-of-moments estimation of do not perform as well as the EBMLE interval, so we do not consider them further. 3.2. Specifying C In reparameterizing the problem using = C p, one must decide how to choose the matrix C = (c1 c2 · · · ck ). In the k-sample problem, the minimum requirements are that c1 , c2 , . . . , ck be linearly independent (so C−1 exists) and that 1 = c1 p, the parameter of interest, does not depend on 2 , 3 , . . . , k . One simple specification would be to take cj =ej , for j =2, 3, . . . , k, where ej is the standard unit vector with 1 in the jth position and zeros elsewhere. This meets the minimum requirements necessary, provided that c1 can not be written as a linear combination of e2 , e3 , . . . , ek (this j , ignores all the data can not occur if 1 depends on p1 ). A drawback with this simple approach, of course, is that
1888
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
observed except in the jth treatment group, j = 2, 3, . . . , k. Not surprisingly, we have found that this choice of C does not provide good confidence intervals for 1 . Clearly, using a reparameterization where j depends on all the available data is desirable, and the benefits of working with an orthogonal reparameterization are well known (Beal, 1987; see also Cox and Reid, 1987). In this light, we propose the following approach: (A1) Take c∗1 = c1 and c∗j = ej , for j = 2, 3, . . . , k, so that {c∗1 , c∗2 , . . . , c∗k } is a basis for Rk . (A2) Orthonormalize c∗1 , c∗2 , . . . , c∗k using a Gram–Schimdt (GS) algorithm. This algorithm transforms (c∗1 c∗2 · · · c∗k ) into (c1O c2O · · · ckO ) where
1, i = j, (9) ciO cj O = 0, i = j, so that c1O is in the same direction as c1 . The GS algorithm outlined in Christensen (2002, pp 394) provides a unique orthonormalized set c1O , c2O , . . . , ckO , where csO is in the space spanned by {c∗1 , c∗2 , . . . , c∗s }, for all s k. (A3) To preserve the value of c1 in 1 = c1 p, choose C = ( c1 c1 c1O c1 c1 c2O · · · c1 c1 ckO ). The resulting matrix C is nearly orthogonal, i.e., C C = aI, where a = c1 c1 and I denotes the k × k identity matrix (the constant a will not affect orthogonality). From the GS procedure outlined in Christensen (2002), c1 c1 c1O = c1 , as needed. The approach we have outlined produces an orthogonal reparameterization = C p as desired. However, two small technical details are worthy of a remark. First, the choice of the initial basis in (A1) is arbitrary, and other bases could be used. Perhaps as expected, we have not found that one particular basis selection always performs best. However, our choice of initial basis does often result in a well-conditioned GS procedure as measured through its condition number using the Frobenius norm. Second, we have chosen to implement Christensen’s (2002) GS algorithm, because it provides a unique set of orthonormalized vectors {c1O , c2O , . . . , ckO }. Other orthogonalization procedures, and even other versions of the GS algorithm that do not provide a unique set, are available. 4. Small-sample comparisons In this section, we provide a comparison of the six intervals under investigation (W, LW, PB, H, JP, and EBMLE). Because the support of X is a finite, coverage probability and mean length can be computed exactly. In particular, the coverage probability for a specific procedure (for fixed p) is given by I(x)f (x|p), R
where R is the support of X and the indicator function I(x) = 1, if an interval contains 1 ; I(x) = 0, otherwise. Similarly, the mean length is l(x)f (x|p), R
where l(x) is the interval length based on the realization x. We also compute the conditional mean length ratio CMLR = lI / lE to assess interval precision (Beal, 1987). Here, lI (lE ) represents the mean length for only those cases where 1 is (is not) covered. A small value of CMLR is desirable. To make our comparison, for fixed n = (n1 , n2 , . . . , nk ) , c1 , , and k, we first generate 100 values of p and compute the exact coverage, the exact mean length, and CMLR for each value of p. To provide an overall sense of coverage and length, we then average the 100 values for each measure. Tables 1 and 2 display the results for = 0.05 and for different values of c1 and ni = n (balanced designs), when k = 3, 4, and 5. In addition, we compute the minimum coverage
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
1889
Table 1 Coverage probability, mean length, conditional mean length ratio (CMLR), and distal and mesial noncoverage probability computed from 100 simulated values of p when = 0.05 and k = 3
c1 = (1, − 21 , − 21 )
Interval
Coverage
Length
Noncoverage
Mean
Min.
Mean
CMLR
Distal
Mesial
W LW PB H JP EB
0.910 0.958 0.951 0.927 0.937 0.944
0.846 0.933 0.930 0.858 0.853 0.879
0.629 0.616 0.622 0.570 0.578 0.599
1.267 1.035 1.077 1.131 1.087 1.126
0.046 0.020 0.024 0.035 0.030 0.027
0.044 0.022 0.025 0.028 0.033 0.029
20
W LW PB H JP EB
0.931 0.953 0.949 0.933 0.937 0.943
0.906 0.941 0.939 0.853 0.865 0.888
0.460 0.454 0.456 0.428 0.432 0.442
1.123 1.031 1.057 1.067 1.054 1.064
0.035 0.023 0.026 0.033 0.031 0.028
0.034 0.025 0.025 0.034 0.032 0.029
30
W LW PB H JP EB
0.938 0.953 0.949 0.936 0.939 0.943
0.928 0.941 0.944 0.860 0.867 0.887
0.377 0.373 0.375 0.355 0.358 0.363
1.084 1.024 1.041 1.044 1.037 1.041
0.031 0.024 0.025 0.032 0.030 0.028
0.031 0.023 0.026 0.032 0.031 0.029
n1
n2
n3
10
10
10
20
20
30
30
W, Wald; LW, Laplace–Wald; PB, Price–Bonett; H, Haldane; JP, Jeffreys–Perks; EB, EBMLE. The minimum coverage probability is also given.
from the 100 simulated values of p, as well as the average distal and mesial noncoverage probabilities. Distal (Mesial) noncoverage measures the probability content for all x vectors in the support producing intervals that miss 1 on the left (right) side. Some would argue that a desirable interval possesses symmetry in the noncoverage tails, i.e., that the distal and mesial noncoverage probabilities are approximately /2 (Newcombe, 1998). When generating the values of p for Tables 1 and 2, we have restricted the parameter space so that 0.1 pi 0.9, for each i = 1, 2, . . . , k. We have done this because our prior for p is not appropriate if, for example when k = 3, p1 is small (e.g., 0.05) and p2 and p3 are both large (e.g., 0.95). For these anomalous cases, the H, JP, and EBMLE intervals can perform badly because the nuisance parameters j , j = 2, 3, . . . , k, are not estimated well when sample sizes are small. This is a potential drawback with our new intervals; if the proportions lie in different extremes of (0, 1), we do not recommend them (see Section 6); of course, this situation might be unlikely in practice. On the other hand, if each proportion falls toward one extreme direction (e.g., p1 , p2 , and p3 are all small), our intervals are robust to the one-parameter prior and perform satisfactorily. We also remind the reader that the roots in (5) are sometimes imaginary (this only affects H, JP, and EBMLE). This happens only very rarely, but, when it does occur, we set the resulting interval equal to the most noninformative (lengthiest) interval possible, which, of course, depends on the value of c1 . From perusing Tables 1 and 2, we note that the new intervals are usually below the nominal confidence coefficient overall, although the EBMLE interval is often just barely below. The LW and PB intervals are quite good when k = 3 and 4; when k = 5, PB can be somewhat anticonservative, although certainly not by much. With regard to mean interval length and CMLR, the new intervals are somewhat shorter on average than LW and PB (although their mean coverage probabilities are slightly smaller on average) and very similar themselves. With regard to minimum coverage, the LW and PB intervals are amazingly consistent; the new intervals can provide noticeably low values because of the reasons stated in the last paragraph. It appears that all six intervals do fairly well at maintaining equal distal and mesial noncoverage probabilities. Tables 1 and 2 display the results only for balanced designs, but we have found that unbalanced designs do not largely affect our findings here. Among the three new intervals, the EBMLE appears to emerge as the best choice, maintaining an overall coverage closest to nominal, possessing the best minimum coverage, and providing similar mean length properties.
1890
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
Table 2 Coverage probability, mean length, conditional mean length ratio (CMLR), and distal and mesial noncoverage probability computed from 100 simulated values of p when = 0.05 and k = 4, 5
c1 = (1, − 13 , − 13 , − 13 )
Interval
n1
n2
n3
n4
10
10
10
10
20
20
20
20
c1 = (1, − 41 , − 41 , − 41 , − 41 )
Coverage
Length
Noncoverage
Mean
Min.
Mean
CMLR
Distal
Mesial
W LW PB H JP EB
0.903 0.957 0.941 0.916 0.928 0.938
0.858 0.936 0.921 0.817 0.802 0.842
0.597 0.582 0.592 0.520 0.528 0.551
1.334 1.047 1.140 1.152 1.106 1.139
0.048 0.022 0.030 0.043 0.036 0.031
0.049 0.021 0.029 0.041 0.036 0.031
W LW PB H JP EB
0.929 0.953 0.944 0.928 0.933 0.940
0.912 0.943 0.937 0.848 0.870 0.882
0.434 0.427 0.431 0.391 0.396 0.407
1.161 1.042 1.095 1.062 1.053 1.063
0.034 0.024 0.028 0.037 0.034 0.030
0.037 0.023 0.028 0.035 0.033 0.030
Interval
Coverage Mean
Min.
Length Mean
CMLR
Noncoverage Distal Mesial
n1
n2
n3
n4
n5
10
10
10
10
10
W LW PB H JP EB
0.897 0.957 0.933 0.913 0.926 0.940
0.857 0.938 0.922 0.818 0.869 0.873
0.575 0.563 0.573 0.492 0.499 0.524
1.413 1.053 1.211 1.150 1.110 1.113
0.052 0.021 0.034 0.044 0.037 0.031
0.051 0.022 0.033 0.043 0.037 0.029
20
20
20
20
20
W LW PB H JP EB
0.925 0.953 0.940 0.921 0.927 0.935
0.907 0.943 0.936 0.830 0.863 0.883
0.419 0.412 0.417 0.371 0.374 0.385
1.201 1.046 1.132 1.080 1.062 1.071
0.038 0.023 0.031 0.038 0.036 0.032
0.037 0.024 0.029 0.041 0.037 0.033
W, Wald; LW, Laplace–Wald; PB, Price–Bonett; H, Haldane; JP, Jeffreys–Perks; EB, EBMLE. The minimum coverage probability is also given.
5. Applications 5.1. Ecology example We first illustrate our intervals using data from an ecological study recently conducted at the Grand Bay Reserve, an estuarine ecosystem located near the Gulf of Mexico in Jackson County, Mississippi. Because the main results from this study have not yet been published by the owners of the data, by their request, we suppress many of the details and provide only salient aspects of the study. One of the goals was to compare the proportion of “grasshopper-munched” Black Needlerush (Juncus roemerianus) plants between plots with low and high amounts of deadweight (e.g., the total weight of nonliving plants per quadrat). Researchers took measurements in late October (in 2003 and 2005) from different quadrats; we use low and high deadweight quadrats from each year. The data appear in Table 3; the response is the number of plants that exhibit signs of feeding from grasshoppers. Denote by pi the true proportion of munched plants for stratum i, i = 1, 2, 3, 4. The factorial treatment structure in the data suggests three mutually orthogonal contrasts to investigate. Table 4 displays the results. The first contrast 1 = c1 p, where c1 = (− 21 , 21 , − 21 , 21 ) , compares the proportions between levels of deadweight, averaged over years. Analogously, the second contrast with c1 = (− 21 , − 21 , 21 , 21 ) compares the proportions between 2003 and 2005, averaged over the levels of deadweight. The contrast with c1 = (−1, 1, 1, −1) is used to assess the interaction between deadweight and year. One notes that the conclusion reached from each contrast is
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
1891
Table 3 Black Needlerush data for two years and two levels of deadweight (DW) Stratum
Year
DW level
# Plants sampled (ni )
# Munched (xi )
1 2 3 4
2003 2003 2005 2005
Low High Low High
58 67 16 25
35 41 4 11
Number of plants and number exhibiting signs of grasshopper munching are also provided.
Table 4 Black Needlerush data
c1
Interval
Lower limit
c1 = (− 21 , 21 , − 21 , 21 )
W LW PB H JP EB
−0.068 −0.075 −0.072 −0.042 −0.043 −0.058
0.267 0.251 0.259 0.228 0.230 0.248
0.335 0.326 0.331 0.270 0.273 0.306
c1 = (− 21 , − 21 , 21 , 21 )
W LW PB H JP EB
−0.430 −0.406 −0.417 −0.396 −0.397 −0.405
−0.095 −0.080 −0.087 −0.109 −0.108 −0.104
0.335 0.326 0.330 0.287 0.289 0.301
c1 = (−1, 1, 1, −1)
W LW PB H JP EB
−0.517 −0.484 −0.500 −0.528 −0.531 −0.551
0.153 0.168 0.162 0.189 0.191 0.204
0.620 0.652 0.662 0.717 0.722 0.755
Upper limit
Length
W, Wald; LW, Laplace–Wald; PB, Price–Bonett; H, Haldane, JP, Jeffreys–Perks, EB, EBMLE. Ninety five percent confidence intervals for three contrasts using the strata definitions from Table 3. No adjustments have been made for multiplicity.
the same for all six intervals, and, within contrast, the intervals are very similar. First, there is not a significant interaction between deadweight and year. Second, there is a significant difference between the proportion of munching between 2003 and 2005 (perhaps because of the effects of Hurricane Katrina). The effect of deadweight is not significant. It should be noted that we are interpreting each contrast separately; no attempts have been made to correct for multiplicity (see Section 6). 5.2. Multicenter clinical trial example The use of multicenter clinical trials has grown dramatically in recent years as a means of determining the efficacy of new drug treatments. In such experiments, one goal may be to estimate the pooled proportion of subjects who respond to treatment (or some other binary characteristic), where the pooling is done across the k investigational sites. Because the level of participation is likely different depending on the location, a natural estimate of the pooled proportion, say 1 , is the weighted average of the response probabilities from the k sites, i.e., 1 = ki=1 c1i pi , where c1i = ni / i ni , and ni denotes the number of subjects who participate at the ith site. To illustrate, we use data from a clinical trial recently conducted across five sites in Bangladesh, Brazil, India, Peru, and Vietnam; see Choice Study Group (2001) for complete details. The goal of the trial was to assess the efficacy of a reduced-salt regimen in treating male infants (in developing countries) for acute watery diarrhea. A total of 675 infants were recruited during the trial from June 1995 through February 1997. One of the characteristics measured
1892
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
Table 5 Multicenter clinical trial data Site
Location
# Subjects (ni )
# Fever cases (xi )
95%
CI
1 2 3 4 5
Bangladesh Brazil India Peru Vietnam Total
158 107 175 92 143 675
73 32 44 34 104 287
W LW PB H JP EB
(0.390, 0.460) (0.392, 0.461) (0.391, 0.460) (0.388, 0.463) (0.388, 0.463) (0.388, 0.464)
W, Wald; LW, Laplace–Wald; PB, Price–Bonett; H, Haldane; JP, Jeffreys–Perks; EB, EBMLE. Number of subjects and number of fever cases, at k = 5 sites, and 95% confidence intervals for the pooled proportion 1 = ki=1 c1i pi , where c1i = ni /675.
was the number of infants (at each site) experiencing fever at admission or during the trial. Our Table 5 reproduces the data and lists 95% confidence intervals for the pooled proportion 1 for the six intervals under investigation. That the intervals are nearly identical is not unexpected since the number of subjects at each site is somewhat large. In carrying out more extensive comparisons to investigate the pooled proportion parameter, we find the same general pattern from Tables 1 and 2, namely, the LW, PB, and EBMLE intervals are all good choices, although the EBMLE can be mildly conservative.
6. Discussion In this paper, we have generalized the approaches taken by Beal (1987) and Roths and Tebbs (2006) in the twosample problem with independent binomial samples. In particular, we have presented three new large-sample confidence intervals for a single linear combination, compared them to the intervals discussed in Price and Bonett (2004), and illustrated their use with two real data examples. We have created a website at http://www.stat.sc.edu/∼tebbs/ index.htm that contains an R program to compute all six intervals considered. Our C + + programs used to perform the extensive comparisons in Section 4 are available by request. In removing the effects of the nuisance parameters j , j = 2, 3, . . . , k, we have chosen for simplicity to consider a beta(, ) prior, common for p1 , p2 , . . . , pk . Of course, this choice could be altered by (a) assuming a common beta(, ) prior, where and are known or even (b) assuming a beta(i , i ) prior for pi , where i and i are known, for i = 1, 2, . . . , k. Using either extension would not greatly complicate the calculations involved here, and doing so could prove useful because the researcher might have good information as to where p1 , p2 , . . . , pk are likely located. It would be more difficult, however, although not impossible, to implement our parametric empirical Bayesian approach with the more flexible prior in (a). This would involve estimating and from the marginal distribution of X, a calculation that is known to sometimes possess numerical instability. Implementing an empirical Bayesian approach in (b) would not be possible since there are 2k parameters and only k sufficient statistics. What is perhaps most interesting with our proposed approach is that it can be used in or extended to a variety of other problems. First, the general methodology that we have presented could be easily adapted to handle other discrete probability distributions, such as the Poisson or geometric. In our survey of the literature, we have found that little research has focused on estimating a Poisson linear combination (and none that has focused on estimating a geometric linear combination). All one would need to implement our approach with other models is a single parameter prior distribution (e.g., gamma or beta, respectively) for each of the k contrast parameters. Second, we have restricted attention to linear combinations of the form 1 = c1 p in this paper; however, current work suggests little doubt that our approach could be extended to handle a more general parametric function; e.g., 1 = h(p) = p1 p2 p3 , etc. Finally, we also believe that our approach could be extended to simultaneous estimation problems involving a set of g < k linear combinations. Of course, one could just compute the g intervals separately, using the methods outlined here, and implement a Bonferroni correction. However, the resulting intervals would likely be too conservative, similar to what Piegorsch (1991) found in the MCA (all-pairwise comparisons) problem with independent binomial samples.
J. M. Tebbs, S. A. Roths / Journal of Statistical Planning and Inference 138 (2008) 1884 – 1893
1893
Acknowledgments We appreciate the thoughtful comments from the anonymous referees; implementing their suggestions has greatly improved this paper. We also thank Samuel Walker and his colleagues for the grasshopper data. References Agresti, A., Coull, B., 1998. Approximate is better than “exact” for interval estimation of binomial proportions. Amer. Statist. 52, 119–126. Agresti, A., Caffo, B., 2000. Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. Amer. Statist. 54, 280–288. Beal, S., 1987. Asymptotic confidence intervals for the difference between two binomial parameters for use with small samples. Biometrics 73, 941 –950. Brown, L., Li, X., 2005. Confidence intervals for two sample binomial distribution. J. Statist. Plann. Inference 130, 359–375. Brown, L., Cai, T., DasGupta, A., 2001. Interval estimation for a binomial proportion. Statist. Sci. 16, 101–117. Choice Study Group, 2001. Multicenter, randomized, double-blind clinical trial to evaluate the efficacy and safety of a reduced osmolarity oral rehydration salts solution in children with acute watery diarrhea. Pediatrics 107, 613–618. Christensen, R., 2002. Plane Answers to Complex Questions. Springer, New York. Cox, D., Reid, N., 1987. Parameter orthogonality and approximate conditional inference. J. Roy. Statist. Soc. Ser. B 49, 1–17. Good, I., 1965. The Estimation of Probabilities: An Essay on Modern Bayesian Methods. MIT Press, Cambridge. Greenland, S., 2001. Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. Amer. Statist. 55, 172. Haldane, J., 1945. On a method of estimating frequencies. Biometrika 33, 222–225. Morris, C., 1983. Parametric empirical Bayes inference: theory and applications. J. Amer. Statist. Assoc. 78, 47–55. Newcombe, R., 1998. Two sided confidence intervals for the single proportion: comparison of seven methods. Statist. Med. 17, 857–872. Piegorsch, W., 1991. Multiple comparisons for analyzing dichotomous response. Biometrics 47, 45–52. Price, R., Bonett, D., 2004. An improved confidence interval for a linear function of binomial proportions. Comput. Statist. Data Anal. 45, 449–456. Reiczigel, J., 2003. Confidence intervals for the binomial parameter: some new considerations. Statist. Med. 22, 611–621. Roths, S., Tebbs, J., 2006. Revisiting Beal’s confidence intervals for the difference of two binomial proportions. Comm. Statist. Theory Methods 35, 1593–1609. Zhou, X., Tsao, M., Qin, Q., 2004. New intervals for the difference between two independent binomial proportions. J. Statist. Plann. Inference 123, 97–115.