Exact likelihood ratio test of independence of binary responses within clusters

Exact likelihood ratio test of independence of binary responses within clusters

Computational Statistics & Data Analysis 33 (2000) 15–23 www.elsevier.com/locate/csda Exact likelihood ratio test of independence of binary responses...

92KB Sizes 0 Downloads 32 Views

Computational Statistics & Data Analysis 33 (2000) 15–23 www.elsevier.com/locate/csda

Exact likelihood ratio test of independence of binary responses within clusters Seung-Ho Kang a;∗ , Soo Min Park b a

Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug Administration, 5900 NCTR Road, Je erson, AR72079, USA b Department of Statistics, University of Wisconsin at Madison, USA

Abstract We consider the exact likelihood ratio test of independence of binary responses within clusters developed by George and Kodell (J. Amer. Statist. Assoc., 1996, 91, 1602–1611) conditional on the sucient statistics for the nuisance parameter under the null hypothesis. Only an exchangeability of responses within clusters is assumed for the distribution of the observed vector. Two formulas for the exact P-values are obtained. An algorithm is proposed to avoid unnecessary enumeration when the c 2000 Elsevier cluster size is two. We compare the exact and asymptotic P-values in several cases. Science B.V. All rights reserved. Keywords: Correlated binary data; Exchangeability; Exact test; Otolaryngology; Ophthalmology; Longitudinal study

1. Introduction Correlated binary data arise in many applications, including longitudinal studies, ophthalmologic and otolaryngologic clinical trials and developmental toxicity experiments. It is customary to use methods that allow for the correlation of the responses from the same cluster. The motivation for this is that the responses from the same clusters tend to behave more alike than the responses from di erent clusters. Correlated binary data are often analyzed using quasi-likelihood and generalized estimating equations methods and beta-binomial models and their generalizations. Recently, a ∗

Corresponding author.

c 2000 Elsevier Science B.V. All rights reserved. 0167-9473/00/$ - see front matter PII: S 0 1 6 7 - 9 4 7 3 ( 9 9 ) 0 0 0 4 4 - 4

16

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

method based on exchangeable binary random variables was proposed (Bowman and George, 1995; George and Bowman, 1995; George and Kodell, 1996). In comparison to the generalized estimating equations and beta-binomial procedures, the exchangeable model makes a minimal assumption about data. George and Kodell (1996) developed a nonparametric likelihood ratio test of independence of binary responses within clusters based on exchangeable binary random variables. The test relies on the asymptotic null distribution as the sample size goes to in nity. However, in many medical and biological applications the sample size may not be large enough to rely on asymptotic inference. The aim of this paper is to develop an exact likelihood ratio test of independence conditional on the sucient statistics for the nuisance parameter under the null hypothesis when cluster size is xed. A simple algorithm is proposed to speed up computing time when cluster size is two and complete enumeration is used when cluster size is greater than two. 2. Review A nite sequence of binary random variables X1 ; X2 ; : : : ; Xn is exchangeable if for any m(≤ n) and any vector (x1 ; x2 ; : : : ; xm ), with xi = 0; 1; i = 1; 2; : : : ; m, P(X(1) = x1 ; : : : ; X(m) = xm ) = P(X1 = x1 ; : : : ; Xm = xm ) for any permutation (1); : : : ; (m) of 1; : : : ; m: Let k = P(Xi1 = 1; Xi2 = 1; : : : ; Xik = 1) where {i1 ; : : : ; ik } is a subset of {1; : : : ; n}; k = 1; : : : ; n and 0 = 1. Using an inclusion and exclusion principle, George and Bowman (1995) obtained P(X1 = x1 ; : : : ; Xn = x n ) =

n−r X

(−1)

j

j=0

P



n−r j



r+j

where r = ni=1 xi . Let (Yi1 ; : : : ; Yin ); i = 1; : : : ; M , be independent vectors of binary random variables such P that each of the vectors have a common exchangeable distribution. Let Ri = nj=1 Yij ; i = 1; : : : ; M . Then (R1 ; : : : ; RM ) is a random sample from a population. Let Aj be the number of samples for which Ri equal j; j = 0; 1; : : : ; n and i = 1; : : : ; M: The null hypothesis of independence within clusters is given by H0 : r = 1r ;

r = 1; : : : ; n:

George and Kodell (1996) developed a likelihood ratio test of independence of binary responses within clusters when cluster sizes are random. In this paper we consider only the case that cluster size is xed. When cluster size (n) is xed, the likelihood ratio statistic obtained by George and Kodell (1996) is given by (1 ) =

" #Ar n Y ˆr1 (1 − ˆ1 )n−r Mn! r=0

Ar r!(n − r)!

;

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

where ˆ1 =

17

Pn

rAr : nM

r=0

The limiting null distribution of −2 log() is the chi-square. Based on the assumption that A0 6= M and An 6= M , the degree of freedom for the asymptotic chi-square distribution of −2 log() is 1.

3. Exact test It is a traditional approach to form exact tests by conditioning on sucient statistics for the nuisance parameters under the null hypothesis (Agresti, 1992). Under H0 1 is a nuisance parameter. Under H0 the log-likelihood is expressed by log(L(1 )) =

M X

log P(Yi1 = yi1 ; : : : ; Yin = yin )

i=1

=

M X



n−r Xi

log

i=1

=

M X i=1

=−

(−1) j

j=0



n − ri j





1ri +j 

log(1n [ − 1 + 1−1 ]n−ri )

M X i=1

ri log(−1 + 1−1 ) + Mn log(1 − 1 ): P

From the factorization theorem M i=1 Ri is a sucient statistics for 1 under H0 . Let Yi = (Yi1 ; Yi2 ; : : : ; Yin ); i = 1; : : : ; M . Since data from the same clusters are PM exchangeable, ( Y1 ; : : : ; YM ) is equally likely given i=1 Ri = R∗ under H0 . PH0

M ! X 1 ∗ Y1 = y 1 ; : : : ; Y M = y M R i = R =   nM i=1

R∗

for any possible ( y1 ; : : : ; yM ), because the number of all possible Y with is 

nM R∗

PM

i=1

Ri =R∗



:

(1)

P P ∗ ˆ Since nr=0 rAr is constant given M i=1 Ri =R , 1 is also constant. Therefore, given ∗ i=1 Ri = R , −2 log() is equivalent to

PM

T =2

n X r=0

Ar log[Ar r!(n − r)!]

(2)

18

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

with 0 log(0) = 0. The exact P-value of the likelihood ratio test conditional on PM ∗ i=1 Ri = R is 

1 nM R∗

where B=

X



1(T ( Y )≥T ( Y 0 )) ;

(3)

Y ∈B

 

Y|



n M X X

yij = R∗ ; yij is either 0 or 1

i=1 j=1

  

and Y 0 is the observed vector and 1 is the indicator function. Even if (3) is a right formula, we can obtain a simpler expression of the exact P-value using (A0 ; A1 ; : : : ; An ). The number of all possible cases of ( Y1 ; : : : ; YM ) with (A0 ; A1 ; : : : ; An ) is n Y M! × i=0 Ai ! i=0

Qn

 Ai

n i

:

(4)

From (1) and (4) the conditional probability of (A0 ; A1 ; : : : ; An ) given under H0 is n Y M! × i=0 Ai ! i=0

Qn

 Ai

n i



×

nM R∗

PM

i=1

Ri = R∗

−1

:

(5)

of (A0 ; A1 ; : : : ; An ). The exact condiLet (a0 ; a1 ; : : : ; an ) be a vector of the values P ∗ tional distribution of (A0 ; A1 ; : : : ; An ) given M i=1 Ri = R under H0 can be obtained by enumerating all possible vectors (a0 ; a1 ; : : : ; an ) with integer component satisfying Pn 0 ≤ ai ≤ M , i=1 ai i = R∗ and computing corresponding values of (2) along with probability (5). An P alternative expression of the exact P-value of the likelihood ratio ∗ test conditional on M i=1 Ri = R is n Y M! 1(T ( A)≥T ( A0 )) × Qn × i=0 Ai ! i=0 A∈D

X

 Ai

n i



×

nM R∗

−1

;

(6)

where A0 is the observed vector of A and (

A = (A0 ; A1 ; : : : ; An )

D = A|0 ≤ Ai ≤ M;

n X

)

Ai i = R



:

i=1

It is easier to compute the exact P-value with (6) than with (3), because the number of elements of D is smaller than that of B. 4. Algorithm when cluster size is two Formula (6) holds for any xed positive cluster size n. However, in this section we are particularly interested in the case of n = 2. Those cases often occur in ophthalmologic and otolaryngologic clinical trials. When cluster size is two, independence of binary responses within clusters may be tested by Fisher’s exact test with

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

19

multinomial sampling scheme. However, Fisher’s exact test does not incorporate the assumption of exchangeability while the exact likelihood ratio test does. P

∗ De nition. When we enumerate all possible (A0 ; A1 ; A2 ) with M i=1 Ri = R , we say we generate them in the decreasing order if (A0 ; A1 ; A2 ) with the larger value of Pn i i=0 Ai 2 is enumerated earlier. Similarly, we de ne the increasing order.

P

∗ Theorem. Suppose; given M i=1 Ri = R ; we enumerate (A0 ; A1 ; A2 ) in the decreasing order. Let w be a positive integer. (a) When R∗ = 2w; all possible vectors of (A0 ; A1 ; A2 ) and T (k) are expressed by

(M − w − k; 2k; w − k);

k = 0; 1; : : : ; min(w; M − w)

T (k) = 2(M −w −k)log[2(M −w −k)]+4k log(2k) + 2(w −k)log[2(w −k)]: Then T (k) decreases as k increases (k = 0; 1; : : : ; [w(M − w)=M ]) and then T (k) increases as k increases (k = [w(M − w)=M ] + 1; : : : ; w). (b) When R∗ = 2w + 1; all possible vectors of (A0 ; A1 ; A2 ) and T (k) are expressed by (M − w − k − 1; 2k + 1; w − k); k = 0; 1; : : : ; min(w; M − w − 1) T (k) = 2(M − w − k − 1)log[2(M − w − k − 1)] +2(2k + 1)log(2k + 1) + 2(w − k)log[2(w − k)]: Then T (k) decreases as k increases (k = 0; 1; : : : ; [(w(M − w − 1) − 1=4)=M ]) and then T (k) increases as k increases (k = [(w(M − w − 1) − 1=4)=M ] + 1; : : : ; w). The proof can be obtained directly from di erentiation of T (k). With the above theorem we do not need to enumerate all possible (A0 ; A1 ; A2 ) to compute the exact P-value of the likelihood ratio test of independence given Ri = R∗ and the following simple algorithm can be used. Step 1: We enumerate (A0 ; A1 ; A2 ) in the decreasing order and add its exact conditional probability to the P-value until the value of T is the same as that of the observed vector. Step 2: We do the same process as (A0 ; A1 ; A2 ) is enumerated in the increasing order. A FORTRAN program is made to compute the exact P-value of the test. The program uses the observed values of (A0 ; A1 ; A2 ) as input data and gives the exact and asymptotic P-values. The program is implemented on a 400 MHz Pentium IBM PC with 128M RAM and available from the author upon request. We compute exact and asymptotic P-values for several cases and results are presented in Table 1. The elapsed time is less than a few seconds even for the last case. It seems that the asymptotic P-value is more liberal than the exact P-value. We observe that the di erence between the two P-values decreases as the sample size increases.

20

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

Table 1 Comparison of exact and asymptotic P-values (A0 ; A1 ; A2 )

exact P-value

asymp. P-value

(6; 6; 10) (13; 15; 16) (21; 28; 25) (45; 62; 41) (112; 190; 119) (210; 374; 220) (450; 758; 390)

0.071 0.067 0.061 0.070 0.051 0.057 0.051

0.039 0.035 0.037 0.049 0.046 0.049 0.046

An alternative formulation of the paired binary case may be illuminative. We consider the model from Hirji (1994). P(X1 = x1 ; X2 = x2 ) =

 x1 +x2 x1 x2 ; 1 + 2 +  2 

 ¿ 0;  ¿ 0:

This model is equivalent to one in this paper. That is,  = P 1 =(1 − 21 − 2 ), P M  = (1 − 21 − 2 )=12 . Note that S1 = M i=1 (Xi1 + Xi2 ) and S2 = i=1 Xi1 Xi2 are the sucient statistics for  and , respectively. Then the exact conditional distribution of S2 given S1 is given by c(s1 ; s2 )s2 ; P(S2 = s2 |S1 = s1 ) = P u u c(s1 ; u)

(7)

where c(s1 ; s2 ) is the coecient of  s1 s2 in the expansion of the polynomial (1 + 2 +  2 )M . The exact conditional distribution under the null hypothesis, (7) with  = 0, could be adapted as a discrepancy measure between the null and alternative hypothesis. In such a case, the exact P-value can be obtained from the algorithm proposed by Hirji (1994). The algorithm also provides an easy method to compute the exact conditional distribution of the sucient statistic. 5. Examples 5.1. Otolaryngologic clinical trial We illustrate the algorithm by using the following example. The data set is from a double-blind randomized clinical trials comparing two antibiotics, cefaclor and amoxicillin, used for the treatment of acute otitis media (Mandel et al., 1982). Each child had acute otitis media infection in both ears at the beginning of treatment. We de ne Yi1 = 1 if the right ear is clear at the 14th day and 0 otherwise. Similarly, let Yi2 = 1 de ne the state of the left ear. Without additional information, such as the e ect of the right (or left) ear-speci c covariates on the severity of ear infection, it is reasonable to assume that Yi1 and Yi2 are exchangeable. For children with age 2–5, (A0 ; A1 ; A2 ) = (6; 6; 10) was observed for the cefaclor group at the 14th day. All

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

21

Table 2 P22 Enumeration for R = 26 and M = 22 i=1 i (A0 ; A1 ; A2 )

# of cases

Probability

T

Contribution to P-value

(9,0,13) (8,2,12) (7,4,11) (6,6,10) (5,8,9) (4,10,8) (3,12,7) (2,14,6) (1,16,5) (0,18,4)

497 420 × 20 29 099 070 × 22 232 792 560 × 24 597 500 904 × 26 640 179 540 × 28 320 089 770 × 210 77 597 520 × 212 8 953 560 × 214 447 678 × 216 7315 × 218

4:8 × 10−7 1:1 × 10−4 0.004 0.037 0.159 0.318 0.309 0.142 0.028 0.002

136.7 123.4 116.0 111.2 108.3 107.0 107.3 109.3 113.1 120.7

Yes Yes Yes Yes No No No No Yes Yes

P

possible con gurations of (A0 ; A1 ; A2 ) given 22 i=1 Ri = 0 × 6 + 1 × 6 + 2 × 10 = 26, are summarized in Table 2. The second column gives the number of Y with (A0 ; A1 ; A2 ) expressed by (4). Since the observed value of T is 111.2, the rejection region is {(9; 0; 13); (8; 2; 12); (7; 4; 11); (6; 6; 10); (1; 16; 5); (0; 18; 4)}: The exact P-value is 0.071. The asymptotic P-value based on the chi-square approximation with 1 degree of freedom is 0.039, which leads to a contrary conclusion concerning the hypothesis of independence. 5.2. Longitudinal study We consider the cohort study to evaluate falls of the elderly after hospital discharge (Hansen et al., 1999). Elderly patients were enrolled from April 1994 to May 1996 in hospitals. The purpose of this study is to determine the rate of falls and injuries occurring in high risk elderly over the 90 days (about 14 weeks) after hospitalization, and to determine the relationship of pre-hospital risk factors to the rate of falls after discharge. Accidental fall was the outcome of interest. We considered whether the patient fell in every two weeks (binary data, 1-fall, 0-not fall). In preliminary analysis, we might be interested in testing of independence within clusters. For male patients with age over 90 the following data were observed. (A0 ; A1 ; A2 ; A3 ; A4 ; A5 ; A6 ; A7 ) = (8; 1; 2; 1; 0; 0; 0; 0): When cluster size is greater than two, we resort to complete enumeration with (6). Complete enumeration is feasible when the sample size is relatively small. From the implementation of a developed FORTRAN program, the exact and asymptotic P-values of the likelihood ratio test are 0.1003 and 0.0138, respectively. The elapsed time is less than a few seconds.

22

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

6. Discussion The main advantage of exact tests is that it is guaranteed to bound the type I error probability to any desired level unconditionally. Therefore, the exact test is recommended as long as it is feasible and it is the most important concern to preserve the type I error. However, the conditioning often makes the distribution highly discrete. Hence tests of nominal size based on the exact conditional P-value may have actual size considerably less than , which results in the loss of power (e.g. Suissa and Shuster, 1985). Upton (1982) argues for simply reporting the P-value and not making comparisons to such arbitrary levels. Lancaster (1961) proposes to use the mid P-value, which is de ned to be half the probability of the observed result plus the probability of more extreme values. There are studies that show that the mid-P-value performs well (Hirji, 1991; Hirji et al., 1991). A disadvantage of the mid P-value is that tests based on it are not exact, the actual size possibly exceeding the nominal level. It might be an interesting future study to investigate the performance of the mid-P-value in this problem. Recently, a modi ed P-value has been used to decrease the conservativeness that occurs due to discreteness and guarantee to preserve the type I error unconditionally (Kim and Agresti, 1995). The proposed exact test is conditional on R∗ , which implies that the exact conditional power PH1 (reject H0 |R∗ ) is an appropriate tool to investigate the performance (i.e. power) of the exact test under the alternative hypothesis after an experiment. However, when we design an experiment, we often need to know the sample size to achieve a predetermined power. Since we do not observe R∗ , the unconditional exact power is an appropriate tool in such a case. The unconditional exact power is given by a weighted sum of the conditional exact power, i.e., PH1 (reject H0 ) =

X R∗

PH1 (reject H0 |R∗ )PH1 (R∗ ):

Exact unconditional power computations are orders of magnitude more dicult than exact tests and has begun to receive attention recently (Hilton and Mehta, 1993; Mehta et al., 1998). It is a future study to develop an ecient algorithm to compute exact unconditional power. In this paper we adapt the conditional exact test where the nuisance parameter is eliminated by conditioning on the sucient statistic. Another way to remove the nuisance parameter is to use the unconditional exact test (Agresti, 1992). An unconditional exact test is to compute the P-value by maximizing the null likelihood over the nuisance parameter space. Problems with the unconditional exact tests are that it is computationally intensive and could be very conservative if the maximum occurs at the point far away from the true value of the nuisance parameter (Berger and Boos, 1994). Acknowledgements This research was supported in part by an appointment to the Postgraduate Research Program at the National Center for Toxicological Research administered by

S.-H. Kang, S.M. Park / Computational Statistics & Data Analysis 33 (2000) 15–23

23

the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and the U.S. Food and Drug Administration. References Agresti, A., 1992. A survey of exact inference for contingency tables. Statist. Sci. 7, 131–177. Berger, R.L., Boos, D.D., 1994. P-values maximized over a con dence set for the nuisance parameter. J. Amer. Statist. Assoc. 89, 1012–1016. Bowman, D., George, E.O., 1995. A saturated model for analyzing exchangeable binary data: applications to clinical and developmental toxicity studies. J. Amer. Statist. Assoc. 90, 871– 879. George, E.O., Kodell, R.L., 1996. Tests of independence, treatment heterogeneity, and dose-related trend with exchangeable binary data. J. Amer. Statist. Assoc. 91, 1602–1611. George, E.O., Bowman, D., 1995. A full likelihood procedure for analysing exchangeable binary data. Biometrics 51, 512–523. Hansen, K., Mahoney, J., Palta, M. 1999. Risk factors of lack of recovery of ADL independence after hospital discharge. J. Amer. Geriatrics Soc., to appear. Hilton, J., Mehta, C.R., 1993. Power and sample size calculations for exact conditional tests with ordered categorical data. Biometrics 49, 609–616. Hirji, K.F., 1991. A comparison of exact, mid-P and score tests for matched case-control studies. Biometrics 47, 487– 496. Hirji, K.F., 1994. Exact analysis for paired binary data. Biometrics 50, 964 –974. Hirji, K.F., Tan, S., Elasho , R.M., 1991. A quasi-exact test for comparing two binomial parameters. Statist. Med. 10, 1137–1153. Kim, D., Agresti, A., 1995. Improved exact inference about conditional association in three-way contingency tables. J. Amer. Statist. Assoc. 90, 632– 639. Lancaster, H.O., 1961. Signi cance tests in discrete distribution. J. Amer. Statist. Assoc. 56, 223–234. Mandel, E., Bluestone, C.D., Rockette, H.E., Blatter, M.M., Reisinger, K.S., Wucher, F.P., Harper, J., 1982. Duration of e usion after antibiotic treatment for acute Otitis Media: comparison of Cefaclor and Amoxicillin. Pediatric Infectious Diseases 1, 310 –316. Mehta, C.R., Patel, N.R., Senchaudhuri, P., 1998. Exact power and sample-size computations for the Cochran-Armitage trend test. Biometrics 54, 1615–1621. Suissa, S., Shuster, J.J., 1985. Exact unconditional sample sizes for the 2 × 2 binomial trial. J. Roy. Statist. Soc. Ser. A 148, 317–327. Upton, G.J.G., 1982. A comparison of alternative tests for the 2 × 2 comparative trial. J. Roy. Statist. Soc. Ser. A 145, 86 –105.