Controlled Clinical Trials 24 (2003) 411–421
Simple approaches to assess the possible impact of missing outcome information on estimates of risk ratios, odds ratios, and risk differences Laurence S. Magder, Ph.D., M.P.H.* Department of Epidemiology and Preventive Medicine, University of Maryland, Baltimore, Maryland, USA Manuscript received August 5, 2002; manuscript accepted January 27, 2003
Abstract Often in clinical trials, the primary outcome is binary and the impact of an intervention is summarized using risk ratios (RRs), odds ratios (ORs), or risk differences (RDs). It is typical that in such studies, the binary outcome variable is not observed for some study participants. When there is missing data, it is well known that analyses based on those participants with complete data can be biased unless it can be assumed that the probability of a missing outcome is unrelated to the value of the missing binary outcome (i.e., missing at random). Unfortunately, this assumption cannot be assessed with the data since the missing outcomes, by definition, are not observed. One approach to this problem is to perform a sensitivity analysis to see the degree to which conclusions based only on the complete data would be affected given various degrees of departure from the missing at random assumption. In this paper we provide researchers formulae for doing such a sensitivity analysis. We quantify the departure from the missing at random assumption with a parameter we call the “response probability ratio” (RPR). This is the ratio between the probability of a nonmissing outcome among those with one value of the binary outcome and the probability of a nonmissing outcome among those with the other value of the outcome. Then we provide simple formulae for the estimation of the RRs, ORs, and RDs given any specific values of the RPRs. In addition to being useful for sensitivity analyses, these formulae provide some insight into the conditions that are necessary for bias to occur. In particular, it can be seen that, under certain plausible assumptions, OR estimates based on participants with complete data will be asymptotically unbiased, even if the probability of missing outcome depends on both the treatment and the outcome. 쑕 2003 Elsevier Inc. All rights reserved. Keywords: Missing data; Sensitivity analysis; Risk ratios; Odds ratios; Risk differences; Nonresponse bias; Dropout; Binary data; Nonignorable missingness
* Department of Epidemiology and Preventive Medicine, University of Maryland, 660 W. Redwood Street, Baltimore, MD 21201-1596. Tel.: ⫹1-410-707-3808; fax: ⫹1-410-706-3253. E-mail address:
[email protected] 0197-2456/03/$—see front matter 쑕 2003 Elsevier Inc. All rights reserved. doi:10.1016/S0197-2456(03)00021-7
412
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
Introduction Often in clinical trials, the primary outcome is binary (e.g., treatment success, yes/no), and the impact of an intervention is summarized from 2 × 2 tables using risk ratios (RRs), odds ratios (ORs), or risk differences (RDs). It is typical that in such studies, the binary outcome variable is not observed for some study participants. This could be due to measurement problems, the fact that patients drop out of the study, or many other reasons. It is well known that if the probability that an outcome variable is missing is related to the unobserved value of the outcome variable, then estimates of impact based on the observed data can be biased [1,2]. In this paper we present simple formulae that researchers can use to determine the potential impact of missing outcome information on their conclusions. In addition, inspection of these formulae reveals the conditions that must be present for bias to occur. For example, a trial was conducted to compare optic nerve decompression surgery (ONDS) to careful follow-up for treatment of nonarteritic anterior ischemic optic neuropathy [3]. Treatment success was defined as an improvement of three or more lines of visual acuity. For each patient, at 12 months after randomization, there were three possible outcomes: success, failure, or missing 12-month data. Table 1 provides the 12-month results. One approach to the analysis of these data is to base the analysis on only those for whom outcome data are available. In this example that would result in the following estimates: Probability of success given ONDS ⫽ 37Ⲑ107 (35%) Probability of success given careful follow-up ⫽ 47/114 (41%) Ratio of success probabilities ⫽ RR ⫽ 0.35Ⲑ0.41 ⫽ 0.84 95% confidence interval for RR: (0.6–1.2) p-value for null hypothesis of no difference between the treatments ⫽ 0.31 It seems likely that failure to complete their 12-month visit would not be a random event. Such nonresponse might be related to treatment if the side effects of one treatment lead patients to drop out. It is also possible that failure to complete a 12-month visit might be related to the outcome if those with a poorer (or better) outcome are less likely to return for their 12-month visit. What is the degree of potential bias of our estimates of RRs, ORs, and RDs under these conditions if we ignore the missing patients? Under what conditions does this bias arise? These are the questions we will address in this paper. Methods Our general approach is to define a parameter that quantifies the degree of departure from the “missing-at-random” assumption. Then we derive formulae that give maximum likelihood Table 1. 12-month results from the Ischemic Optic Neuropathy Decompression Trial [3] Number of patients with each outcome
ONDS Careful follow-up
Success (⭓three lines better vision)
Failure (⬍three lines better vision)
Missing
37 47
70 67
20 17
ONDS ⫽ optic nerve decompression surgery.
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
413
estimators for the RRs, ORs, and RDs given a fixed value for that parameter. Researchers can use these formulae to calculate estimates of association given various values of that parameter (i.e., for various degrees of departure from the missing-at-random assumption). This makes it possible to see the degree to which their estimates are sensitive to that assumption. Notation for the parameters Let pS|A and pS|B stand for the probabilities of treatment success given treatments A and B. Then, let RR ⫽ pS|A/pS|B, OR ⫽ {pS|A/(1 ⫺ pS|A)} ÷ {pS|B/(1 ⫺ pS|B)}, and RD ⫽ pS|A ⫺ pS|B. We assume that these are the parameters of scientific interest. In addition, we must define parameters related to the probability of missing information regarding the outcome. Let RP stand for the response probability (i.e., the probability that a patient will not have a missing value for the outcome variable). We allow for the possibility that the response probabilities may vary depending on what treatment group the patient is in (A or B) and what the actual outcome for that patient is (success or failure). Thus we define RPAS and RPAS¯ to be the response probabilities for those with treatment A who had treatment successes and failures, respectively. Similarly, we define RPBS and RPBS¯ to be the response probabilities for those with treatment B who had treatment successes and failures, respectively. Then, let RPRA be the response probability ratio for treatment A (i.e., RPRA ⫽ RPAS/RPAS¯. Analogously, define RPRB as the response probability ratio for treatment B. When both the RPRs equal one, the data are missing at random. Greater or lesser values of the RPRs quantify the degree of departure from the missing-at-random assumption. Notation for the data As in the example described above, we assume that patients are randomly allocated to one of two groups (A and B), and there are three possible outcomes for each patient (success, failure, and missing). The numbers of patients in each category are given by the letters a, b, c, d, mA, and mB as indicated in Table 2.
Results Complete-case estimates Using the above notation, the standard estimates for the RR, OR, and RD, ignoring the fact that there are missing data, are given in the first column of Table 3. Estimates such as these that apply standard methods to those with complete information are often referred to as “complete-case” estimates. Table 2. Notation for data from a randomized trial with binary outcome and missing responses Outcome
Treatment A Treatment B
Success
Failure
Missing
a c
b d
mA mB
414
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
Table 3. Complete-case and adjusted estimates of measures of effect from 2 × 2 tables Effect measure Risk ratio Odds ratio Risk difference
Standard estimates that do not take into account nonresponse
Estimates accounting for specified values of the response probability ratios (RPRs)
a/(a ⫹ b) RRˆcomplete-case ⫽ c/(c ⫹ d) ad ORˆcomplete-case ⫽ bc ˆ complete-case ⫽ a ⫺ c RD a⫹b c⫹d
a/(a ⫹ RPRAb) RRˆadj ⫽ c/(c ⫹ RPRBd) RPRB ad ORˆadj ⫽ bc RPRA c a ˆ adj ⫽ RD ⫺ a ⫹ RPRAb c ⫹ RPRBd
Adjusted estimates For any given values of RPRA and RPRB within the range of values supported by the data (as described below), the maximum likelihood estimates of the RR, OR and RD, adjusting for the missing data probabilities are given in the second column of Table 3. As maximum likelihood estimators, these estimators are asymptotically unbiased. The justifications for these formulae are provided in the appendix. Formulae for large sample confidence intervals are also provided in the appendix. Conditions required for bias Since the adjusted estimates are asymptotically unbiased, a comparison of the completecase estimates to the adjusted estimates provides insight into the key determinants of bias. A number of useful conclusions can be arrived at by comparing these formulae: Given the RPRs, bias does not depend on absolute probabilities of response. Note that the adjusted estimates do not depend on the actual response probabilities (RP) in each treatment group except through the RPRs. Response probabilities can be very low, but this in itself does not lead to bias. The bias depends on the values of the RPRs. Bias does not depend directly on the association between treatment group and response probabilities. Response probabilities can be very different in the two treatment groups, but this, in itself, does not lead to bias. Again, the key parameters are the RPRs. If there is no association between outcome and response probabilities, the complete-case estimates are asymptotically unbiased. Note that if RPRA ⫽ RPRB ⫽ 1, then the complete-case and adjusted estimates are equivalent. Note that this could occur even if the probability of missing data varies between treatment groups. This is an example of data that are “missing at random.” The complete-case OR estimates are asymptotically unbiased under certain plausible conditions, even if the response probabilities are associated with both treatment and outcome. Note that if the ratio of the RPRs, RPRB/RPRA, equals 1, then the completecase OR estimate is asymptotically unbiased. This could occur even if the probability of missing outcome data depends on both the treatment group and the outcome. In this situation, although the missing data are technically “nonignorable,” they can be safely
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
415
ignored in an analysis. The only time the complete-case OR estimate is biased is if the response probabilities depend on both the treatment and the outcome, and the relationships between the response probabilities and the outcome differ in the two treatment groups. What the data say about the RPRs From the likelihood perspective, the data provide most support for parameter values that make the observed data most probable [4]. In our problem, the observed data are realized trinomial random variables in each treatment group (see appendix). It is well known that the probability of a trinomial outcome is highest for parameter values that imply that the probability of each outcome is equal to the proportions observed in each cell. Thus, by the likelihood criterion, the data provide most support for parameter sets that imply that the probability of each outcome is equal to the observed sample proportions. Unfortunately, given the number of parameters in the model (if the RPRs are not assumed known), multiple parameter sets can imply probabilities equal to the observed proportions. The data provide equal support for a large set of parameter sets, and the parameters are said to be unidentifiable. Thus, the data provide equal support for a range of RPRs that are in this set of parameter sets with highest support. However, RPRs outside of this range have less support from the likelihood perspective. RPRs are outside this range if it is impossible to choose other parameter values in such a way that the implied probability of each outcome is equal to the observed proportions. This range can be found algebraically, resulting in the following conclusions. 1. The data provide most support for RPRA in the range a/(a ⫹ mA) to (b ⫹ mA)/b. 2. The data provide most support for RPRB in the range c/(c ⫹ mB) to (d ⫹ mB)/d. The lower limits of these ranges correspond to the RPRs that would have been observed under the extreme scenario that all those with missing data were treatment successes. To see this, consider the first row of Table 2. If all patients with missing outcomes were treatment successes, then the observed RP among those with treatment success would be a/(a ⫹ mA). Also, in that case, the observed RP among those with treatment failure would be 1.0, so the observed RPRA would be a/(a ⫹ mA), the lower range given above. Similarly, the upper ranges correspond to the scenario that all those with missing data were treatment failures. The estimates of treatment success probabilities that result from these extreme scenarios are equivalent to the sharp bounds on treatment success derived by Horowitz and Manski [5]. Example Table 4 gives adjusted estimates based on the data from Table 1 under various assumptions about the missing data probabilities. The first row provides the results of the analysis that ignores the missing data. The second row provides the results assuming that the RPR is 1.0 for both treatments. As is evident from the formulae, this analysis is equivalent to the complete-case analysis. It might be reasonable to assume that the factor by which the outcome variable affects the probability of response is the same in both groups (i.e., RPRSurgery ⫽ RPRNo Surgery). Now,
416
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
Table 4. Estimation of risk rations, odds ratios, and risk differences from the data in Table 1 under various assumptions about the missing data probabilities Estimates (95% confidence intervalsa) Assumptions
Risk ratio
1. Complete-case analysis 2. RPRSurgery ⫽ RPRNo Surgery ⫽ 1 3. RPRSurgery ⫽ RPRNo Surgery ⫽ 0.73 4. RPRSurgery ⫽ RPRNo Surgery ⫽ 1.28 5. RPRSurgery ⫽ 0.65, RPRNo Surgery ⫽ 1.25 6. RPRSurgery ⫽ 1.28, RPRNo Surgery ⫽ 0.73
0.84 0.84 0.86 0.83 1.25 0.60
a b
(0.60, (0.60, (0.64, (0.57, (0.90, (0.42,
Odds ratio 1.18) 1.18) 1.16) 1.19) 1.73) 0.84)
0.75 0.75 0.75 0.75 1.45 0.43
(0.44, (0.44, (0.44, (0.44, (0.84, (0.25,
1.30) 1.30) 1.30) 1.30) 2.50) 0.74)
Risk difference
p-valueb
⫺0.07 (⫺0.19, 0.06) ⫺0.07 (⫺0.19, 0.06) ⫺0.07 (⫺0.20, 0.06) ⫺0.06 (⫺0.18, 0.06) 0.09 (⫺0.04, 0.21) ⫺0.20 (⫺0.32, ⫺0.07)
0.31 0.31 0.31 0.31 0.18 0.0024
Confidence intervals based on Taylor Series approximation as described in the appendix. Based on the asymptotic distribution of the log of the adjusted odds ratio estimate as described in the appendix.
note that the data are most consistent with the RPRSurgery being in the range of 0.65 to 1.28 and the RPRNo Surgery being in the range of 0.73 to 1.25. Thus to consider the extreme possibilities consistent with the data under the assumption that the RPRs in the two groups are equivalent, we estimated the effects under the assumptions that RPRSurgery ⫽ RPRNo Surgery ⫽ 0.73 and RPRSurgery ⫽ RPRNo Surgery ⫽ 1.25 (rows three and four of Table 3). If we are willing to make this assumption, then it is evident that the conclusions of the analysis do not change substantially. If it is thought possible that among those with surgery, treatment successes have a lower probability of returning for their 12-month assessment, while among those without surgery, treatment successes have a higher probability of returning for their 12-month assessment, then we might postulate that RPRSurgery is less than 1, while RPRNo Surgery is greater than 1. To consider the most extreme possibility consistent with the data under this assumption, we calculated estimates assuming RPRSurgery ⫽ 0.65 and RPRNo Surgery ⫽ 1.25. The point estimate from this model is the same as one would get if all those with an unknown outcome in group A were assumed to be treatment failures, while all those with an unknown outcome in group B were assumed to be treatment successes. The results are seen in the fifth line of Table 3. Under this assumption, the data still do not provide evidence in favor of one treatment or another. The last line of the table considers how the results would change if the opposite extremes were chosen. Estimation of absolute probabilities The same methods can be used to derive estimates of the absolute probabilities of treatment success in each group. The estimate for the probability of treatment success in group A adjusting for the missing data probabilities is: pˆ S|A ⫽
a . a ⫹ RPRAb
A formula for a large sample confidence interval for pS|A is provided in the appendix. Analogous expressions can be used to estimate pS|B.
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
417
Estimation of the response probabilities For a given RPR, it is also possible to derive maximum likelihood estimates of the individual response probabilities. Specifically, a ⫹ RPRAb a ⫹ RPRAb RPˆAS ⫽ and RPˆAS¯ ⫽ nA nARPRA with analogous expressions for RPˆBS and RPˆBS¯. Discussion There have been many articles in the literature [6–13] related to the analysis of categorical data when some of the classifications are missing. Most papers consider models that allow for missing information regarding both the predictor variables and the outcome variables [8–10,12–13]. This greatly increases the complexity of the models. This complexity is not necessary in the context of clinical trials because information regarding the group assignment is rarely if ever missing. Several of the papers only considered models in which the missing data mechanism is ignorable [8,10]. Those that considered models that allowed for a nonignorable missing data mechanism generally did not take the perspective of performing a sensitivity analysis based on the fixed value of a key parameter, confining their analysis to identifiable models [6,7,9,11–13]. Various types of sensitivity analyses have been proposed previously [14–19]. Nordheim [14] takes an approach similar to ours but parameterizes the model differently, using the ratio of probabilities of missing data, rather than the ratio of response probabilities to quantify departures from the missing at random assumption. This approach results in quadratic equations and more complicated formulae. Vach and Blettner [15] propose a sensitivity analysis similar to ours; however, they consider the case where categorical predictors are missing but the outcome is completely observed. Scharfstein et al. [16] propose a sensitivity analysis similar to ours for the situation when the missing outcome is due to dropout and the dropout time is known. They parameterize the departure from the missing at random assumption with a parameter that stands for the factor by which the hazard of the dropout increases. Hollis [17] and Delucci [18] provide graphical methods for showing the range of results that could have been observed based on various possible allocations of the missing outcomes to treatment success and failures. The estimates of impact that result from their “extreme” cases (which occur when assigning all the missing outcomes to success in one group and failure in the other) would be equivalent to the estimates found using our methods when the extreme RPRs are taken at the endpoints of the intervals supported by the data. Finally, Matts et al. [19] present an interesting graphical approach to illustrate the impact of various assumptions about those who dropped out on the results of a survival analysis. None of the above papers derived the relatively simple formulae presented in this paper, nor are the formulae presented here derivable as special cases from any of the models in these papers. Although this method was described with reference to clinical trial data, the method can be used for observational data when there is complete information about a predictor, but some missing information about outcomes.
418
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
It is encouraging to find that even if the data are not missing at random, the complete case OR is asymptotically unbiased as long as RPRA ⫽ RPRB. In many cases it might be reasonable to make this assumption. For example, if it is thought that the probability that a study participant has a missing outcome is the product of the probabilities of being missing due to background factors, treatment factors, and outcome-related factors, then the two RPRs would be equivalent. The question for the data analyst is how to choose the set of reasonable parameter values (RPRs) over which to conduct the sensitivity analysis. One approach, as illustrated in our example, is to choose values of the RPRs in the range supported by the data. However, some RPRs in this range may seem implausible, and, as Scharfstein et al. point out [16], a choice of the range of values to consider is best made by subject-matter experts and those most familiar with the trial. This choice can be better informed if researchers collect data, when possible, on the reason for the missing outcome. If the reasons for missing outcome data are deemed likely to be unrelated to the value of the outcome variables (e.g., missing because study participants moved to another city or missing due to laboratory errors), then values of RPR closer to 1 may be considered reasonable. The results of the sensitivity analysis will have greater credibility if the choice of the range of parameters is made prior to unblinding the data. In presenting the results, graphical summaries (such as those presented in Refs. [15–19]) are useful to show the sensitivity analysis over a wide range of assumptions. If a single best estimate of the effect of treatment is needed for decision making, a Bayesian approach can be used (putting a prior on the values of the sensitivity parameter), as suggested by Scharfstein et al. [16]. Acknowledgments The author would like to thank P. David Wilson, Mona Baumgarten, and two anonymous referees for helpful comments regarding earlier drafts of this paper. This work was supported by research grant R0-1 AR 43727 of the National Institutes of Health. Appendix Adjusted point estimates for RR and OR The observed outcome variable for each participant in the study can take one of three values: success, failure, and missing. Let Y stand for the observed outcome for one study participant. Let:
{
1 2 Y⫽ 3
if success if failure if missing
As a general shorthand notation, let pu|v stand for the conditional probability of u given v, for any events u and v. Thus, for example, pY⫽1|A stands for the probability that Y equals 1 given treatment A.
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
419
The data provide direct information and obvious estimators for pY⫽1|A, pY⫽2|A and pY⫽3|A and the same probabilities given treatment B. Unfortunately, these are not the parameters of primary interest. The parameters of primary interest are pS|A and pS|B. The general approach is to find expressions for the parameters of primary interest in terms of the parameters that have obvious estimators, and substitute in the obvious estimators. Note: pY⫽2|A ⫽ (1 ⫺ pS|A)(RPAS¯)
(1)
pY=1|A ⫽ (pS|A)(RPRA)(RPAS¯).
(2)
and
Considering RPRA, pY⫽2|A, and pY⫽1|A as fixed, these equations constitute two equations in two unknowns. Solving them for pS|A and RPAS¯, we get: pS|A ⫽
pY⫽1|A pY⫽1|A ⫹ RPRApY⫽2|A
(3)
and RPS¯A ⫽
pY⫽1|A ⫹ RPRApY⫽2|A . RPRA
(4)
The maximum likelihood estimators for pY⫽1|A and pY⫽2|A are a/nA and b/nA, respectively, where nA ⫽ a ⫹ b ⫹ mA. Substituting these into Eqs. (3) and (4) results in: pˆS|A ⫽
a a ⫹ RPRAb
(5)
and a ⫹ RPRAb . RPˆS¯A ⫽ nARPRA
(6)
Using analogous reasoning: pˆS|B ⫽
c c ⫹ RPRBd
(7)
and c ⫹ RPRBd RPˆS¯B ⫽ nBRPRB
(8)
where nB ⫽ c ⫹ d ⫹ mB. Taking the ratio of Eqs. (5) and (7), the ratio of their associated odds, or their differences results in the estimates of the RRadj, ORadj, and RDadj presented in Table 3. Since these estimates are based on maximum likelihood estimates of the probabilities of Y, they themselves must be maximum likelihood estimates due to the invariance of the likelihood to reparameterizations.
420
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
As mentioned above, these formulae are only valid if RPRA and RPRB are chosen in the range supported by the data. If they are outside these ranges, these expressions will result in estimates of response probabilities that are not between 0 and 1. Confidence intervals for the RR, OR, and RD adjusting for unequal probabilities of missing responses We used the “delta method” [20] to derive large sample variances of the loge(RRadjusted) and the loge(ORadjusted) based on the distributions of the estimates of the probability that Y ⫽ 1, 2, and 3. This is also sometimes referred to as the “Taylor Series Approach” [21]. This results in the following formulae: Approximate 95% confidence interval for the RR, adjusting for unequal probabilities of missing response:
(
RPR2Ab(a⫹b) RPR2Bd(c ⫹ d) ⫹ (RPRAb ⫹ a)2a (RPRBd ⫹ c)2c
冪
exp log(RRˆadjusted) ± 1.96
)
Approximate 95% confidence interval for the OR, adjusting for unequal probabilities of missing response:
(
冪
1 1 1 1 ⫹ ⫹ ⫹ exp log(ORˆadjusted) ± 1.96 a b c d
)
Approximate 95% confidence interval for the RD, adjusting for unequal probabilities of missing response: RPR2Aab(a ⫹ b) RPR2Bcd(c ⫹ d) ⫹ . (a ⫹ bRPRA)4 (c ⫹ dRPRB)4
冪
ˆ adjusted ± 1.96 RD
For different levels of confidence other than 95%, replace 1.96 in the formulae with the appropriate quantile from the standard normal distribution.
References [1] Little RJA, Rubin DB. Statistical analysis with missing data. New York: John Wiley & Sons, 1987. [2] Carpenter J, Pocock S, Lamm CJ. Coping with missing data in clinical trials: a model-based approach applied to asthma trials. Stat Med 2002;21:1043–1066. [3] The Ischemic Optic Neuropathy Decompression Trial Research Group. Ischemic Optic Neuropathy Decompression Trial: twenty-four-month update. Arch Ophthalmol 2000;118:793–798. [4] Royall R. Statistical evidence: a likelihood paradigm. New York: Chapman & Hall, 1997. [5] Horowitz JL, Manski CF. Nonparametric analysis of randomized experiments with missing covariate and outcome data. J Am Stat Assoc 2000;95:77–84. [6] Blumenthal S. Multinomial sampling with partially categorized data. J Am Stat Assoc 1968;63:542–551. [7] Elashoff JD, Elashoff RM. Two-sample problems for a dichotomous variable with missing data. Applied Statistics 1974;23:26–34.
L.S. Magder/Controlled Clinical Trials 24 (2003) 411–421
421
[8] Hocking RR, Oxspring HH. The analysis of partially categorized contingency data. Biometrics 1974; 30:469–483. [9] Chen T, Fienburg SE. Two-dimensional contingency tables with both completely and partially cross-classified data. Biometrics 1974;30:629–642. [10] Fuchs C. Maximum likelihood estimation and model selection in contingency tables with missing data. J Am Stat Assoc 1982;77:270–278. [11] Baker SG, Laird NM. Regression analysis for categorical variables with outcome subject to nonignorable nonresponse. J Am Stat Assoc 1988;83:62–69. [12] Baker SG, Rosenburg WF, Dersimonian R. Closed-form estimates for missing counts in two-way contingency tables. Stat Med 1992;11:643–657. [13] Molenberghs G, Goetghebeur EJT, Lipsitz SR, Kenward MG. Nonrandom missingness in categorical data: strengths and limitations. American Statistician 1999;53:110–118. [14] Nordheim EV. Inference from nonrandomly missing categorical data: an example from a genetics study on Turner’s syndrome. J Am Stat Assoc 1984;79:772–780. [15] Vach W, Blettner M. Logistic regression with incompletely observed categorical covariates—investigating the sensitivity against violation of the missing at random assumption. Stat Med 1995;14:1315–1329. [16] Scharfstein DO, Rotnitzky A, Robins JM. Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Stat Assoc 1999;94:1096–1020. [17] Hollis S. A graphical sensitivity analysis for clinical trials with non-ignorable missing binary outcome. Stat Med 2002;21:3823–3834. [18] Delucchi KL. Methods for the analysis of binary outcomes results in the presence of missing data. J Consult Clin Psychol 1994;62:569–575. [19] Matts JP, Launer CA, Nelson ET, et al. A graphical assessment of the potential impact of losses to followup on the validity of study results. Stat Med 1997;16:1943–1954. [20] Bishop YMM, Feinberg SE, Holland PW. Discrete multivariate analysis. Cambridge, Massachusetts: MIT Press, 1975. [21] Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic research: principles and quantitative methods. New York: Van Nostrand Rheinhold, 1982.