EDITORIAL
Strategies to control confounding in causation studies Management and policy decisions in health care often rely on evidence con¢rming a causal relationship between exposure and outcome. Whether one is addressing the question of adverse outcomes from treatment or factors that play a role in the etiology of disease, one is interested in determining whether there is a cause-and-e¡ect relationship. A factor can be said to cause an event if its operation increases the frequency of that event. Hence a causal relationship exists if the frequency of a particular outcome is higher in the group exposed to the causal factor compared to one not exposed. This condition also implies that individuals with the outcome of interest will have a greater frequency of having had exposure to the causal agent. Experimental and observational studies in health care are designed to identify factors that cause disease. The initial focus of the study should be directed towards designing appropriate selection procedures, so that the internal validity of the study can be enhanced. Before attributing the relationship between the outcome event observed and the exposure under evaluation to causation, one needs to rule out the possibility that the relationship has occurred spuriously from bias, confounding, or chance variation. Confounding is de¢ned as a distortion of an exposure^ outcome association brought about by the association of another factor with both outcome and exposure.1 Thus, the apparent association between an exposure and disease may actually be due to another factor. To be confounding, a factor has to exhibit two simultaneous and independent properties; it has to be associated with both the exposure and the outcome under evaluation. To illustrate this point, consider the debate that has been ongoing for at least 10 years regarding the use of fertility drugs and the subsequent development of ovarian cancer.The results from poorly conducted and relatively dated case^control studies implicated fertility drugs as causal agents for ovarian cancer.2 Applying the criteria for the diagnosis of causation to the evidence in the literature failed to support with con¢dence the purported role of fertility drugs in ovarian cancer development.3 In a more recent cohort study, it was observed that the risk of ovarian cancer was not increased with fertility drug use but was associated with a diagnosis of unexplained infertility.4 It is common practice to treat women with unexplained infertility with ovarian stimulation and 1361-259X/02/$ - see front matter & 2002 Published by Elsevier Science Ltd. doi:10.1054/ebog.40, available online at http://www.idealibrary.com.on
intrauterine insemination. Thus, it is evident that the relationship between fertility drugs and ovarian cancer is confounded by unexplained infertility. Confounding can be controlled in the design and analysis phases of the study by ¢ve methods: restriction, randomization, strati¢cation, matching, and multivariate analysis.5 1.The simplest strategy, restriction, is to have inclusion criteria that specify a value of the potential confounding variable and exclude from the study everyone with a di¡erent value. In the example noted above, the study could be restricted to women with anovulation, thereby avoiding the possibility of confounding the relationship between fertility drug use and ovarian cancer. Any causal association then observed would not be due to unexplained infertility. The disadvantage of this method is that the study becomes speci¢c to a restricted group and cannot be generalized beyond that target population. This problem can become serious if restriction is used to control too many confounders or control them in too narrow a fashion. 2. Randomized trials are not a¡ected by confounding because only subjects who are eligible are entered into the study. The main advantage of random allocation of subjects to exposure and non-exposure groups, if this is feasible and the sample size is adequate, is that in the two groups so produced the confounding variables are likely to be equally distributed. It is the only method able to control factors that have not yet been recognized as confounders. 3. Strati¢cation ensures that the cases and controls or the exposed and unexposed subjects with similar levels of a potential confounding factor can be compared. It involves segregating the subjects into strata according to the level of the potential confounder (e.g. di¡erent age groups or di¡erent diagnostic categories of infertility, etc.) and then examining the relationship between the two groups separately within each stratum. The advantage of strati¢cation is that several analyses can be performed to decide which variables are confounders by determining whether the results of the strati¢ed analyses are substantially di¡erent from those of the unstrati¢ed analyses. The main disadvantage is that only a limited number of variables can be controlled simultaneously. Further, the numbers of subjects remaining in each stratum could become very small. The resulting instability in the outcome measure Evidence-based Obstetrics and Gynecology (2002) 4,1^2
1
would produce considerable variability in the estimates of association, thereby reducing the con¢dence one would have in the results of such analyses. 4. Matching is most commonly used in case^control studies, but can also be used with cohort studies. It involves selecting for each case or exposed subject a control with the same value of the confounding variable (e.g. duration of infertility, female age group, body mass index, geographical location, etc.). Matching, like restriction, is a sampling strategy to prevent confounding by allowing comparison only between groups that share the same level of the confounder. However, matching di¡ers from restriction in preserving generalizability, because subjects at all levels of the confounder can be studied. It is an e¡ective way of preventing confounding by constitutional factors, such as age and gender, that are often strong determinants of outcome. Matching can be used to control confounders, such as genetic and familial factors, that cannot be measured or controlled in any other way. A very useful strategy in this regard is to compare matched siblings or twins with one another. Another method is to match for clinics to account for any unspeci¢ed di¡erences among the subjects that may be seen at these centers. Matching may increase the precision of the results by allowing the numbers of cases (or exposed subjects) and controls at each level of the confounder to be balanced. However, one disadvantage of matching is that the factor which is matched cannot itself be assessed in terms of its relationship to the outcome. Consequently, to avoid ine⁄ciency in the study, matching should only be used for risk factors that are known confounders. When matching is used, it is important to recognize that the correct analysis of the matched data requires special analytical techniques that consider each matched group of cases and controls as one unit. Thus, subjects are compared only with individuals with whom they have been matched and not with subjects who have di¡ering levels of confounder. The use of ordinary statistical analysis techniques for matched data can lead to incorrect results.
2
Evidence-based Obstetrics and Gynecology (2002) 4,1^2
5. Finally, the problem of confounding can be handled by analyzing the data using multivariate techniques. The outcome under consideration becomes the dependent variable for which a model can be constructed to include the putative casual factor and confounding factor(s) as independent variables. For binary outcomes, logistic regression analysis is a very useful approach to identify the model that best ¢ts the data. To use such models requires identifying the confounding factors in advance, so that quantitative information on them can be collected. Confounding e¡ects can then be assessed by changes in the coe⁄cient of the exposure variable when the confounding factor is added, or removed from, the model. One of the great advantages of multivariate techniques is the capacity to control the in£uence of many confounders simultaneously. Methodological rigour in the design, conduct, and analysis phases of studies is essential to the generation of evidence in studying health care decision-making. In this context, the deployment of strategies to rule out confounding is vitally important, if the inferences that are made from the results of a study evaluating a causeand-e¡ect relationship are going to be of any value to healthcare providers. Salim Daya, MB, MSc Editor Literature cited 1. Elwood M. Critical Appraisal of Epidemiological Studies and Clinical Trials. Oxford: Oxford Medical Publications,1998. 2. Whittemore AS, Harris R, Itnyre J. Characteristics relating to ovarian cancer risk: collaborative analysis of 12 US case ^control studies. II. Invasive epithelial ovarian cancers in white women. Collaborative Ovarian Cancer Group. Am J Epidemiol 1992; 136: 1184 ^1203. 3. Daya S. Fertility drugs and ovarian cancer. In: Coutinho EM, Spinola P (eds) Reproductive Medicine: A Millennium Review. London: Parthenon Publishing,1999: 45^ 48. 4. Venn A, Watson L, Lumley J et al. Breast and ovarian cancer incidence after infertility and in vitro fertilisation. Lancet 1995; 346: 995^1000. 5. Daya S. Characteristics of good causation studies. Seminars Reprod Med 2002 (in press).