Causal Diagrams and Multivariate Analysis II: Precision Work

Causal Diagrams and Multivariate Analysis II: Precision Work

The Journal of Foot & Ankle Surgery 53 (2014) 829–831 Contents lists available at ScienceDirect The Journal of Foot & Ankle Surgery journal homepage...

613KB Sizes 4 Downloads 44 Views

The Journal of Foot & Ankle Surgery 53 (2014) 829–831

Contents lists available at ScienceDirect

The Journal of Foot & Ankle Surgery journal homepage: www.jfas.org

Investigators’ Corner

Causal Diagrams and Multivariate Analysis II: Precision Work Daniel C. Jupiter, PhD Assistant Professor, Department of Preventive Medicine and Community Health, The University of Texas Medical Branch, Galveston, TX

a r t i c l e i n f o

a b s t r a c t

Keywords: confounder multivariate analysis precision variable

In this Investigators’ Corner, I continue my discussion of when and why we researchers should include variables in multivariate regression. My examination focuses on studies comparing treatment groups and situations for which we can either exclude variables from multivariate analyses or include them for reasons of precision. Ó 2014 by the American College of Foot and Ankle Surgeons. All rights reserved.

Financial Disclosure: None reported. Conflict of Interest: None reported. Address correspondence to: Daniel C. Jupiter, PhD, Assistant Professor, Department of Preventive Medicine and Community Health, The University of Texas Medical Branch, 301 University Boulevard, 1.134G Ewing Hall, Galveston, TX 77555-1150. E-mail address: [email protected]

In my previous Investigators’ Corner (1), I introduced 2 types of study: outcome group and treatment group. I drew pictures describing relationships, which I assumed to be causal, in both types of study (Fig. 1), and in both cases, P represents a predictor, O an outcome, and C potential confounding variables. I focused on multivariate analysis of outcome group studies (1). The focus of this article is treatment group studies. In this type of study, P is a variable of particular a priori interest. The goal of a treatment group study is to assess the relationship between P and O, given the potential confounding variables. To simplify thinking about this goal of assessing the relationship between P and O and to maintain the visual approach to these issues that I presented previously (1), I provide here an additional diagram detailing the possible relationships between the 3 classes of variables in treatment group studies (Fig. 2). Fig. 2 does not capture all the subtleties possible, but it does give an initial gestalt view of the terrain. In this column, I discuss the simplified situations described in Fig. 2A–C in order to introduce my method of thinking about these issues, and I reserve the discussion of Fig. 2D (and the finer distinctions of Fig. 2A–C) for the next column. First, I will say a few words about causal pathways. Our goal as researchers, of course, is to assess causal relationships, and as a proxy for causality, we use statistically (and clinically) significant association. We also allow intuition to play a role, in that if prior knowledge indicates a causal relationship, we will also consider that association and the involved variables in our statistical thought processes. The arrows in my diagrams indicate causation, along a causal pathway (with the exception mentioned later in this column). For example, if P is the presence of a campfire and O is the presence of a roasted marshmallow, I can draw an arrow from P to O. I also allow that there can be more or less proximate causes: if P is lighting a match, then P could cause the presence of roasted marshmallow, but there is also the original cause, P, proximate to the roasted marshmallow. Further, there are causes of roasted marshmallow completely independent of thesedI will say these causes lie on other causal pathwaysdand as an

1067-2516/$ - see front matter Ó 2014 by the American College of Foot and Ankle Surgeons. All rights reserved. http://dx.doi.org/10.1053/j.jfas.2014.08.023

830

D.C. Jupiter / The Journal of Foot & Ankle Surgery 53 (2014) 829–831

Fig. 1. Outcome and treatment group studies. The arrows indicate causation along a causal pathway. (A) Several predictors (P) may influence the outcome (O) in an outcome group study. (B) A P may interact with a potential confounder (C) in impacting the O in a treatment group study.

example I could consider P the use of a lit blowtorch. In these models, C is always considered to be on a different causal pathway terminating in O, than is P. There are 2 reasons for this consideration: 1. Scientific reason: We researchers are interested in the potential causal pathway from P to O, with all the proximate causes (enumerated or not, explicit or not), and whether that presumptive pathway is affected by independent factors, C. 2. Statistical reason: If, in a multivariate analysis, we control for a factor on the causal pathway connecting P and O, that factor acts as a proxy for P. In our multivariate analysis, then, it is as if we had controlled for P, but this is the variable of interest, and controlling for it ablates any relationship between it and the outcome. In short, controlling for proximate causes blinds us to the effects of the causes of interest. One last comment before beginning our exploration: as in all studies, we researchers are interested in the relationships within the entire population, but we use the sample to derive estimates of these relationships. Thus, if we see a statistically significant relationship

between C and O in the sample, we consider this causal. The only exception to this rule is in our view of the relationships between P and C, where the relationships need not be causal, but may simply be artifacts of randomization in the study. (Indeed we hope the relationship is not causal, or we have violated our admonition against controlling for proximate causes.) With this background I attack the situations in Fig. 2A–C. We researchers can examine the situation in Fig. 2A, where we observe or posit no causal relationship between C and O and observe no association between P and C in our sample. In carrying out a multivariate analysis assessing the relationship between P and O, should we include C as a covariate? The answer is “no”: as C has no relationship to the outcome or to the predictor, including it as a multivariate regression provides no information but reduces the degrees of freedom and with it the power of the analysis. In Fig. 2B we observe or posit no causal relationship between C and O, but we observe an association between P and C in our sample. This situation is the topic of the usual table that we see in clinical studies comparing treatment groupsdthe table in which baseline characteristics of treatment groups are compared. Often, when an imbalance

Fig. 2. Intervariable relationships in treatment group studies. The arrows indicate causation along a causal pathway. (A–D) Several potential relationships are shown between the various variables in a treatment group study.

D.C. Jupiter / The Journal of Foot & Ankle Surgery 53 (2014) 829–831

between groups is found, this is seen as an indication that the unbalanced variables should be included in a multivariate analysis. Consider this thought further. Imagine a situation where you know that gender does not affect surgical outcome (e.g., ankle fixation). Consider now the case of 2 different treatment groups, 1 each for a distinct fixation method. Finally, assume a gender difference between the 2 groups. In taking a patient to the operating room for 1 of the 2 procedures, gender would not enter into decisions regarding choice of procedure because it has no impact on outcome. If it has no impact on outcome, and thus would not impact clinical decision making, why would we include it in our multivariate analyses? (I mention a caveat here: such variables [Fig. 2C] may still be effect modifiers and meet all the criteria of the picture in Fig. 2B; I delay this consideration to my discussion of Fig 2D.) I conclude my discussion with (Fig. 2C). Here, we notice and/or posit a relationship between C and O but observe none between P and C. In this setting, the outcome changes between levels of the potential

831

confounder C (e.g., the outcome differs between men and women), but the levels of the confounder are evenly spread across predictor groups (e.g., the proportion of the 2 genders does not differ between treatment groups). In this setting we should include C in our regression: as the outcome varies between levels of C, we would like to examine the effect of the predictor within each level of the confounder, to gain precision, and then average across levels of C in order to get a population estimate of effect of P on O. As suggested in my introduction to the diagrammatic representation of the various situations, I glossed over some finer points. I also have not addressed the situation in Fig. 2D. These topics will be broached in my next Investigators’ Corner. Reference 1. Jupiter D. Causal diagrams and multivariate analysis I: a quiver full of arrows [investigators’ corner]. J Foot Ankle Surg 53:672–673, 2014.