ARTICLE IN PRESS
Manual Therapy 12 (2007) 1–2 www.elsevier.com/locate/math
Editorial
Dealing with heterogeneity in clinical trials Clinical trials would be a simpler enterprise if patients presented with homogenous clinical presentations to which we could assign simple diagnoses, if therapists uniformly applied standardised therapies using mechanistic decision rules based on objective and universally accepted criteria, and if patients responded in more or less uniform ways to intervention. Unfortunately, the clinical presentations for each diagnosis are varied, diagnosis can be difficult, therapists choose to intervene very differently for the same condition or presentation, and patients’ outcomes often appear hard to predict. For almost any clinical problem, clinical presentations, diagnoses, interventions and outcomes are heterogenous. This makes clinical trials difficult. Clinical research is a messy business. One way to deal with heterogeneity is to attempt to minimise it. For example a trial could recruit from homogeneous populations by defining stringent inclusion and exclusion criteria. The process of clinical decision-making could be tightly constrained by requiring that the experimental intervention is always administered in a particular way, or by defining precise algorithms for decisions about intervention. Trials with narrowly defined populations and tightly constrained interventions are sometimes called ‘‘explanatory’’ clinical trials (Schwartz and Lellouch, 1967; McMahon, 2002; Herbert et al., 2005). Typically this approach maximises the effects of intervention and reduces variability of outcomes. So the explanatory approach is often preferred by researchers who are intent on proving the efficacy of intervention. Alternatively, a trial might recruit from the diverse populations for whom therapy is usually provided in the course of normal clinical practice. Such populations will not usually be those in whom the intervention is most effective because therapists often offer intervention even when they are not optimistic of success. Therapists could be given freedom in exactly how they provide the experimental intervention, and they might be allowed to customise the intervention to particular needs of individual patients. These ‘‘pragmatic’’ trials reflect the way intervention is administered in the course of normal clinical practice. By giving in to heterogeneity, prag1356-689X/$ - see front matter r 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.math.2006.11.001
matic trials may be less likely to find large effects of interventions, but they have the advantage of telling us about the real-world effects of intervention. The primary purpose of clinical trials is to estimate ‘‘the effect’’ of intervention. But interventions do not have one effect on all patients; typically they are very effective for some patients and ineffective or even harmful for others. In other words, effects of intervention are heterogenous. Unfortunately, clinical trials cannot provide unbiased estimates of the effects of intervention on each participant in a trial. Alternative methods, such as single case experiments (also called n-of-1 designs) may permit conclusions to be drawn about effects of interventions on individuals (Barlow and Hersen, 1984), but these methods do not provide a basis for robust inference about effects of interventions on individuals other than those who participated in the study. Ultimately neither clinical trials nor single case experiments can provide what we most want: the capacity to make specific predictions about what the effect of intervention will be on our next patient. The best we can hope for from most clinical trials is an unbiased estimate of the average effect of the experimental intervention in a population. This has been a frequent source of criticism. Some have argued that by focusing on averages we ignore the heterogeneity of effects of interventions. After all, it is argued, we treat individual patients, not average patients. Of course that is true, but it fails to recognise why we might be interested in the average effect of an intervention: in the absence of better information about how individuals will respond, the average effect of intervention provides us with a ‘‘best guess’’ of what the effect of intervention will be on any individual (Herbert, 2000). Clinical trials cannot give us specific estimates of the effects of an intervention on our next patient but they can help us make unbiased guesses about effects of intervention that can form an appropriate basis for clinical decision making. That is not to say we should not try to identify those patients most likely and least likely to benefit from an intervention. One of the next big challenges for clinical researchers investigating manual therapies is to identify
ARTICLE IN PRESS 2
Editorial / Manual Therapy 12 (2007) 1–2
characteristics of people who respond to therapy. (Sometimes such characteristics are called ‘‘effect modifiers’’, and in the context of clinical trials identification of responders is sometimes referred to as ‘‘subgroup analysis’’.) Recently there has been a flurry of interest in development of new taxonomies of low back pain (O’Sullivan, 2006) driven partly by a desire to be able to target interventions at people who will benefit most from intervention. Manual therapy researchers have begun to think about the best methodologies for identifying effect modifiers (Beattie and Nelson, 2006). And the first well designed studies have begun to identify which patients respond well to manual therapy interventions (Childs et al., 2004). Unfortunately, identification of effect modifiers is a methodologically hazardous undertaking. A simple and common mistake is to confuse prognostic factors (predictors of outcomes) with effect modifiers (predictors of response to therapy). Prognostic factors can be identified using cohort studies, but effect modifiers can only be identified with controlled clinical trials. Rigorous identification of effect modifiers involves contrasting effects of interventions across subgroups in randomised trials. The perils of naı¨ ve analyses have been widely discussed (Yusuf et al., 1991) and extensively analysed (Brookes et al., 2004). The message from this literature is that robust identification of effect modifiers can only be carried out within the context of a randomised trial. Identification of effect modifiers must involve prior specification of a small number of specific hypotheses rather than undisciplined dredging of numerous hypotheses. Analysis must involve examination of the magnitude of the interaction between patient characteristics and intervention (Brookes et al., 2004). A consequence of the need to examine interactions is that sample size requirements are quadrupled (Brookes et al., 2004). Rothman and Greenland (1998) have pointed out that particular care must be taken in defining what is meant by an interaction, because the magnitude of any interaction will depend on how the effect of intervention is measured. (For example, an interaction observed when the effect of an intervention is measured as an absolute risk reduction may evaporate when the effect is re-expressed as a relative risk.) A consequence is that
certain patient characteristics may appear to predict effects of intervention when effects are measured with one metric, but not when effects are measured with another. Heterogeneity is a universal feature of clinical practice that presents a challenge for clinical trialists. Careful consideration of sampling, intervention and analysis should make it possible to design trials which can support real-world clinical decision-making. References Barlow DH, Hersen M. Single case experimental designs: strategies for studying behavior change. Boston: Allyn and Bacon; 1984. Beattie P, Nelson R. Clinical prediction rules: what are they and what do they tell us? Australian Journal of Physiotherapy 2006;52(3):157–63. Brookes ST, Whitely E, Egger M, Smith GD, Mulheran PA, Peters TJ. Subgroup analyses in randomized trials: risks of subgroup-specific analyses; power and sample size for the interaction test. Journal of Clinical Epidemiology 2004;57(3):229–36. Childs JD, Fritz JM, Flynn TW, Irrgang JJ, Johnson KK, Majkowski GR, et al. A clinical prediction rule to identify patients with low back pain most likely to benefit from spinal manipulation: a validation study. Annals of Internal Medicine 2004;141(12):920–8. Herbert RD. How to estimate treatment effects from reports of clinical trials. I: continuous outcomes. Australian Journal of Physiotherapy 2000;46(3):229–35. Herbert RD, Jamtvedt G, Mead J, Hagen KB. Practical evidencebased physiotherapy. Oxford: Elsevier; 2005. McMahon AD. Study control, violators, inclusion criteria and defining explanatory and pragmatic trials. Statistics in Medicine 2002;21(10):1365–76. O’Sullivan P. Classification of lumbopelvic pain disorders—why is it essential for management? Manual Therapy 2006;11(3):169–70. Rothman KJ, Greenland S. Modern epidemiology. Philadelphia, PA: Lippincott-Raven; 1998. Schwartz D, Lellouch J. Explanatory and therapeutical attitudes in therapeutical trials. Journal of Chronic Diseases 1967;20: 637–48. Yusuf S, Wittes J, Probstfield J, Tyroler HA. Analysis and interpretation of treatment effects in subgroups of patients in randomized clinical trials. Journal of the American Medical Association 1991;266(1):93–8.
Rob Herbert School of Physiotherapy, University of Sydney, Sydney, Australia E-mail address:
[email protected]