Evaluating preference effects in partially unblinded, randomized clinical trials

Evaluating preference effects in partially unblinded, randomized clinical trials

Journal of Clinical Epidemiology 56 (2003) 109–115 Evaluating preference effects in partially unblinded, randomized clinical trials Scott D. Halpern*...

89KB Sizes 0 Downloads 19 Views

Journal of Clinical Epidemiology 56 (2003) 109–115

Evaluating preference effects in partially unblinded, randomized clinical trials Scott D. Halpern* Center for Clinical Epidemiology and Biostatistics, Center for Bioethics, and Center for Education and Research on Therapeutics, University of Pennsylvania School of Medicine, 108 Blockley Hall, 423 Guardian Drive, Philadelphia, PA 19104, USA Received 6 November 2001; received in revised form 29 August 2002; accepted 25 October 2002

Abstract The ability of randomized, clinical trials (RCTs) to produce unbiased estimates of an intervention’s specific (i.e., innate pharmacologic or physiologic) effect rests on at least two assumptions. First, other prognostic effects (such as disease severity or the psychologic effects of treatment) are assumed to be evenly distributed across treatment groups, and second, it is assumed that these other effects do not interact with the intervention’s specific effect. This essay shows how the effect of patients’ preferences for treatments may interact with an intervention’s specific effect to bias the trial’s results. Earlier mathematical descriptions of how preference effects may produce bias in nonblinded trials are extended to the case of (presumably) blinded RCTs. The extent to which preference effects may confer bias in partially unblinded placebo-controlled vs. active-controlled drug trials are considered. Finally, methods for estimating preference effects in partially unblinded RCTs are provided. It is concluded that investigators could use these methods to quantify potential preference effects in partially unblinded RCTs, and thereby more accurately estimate the specific effects of treatments. © 2003 Elsevier Science Inc. All right reserved. Keywords: Clinical trials; Placebo-controlled trials; Placebo effect; Randomization; Bias; Preferences

1. Introduction The double-blind, randomized, clinical trial (RCT) has been adopted as the gold standard method for evaluating new and existing medical interventions due to its ability to minimize bias. Randomization enables the equal distribution of potentially confounding prognostic variables (e.g., patients’ severity of disease or motivations for improvement) between treatment groups. Blinding is favored because it may mitigate the influences of expectation or other human predilections [1–3]. With these potential biases diminished, investigators traditionally calculate the intervention’s specific (i.e., innate pharmacologic or physiologic) effect as the difference between the observed outcomes in the control and experimental groups. However, at least three factors threaten the ability of this traditional, additive model of RCTs to disentangle a treatment’s specific effect from the exogenous, or nonphysiologic effects of treatment, including contextual and psychologic effects. First, participant blinding is often difficult to maintain [4–13]. If trials are not blinded, or if they are imperfectly blinded, then between-group differences in treat* Corresponding author. Tel.: 215-573-8910; fax: 215-573-5535. E-mail address: [email protected] (S.D. Halpern).

ment adherence, drop-out, cointervention use, symptom reporting, or psychosomatic responses may bias the results. Second, the psychologic effects of treatments may not be evenly distributed among treatment groups [1,14–16]. McPherson and colleagues first described how the distributions of one psychologic effect, patients’ preferences for treatments, may vary between trial groups and thus bias the results of nonblinded RCTs [15]. These “preference effects” [17] regard the fact that, among patients treated with an intervention, the variability of their outcomes may be attributable not only to random variation and error, but also to systematic biases stemming from the strength of the patients’ beliefs in, or preferences for, their received treatments [15,16]. The third reason the additive model of RCTs may produce biased results is that even when psychological or other prognostic effects are evenly distributed across groups, they may differentially alter the outcome in each treatment group if they interact with the treatment’s specific effect [14,1820]. Such interactions of physiologic and psychologic effects have been observed, for example, in studies showing that what subjects are told regarding their treatment allocation is independently associated with outcome in both active- and placebo-treated groups, but that the relative effect of such instructions differs between treatment groups [21,22]. Such observations undermine a central assumption

0895-4356/03/$ – see front matter © 2003 Elsevier Science Inc. All rights reserved. doi:10.1016/S0895-4356(02)00598-X

110

S.D. Halpern / Journal of Clinical Epidemiology 56 (2003) 109–115

of RCTs: that the magnitude of all effects other than the specific, physiologic actions of the treatment will be equivalent in the control and experimental arms [2,20,22]. The first two problems of participant unblinding and uneven distributions of prognostic variables might be overcome through more rigorous blinding schemes and the randomization of larger samples, respectively. Further, methods proposed in the past decade may account for preference effects in nonblinded trials [14,15,17,18,23]. However, because preference effects might also operate in presumably blinded RCTs in which some portion of the patients become unblinded [15], a method to deal with this case is still needed. In this essay, I first extend earlier mathematical descriptions of how preference effects may produce bias in nonblinded trials [15] to the case of partially unblinded RCTs. In doing so, I apply these observations to hypothetical RCTs of new antihypertensive drugs to show the extent to which preference effects may bias the results. I further consider how preference effects may produce different degrees of bias in placebo-controlled trials (PCTs) and active-controlled trials (ACTs). Finally, I outline methods that future investigators can use to empirically assess the existence of preference effects in partially unblinded RCTs, and thus more accurately estimate the physiologic effects of treatments. 2. Assessing preference effects in nonblinded trials To determine how preference effects may influence the results of RCTs, investigators must first measure their magnitude among treatment groups. Several authors have suggested that preference effects may be quantified using partially randomized, “patient preference trials” [14,16,24]. In such trials, potential participants who are unwilling to be randomized between a preferred and less preferred intervention are assigned to their preferred interventions in openlabel study arms, while those without such preferences are randomized in the usual fashion. When two interventions are being compared, this scheme produces a four-armed trial. By comparing the outcomes of patients according to randomization status, stratifying by treatment arm, such trials provide one way to evaluate preference effects. Unfortunately, because two of the four groups must knowingly receive their preferred intervention, this approach is not applicable to the many trials in which participant blinding is important. Furthermore, because such trials are only partly randomized, uncontrolled confounding may bias the main results [16,23,25]. Although blinding could be maintained, and confounding avoided, by restricting the primary efficacy analyses to only randomized participants, doing so would limit power. Thus, though this approach might quantify preference effects, the corresponding estimates of the main effects would be invalid or imprecise. An alternative approach to measuring preference effects that avoids these pitfalls is to conduct traditional, fully randomized trials in which participants’ treatment preferences

are assessed prior to randomization (or at least prior to commencing treatment) [15,23,25]. By obtaining prospective data on patients’ treatment preferences, investigators may then stratify their primary analyses by stated preferences, and thus quantify these effects [15,23]. At least two nonblinded trials have used this approach. In a trial of a new exercise program for lower back pain, no preference effects were detected, so the main analyses did not require adjustment for preference [26]. By contrast, a reanalysis of a trial comparing two schedules of antenatal visits found significant preference effects that not only modified, but reversed the main results across preference strata [27]. To summarize McPherson et al.’s method for quantifying preference effects in such nonblinded trials, consider a simple RCT comparing the effects of two antihypertensive drugs, A and B. Suppose the true pharmacologic effect of B was equal to a mean reduction of e mmHg (systolic), and that the true pharmacologic effect of the better agent, A, was e  x mmHg (where x equals the specific, pharmacologic effect difference). To assess treatment preferences, investigators could ask participants, prior to commencing treatment, which drug they would prefer to receive (allowing for neutral responses). Some proportion, , would prefer drug A; some proportion, , would prefer drug B; and the remainder, , would be those who were indifferent between the two drugs. Such preferences may derive from information patients obtained from the informed consent process or from their physicians. (Note that treatment preferences could also be quantified using an ordinal or continuous measure, such as a numeric rating scale in which the strongest preferences for A and B are at either end, with the central value connoting indifference. This might provide a more sensitive measure of preference effects. However, for simplicity, I present the analytic strategy using the trichotomous preference score— , , or .) If preference effects were operative, then among patients preferring A, and receiving A, the observed mean response would be increased by a value y, such that the total response would be e  x  y (Table 1). Among patients who know they are receiving their nonpreferred treatment, drug B, the mean response would be decreased by y, to e  y. For algebraic simplicity, McPherson’s model assumes that the effi-

Table 1 Preference-adjusted responses in a nonblinded trial Group

Prefer

Actually Get

Observed response

1 2 3 4

A A B B

A B A B

exy Ey exy Ey

Assumes that first, the increased response when getting the preferred drug equals the decrement in response when getting the nonpreferred drug, and second, that the treatment boost or decrement is the same, in absolute terms, for treatments A and B (i.e., y is not an equivalent proportion of the actual physical effects of A and B, respectively).

S.D. Halpern / Journal of Clinical Epidemiology 56 (2003) 109–115

cacy boost when receiving the preferred drug equals the efficacy decrement when receiving the nonpreferred drug. However, different effects could be incorporated into the model as well [15]. The main treatment effect, x, will be modified by patients’ preferences via a two-way, treatment preference by treatment received interaction [15]. Thus, if the interaction is not accounted for through stratification, the main result may be biased as shown in Appendix 1. The calculations in Appendix 1 yield two interesting findings regarding nonblinded trials in which randomization evenly distributes subgroups of patients with different treatment preferences. First, the percentage of the observed treatment difference attributable to bias is precisely equal to |  |. Thus, the greater the difference in the proportions of patients preferring one treatment to the other, the greater the distortion between the observed and true (i.e., pharmacologic) treatment effects. Second, as the ratio of the preference effect to the real treatment difference (y/x) increases, so too will the magnitude of the bias. Additional bias may result if randomization failed to evenly distribute patients with different treatment preferences. 3. Assessing preference effects in (presumably) blinded trials Although the case for nonblinded trials is informative, investigators conduct double-blinded trials whenever feasible. If participant blinding were perfectly maintained, then preference effects could not manifest because patients would not know whether they were receiving their preferred or nonpreferred treatment. In this case, the psychosomatic responses that might account for preference effects would not be activated. Similarly, if preference effects were mediated through differential reporting of symptoms, or selective adherence to a preferred treatment, perfect blinding should mitigate these potential biases. Unfortunately, because participant blinding is difficult to maintain [4–13], preference effects could bias the results of presumably blinded trials just as they might for nonblinded trials. The key difference is that, whereas preference effects in nonblinded trials arise via a two-way—treatment preference by treatment received—interaction, preference effects in trials that become partially unblinded smay manifest via a three-way—treatment preference by treatment guessed by treatment received—interaction. When participants receive their preferred drug and know it, their outcomes may be augmented; when participants receive their nonpreferred treatment and know it, their outcomes may be suppressed. Table 2 describes the outcomes in all possible groups defined by their preference, randomization, and blinding status. Techniques for measuring the proportion of patients who become unblinded during a trial should be able to distinguish the two ways in which unblinding may occur. First, overt failures of the blinding scheme may allow patients to directly discover their treatment assignment. Second, pa-

111

tients may gradually become able to predict their treatment assignments by noting side effects or clinical improvement. For example, if patients in an antihypertensive drug trial note an increase in their blood pressure, they may believe they are receiving placebo, whereas experiencing dizziness may lead one to believe he received the active agent [6,7]. Both types of unblinding may engender bias due to preference effects. For example, if patients in the placebo group become overtly unblinded to their treatment assignment, ensuing psychosomatic responses may curtail their subsequent response, selectively limiting the outcome in the placebo group and thus making the active intervention look better. Conversely, if patients first note a lack of response, they may become suspicious that they are receiving placebo, and their ensuing frustration at having been denied therapy they believed to be effective may diminish their treatment response [13]. Confidence in one’s knowledge of the treatment received, rather than simply a dichotomous “best guess,” may be important. Such confidence may accrue as the trial proceeds. Additionally, the processes by which knowledge of treatment assignment alters patients’ outcomes may work gradually, such that bias may be more likely to result when unblinding occurs early. For these reasons, investigators should measure participants’ treatment guesses, and confidence in such guesses, at the trial’s outset, conclusion, and at several intervals between [11,12]. Moscussi and colleagues described an effective method for determining the proportion of patients (p) who become unblinded during the study [9]. They asked patients to predict which treatment they thought they were receiving, to state their confidence in this prediction (on a scale from 0 to 100), and to indicate their reasons for this belief. Soliciting such information at several intervals throughout the trial would allow optimal determinations of the extent, causes, and duration of unblinding. These data, in turn, would allow investigators to calculate both p and q, the proportions who correctly and incorrectly guess their assignments, respectively (where q  1  p in the simplest case in which pa-

Table 2 Preference-adjusted responses in a blinded trial Group

Prefer

Actually get

Believe they get

Observed response

1 2 3 4 5 6 7 8

A A A A B B B B

A A B B A A B B

A B A B A B A B

exy exy ey ey exy exy ey ey

In addition to the assumptions described in Table 1, a third assumption required here is that the boost in response attributable to believing one is receiving the preferred drug, and the decrement in response attributable to believing one is getting the nonpreferred drug, are independent of the drug actually received.

112

S.D. Halpern / Journal of Clinical Epidemiology 56 (2003) 109–115

tients are forced to guess). Investigators could then use the equations in Appendix 2 to calculate the bias attributable to preference effects in blinded RCTs. These equations show that there are three determinants of the magnitude of the preference-induced bias in partially unblinded RCTs. As in nonblinded trials, greater differences in the proportions of patients preferring the two treatments (i.e., greater values of |  |), and greater values of the ratio of the preference effect to the physiologic effect (i.e., greater values of y/x) will each increase the magnitude of the bias. In addition, greater proportions of people successfully guessing their treatment assignment (i.e., greater values of p  q), will increase the bias in presumably blinded RCTs. These three features have important implications for how preference effects may produce different degrees of bias in presumably blinded PCTs and ACTs. 4. Placebo-controlled vs. active-controlled trials There are at least three reasons why the bias introduced by preference effects may differ between PCTs and ACTs. First, it may be easier for participants in PCTs to become unblinded to their treatment group, increasing the difference between p and q. Patients receiving placebo may recognize their treatment assignment due to the absence of expected side effects, or the absence of anticipated benefits [9]. Because the efficacy and side-effect profiles of active agents may be more similar, ACTs may be less prone to this type of unblinding. Second, participants in PCTs may be particularly unlikely to have balanced preferences for the two (or more) treatment groups. Patients who would prefer to receive placebo rather than an active treatment typically choose not to enroll at all; among those who would enroll, few wish to receive placebo (S.D. Halpern, J.H.T. Karlawish, D. Casarett, et al., unpublished data). Thus, the difference between  and  among participants in PCTs could approach 100%. By contrast, participants in ACTs may have more balanced preferences because both interventions would be reputed to be active against the patients’ illness, and so patients’ preferences would likely vary according to which dosing regimens, side effect profiles, or other factors are most appealing. Whereas these two features would tend to magnify the bias in PCTs, a third may increase the bias in ACTs. Because the magnitude of the specific treatment difference may be greater when one of the treatments is placebo, the relative influence of any given preference effect (the ratio y/ x) may be smaller in PCTs than in trials assessing interventions with more comparable efficacy. Formally determining whether ACTs or PCTs are more prone to preference effects will require future empirical work to (1) quantify the magnitude of the difference in unblinding rates between ACTs and PCTs, (2) confirm the speculation that patients’ preferences are more balanced in ACTs, and (3) determine whether these first two factors, or the magnitude of the specific treatment difference (y/x),

tend to dominate the quantification of preference effects. However, as shown in Table 3, inserting plausible values into hypothetical ACTs and PCTs of a new antihypertensive drug, and then applying the calculations from Appendix 2, suggests that bias will generally be greater in PCTs. Using the conservative values indicated, the bias in the PCT was substantially greater than in the ACT in both absolute terms and as a percentage of the true (pharmacologic) difference between treatments. Other sets of values consistent with the foregoing assumptions regarding the differences between ACTs and PCTs produce comparable results. 5. Quantifying preference effects To this point, I have simply defined the observed preference effect as y, and then calculated the resultant bias in terms of y. However, y may be thought of, and calculated, in at least three distinct ways. As described in Table 4, the “positive preference effect” represents the difference in the mean change in blood pressure (BP) between patients receiving their preferred therapy compared with those receiving the same therapy who were neutral in their preference. Similarly, the “negative preference effect” equals the difference in mean BP between patients receiving their nonpreferred therapy compared with those receiving the same therapy who were neutral in their preference. Finally, the “total

Table 3 Plausible biases in hypothetical trials of a new antihypertensive drug Variable E

Variable interpretation

Mean effect size in control group X Mean true (pharmacologic) difference between treatments  Proportion of participants favoring Drug A  Proportion of participants favoring control treatment P Proportion of participants correctly guessing treatment Q Proportion of participants incorrectly guessing treatment x(pq)()y Mean observed difference between treatments Bias Observed difference— true (pharmacologic) difference Bias/x * 100 Bias as a percentage of the true treatment difference

Value in ACT

Value in PCT

8

4

2

6

0.60

0.95

0.40

0.05

0.55

0.65

0.45

0.35

2  0.02y

6  0.27y

0.04 mmHg 0.54 mmHg

2%

9%

For both trials, assume a modest preference effect, y, equal to 1/2 the observed change from baseline in the placebo group, or 2 mmHg. ACT, active-controlled trial; PCT, placebo-controlled trial.

S.D. Halpern / Journal of Clinical Epidemiology 56 (2003) 109–115 Table 4 Types of Preference Effects Preference effect Positive

Negative

Total

Description Mean BP among those receiving preferred therapy  Mean BP among those receiving same therapy when they are neutral towards that therapy. Mean BP among those receiving nonpreferred therapy  Mean BP among those receiving same therapy when they are neutral towards that therapy. |Positive Preference Effect|  |Negative Preference Effect|

preference effect” equals the sum of the absolute values of these positive and negative effects. In trials in which most patients will have a preference, the total preference may also be calculated by subtracting the mean BP in patients receiving their nonpreferred therapy from the mean BP in patients receiving the same therapy when it is preferred. Empirical tests of these different formulations of preference effects are needed to determine which corresponds most closely with patients’ outcomes. Furthermore, although these additive models most simply represent the concept of preference effects, nonlinear models may be found to more closely fit actual trial data. It is also possible that different formulations will be needed to best model preference effects that arise via different mechanisms (e.g., psychosomatic responses, reporting bias, differential treatment adherence, or others) or in studies using qualitatively different outcome measures (e.g., symptom reports vs. laboratory values). 6. Discussion If investigators fail to first measure patients’ preferences for treatments, and then stratify their analyses accordingly, the observed difference in outcomes produced by two treatments may be biased by preference effects. Indeed, this will occur in all RCTs in which the following three conditions exist. First, the outcome must be sensitive to patients’ treatment preferences, so that there may be a detectable preference effect. Some outcomes, such as mortality rate in a trial of two thrombolytic agents for the treatment of acute myocardial infarction, may be less sensitive to patients’ preferences than blood pressure reduction. Furthermore, evidence that the related effects of placebo interventions may have more limited magnitude and scope than previously thought [28,29] suggests that research is needed to determine the types of trials in which preference effects are most operative. Second, participants must be able to decipher their treatment assignments at a rate greater than chance (50% in a two-armed trial). This criterion is clearly met in nonblinded trials, and a wealth of evidence suggests that it also occurs in presumably blinded trials [4–13]. Finally, a majority of participants must prefer one of the treatments being compared at the trial’s outset—that is, they

113

must not be in a state of equipoise regarding the perceived effectiveness of the competing treatments. This notion of participant equipoise extends the commonly accepted notion of clinical equipoise [30], which considers only the views of expert clinicians in determining whether one treatment should be preferred to another. Participant equipoise, by contrast, exists if either (1) all prospective participants are indifferent regarding their perceived effectiveness of the treatments being compared, or (2) the proportion of participants favoring one treatment is precisely balanced by the proportion favoring the other. Applying the mathematical syntax used above, participant equipoise would exist when either 1)   1.0 (where     0), or 2) in any case in which    (where   1    ). The most prominent limitation to the proposed approach to measuring preference effects regards the sample size required to reliably detect interactions. Detecting treatment preference by treatment received interactions requires larger sample sizes than simply detecting a main effect of treatment [15]; detecting the proposed three-way treatment preference by treatment guessed by treatment received interaction will require still larger samples. Limited evidence suggests that detecting two-way interactions via the method of McPherson et al. is more statistically efficient than comparable, partially randomized, patient preference trials [27]. Still, the costs of increasing the sample size in a fully randomized design to detect the proposed interactions will need to be weighed, on a case-by-case basis, against the probability that such interactions could substantially bias the outcomes. As Torgerson and Sibbald have suggested, standardizing methods of preference assessment across related trials could enable the use of metaanalysis to detect such interactions more reliably than could be done in a single trial [31]. Further research in a variety of clinical settings will be required to determine (1) the extent to which participants can guess their treatment assignment (p, q); (2) the magnitude of the difference in baseline preferences for two treatments (, ); and (3) the magnitude of preference effects (y) across a variety of interventions. In the meantime, the measurement of patients’ treatment preferences, and subsequent stratification of analyses to evaluate possible preference effects, should be viewed as methods to increase the value [32] of the information to be gained from RCTs. Documenting the presence or absence of preference effects, and thereby generating more specific estimates of a treatment’s efficacy, may aid regulatory authorities in considering whether a new drug merits approval. Perhaps even more importantly, because patients are treated knowingly in the clinical context, and with agents they wish to take, measuring outcomes among subgroups of patients with different preferences and motivations may help clinicians apply the results of RCTs to their practices. Acknowledgments Support for this article was by a predoctoral fellowship from the American Heart Association, and by a National

114

S.D. Halpern / Journal of Clinical Epidemiology 56 (2003) 109–115

Research Service Award in Cardiopulmonary Epidemiology from the National Heart, Lung, and Blood Institute. I thank Jason H. T. Karlawish, M.D., Stephen E. Kimmel, M.D., M.S.C.E., Brian L. Strom, M.D., M.P.H., and the two referees for their thoughtful comments on earlier drafts of this manuscript. Appendix 1: Non-blinded trials (1) Use the following variable definitions: e  mean pharmacologic efficacy of drug B x  mean magnitude of the extra pharmacologic efficacy of drug A compared to B y  mean magnitude of the added (or subtracted) efficacy attributable to receiving one’s preferred (nonpreferred) treatment   proportion of participants preferring drug A at the outset   proportion of participants preferring drug B at the outset   proportion of participants who are indifferent between A and B at the outset (2) Assume     1.0; that is, everyone has a treatment preference, and   0. (3) Assume that the randomization process worked, such that participants with each treatment preference were equally distributed between groups. (4) Let R1  mean response in Group 1 from Table 1, R2  mean response in Group 2, and so on. (5) The observed difference in treatment effects will be: (R1)(R3)[(R2)(R4)] ((ex)y)((ex)y)(ey)(ey) ()(x)2y2y x2y() (1) Equation (1) shows that the observed treatment difference may differ from the pharmacologic treatment difference, x, by 2y(  ). Thus, whenever y ≠ 0 (i.e., a preference effect is operative) and  ≠  (i.e., more participants prefer one treatment than prefer the other), the observed efficacy will be biased.

Appendix 2: Blinded Trials (1) Use the following variable definitions, in addition to those in Appendix 1: p  proportion of participants in a blinded study who can successfully identify their received treatment. q  proportion of participants in a blinded study who fail to identify their received treatment (q  1  p). (2) Assume randomization worked to evenly distribute preference groups among the two treatments. (3) Assume no participant is indifferent (i.e.,     1.0). (4) If p is substantially different from 0.5 (the null value in a two-armed trial), then blinding is said to have failed.

(5) The observed treatment difference, for all values of p and q, will then be: p(/2)(R1)q(/2)(R2)p(/2)(R5)q(/2) (R6)[q(/2)(R3)p(/2)(R4)q(/2)(R7)p(/2)(R8)] (p(ex)/2py/2)(q(ex)/2qy/2)(p (ex)/2py/2)(q(ex)/2qy/2)[(qe/2 qy/2)(pe/2py/2)(qe/2qy/2)(pe/2 py/2)] p()(ex)/2q()(ex)/2q()e/2p ()e/2pyqyqypy pxqxp()yq()y xp()yq()y (Because p  q  1.0, and so the first two terms simplify to x for all (p, q)) x(pq)()y

(2)

Equation (2) shows that the observed treatment difference may differ from the pharmacologic treatment difference, x, by (p  q)(  )y. Thus, whenever ≠ 0 (i.e., a preference effect is operative) and  ≠  (i.e., more participants prefer one treatment than prefer the other), and p ≠ q (i.e., the percentage of patients becoming unblinded differs from that expected by chance), the observed efficacy will be biased. References [1] Jüni P, Altman DG, Egger M. Assessing the quality of controlled clinical trials. BMJ 2001;323:42–6. [2] Kaptchuk TJ. The double-blind, randomized, placebo-controlled trial: gold standard or golden calf? J Clin Epidemiol 2001;54:541–9. [3] Kramer MS, Shapiro SH. Scientific challenges in the application of randomized trials. JAMA 1984;252(19):2739–45. [4] Karlowski TR, Chalmers TC, Frenkel LD, Kapikian AZ, Lewis TL, Lynch JM. Ascorbic acid for the common cold: a prophylactic and therapeutic trial. JAMA 1975;231:1038–42. [5] Howard J, Whittemore AS, Hoover J, Panos M. The Aspirin Myocardial Infarction Study Research Group: how blind was the patient blind in AMIS? Clin Pharmacol Ther 1982;32:543–53. [6] Brownell KD, Stunkard AJ. The double-blind in danger: untoward consequences of informed consent. Am J Psychiatry 1982;139:1487–89. [7] Byrington R, Curb DJ, Mattson ME. Assessment of blindness at the conclusion of the beta-blocker heart attack trial. JAMA 1985;253: 1733–36. [8] Rabkin JG, Markowitz JS, Stewart J, et al. How blind is blind? Assessment of patient and doctor medication guesses in a placebo-controlled trial of imipramine and phenelzine. Psychiatry Res 1986;19: 75–86. [9] Moscussi M, Byrne L, Weintraub M, Cox C. Blinding, unblinding and the placebo effect: an analysis of patients’ guesses of treatment assignment in a double-blind clinical trial. Clin Pharmacol Ther 1987;41:259–65. [10] Fisher S, Greenberg RP. How sound is the double-blind design for evaluating psychotropic drugs? J Nerv Ment Dis 1993;181:345–50. [11] Noseworthy JH, Elbers GC, Vandervoort MK, Farquhar RE, Yetisir E, Roberts R. The impact of blinding on the results of a randomized, placebocontrolled multiple sclerosis clinical trial. Neurology 1994;44:16–20. [12] Morin CM, Colecchi C, Brink D, Astruc M, Mercer J, Remsberg S. How “blind” are double-blind placebo-controlled trials of benzodiazepine hypnotics? Sleep 1995;18:240–5.

S.D. Halpern / Journal of Clinical Epidemiology 56 (2003) 109–115 [13] Basoglu M, Marks I, Livanou M, Swinson R. Double-blindness procedures, rater blindness, and ratings of outcome: observations from a controlled trial. Arch Gen Psychiatry 1997;54(8):744–8. [14] Brewin CR, Bradley C. Patient preferences and randomized clinical trials. BMJ 1989;299:313–5. [15] McPherson K, Britton AR, Wennburg JE. Are randomized controlled trials controlled? Patient preferences and unblind trials. J R Soc Med 1997;90:652–6. [16] Silverman WA, Altman DG. Patients’ preferences and randomised trials. Lancet 1996;347:171–4. [17] Rücker G. A two-stage trial design for testing treatment, self-selection and treatment preference effects. Stat Med 1989;8:477–85. [18] Bradley C. Designing medical and educational intervention studies: a review of some alternatives to conventional randomized controlled trials. Diabetes Care 1993;16(2):509–18. [19] Kleijnen J, de Craen AJM, Everdingen JV, Krol L. Placebo effect in double-blind clinical trials: a review of interactions with medications. Lancet 1994;344:1347–9. [20] Kaptchuk TJ. Powerful placebo: the dark side of the randomised controlled trial. Lancet 1998;351:1722–5. [21] Hughes JR, Gulliver SB, Amori G, Mireault GC, Fenwsick J. Effect of instructions and nicotine on smoking cessation, withdrawal symptoms and self-administration of nicotine gum. Psychopharmacology (Berlin) 1989;99:486–91. [22] Kirsch I, Rosadino J. Do double-blind studies with informed consent yield externally valid results? Psychopharmacology (Berlin) 1993;110: 437–42.

115

[23] Torgerson DJ, Klaber-Moffett J, Russell IT. Patient preferences in randomised trials: threat or opportunity? J Health Serv Res Policy 1996;1(4):194–7. [24] Torgerson DJ, Sibbald B. Understanding controlled trials: what is a patient preference trial? BMJ 1998;316:360. [25] McPherson K, Chalmers I. Incorporating patient preferences into clinical trials: information about patients’ preference must be obtained first. BMJ 1998;317:78. [26] Klaber-Moffett J, Torgerson DJ, Bell-Syer S, et al. Randomised controlled trial of exercise for low back pain: clinical outcomes, costs, and preferences. BMJ 1999;319:279–83. [27] Clement S, Sikorski J, Wilson J, Candy B. Merits of alternative strategies for incorporating patient preferences into clinical trials must be considered carefully [letter]. BMJ 1998;317:78. [28] McDonald CJ, Massuca SA, McCabe GP. How much of the placebo “effect” is really statistical regression? Stat Med 1983;2:417–27. [29] Hrobjartsson A, Gotzsche PC. Is the placebo powerless? An analysis of clinical trials comparing placebo with no treatment. N Engl J Med 2001;344:1594–602. [30] Freedman B. Equipoise and the ethics of clinical research. N Engl J Med 1987;317:141–5. [31] Torgerson DJ, Sibbald B. Incorporating patient preferences into clinical trials [letter]. BMJ 1998;317:79. [32] Freedman B. Scientific value and validity as ethical requirements for research: a proposed explication. IRB Rev Hum Subjects Res 1987; 9(6):7–10.