Noninferiority is (too) common in noninferiority trials

Noninferiority is (too) common in noninferiority trials

118 Letters to the editor / Journal of Clinical Epidemiology 71 (2016) 113e123 Regional Health Care Agency of Abruzzo Via Attilio Monti 9 Pescara, I...

336KB Sizes 47 Downloads 123 Views

118

Letters to the editor / Journal of Clinical Epidemiology 71 (2016) 113e123

Regional Health Care Agency of Abruzzo Via Attilio Monti 9 Pescara, Italy

Noninferiority is almost certain with lenient noninferiority margins Dear Editor,

John P.A. Ioannidis*

We thank Dr. Soonawala et al. for their interesting comment [1] to our article on the sponsorship of head-tohead randomized trials [2]. Our proportion of noninferiority trials with ‘‘desirable favorable results’’ (97%) was higher than the one reported by Soonawala (84%, if we only consider the 165 trials with a sample larger than 100, that was one of our study selection criteria) [2,3]. The observed discrepancy may still be due to chance, or it may be caused by other differences between the two studies. The previous analysis [2] was based on trials published between 1999 and 2009, and we examined trials published in 2011. It is possible that the machinery of the industry producing favorable results from noninferiority trials may have become more efficient recently. Moreover, Soonawala et al. analyzed only trials published on core journals, whereas 45/57 of our trials were not published on core journals, which increased the likelihood of including some more biased trials. Of the 12 trials published in core journals, 11 (92%) had favorable results, not significantly different from the proportion in Soonawala et al. [2]. Overall, we fully agree with the concluding statements of Soonawala et al. [1]: the noninferiority design seems a very ‘‘safe’’ design, and the rationale for choosing it should be more consistently reported in publications and registries, as it was also more recently documented [4]. In addition, we share previous concerns on the choice of noninferiority margins [5,6]. Among the 33 comparisons in which the result was expressed as a relative risk in Soonawala et al. survey, 6 choose a margin equal or higher that 2.00 [3]. Although the selection of the noninferiority margin is based on clinical judgment in addition to statistical reasoning, these trials selected margins that were too lenient. To provide the highest standard of patient care, researchers must be very careful in designing noninferiority trials. Maria Elena Flacco Department of Medicine and Aging Sciences University of Chieti Via dei Vestini 5 66013 Chieti, Italy Regional Health Care Agency of Abruzzo Via Attilio Monti 9 Pescara, Italy

Lamberto Manzoli Department of Medicine and Aging Sciences University of Chieti Via dei Vestini 5 66013 Chieti, Italy

Stanford Prevention Research Center Department of Medicine Stanford University School of Medicine Department of Health Research and Policy Stanford University School of Medicine Department of Statistics Stanford University School of Humanities and Sciences Meta-Research Innovation Center at Stanford (METRICS) Stanford, CA, USA *Corresponding author. Tel.: 650-7045584; fax: 650-7256247. E-mail address: [email protected] (J.P.A. Ioannidis)

References [1] Soonawala D, Dekkers OM, Vandenbroucke JP, Egger M. Non-inferiority is (too) common in non-inferiority trials. J Clin Epidemiol 2015. [2] Flacco ME, Manzoli L, Boccia S, Capasso L, Aleksovska K, Rosso A, et al. Head-to-head randomized trials are mostly industry sponsored and almost always favor the industry sponsor. J Clin Epidemiol 2015;68:811e20. [3] Soonawala D, Middelburg RA, Egger M, Vandenbroucke JP, Dekkers OM. Efficacy of experimental treatments compared with standard treatments in non-inferiority trials: a meta-analysis of randomized controlled trials. Int J Epidemiol 2010;39:1567e81. [4] Gopal AD, Desai NR, Tse T, Ross JS. Reporting of noninferiority trials in ClinicalTrials.gov and corresponding publications. JAMA 2015;313:1163e5. [5] Burotto M, Prasad V, Fojo T. Non-inferiority trials: why oncologists must remain wary. Lancet Oncol 2015;16:364e6. [6] Tanaka S, Kinjo Y, Kataoka Y, Yoshimura K, Teramukai S. Statistical issues and recommendations for noninferiority trials in oncology: a systematic review. Clin Cancer Res 2012;18:1837e47. http://dx.doi.org/10.1016/j.jclinepi.2015.11.010

Noninferiority is (too) common in noninferiority trials In their meta-analysis of 317 randomized trials that examined efficacy and safety of drugs, biologics, and medical devices in head-to-head comparisons, Flacco et al. [1] report that 97% (55 of 57) of industry-funded noninferiority/equivalence trials provided favorable results (defined as superiority or noninferiority of the experimental treatment) compared to 71% (20 of 28) of nonindustry-funded noninferiority/equivalence trials. We reviewed the data from our previously published meta-analysis of 175 noninferiority comparisons [2]. Overall, 83% (145 of 175) of the comparisons produced desirable favorable results (superiority or noninferiority for the experimental treatment). In contrast to Flacco et al. [1], there was no important

Letters to the editor / Journal of Clinical Epidemiology 71 (2016) 113e123

119

% Favorable Favors experimental

Favors standard

Fig. 1. Results of 175 comparisons from 170 noninferiority trials and percentage of comparisons favoring the experimental treatment (defined as superiority or noninferiority of the experimental treatment). Reanalysis of the data from Soonawala et al. [2].

difference in this percentage by type of funding source. Fig. 1 shows the combined risk ratios from random-effects meta-analyses by funding source, and the percentage of comparisons with favorable results. There was little evidence for heterogeneity in combined risk ratios (P Z 0.20 from chi-squared test) and percentages (P Z 0.49 from chi-squared test). The trials examined by Flacco et al. [1] were published in 2011, whereas our analysis was based on trials published between 1993 and 2009. It is therefore possible that the difference between industry-sponsored and other trials emerged in recent years. However, in our data set, we found no clear trend over time in this difference. Taken together, the data from these two meta-analyses show that a large majority of published noninferiority trials support the experimental treatment. The probability of getting a verdict of noninferiority in a noninferiority trial is greater than 80%. From the viewpoint of industry and other stakeholders, the noninferiority design is therefore a very ‘‘safe’’ design. The situation is more variable for superiority trials. The study by Flacco et al. [1] and another metaepidemiological study of trials from a range of medical specialties found that experimental treatments were superior in 68% to 74% of trials [3]. In contrast, two studies in oncology found that only 25% to 45% of trials produced results compatible with superiority of the experimental treatment [4,5]. There are valid reasons for choosing a noninferiority design when performing a trial. However, industry and other stakeholders may be tempted to choose the design for the wrong reasons; that is, to increase the chance of

obtaining a favorable result even when a superiority design would have been more appropriate to answer the questions that really matter to patients. Therefore, the rationale for choosing the noninferiority design, the primary outcome, the noninferiority margin and the evidence for the supposed benefit of the experimental treatment should always be specified and consistently reported in study protocols, trial registries, and publications [6e8]. Darius Soonawala* Department of Nephrology Leiden University Medical Center Albinusdreef 2, 2300 RC Leiden, The Netherlands

Olaf M. Dekkers Department of Clinical Epidemiology Leiden University Medical Center Albinusdreef 2, 2300 RC Leiden, The Netherlands Department of Endocrinology and Metabolic Diseases Leiden University Medical Center Albinusdreef 2, 2300 RC Leiden, The Netherlands

Jan P. Vandenbroucke Department of Clinical Epidemiology Leiden University Medical Center Albinusdreef 2, 2300 RC Leiden, The Netherlands

120

Letters to the editor / Journal of Clinical Epidemiology 71 (2016) 113e123 1

Department of Social & Preventive Medicine Institute of Social and Preventive Medicine University of Bern Finkenhubelweg 11, 3012 Bern Switzerland *Corresponding author. Albinusdreef 2, 2300 RC, Leiden, The Netherlands. Tel.: 071-5268156; fax: 071-5248146. E-mail address: [email protected] (D. Soonawala)

References [1] Flacco ME, Manzoli L, Boccia S, Capasso L, Aleksovska K, Rosso A, et al. Head-to-head randomized trials are mostly industry sponsored and almost always favor the industry sponsor. J Clin Epidemiol 2015;68:811e20. [2] Soonawala D, Middelburg RA, Egger M, Vandenbroucke JP, Dekkers OM. Efficacy of experimental treatments compared with standard treatments in non-inferiority trials: a meta-analysis of randomized controlled trials. Int J Epidemiol 2010;39:1567e81. [3] Bourgeois FT, Murthy S, Mandl KD. Outcome reporting among drug trials registered in ClinicalTrials.gov. Ann Intern Med 2010;153: 158e66. [4] Djulbegovic B, Kumar A, Soares HP, Hozo I, Bepler G, Clarke M, et al. Treatment success in cancer: new cancer treatment successes identified in phase 3 randomized controlled trials conducted by the National Cancer Institute-sponsored cooperative oncology groups, 1955 to 2006. Arch Intern Med 2008;168:632e42. [5] Djulbegovic B, Kumar A, Miladinovic B, Reljic T, Galeb S, Mhaskar A, et al. Treatment success in cancer: industry compared to publicly sponsored randomized controlled trials. PLoS One 2013;8:e58711. [6] Dekkers OM, Cevallos M, B€uhrer J, Poncet A, Ackermann Rau S, Perneger TV, et al. Comparison of noninferiority margins reported in protocols and publications showed incomplete and inconsistent reporting. J Clin Epidemiol 2014;68:510e7. [7] Dekkers OM, Soonawala D, Vandenbroucke JP, Egger M. Reporting of noninferiority trials was incomplete in trial registries. J Clin Epidemiol 2011;64:1034e8. [8] Bernabe RDLC, Wangge G, Knol MJ, Klungel OH, van Delden JJM, de Boer A, et al. Phase IV non-inferiority trials and additional claims of benefit. BMC Med Res Methodol 2013;13:70. http://dx.doi.org/10.1016/j.jclinepi.2015.11.009

Effects of general health checks differ under two different analyses perspectivesdthe Inter99 randomized study Recently, a large randomized interventiondthe Inter99d has confirmed that health check followed by repeated lifestyle counseling has no effect on 10-year mortality at a population level [1]. This is in accordance with a Cochrane review that showed that general health checks do not Funding: This work was supported by the Health Foundation (Helsefonden) Fondenes Hus, Otto Mønsteds Gade 5, 1571 København V, Danmark (grant 2010 B 131). Conflict of interest: None.

Cummulative survival probability (%)

Matthias Egger

0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.92

0

1

2

3

4 5 6 7 Years since baseline

8

Participants IG

Nonparticipants IG

Participants SCG

Nonparticipants SCG

9

10

ECG

Fig. 1. Ten-year survival by participation in the Inter99 intervention group (IG), sample of control group (SCG), and entire control group (ECG), Denmark, 1999e2010. Adjusted for age, age squared, and sex.

influence mortality at a population level [2]. Despite this, many researchers, health professionals, and politicians still believe that preventive health checks are effective in preventing disease and death in the general population [3]. General preventive health checks followed by repeated counseling of high-risk persons are currently implemented at a national level in several countries (e.g., England, Japan, Russia, and Australia). The evidence that favor an implementation of generalized health checks primarily comes from quasiexperimental studies which describe changes in, for example, lifestyle and self-reported health or blood pressure and not in hard end points such as death [4e7]. Furthermore, these conclusions regarding a beneficial effect have primarily been based on analyses restricted to participants [8,9]. However, excluding nonparticipants after randomization may introduce noncomparability of the control group (CG) and intervention group (IG) and lead to biased results. Recently, an article has been published showing that selection bias in large part influences the effect measures of the Inter99 study [10]. The Inter99 study, a population-based, randomized lifestyle intervention, took place in the years 1999e2006 (for a detailed description please see [11]). The study population (age 30e60 years) comprised 11,629 persons in the IG and 47,987 persons in the CG. At baseline, a random sample of the control group (SCG) (n Z 5,228) received questionnaires about health and lifestyle. The remaining CG was not aware of taking part in an intervention study. Persons defined as being at high risk of ischemic heart disease were repeatedly offered individual and group-based lifestyle counseling over a period of 5 years. All persons