Annals of Oncology 29: 1320–1324, 2018 doi:10.1093/annonc/mdy075 Published online 23 February 2018
ORIGINAL ARTICLE
F. Liang1, S. Zhang2*, Q. Wang3 & W. Li4 1 Clinical Statistical Center; 2Medical Oncology, Fudan University, Shanghai Cancer Center, Shanghai; 3Clinical Laboratory; 4Medical Oncology, The Affiliated Hospital of Qingdao University, Qingdao, China
*Correspondence to: Dr Sheng Zhang, Medical Oncology, Shanghai Cancer Center, Fudan University, 270 Dongan Road, 200032 Shanghai, China. Tel: þ86-21-67822026; Fax: þ86-21-67820385; E-mail:
[email protected]
Background: The hazard ratio (HR) is used routinely to quantify the treatment effect for time-to-event end points in oncology trials, but its use requires that there be proportional hazards in the treatment arms. Non-proportional hazards are observed frequently in cancer immunotherapy trials due to the long-term survival and delayed clinical effect. Although values of HR are quoted in such trials, they are not valid measures of outcome. Methods: Reports of parallel group randomized controlled trials (RCTs) evaluating immune checkpoint inhibitors with overall survival data were eligible. For each trial, the ratio of restricted mean survival time (RMST) between the arms was based on reconstructed individual patient data for overall survival. Results: Twenty-five RCTs totaling 12 870 patients were included in this study. Overall survival was used as primary or coprimary end point in 18 trials (72%). In all trials, there was agreement between the ratio of RMST or RMTL and the reported HR about the direction of treatment effect. Estimates of HR provided larger estimates of treatment effect than the ratio of RMST or RMTL in all these trials. The estimated HR and RMST-based measures were in agreement regarding the statistical significance of the effect in all but two trials. Conclusions: Ratio of RMST is a complementary technique that provides alternative method of summarizing treatment effects. Proportional hazards of the treatment effect should not be assumed in RCTs evaluating immune checkpoint inhibitors, and RMST analysis should be reported in such trials. Key words: treatment effects, immune checkpoint inhibitors, randomized controlled trials, hazard ratio
Introduction Over the past several years, randomized controlled trials (RCTs) have shown that immune checkpoint inhibitors can improve survival of people with a wide range of hard-to-treat advanced cancers previously considered intractable [1, 2]. The hazard ratio (HR), which requires the assumption of proportional hazards, is used routinely to quantify treatment effects in RCTs evaluating new types of cancer therapy, including immunotherapy. However, the results of RCTs comparing treatment with immune checkpoint inhibitors to previous standard therapy often show delayed separation of survival curves, a proportion of long-term survivals, and sometimes survival curves
that cross over, indicating violation of the assumption of proportional hazards. This concern has been discussed in the clinical and statistical literature [3–5]. An alternative and statistically valid method of analysis is to use the restricted mean survival time (RMST) or restricted mean time lost (RMTL) to quantify the treatment effect; these methods do not require assumptions such as proportional hazards. The restricted mean is a measure of average survival from time 0 to a specified time point, and may be estimated as the area under (RMST) or above (RMTL) the survival curve up to that point [5–8]. In this article, we estimated the treatment effects measured by the ratio of RMST or RMTL in RCTs evaluating effects of immune checkpoint inhibitors.
C The Author(s) 2018. Published by Oxford University Press on behalf of the European Society for Medical Oncology. V
All rights reserved. For permissions, please email:
[email protected].
Downloaded from https://academic.oup.com/annonc/article-abstract/29/5/1320/4904103 by Bukkyo University user on 02 October 2018
Treatment effects measured by restricted mean survival time in trials of immune checkpoint inhibitors for cancer
Original article
Annals of Oncology Methods
Results A total of 25 trials met the inclusion criteria and were included in our analysis [1, 2, 12–34] (flow of trial screening and references are provided in supplementary Figure S1, available at Annals of Oncology online). Baseline characteristics of these trials are summarized in Table 1. A total of 12 870 patients (median, 425 patients; range, 168–2559 patients) were enrolled in the 25 trials. Overall survival was used as a primary or coprimary end point in 18 (72%) trials (Table 1). Evidence of nonproportional hazards was found in 7 (28%) trials but all of the trials reported a value of HR. The reported HRs ranged from 0.42 to 1.11 with a median of 0.73, and reconstructed data demonstrated estimates close to the HRs or median survival times reported in the original articles (supplementary Table S1, available at Annals of Oncology online). Ratios of RMST ranged from 0.7 to 1.07 with a median of 0.88; differences of RMST ranged from –1.89 to 6.40 months with a median of 1.7 months; the ratios of RMTL ranged from 0.52 to 1.09 with a median of 0.85 (supplementary Table S2, available at Annals of Oncology online).
Volume 29 | Issue 5 | 2018
Table 1. Characteristics of trials included in the analysis Study characteristics
No. of trials (%)
Overall Journal New England Journal of Medicine Lancet Oncology Lancet Annals of Oncology Journal of Clinical Oncology JAMA Trial phase Phase II Phase III Sample size, median (range) Treatment setting Metastatic First line Second line or beyond Mixed Adjuvant Control arm Placebo Chemotherapy Targeted therapy Immune checkpoint inhibitor Others Treatment arm Pembrolizumab Nivolumab Atezolizumab Ipilimumab Tremelimumab Primary end point OS OS and PFS OS and ORR PFSa ORR RFS Treatment mode Monotherapy Combination therapy Both Trial met primary end point Yesb No
25 (100) 12 (48.0) 3 (12.0) 3 (12.0) 1 (4.0) 5 (20.0) 1 (4.0) 6 (24.0) 19 (76.0) 425 (123–1132) 24 (96.0) 11 (44.0) 12 (48.0) 1 (4.0) 1 (4.0) 3 (12.0) 17 (68.0) 1 (4.0) 3 (12.0) 1 (4.0) 5 (20.0) 8 (32.0) 2 (8.0) 9 (36.0) 1 (4.0) 14 (56.0) 3 (12.0) 1 (4.0) 4 (16.0) 2 (8.0) 1 (4.0) 17 (68.0) 7 (28.0) 1 (4.0) 20 (80.0) 5 (20.0)
a
Including two trials using immune-related PFS as primary end point. For trials with two primary end points, at least one was met. OS, overall survival; PFS, progression-free survival; ORR, overall response rate; RFS, recurrence-free survival.
b
In all trials, there was agreement between the ratio of RMST and the HR about the direction of treatment effect. In 23 trials, both HR and ratio of RMST favored the treatment arm, but HR provided larger estimates of treatment effect than the ratio of RMST in all of them. The HR and RMST-based measures were in
doi:10.1093/annonc/mdy075 | 1321
Downloaded from https://academic.oup.com/annonc/article-abstract/29/5/1320/4904103 by Bukkyo University user on 02 October 2018
We searched Medline, Embase, and the Cochrane Central Register of Controlled Trials (CENTRAL) from inception to 15 July 2017. We combined both MeSH and free text words to identify relevant studies. The search strategy is based on two previously published systematic reviews [9, 10]. Eligibility required reports of parallel group randomized trials of the immune checkpoint inhibitors, which included a Kaplan–Meier curve for overall survival. Phase I, single-arm phase II, and dose-finding trials were excluded. News, editorials, letters or commentaries, retrospective studies, review articles, and secondary analyses of RCTs were also excluded. Multiple-arm trials were included, but only one comparison arm was selected. Two authors (SZ and FL) screened trials independently for eligibility, and extracted the following information from each included trial using standardized forms: first author’s name, journal name, phase of the trial, cancer type, treatment regimen in both arms, sample size, and primary end point. We used digital software (DigitizeIt) to determine the time-dependent probability of overall survival from the published Kaplan–Meier curves. We used information on numbers at risk, and total number of events, where available, to reconstruct the Kaplan–Meier data for each arm [6, 8]. We calculated the RMST and RMTL in the experimental and control groups at the time horizon t* which was defined as the maximum (rounded) time that was shorter than or equal to the lesser of the longest time of follow-up for each of the two groups. For example, if the longest follow-up times were 18.4 and 19.2 months for treatment and control arms, respectively, 18 was chosen as t*. We also conducted a sensitivity analysis by choosing t* as the last failure time in the study. We estimated the ratio of RMST, difference of RMST, as well as the ratio of RMTL. The ratio of RMST was transformed so that a ratio of RMST or RMTL less than 1 indicated superiority of the experimental treatment, as would be the case for an estimate of HR in a trial where proportional hazards were satisfied. We examined proportional-hazards assumptions by testing the period by-treatment interaction term in a time-dependent Cox model [6, 8, 11]. Analyses involved use of R with the survRM2 package to derive the RMST and RMTL and the survival package for proportional-hazards assumptions test. A two-tailed P < 0.05 was considered statistically significant for analyses of RMST and RMTL, and P < 0.10 for the proportional-hazards assumptions test.
Original article
Annals of Oncology
A 1.20 Consistent conclusion of ratio RMST and HR Ratio of RMST significant, HR not significant Ratio of RMST significant, HR significant Proportional hazards Nonproportional hazards
0.80
0.60
0.40 0.40
0.60
0.80 HR
1.00
1.20
1.00
1.20
B 1.20
Consistent conclusion of ratio RMST and HR Ratio of RMST significant, HR not significant Ratio of RMST not significant, HR significant Proportional hazards Nonproportional hazards
Ratio of RMST
1.00
0.80
0.60
0.40 0.40
0.60
0.80 HR
Figure 1. (A) Estimates of treatment effect according to hazard ratio (HR) and ratio of restricted mean survival times (RMSTs). (B) Estimates of treatment effect according to HR and ratio of restricted mean time lost (RMTL).
1322 | Liang et al.
Volume 29 | Issue 5 | 2018
Downloaded from https://academic.oup.com/annonc/article-abstract/29/5/1320/4904103 by Bukkyo University user on 02 October 2018
Ratio of RMST
1.00
Original article
Annals of Oncology
Discussion There is empirical evidence from previous studies that proportional hazards cannot be always assumed in survival curves for RCTs [3, 4]. Furthermore, failure to demonstrate a violation of proportional hazards may reflect lack of power to detect such violation rather than confirm that hazards are proportional. In trials where the hazard function for two treatment groups is time-dependent during the study follow-up, a method of analysis such as RMST, which is appropriate for any time-to-event relationship, is valuable [7]. Results of our study revealed that estimates of HR and RMST-based measures were generally in agreement regarding the direction and statistical significance of treatment. Evidence of nonproportional hazards was identified in over one quarter of included trials. However, the treatment effect estimated by HR was always larger than that of the ratio of RMST or of RMTL, so that the latter may be more conservative measures for evaluating the benefit of immune checkpoint inhibitors. When the impact of immune checkpoint blockade manifests later in the survival follow-up, little or no difference in survival curves is seen for an initial period and then a late separation occurs. In this scenario, the PH assumption is clearly violated and the HR is difficult to interpretation. A ‘cumulative’ measure such as area between the curves (RMST) may reflect late differences. For example, the survival curves could cross at the median or at some other t* but still show a substantial difference in RMST at t*final. RMST-based measures depend on no distributional assumption. Furthermore, previous studies [4, 35] have showed that study power under proportional hazards reduced under a ‘late effect’, when little or no difference in survival curves is seen for an initial period and then a late separation occurs. Because RMST-based measures use all survival information before the prespecified t*, they are potentially more powerful. RMST or RMTL must be interpreted relative to a time horizon t* and the choice of t* to define the RMST is crucial. In our study, we used a rule that made full use of the available data to capture the long-term benefit of immune checkpoint inhibition, but other rules have been used in previous studies. The rule for choosing t* could be prespecified at the study design stage with respect to the clinical relevance and feasibility of conducting the study [7]. It is possible that ratio of RMTL may be close to the HR when the event rate is low. For example, previous studies [8, 36, 37] have revealed similar results that for trials in the adjuvant setting with low event rates, ratio of RMTL is closer to HR than ratio of RMST.
Volume 29 | Issue 5 | 2018
Our study has limitations. We calculated RMST and RMTL using reconstructed individual patient data, since it is impractical to obtain them from all trials. But the methods we used were validated with excellent accuracy and reproducibility stated in previous studies [6–8]. A single measure is not sufficient to fully characterize the survival profile for immune checkpoint inhibitors, given the complexity of survival curves. Ratios of RMST or RMTL serve as important supplementary measures and should be considered in order to fully interpret the results. In summary, the ratios of RMST and RMTL are complementary techniques that provide valid methods of summarizing treatment effects when the proportional hazards assumption is violated, as may occur in RCTs evaluating immune checkpoint inhibitors.
Acknowledgement We thank Dr Ian Tannock, Princess Margaret Cancer Centre and University of Toronto, for his assistance in reading and editing the manuscript.
Funding None declared.
Disclosure The authors have declared no conflicts of interest.
References 1. Fehrenbacher L, Spira A, Ballinger M et al. Atezolizumab versus docetaxel for patients with previously treated non-small-cell lung cancer (POPLAR): a multicentre, open-label, phase 2 randomised controlled trial. Lancet 2016; 387: 1837–1846. 2. Rittmeyer A, Barlesi F, Waterkamp D et al. Atezolizumab versus docetaxel in patients with previously treated non-small-cell lung cancer (OAK): a phase 3, open-label, multicentre randomised controlled trial. Lancet 2017; 389(10066): 255–265. 3. Seruga B, Amir E, Tannock I, Treatment of lung cancer. N Engl J Med 2009; 361: 2485; author reply 2486–2487. 4. Mick R, Chen TT. Statistical challenges in the design of late-stage cancer immunotherapy studies. Cancer Immunol Res 2015; 3(12): 1292–1298. 5. A’Hern RP. Restricted mean survival time: an obligatory end point for time-to-event analysis in cancer trials? J Clin Oncol 2016; 34: 3474–3476. 6. Royston P, Parmar MK. Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome. BMC Med Res Methodol 2013; 13(1): 152. 7. Trinquart L, Jacot J, Conner SC, Porcher R. Comparison of treatment effects measured by the hazard ratio and by the ratio of restricted mean survival times in oncology randomized controlled trials. J Clin Oncol 2016; 34(15): 1813–1819. 8. Uno H, Claggett B, Tian L et al. Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis. J Clin Oncol 2014; 32(22): 2380–2385. 9. Zhang S, Liang F, Zhu J, Chen Q. Risk of pneumonitis associated with programmed cell death 1 inhibitors in cancer patients: a meta-analysis. Mol Cancer Ther 2017; 16(8): 1588–1595. 10. Zhang S, Liang F, Li W, Wang Q. Risk of treatment-related mortality in cancer patients treated with ipilimumab: a systematic review and metaanalysis. Eur J Cancer 2017; 83: 71–79.
doi:10.1093/annonc/mdy075 | 1323
Downloaded from https://academic.oup.com/annonc/article-abstract/29/5/1320/4904103 by Bukkyo University user on 02 October 2018
agreement regarding the statistical significance of the effect in all but two trials. In one trial, ratio of RMST indicated that the experimental treatment was significantly superior, whereas the HR did not [2]. In another trial, the HR significantly favored the experimental arm, whereas the ratio of RMST did not (Figure 1A) [34]. Comparison of the ratio of RMTL and HR revealed same results (Figure 1B). The results of the sensitivity analysis are consistent with those of primary analysis regarding the direction and the statistical significance of the ratios of RMST/RMTL (supplementary Table S3, available at Annals of Oncology online).
Original article
1324 | Liang et al.
24. Brahmer J, Reckamp KL, Baas P et al. Nivolumab versus docetaxel in advanced squamous-cell non-small-cell lung cancer. N Engl J Med 2015; 373(2): 123–135. 25. Motzer RJ, Escudier B, McDermott DF et al. Nivolumab versus everolimus in advanced renal-cell carcinoma. N Engl J Med 2015; 373(19): 1803–1813. 26. Larkin J, Minor D, D’Angelo S et al. Overall survival in patients with advanced melanoma who received nivolumab versus investigator’s choice chemotherapy in checkmate 037: a randomized, controlled, openlabel phase III trial. J Clin Oncol 2018; 36(4): 383–390. 27. Bellmunt J, de Wit R, Vaughn DJ et al. Pembrolizumab as second-line therapy for advanced urothelial carcinoma. N Engl J Med 2017; 376(11): 1015–1026. 28. Reck M, Rodriguez-Abreu D, Robinson AG et al. Pembrolizumab versus chemotherapy for PD-L1-positive non-small-cell lung cancer. N Engl J Med 2016; 375(19): 1823–1833. 29. Herbst RS, Baas P, Kim DW et al. Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): a randomised controlled trial. Lancet 2016; 387(10027): 1540–1550. 30. Robert C, Schachter J, Long GV et al. Pembrolizumab versus ipilimumab in advanced melanoma. N Engl J Med 2015; 372(26): 2521–2532. 31. Ribas A, Kefford R, Marshall MA et al. Phase III randomized clinical trial comparing tremelimumab with standard-of-care chemotherapy in patients with advanced melanoma. J Clin Oncol 2013; 31(5): 616–622. 32. Reck M, Luft A, Szczesna A et al. Phase III randomized trial of ipilimumab plus etoposide and platinum versus placebo plus etoposide and platinum in extensive-stage small-cell lung cancer. J Clin Oncol 2016; 34(31): 3740–3748. 33. Eggermont AM, Chiarion-Sileni V, Grob JJ et al. Prolonged survival in stage III melanoma with ipilimumab adjuvant therapy. N Engl J Med 2016; 375(19): 1845–1855. 34. Beer TM, Kwon ED, Drake CG et al. Randomized, double-blind, phase III trial of ipilimumab versus placebo in asymptomatic or minimally symptomatic patients with metastatic chemotherapy-naive castrationresistant prostate cancer. J Clin Oncol 2017; 35(1): 40–47. 35. Pak K, Uno H, Kim DH et al. Interpretability of cancer clinical trial results using restricted mean survival time as an alternative to the hazard ratio. JAMA Oncol 2017; 3(12): 1692–1696. 36. Chan A, Buyse M, Yao B. Neratinib after trastuzumab in patients with HER2-positive breast cancer – author’s reply. Lancet Oncol 2016; 17(5): e176–e177. 37. Hasegawa T, Uno H, Wei LJ. Neratinib after trastuzumab in patients with HER2-positive breast cancer. Lancet Oncol 2016; 17(5): e176.
Volume 29 | Issue 5 | 2018
Downloaded from https://academic.oup.com/annonc/article-abstract/29/5/1320/4904103 by Bukkyo University user on 02 October 2018
11. Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 1994; 81(3): 515–526. 12. Langer CJ, Gadgeel SM, Borghaei H et al. Carboplatin and pemetrexed with or without pembrolizumab for advanced, non-squamous nonsmall-cell lung cancer: a randomised, phase 2 cohort of the open-label KEYNOTE-021 study. Lancet Oncol 2016; 17(11): 1497–1508. 13. Hodi FS, Chesney J, Pavlick AC et al. Combined nivolumab and ipilimumab versus ipilimumab alone in patients with advanced melanoma: 2-year overall survival outcomes in a multicentre, randomised, controlled, phase 2 trial. Lancet Oncol 2016; 17(11): 1558–1568. 14. Carbone DP, Reck M, Paz-Ares L et al. First-line nivolumab in stage IV or recurrent non-small-cell lung cancer. N Engl J Med 2017; 376(25): 2415–2426. 15. Hodi FS, O’Day SJ, McDermott DF et al. Improved survival with ipilimumab in patients with metastatic melanoma. N Engl J Med 2010; 363(8): 711–723. 16. Reck M, Bondarenko I, Luft A et al. Ipilimumab in combination with paclitaxel and carboplatin as first-line therapy in extensive-disease-smallcell lung cancer: results from a randomized, double-blind, multicenter phase 2 trial. Ann Oncol 2013; 24(1): 75–83. 17. Lynch TJ, Bondarenko I, Luft A et al. Ipilimumab in combination with paclitaxel and carboplatin as first-line treatment in stage IIIB/IV nonsmall-cell lung cancer: results from a randomized, double-blind, multicenter phase II study. J Clin Oncol 2012; 30(17): 2046–2054. 18. Robert C, Thomas L, Bondarenko I et al. Ipilimumab plus dacarbazine for previously untreated metastatic melanoma. N Engl J Med 2011; 364(26): 2517–2526. 19. Hodi FS, Lee S, McDermott DF et al. Ipilimumab plus sargramostim vs ipilimumab alone for treatment of metastatic melanoma: a randomized clinical trial. JAMA 2014; 312(17): 1744–1753. 20. Kwon ED, Drake CG, Scher HI et al. Ipilimumab versus placebo after radiotherapy in patients with metastatic castration-resistant prostate cancer that had progressed after docetaxel chemotherapy (CA184-043): a multicentre, randomised, double-blind, phase 3 trial. Lancet Oncol 2014; 15(7): 700–712. 21. Ferris RL, Blumenschein G, Jr., Fayette J et al. Nivolumab for recurrent squamous-cell carcinoma of the head and neck. N Engl J Med 2016; 375(19): 1856–1867. 22. Robert C, Long GV, Brady B et al. Nivolumab in previously untreated melanoma without BRAF mutation. N Engl J Med 2015; 372(4): 320–330. 23. Borghaei H, Paz-Ares L, Horn L et al. Nivolumab versus docetaxel in advanced nonsquamous non-small-cell lung cancer. N Engl J Med 2015; 373(17): 1627–1639.
Annals of Oncology