ELSEVIER
Power Calculation for the Log Rank Test Using Historical Data Alan B. Cantor, Moffitt Cancer
Center
PhD and Research
Institute,
Tampa,
Florida
ABSTRACT: When planning a clinical trial that is to use the log rank test to compare survival in two groups, it is desirable to determine that the power of the test is adequate given the anticipated accrual rate and time, follow-up time, and survival functions S,(t) and S,(t). Often it is assumed that the ratio of the associated hazards is a constant, p, and we want adequate power for a given value of p. In this case S&) = S!(t), so that an assumption concerning S1( t) is required. If a Kaplan-Meier estimate S,(t) is available from a previous study, its use might be preferable to assuming a distribution of a particular form. In this note we show how such power calculations can be performed. Furthermore, since for any value of t, S;(t) is a random variable, the variance of power estimates calculated using it Controlled Chin Trials 2996; 17:111-116 can be estimated. KEY WORDS:
power, sample size, log rank test, Kaplan-Meier
method
INTRODUCTION
Since its introduction in 1972, the log rank test [l] has become the most widely used method for comparing survival curves. Thus, it is important to have methods to calculate the power of the log rank test under various scenarios. The literature on this subject is plentiful but generally requires assumptions concerning the underlying survival model. Several methods of power calculation that assume proportional hazards are based on the asymptotic relationship given by Rubenstein et al. [2]:
bdP)12 (2,
+
=
E-‘[Dl]
+
E-‘[II21
(1)
q-J2
Here p is the ratio of hazards, a and l3 are the desired type I and II error rates, z. and zs are defined by Q(zo) = 1 - 8, CD(. ) being the standard normal distribution function. E[Dil, the expected number of deaths on treatment i, is a function of the survival distribution Si( * ), the censoring distribution, and the sample size in group i. The censoring distribution is determined by the accrual pattern, postaccrual follow-up time, and losses to follow-up. A number of formulas assume Address reprint requests to: Alan B. Cantor, Ph.D., Moffitt Cancer Center and Research Institute, 12902 Magnolia Drive, Tampa, FL 33612. Received April 2, 1993; revised May 31, 1995; accepted June 28, 1995. Controlled Clinical Trials 17:111-116(1996) 0 Elsevier Science Inc. 1996 655 Avenue of the Americas. New York, NY 10010
0197-2456/96/$15.00 SSDI 0197-2456(95)00152-7
A.B. Cantor
112
that accrual is uniform over some interval 10, T] and that the analyses are done after T units of additional follow-up time. Thus, a patient arriving at time t has a censored survival time of T + (r - t) if his or her survival time exceeds that value. If patients are enrolled at a uniform rate, r, in time interval [0, T] and followed for an additional t > 0 units of time before analyses are done, then for a patient arriving at time t in [0, T] and assigned treatment i, the probability of death prior to the analysis is 1 - Si(w), where w = T + (7 - t) is the time from arrival to date of analysis. Integrating over the arrival time distribution, the probability of death for a patient on treatment i is pi = -‘Sr+‘[l-Si(W)jdW
=
T,
1-
+jTTt’5i(W)dW
(2)
If Xi is the proportion of the study group to be placed on treatment i and r is the accrual rate, E[Dl] = rTRiPi. Thus, the calculation of power requires an assumption concerning S,(t) and the calculation of
s J+r
T+T
&(w)dw
and
S,p(w)dw s7
T
Schoenfeld [3] discusses estimation of the right side of Eq. (2) using Simpson’s rule with two subintervals. Pz can be approximated conservatively by 1 - (1P&‘. The method described below is in the spirit of Schoenfeld’s work. The main departure is that the historical data are more fully utilized so that the resulting power estimate is likely to be more accurate. Rubenstein et al. [2] derive an exact expression for the exponential case. Sposto and Sather [a], Shuster [S], and Cantor [6] consider models for which S,(t) -t c > 0 as t -+ 00, i.e., models with a nonzero “cure rate.” The latter two provide computer programs. Brown et al. [7] present a Bayesian approach to power calculation. They use results of a previous study of two treatments to base power calculations on a posterior distribution of the parameters of interest. Lagatos [8] discusses sample size for clinical trials based on a Markov model in which Kaplan-Meier curves can be used to obtain transition probabilities.
USE OF HISTORICAL
DATA
When a clinical trial to compare a standard to an experimental therapy is being planned, the investigators often have data concerning the standard treatment and its survival curve, S,(t). Typically, the objective is to have a specified power for some given value of S1(tO) and S,(tJ where to is specified. These values lead to the hazard ratio p = log SZ(to)/log S,(r,). The choice of a function to be used as the integrand in Eq. (2) is frequently a source of uncertainty and potential error. Cantor [6] showed that the erroneous assumption of an exponential model can produce serious overestimates of power. If sufficient historical data on the standard treatment are available to produce its Kaplan-Meier survival curve S,(t), a reasonable approach is to replace S,(t) in Eq. (2) by S,(t) and S,(t) by S,(t) = [S,(t)]“. In addition to avoiding the problem of model specification, this approach leads to a much simpler calculation. The required integrals become simply sums of products of the form AjSi(tj-1).
Power Calculation Using Historical Data Table 1
113
Historical Data on Standard Treatment Time (years)
Survival
0.00
1.00 0.99 0.98 0.96 0.95 0.93 0.90 0.87 0.83 0.75 0.60 0.60
1.23 1.48 1.70 2.11 2.43 3.01 3.75 4.51 4.92 6.63 7.21
A NUMERICAL
EXAMPLE
Suppose we are planning a study comparing a standard and experimental treatment. We have historical data for survival on the standard treatment which enable us to calculate the Kaplan-Meier estimate of the survival distribution in Table 1. Except for the final entry in the above table, we have deleted times associated with censored observations because they do not affect the calculations that follow. Noting that the estimated S-year survival probability is 0.75, we would like to estimate the power of a one-sided test with a = 0.05 if the new treatment increases the 5-year survival probability to 0.85. A proportional hazard assumption implies P = log(O.85)/log(O.75) = 0.565. Thus, S,(t) = [Si(t)]O.““. If we plan to accrue 100 patients/year for 5 years, rrl = rc2 = 0.5 and z = 2. Let Aj = ti - tj-1. Table 1 can now be expanded as in Table 2. We then have E[Dl] = rr~l[ T - ZAjS,(tj-1)) E[Dz] = r~z( T - tAjS2(tj-l))
Therefore,zg=
= 50(5-4.12)=44.0 = 50(5-4.47)=26.5
]log 0.5651 ,44 o + 1,26 5 - 1.645 = 0.677and the projected power is 75 % .
Using Simpson’s rule as in Schoenfeld (3) yields 0.160 and 0.094 for the proportion dying in the control group and experimental group, respectively. As a result,
Table 2
Calculations
6
Slh)
2.00 2.11 2.43 3.01 3.75 4.51 4.92 6.63 7.00
0.96 0.95 0.93 0.90 0.87 0.83 0.75 0.60 0.60
for E[Dil S2m
.98 .97 .96 .94 .92 .90 .85 .75 .75
Ai
0.11 0.32 0.58 0.74 0.76 0.41 1.71 0.37
lAi..$(ti-l) 0.11 0.41 0.95 1.62 2.28 2.62 3.90 4.12
EAi*$(fi-~) 0.11 0.42 0.98 1.67 2.37 2.74 4.19 4.47
A.B. Cantor
114
188
04,
I.
I
0
1
2
’
I.
I
I.
I
3
4
5
6
174
’
144
95
4Q
I.
I
I.
I
7
8
9
10
24
’
I
11
6
I
12
’
1.
I
13
14
Yews Followed
Figure 1 Kaplan-Meier
curve from a leukemia study.
E[Dz] = 22.5, zg= 0.5215, and 1 - l3 = 0.70. To increase power, EDI = 40.0, we can consider increasing the accrual rate, accrual time, or follow-up time. Values of T and r for which T + r > 7.21 will require an assumption about S,(t) for t > 7.21. If T + r does not greatly exceed 7.21, the result will not be too sensitive to that assumption. CONFIDENCE INTERVALS FOR ASYMPTOTIC
POWER
Although to simplify the notation we will suppress the ‘hat” notation, the quantities E[Di], zp, and I3 calculated as described above are random variables, functions of the random variables [S,(Q)]. It is reasonable therefore to estimate var(p) and thus derive one-sided confidence intervals of the form (PL, 1) for 1 - l3. We can consider the Ai fixed and write the variance of E[Di] as (rni)” f Ai Ak cov(Si(q), Qtk)). The formula E[S(tj). S(tk)] = [S(tk)/S(tj)]E[S’(tj)],
t,
(3)
given by Kaplan and Meier [o] can be used to estimate the covariances. These calculations lead to an estimate of V = var(E[Di]-* + E[DJ-‘), which in turn allows the calculation of confidence intervals for the estimated power. In general, we want Pr[l - p > PI_] = 1 - y for some preassigned y. Since 1 - p is a decreasing function of E-‘[D,] + E-l[D,], an appropriate value of PL is
Q,[p%Pl/JEi-Zal
(4)
where u = E-l[Dl] + E-‘[DJ + E-’ [DJ + zy a. Alternatively, var(1 - I3) can be calculated directly from 1 - P = @[llogrll&*
[DJ + E-l
by the delta method. Then a (1 1-P--z,~
[DJ-zal
(5)
y) 100% lower bound is (6)
The details are provided in the Appendix. As an example, Fig. 1 presents the Kaplan-Meier survival curve for one of the treatment arms in a leukemia study. We plan to accrue 50 patients per year randomly assigned to each treatment with equal probability for 5 years and to follow
Power
Calculation Using Historical Data
115
them for 3 additional years. We want to determine the power of this study using a log rank test with one-sided significance level of 0.05. Suppose the ratio of the hazard of the new treatment to that of the old is 0.61. Note that this corresponds to increasing the S-year survival rate from 52% to 67%. Applying the above methods yields E[Di] = 59.09, E[Dz] = 40.54, and 1 - S = 0.782. To obtain the 9.5% lower bound, we estimate the variance of E-‘[D,] + E-‘[Q] to be 8.919 x 10W7. A 95 % upper bound for E-‘[IA] + Eml[DJ is 0.043 and the corresponding 95% lower bound for power is 0.769. As an alternative, we estimate the variance of 1 - l3 to be 6.573 X 10P5, which also produces a 95% lower bound of 0.769. DISCUSSION Calculation of the power of a clinical trial being planned is complicated by the fact that the actual power depends on many unknown factors. Among these are the accrual rate and pattern, the distribution of losses, and the true survival distributions for the treatment groups. On the other hand, accrual time, follow-up time, and type I error rate, which are set by the planners of the trial, also affect power. In actual practice, knowledge of the first set of factors can only come from experience. Preferably this experience will be preserved as carefully recorded data under circumstances very similar to those of the trial being planned. Such data are most likely to be available for cooperative groups that are funded over a long period and for major research institutions that conduct a series of clinical trials for the same disease. In such cases, the method discussed above might be particularly appropriate. Even in other situations the use of the estimated KaplanMeier survival curve obtained from historical data might be a reasonable alternative to the imposition of an assumed model. The author thanks the referees for suggestions that improved this manuscript. In particular, the use of the expression var CAi S(ti) = c Ai Ak cov (.@j), $tk)) was due to one of the referees. Ik
REFERENCES 1. Mantel N: Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep 50:163-170, 1966 clinical 2. Rubinstein RV, Gail MH, Santner JT: Planning the duration of a comparative trial with loss to follow-up and a period of continued observation. J Chron Dis 34:469479, 1991 3. Schoenfeld Biometrics
DA: Sample size formula 39:449, 1983
for the proportional-hazards
4. Sposto R, Sather HN: Determining the duration for cure. J Chron Dis 38:683-690, 1985
of comparative
5. Shuster JJ: Handbook Press; 1990
for clinical
of sample
size guidelines
6. Cantor AB: Sample size calculations for the logrank J Clin Epidemiol 45:1131-1136, 1992
regression
model.
trials while allowing
trials.
Boca Raton:
test: a Gompertz
CRC
model approach.
7. Brown WB, Herson J, Atkinson EN, Rozell ME: Projection from previous studies: Bayesian and frequentist compromise. Controlled Clin Trials 8:29-44, 1987 8. Lagatos E: Sample sizes based on the log-rank rics 44~229-241, 1988 9. Kaplan EL, Meier I’: Nonparametric Stat Assoc 53:457-4&U, 1958
a
statistic in complex clinical trials. Biomet-
estimation
from incomplete
observations.
J Am
A.B. Cantor
116 APPENDIX
From Eq. (3), replacing survival function values by their Kaplan-Meier estimates, we have cov(~~(t~),~~(f~J) = [$(t~)/$(t~)l var[!$(&)] (ti < tk). By the delta method, cov($(ti), L&k)) = p2[$(ti)*$(f&]P-1 Cov(Sr(tj), Sr(Q). Then for i = 1, 2, varE[Di] = C Aj hk COV(Si(ti),$(fk)) and var(E-l[Di]) = Ee4[Di] varE[Di] by the delta method. Since the flDi] are independent, V = var(E-‘[Q] + ET1[Dz]) = var(E-‘[D1]) + var(E-‘[Dr]). Then Cp [Ilog (p)]/u - z,] where u = E-‘[DJ + E-‘Dl + z,fi is a (1 - y) 100% lower bound for power. As an alternative, applying the delta method to Eq. (S), we obtain (2n)-r exp [ - ( (log @)]I& - z,) 1” (log(p)2/(4X3)var(X) where X = E-’ [D,] + EP1[D2] as V(P), the variance of the estimated power. Then 1 - p - zlTy m is a (1 - y) 100% one-sided confidence limit.