A comparison of methods for Phase II cancer clinical trials: advantages of the triangular test, a group sequential method

A comparison of methods for Phase II cancer clinical trials: advantages of the triangular test, a group sequential method

LUNU CANCER -+ii& Lung Cancer 10 Suppl. 10994) S105-S115 A comparison of methods for Phase II cancer clinical trials: advantages of the triangular ...

800KB Sizes 0 Downloads 19 Views

LUNU

CANCER

-+ii& Lung Cancer 10 Suppl. 10994) S105-S115

A comparison of methods for Phase II cancer clinical trials: advantages of the triangular test, a group sequential method Eric Bellissant,* Jacques Benichou, Claude Chastang Dipatiement de Biostatistique et lnformatique Mdicale, Hbpital Saint-Louis, 1, awnue Claude Veliefauw, 75475 Paris, Ceder 10, France

Abstract In cancer, the purpose of Phase II studies is to determine whether the response rate p to a new treatment is greater than a prespecified value po, defined as the largest response rate for which Phase III studies are not worthwhile. It concerns, for example, determining whether the response rate to a new drug is greater than 20%. The main problem is of a decision making nature, and amounts to the comparison of an observed percentage with a theoretical percentage. One way of resolving it is to perform a single statistical analysis after the inclusion of a predetermined number of patients N, but it is not always possible because of high values of N. Furthermore, this approach presents ethical problems when elements in favor of inefficacy or efficacy are available early in the trial. For these reasons, several authors have developed methods which allow to perform repeated analyses and possibly to reach an early conclusion of the study: two-stage, multistage, sequential and group sequential methods. This article considers the main decision making methods proposed in the literature for Phase II studies in oncology. The bibliographic study, which highlights the interest of using group sequential methods, and especially the Triangular Test, is confirmed by a comparative study of the statistical properties of the different methods. Key words:

sequential

Non-randomized clinical trials; Phase II; Oncology; Multi-stage methods; Group methods; Sequential probability ratio test; Triangular test

*Corresponding author, Dtpartement de Biostatistique et Informatique Mtdicale, Hapital Saint-Louis, 1, avenue Claude Vellefaux, 75475 Paris Cedex 10, France. 0169-5002/94/$07.00 0 1994 Elsevier Science Ireland Ltd. All rights reserved. SSDI 0169-5002(93)00271-A

S106

E. Bellissant et al./Lung Cancer10Suppl.1 (1994)S105-S115

1. Introduction In cancer, Phase II clinical trials are most of the time non-comparative trials and aim at determining whether the efficacy of a new treatment is sufficient to warrant further studies in Phase III [3]. The usual endpoint in these trials is the response rate p estimated by the ratio of the number of observed responses S to the number of included and evaluated patients N [7]. The study should be able to determine whether p is greater than a prespecified value p,,, defined as the largest response rate for which it is considered that Phase III studies are not worthwhile. In lung cancer, the problem could be to determine whether the response rate to a new chemotherapy is greater than 20%. In statistical terms, the problem amounts to the comparison of an observed percentage (p) with a theoretical percentage (p,), and can be expressed by the test of the null hypothesis H,, given by p sp,, (inefficacy), against the alternative hypothesis H,, given by p >po (efficacy). To work out this problem, the simplest method consists in performing a single statistical analysis after the inclusion of a predetermined number of patients N. N is calculated by choosing p,, and the value of the Type I error rate (Y, and by specifying the alternative hypothesis, that is by choosing the threshold response rate p, for (and above) which it is considered that Phase III studies are justified (p, corresponds to the minimum clinically interesting benefit when compared with pJ, and the value of the Type II error rate p under this particular hypothesis P =pa* In practice, this single-stage design is difficult to implement due to recruitment-related (N is usually too large) and ethical problems (impossibility of stopping a study when the drug appears clearly ineffective or effective). Thus, many other methods which allow repeated analyses have been developed. Some methods foresee only two stages [5,8,10], whereas some others do not make limitations on the number of analyses [4,6]. Finally, some other methods plan for a new analysis each time a patient is evaluated (strictly sequential methods [l, 111), or each time a group of N patients is evaluated (group sequential methods [2]) without predetermining a maximum number of analyses. The purposes of this article are, in the first section, to describe the main methods proposed for Phase II trials in oncology, and then, in the second section, to compare their statistical properties. 2. Main methods proposed for Phase II studies in oncology 2.1. The single stage design This is the reference method which plan for only one analysis when the predetermined sample size has been reached. As described in the introduction, it is possible, for a given set of values of p,,, pa, CY and /3, to calculate the required sample size N and the number of responses C corresponding to the decision threshold. At the end of the inclusion of N patients, the drug under study is classified as either ineffective or effective according to the observed number of responses S: if S I C, the conclusion is that the null hypothesis cannot be rejected, signifying that the response rate is not shown to be superior to pa, and therefore that Phase III studies are not worthwhile; if S > C, the conclusion is that

E. Bellissant et al. / Lung Cancer IO Suppl.I (1994) S105-S115

s107

the null hypothesis is rejected, signifying that the response rate is superior to pO, and therefore that Phase III studies should be considered. This method is easy to implement and its statistical analysis is simple to perform. In addition, it takes the predetermined Type I and II error rates into account, since exact values for N and C can be calculated using the binomial distribution. But, this method is difficult to implement because of too high values for N: for example if p,, = 0.40, p, = 0.60, (Y= p = 0.05, the required sample size is N = 67. It also poses ethical and economic problems when there are early arguments in favor of inefficacy or efficacy, since it requires the inclusion of N patients before a conclusion can be made. For these two reasons, several authors have proposed multistage methods for reaching the conclusion earlier, and consequently, reducing the required sample size. 2.2. The two-stage method proposed by Gehan (51 This method gives an estimate of the response rate in two stages, allowing early stopping when there is a high probability that the response rate will be lower than a predefined value. It combines testing (first stage) and estimation (second stage). At the end of the first stage, in which N, subjects have been included, the decision is taken to continue or to stop the study according to the observed number of responses S: if S = 0, the study is stopped and the development of the drug is abandoned; if S > 0, the study continues with the inclusion of NZ subjects, this number being determined in order to estimate the success rate p with a predefined precision. The calculation of N, relies on the choices of pa and of the Type I1 error rate /3: N, is equal to the minimum number of subjects required in order that the probability of observing no success will be lower than the prespecified value p if the response rate p to the new drug is equal to p,. For example, if p, = 20% and p = 5%, th en N, = 14. The calculation of NZ depends on the choice of g, the standard error corresponding to the desired precision in the estimation of p: N2 is the number of subjects to add to N, to estimate the response rate p with a 95% confidence interval of width 4~. In the former example, if after the first stage S = 2 and (T= 5%, then NZ = 63. This method is easy to implement and allows early stopping if the drug shows very poor efficacy. However, this method induces the selection of a large number of ineffective drugs for the second stage (since Type I error rate (Y is not taken into account) and leads to high sample sizes if one wants to estimate p with a good precision: in the former example, if the real response rate of the drug is 5%, the probability of reaching the second stage is 0.51, and the inclusion of 77 patients is required to estimate p with cr = 5%, that is to say a 95% confidence interval with a width of 20%. Furthermore, it does not allow (except when there is no response in the first stage) to decide whether the development of the drug must be abandoned or continued in Phase III. 2.3. The multistage method proposed by Herson [61 This is a multistage procedure (2, 3, 4 or more stages) the purpose of which is to allow early stopping when preliminary results show evident inefficacy. Given the

S108

E. Bellissant et al/Lung

Cancer 10 Suppl. I (1994) SlO5-S115

values of pO, pa, (Y and p, the required sample size N is calculated (the same as for the standard single stage design) and then the number of stages and the number of patients to be included at each stage are chosen. At the end of each stage, a statistical analysis is performed. At stage g, the decision depends upon the number of responses S observed since the beginning of the trial: if S I Ca (the threshold value at the end of stage g), the study is stopped and the development of the drug is abandoned; if S > Cs, the study continues to the next stage. This procedure is repeated until the final stage k, in which C, = C (the threshold value for a single stage design including Nk = N patients). The determination of the threshold value Cs is based on the calculation of the probability of observing, at the end of the trial, at most C responses on N patients when S, responses on Ns patients (S, and Na being, respectively, the cumulative numbers of observed responses and included patients) have been observed at the end of stage g. This probability is called predictive: when it is greater than a threshold value PO to be determined (0.85 for example), it means that there is a high probability that the number of responses at the end of the study will be lower or equal to the final threshold value C, and that it is then possible to stop the study at this stage. Although, this procedure allows early stopping if the drug under study shows clear inefficacy, planning is relatively difficult, and a computer is needed to calcuate Cs. Furthermore, this method takes into account various interrelated parameters and thus generates, for certain choices of p,,, pa, cr and p, cases for which it is impossible to calculate, for some given values of k and Ns, values of Cs which allow to obtain a predictive probability that is greater than or equal to the chosen threshold value P,,. Finally, the alternative possibility of early stopping when the drug shows evident efficacy is not considered. 2.4. The multistage method proposed by Fleming [4] This is a multistage procedure (2,3,4 or more stages) the purpose of which is to allow early stopping when preliminary results show either evident inefficacy or efficacy. Given the values of p,,, pa, (Y and /3, the required sample size N is calculated and then the number of stages and the number of patients to be included at each stage are chosen as in the method proposed by Herson [61. At the end of each stage, a statistical analysis is performed. At stage g, the drug is classified as ineffective or effective according to the number of responses S obtained since the beginning of the trial compared with two critical values ag and rs (a, < r,): if S I ag, the study is stopped and the development of the drug is abandoned; if S 2 r9, the study is stopped and the drug is selected for Phase III studies; and if ag < S < r9, the study continues until the next stage. This procedure is repeated until the final stage k where ak = rk - 1, and a decision is necessarily taken. This procedure allows early stopping if the drug under study shows either evident inefficacy or efficacy. However, using the normal approximation of the binomial distribution leads to a slight underestimation of N, and Fleming, in order to obtain a test which respects the Type I error rate cy, has to increase the lower threshold values ag. The consequence is a reduction of power.

E. Belhant

et al./Lung Cancer 10 Suppl. 1 (1994) SlOS-SllS

s109

2.5. The group sequential methods: the discrete sequentialprobabilityratio test (SPRT) and the discrete triangulartest (TT) [2] We have extended two group sequential methods initially proposed by Whitehead and Jones in 1979 [12-141 for Phase III studies to non-comparative Phase II studies in oncology [2]. These methods, which use the principles of Wald’s Sequential Probability Ratio Test [ll] and Anderson’s Triangular Test 111, respectively, allow analyses to be done after the evaluation of groups of n subjects. They use a sequential plan defined by two orthogonal axes where the X-axis represents a statistic, V, and the Y-axis, another statistic, 2. In the case of the comparison between an observed percentage and a theoretical percentage, we have demonstrated [2] that these two statistics have simple expressions: I/= Np, (1 -pJ and Z = S - Np,. Z corresponds to the difference between the numbers of observed responses S and expected responses Np,, and I/ represents the variance of Z for p =po as well. In practice, at the end of the inclusion of a group of N patients, the two statistics 2 and V are calculated from all the data collected since the beginning of the study. The point so defined is reported on the sequential plan. As soon as the sample path formed by the successive points crosses one of the boundaries of the test, the conclusion is obtained and the study is stopped. For both methods, the equations of the straight line stopping boundaries depend on the chosen values of po, pa, (Y, p and N (which determine the frequency of the analyses). Fig. 1 displays, for example, the sequential plan obtained with the IT for

in-

REOlONOFRuEcnoNOF”o

Fig. 1. Triangular Test: p. = 0.40, pa = 0.60, (Y= p = 0.05, N = 3, simulated data.

SllO

E. Bellissantetal./LungCancerIOSuppl. I (1994)SIOS-SII5

p. = 0.40, p, = 0.60, (Y= 0.05, p = 0.05 and N = 3 and the path obtained from the analysis of simulated data, which leads to reject the null hypothesis at the end of the sixth analysis (18th inclusion). The group sequential procedures retain the ethical advantages of strictly sequential methods. They have appropriate statistical properties (a and p 1, and they allow important reductions in the required sample size [21. But, as with Wald’s SPRT, the discrete SPRT has the disadvantage of a potentially infinite number of analyses (open plan), which is not the case with the discrete IT (closed plan). 3. Comparison of the different methods 3.1. Methodology In order to assess the main methods proposed for Phase II cancer clinical trials, it is necessary to compare their statistical properties in a wide range of situations. We excluded Gehan’s method [5] because of its lack of decisional character. The single stage and multistage designs were studied from an analytical approach (exact calculations were made from Schultz’s formulas [9]). Sequential methods were studied using simulations which led, for each studied case, to generate 30 000 Phase II independent studies in which each patient could be classified as a responder (with a probability p) or a non-responder (with a probability 1 -p). This study [2] allowed to estimate both the Type I and II error rates, and the average sample number (ASN) of patients required to reach a conclusion for different values of p, in different cases defined by their values of po, pa, cr and /?I.We have selected five cases for p,, and pa: (I) 0.05/0.20, (II) 0.20/0.40, (III) 0.40/0.60, (IV) 0.60/0.80 and (V) 0.80/0.95. Concerning Type I and II error rates, we have chosen two different values for LY(0.05, 0.10) and only one for p(O.05) because we consider that the most important error which can occur in Phase II is the inability of the study to detect an effective treatment [8,10]. For each case, different values of the response rate p have been successively studied (21 values from 0.0 to 1.0 with steps of 0.05). Furthermore, the influence of the frequency of the analyses was studied considering, for multistage methods different numbers of stages (k = 2, 3 and 41, and for group sequential methods all the odd values of N between 1 and 15. 3.2. Results Type I error rate Q and power 1-p. Table 1 gives the estimated values of the Type I error rate (Yand the power 1 - p obtained for the five methods compared. The k and N values, respectively retained for the presentation of the results of the multistage and of the group sequential methods are equal to 3 and 7. The results concerning the other k and N studied values are similar. For the group sequential methods and the single-stage design, the Type I error rate (Y was always close to the target value (0.05 or 0.10) and the power 1 - p under the alternative hypothesis (p =p,) was always close to 0.95 (for a theoretical value of /3 = 0.05). For Herson’s multistage method, Type I error rate LYwas sometimes greater than the target value - for example in Case IV, for a target value equal to 0.05, the

E. Bellissantet al/Lung

Cancer 10 Suppl. I (1994) SIOS-SIl5

Sill

Table 1 Type I error a and power 1 - p for p =pa for the single stage design, the multi-stage designs of Herson and Fleming UC= 31 and the group sequential methods (n = 71 PO/Pa

cy

0.05/0.20 (11

Single stage

Herson

Fleming

SPRT

(k = 3)

(k = 3)

(n = 7)

T-I tn=71

0.05 0.10

0.04-0.95 0.07-0.96

0.03-0.93 0.06-0.94

0.03-0.92 0.07-0.92

0.09-0.99 0.13-0.98

0.07-0.99 0.12-0.99

0.20/0.40 (II)

0.05 0.10

0.04-0.96 0.10-0.96

0.04-0.94 0.09-0.93

0.05-0.85 0.08-0.84

0.05-0.97 0.13-0.97

0.06-0.98 0.1 l-0.97

0.40/0.60 (III)

0.05 0.10

0.05-0.95 0.08-0.95

0.09-0.94 0.15-0.95

0.04-0.76 0.09-0.81

0.06-0.95 0.10-0.94

0.05-0.95 0.10-0.95

0.60/0.80 (IV)

0.05 0.10

0.04-0.96 0.10-0.96

0.18-0.96 0.26-0.95

0.03-0.79 0.08-0.82

0.03-0.95 0.10-0.92

0.05-0.93 0.09-0.93

0.80/0.95 W)

0.05 0.10

0.05-0.96 0.10-0.96

0.12-0.93 0.19-0.95

0.03-0.89 0.08-0.93

0.01-0.81 0.06-0.87

0.03-0.77 0.10-0.85

observed value was 0.18 - whereas the power was always close to 0.95. For Fleming’s multistage method, Type I error rate (Y was always inferior to the target value (due to the method of calculation used by Fleming to determine the thresholds), but the power was generally lower than 0.95: for example in Case III, with LYchosen equal to 0.05, the observed power was 0.76 (or a Type II error rate p of 0.24). For the group sequential methods, we observe that Type I and II errors with the TT and SPRT were identical to the target values for Plan III (0.40/0.60). Type I error rate (Y and power 1 - p tended to increase when pO/pa decreased and tended to decrease when p,,/p, increased. Nevertheless, the differences between the observed and theoretical values are acceptable in practice for (Y and p, since very low values of p. and pa (as in Plan I) or very high values (as in Plan V) are rarely used in Phase II cancer clinical trials. Acerage sample number (ASN). For the single stage design, the sample size depends only on po, pa, cr and p, whereas for the multistage and the sequential methods, it also depends on the value of the actual response rate p. Table 2 shows the average number of patients (ASN) for p =po and p = pa for the five methods. The group sequential methods allow large decreases in the sample size. The reductions obtained are approximately of 50%: for example, for p. = 0.20; pa = 0.40, (Y = 0.05 and p = 0.05, the single stage design requires 60 subjects whereas an average of 38 is required for p =p,, and 32 for p =pa with the ‘IT when an analysis is performed every 7 subjects. Herson’s method enables, for p =po, similar reductions to those obtained with the group sequential methods. On the other hand, it requires, for p =pa, a sample size close to that of the single stage design. In Fleming’s method, reductions in the ASN for both p =po and p =pa are

E. Bellissantet al./Lung Cancer 10 Suppl. I (1994) S105-S115

s112

Table 2 Average sample number (ASN) for p =po and p =pa for the single stage design, the multi-stage designs of Herson and Fleming UC= 3) and the group sequential methods (n = 7) PO/P,

(Y

0.05/0.20

0.05

(11

0.10

Single stage

Herson

Fleming

SPRT

l-r

(k = 3)

(k = 3)

(n = 7)

(n = 7)

50 44

31-49 29-43

34-28 30-25

41-21 35-18

47-28 38-20

0.20/0.40 (11)

0.05

60

37-59

25-28

34-28

38-32

0.10

45

29-44

20-21

31-20

31-25

0.40/0.60 (III)

0.05 0.10

67 56

37-65 36-55

25-31 20-25

34-33 29-26

38-37 32-29

0.60/0.80 (IV)

0.05 0.10

60 47

32-59 28-46

22-30 18-24

29-34 21-22

28-31 24-25

0.80/0.95 (V)

0.05 0.10

50 38

24-48 22-37

23-36 22-30

16-39 16-24

18-26 14-18

as important as those obtained with the group sequential methods. This is, however, in part a consequence of the underestimation in the calculation of N (which leads to decrease the number of subjects included at each stage) and is obtained at the price of an important loss of power. Fig. 2 represents the ASN as a function of the value of p in Case III (p. = 0.40, pa = 0.60) for a = p= 0.05. F or the group sequential methods the curves are symmetrical with respect to a maximal value situated between p. and pa. The ASN is minimal for the extreme values of p (close to 0 or 1) and increases to reach a maximum for p = 0.50. Between p. and pa, the ASN is greater with the SPRT (maximum 53) than with the TT (maximum 49). However, for the values of p much lower than p. or much higher than pa, the SPRT allows larger reductions in the ASN. This is a classical feature of the IT [1,14]. In Herson’s method, the decrease obtained is as high as with the sequential methods for the values of p between 0 and (p,, +p,)/2. For the high values of p (p 2pa), the ASN tends to be the same as the value required by the corresponding single stage design. In Fleming’s multistage method, the decreases in the ASN are equal to (for the values of p lower than p,, or greater than p,) or larger (between p,, and p,) than with the group sequential methods, but this is achieved at the cost of a decrease in power. To study the influence of the Influence of the frequency of the anabses. frequency of the analyses on the properties of the multistage and group sequential methods, we have studied Type I and II error rates (Y and p and the ASN under both H, and H, for different values of k and n. Table 3 presents the results for the Type I error rate (Y,the power 1 - p, and the ASN for p =po and p =pa for two of the three k (k = 2 and 4) and three of the eight n (n = 1, 5 and 15) studied

E. Bellissantet al./ Lung Cancer 10 Suppl. I (1994) S105-Sll5



s113

!

0.0

0.2

0.4

0.6

0.8

10

P

Fig. 2. Average sample number (ASN) as a function of the response rate p for the single stage design, the multi-stage designs of Herson and Fleming (k = 3) and the group sequential methods (N = 7). Case III: pa/p, = 0.40/0.60, a = p = 0.05.

values for Case III (0.40/0.60). For Herson’s method, the increase in the number of stages leads to a rise in Type I error rate (Y,especially clear when there are at least four stages, but without any simultaneous variation in the power. The ASN for p =p,, is reduced when the number of stages increases, whereas it remains constant for p =pa. For Fleming’s method, the increase in the number of stages does not lead to any variation in Type I error rate (Y,but the power, already small, decreases with the number of stages. The ASN for p =p,, and p =pa are inversely related to the number of stages. For the group sequential methods, the Type I error rate (Yand the power 1 - /3 are not modified by the frequency of the analyses (at least for n I 15). The ASN for p =p,, and p =pa increases when the frequency of analyses decreases. In practice, the impact is moderate: for example, with the TT in Case III (p,, = 0.40, pa = 0.60) with (Y= p = 0.05, the ASN for p =p,, is equal to 36 for n = 1, increases to 38 for n = 5 and 42 for IZ= 15. The results are similar for the other studied plans. The advantage of the group sequential methods, namely the possibility of making analyses every II subjects, is not obtained at the expense of the statistical properties and thus it is unnecessary to make too frequent analyses. Nevertheless, we have to emphasize the fact that the ASN represents an average and that, in practice, the conclusion can only be obtained for an exact multiple of n. Therefore, it is necessary, in order to keep the benefit of the sequential approach, not to choose too high values for n. This comparative study makes clear the practical advantages of the sequential methods: these statistical methods, conceived with the idea of repeated analyses,

0.74 0.84

27 25

35 28

0.94 0.93

35 29

65 53

0.93 0.94

66

55

0.05 0.10

0.05 0.10

0.05

0.10

1-P

ASN under H,,

ASN under H,

0.05

41

44

0.03 0.08

0.20 0.23

0.04 0.08

0.10

cx

Fleming

(k=2)

Herson

(k = 4)

Herson

a

(k = 2)

31 24

24 20

0.74 0.80

0.03 0.08

(k=4)

Fleming

0.96 0.95 35 30

0.95 0.94 32 28 32 25

0.06 0.11

0.05 0.11

31 24

SPRT (n = 5)

(n = 1)

SPRT

35 27

34 29

37 28

38 31

36 30 38 33

41 32

42 34

0.96 0.95

0.05 0.10

0.05 0.10 0.96 0.95

0.05 0.10

l-r (n = 15) TT (n = 5)

0.96 0.95

0.96 0.95

0.07 0.10

SPRT (n = 15)

Table 3 Type I error (I, power 1 - p, and average sample number (ASN) for p =po and p =pa as functions of the frequency of the analyses for the multi-stage designs of Herson and Fleming and the group sequential methods. Case III: p”/p, = 0.40/0.60.

E. Bellissantet al./Lung Cancer 10 Suppl. 1 (1994) SlOS-Sll5

S115

are among the only methods studied which allow large reductions in the ASN with Type I and II error rates close to their target values. 4. Conclusion Phase II studies in oncology whose aim is to select sufficiently effective drugs to justify further studies in Phase III, lead, from a statistical point of view, to the comparison between observed and theoretical proportions. In these studies, the required sample size with the single stage design is often too high. Furthermore, it is advisable, for ethical reasons, that studies can be stopped early when the drug appears clearly ineffective or effective. Among the methods proposed to obtain a more rapid conclusion, the Triangular Test constitutes an interesting approach: 6) it meets the need (derived from ethical considerations) of repeated analyses during the study; (ii) it has good statistical properties and allows to reach, as compared with the single-stage design, the conclusion earlier; (iii) it is simple to implement and the values of V and 2 obtained at each analysis are easily calculated from the numbers of patients included and observed responses; (iv) inclusion cannot go on indefinitely due to the closed nature of the continuation region; and (v> the possibility of not performing an analysis after each patient facilitates the implementation of the study (possibility of multicentric studies). 5. References 1 Anderson TW. A modification of the sequential probability ratio test to reduce the sample size. Ann Math Stat 1960; 31: 165-97. 2 Bellissant E, BCnichou J, Chastang C. Application of the Triangular Test to Phase II cancer clinical trials. Stat Med 1990; 9: 907-17. 3 Carter SK. Clinical trials in cancer chemotherapy. Cancer 1977; 40: 544-57. 4 Fleming TR. One-sample multiple testing procedure for Phase II clinical trials. Biometrics 1082; 38: 143-51. 5 Gehan EA. The determination of the number of patients required in a preliminary and a follow-up trial of a new chemotherapeutic agent. J Chron Dis 1961; 13: 346-53. 6 Herson J. Predictive probability early termination plans for Phase II clinical trials. Biometrics 1979; 35: 775-83. 7 Herson J. Statistical aspects in the design and analysis of Phase II clinical trials. In Buyse ME, Staquet MJ. Sylvester RJ, editors. Cancer clinical trials: methods and practice. Oxford: Oxford University Press, 1984; 239-57. 8 Lee YJ, Staquet M, Simon R, Catane R, Muggia F. Two-stage plans for patient accrual in Phase II cancer clinical trials. Cancer Treat Rep 1979; 63: 1721-6. Y Schultz JR, Nichol FR, Elfring GL, Weed SD. Multiple-stage procedures for drug screening. Biometrics 1973; 29: 293-300. 10 Staquet M, Sylvester R. A decision theory approach to Phase II clinical trials. Biomedicine lY77; 26: 262-6. 11 Wald A. Sequential analysis. New York: Wiley, 1947. 12 Whitehead J. The design and analysis of sequential clinical trials. Chichester: Ellis Horwood, 19x3. 13 Whitehead J, Jones DR. The analysis of sequential clinical trials. Biometrika 1979; 66: 443-52. 14 Whitehead J. Stratton I. Group sequential clinical trials with triangular continuation regions. Biometrics 1983; 39: 227-36.