Special
Article
Statistics SEQUENTIAL IRWIN D. J.
CLINICAL
BROSS, Ph.D.
NEW YORK, N.Y.
I.
TRIALS From the Cornell University Medical College* (Received
for publication
June 7, 1958.)
INTRODUCTION
purpose here is to discuss a class of techniques M Ycedures” which are sometimes useful in clinical
called “sequential proIn an effort to trials. improve communication I will use an informal question-and-answer style: What is a sequential procedure? Q: A: In a broad sense, a sequential study is one where the investigator tries to profit from theexperience he accumulates as the study goes along. Thus, At the completion of a study is done in a sequence of steps (or “stages”). each stage the researcher pauses and reviews what has happened up to this point. After considering what has already occurred, he makes decisions about the next stage. Q: What kind of decisions are you talking about? A: I will be mainly concerned here with a decision as to whether to stop the trial or to go ahead with it. However, there are other decisions which can be made in this step-by-step or sequential fashion. For example, the dosage of a drug can be increased or lowered, or other changes in protocol may be made. Q: That hardly seems like a new idea. A: Oh it isn’t-almost any good investigator intuitively uses a sequential approach. The statisticians have simply taken a familiar idea and put it to work in an explicit and systematic fashion. The rules of the game are written out so that the investigator has an objective guide for his decisions and does not have to rely entirely on his personal judgment. The formalization of this old idea is a comparatively new development which dates back to World War II and the work of Abraham Wa1d.r The original application was to sampling inspection of incoming lots of such things as ball bearings or ammunition. *Departmentof Public Health and Preventive Medicine, The New York Hospital-Cornell Center. supported in part by grant from Rockefeller Foundation.
Medical
350
BROSS
J. Chron. Dis. September, 1958
Q:
That seems like a long way from the clinical trial of a new drug.
A:
It is, and naturally the original procedures designed for industrial inspection have to be adapted to the much different situations encountered in the medical field. Sometimes the older devices cannot be fitted into the clinical trials so we have to develop new ones. One of the factors that has hampered the use of sequential methods in medical research has been a lack of appreciation of this very point: statistical methods have to be tailored to fit the actual clinical situation, and sequential procedures are no exception to this rule.
Q:
I take it then, that sequential procedures have not been an unqualified success in the medical field.
A:
So they have not. The sequential trials which have been published in the medical literature can be counted on your fingers. However, the current trend toward the controlled clinical trial as an instrument for evaluation of new drugs (and other therapeutic measures) is likely to lead to a wider use of sequential methods. So at least a nodding acquaintance with the notion of a sequential procedure should be of use to an experimenter or a critical reader.
Q:
Why do you think that these methods will come into wider use?
A:
Largely it’s a matter of costs, not just dollars and cents but costs in time and effort as well. Thus suppose we have a sound protocol for testing chemotherapeutic agents for cancer that calls for a fixed number of patients to be tested on each compound. In certain favorable situations sequentialization of the protocol may do two things. First it may give a trial which is just about as good as the original “fixed sample size” protocol insofar as detection of promising agents goes. Second it may cut the average sample size required per agent by as much as 50 per cent. Cutting the work load may save money, or it may permit more agents to be tested with available resources. This is especially important nowadays because of the developing bottleneck at the clinical trial stage.
Q:
You say that “sequentialization” of a protocol can sometimes save money, time, and effort. What do you mean by “sequentialization”? If we start with a good protocol based on a fixed number of patients for each test agent we may be able-without drastic changes in the original protocol-to add a stop-or-go sequential scheme. What’s that? A stop-or-go scheme is simply a rule which, at each stage, tells us whether to go on with the test of the particular agent or to stop the trial. Such a scheme presupposes that the patients enter the study, either individually or in batches, over a period of time so that the total series can be regarded as a sequence of smaller sub-experiments or stages. For example, if the original protocol called for a series of 30 patients, the patients might come into the study at the rate of 6 per month over a S-month period. If some assessment of the response to the test agent of the first 6 patients can be made within the first month, then this information may be used as the basis
A: Q: A:
Volume 8 Number 3
Q: A:
SEQUENTIAL
CLINICAL
351
TRIALS
for a stop-or-go rule. Depending on the initial results we might either continue the trial or stop it. Notice that with sequentialization there will be different numbers of patients tested on the different agents. But what’s the point of the stop-or-go scheme? If the investigator makes a clinical trial of a number of agents some of them will be duds-no better than the agent in the control series-while others will be promising. Of course, there is usually a favorable report on an agent from laboratory animal studies before it would be considered for clinical testing, but even so most of the agents are likely to turn out to be duds in human beings. So what we would want to do is to drop the duds as soon as possible, to use the minimum number of patients on the duds. At the same time we would usually want to carry the promising agents through the full test. FIXED
SAMPLE
SIZE
II.
A HYPOTHETICAL
Q:
You have said that a stop-or-go scheme aims at using as few as possible patients on dud agents and at the same time doing a full test on promising agents. But I don’t see how a stop-or-go scheme can work since you don’t know in advance which are duds and which are not. If you did, you wouldn’t have to do a clinical trial at all. I think the easiest way to see how a stop-or-go scheme works is to watch one in operation. To keep things simple, I will have to use a rather unrealistic example, a very small hypothetical study, if you do not mind. Well I do not care much for hypothetical studies. Neither do I. However the little hypothetical study will serve to illustrate some principles of sequentialization that carry over to actual studies. Perhaps I can build a plausible setting for the example. Let’s suppose that the response to the agents under test is to be assessed by a complex and possibly erratic chemical determination on blood samples. This lab work is the limiting factor of the study since the doctor can run only 6 determinations a week. He plans to run 3 determinations on patients receiving the test agent (Drug A) and 3 on control patients (Drug B) where the allocation of patients to the drugs is randomized. He feels that 9 patients (i.e., 3 stages or weeks) will provide an adequate test of the agent. Before we consider sequentialization of this study plan, let us examine a possible method of analyzing the fixed-sample (9-patient) trial. Isn’t the 9-patient trial too small for any valid analysis? The study is smaller than would ordinarily be desirable, but even with 9 patients we can set up a legitimate significance test which will correspond to the customary (two-tailed) 5 per cent level test. Moreover the significance test will be “distribution free”-we will not need to assume that the measurements will be normally distributed, for example. This is how it works. Let us suppose that an effective agent acts to lower the laboratory readings. Then if we compare the measurement for a patient given Drug A with the reading for a control patient we will say there is an “inversion”
A:
Q: A:
Q: A:
CLINICAL
TRIAL:
352 if the patient (stage)
given
there
Drug
of inversions
A:
agent
What
I thought Well would
have
occur.
find only
would we expect I was asking
rank
here.
patients.
The
be any number a few
in a given
patient
“I”),
total perhaps
to answer
since
if you know
by its ranR (i.e., the rank
the trick.
These
outcomes
order of the observations
in Table
likely
OF INVERSIONS
to occur.
in Table
Since
of 0.05
(i.e.,
then each of the
there
are
one chance
So we can find the chance
will occur.
of LJA inversions
Table
For instance,
might
the num-
I.
is a dud (and if we randomize),
(say “U”
step is summarized From
that
steps are shown
is a probabillty
outcome
of inversions
Probability This
is all that
outcomes
down we can determine
is equally
in all there
a given
let us
356 346 34.5 256 216 245 236 235 234 156 146 14.5 136 13.5 134 126 12s 124 12.3
Now if the test agent
that
With
measurement
we have written
124 12s 126 131 135 136 14.5 146 1.56 234 23.5 236 245 246 256 345 346 3.56 4.56
number
Drug
one or
First
the smallest
NUMBER
outcomes
week
number
from 0 to 9.
“inversions,”
we write down all the possible
For an)- outcome
possible
given
to find with a dud agent?
observation
Second
ber of inversions.
Now
since each
Uis. 1958
the questions.
question
the original
matters
may therefore
we would
it is an easy
replace
reading.
comparisons
with each of the 3 control
in this stage
an effective two.
A has a higher
will be 9 possible
A can be compared
Q:
J. Chron. September,
BRVSS
inversions) =
Kumber
will occur
in twenty) that
a given
by:
of outcomes Total
20 possible
number
with UA inversions of outcomes
1 I.
II we can see what
will happen
if the test agent
is a dud.
we see that only 5 per cent of the time will we find no inversions
Volume 8 Number 3
SEQUENTIAL
CLINICAL TABLE
NUMBER
OF INVERSIONS
NUMBER WITH
(U)
353
TRIALS
II
OF OUTCOMES
PROBABILITY
U INVERSIONS
OF
U INVERSIONS
!
0
0.05 0.05
1
0.10 0.15
3
0.15 0.15
:: 3 2 1 1 Total
0.15 0.10 0.05 0.05
20
1.00
TABLE III
UB
UA
0
1
2
3
4
5
P’
1
1
2
3
3
3
A:
.
.
_’
3
l 9’,
‘9’
4
3
9’
5
3
6
3
7
2
/2; R ‘2 , ‘4
8
1
‘1,
9
1
1
1
1
2
2
3
I
,‘-1
4’2
/‘3
,/3,‘6,
‘9 ’
/6)
’ : , ‘2 , ‘3” / ‘1 /y;/A,‘;,‘:
/
Q:
_’
’
0
0
0
’
,’
’
6
7
8
9
3
2
1
1
----7--3,“2,
.
’
0
1,‘:
9’ ‘9’ ’ a’ /b/SC, ‘< /3’,
‘2, ‘3” ’
/
’
* 3,/2,
.
/
‘l/‘-l
,
from a dud agent. Sixty per cent of the time (0.60= 0.15 + 0.15 + 0.15 + 0.15) we would find between 3 and 6 inversions. So much for the first stage. I think you can see that if we let Us be the number of inversions in the second stage then we would have the same probabilities for LIB as for UA. Are you following so far? I suppose so. Just give me a while to study Tables I and II. Where is this all leading? The next step is to see what happens (always assuming the agent is a dud) to the total number of inversions in the first two stages; call this IJAB (i.e., If the stages are independent, that is, the results of one UAB = UA + UB). stage do not affect another stage, we can get the answer by a little addition and multiplication.
J. Chron. Dis. September, 1958
354 To do this we set up Table number
of inversions
twentieths)
of UA.
of fractions.
in the This
sions in the second
stage
multiply
the column.
stage
In the horizontal stub we write the (UA) and the probability, P’, (in
will allow us to work with whole numbers
In the vertical
we simply
III.
first
stub we do the same
(Un).
To get the numbers
the probability
For example
for the number
(P’)
in the body
instead of inver-
of the table
for the row by the probability
UB = 3 and UA = 2, then
for
P’ for UB = 3 and for
l_lA = 2 is 2 X 3 = 6 in the table. From
Table
the diagonals probability
III
we obtain
(indicated
Table
by dotted
(in four hundredths)
first two stages.
IV by adding
lines in Table
of getting
up the numbers III).
a total
This
along
gives
us the
of UAn inversions
in the
Thus for IT, = 0 and U* = 3, P’ = 3.
3, 2, 2, 3 (along
this diagnonal)
We add the numbers to find I~JAn= 3 has a P’ = 10. TAH1.E I\
u411
0
1
2
3
4
5
6
7
x
IJr
P’
1
2
5
10
16
24
33
4U
45
48
45
40
33
24
16
10
5
2
1
0
I
I
2
5
10
16
24
33
40
45
48
45
40
33
24
16
10
5
2
1
1
1
I
2
5
10
16
24
33
40
45
48
45
40
33
24
16
10
5
2
1
2
2
2
4
10
20
32
48
66
80
90
96
90
80
66
48
32
20
10
4
2
3
3
3
6
15
30
48
72
99
120
135
144
135
120
99
72
48
30
15
6
3
4
3
3
6
15
30
48
72
99
120
135
144
135
120
99
72
48
30
15
6
3
5
3
3
6
15
30
48
72
99
120
135
144
135
120
99
72
48
30
15
6
3
6
3
3
6
15
30
48
72
99
120
135
144
135
120
99
72
48
30
15
6
3
7
2
2
4
IO
20
32
4X
66
80
90
96
90
80
66
48
32
20
10
4
2
8
1
I
2
5
IO
16
24
33
40
45
48
45
40
33
24
16
10
5
2
1
9
1
1
2
5
10
16
24
33
40
45
48
45
40
33
24
16
10
5
2 . .._~
1
This
Same
V we put
These
Q: A:
11
horizontal
stub
and
Uc,
12
13
14
of stages.
15
Thus
the number
are given in Table
17
18
in Table
of inversions
VI.
Note that
the total
in Table
IV, 8,000,
repre-
of ways in which a UAsc can be achieved.
so far?
I can see the arithmetic
but
I do not see why you do these
multiplications
and additions or what, if anything, you have accomplished. These arithmetic steps follow from the basic rules for probability; explained
16
Proceeding as before we find the stage, in the vertical stub. for the total number of inversions in all three stages, UAsc.
the number All right
IO
can be used for any number
UAB in the
in the third probabilities sents
device
9
in Appendix
I.
As for what
we have
accomplished,
this
is
well, we are
Volume 8 Number 3
SEQUENTIAL
CLINICAL
355
TRIALS
TABLE VI _
_ UABC
Prob Cum P
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1
3
9
22
45
84
143
222
321
435
549
654.
735
777
777
.OOOl .0005 ,002 .004 ,010 ,020 .038 .066 .106 .161 .229 ,311
.403
.500
.597
15
UARC
16
17
18
19
20
21
22
23
24
25
26
Prob
735
654
549
435
321
222
143
84
45
22
9
3
Cum P
,689
,771
.839
.894
.934
.962
.980
.990
.996
.998
.9995
.9999
27 TOTAL 1 8000 1.000
now all set to construct our test of significance. For this we use Table VI. In this table we have added a column labeled Cum P (“Cumulative Probability”) which gives us the chance that U ABCor fewer inversions will Occur. Notice that for UAsc = 5 the value of Cum P is 0.020. From this we can set up a rule for rejectiona rule which will tell us when to reject the (Null) hypothesis that an agent is a dud. Thus suppose that we use the rule for rejection: reject the (Null) hypothesis that the test agent is a dud if UAsc is 5 or less. In other words, if U~sc is 5 or less we will say that the test agent is not a dud. Of course, we will sometimes be mistaken,now and then we will say an agent is not a dud when it happens to be a dud. How often will we make this kind of mistake (i.e., a false-positive conclusion) if we use the above rule? From Cum P we see that if an agent is a dud we will make this false claim about 2 per cent of the time. Notice that this considers only departures in a favorable direction (so it is called a one-h&d test). Most of the common significance tests are two-tailed tests (i.e., pick-up departures in The significance test presented here can be easily coneither direction). verted to a two-tailed test by rejecting if UABC is 23 or more (in which event the risk of rejection under the null hypothesis would be 2 X 0.02 = 0.04). Hence, a two-tailed test based on inversions corresponds approximately to the usual 5 per cent level (two-tailed) statistical test. Notice also that we have really made just one major assumption in the whole process, namely that we have random samples.
Q: A:
III.
Q:
A:
But I still do not see what this has to do with sequentialization. Is this significance test based on a full trial of 9 patients? Yes, this is a fixed-sample test. But we are now ready to see what happens if a stop-or-go scheme is added to the protocol. SEQLJENTIALIZING THEHYPOTHETICAL CLINICALTRIhL It still bothers me that there are only 9 patients in the trial. Remember this is just an example, a working model if you like. The question “how big a series (fixed sample) is needed for an adequate test of an agent?” is an important one, but to answer it would take us too far afield. However, whatever size sample is judged adequate we can apply the same principles of sequentialization which will use in this example. So in that
356
52: A:
Q: A:
J. Chron. Dis. September, 1958
UKOSS
sense we will now assume that 9 patients would provide an adequate trial in our hypothetical example. We will also suppose that we would be interzsted in those compounds which we can demonstrate are not duds (using the rule of rejection derived above). Well all right. Kow what do we want from a stop-or-go scheme? First we would want to carry through a full trial those agents which will subsequently turn out to be “significant” by our rule for rejection. Second, we would like to drop those agents which will not make the grade at an early stage of the studyafter the first stage for example. But you do not know which agents will be “significant” if you only know what happens in the first stage. No we do not. But we can spot agents which are not very likely to turn out to be “significant.” Suppose that an agent has 6 inversions in the first stage, then it cannot turn out to be significant by our rule for rejection (which allows at most 5 inversions in all 3 stages). So we might as well drop this agent.
Q: Oh, I see. two stages
And if an agent accumulated 6 or more it could be dropped at this point.
inversions
in the first
A: Yes, and there you have one possible
stop-or-go rule: drop an agent if, at any stage, it has a total of 6 or more inversions. With this rule, an). agent which could possibly pass the test would get a full trial so that, so far as “significant” agents go, the protocol plus the stop-or-go rule would have the same performance as the original scheme. At the same time we could cut the work load a bit.
Q: How much would you save? A: To see this we can go back to the tables
we have previously calculated. (‘onsider Table 11. With the “6-and-out” stop-or-go rule we would drop 35 per cent of the duds after the first stage. To see this we add up the probabilities for the UA’S where we would stop the trial: P(Drop after first stage if dud) = 0.15 + 0.10 + 0.05 + 0.05 = 0.35. When some of the agents are dropped after the first stage, Table IV is modified. We can draw a line to indicate the stop rule for the first stage (the vertical solid line in Table IV). In adding along the diagonals of Table Ill, we do not count numbers to We can also draw a solid diagonal line the right of the solid vertical line. to indicate the stop rule for the second stage. Adding up the numbers below this diagonal line gives us:
P(Drop after second stage if dud) = 30 + 35 + 36 + 33 + etc. 202 ==C = 0.505 400 400 Finally note that: P(goes into second stage if dud) = l- P(Drop after first stage if dud) = l-0.35 = 0.65 and
Volume 8 Number 3
SEQUENTIAL
CLINICAL
357
TRIALS
P(goes into third stage if dud) = P(goes into second stage if dud)P(Drop after second stage) = 0.650 - 0.0505 = 0.145 For a Dud Probability of Continuing Probability of Stopping to Next Stage at Stage Stage 1 0.650 0.350 2 0.145 0.505 3 1.000 So that the mean sample size for dud agents (MSS
if dud):
MSS if dud = no. observations per stage X [l + P(goes into second stage if dud + P (goes into third stage if dud)] = 3 X [l + 0.650 + 0.145]= 5.385 Hence instead of using 9 patients on dud agents, as in the fixed sample plan we would use, on the average, about 5% patients and thus save about 40 per cent of this work load.
Q:
Can you save more than this with some other stop-or-go
rule?
A:
Yes, if we are willing to lose a small proportion of the agents which would be significant in a full trial. For example, we could set the stop rule in the first stage at 5 instead of 6. It would be rather rare for an agent to give 5 inversions on the first stage and no inversions at all in the last two stages. We can try various stop rules and calculate their performance from the tables. Thus the MS.5 for duds with the stop rule set at 5 in the first stage instead of 6 would be about 4.75 patients.
IV.
PROS AND CON
Q:
I can see how a stop-or-go rule provides an automatic, mechanical way to stop the trial of an agent. But it seems like a blind and brainless way to do research , . . it doesn’t leave any room for the investigator’s personal judgment.
,4:This
depends on how the rule is used. If the rule is taken as a “commandI would ment” to be rigidly followed, then your objection would apply. prefer to think of it as a guide. For example, suppose an agent performed rather poorly most of the time but achieved an occasional spectacular result. Then, even if the stop rule said to drop the agent, the investigator could continue it in the trial. A stop-or-go rule, or any other routine procedure, should not prevent a researcher from keeping his eyes open for the unusual.
Q:
What about the final decision about the agent. The significance test is based only upon the main response variable, but there is likely to be much more information obtained in the clinical trial. There may be other measures of response, and there would be information on toxicity or side effects. Is this additional information to be ignored? Not at all. In fact, this is why I have emphasized the stop-or-go feature rather than the use of the sequential procedures in testing hypotheses or
35x
J. Chron. September,
BROSS
making
decisions.
For
the investigator
the agents
would
probably
and to utilize whatever mendations.
other
In other
words,
which want
run through
to make
the complete
a more
extensive
information
is available
the clinician
can use the stop-or-go
Dis. 1958
trial,
analysis
in making
his recomscheme
to
save work, but he can deal with his final data just as he would if he had done the fixed-sample-size
Q: A:
But
then
nificant Yes,
study.
it could happen in the extensive
this is the price that
ever,
ordinarily
there
a substantial would
turned
If the researcher the equivalent
Q:
out
is concerned
costs”
sequential
stages.
It
secondary
Q: A:
Why
never
continuous
through
which
analysis. scheme,
possibly that
of
effort to set the scheme
For this reason
be forgotten
processing
in time to use it in sub-
administration
confusion,
the sequentialization
to deterioration
sequentialization
of data
is really
of
importance.
do you say that?
I think
the primary
responsibility
reliable,
high quality
data.
After
of the clinical the study
can worry about cutting
investigator
costs.
But to save money by lower-
ing data quality
would be “penny
wise, pound foolish.”
sequentialization
should
like
from his primary worked
out.
without What
task.
For this reason
for the neophyte I think
first considering is a favorable
whether
situation
divert
the
I don’t
where
essential
is favorable
that
attention sequentiali-
protocol
to try to sequentialize
the situation
think
researcher’s
I would not recommend
or for a study
it is a mistake
is to obtain
plan is set up to yield good data
a ‘gimmick zation
Q:
and
extensive
level in his final analysis.
require
come
smoothl\f.
haste
should
the investigator
A:
must
So it takes additional
to some
rare loss of agents more
of sequentialization?
schemes
the information
the re-
So there may be
level in his stop-or-go
“hidden
The
How-
between
this he can reduce his risk by using, say,
probability
an)- other
lead
of the work load.
relationship
by the
at the 5 per cent
are.
quality.
about
up as sig-
in an early stage.
by other analyses.
to be significant
up and to keep it Harking may
close
test and results
of a 10 per cent
since
sequent
be a fairly
to work
there
the data
would have showed
he intends
Are there Yes
which
wouId be dropped
cut in work load for a comparatively
have
although
agents
is paid for the reduction
would
sults by the “inversion”
A:
that analysis
is not yet
any clinical
trial
or not.
for sequentialization?
That is hard to sa!~. One good omen is if, after setting up a fixed sample scheme, it breaks up easil\y and naturally- into a sequence of stages. Then, too,
the information
be available
from
for subsequent
one stage stages
should (though
be processed not
necessarily
in time
for it to
the very
next
stage). sessed,
When it takes a long time for the response to the agent to be assuch as often happens in the chronic diseases such as cancer or heart There disease, this makes it very difficult to set up a sequential scheme. are other
signs and portents
for sequentialization scheme.
Without
will turn up.
that
indicate
whether
or not a situation
but the ncid test is a preliminary an actual
trial of the scheme
is ripe
run with a sequential
it is hard to say what
bugs
Volume 8 Number 3
Q: A:
SEQUENTIAL
CLINICAL
TRIALS
359
That is pretty vague. Can you be more definite? We can put it this way. I would see the greatest utility for sequentialization in more or less routine screening of new agents. For example, if a doctor is running a continuing clinical trial of analgesics or other agents and if a stop-or-go scheme fits neatly into his existing routines, then they are probably worth a trial. I think it is important that the study be well organized and have strong central direction. I would be chary about sequentializing a cooperative study if, for example, some participants might delay sending in reports.
OTHER USES OF SEQUENTIALIZATION
Q:
A:
Q: A:
Q:
The purpose of sequentialization, as you have presented it, seems to be to drop dud agents out of a clinical trial as soon as possible. Is this all that the method can do? No, there are other uses for sequential schemes in clinical trials. However, I will not be able to do much more than mention these other possibilities. For one thing, there is a second way to reduce the work load; we could cut short the trial of agents with spectacularly good performance. In other words, we can set up a slightly more complicated stop-or-go rule which will stop the trial of agents which either do poorly or else do especially well. For example, in the example of section II we could use a rule such as: Stop if there are no inversions after the second trial. I have not emphasized this type of stop rule because I think that ordinarily we would not save much work with such a rule; such agents would be rather rare, and very often we would like to have additional information on the performance of such agents. Sometimes, though, the saving in time might be important if immediate action were planned if a good agent turned up. Are there other uses? Yes, we can sometimes save work and time by a very different strategy: we may try to increase the amount of relevant information that we obtain from each patient in the series. For example, suppose we are trying to estimate the relative potency of an analgesic agent with respect to morphine and we use dosages of morphine around 10 mg. Now if we chose a dosage of the test agent which will give us about the same response as this dose of morphine, we will have a somewhat better estimate of relative potency than if the response to the test agent is much higher (or much lower) than the response to morphine. At the start of the study we have to guess what the equivalent dosage of the new agent will be, but as the study progresses we can adjust the original dosage up or down so as to get a response more In this way, for a given number of panearly equal to that of morphine. tients we can get a better estimate of a relative potency.4 To distinguish this type of sequential rule we might call it a “move rule” (as opposed to a stop-or-go rule). If you succeed in matching responses you would, in fact, have a direct estimate of relative potency since you would know the dosage equivalent of 10 mg. of morphine.
J. Chron. Dis. September, 1958
360 A:
Yes,
could be used as a method
the scheme
its main
importance
is to “centrally
trials we are interested ing response. mum effect
Q: A:
Here,
in achieving
Well
that
1 can give an instance leukemia
the trial it was suggested way.
better this
patient
had been
given
raising
the dosage,
maximum
response
rather
of dosages
In other than a match-
to try for maxi-
at the maximum
tolerated
a drastically so a trial
second
dose).
trial
occurs
Are there other (i.e.,
sequentialized
schemes
suggested
These
\\rould naturally You
would
greatest
‘I,.II, (‘an
(“steepest
enters
thinking cancer
where
tion.
distinguish agents
rules”
might
might
be used?
therapies the
of agents.
are em-
problem
is to
Potentially
some
be used to explore
the re-
a strategy
in the
schemes
be used if there and the time
kind
of situation
it ma)’ take
years
is a serious
it is in just
of sequential
headed
dose, then this scheme
combination
combination by Box’
down
that the
may be questioned.
simultaneously)
employ
(by
much
like the one you
on a dark,
direction
in which
foggy
the
night.
slope
was
ascent”).
response
Yet
and then backing
tolerated
“move
the study
of the
Delayed
if each trial
IN SEQIiENTIAI,I%ATION
sequential
patient
appear
When
given
schemes
keep
is now under-
unnecessary
dose in the first
use to find the top of a mountain
simply
PKORLISMS
been
After would
schedule
If there is good reason to believe
drugs
for the
surface.
have
tolerated
places where sequential
dosages
sponse
dosage
the assumption
find optimum
in
increased
at the maximum
two or more
but
was employed.
would
Yes, I will give one more example. ployed
have
schedule
until side effects
However
helped
trial of 6 mercaptopurine
with this new schedule
his maximum
by steps,
response
clinical
a fixed dosage that
might
sequentialization
until they disappear).
would save work.
A:
where
results
In a sense
the staircase
Q:
this comes
In a controlled
in children
have produced
SPEC
maximum
but in practice
observations.
Could you give an example?
acute
A:
the
too, we can use al‘staircase”
(if we assume
where it was not used.
Q:
of estimation,
locate”
schemes between
that
to say what
acute
in evaluating really
the time a
therapies
I am for
has happened. obstacle
importance.
and long-term leukemia
between
is determined?
where the potential
be of most
short-term
now used to treat
occurs
but not impassable
this situation may
is a long delay the response
It is helpful
response.
in children
to sequentializa-
time-saving For
produce
feature here
example,
to the
a “short-term”
within a month or two. The “remission” may conresponse, “remission,” tinue for a year or more, and so the duration of remission or the survival time
for the
patient
is a “long-term”
response.
If we are
response-either in itself or as a possible “short-term” we might employ this measure term” response-then sequential
plan.
on the “long-term”
Q: A:
Another
possibility
response
interested
in
indicator of “longas the basis for a
is to use what information
is available
even if it is incomplete.
How could J-ou do this? Take survival for example and suppose on Drug A (test agent) and the other
a paired trial is run with one patient on Drug B (control agent). If the
Volume 8 Number 3
Q: A:
Q: A:
SEQUENTIAL
CLINICAL
TRIALS
361
patient on Drug A dies first this is an “inversion” whereas if the patient on Drug B dies first this is a “noninversion.” There is, however, the difficulty that the outcomes for the pairs become known in a different order than that in which the pairs enter the study. Consequently, there may be a bias introduced if “tied” observations are disregarded. For example it might be that more patients would be alive at 3 months for Drug B but that by 12 months the situation would be reversed and hence Drug A might be substantially better in the long run. However, using a sequential scheme based on the order in which the pair outcomes are determined would lead us to drop Drug A. Let’s go back to an earlier point you made, the risk of dropping a promising agent with a stop-or-go scheme. How great is this risk? First of all we must remember that with the fixed-sample scheme, there is also this risk. The stop-or-go rule is chosen as to have relatively little effect on this risk. However, it is important to have some idea of the magnitude of this risk, and there are two ways to approach this question. The first is theoretical. You will recall that in section II we calculated the chances that a dud would get dropped. We might do essentially the same thing for an agent which wasn’t a dud. For example, we might postulate that the agent was fully effective on 50 per cent of the patients and was a dud on the rest of the patients. Alternatively, we could proceed more empirically. If the investigator has previously tested a number of agents and has turned up a promising compound he might use the data on this compound. The procedure, which is like that of section II is illustrated in Appendix II. One last question. If I want to find out more about sequential schemes where can I go? Preferably to an experienced statistician since it may be a poor idea to pick a scheme out of a book or a paper. However, I have appended a short list of references which includes some of the published papers where sequential methods were used.* The “inversion” stop-or-go schemes are easy enough to construct on a do-it-yourself basis (Appendix III).
REFERENCES
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Wald, A.: Sequential Analysis, New York, 1947, John Wiley & Sons, Inc. Bross, I.: Sequential Medical Plans, Biometrics, The Biometric Society 8:188, 1952. Bross, I.: Design For Decision, New York, 19.53, The Macmillan Co. Wallenstein, S. L., and Houde, R. W.: Clinical Analgesic Assay of Dihydrohydroxymorphinone, Fed. Proc. 15:1611., 1956. Armitage, P.: Sequential Tests m Prophylactic and Therapeutic Trials, Quart. J. Med. (New Series) 23:255, 1954. Armitage, P.: Sequential Procedures for Medical Trials, Biometrics 14:132, 1958. Box, G. E. P.: The Exploration and Exploitation of Response Surfaces: Some General Considerations and Examples, Biometrics 10:16, 1954. Kilpatrick, G. S., and Oldham, P. D.: Calcium Chloride and Adrenaline as Bronchial Dilators Compared by Sequential Analysis, Brit. M. J. 2:1388, 1954. Newton, D. R., and Tanner J. M.: N-acetylpara-aminophenol as an Analgesic: a Controlled Clinical Trial Using the Method of Sequential Analysis, Brit. M. J. 2:1096, 1956. Snell, E. S., and Armitage P.: Clinical Comparison of Diamorphine and Pholcodine as Cough Suppressants by a New Method of Sequential Analysis, Lancet 1:860, 1957. Lasagna, L., and Meier, P.: Clinical Evaluation of Drugs, Ann. Rev. Med. 9:347, 1958. *I am indebted to Professor William Cochran for this list.
362
J. Chron. Dis. September.1958
BROSS
APPENDIX
I
The operations in section II stem directly from the 3 basic rules for probabilities.3 Let A and B be “events” and let P(A) represent “the probability that event A occurs” or “the probability of A.” Then: Rule Ia:
P(A and not-A) = 0
Example: for a comparison tion where “ties” do not occur: P(an inversion Rule Ib:
of a single treated
occurs and an inversion
in a situa-
does not occur) = 0
P(A or not-A) = 1
In the situation
Ia:
P(an inversion
occurs or an inversion
Rule ZC: 0 5
P(A) 5
Rule II:
with a single control observation
does not occur) = 1
1, i.e., P(A) is 0, 1, or any value between.
If two events are independent the following multiplicate
rule holds:
P(A and B) = P(A) x P(B) Example:
to obtain
the entries
in the body of Table
III
the probabilities
in the stubs are
multiplied. Rule III: add&e
If two events are mutually exclusive
(i.e., cannot
both occur)
then the following
rule holds:
P(A or B) = P(A) + P(B) To obtain the probability that UAD takes a given value the probabilities Example: combinations of UA and IJn which lead to the given value of UAR are added together. that in Table III the cases where U.A + Un = 3 (say) appear in a diagonal of the Table. by adding along the diagonal
~_
~__~__
Hence,
= 3) will be obtained.
P(U~B
TABLE A-I. ~
of the Notice
BASIC DATA.
___~~
..~.
RESULTS (TUMOR WEIGHT, GM.)
EXPERIMENT __.
TREATED -1
1
CONTROL ____~__
0.4 1.0 1.6
0.9 1.4 1.7
2.4 20 1.7
2.6 3.0 3 6
06
1.5
2.1 1.4
2.5
2.5 2.4 1.3
1.8
2.3 2.8 2.9
1.5
1.3 3 2 24
0.8 1.1 1.6
2.5
1.6 0.6
1.5 2.2
This data can be used to set up a table analogous to Table II.
NUMBER OF IIWERSIOP;
Volume 8 Number 3
SEQUENTIAL
APPENDIX
CLINICAL
363
TRIALS
II
The following example illustrates the use of empirical data to estimate the chance of missing an effective agent (the “Type II error”) or the chance of not missing such an agent (the “power”). I have no clinical data to use here so as the next best thing I will use some data from a cancer chemotherapy animal screen.* TABLE
NUMBER NUMBER
OF INVERSIONS
(U)
WITH
OF OUTCOMES
this,
in table
PROBABILITY
U INVERSIONS
Total From
A-II
INVERSIONS
6 analogous
to Table
III
may
be constructed
TABLE
UB
1.00
I
A-III
UA
0
1
2
3
P’
1
2
2
1
0
1
1
2
2
2
3
1
and simiiarly TABLE
A-IV
U AH
0
1
2
3
4
5
6
P’
1
4
8
10
8
4
1
TOTAL
36
and TABLE
A-V
UAB
0
1
2
3
4
5
6
UC
P’
1
4
8
10
8
4
I
0
1
4
1
1
2
16
8
2
2
2
20
16
8
2
3
1
10
8
4
1
0
the
kind
permission
of
Dr.
_’
.’
,‘?O,G $+10
giving -__ *With
-.__
Phillip
Merker.
OF U
364
J. Chron. Dis. September, 1958
BROSS TABLE
4-w
I7,AH,!
0
1
2
3
4
5
6
7
8
9
P’
1
6
18
3.5
48
48
35
18
6
1
The performance
is summarized
‘I‘ARLE
h-C’11
‘rOTAL
216
in Table A--\:II.
PEKFOHMANCE
OF ‘I‘ISST
(STANDARD COMPOUND)
, I’KORARILITY OF STAGE
4T
KIiJECTINC
/
PKOBABILITY
STA(;E
_
STAGE
1.0000 0.9722
0.0000 0.0278 0.2778 Over-all
OF CONTINUING
TO NEXT
0 3056
Table A-VII can be compared with Table VII. It shows that there is very little chance of stopping the trial (i.e., only a 3 per cent chance) if the agent is as effective as the standard. However the over-all chance of missing an agent as good as the standard is 28 per cent which suggests that the fixed sample-size procedure should use more animals.
APPENDIX
III
The construction of a stop-or-go rule based on the number of inversions can easily be constructed on a do-it-yourself basis without using the auxiliary tables (e.g., Table I-VII). However to see the performance of a scheme the tables are needed. The following steps will lead to a stop-or-go scheme which will lose no power relative to the original fixed-sample plan but may save some time and effort. .S.kp I.-Set up a suitable fixed sample scheme consisting of m stages (m = 3 in section II) with n, control observations and nl observations on the test agent at each stage (n, = no = 3 in section II ). Step Il.--Enter Table a stage E, and the variance S.kp III.-Calculate from
III-1
with n1 and no and read the expected number of inversions Vi. Thus for n, = no = 3, Et = 4.5 and Vi = 5.25.
in
in a stage
the expected
number of inversions,
E, and the variance,
V, for m stages
: E = mE, = 3(4.5) = 13.5 V = mt’, = 3(.5.2.5) = 15.75
Skp I I/.-Calculate the critical test corresponding to the customary
number for the fixed sample study (U,,*). two-tailed 5 per cent level test:
For a significance
17X 1x1 = E - 24 \- = 13.5 -
2 4 15.75 = 13.5 - 7.94 = 5.56 To be conseraative the The value of IT ,,,* obtained from the above formula may not be an integer. decimal fraction would be dropped. Thus in the above example U.,* would be taken as U,* = 5. .‘Yfcp l’.---Employ stage exceeds II*.
the stop-or-go
rule:
Stop the study where the number of inversions
at any
Although this step-by-step procedure provides a rule it does not give the performance of the rule. The investigator is advised to use the table method of section II and Appendix II to study the performance of this rule.
Volume 8 Number 3
SEQUENTIAL
CLINICAL
TRIALS
365
TABLE III-I. EXPECTATIONSANDVARIANCESFORA STAGE _____ ___NUMBEROF OBSERVATIONSINTHE TESTSERIES(II,)
1
Ei
Vi
2
1
2
3
4
5
0.5 0.25
1.0 0.67
1.5 1.25
20 2.0
1.0 0.67
2.0 1.66
3.0 3.0
6
7
8
9
10
2.5 2.92
3 0 4 0
3.5 5.25
4.0 6.67
4.5 8 25
4.0 4.67
5.0 6.67
6.0 9.0
7.0 11.67
8.0 14.67
9.0 18.0
10.0 21.67
5.0 10.0
3
Ei Vi
1.5 1.25
3.0 3.0
4.5 5.25
6.0 8.0
7.5 11.25
9.0 15.0
10.5 19.25
12.0 24.0
13.5 29.25
15.0 35.0
4
Ei
2.0 2.0
4.0 4.67
6.0 8.0
8.0 12.0
10.0 16.67
12.0 22.0
14.0 28.0
16.0 34.67
18.0 42.0
20 0 50.0
2.5 2.92
5.0 6.67
7.5 11.25
10.0 16.67
12.5 22.92
15.0 30.0
17.5 37.92
20.0 46.67
22.5 56.25
25.0 66.67
3.0 4.0
6.0 9.0
9.0 15.0
12.0 22.0
15.0 30.0
18.0 39.0
21.0 49.0
24.0 60.0
27.0 72.0
30.0 85.0
3.5 5.25
7.0 11.67
10.5 19.25
14.0 28.0
17.5 37.92
21.0 49.0
24 5 61.25
28.0 74.67
31.5 89.25
V.
4.0 6.67
8.0 14.67
12.0 24.0
16.0 34.67
20.0 46.67
24.0 60.0
28.0 74.67
32.0 90.67
9
Ei Vi
4.5 8.25
9.0 18.0
13.5 29.25
18.0 42.0
22.5 56.25
27.0 72.0
31.5 89.25
10
Ei
10.0 21.67
15.0 35.0
20.0 50.0
25.0 66.67
30.0 85.0
35 0 105.0
Vi 5
Ei Vi
6
Ei Vi
7
Ei Vi
8
Ei
Vi
5.0 10.0
35.0 105.0
36.0 108.0
40.0 126.67
36.0 108.0
40 5 128.25
45.0 150.0
40.0 126.67
45.0 150.0
50.0 175.0
The entriesm? calculatedfrom: E=--_ 2 non,(no+ n1 + 1) V=___----12 See Mann, H. B., and Whitney, D. R.: On a testof whether one of two random variables is stochastically larger than the other, Annals of Mathematical Statistics l&50-60, 1947. Table III-l may also be used if the stages do not have the same no and nl. In this case E and V are obtained by summing the Ei’s and Vi'sdirectly. This method may also be employed in adjusting the final significance test if observations are lost due to patients dying, moving away. or refusing to abide by protocol.