Sequential clinical trials

Sequential clinical trials

Special Article Statistics SEQUENTIAL IRWIN D. J. CLINICAL BROSS, Ph.D. NEW YORK, N.Y. I. TRIALS From the Cornell University Medical College* (...

1MB Sizes 6 Downloads 136 Views

Special

Article

Statistics SEQUENTIAL IRWIN D. J.

CLINICAL

BROSS, Ph.D.

NEW YORK, N.Y.

I.

TRIALS From the Cornell University Medical College* (Received

for publication

June 7, 1958.)

INTRODUCTION

purpose here is to discuss a class of techniques M Ycedures” which are sometimes useful in clinical

called “sequential proIn an effort to trials. improve communication I will use an informal question-and-answer style: What is a sequential procedure? Q: A: In a broad sense, a sequential study is one where the investigator tries to profit from theexperience he accumulates as the study goes along. Thus, At the completion of a study is done in a sequence of steps (or “stages”). each stage the researcher pauses and reviews what has happened up to this point. After considering what has already occurred, he makes decisions about the next stage. Q: What kind of decisions are you talking about? A: I will be mainly concerned here with a decision as to whether to stop the trial or to go ahead with it. However, there are other decisions which can be made in this step-by-step or sequential fashion. For example, the dosage of a drug can be increased or lowered, or other changes in protocol may be made. Q: That hardly seems like a new idea. A: Oh it isn’t-almost any good investigator intuitively uses a sequential approach. The statisticians have simply taken a familiar idea and put it to work in an explicit and systematic fashion. The rules of the game are written out so that the investigator has an objective guide for his decisions and does not have to rely entirely on his personal judgment. The formalization of this old idea is a comparatively new development which dates back to World War II and the work of Abraham Wa1d.r The original application was to sampling inspection of incoming lots of such things as ball bearings or ammunition. *Departmentof Public Health and Preventive Medicine, The New York Hospital-Cornell Center. supported in part by grant from Rockefeller Foundation.

Medical

350

BROSS

J. Chron. Dis. September, 1958

Q:

That seems like a long way from the clinical trial of a new drug.

A:

It is, and naturally the original procedures designed for industrial inspection have to be adapted to the much different situations encountered in the medical field. Sometimes the older devices cannot be fitted into the clinical trials so we have to develop new ones. One of the factors that has hampered the use of sequential methods in medical research has been a lack of appreciation of this very point: statistical methods have to be tailored to fit the actual clinical situation, and sequential procedures are no exception to this rule.

Q:

I take it then, that sequential procedures have not been an unqualified success in the medical field.

A:

So they have not. The sequential trials which have been published in the medical literature can be counted on your fingers. However, the current trend toward the controlled clinical trial as an instrument for evaluation of new drugs (and other therapeutic measures) is likely to lead to a wider use of sequential methods. So at least a nodding acquaintance with the notion of a sequential procedure should be of use to an experimenter or a critical reader.

Q:

Why do you think that these methods will come into wider use?

A:

Largely it’s a matter of costs, not just dollars and cents but costs in time and effort as well. Thus suppose we have a sound protocol for testing chemotherapeutic agents for cancer that calls for a fixed number of patients to be tested on each compound. In certain favorable situations sequentialization of the protocol may do two things. First it may give a trial which is just about as good as the original “fixed sample size” protocol insofar as detection of promising agents goes. Second it may cut the average sample size required per agent by as much as 50 per cent. Cutting the work load may save money, or it may permit more agents to be tested with available resources. This is especially important nowadays because of the developing bottleneck at the clinical trial stage.

Q:

You say that “sequentialization” of a protocol can sometimes save money, time, and effort. What do you mean by “sequentialization”? If we start with a good protocol based on a fixed number of patients for each test agent we may be able-without drastic changes in the original protocol-to add a stop-or-go sequential scheme. What’s that? A stop-or-go scheme is simply a rule which, at each stage, tells us whether to go on with the test of the particular agent or to stop the trial. Such a scheme presupposes that the patients enter the study, either individually or in batches, over a period of time so that the total series can be regarded as a sequence of smaller sub-experiments or stages. For example, if the original protocol called for a series of 30 patients, the patients might come into the study at the rate of 6 per month over a S-month period. If some assessment of the response to the test agent of the first 6 patients can be made within the first month, then this information may be used as the basis

A: Q: A:

Volume 8 Number 3

Q: A:

SEQUENTIAL

CLINICAL

351

TRIALS

for a stop-or-go rule. Depending on the initial results we might either continue the trial or stop it. Notice that with sequentialization there will be different numbers of patients tested on the different agents. But what’s the point of the stop-or-go scheme? If the investigator makes a clinical trial of a number of agents some of them will be duds-no better than the agent in the control series-while others will be promising. Of course, there is usually a favorable report on an agent from laboratory animal studies before it would be considered for clinical testing, but even so most of the agents are likely to turn out to be duds in human beings. So what we would want to do is to drop the duds as soon as possible, to use the minimum number of patients on the duds. At the same time we would usually want to carry the promising agents through the full test. FIXED

SAMPLE

SIZE

II.

A HYPOTHETICAL

Q:

You have said that a stop-or-go scheme aims at using as few as possible patients on dud agents and at the same time doing a full test on promising agents. But I don’t see how a stop-or-go scheme can work since you don’t know in advance which are duds and which are not. If you did, you wouldn’t have to do a clinical trial at all. I think the easiest way to see how a stop-or-go scheme works is to watch one in operation. To keep things simple, I will have to use a rather unrealistic example, a very small hypothetical study, if you do not mind. Well I do not care much for hypothetical studies. Neither do I. However the little hypothetical study will serve to illustrate some principles of sequentialization that carry over to actual studies. Perhaps I can build a plausible setting for the example. Let’s suppose that the response to the agents under test is to be assessed by a complex and possibly erratic chemical determination on blood samples. This lab work is the limiting factor of the study since the doctor can run only 6 determinations a week. He plans to run 3 determinations on patients receiving the test agent (Drug A) and 3 on control patients (Drug B) where the allocation of patients to the drugs is randomized. He feels that 9 patients (i.e., 3 stages or weeks) will provide an adequate test of the agent. Before we consider sequentialization of this study plan, let us examine a possible method of analyzing the fixed-sample (9-patient) trial. Isn’t the 9-patient trial too small for any valid analysis? The study is smaller than would ordinarily be desirable, but even with 9 patients we can set up a legitimate significance test which will correspond to the customary (two-tailed) 5 per cent level test. Moreover the significance test will be “distribution free”-we will not need to assume that the measurements will be normally distributed, for example. This is how it works. Let us suppose that an effective agent acts to lower the laboratory readings. Then if we compare the measurement for a patient given Drug A with the reading for a control patient we will say there is an “inversion”

A:

Q: A:

Q: A:

CLINICAL

TRIAL:

352 if the patient (stage)

given

there

Drug

of inversions

A:

agent

What

I thought Well would

have

occur.

find only

would we expect I was asking

rank

here.

patients.

The

be any number a few

in a given

patient

“I”),

total perhaps

to answer

since

if you know

by its ranR (i.e., the rank

the trick.

These

outcomes

order of the observations

in Table

likely

OF INVERSIONS

to occur.

in Table

Since

of 0.05

(i.e.,

then each of the

there

are

one chance

So we can find the chance

will occur.

of LJA inversions

Table

For instance,

might

the num-

I.

is a dud (and if we randomize),

(say “U”

step is summarized From

that

steps are shown

is a probabillty

outcome

of inversions

Probability This

is all that

outcomes

down we can determine

is equally

in all there

a given

let us

356 346 34.5 256 216 245 236 235 234 156 146 14.5 136 13.5 134 126 12s 124 12.3

Now if the test agent

that

With

measurement

we have written

124 12s 126 131 135 136 14.5 146 1.56 234 23.5 236 245 246 256 345 346 3.56 4.56

number

Drug

one or

First

the smallest

NUMBER

outcomes

week

number

from 0 to 9.

“inversions,”

we write down all the possible

For an)- outcome

possible

given

to find with a dud agent?

observation

Second

ber of inversions.

Now

since each

Uis. 1958

the questions.

question

the original

matters

may therefore

we would

it is an easy

replace

reading.

comparisons

with each of the 3 control

in this stage

an effective two.

A has a higher

will be 9 possible

A can be compared

Q:

J. Chron. September,

BRVSS

inversions) =

Kumber

will occur

in twenty) that

a given

by:

of outcomes Total

20 possible

number

with UA inversions of outcomes

1 I.

II we can see what

will happen

if the test agent

is a dud.

we see that only 5 per cent of the time will we find no inversions

Volume 8 Number 3

SEQUENTIAL

CLINICAL TABLE

NUMBER

OF INVERSIONS

NUMBER WITH

(U)

353

TRIALS

II

OF OUTCOMES

PROBABILITY

U INVERSIONS

OF

U INVERSIONS

!

0

0.05 0.05

1

0.10 0.15

3

0.15 0.15

:: 3 2 1 1 Total

0.15 0.10 0.05 0.05

20

1.00

TABLE III

UB

UA

0

1

2

3

4

5

P’

1

1

2

3

3

3

A:

.

.

_’

3

l 9’,

‘9’

4

3

9’

5

3

6

3

7

2

/2; R ‘2 , ‘4

8

1

‘1,

9

1

1

1

1

2

2

3

I

,‘-1

4’2

/‘3

,/3,‘6,

‘9 ’

/6)

’ : , ‘2 , ‘3” / ‘1 /y;/A,‘;,‘:

/

Q:

_’



0

0

0



,’



6

7

8

9

3

2

1

1

----7--3,“2,

.



0

1,‘:

9’ ‘9’ ’ a’ /b/SC, ‘< /3’,

‘2, ‘3” ’

/



* 3,/2,

.

/

‘l/‘-l

,

from a dud agent. Sixty per cent of the time (0.60= 0.15 + 0.15 + 0.15 + 0.15) we would find between 3 and 6 inversions. So much for the first stage. I think you can see that if we let Us be the number of inversions in the second stage then we would have the same probabilities for LIB as for UA. Are you following so far? I suppose so. Just give me a while to study Tables I and II. Where is this all leading? The next step is to see what happens (always assuming the agent is a dud) to the total number of inversions in the first two stages; call this IJAB (i.e., If the stages are independent, that is, the results of one UAB = UA + UB). stage do not affect another stage, we can get the answer by a little addition and multiplication.

J. Chron. Dis. September, 1958

354 To do this we set up Table number

of inversions

twentieths)

of UA.

of fractions.

in the This

sions in the second

stage

multiply

the column.

stage

In the horizontal stub we write the (UA) and the probability, P’, (in

will allow us to work with whole numbers

In the vertical

we simply

III.

first

stub we do the same

(Un).

To get the numbers

the probability

For example

for the number

(P’)

in the body

instead of inver-

of the table

for the row by the probability

UB = 3 and UA = 2, then

for

P’ for UB = 3 and for

l_lA = 2 is 2 X 3 = 6 in the table. From

Table

the diagonals probability

III

we obtain

(indicated

Table

by dotted

(in four hundredths)

first two stages.

IV by adding

lines in Table

of getting

up the numbers III).

a total

This

along

gives

us the

of UAn inversions

in the

Thus for IT, = 0 and U* = 3, P’ = 3.

3, 2, 2, 3 (along

this diagnonal)

We add the numbers to find I~JAn= 3 has a P’ = 10. TAH1.E I\

u411

0

1

2

3

4

5

6

7

x

IJr

P’

1

2

5

10

16

24

33

4U

45

48

45

40

33

24

16

10

5

2

1

0

I

I

2

5

10

16

24

33

40

45

48

45

40

33

24

16

10

5

2

1

1

1

I

2

5

10

16

24

33

40

45

48

45

40

33

24

16

10

5

2

1

2

2

2

4

10

20

32

48

66

80

90

96

90

80

66

48

32

20

10

4

2

3

3

3

6

15

30

48

72

99

120

135

144

135

120

99

72

48

30

15

6

3

4

3

3

6

15

30

48

72

99

120

135

144

135

120

99

72

48

30

15

6

3

5

3

3

6

15

30

48

72

99

120

135

144

135

120

99

72

48

30

15

6

3

6

3

3

6

15

30

48

72

99

120

135

144

135

120

99

72

48

30

15

6

3

7

2

2

4

IO

20

32

4X

66

80

90

96

90

80

66

48

32

20

10

4

2

8

1

I

2

5

IO

16

24

33

40

45

48

45

40

33

24

16

10

5

2

1

9

1

1

2

5

10

16

24

33

40

45

48

45

40

33

24

16

10

5

2 . .._~

1

This

Same

V we put

These

Q: A:

11

horizontal

stub

and

Uc,

12

13

14

of stages.

15

Thus

the number

are given in Table

17

18

in Table

of inversions

VI.

Note that

the total

in Table

IV, 8,000,

repre-

of ways in which a UAsc can be achieved.

so far?

I can see the arithmetic

but

I do not see why you do these

multiplications

and additions or what, if anything, you have accomplished. These arithmetic steps follow from the basic rules for probability; explained

16

Proceeding as before we find the stage, in the vertical stub. for the total number of inversions in all three stages, UAsc.

the number All right

IO

can be used for any number

UAB in the

in the third probabilities sents

device

9

in Appendix

I.

As for what

we have

accomplished,

this

is

well, we are

Volume 8 Number 3

SEQUENTIAL

CLINICAL

355

TRIALS

TABLE VI _

_ UABC

Prob Cum P

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

1

3

9

22

45

84

143

222

321

435

549

654.

735

777

777

.OOOl .0005 ,002 .004 ,010 ,020 .038 .066 .106 .161 .229 ,311

.403

.500

.597

15

UARC

16

17

18

19

20

21

22

23

24

25

26

Prob

735

654

549

435

321

222

143

84

45

22

9

3

Cum P

,689

,771

.839

.894

.934

.962

.980

.990

.996

.998

.9995

.9999

27 TOTAL 1 8000 1.000

now all set to construct our test of significance. For this we use Table VI. In this table we have added a column labeled Cum P (“Cumulative Probability”) which gives us the chance that U ABCor fewer inversions will Occur. Notice that for UAsc = 5 the value of Cum P is 0.020. From this we can set up a rule for rejectiona rule which will tell us when to reject the (Null) hypothesis that an agent is a dud. Thus suppose that we use the rule for rejection: reject the (Null) hypothesis that the test agent is a dud if UAsc is 5 or less. In other words, if U~sc is 5 or less we will say that the test agent is not a dud. Of course, we will sometimes be mistaken,now and then we will say an agent is not a dud when it happens to be a dud. How often will we make this kind of mistake (i.e., a false-positive conclusion) if we use the above rule? From Cum P we see that if an agent is a dud we will make this false claim about 2 per cent of the time. Notice that this considers only departures in a favorable direction (so it is called a one-h&d test). Most of the common significance tests are two-tailed tests (i.e., pick-up departures in The significance test presented here can be easily coneither direction). verted to a two-tailed test by rejecting if UABC is 23 or more (in which event the risk of rejection under the null hypothesis would be 2 X 0.02 = 0.04). Hence, a two-tailed test based on inversions corresponds approximately to the usual 5 per cent level (two-tailed) statistical test. Notice also that we have really made just one major assumption in the whole process, namely that we have random samples.

Q: A:

III.

Q:

A:

But I still do not see what this has to do with sequentialization. Is this significance test based on a full trial of 9 patients? Yes, this is a fixed-sample test. But we are now ready to see what happens if a stop-or-go scheme is added to the protocol. SEQLJENTIALIZING THEHYPOTHETICAL CLINICALTRIhL It still bothers me that there are only 9 patients in the trial. Remember this is just an example, a working model if you like. The question “how big a series (fixed sample) is needed for an adequate test of an agent?” is an important one, but to answer it would take us too far afield. However, whatever size sample is judged adequate we can apply the same principles of sequentialization which will use in this example. So in that

356

52: A:

Q: A:

J. Chron. Dis. September, 1958

UKOSS

sense we will now assume that 9 patients would provide an adequate trial in our hypothetical example. We will also suppose that we would be interzsted in those compounds which we can demonstrate are not duds (using the rule of rejection derived above). Well all right. Kow what do we want from a stop-or-go scheme? First we would want to carry through a full trial those agents which will subsequently turn out to be “significant” by our rule for rejection. Second, we would like to drop those agents which will not make the grade at an early stage of the studyafter the first stage for example. But you do not know which agents will be “significant” if you only know what happens in the first stage. No we do not. But we can spot agents which are not very likely to turn out to be “significant.” Suppose that an agent has 6 inversions in the first stage, then it cannot turn out to be significant by our rule for rejection (which allows at most 5 inversions in all 3 stages). So we might as well drop this agent.

Q: Oh, I see. two stages

And if an agent accumulated 6 or more it could be dropped at this point.

inversions

in the first

A: Yes, and there you have one possible

stop-or-go rule: drop an agent if, at any stage, it has a total of 6 or more inversions. With this rule, an). agent which could possibly pass the test would get a full trial so that, so far as “significant” agents go, the protocol plus the stop-or-go rule would have the same performance as the original scheme. At the same time we could cut the work load a bit.

Q: How much would you save? A: To see this we can go back to the tables

we have previously calculated. (‘onsider Table 11. With the “6-and-out” stop-or-go rule we would drop 35 per cent of the duds after the first stage. To see this we add up the probabilities for the UA’S where we would stop the trial: P(Drop after first stage if dud) = 0.15 + 0.10 + 0.05 + 0.05 = 0.35. When some of the agents are dropped after the first stage, Table IV is modified. We can draw a line to indicate the stop rule for the first stage (the vertical solid line in Table IV). In adding along the diagonals of Table Ill, we do not count numbers to We can also draw a solid diagonal line the right of the solid vertical line. to indicate the stop rule for the second stage. Adding up the numbers below this diagonal line gives us:

P(Drop after second stage if dud) = 30 + 35 + 36 + 33 + etc. 202 ==C = 0.505 400 400 Finally note that: P(goes into second stage if dud) = l- P(Drop after first stage if dud) = l-0.35 = 0.65 and

Volume 8 Number 3

SEQUENTIAL

CLINICAL

357

TRIALS

P(goes into third stage if dud) = P(goes into second stage if dud)P(Drop after second stage) = 0.650 - 0.0505 = 0.145 For a Dud Probability of Continuing Probability of Stopping to Next Stage at Stage Stage 1 0.650 0.350 2 0.145 0.505 3 1.000 So that the mean sample size for dud agents (MSS

if dud):

MSS if dud = no. observations per stage X [l + P(goes into second stage if dud + P (goes into third stage if dud)] = 3 X [l + 0.650 + 0.145]= 5.385 Hence instead of using 9 patients on dud agents, as in the fixed sample plan we would use, on the average, about 5% patients and thus save about 40 per cent of this work load.

Q:

Can you save more than this with some other stop-or-go

rule?

A:

Yes, if we are willing to lose a small proportion of the agents which would be significant in a full trial. For example, we could set the stop rule in the first stage at 5 instead of 6. It would be rather rare for an agent to give 5 inversions on the first stage and no inversions at all in the last two stages. We can try various stop rules and calculate their performance from the tables. Thus the MS.5 for duds with the stop rule set at 5 in the first stage instead of 6 would be about 4.75 patients.

IV.

PROS AND CON

Q:

I can see how a stop-or-go rule provides an automatic, mechanical way to stop the trial of an agent. But it seems like a blind and brainless way to do research , . . it doesn’t leave any room for the investigator’s personal judgment.

,4:This

depends on how the rule is used. If the rule is taken as a “commandI would ment” to be rigidly followed, then your objection would apply. prefer to think of it as a guide. For example, suppose an agent performed rather poorly most of the time but achieved an occasional spectacular result. Then, even if the stop rule said to drop the agent, the investigator could continue it in the trial. A stop-or-go rule, or any other routine procedure, should not prevent a researcher from keeping his eyes open for the unusual.

Q:

What about the final decision about the agent. The significance test is based only upon the main response variable, but there is likely to be much more information obtained in the clinical trial. There may be other measures of response, and there would be information on toxicity or side effects. Is this additional information to be ignored? Not at all. In fact, this is why I have emphasized the stop-or-go feature rather than the use of the sequential procedures in testing hypotheses or

35x

J. Chron. September,

BROSS

making

decisions.

For

the investigator

the agents

would

probably

and to utilize whatever mendations.

other

In other

words,

which want

run through

to make

the complete

a more

extensive

information

is available

the clinician

can use the stop-or-go

Dis. 1958

trial,

analysis

in making

his recomscheme

to

save work, but he can deal with his final data just as he would if he had done the fixed-sample-size

Q: A:

But

then

nificant Yes,

study.

it could happen in the extensive

this is the price that

ever,

ordinarily

there

a substantial would

turned

If the researcher the equivalent

Q:

out

is concerned

costs”

sequential

stages.

It

secondary

Q: A:

Why

never

continuous

through

which

analysis. scheme,

possibly that

of

effort to set the scheme

For this reason

be forgotten

processing

in time to use it in sub-

administration

confusion,

the sequentialization

to deterioration

sequentialization

of data

is really

of

importance.

do you say that?

I think

the primary

responsibility

reliable,

high quality

data.

After

of the clinical the study

can worry about cutting

investigator

costs.

But to save money by lower-

ing data quality

would be “penny

wise, pound foolish.”

sequentialization

should

like

from his primary worked

out.

without What

task.

For this reason

for the neophyte I think

first considering is a favorable

whether

situation

divert

the

I don’t

where

essential

is favorable

that

attention sequentiali-

protocol

to try to sequentialize

the situation

think

researcher’s

I would not recommend

or for a study

it is a mistake

is to obtain

plan is set up to yield good data

a ‘gimmick zation

Q:

and

extensive

level in his final analysis.

require

come

smoothl\f.

haste

should

the investigator

A:

must

So it takes additional

to some

rare loss of agents more

of sequentialization?

schemes

the information

the re-

So there may be

level in his stop-or-go

“hidden

The

How-

between

this he can reduce his risk by using, say,

probability

an)- other

lead

of the work load.

relationship

by the

at the 5 per cent

are.

quality.

about

up as sig-

in an early stage.

by other analyses.

to be significant

up and to keep it Harking may

close

test and results

of a 10 per cent

since

sequent

be a fairly

to work

there

the data

would have showed

he intends

Are there Yes

which

wouId be dropped

cut in work load for a comparatively

have

although

agents

is paid for the reduction

would

sults by the “inversion”

A:

that analysis

is not yet

any clinical

trial

or not.

for sequentialization?

That is hard to sa!~. One good omen is if, after setting up a fixed sample scheme, it breaks up easil\y and naturally- into a sequence of stages. Then, too,

the information

be available

from

for subsequent

one stage stages

should (though

be processed not

necessarily

in time

for it to

the very

next

stage). sessed,

When it takes a long time for the response to the agent to be assuch as often happens in the chronic diseases such as cancer or heart There disease, this makes it very difficult to set up a sequential scheme. are other

signs and portents

for sequentialization scheme.

Without

will turn up.

that

indicate

whether

or not a situation

but the ncid test is a preliminary an actual

trial of the scheme

is ripe

run with a sequential

it is hard to say what

bugs

Volume 8 Number 3

Q: A:

SEQUENTIAL

CLINICAL

TRIALS

359

That is pretty vague. Can you be more definite? We can put it this way. I would see the greatest utility for sequentialization in more or less routine screening of new agents. For example, if a doctor is running a continuing clinical trial of analgesics or other agents and if a stop-or-go scheme fits neatly into his existing routines, then they are probably worth a trial. I think it is important that the study be well organized and have strong central direction. I would be chary about sequentializing a cooperative study if, for example, some participants might delay sending in reports.

OTHER USES OF SEQUENTIALIZATION

Q:

A:

Q: A:

Q:

The purpose of sequentialization, as you have presented it, seems to be to drop dud agents out of a clinical trial as soon as possible. Is this all that the method can do? No, there are other uses for sequential schemes in clinical trials. However, I will not be able to do much more than mention these other possibilities. For one thing, there is a second way to reduce the work load; we could cut short the trial of agents with spectacularly good performance. In other words, we can set up a slightly more complicated stop-or-go rule which will stop the trial of agents which either do poorly or else do especially well. For example, in the example of section II we could use a rule such as: Stop if there are no inversions after the second trial. I have not emphasized this type of stop rule because I think that ordinarily we would not save much work with such a rule; such agents would be rather rare, and very often we would like to have additional information on the performance of such agents. Sometimes, though, the saving in time might be important if immediate action were planned if a good agent turned up. Are there other uses? Yes, we can sometimes save work and time by a very different strategy: we may try to increase the amount of relevant information that we obtain from each patient in the series. For example, suppose we are trying to estimate the relative potency of an analgesic agent with respect to morphine and we use dosages of morphine around 10 mg. Now if we chose a dosage of the test agent which will give us about the same response as this dose of morphine, we will have a somewhat better estimate of relative potency than if the response to the test agent is much higher (or much lower) than the response to morphine. At the start of the study we have to guess what the equivalent dosage of the new agent will be, but as the study progresses we can adjust the original dosage up or down so as to get a response more In this way, for a given number of panearly equal to that of morphine. tients we can get a better estimate of a relative potency.4 To distinguish this type of sequential rule we might call it a “move rule” (as opposed to a stop-or-go rule). If you succeed in matching responses you would, in fact, have a direct estimate of relative potency since you would know the dosage equivalent of 10 mg. of morphine.

J. Chron. Dis. September, 1958

360 A:

Yes,

could be used as a method

the scheme

its main

importance

is to “centrally

trials we are interested ing response. mum effect

Q: A:

Here,

in achieving

Well

that

1 can give an instance leukemia

the trial it was suggested way.

better this

patient

had been

given

raising

the dosage,

maximum

response

rather

of dosages

In other than a match-

to try for maxi-

at the maximum

tolerated

a drastically so a trial

second

dose).

trial

occurs

Are there other (i.e.,

sequentialized

schemes

suggested

These

\\rould naturally You

would

greatest

‘I,.II, (‘an

(“steepest

enters

thinking cancer

where

tion.

distinguish agents

rules”

might

might

be used?

therapies the

of agents.

are em-

problem

is to

Potentially

some

be used to explore

the re-

a strategy

in the

schemes

be used if there and the time

kind

of situation

it ma)’ take

years

is a serious

it is in just

of sequential

headed

dose, then this scheme

combination

combination by Box’

down

that the

may be questioned.

simultaneously)

employ

(by

much

like the one you

on a dark,

direction

in which

foggy

the

night.

slope

was

ascent”).

response

Yet

and then backing

tolerated

“move

the study

of the

Delayed

if each trial

IN SEQIiENTIAI,I%ATION

sequential

patient

appear

When

given

schemes

keep

is now under-

unnecessary

dose in the first

use to find the top of a mountain

simply

PKORLISMS

been

After would

schedule

If there is good reason to believe

drugs

for the

surface.

have

tolerated

places where sequential

dosages

sponse

dosage

the assumption

find optimum

in

increased

at the maximum

two or more

but

was employed.

would

Yes, I will give one more example. ployed

have

schedule

until side effects

However

helped

trial of 6 mercaptopurine

with this new schedule

his maximum

by steps,

response

clinical

a fixed dosage that

might

sequentialization

until they disappear).

would save work.

A:

where

results

In a sense

the staircase

Q:

this comes

In a controlled

in children

have produced

SPEC

maximum

but in practice

observations.

Could you give an example?

acute

A:

the

too, we can use al‘staircase”

(if we assume

where it was not used.

Q:

of estimation,

locate”

schemes between

that

to say what

acute

in evaluating really

the time a

therapies

I am for

has happened. obstacle

importance.

and long-term leukemia

between

is determined?

where the potential

be of most

short-term

now used to treat

occurs

but not impassable

this situation may

is a long delay the response

It is helpful

response.

in children

to sequentializa-

time-saving For

produce

feature here

example,

to the

a “short-term”

within a month or two. The “remission” may conresponse, “remission,” tinue for a year or more, and so the duration of remission or the survival time

for the

patient

is a “long-term”

response.

If we are

response-either in itself or as a possible “short-term” we might employ this measure term” response-then sequential

plan.

on the “long-term”

Q: A:

Another

possibility

response

interested

in

indicator of “longas the basis for a

is to use what information

is available

even if it is incomplete.

How could J-ou do this? Take survival for example and suppose on Drug A (test agent) and the other

a paired trial is run with one patient on Drug B (control agent). If the

Volume 8 Number 3

Q: A:

Q: A:

SEQUENTIAL

CLINICAL

TRIALS

361

patient on Drug A dies first this is an “inversion” whereas if the patient on Drug B dies first this is a “noninversion.” There is, however, the difficulty that the outcomes for the pairs become known in a different order than that in which the pairs enter the study. Consequently, there may be a bias introduced if “tied” observations are disregarded. For example it might be that more patients would be alive at 3 months for Drug B but that by 12 months the situation would be reversed and hence Drug A might be substantially better in the long run. However, using a sequential scheme based on the order in which the pair outcomes are determined would lead us to drop Drug A. Let’s go back to an earlier point you made, the risk of dropping a promising agent with a stop-or-go scheme. How great is this risk? First of all we must remember that with the fixed-sample scheme, there is also this risk. The stop-or-go rule is chosen as to have relatively little effect on this risk. However, it is important to have some idea of the magnitude of this risk, and there are two ways to approach this question. The first is theoretical. You will recall that in section II we calculated the chances that a dud would get dropped. We might do essentially the same thing for an agent which wasn’t a dud. For example, we might postulate that the agent was fully effective on 50 per cent of the patients and was a dud on the rest of the patients. Alternatively, we could proceed more empirically. If the investigator has previously tested a number of agents and has turned up a promising compound he might use the data on this compound. The procedure, which is like that of section II is illustrated in Appendix II. One last question. If I want to find out more about sequential schemes where can I go? Preferably to an experienced statistician since it may be a poor idea to pick a scheme out of a book or a paper. However, I have appended a short list of references which includes some of the published papers where sequential methods were used.* The “inversion” stop-or-go schemes are easy enough to construct on a do-it-yourself basis (Appendix III).

REFERENCES

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

Wald, A.: Sequential Analysis, New York, 1947, John Wiley & Sons, Inc. Bross, I.: Sequential Medical Plans, Biometrics, The Biometric Society 8:188, 1952. Bross, I.: Design For Decision, New York, 19.53, The Macmillan Co. Wallenstein, S. L., and Houde, R. W.: Clinical Analgesic Assay of Dihydrohydroxymorphinone, Fed. Proc. 15:1611., 1956. Armitage, P.: Sequential Tests m Prophylactic and Therapeutic Trials, Quart. J. Med. (New Series) 23:255, 1954. Armitage, P.: Sequential Procedures for Medical Trials, Biometrics 14:132, 1958. Box, G. E. P.: The Exploration and Exploitation of Response Surfaces: Some General Considerations and Examples, Biometrics 10:16, 1954. Kilpatrick, G. S., and Oldham, P. D.: Calcium Chloride and Adrenaline as Bronchial Dilators Compared by Sequential Analysis, Brit. M. J. 2:1388, 1954. Newton, D. R., and Tanner J. M.: N-acetylpara-aminophenol as an Analgesic: a Controlled Clinical Trial Using the Method of Sequential Analysis, Brit. M. J. 2:1096, 1956. Snell, E. S., and Armitage P.: Clinical Comparison of Diamorphine and Pholcodine as Cough Suppressants by a New Method of Sequential Analysis, Lancet 1:860, 1957. Lasagna, L., and Meier, P.: Clinical Evaluation of Drugs, Ann. Rev. Med. 9:347, 1958. *I am indebted to Professor William Cochran for this list.

362

J. Chron. Dis. September.1958

BROSS

APPENDIX

I

The operations in section II stem directly from the 3 basic rules for probabilities.3 Let A and B be “events” and let P(A) represent “the probability that event A occurs” or “the probability of A.” Then: Rule Ia:

P(A and not-A) = 0

Example: for a comparison tion where “ties” do not occur: P(an inversion Rule Ib:

of a single treated

occurs and an inversion

in a situa-

does not occur) = 0

P(A or not-A) = 1

In the situation

Ia:

P(an inversion

occurs or an inversion

Rule ZC: 0 5

P(A) 5

Rule II:

with a single control observation

does not occur) = 1

1, i.e., P(A) is 0, 1, or any value between.

If two events are independent the following multiplicate

rule holds:

P(A and B) = P(A) x P(B) Example:

to obtain

the entries

in the body of Table

III

the probabilities

in the stubs are

multiplied. Rule III: add&e

If two events are mutually exclusive

(i.e., cannot

both occur)

then the following

rule holds:

P(A or B) = P(A) + P(B) To obtain the probability that UAD takes a given value the probabilities Example: combinations of UA and IJn which lead to the given value of UAR are added together. that in Table III the cases where U.A + Un = 3 (say) appear in a diagonal of the Table. by adding along the diagonal

~_

~__~__

Hence,

= 3) will be obtained.

P(U~B

TABLE A-I. ~

of the Notice

BASIC DATA.

___~~

..~.

RESULTS (TUMOR WEIGHT, GM.)

EXPERIMENT __.

TREATED -1

1

CONTROL ____~__

0.4 1.0 1.6

0.9 1.4 1.7

2.4 20 1.7

2.6 3.0 3 6

06

1.5

2.1 1.4

2.5

2.5 2.4 1.3

1.8

2.3 2.8 2.9

1.5

1.3 3 2 24

0.8 1.1 1.6

2.5

1.6 0.6

1.5 2.2

This data can be used to set up a table analogous to Table II.

NUMBER OF IIWERSIOP;

Volume 8 Number 3

SEQUENTIAL

APPENDIX

CLINICAL

363

TRIALS

II

The following example illustrates the use of empirical data to estimate the chance of missing an effective agent (the “Type II error”) or the chance of not missing such an agent (the “power”). I have no clinical data to use here so as the next best thing I will use some data from a cancer chemotherapy animal screen.* TABLE

NUMBER NUMBER

OF INVERSIONS

(U)

WITH

OF OUTCOMES

this,

in table

PROBABILITY

U INVERSIONS

Total From

A-II

INVERSIONS

6 analogous

to Table

III

may

be constructed

TABLE

UB

1.00

I

A-III

UA

0

1

2

3

P’

1

2

2

1

0

1

1

2

2

2

3

1

and simiiarly TABLE

A-IV

U AH

0

1

2

3

4

5

6

P’

1

4

8

10

8

4

1

TOTAL

36

and TABLE

A-V

UAB

0

1

2

3

4

5

6

UC

P’

1

4

8

10

8

4

I

0

1

4

1

1

2

16

8

2

2

2

20

16

8

2

3

1

10

8

4

1

0

the

kind

permission

of

Dr.

_’

.’

,‘?O,G $+10

giving -__ *With

-.__

Phillip

Merker.

OF U

364

J. Chron. Dis. September, 1958

BROSS TABLE

4-w

I7,AH,!

0

1

2

3

4

5

6

7

8

9

P’

1

6

18

3.5

48

48

35

18

6

1

The performance

is summarized

‘I‘ARLE

h-C’11

‘rOTAL

216

in Table A--\:II.

PEKFOHMANCE

OF ‘I‘ISST

(STANDARD COMPOUND)

, I’KORARILITY OF STAGE

4T

KIiJECTINC

/

PKOBABILITY

STA(;E

_

STAGE

1.0000 0.9722

0.0000 0.0278 0.2778 Over-all

OF CONTINUING

TO NEXT

0 3056

Table A-VII can be compared with Table VII. It shows that there is very little chance of stopping the trial (i.e., only a 3 per cent chance) if the agent is as effective as the standard. However the over-all chance of missing an agent as good as the standard is 28 per cent which suggests that the fixed sample-size procedure should use more animals.

APPENDIX

III

The construction of a stop-or-go rule based on the number of inversions can easily be constructed on a do-it-yourself basis without using the auxiliary tables (e.g., Table I-VII). However to see the performance of a scheme the tables are needed. The following steps will lead to a stop-or-go scheme which will lose no power relative to the original fixed-sample plan but may save some time and effort. .S.kp I.-Set up a suitable fixed sample scheme consisting of m stages (m = 3 in section II) with n, control observations and nl observations on the test agent at each stage (n, = no = 3 in section II ). Step Il.--Enter Table a stage E, and the variance S.kp III.-Calculate from

III-1

with n1 and no and read the expected number of inversions Vi. Thus for n, = no = 3, Et = 4.5 and Vi = 5.25.

in

in a stage

the expected

number of inversions,

E, and the variance,

V, for m stages

: E = mE, = 3(4.5) = 13.5 V = mt’, = 3(.5.2.5) = 15.75

Skp I I/.-Calculate the critical test corresponding to the customary

number for the fixed sample study (U,,*). two-tailed 5 per cent level test:

For a significance

17X 1x1 = E - 24 \- = 13.5 -

2 4 15.75 = 13.5 - 7.94 = 5.56 To be conseraative the The value of IT ,,,* obtained from the above formula may not be an integer. decimal fraction would be dropped. Thus in the above example U.,* would be taken as U,* = 5. .‘Yfcp l’.---Employ stage exceeds II*.

the stop-or-go

rule:

Stop the study where the number of inversions

at any

Although this step-by-step procedure provides a rule it does not give the performance of the rule. The investigator is advised to use the table method of section II and Appendix II to study the performance of this rule.

Volume 8 Number 3

SEQUENTIAL

CLINICAL

TRIALS

365

TABLE III-I. EXPECTATIONSANDVARIANCESFORA STAGE _____ ___NUMBEROF OBSERVATIONSINTHE TESTSERIES(II,)

1

Ei

Vi

2

1

2

3

4

5

0.5 0.25

1.0 0.67

1.5 1.25

20 2.0

1.0 0.67

2.0 1.66

3.0 3.0

6

7

8

9

10

2.5 2.92

3 0 4 0

3.5 5.25

4.0 6.67

4.5 8 25

4.0 4.67

5.0 6.67

6.0 9.0

7.0 11.67

8.0 14.67

9.0 18.0

10.0 21.67

5.0 10.0

3

Ei Vi

1.5 1.25

3.0 3.0

4.5 5.25

6.0 8.0

7.5 11.25

9.0 15.0

10.5 19.25

12.0 24.0

13.5 29.25

15.0 35.0

4

Ei

2.0 2.0

4.0 4.67

6.0 8.0

8.0 12.0

10.0 16.67

12.0 22.0

14.0 28.0

16.0 34.67

18.0 42.0

20 0 50.0

2.5 2.92

5.0 6.67

7.5 11.25

10.0 16.67

12.5 22.92

15.0 30.0

17.5 37.92

20.0 46.67

22.5 56.25

25.0 66.67

3.0 4.0

6.0 9.0

9.0 15.0

12.0 22.0

15.0 30.0

18.0 39.0

21.0 49.0

24.0 60.0

27.0 72.0

30.0 85.0

3.5 5.25

7.0 11.67

10.5 19.25

14.0 28.0

17.5 37.92

21.0 49.0

24 5 61.25

28.0 74.67

31.5 89.25

V.

4.0 6.67

8.0 14.67

12.0 24.0

16.0 34.67

20.0 46.67

24.0 60.0

28.0 74.67

32.0 90.67

9

Ei Vi

4.5 8.25

9.0 18.0

13.5 29.25

18.0 42.0

22.5 56.25

27.0 72.0

31.5 89.25

10

Ei

10.0 21.67

15.0 35.0

20.0 50.0

25.0 66.67

30.0 85.0

35 0 105.0

Vi 5

Ei Vi

6

Ei Vi

7

Ei Vi

8

Ei

Vi

5.0 10.0

35.0 105.0

36.0 108.0

40.0 126.67

36.0 108.0

40 5 128.25

45.0 150.0

40.0 126.67

45.0 150.0

50.0 175.0

The entriesm? calculatedfrom: E=--_ 2 non,(no+ n1 + 1) V=___----12 See Mann, H. B., and Whitney, D. R.: On a testof whether one of two random variables is stochastically larger than the other, Annals of Mathematical Statistics l&50-60, 1947. Table III-l may also be used if the stages do not have the same no and nl. In this case E and V are obtained by summing the Ei’s and Vi'sdirectly. This method may also be employed in adjusting the final significance test if observations are lost due to patients dying, moving away. or refusing to abide by protocol.