Jerome Cornfield and the Methodology of Clinical Trials Paul Meier From the Department of Statistics, the University of Chicago, Chicago, Illinois
ABSTRACT: Jerome Cornfield was a major contributor to the theory of clinical trials and a
valued advisor on the conduct of many of the major trials. In a contribution to the Reed-Frost symposium in 1976, Cornfield expressed his views on the contribution of the Bayesian outlook to clinical trials methodology. Here we reexamine and comment on some of the same issues--in particular, on the uses and usefulness of likelihood ratios and prior opinions and on the role of randomization. KEY WORDS: clinical trial, likelihood, prior opinion, randomization
INTRODUCTION It is b o t h an h o n o r a n d a s a d n e s s to stand in for J e r o m e Cornfield. Jerry was one of m y role m o d e l s - - f r o m the t i m e I first started as a p r o f e s s i o n a l statistician at The Johns H o p k i n s University. H e was p r e v e n t e d b y illness f r o m s p e a k i n g at the International S y m p o s i u m on L o n g Term Clinical Trials. I agreed to s p e a k in his p l a c e - - n o t to g i v e his talk (no one b u t Jerry could do t h a t ) - - b u t to s p e a k a b o u t i m p o r t a n t concepts that are p r o v o k i n g m u c h n e e d e d t h o u g h t a m o n g all of those w h o s e p r o f e s s i o n a l interests e n c o m p a s s statistical inference. I s p e a k of the Bayesian r e v o l u t i o n ( s o m e w o u l d call it rebellion) a n d its i m p l i c a t i o n s specifically for the analysis of clinical trials. Jerry w o u l d h a v e b a s e d his talk on an a d d r e s s he g a v e at Johns H o p k i n s for the Reed-Frost s y m p o s i u m , a n d I shall do likewise. There is far too m u c h to cover it all, b u t I shall try to hit the highlights. The Reed-Frost a d d r e s s was o r g a n i z e d as follows: Decision-making A p p r a i s a l of u n c e r t a i n t y Likelihood r a t i o s - - t h e Bayesian v i e w Prior o p i n i o n a n d likelihood ratios Patient s u b g r o u p s Randomization
Address requests for reprints to Mr. Paul Meier, Department of Statistics, 5734 S. University Avenue, University of Chicago, Chicago, IL 60637. Controlled Clinical Trials 1, 339-345 (1981) © 1981 Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY 10017
339 0197-2456/81/040339007502.50
340
Paul Meier I will concentrate on likelihood ratios and prior opinions and will comment also on the issue of randomization. (Cornfield's Bayesian analysis of the subgroup problem involves too much technical detail, and it is not adequately developed for my presentation here.) To begin with, let us be quite clear that the development of clinical trial methodology has been a remarkably successful undertaking. When we criticize some of its aspects, it should be understood that the major structure is not under attack. Rather, the rationale that supports it appears in some ways to be less tightly organized than it had seemed. To quote Cornfield [1]: The empirical process of inference and decision in ongoing clinical trials is perceived to be loosely defined and structured, not because appropriate mathematical tests are lacking, but because this is the nature of the enterprise. The paradox is that a solid structure of permanent value has, nevertheless, emerged, lacking only the firm logical foundation on which it was originally thought to have been built. The now classical theory of experimental design, due largely to R. A. Fisher, developed out of a very different sort of experimentation--agricultural----and there are some important differences between agricultural experiments and clinical trials that make the latter far more demanding of the underlying theory. In particular: 1. The planning, execution, and analysis of an agricultural field experiment are all well separated in time. The intended design, if properly executed, will be the framework for the final analysis. L.ong-term clinical trials, by contrast, are still recruiting patients when the findings of analysis begin to emerge. These findings may quite properly cause the design to change in radical w a y s - - e v e n , on occasion, leading to early termination of the study. For a time it was possible to consider such decision making as outside the domain of statistical analysis and to regard it rather as the intrusion of extrastatistical humane considerations that caused us on occasion to terminate or alter an ongoing study. More recently it has become clear that the possibility of changes in the study brought about by early findings is not a rare incursion by extrascientific elements but rather a necessary and typical feature of this type of clinical experimentation. There has of late been considerable study of sequential and adaptive designs and of termination rules and it has been found that they are well within the domain accessible by statistical theory. They are, however, quite far away from the concepts that went with the agricultural prototype. 2. A second major distinction between agricultural experiments and clinical trials lies in our intimate knowledge of and concern with each experimental unit, leading to an order of concern with interactions between treatment conditions and patient characteristics quite unfamiliar in agriculture. In the agricultural sphere we are largely prepared to let random-
Jerome Cornfield and Clinical Trials
341
ization d i s p e r s e o u r c o n c e r n w i t h i n d i v i d u a l differences b e t w e e n the fertility of different plots or, if w e t h i n k there is m u c h to g a i n f r o m it, to u s e a s i m p l e k i n d of covariance analysis to i m p r o v e p r e c i s i o n . T h e r e is n o such easy g o i n g a t t i t u d e t o w a r d differences a m o n g p a t i e n t s in clinical trials a n d this too r e q u i r e s a c h a n g e in the focus of statistical t h e o r y a p p l i e d to this area. O n e aspect of this c h a n g e in focus is to m a k e statistical decision t h e o r y a p p e a r to b e m o r e i n t i m a t e l y tied to clinical trials t h a n to its agricultural p r e d e c e s s o r . We find decision p o i n t s arising so f r e q u e n t l y in the c o n d u c t of a trial that a t h e o r y dealing explicitly w i t h decision--as contrasted with estimation--seems appealing. To b e sure, d e c i s i o n t h e o r y a n d e s t i m a t i o n t h e o r y are n o t e n t i r e l y separate, a n d they m a y b e s e e n m e r e l y as different r e p r e s e n t a t i o n s of the s a m e u n d e r l y i n g model. N o n e t h e l e s s , o n e v i e w p o i n t m a y p r o v e to b e m o r e congenial than the other in a g i v e n context. Cornfield is e m p h a t i c that decision s e e m s to b e the right m o d e in this one.
THE BAYESIAN VIEW To get s o m e i n s i g h t into the n a t u r e of the n e w c o n t r i b u t i o n s , w e m u s t d i g r e s s a m o m e n t a n d say a little a b o u t the f o u n d a t i o n s of statistical inference f r o m the Bayesian p o i n t of view. It is r e m a r k a b l e h o w m u c h i n f o r m a t i o n a b o u t statistical inference can b e d e r i v e d f r o m p u r e t h o u g h t . W h e t h e r w e find the B a y e s i a n f r a m e w o r k u l t i m a t e l y satisfying for a p p l i c a t i o n or not, it certainly gives i m p o r t a n t n e w viewpoints. Cornfield b e g i n s m o r e or less as follows [1]: Let us start by considering two simple hypotheses about a treatment effect, such as that the (chances) of a favorable outcome are the s a m e for both the treatment and the control (H1) and, as an alternative, that the difference in the (probability) of a favorable outcome is some non-zero value, say 0 (H2). We consider a fixed number of observations and three possible decisions D1 = The acceptance of H1 D2 = The acceptance of H2 D3 = S u s p e n d e d j u d g m e n t The data are in and a decision needs to be made [Table 1]. How should one proceed?
Table 1 Alternatives H~: Probability of cure is the s a m e for treated and control H2: Probability of cure for treated exceeds probability for control by a fixed amount, 0 > 0 Possible decisions D~: Accept H~ D2: Accept H2 D3: Suspend judgment
342
Paul Meier Table 2 Prior probabilities gl = Chance that HI is true g2 = Chance that H2 is true Utilities Uii = Utility of decision that Hi is true when, in fact, Hj is true
We find in Bayesian t h e o r y that the decision rests on the a s s u m p t i o n (in t u r n derivable from v e r y plausible arguments) that the investigator has, before seeing the evidence, prior probabilities, gm and g2, for the chance that HI is true or H2 is true and also utilities for the different possible decisions (see Table 2). It is at this p o i n t that a typical listener will b e p r e p a r e d to stop listening and say, "I have no prior probabilities that I am aware of or care a b o u t and, although m y utility for right answers is greater than that for w r o n g answers, I am not at all p r e p a r e d to q u a n t i f y it." The Bayesian claim on y o u r further attention, h o w e v e r , rests u p o n a basic result, as follows. S u p p o s e y o u , as investigator, h a v e a n u m b e r of possible decisions to make, b a s e d o n a n y e v i d e n c e or o p i n i o n s , h o w e v e r derived. Let us s u p p o s e that y o u are a " r a t i o n a l " or c o h e r e n t d e c i s i o n m a k e r in the sense that you are careful n e v e r to m a k e contradictory decisions. That is, if you h a v e to d e c i d e a b o u t the relative values of three quantities A, B, a n d C, a n d you decide that A > B and B > C, you will not m a k e the c o n t r a d i c t o r y decision A < C. Your family of decisions is said to b e coherent. It is a remarkable, b u t nonetheless true, fact that regardless of w h a t y o u believe a b o u t the f o u n d a t i o n s of statistics, p r o v i d e d y o u r decision rules are coherent, you are acting like a Bayesian statistician. That is, y o u r family of decisions is the same as the family of decisions that w o u l d be m a d e b y a Bayesian with a p p r o p r i a t e l y specified prior probabilities and utilities w h o makes decisions b y m a x i m i z i n g the expected utility. This r e m a r k a b l e c o n d u s i o n is a far cry from b e i n g p r e p a r e d to specify just w h a t those prior probabilities a n d utilities are, b u t it does suggest that if we have m e t h o d s of statistical inference that are inconsistent with any Bayesian f r a m e w o r k , they m u s t s o m e w h e r e lead us to i n c o h e r e n t decision m a k i n g and that they p r e s u m a b l y are not good general m e t h o d s . G r a n t e d only this m u c h , we find that in this three-decision p r o b l e m that w e p o s e d , if x is the data o b s e r v e d in a trial a n d fdx) a n d f2(x) are the probabilities of o b s e r v i n g x if H~ is true or if H2 is true, t h e n the expected utility of decision i is k(x){Un[f,(x)](g~) + Ui2[f2(x)](g2)} w h e r e k(x) = 1/[fdx)gl + f2(x)g2] is i n d e p e n d e n t of i (See Table 3). It is only e l e m e n t a r y algebra, then, to s h o w that the decision rule has the form s h o w n in Table 4. What w e h a v e l e a r n e d here is that any coherent system of decision m a k i n g will take the data into account only t h r o u g h the likelihood ratio, f2(x)/fl(x). In particular, it says s t o p p i n g rules, multiple looks at the data at different
Jerome Cornfield and Clinical Trials
343
Table 3 Likelihood fl(x) = P{xlH1} = probability of observing x when H1 is true G(x) = P{xlH2} = probability of observing x when H2 is true Expected utility of Di Un x P{H1]x} + Ui2 x P{H2]x} = k(x)[Un x fl(x) x gl + Ui2 x f2(x) x g~]
times in the study, and so on are all irrelevant to the inference to be m a d e w h e n the s t u d y is over. This is, of course, the celebrated "likelihood p r i n c i p l e . " It is flatly at variance with the more familiar rule that inference s h o u l d d e p e n d on prespecified error probabilities. If we use a conventional significance test for inference, we find that our risk of error, say, in j u d g i n g H2 is true (the treatment is bett, r than the control) d e p e n d s m a r k e d l y on w h e t h e r we allow for multiple t e s 3 n g or not. The likelihood ratio, h o w e v e r , remains unaffected. DISCUSSION The likelihood principle is one that m a n y n o n - B a y e s i a n s a c c e p t - - a s a principle. Unfortunately, only true Bayesians k n o w quite w h a t to do with the likelihood ratio in decision making; they use it, of course, to convert prior probabilities, or odds, into posterior ones. Cornfield goes on to p o i n t out that our simple example is too simple. We never have a simple alternative, such as difference = 0. H e generalizes the solution to the case w h e r e our prior views a b o u t 0 are e n c o m p a s s e d b y a c o n t i n u o u s p r o b a b i l i t y distribution, i.e., a prior density. This leads to an averaged likelihood ratio to be used for decision m a M n g , w h i c h , in a n o t h e r form, Cornfield has n a m e d the Relative Betting O d d s (RBO) a n d w h i c h he has e m p l o y e d in several clinical trials. H e r e m i n d s us that the t h e o r y itself does not tell us h o w to .choose the priors. He says [1]: There is nothing in the theory indicating how to select a prior density, g(O), and only moderate help is at best obtained by talking to knowledgeable investigators about it. This ambiguity of priors is often regarded as a w~akness in the Bayesian view. More cogently, however, it should be considered a strength, since it provides an appropriate explication of what in fact everyone, no matter what his behavior, seems prepared to admit theoretically, the equivocality of statistical conclusions. In any event, the conclusion to which
Table 4
f2(x)
If ~
f2(x)
If ~
Bayes Decision Rule
< A, choose DI: HI is true ~ B, choose D2:H2 is true
fdx)
If A ,( ~
< B, choose D3: suspend judgment
Paul Meier we are led is that even when all complications are stripped away, there is a residual equivocality in appraising a null hypothesis which arises from the poorly defined nature of the alternatives to it, and that the expression of this equivocality is the prior distribution. Whether this equivocality is a major problem in any particular instance depends on the data, but theoretically it is always present. It cannot be banished by assuming a particular prior and computing a particular likelihood ratio. Prior probabilities exist, not so much to have numerical values assigned to them, as to distinguish between coherent and incoherent ways of appraising a problem.
RANDOMIZATION In the area of r a n d o m i z a t i o n , it seems to be the Bayesians' t u r n to be confused. Just as m a n y n o n - B a y e s i a n s accept the likelihood principle, b u t do not k n o w exactly w h a t to do with it, here most Bayesians are c o n v i n c e d that r a n d o m i z a t i o n is an i m p o r t a n t a n d central feature of experimental design, b u t they can find no role for it in Bayesian theory. W h a t is n e e d e d for Bayesian analysis is that the prior distribution for the outcome of the allocation of patients b e t w e e n treatment a n d control be exchangeable; i.e., that the prior p r o b a b i l i t y w o u l d be u n c h a n g e d if the allocations to treatment a n d control were reversed. If we h a v e baseline i n f o r m a t i o n a b o u t the patients, it is unlikely that a g i v e n r a n d o m allocation will lead to an exchangeable prior; only if the d i s t r i b u t i o n of the baseline variables were the same in each g r o u p w o u l d that be the case. If we do not have, or choose to ignore, the baseline i n f o r m a t i o n , the results of a r a n d o m allocation will, of course, be exchangeable. But in that case the results of a n y arbitrary allocation will also be exchangeable and that o u t c o m e seems to satisfy no one. The classical statistician has no difficulty explaining the role of r a n d o m i z a t i o n . Its p r i m a r y role is not so m u c h to equalize baseline variables (we can ordinarily i m p r o v e considerably on a r a n d o m allocation in this regard) b u t rather to assure that the o u t c o m e r a n d o m variable, x, will have a d i s t r i b u t i o n that is u n b i a s e d and for w h i c h the sample can p r o v i d e an a p p r o p r i a t e s t a n d a r d error. Cornfield, a true Bayesian, cannot accept this justification. Here, at least, I m u s t part c o m p a n y with him.
CONCLUSION Let m e close b y citing Cornfield's c o n c l u s i o n - - a statement that I can endorse w i t h o u t reservation [1]: Despite the ambiguities in design, in decision making and in conclusion reaching, it is undeniable that the clinical trial has constituted an important contribution to medicine. From the 1954 field trial on polio vaccine to this month's report on photocoagulation in diabetic retinopathy, questions have been crisply posed and definitively answered. It would be presumptuous to consider here why this is so, since it surely can be explained as simply a further triumph of experimental method as applied to clinical medicine. As such, it would not have surprised even Claude Bernard, and only the statistical participation would have puzzled him.
Jerome Cornfield and Clinical Trials
345
A clinical trial starts w h e n interest shifts from deducing the consequences of therapy to observing them in a controlled setting. The steps from this initial conception to the completion of the first draft protocol, involving, as they do, defining the variables, the measuring instruments, the patient population, and considering m a n y other detailed and substantively oriented matters, would astonish anyone accustomed to think of the outcome of a trial as simply the x of Section 3. It is the function of everyone engaged in the trial to assure that the x eventually observed bears a close correspondence to the x originally conceived. It is the special function of the statistician to see that this x can be and is reduced to intelligible and interpretable form. The strength of the clinical trial is a consequence of the willingness of all concerned to see that an appropriate x is conceived, observed, properly reduced a n d soberly interpreted. Many skills are involved in this, but for the statistician, a broad awareness of the limitations as well as the strengths of his methodology, is not the least of them.
REFERENCES 1. Cornfield J: Recent methodological contributions to clinical trials. Am J Epidem 104(no. 4):408-421, 1976.