Computer Programs in Biomedicine 17 (1983) 89-94
89
Elsevier
CPB 00615
Determination of the hyperparameters of a prior probability model in survival analysis R o b e r t Birch a n d A l f r e d A. Bartolucci Cancer Center and Department of Biostatistics and Biomathematics, University of Alabama in Birmingham, Birmingham, AL 35294, USA
This paper outlines the mathematical theory required for eliciting the hyperparameters of a subjective conjugate distribution for the exponential survival model with censoring. The technique involves the quantification of expert knowledge based on determination by the expert of expected fractiles of a survival distribution in a particular clinical trial setting. Once the prior predictive distribution is determined and the fractiles elicited one can proceed, using iterative techniques, to solve for the hyperparameters. The restrictions and constraints of the hyperparameters as well as the fractiles are studied. The theory is then applied in a clinical trial setting. Bayesian inference
Conjugate prior distribution
Exponential distribution
1. I N T R O D U C T I O N Here we describe a method and present a computer algorithm for elicitation of the hyperparameters of a g a m m a distribution conjugate to the exponential survival model. In a clinical trial setting the clinical investigator (referred to as the expert) often has opinions concerning the survival prognosis of patients on a particular treatment modality. These opinions can be characterized by the predictive distribution of the time to failure conditional on the hyperparameters, that is the distribution after averaging out the sampling parameters according to their prior distribution. One can simply elicit fractiles of the predictive distribution from the expert. Kadane [1] investigated the assessment of the parameters of a matrix t distribution by a similar elicitation procedure. As in their case we allow for opinion modification as the procedure progresses. Other work dealing with the elicitation problem has been done by (Dickey [21). 0010-468X/83/$03.00 © 1983 Elsevier Science Publishers B.V.
Survival analysis
In section 2 we discuss the prior distribution as well as the exponential sampling model. The relation between fractiles of the prior predictive distribution is shown and in section 3 an interactive method for choosing particular values of the fractiles is outlined. A numerical example of the method is presented in section 4.
2. T H E E X P O N E N T I A L S U R V I V A L A N D GAMMA PRIOR PROBABILITY MODELS Suppose the survival time of a population, t, has an exponential distribution and further suppose: t - exp(X)
X-F(a,b) where: exp(X)i
(1)
= the exponential distribution with parameter h; and
90
p.d.f, p(tlX) = Xe -at 0 < t; and F(a,b) = the g a m m a prior distribution for X, with parameters a and b; and
expression for t2:
t,
/'2
p.d.f.p(Xla,b)
=
b~ M_;e_hX F(a)
( k'l) '/" -( k~k~) '/~ The prior predictive distribution for t is p ( t ) given by the following expression:
ab a ( b + t ) ~+'
p(t)= fp(tlX)p(X)dX-
(2)
The fractiles of this distribution are used to determine the hyperparameters of the prior distribution. Suppose a medical expert first assesses the kith fractile of the distribution to be tl, then:
- b~ i
ab~ Lt' (b + t)~+, dt
(5)
"t 1
( k ' 2 ) ' / ~ - ( k~k'2) 1/" The behavior of this equation is illustrated in fig. 1 for the values of k~ = 0.25, k~ = 0.75, t I = 10. This behavior however is the same for all values. It can be seen that as a increases, t 2 decreases and approaches an asymptote (say L]). Defining u = 1 / a we have the following:
fl(u)= (k~) ~-(k~k2) "t' Then L 1 = l i m f l ( u ) can be determined using L '
(b +-t)"
hospitals rule: =1
and
k~ = 1 - k I -
b
a
(b+tl) ~ h
L 1 = limfl(u )
-k 1
U~o
( k',)" log ( k;)-( k k; ) u log
a
(3)
(b+tl) ~
v
!
(kil"log(k2l-(k,kz)
v
u
l o g ( k ; k ' 2 ) t ' ....
Rearranging eq. (3) we obtain:
b=
(k~)'/o 1 - ( k ; ) a/~ "tt
(4)
Suppose it is desired to assess a second fixed f r a c t i l e , k2, and that the corresponding value for the survival distribution will b e / 2 , then from eq. (3) and (4) to solve simultaneously for a and b the following must hold:
t2 SO- _ ~ .
25-
b ' "'tkzl'/~- b + t 2
- ( (k~)'/ot'
75-~!
+
;
I
28
'
I
• 48
;
I
68
;
I
80
;
I I00
0
Rearranging this equation leads to the following
Fig. 1. The lower limit L I (dotted line) of l 2 a s a function of the hyperparameter, a, for fixed values of k ~ , k ' 2 and t~ of eq. (6).
91
log = log
( k~/k;k'2) (k'2/k;k'2)
log log
3. T H E E L I C I T A T I O N P R O G R A M
t,
(k~) (k;)/'
(6)
Substituting the values for k~, k~ and t~ into (eq. 6) leads to the value L 1 = 48.1884. This value is shown as the dotted line in fig. 1. The implication of this limit is that if an exp o n e n t i a l - g a m m a - t y p e model is to hold once values for t I and k '1 have been specified the range of values for any other percentile is limited. If an expert wishes to specify a value which is not within these limits he must use a different model for his options. Once values for k 'I, k~, t I and t 2 have been determined the values of a and b may be determined using a sample iterative technique, which can be incorporated into the interactive elicitation program described below.
The flow diagram for the elicitation program is shown in fig. 2. Two answers are sought by the program: (1) Are the expert's options expressible in such a format? (2) If such is the case, which model provides the best approximation to these opinions? The expert is asked to supply two fractiles of the survival distribution. The program tests if any prior predictive model of the exponential-gamma-type could fit these values. If so the median for this distribution is calculated and the expert is asked to decide if this is coincident with his opinions. If he agrees then the elicitation is finished. Otherwise the expert specifies his own median M e. The models based on (i) M e and (k'l,q) and (ii) M e and (k'ztz) are calculated and the expert is asked if either of these models is coincident with his opinion. If he chooses either of these models the procedure is finished, otherwise the expert is given the
START
Yes
SPECIFY kl' and t l
SPECIFY
I
k 2' and t 2
J
DO YOU WISH ] TO DECREASE ~
[
t,
Yes
t 2 • IL
l
l
DOYOUWlSH [ I I No _IDO (k2')l'TO INCREASE i-No
t2
Yes
/
J
l
No
I
I DOYOUwANT I TO TRYA
t Yes [ I
[DIFFERENT MODEL I
I
CALCULATE VALUES OF a and b REPORT MEDIAN
V(300TO PROGRA~ LFOR NEW MODEL J IS MEDIAN } _ _ ACCEPTABLE
Yes
oYouwT
_ _ [ D O YOU WANT
No
TO TRY A DIFFERENT M O D E L
G O TO P R O G R A M FOR N E W M O D E L
l
TO TRY AGAIN
No
SPECIFY DESIRED I MEDIAN DISPLAY I M O D E L S FOR MEDIANI and ( k l ' , tl ) and MEDAN AND (k2',t 2
No
Fig. 2. Flow diagram for elicitation program.
IS M O D E L I--Y e s EITHER ACCEPTABLE
Yes
92
i/
I .O0 "~
p 0.75-
.
.
.
.
.
.
.
.
---
R
0 B A B 0.58"r
l I
i i
04-
T
Y 8.2502-
8.00
,,,,i,,;,i,~,,i,,,,i,,,,i t~
28 3;~ D I S E A S E FREE SURVIVAL
4~
58
8.0-
MONTHS
Fig. 3. Disease free survival for an adjuvant melanoma study.
choice of trying again with new values; i.e., trying a different model or exiting from the program. If no gamma-type prior fits the elicited points (k'l,t,) and (k'2,t2) the expert is given the choice of increasing t 2 or decreasing t 1. If he does not wish to take either of these options he may either try a different type of model or exit the program.
4. N U M E R I C A L EXAMPLE Figure 3 shows a plot of the disease free survival (DFS) for an adjuvant melanoma study. It can be seen from the plot that the 25th percentile -(the
FOR W H I C H F R A C T I L E DO YOU WISH TO SPECIFY S U R V I V A L THE S M A L L E S T TO BE S P E C I F I E D ?.25 WNIAT V A L U E DO YOU W I S H TO SPECIFY FOR S U R V I V A L ?5 k ~ I C R O T H E R F R A C T I L E DO YOU WISH TO SPECIFY ?.7 W H A T V A L U E DO YOU W I S H TO SPECIFY FOR S U R V I V A L
?3O THE M E D I A N OF THE D I S T R I B U T I O N IS 14.067 IS THIS V A L U E A C C E P T A B L E ? IF SO T Y P E i ELSE TYPE 0 ?i THIS L I N E IS THE CURVE BASED ON V A L U E S AT .250 AND THE V A L U E AT .700 W H E N G R A P H IS D I S P L A Y E D IT W I L L P A U S E - TYPE CR TO CONTINUE. TYPE i TO D I S P L A Y G R A P H ?i
DOES THE C U R V E S A T I S F A C T O R I L Y R E P R E S E N T Y O U R O P I N I O N IF SO TYPE 1 ELSE TYPE 0 ?i Y O U R P A R A M E T E R S ARE A = 1.385 B= 21.662 DO Y O U W A N T TO TRY SOME D I F F E R E N T V A L U E S IF SO T Y P E 1 ELSE TYPE 0 ?0 *STOP*
Fig. 4. Program generated questions for assessing the values to be used for the investigator's prior distribution.
0
50
100
150
200
Fig. 5. The predictive survival distribution based on the prior distribution with hyperparameters a = 1.385 and b = 21.662.
point at which 75% of the population is surviving) occurs at approximately t = 3 and the 70th percentile (the point at which 30% of the population is still surviving) occurs at approximately t = 24. Suppose now that on the basis of this study an investigator wishes to determine a gamma prior distribution for the parameter ~ of an exponential distribution for some new study. Let us suppose that using the new treatment he expects the DFS FOR W H I C H F R A C T I L E DO YOU W I S H TO SPECIFY S U R V I V A L THE S M A L L E S T TO BE S P E C I F I E D
?.25 WIiAT VALUE DO YOU WISH TO SPECIFY FOR SURVIVAL ?5 WHICH OTHER FRAETILE DO YOU WISH TO SPECIFY ?.7 WHAT VALUE DO YOU WISH TO SPECIFY FOR SURVIVAL ?30 THE M E D I A N OF THE D I S T R I B U T I O N IS 14.067 IS THIS V A L U E A C C E P T A B L E ? IF SO TYPE I ELSE TYPE 0 ?0 W H A T V A L U E DO YOU W A N T FOR THE M E D I A N 716 IN THIS GRAPH THE D O T T E D LINE IS THE CURVE B A S E D ON THE V A L U E AT .250 AND THE M E D I A N THE SOLID ON THE V A L U E AT .700 AND THE M E D I A N W H E N G ~ P H IS D I S P L A Y E D IT W I L L PAUSE - TYPE CR TO CONTINUE. "~'PE 1 TO DISPLAY GRAPH ?I
DOES E I T H E R C U R V E S A T I S F A C T O R I L Y R E P R E S E N T Y O U R OPINIONS. IF THE D O T T E D LINE TYPE 2, THE S O L I D LINE TYPE i, ELSE TYPE 0 ?2 YOUR P A R A M E T E R S ARE A = 3.491 R= 72.850 DO YOU W A N T TO TRY SOME D I F F E R E N T VALUES IF SO TYPE 1 ELSE TYPE 0 ?O *STOP* l
Fig. 6. Program generated questions to allow the investigator to construct and choose the predictive survival curves shown in fig. 7.
93 5. S U M M A R Y
\
0
0
i {3
i
t
i
i
l
Se
]
~
t
100
I
SO
i
L
i
200
Fig. 7. The dotted line is the predictive survival distribution based on the hyperparameters a = 3.491 and b = 72.85. The solid line is the predictive survival distribution based on the hyperparameters a = 1.385 and b = 21.662. to remain approximately exponential but that it should be improved by approximately 25% along the whole range of the curve. Thus at the 25th percentile he would expect the D F S to be approximately 4 and at the 70th percentile to be of the order of 30. Fig. 4 shows the computer program assessing the values to be used for the investigator's prior distribution. The program calculates a median D F S time of 14.067 months which the investigator finds satisfactory. The parameters of his distribution are given as a = 1.385 and b = 21.662. The predictive survival distribution based on this prior distribution is illustrated in fig. 5. Let us suppose now that on inspecting the median determined by the values given at the 25th and 70th percentile the investigator feels the value to be too low. As shown in fig. 6, the investigator is now given a choice between the curves based on the value at the 25th percentile and the median (i.e., the dotted line in fig. 7) and the curve based on the value at the median and the 70th percentile (i.e., the solid line of fig. 7). It can be seen that the effect of choosing the 25th percentile and the median is to cause the tail of the distribution to be higher and choosing the 70th percentile and the median causes the tail of the distribution to drop m u c h faster. In this case the investigator chose the dotted line. The parameters of his prior distribution will be a = 3.491 and b = 72.85.
This paper has demonstrated a tool which can be utilized for the construction of a prior probability model for the scale parameter of an exponential survival model. A clinical investigator, referred to as the expert, now has the ability to guide the statistician in his search of a realistic prior probability distribution. Bartolucci and Dickey [3] as well as other authors have presented a technique for a Bayesian analysis of exponentially modeled survival data. However, a methodology for establishing the creditibility of their choice of gammatype priors through elicitation was lacking. This manuscript provides both the theoretical motivation and c o m p u t a t i o n procedure for establishing such a prior distribution. This procedure is certainly not confined to a g a m m a prior distribution and sampling model. Except for variations in the predictive density depending on the underlying prior and sampling distributional assumptions, the methodology is unchanged. Thus investigators wishing to analyze their data 'Bayesianly" now have a procedure for doing just that. Future research will in fact be directed'to other types of survival models.
ACKNOWLEDGEMENT This research supported in part by N I H - N C I research grant CA-24456.
REFERENCES
[1] J.B. Kadane, J.M. Dickey, R.L. Winkler, W.S. Smith and S.C. Peters, Interactive elicitation of opinion for a normal linear model, J. Am. Star. Assoc. 75 (1980) 845-854. [2] J.M. Dickey, Beliefs About Beliefs; a theory for stochastic assessments of subjective probabilities, invited paper given at Int. Meet. Bayesian Stat. (Vatenica, 1979). [3] A.A. Bartolucci and J.M. Dickey, Comparative Bayesian and traditional inference for gamma-modeled survival data, Biometrics 33 (1977) 343-354.