Contemporary Clinical Trials 29 (2008) 751–755
Contents lists available at ScienceDirect
Contemporary Clinical Trials j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / c o n c l i n t r i a l
Bayesian interim analysis in clinical trials Xiao Zhang ⁎, Gary Cutter Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, AL, 35294, United States
a r t i c l e
i n f o
Article history: Received 14 December 2007 Accepted 30 May 2008 Keywords: Bayesian Enthusiastic prior Monitor Multivariate probit model Non-informative prior Skeptical prior
a b s t r a c t We propose a Bayesian approach to monitor clinical trials with clustered binary outcomes using multivariate probit models. Our monitoring is based on the calculated probability of the reduced incidence rate using a new treatment compared with the standard treatment greater than a target improvement under different prior scenarios for the treatment effect. We develop a Bayesian sampling algorithm for posterior inference allowing missing values in the outcomes. We illustrate our method using a published early trail of inhaled nitric oxide therapy in premature infants. © 2008 Elsevier Inc. All rights reserved.
1. Introduction Interim analysis is an important issue in clinical trials from both the ethical and cost-effective points of view. Classical monitoring approaches are based on Neyman–Pearson theorem and use the observed p-values obtained from each ‘look’ at the accumulating data as a basis for stopping. The number of total ‘looks’ and a stringent significance level for each ‘look’ are pre-specified so that the overall significance level for the trial is kept at a certain value, e.g. 5%. Related references can be referred to Pocock [1], O’Brien and Fleming [2], Pocock [3], and Lan and DeMets [4]. However, there are several issues of concern for using classical monitoring in clinical trials. Cornfield [5] elaborated that conclusions drawn from classical monitoring depend on stopping rules and exemplified absurd situations produced by such dependence. Berry [6] pointed out that the more ‘looks’ in classical approaches can produce larger average sample sizes, while the minimum average sample size can occur with one ‘look’, i.e. the fixed sample size case. In addition, the analysis cannot be significant with type I error spent by the
⁎ Corresponding author. Department of Biostatistics, University of Alabama at Birmingham, Ryals Public Health Bldg. 414A, 1665 University Blvd., Birmingham, AL, 35294, United States. Tel.: +1 205 975 9239; fax: +1 205 975 2540. E-mail address:
[email protected] (X. Zhang). 1551-7144/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.cct.2008.05.007
time of final analysis when the stopping rule is not followed, such as extending the planned sample size. Freedman et al. [7] considered this issue and furthermore, dwelled on the inconsistency between the estimation of treatment effect and the p-values for the sequential design, such as ignoring the adjustment to the p-values and pretending the trial being a fixed sample size at the final analysis. In contrast with the classical approaches utilizing Neyman–Pearson theorem, Bayesian monitoring approaches employ the likelihood principle and Bayes’ theorem, which eliminate the above drawbacks inherent in the classical monitoring approaches; the inference based on the posterior distribution is irrelevant to the stopping rule, the more ‘looks’ leads to smaller average sample sizes with infinite ‘looks’ leading minimum average sample size [6]), the stringent role of type I error does not exist when the trial is extended beyond the planned sample size since the posterior distribution can be updated with the new data and the inference is drawn based on the updated posterior distribution, and the estimation obtained through the posterior distribution is consistent with the trial design, e.g. a conservative prior leads a conservative estimated effect and a conservative stopping rule. Detailed elaboration can be found in Freedman et al. [7]. While the final analysis of a trial for using traditional approaches depends on the pre-specified significant levels and the number of performed interim analysis, the final analysis for using Bayesian approaches depends on the prior
752
X. Zhang, G. Cutter / Contemporary Clinical Trials 29 (2008) 751–755
information which is chosen based on the available resources or the perspective from investigators. The usage of the prior information is another disputed issue. Classical approaches do not allow formally using prior information in the trial design and the final analysis, whereas Bayesian approaches can use prior information formally in all stages of the trial. Chaloner and Rhame [8] pointed out that ‘eliciting prior beliefs is also ethically important for documenting the nature of the uncertainty or equipoise’. Prior distributions elicited from clinicians or experts have been proposed and used by Freedman and Spiegelhalter [9], Spiegelhalter et al. [10], Parmar et al. [11], Chaloner and Rhame [8]. Meanwhile, Freedman et al. [7], Spiegelhalter et al. [10], and Vail et al. [12] used skeptical and enthusiastic priors, which represent different extreme prior information, when the elicited prior information is hard to get. Over the last two decades, Bayesian approaches have been extensively developed in monitoring clinical trials. Kass and Greenhouse [13], Spiegelhalter et al. [14], Spiegelhalter et al. [10], Fayers et al. [15], Chen and Chaloner [16] and et cetera exploited Bayesian monitoring based on guaranteeing the posterior probability of the treatment effect greater than a target improvement being greater than a pre-specified value. Establishing a range of equivalence for the treatment effect has been discussed by Spiegelhalter et al. [10] in detail and popularly used in Bayesian monitoring, e.g. Carlin et al. [17], Parmar et al. [11] and Fayers et al. [15]. Spiegelhalter et al. [14] and Dmitrienko and Wang [18] proposed predictive approaches in Bayesian monitoring. Robust Bayesian approaches have also been considered in monitoring by Berry [19], Berger [20], and Carlin and Sargent [21]. In this manuscript, we develop a Bayesian approach for monitoring clinical trials illustrated by the application to the early inhaled nitric oxide (NO) therapy study [22]). The primary outcome in this NO study is binary, therefore, we consider using the multivariate probit model which assumes that there is a latent normal vector underlying each binary vector. As randomization was within center, we consider a center effect and assume the covariance matrix for the latent normal vector having a compound symmetry structure. Such compound symmetry structure for the covariance matrix has also been utilized by Gajewski et al. [23] for clustered ordinal data under the setting of a hierarchical analysis of variance model on the latent normal variables. We develop a Markov chain Monte Carlo (MCMC) sampling algorithm to get the posterior inference allowing missing values in outcomes. Since the estimation of the treatment effect is our focus, we explore various priors for the treatment effect and let the priors of the other unknown parameters be non-informative. We monitor the trial based on calculated posterior probabilities of the reduced incidence rate for using the new treatment compared with the standard treatment greater than the target improvement. The remainder of our paper is organized as follows. Section 2 describes the early inhaled nitric oxide therapy study. Section 3 provides our Bayesian modeling and sampling algorithm. The interim monitoring for the NO trial are presented in Section 4. We end with a summary and a discussion in Section 5.
2. Early inhaled nitric oxide therapy study Kinsella et al. [22] studied the early initiated inhaled nitric oxide (NO) therapy in premature newborns with respiratory failure through a multicenter randomized trial. This study involved 793 newborns and 16 centers. Newborns were randomly assigned to receive either inhaled nitric oxide (5 ppm) or placebo gas for 21 days or until extubation, with stratification according to birth weight (500 to 749 g, 750 to 999 g, or 1000 to 1250 g). The primary efficacy outcome was a composite of death or bronchopulmonary dysplasia at 36 weeks of postmenstrual age. Because the data on the complete study are available, we examine the monitoring of the data as if it were done prospectively. We illustrate monitoring of this NO trial through three phases with Phase I being 20 months after the admission time with 244 newborns, Phase II being 40 months after the admission time with 590 newborns, and Phase III being the last monitoring with total 793 newborns. Table 1 contains the data information for these three phases. Now let us move on to the following section for Bayesian modeling and sampling algorithm for clustered binary data. 3. Multivariate probit model and Bayesian sampling algorithm In a data containing binary outcomes, suppose we have N centers having each of ni measures, for i = 1, ⋯, N. Let Y1, ⋯, YN be multivariate binary outcome variables with Yi = (Yi1, ⋯, Yini)T for i = 1, ⋯, N, and let Xij = (Xij1, ⋯, Xijl)T be an l × 1 vector of observed covariates for each subject i and each measure j = 1, ⋯, ni. We assume the following model structure. Each Yij is Bernoulli distributed with success probability πij assumed to follow a probit model, i.e., πij = Φ(XTijβ), where Φ(∙) is the cumulative standard normal distribution function and β is an l × 1 vector of unknown regression parameters. Let Xi = [Xi1, ⋯, Xini]T be the design matrix for the ith subject. The latent vector Zi = (Zil, ⋯, Zini)T is assumed following a multivariate normal distribution with mean Xiβ and covariance matrix Ri. Then the probit model further assumes that Yij = 1 if Zij N 0 and Yij = 0 if Zij ≤ 0. Based on the discussion of Chib and Greenberg [24] for the model identification, the covariance matrix Ri per se is a correlation matrix. For balanced data with repeated measures, Ri, for i = 1, ⋯, N, can be assumed to be a common unstructured correlation matrix, e.g. Nandram and Chen [25], Liu [26], and Zhang et al. [27]. However, the unstructured assumption for the correlation matrix is not feasible for high
Table 1 Data information for three phases
Phase I Phase II Phase III
Group
Events number
Non-event number
Missing values’ number
CO NO CO NO CO NO
88 84 219 213 295 282
36 35 73 81 97 112
0 1 1 3 3 4
CO represents the control group and NO represents the treatment group.
X. Zhang, G. Cutter / Contemporary Clinical Trials 29 (2008) 751–755
dimensional unbalanced data. Gajewski et al. [23] assumed compound symmetry structures for covariance matrices of the underlying latent variables for clustered ordinal measures. Herein, we assume the correlation matrix Ri has a compound symmetry structure with correlation ρ. Suppose the joint prior distribution for β and ρ is P(β,ρ) =P (β)P(ρ), i.e. the prior of β is independent of the prior of ρ. We assume P(β) =Nl(β;b,C), a multivariate normal distribution with mean b and covariance matrix C. The prior of ρ,P(ρ), will be discussed later in Section 3. Letting Y = (Y1, ⋯, YN) and Z = (Z1, ⋯, ZN), then we have
753
Table 2 Phase I
Prior mean (β1) Prior standard deviation (β1) Posterior mean (β1) Posterior standard deviation (β1) Posterior mean (ρ) Posterior standard deviation (ρ) Target improvement at 0.0 Target improvement at 0.05 Target improvement at 0.10
Noninformative
Skeptical
Enthusiastic
0.0 ∞ −0.002 0.154 0.244 0.114 0.506 0.173 0.033
0.0 0.154 −0.001 0.107 0.254 0.118 0.505 0.092 0.005
−0.497 0.154 −0.254 0.112 0.221 0.102 0.989 0.833 0.377
N
P ðβ; ρ; ZjY Þ~P ðβÞ P ðρÞ ∏ ½Ii /ðZi ; Xi β; Ri Þ; i¼1
¼ P ðβÞ P ðρÞ N 1 ð Þ Z −X β ∏ Ii ð2πÞ−ni =2 jRi j−1=2 exp − ðZi −Xi βÞT R−1 i i i 2 i¼1 ð1Þ where ϕ(∙) is the standard normal density function, and Ii indicates compatibility of the latent vector Zi with the binary vector Yi through the expression Ii ¼ 1ðZi1 N0;Zi2 N0;: : :;Zin N0Þ 1ðYi1 ¼1;Yi2 ¼1;: : :Yin ¼1Þ i i þ1ðZi1 V0;Zi2 N0;: : :;Zin N0Þ 1ðYi1 ¼0;Yi2 ¼1;: : :Yin ¼1Þ i i þ: : : þ 1ðZi1 V0;Zi2 V0;: : :;Zin V0Þ 1ðYi1 ¼0;Yi2 ¼0;: : :Yin ¼0Þ i i where 1(∙) is the indicator function. Based on Eq. (1), we can get the full conditional distributions for regression parameters β, the correlation ρ and each latent component Zij as follows. N Since P ðβjρ; Z; Y Þ~P ðβÞ ∏ exp − 12 ðZi −Xi βÞT R−1 i ðZi −Xi β ÞÞ , i¼1 using standard Bayesian linear model results, β|ρ,Z,Y has a multivariate normal distribution. This is ˆ Vβ Þ; βjρ; Z; Y eNl ðβ; where Vβ ¼
N
−1 ∑ XiT R−1 i Xi þ C
i¼1
−1
and βˆ ¼ Vβ
N −1 ∑ XiT R−1 i Zi þ C b :
i¼1
We have P(ρ|β,Z,Y) is proportional to N 1 P ðρÞ ∏ exp − ðZi −Xi βÞT R−1 i ðZi −Xi β Þ ; 2 i¼1 which does not belong to any standard distribution. We use Metropolis–Hastings algorithm to sample ρ and let the proposed distribution to be a normal distribution. It can be seen that Zij|β,Ri,Zik,k ≠ j,Yij have normal distributions truncated at the left or right by zero: P Zij jβ; Ri ; Zik ; k≠j; Yij ~Iij P Zij jβ; Ri ; Zik ; k≠j ~ ~ ¼ Iij / Zij ; μ ij ; Rij ; where Iij = 1(Yij = 1,Zij N 0) + 1(Yij = 0,Zij ≤ 0) indicates compatibility of ~ Yij and Zij, andμ~ij andRij are the conditional mean and variance of Zij given Zik, k ≠ j: T β þ Rj;−j R−1 μ~ij ¼ Xi;j −j;−j Zi;−j −Xi;−j β ~ Rij ¼ Rj;j −Rj;−j R−1 −j;−j R−j;j ; in these expressions, Ri,− i refers to the ith row of Ri without its ith column element, R− i,− i is the Ri matrix without its ith row and ith column, Xi,− j is the matrix Xi without jth row, Zi,− j is the vector Zi without jth element, and R− i,j is the transpose of Ri,− i.
Notice that if Yij is missing, then the conditional distribution of Zij is a normal distribution without truncation. In conclusion, one cycle of the algorithm consists of Gibbs steps to sample β and each component of the latent variable Zi, and a Metropolis–Hastings step for ρ. The following section uses the above Bayesian sampling algorithm to monitor the NO trial in three phases. 4. Results We assume the binary primary outcomes follow the multivariate probit model and the covariate matrix is comprised of the intercept term (β0) and the treatment effect term (β1). We assume three priors, i.e. non-informative prior, skeptical prior and enthusiastic prior for treatment effect β1, and assume non-informative prior for β0 and correlation ρ. The non-informative prior represents minimal prior information. It can be interpreted as 50% sure for the null hypothesis and 50% sure for the alternative hypothesis. The skeptical prior represents the skepticism concerning the treatment difference, i.e. the alternative hypothesis, by assuming δ, the treatment effect, with mean 0 and Pr(δ N δA) less than a small value γ, say, 0.05, where δA is the alternative hypothesis. Loosely speaking, with the skeptical prior, there is 95% chance for accepting null hypothesis and 5% sure for rejecting it. In contrast with the skeptical prior, the enthusiastic prior represents high confidence for the alternative hypothesis with mean δA and Pr(δ b 0) less than the prespecified small value γ. The detailed discussion for skeptical and enthusiastic priors can refer to Spiegelhalter et al. [10]. We determine the skeptical and enthusiastic priors for the treatment effect as follows. Based on Kinsella et al. [22], the incidence rate of the treatment group is estimated to be 50%, and that of the control group is to be 60% to guarantee a statistical power of 80% with two-sided alpha of 0.05. With the multivariate probit model assumption, the 50% incidence rate for the treatment group corresponds to β0 + β1 being 0, i.e. Φ(β0 + β1) = 0.5, and the 60% incidence rate for the control group corresponds to β0 being roughly 0.253, i.e. Φ(β0) = 0.6. Therefore, the values of β1 less than − 0.253 produce incidence rates less than 50% for the treatment group given β0. So we choose βA, the value of β1 for the alternative hypothesis, to be −0.253 versus the value being 0 for the null hypothesis. From the perspective of the skepticism, letting Pr (β1 b βA) = 0.05 assuming β1 following a normal distribution with the 0 mean, we get the standard deviation of β1 is about 0.154. To assure Pr(β1 b βA) = 0.95 in the enthusiastic prior, we
754
X. Zhang, G. Cutter / Contemporary Clinical Trials 29 (2008) 751–755
Table 3 Phase II
Prior mean (β1) Prior standard deviation (β1) Posterior mean (β1) Posterior standard deviation (β1) Posterior mean (ρ) Posterior standard deviation (ρ) Target improvement at 0.0 Target improvement at 0.05 Target improvement at 0.10
Table 5 Futility analysis for NO data under the three stages Noninformative
Skeptical
0.0 ∞ −0.066 0.105 0.177 0.095 0.735 0.193 0.011
0.0 0.154 −0.044 0.086 0.187 0.116 0.696 0.103 0.002
Enthusiastic −0.497 0.154 −0.202 0.087 0.164 0.083 0.990 0.700 0.112
have the mean of β1 is about −0.497 with the standard deviation being 0.154. We monitor the NO trial using the Bayesian algorithm presented in Section 3 and illustrate our results in Tables 2, 3 and 4 under three prior scenarios (non-informative, skeptical and enthusiastic priors) of β1. These three tables contain the prior information and the posterior inference for β1, the posterior information for correlation ρ and the probabilities of the target improvement, defined as the reduced incidence rate for using the new treatment compared with the standard treatment, being greater than 0, 5% and 10%. From Table 2 up to Table 4, we can see that the estimated values for correlation ρ are all above 0.16 with 95% credible intervals excluding 0, suggesting that the clustered effect exists in this trial; the posterior estimated values for β1 are all less than 0, and this means that the incidence rate for the treatment group is less than the one in the control group. Let us take a further look at the effect of the new treatment through the probability of the new treatment achieving a target improvement. Start from Phase I in Table 2, we can see there is substantial evidence showing that the reduced incidence for using the new treatment is greater than 0 even using the skeptical prior with over 50% probability, and it gets further confirmed through the increased probabilities as data are cumulated in Phase II and Phase III under all three prior scenarios, as shown in Tables 3 and 4. It also can be seen from Table 2 that the evidence for the reduced incidence rate greater than 5% in Phase I becomes slim under the noninformative prior with the probability about being 0.173 and the skeptical prior with probability about being 0.092 although it is still strong from an enthusiastic perspective
Table 4 Phase III
Prior mean (β1) Prior standard deviation (β1) Posterior mean (β1) Posterior standard deviation (β1) Posterior mean (ρ) Posterior standard deviation (ρ) Target improvement at 0.0 Target improvement at 0.05 Target improvement at 0.10
Noninformative
Skeptical
Enthusiastic
0.0 ∞ −0.102 0.091 0.170 0.078 0.870 0.292 0.015
0.0 0.154 −0.077 0.077 0.177 0.086 0.840 0.163 0.002
−0.497 0.154 −0.203 0.079 0.162 0.075 0.995 0.735 0.103
Stages
Proportion of planned information
Conditional power
Phase I Phase II Phase III
0.31 0.74 0.99
0.324 0.193 b 0.0001
The proportion of planned information for Stage III is not equal to 1 because of the missing values.
with the probability about being 0.833. Similar conclusions can be drawn for Phase II (Table 3) and Phase III (Table 4). Now let us look at the possibilities for 10% reduction of the incidence rate. With the non-informative and skeptical priors, the possibility of achieving 10% reduction in the incidence rate for using the new treatment is unlikely from Phase I with probability less than 0.033 (in Table 2) to Phase III with probability less than 0.015 (in Table 4). With the enthusiastic prior, such probability is reduced from 0.377 in Phase I (Table 2) to 0.112 in Phase II (Table 3) and 0.103 in Phase III (Table 4). This suggests that even from an enthusiastic perspective, there is barely evidence that using the new treatment can gain 10% reduction in incidence rate comparing the standard treatment. Our analysis suggests even investigators with enthusiastic perspective for the new treatment might stop the trial after Phase II and others might stop the trail after Phase I from the cost-effective point of view recommending the standard treatment. Based on the discussion of Spiegelhalter et al. [10] for monitoring a trial from the Bayesian perspective, we recommend an enthusiastic prior should be used when confronted with the data that indicates little evidence for using the new treatment and a skeptical prior should be used when confronted with the data that indicates an effect of a new treatment. For example, in the NO trial, we know from Phase I that the evidence of using the new treatment is weak. Therefore, we use the enthusiastic prior in Phase II and find that the probability of the new treatment achieving 10% improvement is about 0.112 indicting the evidence is still weak. Then we suggest stopping the trial and recommend the standard treatment. Spiegelhalter et al. [10] gave detailed guideline for monitoring clinical trials including the ethical randomization, safety issue, and the cost-effective perspective, and related references can be found therein. To make a comparison, we also conduct the futility analysis using the method of calculating the conditional power [28,29]) for the NO trial and present the results in Table 5, where the proportion of planned information is the ratio of the observed sample size to the total sample size and the condition power is the probability of the new treatment achieving positive benefit over the standard treatment given the observed data. Through Table 5, we can see that the probability of achieving positive benefit for using the new treatment given the observed data, i.e. the reduced incidence rate for using the new treatment compared with the standard treatment, being greater than 10%, is around 0.324 for Phase I. This probability is a little lower than the probability with the enthusiastic prior and much higher than those with noninformative prior and the skeptical prior shown in Table 2. This probability decreases to 0.193 for Phase II and becomes close to 0 Phase III. Although both the Bayesian method and
X. Zhang, G. Cutter / Contemporary Clinical Trials 29 (2008) 751–755
the futility analysis reach the same conclusion that the new treatment does not achieve positive benefit over the standard treatment at the end of the trial, the Bayesian method provides more convincing evidence than the futility analysis by considering different scenarios through prior distributions and this will help the investigators make the timely decision. For example, without optimism for the new treatment, the NO trial may be stopped after Phase I from the cost-effective point of view recommending the standard treatment; or stop the trial with accumulated evidence in Phase II indicating that the new treatment achieving 10% reduction of the incidence rate is slim even using the enthusiastic prior. 5. Discussion In this manuscript we develop a Bayesian approach to monitor clinical trials with clustered binary outcomes using multivariate probit models through calculating the probabilities of the reduced incidence rate for using the new treatment compared with the standard treatment greater than a target improvement. We assume non-informative, skeptical and enthusiastic priors for the treatment effect for the purpose of monitoring. We illustrate our method by monitoring the NO trial through three phases. Our Bayesian modeling using the multivariate probit model is flexible allowing inclusion of covariates in addition to the treatment effect. For the NO trial, we can include the covariate indicating different weight strata and monitor both the treatment effect and the weight effect. Since the incidence rate is the function of the regression parameters, it is straightforward to calculate the incidence rates for different subgroups based on the posterior samples. Through considering different prior scenarios, we can draw a conclusion from different perspectives. For example, with no preference for the two treatments, we monitor a trial with a non-informative prior; with a high suspicion for a trial, we may monitor it using a skeptical prior; and with substantial positive evidence available, we choose an enthusiastic prior. The Bayesian methods in monitoring a trial can be more convincing than the traditional methods when using an enthusiastic prior on the data with little evidence for the new treatment achieving positive benefit, the probability of the new treatment better than the standard treatment is still low, e.g. the NO trial in Section 4, or using a skeptical prior on the data indicating an effect of a new treatment, the probability of the new treatment better than the standard treatment is still high, e.g. Spiegelhalter et al. [10]. Monitoring clinical trials using Bayesian methods is promising and avoids a binary type decision rule which is not generally acted upon by DSMB in this manner. Most frequent methods of monitoring and reaching boundaries are usually interpreted as a mandate to consider stopping rather than a decision. Investigators should be encouraged to develop more efficient Bayesian methods for monitoring and analyzing clinical trials. Inspired by one of our referees, investigating our method in an adaptive trial design is worth pursuing and it will be one of our future research works. Acknowledgements We want to thank Dr. John Petkau, Dr. Frederik Barkhof, Dr. Stephen Reingold, Dr. Paul O’Connor, Dr. Maria Pia Sormani
755
and Dr. Chris Polman for helpful suggestions. Partial support was provided by the national MS society. References [1] Pocock SJ. Group sequential methods in the design and analysis of clinical trials. Biometrika 1977;64:191–9. [2] O’Brien PC, Fleming TR. A multiple testing procedure for clinical trials. Biometrics 1979;35:549–56. [3] Pocock SJ. Interim analysis for randomized clinical trials: the group sequential approach. Biometrics 1982;38:153–62. [4] Lan KKG, DeMets DL. Discrete sequential boundaries for clinical trials. Biometrika 1983;70:659–63. [5] Cornfield J. Sequential trials, sequential analysis and the likelihood principle. Am Stat 1966;20:18–23. [6] Berry DA. Interim analysis in clinical trials: classical vs. Bayesian approaches. Stat Med 1985;4:521–6. [7] Freeman LS, Spiegelhalter DJ, Parmar MB. The what, why and how of Bayesian clinical trials monitoring. Stat Med 1994;13:1371–83. [8] Chaloner K, Rhame FS. Quantifying and documenting prior beliefs in clinical trials. Stat Med 2001;20:581–600. [9] Freeman LS, Spiegelhalter DJ. The assessment of subjective opinion and its use in relation to stopping rules for clinical trials. The Statistician 1983:153–60. [10] Spiegelhalter DJ, Freeman LS, Parmar MB. Bayesian approaches to randomized trials. J Royal Stat Soc Series A 1994;157:357–416. [11] Parmar MK, Spiegelhalter DJ, Freedman LS. The chart trials: Bayesian design and monitoring in practice. Stat Med 1994;13:1297–312. [12] Vail A, Hornbuckle J, Spiegelhalter DJ, Thornton JG. Prospective application of Bayesian monitoring and analysis in an ‘open’ randomized clinical trial. Stat Med 2001;20:3777–87. [13] Kass RE, Greenhouse JB. Comment: a Bayesian perspective. Stat Sci 1989;4:310–7. [14] Spiegelhalter DJ, Freedman LS, Parmar MKB. Applying Bayesian ideas in drug development and clinical trials. Stat Med 1993;12:1501–11. [15] Fayers PM, Ashby D, Parmar MKB. Bayesian data monitoring in clinical trials. Stat Med 1997;16:1413–30. [16] Chen C, Chaloner K. A Bayesian stopping rule for a single arm study: with a case study of stem cell transplantation. Stat Med 2006;25:2956–66. [17] Carlin BP, Chaloner K, Church T, Louis TA, Matts JP. Bayesian approaches for monitoring clinical trials with an application to toxoplasmic encephalitis prophylaxis. The Statistician 1993;42:355–67. [18] Dmitrienko A. Wang M. Bayesian predictive approach to interim monitoring in clinical trials. Stat Med 2006;25:2178–95. [19] Berry DA. A case for Bayesianism in clinical trials (with discussion). Stat Med 1993;12:1377–404. [20] Berger JO. An overview of robust Bayesian analysis (with discussion). Test 1994;3:5–124. [21] Carlin BP, Sargent DJ. Robust Bayesian approaches for clinical trial monitoring. Stat Med 1996;15:1093–106. [22] Kinsella JP, et al. Early inhaled nitric oxide therapy in premature newborns with respiratory failure. N Engl J Med 2006;355:12–22. [23] Gajewski BJ, Hart S, Bergquist-Beringer S, Dunton N. Inter-rater reliability of pressure ulcer staging: ordinal probit Bayesian hierarchical model that allows for uncertain rater response. Stat Med 2007;26: 4602–18. [24] Chib S, Greenberg E. Analysis of multivariate probit models. Biometrika 1998;85:347–61. [25] Nandram B, Chen M-H. Accelerating Gibbs sampler convergence in the generalized linear models via a reparameterization. Journal of Statistical Computation and Simulation 1994;81:27–40. [26] Liu C. Bayesian analysis of multivariate probit model: discussion of “The art of data augmentation” by Van Dyk and Meng. J Comput Graph Stat 2001;10:75–81. [27] Zhang X, Boscardin WJ, Belin TR. Sampling correlation matrices in Bayesian models with correlated latent variables. J Comput Graph Stat 2006;15:880–96. [28] Lan KKG, Wittes J. The B-value: a tool for monitoring data. Biometrics 1988;44:579–85. [29] DeMets D. Futility approaches to interim monitoring by data monitoring committees. Clin Trials 2006;3:522–9.