Reliability Engineering and System Safety 111 (2013) 273–286
Contents lists available at SciVerse ScienceDirect
Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress
A sequential approach for stochastic computer model calibration and prediction Jun Yuan, Szu Hui Ng n Department of Industrial and Systems Engineering, National University of Singapore, 117576 Singapore, Singapore
a r t i c l e i n f o
abstract
Article history: Received 31 May 2012 Received in revised form 9 November 2012 Accepted 10 November 2012 Available online 22 November 2012
Computer models are widely used to simulate complex and costly real processes and systems. When the computer model is used to assess and certify the real system for decision making, it is often important to calibrate the computer model so as to improve the model’s predictive accuracy. A sequential approach is proposed in this paper for stochastic computer model calibration and prediction. More precisely, we propose a surrogate based Bayesian approach for stochastic computer model calibration which accounts for various uncertainties including the calibration parameter uncertainty in the follow up prediction and computer model analysis. We derive the posterior distribution of the calibration parameter and the predictive distributions for both the real process and the computer model which quantify the calibration and prediction uncertainty and provide the analytical calibration and prediction results. We also derive the predictive distribution of the discrepancy term between the real process and the computer model that can be used to validate the computer model. Furthermore, in order to efficiently use limited data resources to obtain a better calibration and prediction performance, we propose a two-stage sequential approach which can effectively allocate the limited resources. The accuracy and efficiency of the proposed approach are illustrated by the numerical examples. & 2012 Elsevier Ltd. All rights reserved.
Keywords: Computer model calibration Stochastic computer simulation Guassian process model Bayesian analysis Parameter uncertainty Uncertainty quantification
1. Introduction Computer models are generally designed to simulate real processes. In real applications, most processes are quite complex, making it impossible to develop realistic analytical models of the system, thus computer models become one of the best choices to represent these real complex systems. When the computer model is used to predict the behavior of the real system so as to assess and certify the system’s reliability and safety that eventually is used for better decision making, it is important to improve the model’s predictive accuracy and capability. In computer model analysis, both validation and calibration are closely related to the model’s predictive performance. Validation is a process to confirm whether the computer model precisely represents the real process, while calibration is a process to adjust the unknown input parameters by comparing the computer model output with the real observed data so as to ensure that the computer model fits well to the real process. These unknown input parameters are usually unobservable or unmeasurable in the real process but need to be specified in the computer model. One possible relationship among calibration, validation and prediction is provided by [1] which shows that both calibration and validation are
n
Corresponding author. E-mail addresses:
[email protected] (J. Yuan),
[email protected] (S.H. Ng).
0951-8320/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ress.2012.11.004
important to improve the model’s predictive performance. However, in this discussion, we focus on providing a calibration procedure to update the model. A brief discussion is also provided for validation based on our proposed calibration procedure. More details about validation in scientific computing can be found in [1] and references therein. More specifically, we propose a calibration procedure to adjust the unknown parameters and obtain the predictive distributions of the calibrated model. The calibrated model usually has better predictive performance as it fits better to the real process with more accurate unknown calibration parameter values. These obtained predictive distributions can be further used for validation and eventual prediction purposes. Then the final calibrated and validated model can be applied for intended use. The importance of calibration has been recognized in many practical models that are used to evaluate the reliability and safety of complex physical and social systems, such as nuclear radiation release model [2], hydrologic model [3], and biological model [4]. For instance, disease transmission models are usually built to model the spread of the infectious diseases such as influenza A (HIN1) virus in order to find the best strategy to control their spread. Within the computer model, the illness attack rate is required to be set before using the model. However, its value is usually unknown in the real transmission process especially for a new outbreak, and the value may change based on the geographic area and population. This attack rate is known as the calibration parameter. Its value may significantly influence
274
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
the predictive performance of the computer model and the subsequent accuracy of the decision making. Therefore, it is necessary to estimate or adjust this value before the computer model can be used for further analysis. The procedure to adjust this parameter to improve the predictive accuracy is known as calibration. Various approaches have been proposed for computer model calibration [5,6]. One type of calibration approach is to find an effective and efficient algorithm which can be directly applied with limited data resources, such as the stochastic approximation methods discussed by [7]. However, this approach may not be efficient when the computer models are extremely time consuming and computationally expensive as it usually requires a relatively large number of simulation runs before converging to the optimal calibration parameter value. Another popular and much more efficient approach is to use surrogates, also known as emulators and metamodels, where simpler and faster statistical approximations are used instead of the original complex computer models. More discussion on using and developing surrogates for computer models can be found in [8,9]. A computer model is used to approximate the real process and the surrogate model is used to approximate the computer model. There are various uncertainties arising in calibrating the computer model and using the surrogate model to predict the behavior of the real process, such as parameter uncertainty, model inadequacy, observation error, etc. When the model is used for decision making, not only the point estimator but also the uncertainty information about the estimator is required to make more informed decisions and to have a better assessment of risk. Uncertainty quantification is an important issue for system’s risk assessment and in conveying the credibility and confidence in system’s reliability in order to support making better decisions. Therefore, it is important to account for various uncertainties in predicting the behavior of the real process. Within these uncertainties, calibration parameter uncertainty sometimes may have significant effects on overall predictive uncertainty, which plays an important role in decision making. Therefore, it is important to consider this uncertainty and its effects on the subsequent prediction in many real applications. [2] proposed a Bayesian approach for computer model calibration using the Gaussian process (GP) as a surrogate model. Their approach takes into account all sources of uncertainty in the computer model analysis including the calibration parameter uncertainty. However, their discussion is based on deterministic computer models. Hence the inherent stochastic error introduced in the stochastic computer model is not accounted for. Different from the deterministic model, simulation outputs from a stochastic model may be different for the same input levels. In most applications, the real systems of interest are often stochastic in nature. The stochastic models are usually required to assess the probability distribution of the outcome of interest and the expected output is a typical measure of performance of such systems. With the increasing application of the stochastic computer model, it is important to consider this stochastic error in the calibration and prediction so as to improve the calibration accuracy and the predictive performance. In this paper, we extend current work using surrogates for calibration to stochastic computer model calibration and prediction, which accounts for all the uncertainties mentioned in [2] and also the inherent stochastic error. [4] also discussed the calibration of the stochastic computer model using a surrogate based Bayesian approach. In their study, the Markov chain Monte Carlo (MCMC) method was used to simulate values from the posterior distribution of the calibration parameter and to obtain the predictive values. Their approach requires many millions of samples to guarantee the convergence, so it is time consuming to implement although they attempt to reduce the computational
burden by simplifying the surrogate model. Similarly in [10], they also discussed the Bayesian calibration approach using the MCMC method. In this paper, different from [4] that use MCMC to approximate the calibration parameter distribution and the predictive distribution, we derive these distributions by estimating some parameters that are empirically shown to have insignificant effects on the predictive performance. We derive the predictive distributions for both the computer model and the real process which quantify the predictive uncertainties. We also derive the predictive distribution for the inadequacy term which measures the discrepancy between the computer model and the real process. This can be used for computer model validation such as in the validation procedure proposed by [11]. The derived posterior distributions provide analytical calibration and prediction results. Meanwhile, the proposed approach can significantly reduce the computational time compared to MCMC and this result is illustrated by the numerical examples. Furthermore, to improve the calibration and prediction process and to support better decisions, we extend this work to address the problem of allocating limited resources to reduce the predictive uncertainty of both the computer model and the real process. We propose a two-stage sequential approach for stochastic computer model calibration and prediction so as to obtain a better set of real and computer data required to build accurate surrogate and sequentially learn and improve the model by effectively allocating resources to select more appropriate points to sample at. The purpose of this article is to develop an efficient sequential calibration and prediction procedure for the stochastic computer models. This includes the collection of computer model and real process data, the development of the surrogate model, the calibration and prediction of the stochastic computer model, the allocation of the resources to improve the model for eventual use in prediction, and the validation of the calibrated computer model, so as to achieve the improved computer model’s prediction for decision makers. This paper is organized as follows: In Section 2, we describe the model used for calibration and prediction. In Section 3, we explain the proposed Bayesian calibration approach and derive the predictive distributions of the computer model, the real process and the model inadequacy term. In Section 4, we propose a two-stage sequential approach for stochastic computer model calibration and prediction and provide the implementation steps. Finally in Section 5, two numerical examples are provided to illustrate the proposed approach.
2. Model formulation 2.1. Stochastic model Based on the model proposed by [2], the relationship between the real observation and simulation output can be represented by zi ¼ zðxi Þ þ ei ¼ Sðxi , yÞ þ dðxi Þ þ ei ,
ð1Þ
where zi is the ith real observation at input level xi, z(xi) is the true output from the real process, S(xi,y) represents the ‘‘true’’ simulation output at xi and the optimum calibration parameter y. Similar to previous works, we see the optimum y as the value that best fits the computer model output to the real process. This does not necessarily equate to the true value of the calibration parameter if it exists in the real process. As the purpose is to improve the predictive performance of the computer model, we focus on finding the best parameter value that matches the computer model to the real process. d(xi) is the model inadequacy or discrepancy term, which is considered to be independent of S(xi,y), and ei is the observation error.
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
Here the calibration parameter is assumed to be independent of the variable input. It is also assumed that the model output, model inadequacy and the observation error are independent of each other. These assumptions are used in many analytical studies for simplification and should be carefully verified in practical applications. To relax some of these assumptions in the model form is an important area for future work. It should be noted that (1) is just one possible way to represent the relationship between computer model output and real observation. Other relationship forms may be more appropriate in specific applications where more detailed structure is known. This relationship form is adopted in this paper as we are interested in more general-purpose applications that assume less structure. This form has also been used in many computer model analysis studies [11,12]. However, S(xi,y) is treated as deterministic in most studies, where the computer model output is always identical with the same inputs xi and y. In many practical applications, the computer models built are stochastic. This means that simulations at the same input levels give different outputs. One typical way to represent the relationship between the observed and expectation of the stochastic simulation output is yðxi , yÞ ¼ Sðxi , yÞ þ ei ,
future work. It should be noted that (3) is just one possible form assumed for illustrative purpose and other forms can also be used. A more general form is to replace (ti,v tj.v)2 with 9ti,v tj.v9l where l can be assigned a particular value or can be treated as an unknown parameter to be estimated. More details about various different correlation forms can be found in [17]. 2) The model inadequacy term d(x) is a Gaussian process with mean md(x)¼ hd(x)Tbd and covariance function s2dRd, where hd(x)¼(hd,1(x),y,hd,kd(x))T is a vector of kd functions of x, bd ¼(bd,1,y,bd,kd)T is a vector of kd unknown coefficients, and Rd has an exponential form p n Y 2 o , Rd xi ,xj ¼ exp fd,u xi,u xj,u
ð4Þ
u¼1
where fd,u 40. We write fd ¼ (fd,1,y,fd,p)T. 3) The observation error e has a normal distribution with mean 0 and variance s2e . 4) S(x,y), d(x), e and e are mutually independent.
ð2Þ
where S(xi,y) denotes the expectation of the stochastic simulation output for a given xi and y, y(xi,y) is the observed stochastic simulation output or the average of several stochastic simulation output replications under a specified xi and y, and ei is the sampling variability inherent in a stochastic simulation. When the GP is used to model the stochastic computer model, both the uncertainty in the response surface and the stochastic nature of the computer model (sampling variability) should be considered. More details about this modeling form can be found in [13] and [14]. Here the calibration and prediction of stochastic computer model is discussed, where the stochastic error e is considered and only noisy observations y are available. The relation form in (2) is also used by [4]. 2.2. Gaussian process model Gaussian process models have been widely used as surrogates of computer models because of their flexibility and convenience [2]. A comprehensive study is given by [15]. In this paper, we use a Gaussian process model as a surrogate model for the computer model. It should be noted that it is important to validate the use of the surrogate model, and methods for validation of GP model can be found in [16]. Based on (1) and (2), we assume the following: 1) The computer model output S(x,t) is a Gaussian process with mean mS(x,t)¼hS(x,t)TbS and covariance function s2S RS, where t denotes the specified input calibration parameter value which can be controlled in the computer model. Here t is used to distinguish between the optimum unknown calibration parameter value y that we want to calibrate for the computer model. hS(x,t)¼(hS,1(x,t),y,hS,kS(x,t))T is a vector of kS functions of x and t and bS ¼(bS,1,y,bS,kS)T is a vector of kS unknown coefficients. The correlation function RS has an exponential form RS ðxi ,t i Þ, xj ,t j ¼
275
p Y
q n n 2 o Y 2 o , exp fSx ,u xi,u xj,u exp fSy ,v t i,v tj,v
u¼1
v¼1
ð3Þ where p is the dimension of x, q is the dimension of y. The parameters fSx,u 40 and fSy,v 40. We write fS ¼(fSx,1,y, fSx,p,fSy,1,y,fSy,q)T. ei is treated as a nugget effect with ei N(0,s2e ) for all i which is independent of the input values. This assumption can be released to account for dependency in
2.3. Prior distributions for parameters In the stochastic model defined in Section 2.1 and the GP model defined in Section 2.2, the parameters need to be specified are: {bS, s2S , fS, bd, s2d, fd, s2e , s2e , y}. One advantage of using Bayesian approach is that we can use the prior knowledge about the computer model and the real process to quantify the uncertainties of these parameters by assigning prior distributions on these parameters. Therefore, we can quantify various uncertainties including model inadequacy, observation error, inherent stochastic error, surrogate model parameters uncertainty, and calibration parameter uncertainty. Many different priors can be used in Bayesian analysis. Although the impact of the subjectivity in choosing the initial priors can be reduced with more information from the available resources, the priors should be selected appropriately as improper priors may lead to improper posteriors. In this paper, we assume the following mathematical convenience priors according to [18]: bS 9s2S N bS , s2S V S , bd 9s2d N bd , s2d V d ; ð5aÞ
ð5bÞ
s2e IG ae , ge , s2e IG ae , ge ;
ð5cÞ
fSx ,i Gðax ,cx Þ, for i ¼ 1,. . .,p,
ð5dÞ
fSy ,i Gðay ,cy Þ, for i ¼ 1,. . .,q,
ð5eÞ
fSd ,i Gðad ,cd Þ, for i ¼ 1,. . .,p:
ð5fÞ
s2S IG aS , gS , s2d IG ad , gd ,
2
where N(b,s V) denotes a multivariate normal distribution with mean vector b and covariance matrix s2V. IG(a,g) denotes an inverse gamma distribution with density function p w; a, g ¼ ga wa1 eg=w =GðaÞ, w 4 0, a 4 0, g 4 0: ð6Þ and G(a,c) denotes the gamma distribution with density Gðw,a,cÞ ¼ ca wa1 ecw =GðaÞ,
w4 0, a 4 0, g 4 0:
ð7Þ
Here we do not assign a specific prior for the calibration parameter y. This means any prior can be used in the following calibration procedure. As it is assumed that y is an independent parameter, and S(x,y), d(x), e and e are mutually independent, we have that {y}, {bS, s2S , fS}, {bd, s2d, fd}, {s2e } and {s2e } are mutually independent. We further assume that {bS, s2S } and {fS} are independent; {bd, s2d} and {fd} are independent. Then we can
276
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
write the prior distribution as p y, bS , s2S , fS , bd , s2d , fd , s2e , s2e ¼ pðyÞp bS , s2S p fS p bd , s2d p fd p s2e p s2e :
ð8Þ
The full data set d is assumed to be normally distributed given {y,b,s2,f}. This will yield the likelihood function and further we can derive the posterior distribution of all the parameters. First, we give the mean and the variance matrix of d. E½d9y, b, s2 , f ¼ md ðyÞ ¼ HðyÞb,
3. Calibration, prediction and validation 3.1. Observed data For the large and complex stochastic simulation models we consider here, the true values of S(xi,t) are not directly observable or available. Due to the computational requirements, it is also practically impossible to run a large number of simulation replications for the specified inputs xi and t to accurately determine it. In reality, one would typically run several simulation replications at each input point and take the average of the replication outputs at each point as an approximate value. Here yij denotes the jth simulation replication output for the ith input set. It is assumed that there are a total of m replications for each i, and that the simulation observations for different replications are independent (i.e., different random number streams for each Pm replication). Then the average yi ¼ yðxi ,t Þ ¼ 1=m j ¼ 1 yij ðxi ,t Þ is taken as the observed stochastic simulation output for the ith input set. Therefore, the two sets of observable data in the calibration of these stochastic models are: 1) Real observations: z¼(z1, z2,y, zN)T, 2) Stochastic simulation outputs: y¼(y1 , y2 ,y, yn )T. where N is the number of inputs where the real observations are available and n is the number of inputs where the simulation outputs are available. Therefore, the full set of available data for analysis is dT ¼(yT,zT). When the available resources and the computing budget are limited, the sequential design approach to collect the experimental data is especially useful as it selects more appropriate points to sample at, such that the surrogate model is more accurate and the calibration and prediction results improved. 3.2. Calibration It is often necessary to calibrate the computer model first before it can be used to accurately predict the real process. The calibration parameter uncertainty is also important to account for in the subsequent prediction as it may significantly influence the predictive performance. In this section, we explain the posterior analysis of the calibration problem which is important in predicting the behavior of the real process. 3.2.1. Posterior distribution of calibration parameter Before deriving the posterior distribution of the calibration parameter, we first specify some notations. Let DS ¼{(x1,t1),y ,(xN,tN)} denote the set of input points at which the computer model outputs are available. Similarly, let Dz ¼ {xn1,y,xnn} denote the set of input points where the real observations are available. Let Dz(y)¼ {(xn1,y),y,(xnn,y)} denote the set of input points with the optimal calibration parameter y. We further let HS(DS)¼ (hS(xn1,t1),y,hS(xnN,tN))T. Similarly, we have Hd(Dz) and HS(Dz(y)). We define RS(DS,DS) to be the covariance matrix of y with (i, j) element RS((xi,ti),(xj,tj)). Similarly we define RS(DS,Dz(y)), RS(Dz(y),Dz(y)), and Rd(Dz,Dz). We further define t21 ¼ s2e /s2S , t22 ¼ s2d/s2S , and t23 ¼ s2e /s2S , which are the new parameters reparameterized for the convenience of integrating out s2S in the posterior distributions. For simplicity, we let b ¼ (bTS,bTd)T, s2 ¼ {s2S ,s2e ,s2d,s2e }, f ¼{fS,fd} and t2 ¼{t21,t22,t23}.
where HS ðDS Þ
HðyÞ ¼
!
0
HS ðDz ðyÞÞ Hd ðDz Þ Var d9y, b, s2 , f ¼ V d ðyÞ ¼ s2S V d ðyÞ
2 S
¼s
RS ðDS ,DS Þ þ t21 IN RS ðDz ðyÞ,DS Þ
RS ðDS ,Dz ðyÞÞ RS ðDz ðyÞ,Dz ðyÞÞ þ t22 Rd ðDz ,Dz Þ þ t23 In
!
where IN and In are the N N and n n identity matrix, respectively. With the prior distributions given in Section 2.3, we can derive the full posterior distribution of all the parameters. 1=2 p y, b, s2 , f9d pp y, bS , s2S , fS , bd , s2d , fd , s2e , s2e V d ðyÞ h n o i T ð9Þ exp dmd ðyÞ V d ðyÞ1 dmd ðyÞ =2 Let
T
b¼(bS,bd) ,
2 SV
s
¼
s2S V S 0 s2S V d 0
! 2 S
¼s
VS 0
0
!
t22 V d ,
b9(s2S ,s2d) N(b,s2S V). Proposition 3.1. Integrating out b and s2S , the posterior distribution of the parameters y, f and t2 is ae 1 2 ad 1 2 ae 1 V d ðyÞ1=2 p y, f, t2 9d ppðyÞpðfÞ t21 t2 t3 1=2 jV j1=2 A gS þ ge =t21 þ gd =t22 þ ge =t23 aS ae ad ae 3ðN þ nÞ=2 T T ð10Þ þ b V 1 b þd V d ðyÞ1 duT Au =2 h i1 h i 1 1 T 1 T 1 where A ¼ HðyÞ V d ðyÞ HðyÞ þ V , u ¼ HðyÞ V d ðyÞ d þ V b Proof. See Appendix A. To obtain the posterior distribution of the calibration parameter p(y9d), we can further integrate out the parameters f and t2. However, p(y9d) is a highly intractable function and it is computationally intensive to numerically integrate out these terms. Therefore, we propose to estimate these parameters. Then the conditional posterior distribution given the estimated parameters is used for inference about y. By estimating these parameters and plugging into the calibration procedure, we do not fully account for all sources of uncertainties. The computer model’s uncertainty is not fully accounted for when plugging in fS and t21, the model inadequacy uncertainty is not fully accounted for when plugging in fd and t22, and the observation error uncertainty is not fully accounted for when plugging in t23. Although results in [2] indicate that the uncertainties in these plug-in parameters are likely to have only a ‘‘second order’’ effect on the overall uncertainty, it is important to make sure that the ignored uncertainties do not have a significant influence on the predictive performance before plugging in these estimated parameters. If the effects of the uncertainties of the estimated parameters on the predictive performance cannot be ignored, the uncertainties of these parameters should be accounted for, and a general MCMC should be applied (although at a larger computational expense). However, in the examples given in Section 5, we compare the calibration and prediction results using estimated parameters to the results obtained when considering all uncertainties using MCMC. The numerical results show that the predictive performance from estimating these parameters is almost the same as when fully considering their effects. Similar in [4], in order to reduce the computational
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
burden, they also estimate some parameters and their results indicate that the effect on prediction performance by plugging in these parameters is negligible. Following, we provide a method to estimate these parameters. 3.2.2. Parameters estimation The parameters are estimated in two steps. In the first step we use data y to estimate parameters fS and t21. Since data z depends on all the parameters and the number of observations in z is usually much smaller than the number of observations in y, little information is lost when we use only y in the first step. In the second step we use data d to estimate remaining parameters fd, t22 and t23 given estimated fS and t21 from the first step. Step 1: Since we have y9bS , s2S , fS , t21 N HS ðDS ÞbS , s2S RS ðDS ,DS Þ þ t21 IN , ð11Þ
277
h 1 Azðbd Þ ¼ ðMHd ðDz ÞÞT AzðbS Þ ðMHd ðDz ÞÞ þ t22 V d 1 1 H d ð Dz Þ , þ Hd ðDz ÞT V z9ðUÞ uzðbd Þ ¼ ðMHd ðDz ÞÞT AzðbS Þ M zOz9ðUÞ y þ V S 1 bS 1 1 bd þ Hd ðDz ÞT V z9ðUÞ zOz9ðUÞ y þ t22 V d 1 T M ¼ HS ðDz ðyÞÞOz9ð:Þ HS ðDS Þ V z9ð:Þ , 1 Oz9ðUÞ ¼ RS ðDz ðyÞ,DS Þ RS ðDS ,DS Þ þ t^ 21 IN 1 T O ¼ zOz9ðUÞ y V z9ðUÞ zOz9ðUÞ y T M zOz9ðUÞ y þ V S 1 bs AzðbS Þ M zOz9ðUÞ y þ V S 1 bs Proof. Similar to the proof of (10).
Proposition 3.2. With the prior distributions in Section 2.3, the posterior distribution of fS and t21 is 1=2 ae 1 V S ðDS Þ1=2 jV S j1=2 AS p fS , t21 9y pp fS t21 gS þ 1=gS þ ge =t21 2aS ae N=2 T ð12Þ þ bS V S 1 bS þ yT V S ðDS Þ1 yuS T AS uS =2
Proposition 3.4. The conditional posterior distribution of y is given as 1=2 ^ , t^ 2 ppðyÞV ðyÞ1=2 jV j1=2 A p y9d, fd , t22 , t23 , f d S 1 aS ðN þ nÞ=2 gS þ bT V 1 b þdT V d ðyÞ1 duT Au =2 ð14Þ
where
Proof. Similar to the proof of (10).
2 1 IN
V S ðDS Þ ¼ RS ðDS ,DS Þ þ t , h i1 T 1 AS ¼ HS ðDS Þ V S ðDS Þ HS ðDS Þ þ V S 1 and h i 1 T 1 uS ¼ HS ðDS Þ V S ðDS Þ yþ V S bS . Proof. Similar to the proof of (10). Then, we can obtain the estimators of fS and t21 by maximizing p(fS,t219y). This can be done using readily available optimization functions in MATLAB. Step 2: In this step, we use the EM algorithm to estimate the remaining parameters fd, t22 and t23. EM algorithm is a powerful computational method to find the maximizer of the posterior distribution. For more details on the EM algorithm, refer to [19]. Similar to Step 1, we can derive the posterior distribution of the ^ ,t^ 2 ) and the posterior distriparameters given y p(fd,t22,t239d,y,f s 1 ^ ,t^ 2 ). bution of y given the parameters p(y9d,fd,t22,t23,f sS 1 Proposition 3.3. The conditional posterior distribution of fd, t22, and t23 is given as ^ , t^ 2 ppf t2 ad 3=2 t2 ae 1 p fd , t22 , t23 9d, y, f S 1 d 2 3 1=2 1=2 1=2 1=2 1=2 jV S j V z9ðUÞ Vd AzðbS Þ Azðbd Þ gS þ 2=gS þ gd =t22 þ ge =t23 1 T T bd þ bS V S 1 bS þ bd t22 V S 3aS ad ae n=2 ð13Þ uTzðb Þ Azðbd Þ uzðbd Þ þ O =2 d ^ and t^ 2 are estimated from step 1 and treated as known, where f s 1 V z9ðUÞ ¼ RS ðDz ðyÞ,Dz ðyÞÞ þ t22 Rd ðDz ,Dz Þ þ t23 In 1 2 RS ðDz ðyÞ,DS Þ RS ðDS ,DS Þ þ t^ 1 IN RS ðDS ,Dz ðyÞÞ 1 T AzðbS Þ ¼ HS ðDz ðyÞÞOz9ðUÞ HS ðDS Þ V z9ðUÞ i1 HS ðDz ðyÞÞOz9ðUÞ HS ðDS Þ þ V S 1
^ ,t^ 2 ), To obtain the maximizer of the posterior p(fd,t22,t239d,y,f s 1 we provide the following procedure. 1) Given the initial estimators fd(0), t22(0), t23(0), 2) E step: calculate posterior mode y^ with respect to ^ ,t^ 2 ), p(y9d,fd(0),t22(0),t23(0),f s 1 ^ ,t^ 2 ) for given y^ to obtain the 3) M step: maximize p(fd,t22,t239d,y,f s 1 2 2 updated fd(i), t2(i), t3(i), 4) The procedure terminates if the maximum iteration number is reached or the difference between the two sequential estimators are small enough (see [19]). Otherwise return to E step. The standard optimization functions in MATLAB can be used to find the maximizers of the conditional posterior distributions. The final estimators obtained are taken as the best estimators of these parameters, and the posterior distribution of y can be derived by plugging in these parameters into (14). ^ and t^ , the conditional posterWith the estimated parameters f ^ ,t^ ,d) can be used to make inference about y, ior distribution p(y9f such as quantifying the calibration parameter uncertainty, and this uncertainty can be accounted for in the subsequent prediction by ^ ,t^ ,d).. In general, the integrating out y with respect to p(y9f posterior mean is often used as the best value for future simulation runs. Alternatively, the posterior mode of y can be used and set as calibration parameter value in future simulation runs when the mode is preferred. The posterior mode is preferred when some ‘‘outlying’’ values exist as it is less sensitive to these values. It is important to choose the appropriate best value according to the shape of the posterior distribution, especially for the bimodal and multimodal distributions. If the posterior distribution is complicated, posterior mean and mode(s) should be compared by their predictive performance in order to select the best one. 3.3. Prediction With the calibrated model, predictions can then be made for the computer model and the real process. In this section, we derive the predictive distributions for both the computer model and the real process. We also derive the predictive distribution for
278
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
the inadequacy term which can be used in the model validation process. 3.3.1. Predictive distribution for the computer model Let S(x0) denote the true simulation output we want to predict at x0 with y. Given the data y, we obtain the following proposition. Proposition 3.5. Assume that fS and t21 are known, the computer model prediction at x0 with y is a non-central t distribution
ð15Þ Sðx0 Þ9y, fS , t21 , y T 1 nS9y , mS9y , s2S9y , where vs9y ¼N þ2aS,
mS9y ¼ hS ðx0 , yÞbS9y þRS ððx0 , yÞ,DS Þ RS ðDS ,DS Þ þ t21 IN
1 BS9y T AS9y 1 BS9y y Q 2S9y ¼ 2gS þ yT RS ðDS ,DS Þ þ t21 IN T 1 V S þAS9y 1 bS AS9y 1 BS9y : þ bS AS9y 1 BS9y Proof. See Appendix B. With this distribution, it is easy to quantify the computer model predictive uncertainty and obtain the pointwise prediction intervals for the computer model output. The 100(1 a)% prediction interval for the computer model output at any input x with a specified a is ð16Þ
where t nS9y , a=2 is the upper a/2 critical point of a univariate t distribution with vs9y degrees of freedom. 3.3.2. Predictive distribution for the real process Let z(x0) be the real process output at x0 to be predicted. Given the full set of available data d, we obtain the following proposition. Proposition 3.6. Assume that f and t2 are known, the real process prediction at x0 is a non-central t distribution
ð17Þ zðx0 Þ9d, f, t, y T 1 nz9d , mz9d , s2z9d ,
with T 2 V z ðx0 , yÞ ¼ RS ððx0 , yÞ,DS Þ,RS ððx0 , yÞ,Dz ðyÞÞ þ t2 Rd ðx0 ,Dz Þ , 1 bz9d ¼ Az9d þ V 1 Bz9d d þ V 1 b , Az9d ¼ 1 þ t22 T 1 T 1 HðyÞ V d ðyÞ HðyÞ, and Bz9d ¼ 1 þ t22 HðyÞ V d ðyÞ ; Q 2z9d n 1 T T s2z9d ¼ 1 hðx0 , yÞ , 1 þ t22 V z ðx0 , yÞ 0 @
nz9d
V 1 HðyÞ
11 HðyÞT A 1 1 þ t22 V d ðyÞ
!9 = hðx0 , yÞ , 2 1 V z ðx0 , yÞ ; 1 þ t2
With this distribution, it is easy to quantify the real process predictive uncertainty and obtain the pointwise prediction intervals for the real process output. The 100(1 a)% prediction interval for the real process at any input x with given a is ð18Þ
3.3.3. Predictive distribution for the inadequacy term The predictive distribution of the inadequacy d(x) can be used to validate the computer model so as to check whether the computer model matches the real process well. [11] provided a validation procedure using this predictive distribution. Therefore, it is useful to derive the predictive distribution of the model inadequacy term. Let d(x0) denotes the inadequacy term to be predicted at x0 and DSx ¼ {x1,y,xN} denote the set of variable input points where the computer model outputs are available. To derive the predictive distribution of the inadequacy term, two situations need to be considered. The first situation: when DzADSx, we obtain the following proposition.
with
mz9d ¼ hðx0 , yÞT bz9d þ V z ðx0 , yÞT V d ðyÞ1 dHðyÞbz9d ,
Proof. Similar to Proposition 3.5.
where t nz9d , a=2 is the upper a/2 critical point of a univariate t distribution with nz9d degrees of freedom.
1 yHS ðDS ÞbS9y with bS9y ¼ AS9y þ V S 1 BS9y yþ V S 1 bS , 1 AS9y ¼ HS ðDS ÞT RS ðDS ,DS Þ þ t21 IN HS ðDS Þ and 1 T BS9y ¼ HS ðDS Þ RS ðDS ,DS Þ þ t21 IN ; 8 2 < Q S9y T s2S9y ¼ 1ðhS ðx0 , yÞ ,RS ððx0 , yÞ,DS ÞÞ nS9y : !1 !9 = hS ðx0 , yÞ HS ðDS ÞT V 1 S , 2 ð D , ð x , y Þ Þ R ; HS ðDS Þ RS ðDS ,DS Þ þ t1 IN S S 0
where nz9d ¼n þN þ2aS,
T Q 2z9d ¼ 2 1 þ t22 gS þ d 1 þ t22 V d ðyÞ1 BTz9d A1 z9d Bz9d d 1 T þ bS A1 V S þ d þA1 bS A1 z9d Bz9d d z9d z9d Bz9d d
mz9d ðxÞ 7 sz9d ðxÞ tnz9d , a=2 ,
1
mS9y ðxÞ 7 sS9y ðxÞ tnS9y , a=2 ,
with
Proposition 3.7. Under the assumption that DzADSx and that f and t2 are known, the model inadequacy prediction at x0 is a noncentral t distribution
ð19Þ dðx0 Þ9zyðDz Þ, f, t, y T 1 nd9d , md9d , s2d9d , where nd9d ¼nþ 2aS,
md9d ¼ hd ðx0 ÞT bd9d þ Rd ðx0 ,Dz Þ Rd ðDz ,Dz Þ þ t23 =t22 In
1
zyðDz ÞHd ðDz Þbd9d , with 1 bd9d ¼ Ad9d þ V d 1 Bd9d zyðDz Þ þV d 1 bd , 1 Ad9d ¼ Hd ðDz ÞT Rd ðDz ,Dz Þ þ t23 =t22 In Hd ðDz Þ and 1 T 2 2 ; Bd9d ¼ Hd ðDz Þ Rd ðDz ,Dz Þ þ t3 =t2 In
s2d9d ¼
Q 2d9d n 1 hd ðx0 ÞT ,Rd ððx0 Þ,Dz Þ
nd9d
V 1 d
Hd ðDz ÞT
Hd ðDz Þ
Rd ðDz ,Dz Þ þ t23 =t22 In
!1
!9 = , Rd ððx0 Þ,Dz Þ ; hd ðx0 Þ
with
Q 2d9d ¼ 2t22 gS þ ðzyðDz ÞÞT 1 T 1 Rd ðDz ,Dz Þ þ t23 =t22 In Bd9d Ad9d Bd9d ðzyðDz ÞÞ 1 T V d þA1 þ bd A1 d9d Bd9d ðzyðDz ÞÞ d9d bd A1 d9d Bd9d ðzyðDz ÞÞ , Proof. Similar to Proposition 3.5. The second situation: when DzeDSx, we can either run the computer simulations at the set of points in Dz but not in DSx to make DzADSx if it is not expensive to run, or we can use the predictive mean values (obtained from Proposition 3.5) of the computer model outputs at the set of points in Dz but not in DSx. Additional uncertainties arise when the predictive mean values are used. However, these may be small if the available computer
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
model data are sufficiently large as the accuracy of the prediction usually improves with the number of observed data points. After obtaining the computer model outputs at all points in Dz, we have the same situation as the first and hence Proposition 3.7 follows. 3.4. Validation Various validation methods have been proposed to assess the validity of the computer models [1]. A simple and straightforward method is to graphically compare the computer model outputs to the test data to assess the performance. The predictive intervals obtained from the predictive distributions in the previous section can be used for validation purpose. For instance, the validation approach proposed in [20] is to reject the computer model at the input point if the predictive interval for the real process (or the inadequacy term) does not contain the computer model output (or zero) and fail to reject otherwise. [11] proposed a validation approach based on the predictive interval for the inadequacy term which also can be used with our current calibration procedure. First, it is easy to obtain the pointwise prediction intervals for the inadequacy term. From (19), the 100(1 a)% prediction interval for the inadequacy term at any input x is
md9d ðxÞ 7 sd9d ðxÞ tnd9d , a=2 ,
4.1. Two-stage sequential design approach Stage 1: Use a space filling design (e.g., LHD) to obtain the initial data. Implement the calibration procedure described in Section 3.2 and obtain the predictive distributions derived in Section 3.3. Stage 2: Run a follow up experiment based on a given design criterion. Update the calibration and prediction results using the entire data set. The percentage of the total resources allocated to the initial stage should be decided appropriately according to the practical problem itself and the amount of the available resources. If a large amount of resources is available, the percentage of the total resources allocated to the initial stage can be less than 50%, so more resources can be allocated later in a more directed manner in order to reduce the integrated mean square prediction error (IMSPE). If the total resources are limited, the percentage can be more than 50% to first focus on obtaining a more accurate surrogate model. As the main purpose is to use the calibrated model for prediction, in the second stage we choose the sequential design criterion to reduce the overall prediction error. Since the computer model and the real process may both be available for the additional runs, we propose sequential designs for two scenarios.
ð20Þ
where t nd9d , a=2 is the upper a/2 critical point of a univariate t distribution with nd9d degrees of freedom. Let ld(x) and ud(x) be the lower bound and upper bound of the 100(1 a)% prediction interval. Define ( 0, if 0 A ðld ðxÞ,ud ðxÞÞ Dmin ðxÞ ¼ min 9ld ðxÞ9,9ud ðxÞ9 , otherwise and
279
Dmax ðxÞ ¼ max 9ld ðxÞ9,9ud ðxÞ9 : Let D0 be the maximum allowable deviation between computer model and real process specified based on expert knowledge. Then the following validation rules are used to validate the computer model. If Dmin(x)4 D0, then the computer model is rejected at x; If Dmax(x) o D0, then the computer model is accepted at x; If Dmin(x) o D0 o Dmax(x), then no conclusion can be made, and more experimental data are required. It is important to validate the computer model after calibration to assess the predictive capability of the calibrated model. If the model is inadequate, it is usually required to collect more data to further update the model or build a more accurate model. If the model is validated, then the calibrated and validated model can be further applied for its intended use.
4. Sequential approach Most studies about computer model calibration and prediction use space filling designs such as Latin hypercube design (LHD) to select all the experimental data at one time. In order to improve the model accuracy and predictive performance, better allocation of limited resources to the design points can be applied, especially when additional resources are available. Here, for the stochastic computer model calibration and prediction, we address this design problem via a two-stage sequential design approach to improve the calibration accuracy and predictive performance by reducing the overall mean square prediction error (MSPE). Other criteria mentioned in [21] such as the expected improvement, entropy and distance-based criteria can also be used in the proposed approach.
Scenario 1. Only computer experiments are available for sequential design, the real process data is fixed. In this scenario, adding more computer simulation data may significantly improve the fitting of the Gaussion process model to the computer model. The improvement to the real process may not be so obvious. However, since the final purpose is to improve the predictive performance for the real process, we can use the follow up design criterion either based on the computer model predictive performance or the real process predictive performance. We use the integrated mean squared prediction error (IMSPE) as the criterion to select the follow up designs so as to reduce the overall predictive uncertainty for the computer model. Given the prediction variance s2S9y of the computer model by (15) and the prediction variance s2z9d of the real process by (17), we assume there are initial n design points D1 ¼{(x1,t1),y,(xn,tn)} for the computer model, and let yn denote the simulation outputs for these design points. Now we would like to select the new m design points D2 ¼{(x10 ,y),y,(xm0 ,y)} which will minimize the integrated mean square prediction error (IMSPE) either for the computer model or the real process. When the criterion is for the computer model, it can improve the computer model predictive performance first which results in better real process predictive performance. When the criterion is for the real process, it can directly improve the real process predictive performance. Let xc ¼{x10 ,x20 ,y,xm0 }, S^ m denote the predicted simulation outputs for the new design points, Y denotes the domain of y, and w denotes the domain of the variable input x. The calibration parameter uncertainty is accounted for in the prediction error by integrating out y. As this involves looking at the design problem before the real experiment is run, we take the prior expected value of the posterior variance. Then the design criteria based on the computer model and the real process are shown below: Minimize EIMSPE based on the computer model prediction: Z Z ^ ,s ^ 2 , t^ dxdy; Jc xc , S^ m ¼ min E½s2S9y , S^ ,x p y9d, f ð21Þ xc
Y w
n
m
c
Minimize EIMSPE based on the real process prediction: Z Z ^ ,s ^ 2 , t^ dxdy: Jr xc , S^ m ¼ min E½s2z9d, S^ ,x p y9d, f xc
Y w
m
c
ð22Þ
280
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
The integrations in these criteria and the following criteria in Scenario 2 can be done using the numerical integration functions in MATLAB and/or the MCMC approach. The minimization can also be done using the standard optimization functions in MATLAB. Scenario 2. Both computer experiments and real process runs are available for the sequential design. In this scenario, we need to decide the allocation of the additional resources to the simulation runs or the real process runs by comparing the improved predictive performance. The overall MSPE for the real process prediction is used for the comparison. Assume there are initial nS computer design points D1S ¼ ðx1 ,t 1 Þ,:::, xnS ,t nS , ynS denote the simulation outputs for these ndesign points, and there are initial nz real process inputs o D1z ¼ x01 ,:::,x0nz , znz denote the initial real observations. Therefore, there are total n¼n S þnz initial input points and the available T data set is d ¼ yTnS ,zTnz . The allocation procedure proposed is as follows. First, before the resource allocation is decided, the cost comparison between a computer model run and a real process run is required. Assume the cost of adding one real process follow up design point is equal to the cost of adding o computer model follow up design points. Second, find the o new computer model design points and calculate the reduced optimum EIMSPE. Let xS ¼{xS1,xS2,y,xSo} denote the o new points we want to find if the additional resources are allocated to the computer experiments and S^ w denote the predicted simulation outputs for the new design points. Then the design criterion to minimize the overall MSPE for the real process prediction is Z Z ^ ,s ^ 2 , t^ dxdy: E½s2z9d, S^ ,x p y9d, f ð23Þ J xS , S^ o ¼ min xS
o
Y w
S
After finding xS, we can also obtain the reduced optimum EIMSPE when the new predicted computer simulation outputs are used for prediction. Z Z ^ ,s ^ 2 , t^ dxdy: E½s2z9d, S^ ,x p y9d, f ð24Þ EIMSPEðxS Þ ¼ Y w
o
S
Third, find the new real process design point and calculate the reduced optimum EIMSPE. Let xr denote the new point we want to find if the additional resources are allocated to the real process runs and x^ r denote the predicted real process output for the new design point. The design criterion is Z Z ^ ,s ^ 2 , t^ dxdy: E½s2z9d, z^ ,x p y9d, f ð25Þ J xr , z^ r ¼ min xr
r
r
Y w
After finding xr, we then calculate the reduced optimum EIMSPE when the new predicted real process output is used for prediction. Z Z ^ ,s ^ 2 , t^ dxdy: E½s2z9d, z^ ,x p y9d, f ð26Þ EIMSPEðxr Þ ¼ Y w
r
r
Finally, we compare EIMSPE(xS) and EIMSPE(xr) to decide the allocation of the resources. If EIMSPE(xS) oEIMSPE(xr), we would like to run o more computer experiments. If EIMSPE(xS)ZEIMSPE(xr), we would like to collect more real process data. Here we consider only one additional real process run compared to o computer model runs. It is easy to extend this to more than one real process run by replacing xr with more points. For this scenario, we assume that only one option is available at the next stage, either choosing the computer model runs or choosing the real process runs. It can be further extended to split the resources between the computer model and the real process either through allocating the resources point by point until exhausted or by
comparing the performances of all possible resource allocation combinations. After the follow up design points with the minimized EIMSPE are obtained, the real computer model experiments or the real process experiments are run at these points to obtain the new experimental data. Then these new data are added to the original data to implement the calibration procedure again. By using this two-stage sequential design approach, more appropriate points with minimized EIMSPE are selected to sample at as compared to a one-stage approach. This will improve the fit of the GP models as more informative data (with less predictive error) are provided. More specifically, all the estimated parameters including the calibration parameter are more accurate using this approach. This will make the posterior conditional distribution of the calibration parameter more precise. Moreover, as more useful information are provided for calibration, the variance of the calibration parameter should be reduced. Therefore, by using this two-stage design approach, not only will the EIMSPE be reduced, but the calibration parameter uncertainty will also be reduced. These results are illustrated by the numerical examples in Section 5.
4.2. Implementation steps for calibration, prediction and validation The sequential calibration and prediction procedure can be summarized as: 1) Use a space filling design such as a LHD to obtain the initial data. [22] recommends to select at least 10 experimental point per dimension. 2) Validate the GP model (see, e.g., [16]). If the GP model is not valid, refine the surrogate model. 3) Apply these data to obtain the posterior calibration parameter distribution and the predictive distributions using the proposed approach. 4) Use the proposed sequential design approach to search for the optimal follow up design points, where EIMSPE is used as the metric which is minimized with the optimal follow up design points. 5) Run the experiments at these new design points to obtain the new experimental data. 6) Use all the available data to obtain the updated posterior calibration parameter distribution and the predictive distributions. 7) Validate the calibrated computer model (see, e.g., [1]). If the computer model is not valid, collect more data and/or revise the computer model until it is adequate for intended use. 8) Apply the calibrated and validated model to intended use.
In the second stage of this two-stage approach (steps 4–6), all the remaining design points are found at a time. The follow up design can also be allocated one point at a time. Although only two-stages are considered in our sequential calibration and prediction approach, it is easily extended to a multi-stage sequential approach by recursively applying the second stage (implement steps 4–6 iteratively). If the multi-stage sequential approach is used, the EIMSPE can still be defined as the metric to determine the follow up design points. It is also required to define the stopping criterion when there are multiple stages. In this multi-stage approach, a stopping criterion such as when the change in the EIMSPE value is small enough or when the available resources are exhausted can be applied. Therefore, for the multistage approach, such a stopping criterion should be added in the step 6, and steps 4–6 are implemented recursively until the stopping criterion is met.
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
5. Examples In this section, two examples are used to illustrate the calibration and prediction accuracy of the proposed approach. The first quadratic example is used for illustrative purposes, to assess our approach with a model with known analytical solutions. The second is a practical mitochondrial DNA deletions example where the calibrated computer model can be further used to evaluate intervention policies in public health and safety. We compare the proposed stochastic model calibration and prediction approach (stochastic approach) to the approach that only considers deterministic computer model (deterministic approach) to evaluate the benefits of considering the stochastic error. We also compare the approach to the full MCMC approach that accounts for all the parameters’ uncertainties to check that the predictive performance will not be significantly influenced by estimating some parameters. Furthermore, we compare the proposed sequential approach to a one-stage approach using LHD to select all the experimental data to illustrate the improvement obtained when using the sequential approach. For the DNA deletions example, we use some test data to validate the calibrated model so as to verify the accuracy of the proposed approach. The Gaussian process models of both examples are assumed to have a constant mean which is reasonable in many practical applications [15]. According to [18], we select the ‘‘location-flat’’ priors N(0,s2S ) and N(0,s2d) for bS and bd. The ‘‘vague’’ prior IG(2,1) is chosen for s2S , s2d, s2e and s2e , and G(2,0.1) is chosen as prior for each element in f.
5.1. The simple quadratic example We first use a simple quadratic function to illustrate the proposed method. The real process model is assumed to be
281
z¼ ynx2 þe with e N(0,4). The computer model is coded as y¼ yx2 þ e with e N(0,s2e ). Here the discrepancy term is set to be 0 so that the influence of the stochastic error on the predictive performance can be clearly identified. We further assume that the true value of yn is 5.000, and y has a uniform prior distribution y U(0,10). For the real model, seven experimental input points are evenly collected from xA[ 3,3] and one real observation for each input point. For the computer model, the LHD method is used to collect 20 input sets with xA[ 3,3] and y U(0,10). Each input set has 100 replications whose mean is taken as the simulation output. We first compare the calibration and prediction results for low (s2e ¼10) and high (s2e ¼1000) variance values using the deterministic approach (A1), the proposed stochastic approach (A2), and the full MCMC approach (A3) that accounts for all the uncertainties. The results are shown by Fig. 1. The results indicate that when the variance of the stochastic error is small, all the three approaches perform well for both calibration and prediction. When the variance of the stochastic error is large, plot c in Fig. 1 shows that the calibration performance of both the MCMC approach and the proposed approach are better than the deterministic approach as their calibration results are closer to the true calibration parameter value. More specifically, the posterior mean/mode of the calibration parameter for the MCMC approach (with 4.992/5.013) and the proposed approach (with 4.986/4.997) are closer to the true value 5.000 than the deterministic approach (with 4.574/4.4.628). The results also indicate that the posterior variance of the calibration parameter for both the MCMC approach (with 0.973) and the proposed approach (with 0.971) are slightly larger than the deterministic approach (with 0.956). This is reasonable as the deterministic approach does not account for the stochastic simulation uncertainty in calibration. From plot d in Fig. 1, it can be seen that both the MCMC approach and the
Fig. 1. Posterior distributions of y, predictive means and 95% predictive intervals of real process in low variance scenario (a) and (b) and high variance scenario (c) and (d) for different approaches.
282
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
proposed approach predict the true objective value better than the deterministic approach. For more accurate comparison, we implement the calibration and prediction procedure and calculate the root mean square prediction error (RMSPE) for 10 sets of random observed real process and computer model data. The average RMSPEs for the three approaches are obtained and twosample t-test is used to compare the predictive performance between each pair of approaches (at a ¼0.05). The results show that the average RMSPEs between the MCMC approach (with 2.319) and the proposed approach (with 2.385) are not significantly different while they are both significantly smaller than the deterministic approach (with 6.306). From these results, we can conclude that it is important to consider the stochastic error for calibration and prediction when the stochastic error cannot be ignored. The results also indicate that the calibration and prediction performance for both the MCMC approach and the proposed approach are almost the same. This shows that estimating some parameters in the proposed approach will not significantly influence the calibration results and the predictive mean of the real process. Moreover, when comparing the computational time required to obtain the calibration and prediction results for the proposed approach and the MCMC approach, the proposed approach is about 23 times faster than the MCMC approach. Therefore, the proposed approach can obtain equivalent calibration and prediction performances as MCMC and it is generally much faster than the MCMC approach. In order to further check the effect on the overall prediction error by estimating some parameters, we compare the difference of the real process predictive variance between the proposed approach and the MCMC approach under the high variance scenario. For a better comparison, 10 sets of random observed real process and computer model data are used. For each set of data, we calculate the predictive variance of the real process at 100 input points that are evenly located in the domain of x for both approaches. Then we can obtain the difference of the predictive variance at each point between the two approaches. We use the ratio of this underestimated variance to the overall variance obtained from MCMC to see the impact of the ignored uncertainties by estimating parameters. The mean, minimum and maximum ratios among 100 points for the 10 sets of data are shown in Table 1. The results indicate that the underestimated variance of the proposed approach is mostly less than 1% of the total predictive variance. This means the effect on predictive performance by estimating these parameters is not significant. Therefore, it is practical to estimate these parameters which have little impact on predictive performance but can significantly reduce the computational burden. Now we consider the sequential design issue for this example. The scenario with variance of 1000 is taken to illustrate the sequential approach. The situation that both the computer experiments and the real process runs are available for sequential design is considered. The proposed two-stage sequential approach is used to select the best follow up design points. Then its calibration and prediction performance are compared to the one-stage approach which use LHD to select all the design points at one time.
It is assumed that there are initially 6 real process design points and 15 computer model LHD points. First, we assume the cost of adding one real process design point is equal to the cost of adding one computer model design point and additional five new design points can be selected either from the real process or from the computer model. Second, we find the new design points when computer model runs are selected based on (23) and the calculated new EIMSPE is 2.1964. Third, we find the new design points when real process data are selected based on (25) and the calculated new EIMSPE is 1.5731. Since EIMSPE with more real process design points is smaller than with more computer model design points, we would like to collect more real process data at the new design points. This is expected as more real process data are favored when the costs of adding real process design points and computer model design points are the same while the initially obtained real process data are less than the initially obtained simulation data. In order to check the EIMSPE values, we calculate the observed average EIMSPE over 10 macroreplications of the design after we obtain the new computer model data and the real process data. Table 2 summarizes the predicted EIMSPE values when we calculate for allocation purpose and the observed average EIMSPE after we obtain the experimental data. The predicted EIMSPE is close to the observed average EIMSPE and both sequential design scenarios reduce EIMSPE significantly. Then we compare the calibration and prediction performances of this two-stage approach to the one-stage approach. As the remaining resources are decided to allocate to the real process, there are a total of 11 real process design points and 15 computer model design points. Hence in the one-stage approach, we select 11 real process design points evenly from the input space and 15 computer model design points are the same with the initial stage. Fig. 2 indicates the posterior distribution of the calibration parameter and the predictive mean and 95% predictive interval of the real process for the proposed two-stage approach with only initial stage data (M1), with five more real process design points (M2), and for the one-stage approach (M3). The results show that the posterior calibration parameter distributions from all the approaches have almost the same posterior mean and mode which are close to the true value. To compare the posterior variance of the calibration parameter for the different design approaches, a sample of posterior variance is obtained from 10 macroreplications of the design for each design approach. Then two-sample t-test is used to compare the difference of the posterior variance between the two-stage approach and the one-stage approach. The results show that the posterior variance Table 2 Predicted EIMSPE and observed EIMSPE for both sequential design scenarios. Origin EIMSPE
4.2854
After six more computer model design points
After six more real process design points
Predicted EIMSPE
Observed average EIMSPE
Predicted EIMSPE
Observed average EIMSPE
2.1964
2.1687
1.5731
1.5672
Table 1 Mean, minimum and maximum ratios of the underestimated predictive variance using the proposed approach to the overall predictive variance using the MCMC approach among 100 evenly selected input points for 10 sets of random observed data. Ratio
Set 1
Set 2
Set 3
Set 4
Set 5
Set 6
Set 7
Set 8
Set 9
Set 10
Mean Min Max
0.0082 0.0056 0.0131
0.0095 0.0061 0.0137
0.0085 0.0054 0.0098
0.0076 0.0049 0.0102
0.0088 0.0052 0.0125
0.0079 0.0047 0.0092
0.0091 0.0058 0.0117
0.0084 0.0053 0.0106
0.0077 0.0051 0.0096
0.0089 0.0055 0.0119
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
283
Fig. 2. Posterior distributions of y (left), predictive means and 95% predictive intervals of real process (right) for different approaches.
of the calibration parameter from the two-stage approach with more real data (with mean variance 0.952) is significantly smaller than the one-stage approach (with mean variance 1.108). Smaller variance means that the calibration results are more precise as the uncertainty about the calibration parameter becomes smaller. Therefore, the two-stage approach with more real data performs better than the one-stage approach. For the predictive performance, the right plot in Fig. 2 shows that the two-stage approach with more real process points predicts the real process better than the one-stage approach. For better comparison, the average root mean square prediction error (RMSPE) over 10 macroreplications of the design and two-sample t-test are used to compare the predictive performance of the different approaches. The results show that the average RMSPE of the two-stage approach with more real data is 1.4267, which is significantly smaller than the average RMSPE of the one-stage approach 1.9683.
5.2. Mitochondrial DNA deletions example Ref. [4] described a biological model of mtDNA population dynamics which simulate the mtDNA deletion accumulation in substantia nigra neurons and the death of neurons. This model is helpful in assessing interventions that are designed to halt or reverse the decline in neurons associated with Parkinson’s disease accurately. It can also be used to predict changes in the incidence of Parkinson’s disease in aging populations [4]. Due to the complex nature of the real process and the limited data resources, it is important to calibrate the model well first. We use this model to assess the accuracy of the proposed calibration and prediction approach. The population of mtDNA within a cell includes two species. One is the number of copies of healthy mtDNA denoted by Y1, and the other is the number of copies of mtDNA with deletions denoted by Y2. There are three unknown parameters in this biological model that need to be calibrated. They are the mutation rate c1, the degradation rate c3, and the lethal threshold t. The lethal threshold is used to determine the death of the cell. A cell dies if its proportion of mtDNA deletions, p¼ Y2/(Y1 þY2), is equal to or larger than the lethal threshold t. In our discussion, the threshold t is treated as known for simplicity, and its value is taken as the posterior mean t ¼0.962 obtained from [4]. Therefore, the unknown parameters c1 and c3 are considered and their log-transformed parameters y1 ¼log(c1) and y2 ¼log(c3) are treated as calibration parameters.
Ref. [4] provides the experimental data on the accumulation with age of mtDNA deletions in substantia nigra neurons. The observed data from technique 2 are used in this discussion. The observations for individuals with age 20, 32, 44, 56, 72, 75, 81 and 91 are used for calibration and the remaining observations for individuals with age 19, 42, 51, 77 and 89 are used for validation. The mean of the real-time polymerase chain reaction (RT-PCR) measurement of deletion accumulation zi for each age i is taken as the real observed data from the real process. For the simulation outputs, we consider both the one-stage approach and the proposed two-stage sequential approach. Only computer experiments are available for the sequential design in this example and the real observed data is fixed. For the one-stage approach, we use LHD to select a total of 150 input sets with the given normal priors of the calibration parameters and the uniform age xAU(1,110). For the two-stage approach, we first use LHD to select 140 input sets and then use the proposed two-stage approach to select the remaining 10 input sets. Only 10 input sets are allocated to the second stage for computational convenience and illustrative purpose. For each input set, we run the stochastic biological computer model with multiple replications until there are 10 cells surviving. The simulation output for each input set is quantified by log2(1 pi), where pi is the sampled proportion of deletions Y2i/(Y1i þY2i). Then the mean of the 10 observed outputs for each input set is taken as the obtained simulation output. The calibration performance of the proposed two-stage approach (T1) is compared to the MCMC approach (T2) and the one-stage approach (T3). The data used for the MCMC approach are the same with the two-stage approach. Fig. 3 shows the marginal posterior distributions of the calibration parameters for the different approaches together with their prior distributions. It can be seen that the uncertainty of the calibration parameters from the proposed two-stage approach (with tighter posterior distributions) is smaller than the one-stage approach (with more spread posterior distributions). For better comparison, a sample of posterior variance is obtained from five macroreplications of each design. Then a two-sample t-test is used to compare the difference of the posterior variance between the two-stage approach and the one-stage approach. The results show that the two-stage approach has significantly smaller posterior calibration parameter variances (with mean variance 0.512 for y1 and 0.336 for y2) than the one-stage approach (with mean variance 0.603 for y1 and 0.361 for y2). This means the proposed two-stage approach provides more precise calibration results than the one-stage
284
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
Fig. 3. Prior together with posterior distributions of y1 (left) and y2 (right) for different approaches.
Table 3 Means and 95% equal-tailed CIs of calibration parameters. Parameter
Mean
95% CI
y1 y2
10.26 4.23
( 11.23, 9.27) ( 4.88, 3.58)
Fig. 4. Posterior predictive means and 95% predictive intervals for the RT-PCR measurement of deletion accumulation; the circles denote the real observations used for validation, the solid lines denote the predictive mean and interval of the proposed approach, the dashed lines denote the predictive mean and interval of the MCMC approach.
approach. The results also show that the proposed approach has almost the same performance with the MCMC approach. This indicates that the calibration results are not significantly influenced by the ignored uncertainty of the estimated parameters in the proposed approach. However, the proposed approach runs approximately 21 times faster than the MCMC approach. Compared to the prior distributions, Fig. 3 also shows that the uncertainties about y1 and y2 have been reduced and the posterior mean and mode of y2 have moved to the left. These results are similar to the results given by [4]. The posterior means and 95% equal-tailed posterior confidence intervals (CIs) of the calibration parameters obtained from the proposed approach are given in Table 3. Then we use the five remaining individuals’ observations to assess the validity of the calibrated model and compare the proposed approach with the MCMC approach to assess the effects
of plugging in the estimated parameters. Fig. 4 shows the observed data together with the posterior predictive mean and the 95% predictive intervals of the proposed approach and the MCMC approach. It can be seen that the most observations lie within or fall just outside the 95% posterior predictive interval except the only one extreme observation for the age 51 individual. This shows that the model calibrated using the proposed approach predicts well for the real observations and is valid for further analysis. This result is similar to the result of [4]. The calibrated and validated biological model has an improved predictive performance. Hence it can provide more reliable assessment of the real biological process for decision makers to make better intervention strategies. Fig. 4 also shows that the differences of the predictive mean and 95% predictive interval between the proposed approach and the MCMC approach are not significant. This indicates that the effect of plugging in the estimated parameters on the predictive performance is negligible. Therefore, it is practical to estimate some parameters which have insignificant effects on the predictive performance, and this can significantly reduce the computational burden.
6. Conclusion When the computer model is used to assess and certify the real process or system’s reliability and safety so as to assist better decision making, it is important to improve the calibration accuracy and predictive performance of the computer model. In this paper, we propose an accurate and efficient sequential approach to stochastic computer model calibration and prediction. The proposed approach can account for various uncertainties including the calibration parameter uncertainty and it provides the analytical results for calibration and prediction. We derive the posterior distribution of the calibration parameter and the predictive distributions for both the computer model outputs and the real process outputs by combining all the available computer model data and the real observations. We also derive the predictive distribution for the inadequacy term which measures the discrepancy between the computer model and the real process. The proposed two-stage sequential approach can improve the calibration accuracy and the predictive performance by effectively allocating the limit resources. Two numerical examples are provided to illustrate the accuracy of the proposed calibration approach and the improvement of using the proposed two-stage sequential approach.
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
Both numerical results indicate that the proposed approach performs well in calibration and prediction and the two-stage sequential approach can efficiently use the available resources. In this paper, we adopt (1) as the relationship form between the computer model and the real process. This is just one possible formulation. The users may choose other appropriate relationships according to specific practical applications. In the relationship form (1), it is assumed (as well as assumed by many others) that the computer model output and the model inadequacy term are independent. However, the model inadequacy term may depend on the computer model output in some applications. Hence it is important to make sure this assumption is valid in practical use. We use a computer model to represent or predict the real process, and use the GP as a surrogate model to the computer model for its generality and flexibility. However, in many real applications, there may be several possible computer models that can be used and the GP model may not be the best surrogate model in all situations. Therefore, the computer model form uncertainty and surrogate model uncertainty are important issues to be addressed in every real application, and it is important to make sure the computer model and surrogate model are valid before the calibrated model is applied for its intended use. The current discussion assumes that the stochastic error has a constant variance and is independent of the design variable and the calibration parameter. This may not be the case in some situations. Therefore, one possible important future work is to take into account the dependency of the input parameters. In the proposed approach, we assume the mathematically convenient conjugate priors for the unknown parameters. A more in depth study on the effects of different priors is also an area for further study. In this paper, we also assume that the calibration parameter is independent of the variable input and the other unknown parameters. Although this assumption is adopted in many calibration literatures, it may not hold and the dependency can be further explored. In the current procedure, the EIMSPE is applied as a criterion to select design points. Other sequential design criteria such as an entropy based criteria to improve the calibration parameter can be applied.
Acknowledgments This research was supported by research project grant (R-266000-055-112) funded by the Ministry of Education Academic Research Fund in Singapore.
Appendix A. List of notations z is the real process observation;
z is the true output from the real process; S is the expected stochastic computer model output; x is the variable input; y is the optimum calibration parameter value; t is the specified input calibration parameter value in the computer model; d is the model inadequacy term; e is the observation error; e is the sampling variability inherent in a stochastic simulation; yi is the average of several stochastic computer model outputs for the ith input set; dT ¼(yT, zT) denote the full available data set from both computer model and real process; N is the number of computer model design points; n is the number of real process design points; I denotes the identity matrix;
285
DS denote the set of input points at which the computer model outputs are available; Dz denote the set of input points where the real observations are available with only x; Dz(y) denote the set of input points where the real observations are available with x and y; H(y)b, HS( )bS, and Hd( )bd are the Gaussian process means of d, S, and d, respectively, where H(y), HS( ), and Hd( ) are the matrixes of known functions, b ¼(bTS, bTd)T is a vector of unknown coefficients; V d ðyÞ ¼ s2S V d ðyÞ, s2S RS, and s2dRd are the covariance functions of d, S, and d, respectively, where s2S and s2d are the unknown variances, RS and Rd are the correlation matrixes; f ¼{fS, fd}, where fS and fd are the unknown parameters in the correlation matrixes RS and Rd, respectively; s2e and s2e are the unknown variances of e and e, respectively; bS and VS, bd and Vd, b and V are the known parameters in the assumed normal priors of bS, bd, b, respectively; aS and gS, ad and gd, ae and ge, ae and ge are the known parameters in the assumed inverse gamma priors of s2S , s2d, s2e , s2e , respectively; t2 ¼{t21, t22, t23} with t21 ¼ s2e /s2S , t22 ¼ s2d/s2S , t23 ¼ s2e /s2S , which are the reparameterized parameters when we integrate out s2S in the posterior calibration parameter distribution and predictive distributions.
Appendix B. Proof of Proposition 3.1. First, integrate out b: Z p y, b, s2 , f9d db p y, f, s2 9d ¼ 1=2 ppðyÞp fS p fd p s2e p s2e p s2S p s2d V d ðyÞ Z h n o i T p b9s2S , s2d exp ðdHðyÞbÞ V d ðyÞ1 ðdHðyÞbÞ =2 db where Z h n o i T p b9s2S , s2d exp ðdHðyÞbÞ V d ðyÞ1 ðdHðyÞbÞ =2 db Z h n o 1=2 1 T ps2S V ðbbÞ =2 exp ðbbÞ s2S V n o i T ðdHðyÞbÞ V d ðyÞ1 ðdHðyÞbÞ =2 db ðCollect terms that dependent on b and group other constant termsÞ h i 1=2 1 T T ¼ s2S V exp b s2S V b=2d V d ðyÞ1 d=2 Z h i T exp b A1 b=2 þ uT b db h i 1=2 1
1=2 T T ps2S V exp b s2S V b=2d V d ðyÞ1 d=2 A exp uT Au=2 with A 1 ¼ [H(y)TVd(y) 1H(y)þ(s2S V) 1] and u ¼[H(y)TVd(y) 1d þ R 1=2 T (s2S V) 1b].To prove exp½b A1 b=2 þuT bdbp9A9 exp½uT Au=2, we consider a multivariate normal distribution NN þ n(Au,A). As R R 1=2 T T ð2pÞðN þ nÞ=2 9A9 ¼ exp½ðbAuÞ A1 ðbAuÞ=2db ¼ exp½b 1 T T A b=2 þu bu Au=2db, rearranging the terms gives the results. Second, integrate out s2S : collecting all the terms that dependent on s2S and integrating out it we obtain Z p y, f, s2e , s2d , s2e , s2S 9d ds2S p y, f, t21 , t22 , t23 9d ¼ Z pp yÞp fS p fd pðs2S Þp s2S t21 p s2S t22 p s2S t23 V d ðyÞ1=2 s2 V 1=2 A1=2 S h i 1
T T exp b s2S V b=2d V d ðyÞ1 d=2 exp uT Au=2 ds2S
286
J. Yuan, S.H. Ng / Reliability Engineering and System Safety 111 (2013) 273–286
Z a 1 pp yÞp fS p fd s2S S
ae 1 ad 1 exp gS =s2S s2S t21 exp ge = s2S t21 s2S t22 ae 1 ðN þ nÞ=2 exp gd = s2S t22 s2S t23 exp ge = s2S t23 s2S 1=2 V d ðyÞ1=2 s2 ðkS þ kd Þ=2 jV j1=2 s2 ðkS þ kd Þ=2 A S S h i T T exp b V 1 bþ d V d ðyÞ1 duT Au = 2s2S ds2S ae 1 2 ad 1 2 ae 1 ¼ pðyÞp fS p fd t21 t2 t3 1=2 1=2 1=2 Z 2 aS ae ad ae 4ðN þ nÞ=2 V d ðyÞ jV j sS A 2 2 exp gS þ ge =t1 þ gd =t2 þ ge =t23 i T T þ b V 1 b þ d V d ðyÞ1 duT Au =2Þ=s2S ds2S ae 1 2 ad 1 2 ae 1 ppðyÞp fS p fd t21 t2 t3 1=2 1=2 1=2 2 V d ðyÞ V S þ d gS þ ge =t1 þ gd =t22 þ ge =t23 A T T þ b V 1 b þ d V d ðyÞ1 duT Au =2ÞaS ae ad ae 3ðN þ nÞ=2 h i1 h i where A ¼ HðyÞT V d ðyÞ1 HðyÞ þV 1 , u ¼ HðyÞT V d ðyÞ1 d þ V 1 b . Appendix C. Proof of Proposition 3.5. T is a multivariate normal distribution Given that Sðx0 Þ,yT given the parameters bS, s2S , fS, t21 and y, we can obtain the conditional distribution of S(x0) based on y and these parameters: T Sðx0 Þ9y, bS , s2S , fS , t21 , y N hS ðx0 , yÞ bS þ RS ððx0 , yÞ,DS Þ, 1 yHS ðDS ÞbS RS ðDS ,DS Þ þ t21 IN 1 s2S 1RS ððx0 , yÞ,DS Þ RS ðDS ,DS Þ þ t21 IN RS ðDS , ðx0 , yÞÞ : This gives us p Sðx0 Þ9y, bS , s2S , fS , t21 , y . Assume that fS and t21 are known, we can obtain the predictive distribution of S(x0) conditional on y, fS, t21 and y by integrating out bS and s2S : Z p Sðx0 Þ9y, fS , t21 , y ¼ p Sðx0 Þ9y, bS , s2S , fS , t21 , y p bS , sS 2 9y, fS , t21 , y dbS ds2S ðC:1Þ As bS and s2S are assumed to be independent of y, p(bS,s2S 9y,fS,t21,y) in the right hand side of (C.1) can be obtained from p(bS,s2S 9y,fS,t21)pp(bS,s2S )p(y9bS,s2S ,fS,t21). Since p(bS,s2S ) and p(y9bS,s2S ,fS,t21) are known, the form of (C.1) can be obtained. To integrate out bS and s2S is similar in spirit to that of Appendix A. After integrating out bS and s2S , we have Proposition 4.1.
References [1] Oberkampf WL, Roy CJ. Verification and validation in scientific computing. Cambridge University Press; 2010. [2] Kennedy MC, O’Hagan A. Bayesian calibration of computer models. Journal of the Royal Statistical Society Series B: Statistical Methodology 2001;63:425–50. [3] Kanso A, Chebbo G, Tassin B. Application of MCMC–GSA model calibration method to urban runoff quality modeling. Reliability Engineering & System Safety 2006;91:1398–405. [4] Henderson DA, Boys RJ, Krishnan KJ, Lawless C, Wilkinson DJ. Bayesian emulation and calibration of a stochastic computer model of mitochondrial DNA deletions in substantia nigra neurons. Journal of the American Statistical Association 2009;104:76–87. [5] Trucano TG, Swiler LP, Igusa T, Oberkampf WL, Pilch M. Calibration, validation, and sensitivity analysis: what’s what. Reliability Engineering & System Safety 2006;91:1331–57. [6] Campbell K. Statistical calibration of computer simulations. Reliability Engineering & System Safety 2006;91:1358–63. [7] Yuan J, Ng SH, Tsui KL. Calibration of stochastic computer models using stochastic approximation methods. IEEE Transactions on Automation Science and Engineering http://dx.doi.org/10.1109/TASE.2012.2199486, in press. [8] O’Hagan A. Bayesian analysis of computer code outputs: a tutorial. Reliability Engineering & System Safety 2006;91:1290–300. [9] Drignei D. A general statistical model for computer experiments with time series output. Reliability Engineering & System Safety 2011;96:460–7. [10] Higdon D, Kennedy M, Cavendish JC, Cafeo JA, Ryne RD. Combining field data and computer simulations for calibration and prediction. SIAM Journal on Scientific Computing 2004;26:448–66. [11] Wang SC, Chen W, Tsui KL. Bayesian validation of computer models. Technometrics 2009;51:439–51. [12] Loeppky J, Bingham D, Welch W. Computer model calibration or tuning in practice. 2006. Available from: /http://www.stat.ubc.ca/Research/TechRe ports/techreports/221.pdfS. [13] Ankenman B, Nelson BL, Staum J. Stochastic kriging for simulation metamodeling. Operations Research 2010;58:371–82. [14] Yin J, Ng SH, Ng KM. Kriging metamodel with modified nugget-effect: the heteroscedastic variance case. Computers & Industrial Engineering 2011;61:760–77. [15] Santner TJ, Williams BJ, Notz W. The design and analysis of computer experiments. New York: Springer; 2003. [16] Bastos LS, O’Hagan A. Diagnostics for Gaussian process emulators. Technometrics 2009;51:425–38. [17] Cressie NAC. Statistics for spatial data. New York: John Wiley; 1993. [18] Qian PZG, CFJ Wu. Bayesian hierarchical modeling for integrating lowaccuracy and high-accuracy experiments. Technometrics 2008;50:192–204. [19] Wei GCG, Tanner MA. A monte-carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. Journal of the American Statistical Association 1990;85:699–704. [20] Hills RG, Trucano TG. Statistical validation of engineering and scientific models: a maximum likelihood based metric. SAND2001-1783. Albuquerque, NM: Sandia National Laboratories; 2002. [21] Williams BJ, Loeppky JL, Moore LM, Macklem MS. Batch sequential design to achieve predictive maturity with calibrated computer models. Reliability Engineering & System Safety 2011;96:1208–19. [22] Jones DR, Schonlau M, Welch WJ. Efficient global optimization of expensive black-box functions. Journal of Global Optimization 1998;13:455–92.