Bayesian structural equation modeling method for hierarchical model validation

Bayesian structural equation modeling method for hierarchical model validation

ARTICLE IN PRESS Reliability Engineering and System Safety 94 (2009) 796–809 Contents lists available at ScienceDirect Reliability Engineering and S...

643KB Sizes 0 Downloads 76 Views

ARTICLE IN PRESS Reliability Engineering and System Safety 94 (2009) 796–809

Contents lists available at ScienceDirect

Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress

Bayesian structural equation modeling method for hierarchical model validation Xiaomo Jiang, Sankaran Mahadevan  Department of Civil and Environmental Engineering, Vanderbilt University, Box 1831-B, Nashville, TN 37235, USA

a r t i c l e in f o

a b s t r a c t

Article history: Received 15 August 2007 Received in revised form 30 July 2008 Accepted 15 August 2008 Available online 2 September 2008

A building block approach to model validation may proceed through various levels, such as material to component to subsystem to system, comparing model predictions with experimental observations at each level. Usually, experimental data becomes scarce as one proceeds from lower to higher levels. This paper presents a structural equation modeling approach to make use of the lower-level data for higherlevel model validation under uncertainty, integrating several components: lower-level data, higher-level data, computational model, and latent variables. The method proposed in this paper uses latent variables to model two sets of relationships, namely, the computational model to system-level data, and lower-level data to system-level data. A Bayesian network with Markov chain Monte Carlo simulation is applied to represent the two relationships and to estimate the influencing factors between them. Bayesian hypothesis testing is employed to quantify the confidence in the predictive model at the system level, and the role of lower-level data in the model validation assessment at the system level. The proposed methodology is implemented for hierarchical assessment of three validation problems, using discrete observations and time-series data. & 2008 Elsevier Ltd. All rights reserved.

Keywords: Bayesian statistics Structural equation modeling Model validation Bayes network Hypothesis testing

1. Introduction Model validation involves comparing model predictions with experimental observations, both of which contain uncertainty. In order to quantitatively assess the validity or predictive capabilities of computational models, experimental observations need to be collected from targeted tests conducted in a well-controlled environment. The confidence in model validation assessment is closely related to the amount and quality of experimental observations. The more validation data are available, the more accurate is the uncertainty quantification in the experiments, and the more confidence in the model validation result. A building block or hierarchical approach [1] to model validation may proceed through various levels, such as material to component to subsystem to system. However, in many cases, the amount of experimental data reduces as one proceeds from lower to higher levels, due to the prohibitive cost of conducting targeted validation experiments for large complex systems. In such situation, it is worth exploring whether lower-level or historical data contain information that can be utilized to assess the computational model at the system level. This paper develops a

 Corresponding author. Tel.: +1 615 322 3040; fax: +1 615 322 3365.

E-mail addresses: [email protected] (X. Jiang), [email protected] (S. Mahadevan). 0951-8320/$ - see front matter & 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.ress.2008.08.008

systematic approach to use information from subsystem- or component-level data for model assessment at the system level. Assessing model accuracy at the system level in the presence of the limited test data and many uncertainties is a difficult problem. Two crucial issues need to be addressed in developing an effective methodology to make use of the lower-level data for model validation at the system level. First, a proper validation metric is needed to quantitatively assess the agreement between the model predictions and experimental observations. Developing quantitative methods for model validation under uncertainty has attracted considerable interest in recent years. For example, the fundamental concepts and methodologies for validation of large-scale computational models are being investigated by several organizations such as the United States Department of Defense [2], American Institute of Aeronautics and Astronautics [3], Accelerated Strategic Computing Initiative (ASCI) program of the United Sates Department of Energy [4], and American Society of Mechanical Engineers Council [5]. Detailed discussion of model verification and validation concepts can be found in the literature (see e.g., [6–19]). Statistical hypothesis testing is one approach to model validation assessment under uncertainty, and both classic and Bayesian statistics have been explored (see Oberkampf and Barone [19] for a comprehensive state-of-the-art review). Classical hypothesis testing is a well-developed statistical method for accepting or rejecting a model based on an error statistic

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

Nomenclature Ahigh Alow B01 b01 bhigh_high bhigh_low blow_low BZ By dhigh di,low Dhigh ¯ high D ~ high D

influencing coefficient vector of g on h at the higher level influencing coefficient vector of g on h at the lower level Bayes factor or likelihood ratio of the two hypotheses Bayes factor in a logarithm scale higher-level validation metric using the higher-level data higher-level validation metric using the lower-level data lower-level validation metric using the lower-level data influencing coefficient vector of glow on ghigh influencing coefficient vector of hlow on hhigh decision variable about model quality at the higher level difference between the available lower-level data Zlow and Ylow difference matrix at the higher level mean values of Dhigh difference matrix at the higher level of the available lower-level data Zlow and Ylow ~ high mean values of D

~¯ high D Dlow difference matrix at the lower level ¯ low D mean values of Dlow f(DatajHi) likelihood of data under hypothesis Hi f(dhighjZ, Y) posterior distribution of dhigh given predictions Z and observations Y f(dhighjZ, X, Y) posterior distribution of dhigh given Z, X, and Y f(dhigh, XjZ, Y) joint distribution of dhigh and X given Z and Y f(yjX) probability density function (PDF) of y given X f(Z, XjY) joint distribution of the model prediction and SEM parameters given experimental observations Y ¼ {Ylow, Yhigh} f(ZjX, U)distribution of model prediction Z given U f(XjZ, Y) posterior distribution of X given Z and Y f(XjY) posterior distribution of X given Y f(XjZ) posterior distribution of X given Z f(X, U) joint distribution of X and U f(lhighjH1) distribution of lhigh given H1 f(llowjH1) distribution of llow given H1 g(X, U) computational model h(Ylow, Zlow, X) Function relating decision variable at the higher level to experimental observation and model prediction at the lower level H0 null hypothesis H1 alternative hypothesis HZ matrix relating glow to ghigh Hy matrix relating hlow to hhigh Iq  q identity matrix with the size of q  q K number of sample data L(YjZ, X) likelihood of the observations M number of observed data pointed at the higher level N number of observed data pointed at the lower level NSEM total number of SEM parameters to be estimated p number of repeated observations at the lower level Pr(H0) prior distribution of H0

(see e.g., [8,9,11,12,15,18,20]). The second author and coworkers have studied Bayesian hypothesis testing-based methods [21–26]. These metrics have been investigated with various model validation problems using limited amount of experimental data.

797

Pr(H1) prior distribution of H1 Pr(H0jData) posterior distribution of H0 given data q number of repeated observations at the higher level r model reliability X input variables of computational model Y ¼ {Ylow, Yhigh) all observations matrix at both lower and higher levels Ylow all lower-level observations matrix Yhigh all higher-level observations matrix yi,low observed data vector at the lower level yi,high observed data vector at the higher level yij observed data point Z ¼ {Zlow, Zhigh) all model prediction data matrix at both lower and higher levels Zlow all lower-level prediction data matrix Zhigh all higher-level prediction data matrix zi,low prediction data vector at the lower level zi,high prediction data vector at the higher level zij predicted data point hhigh unobservable latent response vector at the higher level hlow unobservable latent response vector at the lower level e0 small allowable interval ey,high measurement error at the higher level ey,low measurement error at the lower level ez,high prediction error at the higher level ez,low prediction error at the lower level ey,low, eZ,high, ey,high structural modeling errors ~ high R~ covariance matrix of D Dhigh

Rhigh ~ R high

Rlow Ry,high Ry,low Rz,high Rz,low

sd 2 siy,high 2 siy,low 2 siz,high 2 siz,low ghigh glow U X

md lhigh llow q K

k

covariance matrix of the distribution for the difference of higher-level data covariance matrix of the distribution for the difference of lower-level data at higher level covariance matrix of the difference for the difference of lower-level data variance of ey,high variance of ey,low variance of ez,high variance of ez,low standard deviation of difference data d ith component of Ry,high ith component of Ry,low ith component of Rz,high ith component of Rz,low unpredictable latent response vector at the higher level unpredictable latent response vector at the lower level input random physical parameters all SEM parameters to be estimated mean of difference data d mean values of the distribution for the difference of higher-level data mean values of the distribution for the difference of lower-level data mean of distribution f(llowjH1) variance of distribution f(llowjH1) measure of confidence

Refer to Mahadevan and Rebba [22] for details about the validation metrics and their applications. This paper uses Bayesian hypothesis testing to quantify the confidence in the predictive model at the system level.

ARTICLE IN PRESS 798

X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

Another important issue to be addressed is how to quantify the relationship between the lower-level data and the higher-level data. In the context of a building block approach to model validation, component-level models for idealized situations are often validated using small scale or lower level (e.g., componentor subsystem-level) tests conducted in a well-controlled environment. The validated model needs to be assessed further for practical application at the system level, but system-level data may be scarce. In some cases, however, the response quantity of interest in the target application at the system level may be different from the validated response quantity. When the number of system-level tests is relatively small, it may be desired to use lower-level data and validation experience to make partial inference on the validity of system-level prediction. In such a situation, the role of the lower-level data in the model assessment at system level needs to be effectively quantified. In this paper, a structural equation modeling (SEM) approach with latent variables is explored to achieve this objective by modeling two sets of relationships, namely, the computational model vs. system-level data, and lower-level data vs. system-level data. The SEM approach is a powerful multivariate statistics method that incorporates ideas from regression, path, and factor analyses [27]. In this approach, two data sets are related through a set of equations, using one or more latent response variables and one or more latent explanatory variables. The SEM approach has been used in economics [28], sociology [29], and psychology [30] to model relationships between intrinsically latent variables (i.e. unobserved variables such as happiness, quality of life) and manifestly observed variables and to quantify observation or measurement errors. This paper explores the use of SEM approach for hierarchical assessment of a computational model, using different amounts of data at various levels. The SEM approach has several attractive features. First, it provides a general framework for modeling means and covariance relationships in multivariate data [31]. Second, it permits both input variables and response variables to be summarized separately by their underlying latent variables, while also accounting for the anticipated causal relationships between the two sets of latent variables. Third and finally, it allows the use of different types of data: survey, experimental, and longitudinal data. These features make SEM an effective tool to address the model validation problem, which involves comparing two data sets under uncertainty. In this context, by explicitly modeling measurement and prediction errors, one seeks to derive unbiased estimates for the relationships between latent variables using the SEM approach. To this end, SEM allows multiple observations to be associated with a single latent variable, which is particularly desirable in identifying the relationship between the lower-level data and the higher-level data in the context of hierarchical model validation. The path analysis in the SEM provides: (1) a pictorial representation of a system of simultaneous equations, which shows the relationships between all variables including errors, (2) a set of equations relating correlations or covariance of variables to model parameters, and (3) a means to distinguish direct, indirect, and total effects of one variable on another. The factor analysis explores the relationship of latent variables to observed variables. Comprehensive discussions about the SEM method are available in several studies (e.g., [27–29]). A key issue in successfully applying this approach to solve practical problems is how to effectively estimate SEM parameters under uncertainty. Commonly used parameter estimation methods in earlier studies are full information maximum likelihood and generalized leastsquares procedures [27]. However, these methods are often problematic or infeasible for complicated situations with heavily parameterized models and limited or missing data, while

incorporating various sources of uncertainty [32]. Because of this, Bayesian approaches have been pursued to accommodate data uncertainty and potential prior knowledge for parameter estimation in factor models (a special case of SEM) (e.g., [33–37]). Due to recent advances in Bayesian hierarchical modeling, modern highperformance computing and Markov chain Monte Carlo (MCMC) simulation techniques [22,23], it is feasible to estimate the posterior distributions of SEM parameters. This greatly broadens the application of the SEM approach to models of large complex problems. In the following sections, model validation is treated as a hierarchical problem consisting of several parts: lower-level data, higher-level data, computational model at both levels, and latent variables. The concepts and ideas of the SEM approach and a simple two-level univariate problem with discrete data are first introduced. The example is used to illustrate the hierarchical model validation methodology described subsequently. The SEM approach with latent variables is then exploited to model the relationships between the computational model and higher-level data, and between lower-level data and higher-level data. A Bayesian network is used to represent the structural equation models and to estimate the SEM parameters by Bayesian updating with MCMC simulation, considering data uncertainty. Next, a Bayesian hypothesis testing-based metric is employed to assess the confidence in accepting the computational model. Factor analysis in conjunction with the Bayesian method is applied to investigate the role of component-level data in the model validation assessment at the system level. Finally, the proposed methodology is further investigated with two model validation examples, including a three-level univariate dynamic problem and a two-level multivariate dynamic problem with multiple time-series data.

2. Structural equation modeling Let yi and zi (i ¼ 1, 2, y, N) represent the ith experimental observation and model prediction, respectively, in which N is the number of data points. A classic linear regression equation can be used to model the relationship between the two data sets, expressed as yi ¼ azi þ r

(1)

where a is the regression coefficient and er is the modeling error, usually assumed to be identically independently distributed with zero mean and unknown variance, i.e., erN(0, s2). Note that the experimental observation usually contains measurement error (denoted by ey), and the model output may contain prediction error (denoted by ez). Thus, yi and zi can be expressed as yi ¼ yi þ y

(2)

zi ¼ Zi þ z

(3)

where yi and Zi are two variables representing the true experimental observation and model prediction, which are usually unknown and referred to as latent variables. They are often assumed to be identically independently distributed with zero mean and unit variance, i.e., N(0, 1). Similar to the modeling error er, the measurement error ey and the prediction error ez are usually assumed to be identically independently distributed with zero mean and unknown variance, i.e., eyN(0, sy2) and ezN(0, sz2). In the SEM approach, Eqs. (2) and (3) are referred to the measurement and computational models, respectively. Substituting Eqs. (2) and (3) into Eq. (1) yields

yi ¼ aZi þ Z

(4)

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

in which eZ ¼ aez+erey represents the new modeling error with the distribution assumption eZN(0, sZ2). Eq. (4) is referred to the structural model in the SEM approach. The parameters F ¼ {a, sy, sz, sZ} can be estimated using the maximum likelihood method or generalized least-squares procedures [27]. In this study, the SEM parameters are estimated using Bayesian method to be discussed later. Fig. 1 shows a path diagram for the simple SEM represented by Eqs. (2)–(4). This diagram is a pictorial representation of the system of simultaneous equations. It shows the relation between the experimental observation and the model prediction via the latent variables. In the subsequent sections, the SEM and latent variables concepts are extended to hierarchical model validation by modeling two sets of relationships, namely, the computational model vs. system-level data, and lower-level data vs. system-level data. A simple univariate example is used in the process of formulation to explain the proposed methodology. In the illustrative example, two sets of experimental observations are assumed, each containing 10 data points collected from lowerand higher-level tests (i.e., N ¼ 10), denoted by Ylow and Yhigh, respectively. The corresponding two sets of model predictions at lower and higher levels are obtained from a computational model, denoted by Zlow and Zhigh, respectively. The numerical data are shown in Fig. 2, where the solid and dashed lines represent the model prediction (Z) and the experimental observation (Y), respectively.

3. Hierarchical validation methodology 3.1. Hierarchical model validation The proposed SEM approach consists of three models: (i) measurement model to specify the relationships of the latent

i

y

a

yi

i



zi

z

600

Value

Higher level

Lower level

300

200

Latent variable θ Latent variable η

100 1

2

3

4 5 6 7 Number of data points

3.1.1. Measurement model Let Ylow ¼ fy1 ; . . . ; yp gTlow and Yhigh ¼ fy1 ; . . . ; yq gThigh represent the p sets of independent lower-level observations and the q sets of independent higher-level observations, respectively, in which each lower-level data set contains N data points, i.e., yi,low ¼ [yi1,y,yiN]low, and each higher-level data set contains M data points, i.e., yi,high ¼ [yi1,y,yiM]high. Thus, the experimental observations can be modeled by a set of latent variables in the SEM as follows: Ylow ¼ hlow þ ey;low

(5)

Yhigh ¼ hhigh þ ey;high

(6)

where h is an unobservable latent response vector with the same dimension as Y. The vector ey represents the measurement error, usually assumed to consist of identical independently distributed Gaussian variables with zero mean and variance Ry, i.e., eyN(0, Ry), 2 2 in which Ry,low ¼ diag(s1y,low ,y,spy,low ) and Ry,high ¼ diag 2 2 (s1y,high ,y,spy,high ). Thus, the data model can be written in the form of a probability distribution function as Yj(h, Ry)N(h, Ry).

8

Zlow ¼ glow þ ez;low

(7)

Zhigh ¼ ghigh þ ez;high

(8)

where g is the unobservable latent vector with the same dimension of Z, and the vector ez represents the model prediction error, also assumed to consist of identical independently distributed Gaussian variables with zero mean and variance, i.e., 2 2 ezN(0, Rz), in which Rz,low ¼ diag(s1z,low ,y,spz,low ) and Rz,high ¼ 2 2 diag(s1z,high ,y,sqz,high ). Within the context of model validation, the model prediction is usually generated using a computational model or function with input random parameters U and input variables X, i.e., Z ¼ g(X, U).

Model prediction Observation

400

variables to the observed variables; (ii) computational model to both map the response to input variables and specify the relationships of the latent variables to the predicted variables, and (iii) structural model to identify the relationships among the latent variables, thus relating the computational model to higherlevel data, and lower-level data to higher-level data. The relationships among the variables in the three models are visually represented through a path graph. In this paper, the relationships among various latent variables are linearly modeled in the SEM approach. A nonlinear SEM approach will be presented in another paper to explore the nonlinearity among latent variables. All symbols and variables used in this work are summarized in the nomenclature.

3.1.2. Computational model Let Zlow ¼ fz1 ; . . . ; zp gTlow and Zhigh ¼ fz1 ; . . . ; zq gThigh represent the p sets of lower level and q sets of higher-level model prediction data, respectively, in which each level data set contains corresponding N and M data points. The model prediction can also be decomposed into a set of latent variables and prediction error, expressed as follows:

Fig. 1. Path diagram example.

500

799

9

10

Fig. 2. Illustrative example: model prediction, experimental observations, and latent variables at both lower and higher levels.

3.1.3. Structural model The structural model is used to map the relationships between the lower-level data and the higher-level data, and the experimental observations and the model predictions, expressed as follows:

hlow ¼ Alow glow þ ey;low

ghigh ¼ BZ ½glow HZ  þ eZ;high

(9) (10)

ARTICLE IN PRESS 800

X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

hhigh ¼ Ahigh ghigh þ Bh ½hlow Hy  þ ey;high

(11)

where the p  p matrix Alow and the q  q matrix Ahigh quantify the influence of g on h at the lower level and at the higher level, respectively, the q  p matrices BZ and By associated with the N  M matrices HZ and Hy describe the relationships between latent variables Zi and yi at the lower level and the higher level, respectively, and the vectors ey,low, eZ,high, and ey,high represent structural modeling errors. Eq. (10) models the relationship between the lower-level data and the higher-level data for model prediction data in terms of latent variables, while Eq. (11) models the relationships between the lower-level data and the higherlevel data for experimental observations and between the experimental observations and the model predictions in terms of latent variables. Two issues regarding SEM in modeling the validation data [i.e., represented by Eqs. (5)–(11)] need to be clarified here. First, the approach is flexible, and various modeling equations (e.g., nonlinear regression equations) may be used in practical implementation. Second, the total number of SEM parameters to be estimated (NSEM) is calculated from Eqs. (5)–(11) as follows:

εz,low

Zlow

ε,low

ηlow

NSEM ¼ p  p þ q  q þ 2  q  p þ 2  N  M þ 2ðp þ qÞ

εy,low

(12)

which includes the p  p matrix Alow, the q  q matrix Ahigh, the q  p matrices BZ and By, the N  M matrices HZ and Hy, the p diagonal components in variance matrices Rz,low and Ry,low, and the q diagonal components in variance matrices Rz,high and Ry,high. Obviously, it requires intensive computational effort in estimating all these SEM parameters when there are a large amount of data available (i.e., N or M is large). Note that both relationship matrices HZ and Hy can be taken to be identity matrices if N ¼ M. Thus, the term, 2  N  M, is dropped from Eq. (12) and the computational effort in estimating the SEM parameters are reduced largely. 3.1.4. Path analysis The relationships among the variables in the three types of models are graphically shown in Fig. 3. It includes the experimental observation (Ylow and Yhigh), the model prediction data (Zlow and Zhigh), the latent variables (glow, ghigh, hlow, hhigh), and the errors or uncertainty (e.g., ey,low). The line arrow denotes the influencing relationships between two variables. The computational model, Z ¼ g(X, U), is also incorporated in the parameter updating as shown in Fig. 3, where U represents the vector of input random parameters and X represents the vector of input variables. In the following section, we estimate the SEM parameters, i.e., X ¼ {A, B, H, R}, using the Bayesian method and the given experimental observations (Y) and model prediction (Z) at the lower and higher levels. For the illustrative example described in Section 2 (N ¼ M ¼ 10), both relationship matrices HZ and Hy in Eq. (12) are assumed to be identity matrices. Thus, the number of parameters to be estimated is NSEM ¼ 8 obtained from Eq. (12) by setting q ¼ p ¼ 1 and dropping the term of 2  N  M, i.e., X ¼ {alow, ahigh, bZ, by, sy,low, sy,high, sz,low, sz,high}. The SEM can be graphically represented using the path diagram similar to Fig. 3, with the corresponding parameter vector (e.g., Alow) replaced by the scalar parameter (e.g., alow). 3.2. Bayesian estimation Consider a lower-level computational model zlow ¼ g(Xlow, U) in the validated region, where z is a quantity of interest, U is a set of input random variables and Xlow may represent spatial or time co-ordinates if applicable. Inferences need to be made for a decision variable dhigh ¼ h(Ylow, Zlow, X) ¼ h0 (X, U, X) at the

Xhigh

Xlow Φ

Φ

B

Zhigh

εz,high

ηhigh

ε,high

H Ahigh

Alow B ε,low

θlow

θhigh

ε,high

Yhigh

εy,high

H

Ylow

Available data

Latent variables

Fig. 3. Path diagram for structural equation modeling in hierarchical model validation.

system level, where dhigh is a function of the experimental observations and model predictions. In this paper, for example, the decision variable dhigh is the Bayes validation metric to be discussed later. The parameter vector X is an additional set of (1) random variables at the higher level or in the application domain, or (2) random SEM parameters to link the lower-level data to the higher-level data, or (3) both (1) and (2). The first case has been explored by Rebba and Mahadevan [23] using the Bayes network concept. In this paper, we focus on the second case. The method presented can then be easily extended for the third case. Based on the hierarchical model described previously, the posterior distributions of the parameters given the data are obtained by Bayes’ theorem as follows: f ðXjZ; YÞ / LðYjZ; XÞ  f ðZjX; UÞ  f ðX; UÞ

(13)

where Z ¼ {Zlow, Zhigh} represent the model prediction with the component z ¼ g(X, U); X and U represent the SEM parameters and the input variables in the computational model, respectively; f(Z, XjY) is the joint distribution of the model prediction and SEM parameters given the experimental observations Y ¼ {Ylow, Yhigh}; L(YjZ, X) is the likelihood of the observations, which is modeled by Eq. (1) or (2); f(ZjX, U) is the distribution of the model prediction given U, which is modeled by Eq. (3) or (4), and f(X, U) is the joint distribution of all SEM parameters and input random variables. Eq. (13) is used to implicitly relate the decision variable dhigh at the higher level to quantities y and z at the lower level. In the lower-level model validation, the quantity Ylow is observed and related to a subset of parameters X and unobserved latent variables through a probability density function (PDF) f(yjX) as described in Section 3.1. The decision variable dhigh in the target application region (or at the higher level) is related to Ylow, Zlow, and X through a PDF f(dhighjZ, X, Y) with Z ¼ g(Xlow, U). Having validated the quantity Ylow, the computational model is next used for predictions in the target application region. For the sake of simplicity, assuming the model predictions Z at various

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

levels are given, the input variables U can be ignored in the following formulation. Using Bayes’ theorem, the inference about dhigh may be made using the posterior distribution of dhighjZ, Y given model predictions Z and some observations Y as follows: Z f ðdhigh ; XjZ; YÞ dX f ðdhigh jZ; YÞ ¼ Z ¼ f ðdhigh jZ; Y; XÞf ðXjZ; YÞ dX Z ¼ f ðdhigh jXÞf ðXjZÞf ðXjYÞ dX (14) Q Q where f ðXjYÞ / f ðXÞ ni¼1 f ðyi jXÞ and f ðXjZÞ / f ðXÞ ni¼1 f ðzi jXÞ need to be updated as described below. Now we need to update the density functions, f(dhighjX), f(XjZ), and f(XjY) in Eq (14). Generally, when the SEM parameters X linking the validation data in various levels are random, the decision variable dhigh becomes a random variable. If the computational model is validated using experimental observations at the lower level, the density function associated with the decision variable dhigh can be updated to reflect a change of confidence in the prediction for target application at the higher level. The Bayesian network concept [38] with MCMC techniques like Gibbs sampling [39] can be explored for this purpose conveniently. The implementation of Gibbs sampling is realized using the software WinBUGS [40], which is used in the numerical examples in Section 4 below. Refer to Mahadevan and Rebba [22] about the details of this procedure. Fig. 4 shows a typical Bayes network for data-driven model extrapolation represented by Eq. (14). An ellipse (for example, the SEM parameters X) represents a random variable and a rectangle (for example, the experimental value Y) represents observed data. A solid line arrow represents a conditional probability link, and a dashed line arrow represents the link of a variable to its observed data if available. The densities of the variables X, z, and y are updated using the validated data Y. The updated statistics of X, z, and y are then used to estimate the updated statistics of the decision variable dhigh. For the illustrative example described in Section 2, the eight SEM parameters are easily obtained using the Bayes network with MCMC simulation and the four sets of data (i.e., Ylow, Yhigh, Zlow, Zhigh). Table 1 summarizes the statistical results of the eight SEM parameters, including their mean values and standard deviations. For illustration purposes, the latent variables h and g are also plotted in Fig. 2. It is found that the original data Y and Z are approximated effectively by the corresponding latent variables. 3.3. Point hypothesis-based Bayesian model validation 3.3.1. Lower-level model validation Consider first model validation using experimental data and model prediction at the lower level, Ylow and Zlow, each containing

801

Table 1 Statistic results of SEM parameters and model validation Parameters

Mean Stand. dev.

alow

ahigh

by

bZ

sy,low

sy,high

sx,low

sx,high

3.405 0.238

0.338 0.097

1.212 0.168

5.896 0.405

7.484 1.360

1.148 0.953

127.0 21.37

1.698 1.699

p  N data points, where p ¼ 1 implies a univariate problem, and otherwise a multivariate one. Within the context of binary hypothesis testing for model validation, we need to test two hypotheses H0 and H1, i.e., the null hypothesis (H0: Ylow ¼ Zlow) to accept the model and an alternative hypothesis (H1: YlowaZlow) to reject the model. Let Dlow ¼ YlowZlow represent the difference between the experimental data and the model prediction. Thus model validation reduces to testing two hypotheses, i.e., H0: Dlow ¼ 0 versus H1: Dlowa0. Given Ylow and Zlow, the likelihood ratio of the two hypotheses, referred to as the Bayes factor, is calculated using Bayes’ theorem as [41] B01 ¼

f ðDatajH0 Þ f ðDatajH1 Þ

(15)

where f(DatajHi) (i ¼ 0, 1) is the likelihood of data under the hypothesis Hi. Since B01 is non-negative, the value of B01 is often converted into the logarithm scale for the convenience of comparison among a larger range of values [i.e., b01 ¼ ln(B01)]. In the Bayesian risk-based decision approach developed by the authors [24], an acceptable threshold for the Bayes factor is derived, based on the prior densities of the two hypotheses and the costs of deciding Hi when Hj is true (i ¼ 0, 1; j ¼ 0, 1). It is derived from minimizing the risk in model validation that, a value of B01 greater than the threshold is said to accept the model, otherwise, to reject the model. In the Bayes factor-based approach [21], a value of B01 larger than one indicates that the data supports H0. Kass and Raftery [41] suggest interpreting b01 between 0 and 1 as weak evidence in favor of H0, between 3 and 5 as strong evidence, and b0145 as very strong evidence. Negative b01 of the same magnitude is said to favor H1 by the same amount. Let Dlow ¼ {d1, d2, y, dK} be a sample of size K from the error distribution N(llow, Rlow) with llow ¼ hlowg representing the corresponding K mean values and Rlow being a p  p covariance matrix of all variables Dlow. Thus, the multivariate model validation problem reduces to comparing the hypotheses H0: llow ¼ 0 versus H1: llowa0 with llowjH1N(q, K) [denoted by f(llowjH1)]. The expression of B01 for multivariate model validation derived in Jiang and Mahadevan [25] is converted into the logarithmic scale as follows:   1 KjKj þ jRlow j K ¯ low  qÞ0 ðK K blow_low ¼ ln þ ½ðD 2 jRlow j 2 0

¯ low  qÞ  D ¯ low R1 ¯ þ Rlow Þ1 ðD low Dlow 

Z z

dhigh

y

Ω Fig. 4. Bayes network.

Y

(16)

where blow_low ¼ ln(B01) represents the lower-level validation ¯ low is metric using the lower-level data, and the value of D calculated from the available lower-level data Zlow and Ylow, i.e., P ¯ low ¼ ð1=NÞ Ki¼1 di . Eq. (16) is used to calculate the likelihood D ratio in a logarithmic scale for multivariate model validation, incorporating the correlation of multiple quantities in the covariance matrix Rlow. It is indicated that the Bayes factor only depends on the data through its mean and covariance values. If no information on f(llowjH1) is available, the parameters q ¼ 0 and K ¼ Rlow are suggested in Migon and Gamerman [42]. This selection assumes that the amount of information in the prior is

ARTICLE IN PRESS 802

X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

where bhigh_low ¼ ln(B01) represents the higher-level validation ~¯ high is calculated from the metric using the lower-level data, and D available lower-level data Zlow and Ylow using

equal to that in the observation, which is consistent with the Fisher information-based method [41]. The prior probabilities of two hypotheses are assumed to be Pr(H0) ¼ 0.5 and Pr(H1) ¼ 0.5 in the absence of prior knowledge of each hypothesis before testing. The confidence in the model based on the validation data can then be quantified as [22] B01 k ¼ PrðH0 jDataÞ ¼ B01 þ 1

~ high ¼ ðAhigh  Iqq ÞBZ ðg HZ Þ þ By ðhlow Hy Þ D low and ~ R þ ðAhigh  2Iqq ÞT ðAhigh  2Iqq ÞRz;high þ 4Ry;high ~ high ¼ RD high

(17)

(23)

where k is a measure of confidence with the range of [0, 100]%. Obviously, from Eq. (17), B01-0 indicates 0% confidence in accepting the model, and B01-N indicates 100% confidence.

~ high . in which RD~ high is the covariance matrix of D It should be pointed out that Eq. (18) is used to calculate the higher-level validation metric using the given higher-level data, which results in a deterministic validation metric. However, Eq. (21) relates the higher-level validation metric to the lowerlevel data via the SEM parameters and latent variables. Since the SEM parameters are nondeterministic, the resulting bhigh_low is a random variable. Thus, using the statistics of SEM parameters and the Monte Carlo technique, the probability distribution of bhigh_low can be simulated. Note that the validation metric bhigh_low is the decision variable dhigh mentioned in Section 3.2. Thus, the role of the lower-level data on model validation at the higher level is effectively identified. For the illustrative example described in Section 2, the mean values are mlow ¼ 0.577 and mhigh ¼ 15.99, and the variance s2low ¼ 4.872 and s2high ¼ 13.20. Fig. 5 shows the difference between the experimental observation and the model prediction at two levels. Substituting the obtained quantities mlow and s2low, and mhigh and s2high into Eqs. (14) and (18), respectively, and assuming r ¼ 0 and its variance sr2 ¼ s2low ¼ 4.872, we obtain blow_low ¼ 0.89 [Blow_low ¼ exp(blow_low) ¼ 2.44] and bhigh_high ¼ 75.5 [Bhigh_high ¼ 1.6  1033E0] for the lower- and higher-level model validations using the corresponding level data, respectively. From Eq. (17), the confidence in data support for the model is quantified as klow_low ¼ 70.9% for the lower-level model validation and khigh_high ¼ 0% for the higher level, respectively. Using MCMC simulation technique and the statistics of the SEM parameters in Table 1, the probability distribution of bhigh_low is estimated using Eq. (21). Its statistic along with the corresponding mean mhigh_low and deviation value shigh_low obtained from Eqs. (22) and (23), respectively, are summarized in Table 2. Fig. 6 shows the probability distribution of the Bayes factor metric bhigh_low. The probability of bhigh_lowo0 (i.e., Bhigh_lowo1) (rejecting the model) is up to 99.3%, indicating that the model is rejected with high probability at the higher-level validation using the lower-level data.

3.3.2. Higher-level model validation Now we formulate the role of the lower-level data on the model validation at the higher level in terms of the Bayes factor and SEM parameters described previously. Within the context of binary hypothesis testing for model validation at the higher level, we need to compare two hypotheses H0 and H1, i.e., the null hypothesis (H0: Yhigh ¼ Zhigh) to accept the model and an alternative hypothesis (H1: YhighaZhigh) to reject the model. Similarly, let Dhigh ¼ YhighZhigh be a Gaussian distribution with N(lhigh, Rhigh), representing the difference between the experimental data and the model prediction at the higher level. Then the two hypotheses are H0: lhigh ¼ 0 vs. H1: lhigha0. The validation metric in Eq. (16) becomes   KjKj þ jRhigh j 1 K ¯ high  qÞ0 ðK K þ ½ðD bhigh_high ¼ ln jRhigh j 2 2 0

¯ ¯ high  qÞ  D ¯ high R1 þ Rhigh Þ1 ðD high Dhigh 

(18)

where bhigh_high ¼ ln(B01) represents the higher-level validation ¯ high is metric using the higher-level data, and the mean value D calculated from the available higher-level data Zhigh and Yhigh. Using Eqs. (6) and (8) yields Dhigh ¼ Yhigh  Zhigh ¼ hhigh  ghigh þ ey;high  ez;high

(19)

Substituting Eqs. (10) and (11) into Eq. (19) with ey,high ¼ ey,high and eZ,high ¼ ez,high yields Dhigh ¼ ðAhigh  Iqq ÞBZ ðglow HZ Þ þ By ðhlow Hy Þ þ Ahigh ez;high þ 2ey;high  2ez;high

(20)

3.4. Model reliability metric (MRM) For the purpose of comparison with the Bayes factor, a singlelevel model validation metric is computed at each level in this

10

30

5

20

Value

Value

where Iq  q is a q  q unit matrix. Note that in Eq. (20) glow and hlow are the unobserved latent variables of the model prediction Zlow and experimental observations Ylow, respectively. Eq. (20) relates the lower-level data to the higher level through the SEM parameters and latent variables. Thus, the validation metric at higher level can be calculated by ! ~ KjKj þ jR 1 K h ~¯ high j bhigh_low ¼ ln ðDhigh  qÞ0 ðK K þ ~ 2 2 jRhigh j   i 0 1 ~¯ ~ ~ 1 ~¯ ~¯ (21) þR Dhigh  q  D high Þ high Rhigh Dhigh

0 -5

(22)

10 0

1

2

3

4

5

6

7

8

Number of data points

9 10

1

2

3

4

5

6

7

8

9 10

Number of data points

Fig. 5. Difference between experimental observation and model prediction shown in Fig. 2 at (a) lower level (mlow ¼ 0.577) and (b) higher level (mhigh ¼ 15.99).

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

4. Numerical examples

Table 2 Model validation results of the illustrative example Parameters

Mean Stand. dev.

mhigh_low

shigh_low

bhigh_low

khigh_low (%)

598.6 145.5

168.0 40.9

0.126 0.051

46.86 1.27

Density

1.5

1

P (bhigh_low< 0) = 99.3%

0.5

0 -16

-20

-12 -8 bhigh_low = ln (Bhigh_low)

-4

4.1. Example 1: multilevel model validation (single variable)

study. The MRM [43] is defined as the probability of the difference d between the model prediction and the experimental observation is within an allowable interval e, i.e., r ¼ Pr(e0odoe0). Suppose that d ¼ {d1, d2, y, dn} follows a normal distribution N(md,sd/n), where n is the number of sample data (n ¼ 10 in this example), and md and sd represent the mean and standard deviation of the difference data d, respectively. Thus, the model reliability is calculated as follows:

sd

sd

The proposed methodology is further investigated in this study with two model validation examples. The first example is a threelevel dynamic problem with one set of time-series data at each level (i.e., p ¼ q ¼ 1). The second example is a two-level multivariate validation problem (i.e., p ¼ 3 and q ¼ 12, where p is for the higher level, and q is for the lower level) with time-series data from a dynamic system. In this example, N4M is selected to illustrate the practical effectiveness of the proposed methodology. Each example is investigated in three steps: (1) the SEM approach is applied to model the relationships between data of various levels through latent variables; (2) SEM parameters are estimated using Bayes network with MCMC simulation (Gibbs sampling), and (3) Bayesian hypothesis testing is conducted to quantify the role of lower-level data on model validation at the higher level through SEM parameters. The implementation of Bayesian approach with Gibbs sampling for the estimation of SEM parameters is realized using the software WinBUGS [40] for all examples. All results are based on a run of 11,000 iterations of Gibbs sampler, where the first 1000 iterations are omitted as the simulation burn-in period for the consideration of stationarity. The 1000 iterations are found adequate for the burn-in period by trial and error.

0

Fig. 6. Higher-level model validation using lower-level data with 10,000 iterations of simulation.

pffiffiffi  pffiffiffi  nð0  jmd jÞ nð0  jmd jÞ r¼U U

803

(24)

where U(.) presents a normal cumulative distribution function. For the multivariate case, the threshold parameter and the mean values of difference data in Eq. (24) become two vectors and the variance becomes a matrix. In this situation, U(.) presents a multivariate normal cumulative distribution function. Let e0 ¼ 0.5sd for this illustrative example, substituting the values of md and sd at the lower and higher levels to Eq. (24) yields rlow ¼ 76.7% and rhigh ¼ 2.9  1035, respectively. The results are close to the Bayesian confidence levels previously obtained by Eq. (17), implying that the Bayes factor metric is comparable to the MRM with the proper allowable interval. In addition, substituting the mean values of mhigh_low and shigh_low into Eq. (24) yields the model reliability rhigh_low ¼ 2.4  1061, implying that the model is not reliable at the higher level using the lower-level data, the same as the Bayes factor metric. Note that hypothesis testing and MRM methods are some simple methods that can be used to assess the model validity, but based purely on single-level comparison. The two methods cannot be applied directly for hierarchical validation across the levels. They can only be applied at each level separately. Therefore, the Bayesian SEM approach presented previously is needed to model the relationship across the levels. Then both metrics are used to assess the model validity at the higher level using the data at the lower level.

This example is created to illustrate the proposed methodology for multilevel model assessment using time-series data collected from a dynamic system. Assuming three sets of time-series measurement with 2500 data points are collected from component, subsystem, and system tests. The corresponding three sets of model predictions at component, subsystem, and system levels are generated from a computational model with the same amount of data points. Fig. 7a shows the component-, subsystem-, and system-level data in time-series plots, where the solid and dashed lines represent the model prediction and the experimental measurements, respectively. Fig. 7b shows the comparison of power spectral densities (PSD) of predicted and measured acceleration response data at multiple levels. The PSD of every time series is calculated through fast Fourier transform with the length of 1024. Refer to Stoica and Moses [44] for details of spectral analysis. It is found that the difference between predicted and measured acceleration response data at all levels cannot be identified graphically in both time and frequency domains, implying that the model may be accepted with high confidence at all levels. The frequency data in the range between 1 and 60 Hz are used in the hierarchical model assessment (i.e., q ¼ 1 and N ¼ 60 for all three levels). The mean values of difference of frequency data between model prediction and experimental experiment for the three cases are mcom ¼ 1.8  103, msub ¼ 6.6  104, msys ¼ 9.6  104, and the variances s2com ¼ 1.49  104, s2sub ¼ 6.78  104, and s2sys ¼ 1.48  103. Similarly, substituting the obtained mean and variance values at the three levels into Eqs. (14) and (18), and assuming r ¼ 0 and its variance sr2 ¼ s2com, the Bayes factor metrics bcom ¼ 2.348, bsub ¼ 2.228, and bsys ¼ 1.844 are obtained for component-, subsystem-, and system-level model validation using the corresponding level data, respectively. From Eq. (17), the confidence values in accepting the model are kcom ¼ 91.3%, ksub ¼ 90.3%, and ksys ¼ 86.3% for the three cases. The resulting three MRM values are rcom ¼ 99.7%, rsub ¼ 99.9%, and rsys ¼ 99.99%. Clearly the model is accepted with high confidence in all three-level validation cases using the frequency data.

ARTICLE IN PRESS 804

X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

1.2 Abs. Magnitude

Acceleration (g)

0.4 Component-level

0.2 0 -0.2 -0.4

Component-level 0.5

0 0

5

10

15

20

25

0

Abs. Magnitude

0.25

Subsystem-level

0 -0.25

150

Subsystem-level 1

0 0

5

10

15

20

25

0

0.3

Abs. Magnitude

0.6 Acceleration (g) Link

100

2

-0.5

System-level

0 -0.3 -0.6

50

2.5

0.5 Acceleration (g)

1

50

100

150

3.5 3 System-level 2 1 0

0

5

10 15 20 Time (seconds) Model prediction

25

0

50 100 Frequency (Hz)

150

Experimental measurement

Fig. 7. Single prediction and response data of three-level model validation in both time and frequency domains (Example 1): (a) time series and (b) frequency.

zsub εη,sys

εz,com

zsys

zcom

Table 3 Statistical results of SEM parameters (Example 1)

εz,sys εz,sys

η sub εη,com

ηcom acom

εθ,com

csub

ccom

asub

θcom

bsub

bsys

ηsys

εη,sys

asys θ sys

εθ,sys

ysys

εy,sys

θ sub εy,com

ycom εθ,sub

Available data

ysub

Parameters

Mean

Stand. dev.

acom asub asys bsub bsys ccom csub

1.407 0.806 0.582 0.504 0.579 3.242 4.422 0.0635 0.0667 0.0689 0.1014 0.0668 0.0805

0.147 0.134 0.147 0.299 0.195 0.272 0.367 0.0060 0.0063 0.0067 0.0076 0.0061 0.0082

sy,com sy,sub sy,sys sz,com sz,sub sz,sys

εy,sub Latent variables

Fig. 8. Bayesian network for structural equation modeling (Example 1).

The hierarchical SEM for the three-level model validation is derived in Appendix A. Fig. 8 shows the path diagram of the SEM. Thirteen parameters needs to be estimated in this example (N ¼ M ¼ 2500). Similar to Example 1, given the three sets of experimental frequency data (i.e., Ycom, Ysub, and Ysys) and predicted data (i.e., Zcom, Zsub, and Zsys), the 13 SEM parameters are estimated using a Bayes network with MCMC Gibbs sampling.

Table 3 summarizes the statistical results of the 13 SEM parameters. Now we assess the model validity at the higher level using the lower-level data. There are three scenarios: subsystem-level validation using the component-level data (Scenario 1), systemlevel validation using the subsystem-level data (Scenario 2), and system-level validation using the component-level data ~ high and (Scenario 3). By sampling glow and hlow, the mean D ~ variance R high are calculated using Eqs. (22) and (23), respectively, for the three scenarios. Thus, using MCMC simulation technique and the statistics of the SEM parameters in Table 3, the probability distributions of the Bayes factor metric for the

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

s2sys_com ¼ 0.553, and s2sys_sub ¼ 0.282. The mean value of the

three scenarios are calculated using Eq. (21), and shown in Fig. 9 with 10,000 iterations of simulation. The mean values for the three scenarios are msub_com ¼ 6.3  103, msys_com ¼ 0.234, and msys_sub ¼ 0.179, and the variances s2sub_com ¼ 0.037,

Bayes factor metric, confidence in the model, and probabilities of accepting the model l ¼ Prob(b40) for the three scenarios are also shown in Fig. 9. The resulting three MRM values are rsub_com ¼ 99.98%, rsys_com ¼ 92.4%, and rsys_sub ¼ 89.7%. Two observations are obtained from the numerical results shown in Fig. 9. First, the probability of b40 (i.e. B0141) (accepting the model) is up to 100% for scenarios 1 and 2, indicating that the model is always accepted at the higher-level validation using the data at one level lower; whereas the probability of accepting the model for scenario 3 is only 65.4%, where the data at two levels lower are used for higher-level validation. Second, the confidence of the model at the higher level using the lower-level data decreases. For example, the confidence in the model at the subsystem level using the component level is only 74.1% (Scenario 1) for the Bayesian method, while 89.7% for the MRM method. This is because the data at the lower level is far away from the higher level, which affects the confidence in the model at the higher-level validation using the lower-level data.

5

Density

λ = Prob (b>0) 4

Scenario 1: b = 1.049, κ = 74.1%, λ = 100%

3

Scenario 2: b = 0.974, κ = 72.5%, λ = 100% Scenario 3: b = 0.211, κ = 55.3%, λ = 65.4%

2

1

0 -1.5

-1

-0.5

0 b = ln (B01)

0.5

4.2. Example 2: multilevel model validation (multiple variables)

1

4.2.1. Problem description A structural dynamics problem provided by Sandia National Laboratories [45] as a challenge problem is used in this study to investigate the effectiveness of the proposed Bayesian SEM method for hierarchical multivariate model validation, using time-series data. This validation challenge problem consists of a three-mass subsystem (shown in Fig. 10) and a practical application system (shown in Fig. 11), which was addressed using various approaches by many researchers from the US and Europe in the 2006 model validation workshop organized by Sandia (http:// www.esc.sandia.gov/VCWwebsite/vcwhome.html). The subsystem is mounted on a simply supported beam by an additional weakly nonlinear connection and only vertical motions are permitted. Computational models are available to make simulation-based predictions of structural responses. All validation data used in this example have been virtually generated for illustrative purposes [45]. For the sake of simplicity, two assumptions have been made in this problem [45]: (1) various configurations of the structural systems are completely known; and (2) all experimental measurements are perfect and therefore do not contain any noise. In the subsystem validation test, three levels of shock loading with free decay, namely, low-, medium-, and higher level, are

Fig. 9. Multilevel validation results using various levels of data with 10,000 iterations of simulation (Example 1).

x3 m3 c3

k3

x2 m2 c2

k2

x1 m1

c1

“Weakly” nonlinear connection

k1

Fig. 10. Subsystem for model calibration and validation: components with connection.

x12 3 x11 2

xi response measurement locations

Subsystem

x10 1

x1

x2

x3

x5

x4

805

x6

x7

x8

x9

Shock load f (t)

35 in 37.5 in 50 in Fig. 11. System for model accreditation.

ARTICLE IN PRESS 806

X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

An enlarged plot of the time series is also inserted in Fig. 12 for the sake of visual comparison. The two data sets at the subsystem level are visually indistinguishable in both time and frequency domains. The suitability of the model for practical application in system level is assessed through a more fully relevant scenario shown in Fig. 11. The configuration consists of the validated subsystem (Fig. 10) and a uniform beam that is simply supported on one end, supported with a linear spring on the other, and subjected to a single blast-type shock load. Similar to the subsystem validation case, only vertical motions are permitted on the combined structure. The vertical acceleration responses are measured at nine points along the beam as well as three masses of the associated subsystem, resulting in 12 sets of measured response acceleration time series (i.e., 12 variables q ¼ 12), each having 12,288 data points (i.e., M ¼ 12,288). Due to the high cost of

applied to m2 (Fig. 10) one at a time. The acceleration responses at m1, m2, and m3 in the vertical direction are obtained for every level of shock excitation, resulting in three sets of time-series data (i.e., three variables p ¼ 3), each consisting of three acceleration responses with 6144 data points (i.e., N ¼ 6144) for three masses separately. The virtual tests are repeated on 20 nominally identical subsystems, resulting in 60 sets of response acceleration data. Twenty sets of response data resulting from the higher-level shock excitation is used in this study for illustrative purposes (i.e., the number of repeated experiments is Nrep ¼ 20). As an example, Fig. 12 shows the experimental data (dashed line) and the model prediction (solid line) of the acceleration response data of mass m1 for sample 20 due to the higher-level shock excitation at the subsystem level. Figs. 12a and b show the time-series data and PSD plots, respectively. The PSD of every time series is calculated through fast Fourier transform with the length of 1024 [44].

Model prediction

0 x 104

Experimental measurement

0.005

0.01 2 Abs. Magnitude

Acceleration (unit)

1 m1 0

0.015

x 106

m1 1

0

-1 0

0.1 0.2 Time (second)

0.3

0

50

100 150 Frequency (Hz)

200

Fig. 12. Time series and frequency plots of acceleration response data of mass m1 from 0 to 0.3 s for sample 20 due to higher-level shock excitation at subsystem level (Example 2): (a) time series and (b) frequency.

Experimental measurement

Model prediction

0.04 3 Abs. Magnitude

Acceleration (unit)

2

0.02

0 x104

m3 0

0.06

x108 m3

2 1 0

-2 0

0.2 0.4 Time (second)

0.6

0

20

40 60 Frequency (Hz)

80 90

Fig. 13. Time series and frequency plots of acceleration response data of mass m3 for first sample due to stochastic blast-type shock excitation at system level (Example 2): (a) time series and (b) frequency.

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

Two observations are obtained from the results. First, the model is accepted at the subsystem level with high confidence, while rejected at the system level at the two cases. For example, the values bsub ¼ 2.44 in Case 1 and bsub ¼ 3.56 in Case 2 are obtained, indicating that the model is accepted with the confidence of k ¼ 92.0% and 97.2%, respectively. Second, the more validation data are available, the more accurate validation result is obtained. For example, the value bsys ¼ 4.48 in Case 2 is smaller than bsys ¼ 3.69 in Case 1, implying rejection of the model at the system level with higher confidence. Using the resulting statistics of the SEM parameters, the probability distributions of the Bayes factor metric (bsys_sub) are obtained for the system-level model validation using the subsystem-level data in the two cases. Fig. 14 shows the assessment results of Cases 1 and 2. The probabilities of the events bsys_subo0 (rejecting the model) are up to 100% for both cases. The probabilistic assessment results indicate that the model is rejected with high confidence at the system-level validation using the subsystem-level data. In addition, the confidence in rejecting the model in Case 2, i.e., P(ksys_subo30%) ¼ 95.4%, is higher than that in Case 1, i.e., P(ksys_subo40%) ¼ 80.7%. Clearly, the conclusions are consistent with those obtained in the deterministic situation using the given data at various levels, as discussed previously.

building test specimens for full systems in practice, only three specimens are tested (i.e., the number of repeated experiments is Mrep ¼ 3). Fig. 13 shows the experimental data (dashed line) and the model prediction (solid line) of the acceleration response data of mass m3 for the first sample due to the blast-type shock excitation at the system level. Figs. 13a and b show the time series and frequency plots, respectively. In opposition to the subsystem-level case, it is found that the two data sets at the system level are visually different in both time and frequency domains, implying that the model may be acceptable at the subsystem level but may be rejected at the target application level. Clearly, this is a complicated multivariate validation problem where multiple validation data at the subsystem level (Nrep ¼ 20, p ¼ 3, N ¼ 6144) and the system level (Mrep ¼ 3, q ¼ 12, M ¼ 12,288) are available. Since the validation data at subsystem and system levels are created from various shock excitations, the components in the N  M matrices HZ in Eq. (10) and Hy in Eq. (11) need to be estimated to quantify the relationship between the two-level validation data. In this example, the model validity is investigated in terms of the PSD values using the Bayesian SEM method, with N0 ¼ 200 and M0 ¼ 90 at the subsystem- and system-level validations, respectively. Two cases in terms of the different number of samples are used in this example to investigate the effect of different amounts of data on model validation confidence. Case 1 consists of only one set of sample test data for both the subsystem- and system-level validations with p ¼ 3 and q ¼ 12, respectively, while Case 2 consists of 20 sets of validation data at the subsystem level with p ¼ 3 and three sets of validation data at the system level with q ¼ 3 (only the three-mass subsystem).

5. Concluding remarks This paper presents a novel methodology to make use of multiple levels of data for hierarchical model validation, integrating Bayesian analysis and SEM technique. A SEM approach with latent variables is employed to model the relationships between the computational model and system-level data, and between the lower-level data and system-level data. A Bayesian network with MCMC simulation and Gibbs sampling is applied to the path graph of the SEM model and to estimate the SEM influencing factors. Both Bayesian hypothesis testing and MRM are employed to quantify the confidence in the predictive model at the system level and the role of the lower-level data on the higher-level model validation. This paper addresses the effect of different amounts of data at different stages of the validation hierarchy on the model validation confidence at system level. The proposed methodology is illustrated with univariate and multivariate multilevel model validation problems, using discrete observations and time-series data, respectively. Numerical results have demonstrated that the Bayesian SEM approach with latent variables is an effective tool in quantifying the relationships among

4.2.2. Model validation For comparison purposes, the Bayes factor and MRM metrics in the two cases are calculated using the given frequency data (deterministic). The obtained results are summarized in Table 4.

Table 4 Model validation results at different levels (Example 2) Metrics

Case 1

Case 2

bsub

2.44 92.0 3.69 2.4 86.7 46.2

3.56 97.2 4.48 1.1 91.0 38.0

ksub (%) bsys

ksys (%) rsub (%) rsys (%)

807

0.3 0.2

Density

0.25

0.15

0.2

P (bsys_sub< 0) = 100%

P (bsys_sub< 0) = 100%

0.15

0.1

0.1 0.05

0.05 0

0 -20

-16

-12

-8 -4 bsys_sub = ln (B01)

0

-25

-20 -15 -10 bsys_sub = ln (B01)

-5

0

Fig. 14. Multivariate system-level validation results using subsystem-level data with 10,000 iterations of simulation (Example 2): (a) Case 1: Nrep ¼ 1, p ¼ 3, Mrep ¼ 1, q ¼ 12 and (b) Case 2: Nrep ¼ 20, p ¼ 3, Mrep ¼ 3, q ¼ 3.

ARTICLE IN PRESS 808

X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

different levels of data. In many validation cases such as evaluating large systems, system-level tests are limited or even infeasible. Therefore, the proposed method can be used to draw inference on the validity of system-level prediction using the lower-level data. In future, the proposed Bayesian SEM methodology may be extended to include more complicated situations in hierarchical model validation, such as missing data in experimental observations, nonlinear relationships, and multiple system tests with different boundary conditions.

~¯ sub_com is calculated from the available Thus, the mean value D component-level data Zcom and Ycom using the approximated data ~ sub_com ¼ ðasub  1Þccom Zcom þ bsub Ycom D

(35)

~ 2sub_com þ 4s2z;sub þ 4s2y;sub , in which s ~ 2sub_com is the and s2sub_com ¼ s ~ sub_com . The mean value D ~¯ sys_sub is calculated from the variance of D available subsystem-level data Zsub and Ysub using ~ sys_sub ¼ ðasys  1Þcsub Zsub þ bsys Ysub D

(36)

~ 2sys_sub þ 4s2z;sys þ 4s2y;sys , in which s ~ 2sys_sub is the and s2sys_sub ¼ s

Acknowledgments The research was supported by funds from Sandia National Laboratories, Albuquerque, New Mexico (Contract no. BG-7732, project monitors: Dr. Thomas l. Paez, Dr. Laura p. Swiler, and Dr. Martin pilch). The support is gratefully acknowledged. The authors also thank the anonymous referees for their helpful comments to improve the manuscript.

~ sys_com ¼ ðasys csub  csub þ bsys asub Þccom Zcom þ bsub bsys Ycom D

(37)

and

s2sys_com ¼ s~ 2sys_com þ ðasys  1Þ2 c2sub s2z;sub þ b2sys s2y;sub þ 4s2z;sys þ 4s2y;sys ~ sys_com . ~ 2sys_com is the covariance matrix of D in which s

Appendix A. Structural equation modeling for Example 1 For this three-level model validation, assume only one set of experimental measurements in the form of time series is available at each level, namely, component-, subsystem-, and system-level. Using Eqs. (5)(9), the measurement models and prediction models are obtained as follows: ycom ¼ hcom þ ycom ;

ysub ¼ hsub þ ysub ;

ysys ¼ hsys þ ysys

(25)

zcom ¼ gcom þ zcom ;

zsub ¼ gsub þ zsub ;

zsys ¼ gcsys þ zsys

(26)

where y and z represent the experimental measurement and the model prediction, respectively; h and g are latent variables, and e is a Gaussian random variable with N(0, s2). In this study, the variance of y is used for the corresponding variable h. The structural models can be expressed as

hcom ¼ acom gcom þ ycom

(27)

hsub ¼ asub gsub þ bsub hcom þ ysub

(28)

hsys ¼ asys gsys þ bsys hsub þ ysys

(29)

gsub ¼ ccom gcom þ zsub

(30)

gsys ¼ csub gsub þ zsys

(31)

Fig. 8 shows the path diagram for the SEM represented by Eqs. (25)–(31). In total, 13 SEM parameters need to be estimated, i.e., X ¼ {acom, asub, asys, bsub, bsys, ccom, csub, sy,com, sy,sub, sy,sys, sz,com, sz,sub, sz,sys}. Let Dsub_com, Dsys_sub, and Dsub_com represent the difference of two-data sets for model validation at the subsystem level using component level data, at the system level using subsystem level data, and at the system level using component level data, respectively. Using Eqs. (25)–(31), the differences in the three cases can be expressed as Dsub_com ¼ ysub  zsub ¼ hsub  gsub þ ysub  zsub ¼ ðasub  1Þccom gcom þ bsub hcom þ 2ysub  2zsub

~ sys_sub . The mean value D ~¯ sys_com is calculated from the variance of D available subsystem-level data Zcom and Ycom using

(32)

Dsys_sub ¼ ðasys  1Þcsub gsub þ bsys hsub þ 2ysys  2zsys

(33)

Dsub_com ¼ ðasys csub  csub þ bsys asub Þccom gcom þ bsub bsys hcom þ ðasys  1Þcsub zsub þ bsys ysub þ 2ysys  2ezsys

(34)

References [1] Hasselman TK, Wathugala GW, Crawford J. A hierarchical approach for model validation and uncertainty quantification. In: Mang HA, Rammerstorfer FG, Eberhardsteiner J, editors. Proceeding of the fifth world congress on computational mechanics, Vienna, Austria, July 7–12, 2002. [2] DOD. Verification, validation, and accreditation (VV&A) recommended practices guide. Alexandria, VA: Department of Defense; 1996 /http://vva. dmso.mil/S. [3] AIAA. Guide for the verification and validation of computational fluid dynamics simulations. Reston, VA: American Institute of Aeronautics and Astronautics AIAA-G-077-1998; 1998. [4] DOE. Accelerated Strategic Computing Initiative (ASCI) program plan. Washington, DC: Department of Energy DOE/DP-99-000010592; 2000. [5] ASME. Guide for verification and validation in computational solid mechanics. ASME V&V 10-2006, American Society of Mechanical Engineers. [6] Balci, O. Verification, validation, and accreditation of simulation models. In: Proceedings of the 29th conference on winter simulation, Atlanta, GA, 1997. [7] Roache PJ. Verification and validation in computational science and engineering. Albuquerque, NM: Hermosa Publishers; 1998. [8] Trucano TG, Easterling RG, Dowding KJ, Paez TL, Urbina A, Romero VJ, et al. Description of the Sandia validation metrics project.Technical report of Sandia no. 2001-1339. Albuquerque, NM: Sandia National Laboratories; 2001. [9] Hills RG, Trucano TG. Statistical validation of engineering and scientific models: a maximum likelihood based metric. Technical report of Sandia no. 2001-1783. Albuquerque, NM: Sandia National Laboratories; 2002. [10] Oberkampf WL, Trucano TG. Verification and validation in computational fluid dynamics. Prog Aerospace Sci 2002;38(3):209–72. [11] Paez TL, Urbina A. Validation of mathematical models of complex structural dynamic systems. In: Proceedings of the ninth international congress on sound and vibration, Orlando, FL, 2002. [12] Hills RG, Leslie I. Statistical validation of engineering and scientific models: validation experiments to application. Technical report of Sandia no. 20030706. Albuquerque, NM: Sandia National Laboratories; 2003. [13] Oberkampf WL, Trucano TG, Hirsch Ch. Verification, validation, and predictive capabilities in computational engineering and physics. Technical report of Sandia no. 2003-3769. Albuquerque, NM: Sandia National Laboratories; 2003. [14] Babuska I, Oden JT. Verification and validation in computational engineering and science: basic concepts. Comput Methods Appl Mech Eng 2004; 193(36–38):4057–66. [15] Dowding KJ, Hills RG, Leslie I, Pilch M, Rutherford BM, Hobbs ML. Case study for model validation: assessing a model for thermal decomposition of polyurethane foam. Technical report of Sandia no. 2004-3632. Albuquerque, NM: Sandia National Laboratories; 2004. [16] Campbell, K. A brief survey of statistical model calibration ideas. In: International conference on sensitivity analysis of model output, Santa Fe, NM, 2004. [17] Sargent, P. Validation and verification of simulation models. In: Proceedings of the 2004 winter simulation conference, Washington, DC, 2004. [18] Oberkampf WL, Trucano TG. Design of and comparison with verification and validation benchmarks. Technical report of Sandia no. 2006-5376C. Albuquerque, NM: Sandia National Laboratories; 2006. [19] Oberkampf WL, Barone MF. Measures of agreement between computation and experiment: validation metrics. J Comput Phys 2006;217(1):5–36.

ARTICLE IN PRESS X. Jiang, S. Mahadevan / Reliability Engineering and System Safety 94 (2009) 796–809

[20] Chen W, Baghdasaryan L, Buranathiti T, Cao J. Model validation via uncertainty propagation and data transformation. AIAA J 2004;42(7): 1406–15. [21] Zhang R, Mahadevan S. Bayesian methodology for reliability model acceptance. Reliab Eng Syst Saf 2003;80(1):95–103. [22] Mahadevan S, Rebba R. Validation of reliability computational models using Bayes networks. Reliab Eng Syst Saf 2005;87(2):223–32. [23] Rebba R, Mahadevan S. Model predictive capability assessment under uncertainty. AIAA J 2006;44(10):2376–84. [24] Jiang X, Mahadevan S. Bayesian risk-based decision method for model validation under uncertainty. Reliab Eng Syst Saf 2007;92(6):707–18. [25] Jiang X, Mahadevan S. Bayesian validation assessment of multivariate computational models. J Appl Stat 2008;35(1):49–65. [26] Jiang X, Mahadevan S. Bayesian wavelet method for multivariate model assessment of dynamical systems. J Sound Vib 2008;312(4–5):694–712. [27] Bollen KA. Structural equations with latent variables. New York: Wiley; 1989. [28] Goldberger AS. Structural equation methods in the social sciences. Econometrica 1972;40(6):979–1001. [29] Bielby WT, Hauser RM. Structural equation models. Annu Rev Sociol 1977; 3:137–61. [30] Bentler PM. Multivariate analysis with latent variables: causal modeling. Annu Rev Sociol 1980;31(1):419–56. [31] Congdon P. Applied Bayesian modelling. Chichester: Wiley; 2003. [32] Hilborn R, Mangel M. The ecological detective: confronting models with data. Princeton, NJ: Princeton University Press; 1997. [33] Martin JK, McDonald RP. Bayesian estimation in unrestricted factor analysis: a treatment for heywood cases. Psychometrika 1975;40(4):505–17.

809

[34] Lee SY. A Bayesian approach to confirmatory factor analysis. Psychometrika 1981;46(2):153–60. [35] Arminger G, Muthe´n BO. A Bayesian approach to nonlinear latent variable models using the Gibbs sampler and the Metropolis–Hastings algorithm. Psychometrika 1998;63(3):271–300. [36] Ansari A, Jedidi K, Jagpal S. A hierarchical Bayesian methodology for treating heterogeneity in structural equation models. Market Sci 2000;19: 328–47. [37] Lopes HF, West M. Bayesian model assessment in factor analysis. Stat Sin 2004;14(1):41–67. [38] Jensen FV, Jensen FB. Bayesian networks and decision graphs. New York: Springer; 2001. [39] Gilks GR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. Interdisciplinary statistics. London: Chapman & Hall/CRC; 1996. [40] Spiegelhalter DJ, Thomas A, Best NG. WinBUGS version 1.4 user manual. Cambridge, UK: MRC Biostatistics Unit; 2002. [41] Kass R, Raftery A. Bayes factors. J Am Stat Assoc 1995;90(430):773–95. [42] Migon HS, Gamerman D. Statistical inference—an integrated approach. London: Arnold, a Member of the Holder Headline Group; 1999. [43] Rebba R, Mahadevan S. Computer methods for model reliability assessment. Reliab Eng Syst Saf 2008;93(8):1197–207. [44] Stoica P, Moses RL. Introduction to spectral analysis. Englewood Cliffs, NJ: Prentice-Hall; 1997. [45] Red-Horse JR, Paez TL. Sandia National Laboratories validation workshop: structural dynamics application. Comput Methods Appl Mech Eng 2008; 197(29–32):2578–84.