Probabilistic Engineering Mechanics 28 (2012) 216–224
Contents lists available at SciVerse ScienceDirect
Probabilistic Engineering Mechanics journal homepage: www.elsevier.com/locate/probengmech
Stochastic sampling using moving least squares response surface approximations Alexandros A. Taflanidis a,∗,1 , Sai-Hung Cheung b a
University of Notre Dame, Department of Civil Engineering and Geological Sciences, 156 Fitzpatrick Hall, Notre Dame, IN 46556, United States
b
Nanyang Technological University, Division of Structures and Mechanics, Singapore
article
info
Article history: Received 3 December 2010 Accepted 6 July 2011 Available online 23 July 2011 Keywords: Stochastic simulation Stochastic sampling Response surfaces Moving least squares Relative information entropy
abstract This work discusses the simulation of samples from a target probability distribution which is related to the response of a system model that is computationally expensive to evaluate. Implementation of surrogate modeling, in particular moving least squares (MLS) response surface methodologies, is suggested for efficient approximation of the model response for reduction of the computational burden associated with the stochastic sampling. For efficient selection of the MLS weights and improvement of the response surface approximation accuracy, a novel methodology is introduced, based on information about the sensitivity of the sampling process with respect to each of the model parameters. An approach based on the relative information entropy is suggested for this purpose, and direct evaluation from the samples available from the stochastic sampling is discussed. A novel measure is also introduced for evaluating the accuracy of the response surface approximation in terms relevant to the stochastic sampling task. © 2011 Elsevier Ltd. All rights reserved.
1. Introduction In stochastic analysis of engineering systems the task of generating samples according to a target probability distribution involving some performance function of the system response often arises. Such problems are encountered, for example, in Bayesian system identification [1,2] in recent innovative stochastic optimization approaches [3,4] or in some advanced methodologies for estimation of stochastic integrals [5,6]. Various algorithms have been proposed for this task. Refs. [7,8] provide a more detailed discussion on this topic. These algorithms rely, in general, on generation of candidate samples based on some proposal density and require afterward repeated evaluation of the system model for each of the candidate samples to compute the performance function involved in the target probability distribution function (PDF). For applications involving complex system models, these latter evaluations represent typically the most computationally intensive task; the computational burden involved in the model response evaluation is orders of magnitude larger than the effort associated with all other aspects of the stochastic sampling algorithm. In this paper we focus on implementation of surrogate modeling, in particular moving least squares (MLS) response surface methodologies [9], for efficient approximation of the model response and thus reduction of the computational burden for the
stochastic sampling process. Since efficiency of any MLS approximation is largely dependent on appropriate selection of the weights involved in the MLS problem formulation, approaches are discussed for appropriate selection of these weights. This is established here by exploiting the information available from a sensitivity analysis within the stochastic sampling, an idea initially introduced in [10]. The relative information entropy between the target and proposal densities is suggested for quantification of the sensitivity of the sampling process with respect to each of the model parameters. For efficiently estimating the probability distribution functions needed for the entropy evaluation, which corresponds to a scalar integral in this case, an approach based on kernel density estimation calculated through the available samples is presented. A novel measure is also introduced for evaluating the accuracy of the response surface methodologies for the stochastic sampling. The relative entropy between the exact (obtained by direct evaluation of the system model) and approximate (obtained by approximation of the model response) probability density functions is adopted as such a measure. In this case the entropy corresponds to a multidimensional integral and estimation by stochastic simulation is proposed. The efficiency of the methodology for stochastic sampling using MLS approximations is illustrated in an example related to the reliability of a base-isolated structure against near-fault type of excitation. 2. Problem formulation and stochastic sampling
∗
Corresponding author. Tel.: +1 5746315696; fax: +1 5746319236. E-mail address:
[email protected] (A.A. Taflanidis).
1 Rooney Family Assistant Professor. 0266-8920/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.probengmech.2011.07.003
Consider a system model that includes model parameters
θ ∈ Θ ⊂ Rnθ , where Θ denotes the space of possible values for
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
these parameters. In a probabilistic setting, our incomplete knowledge about the model parameters is treated by assigning a probability distribution function (PDF) to them p(θ) [11]. Let z ∈ Rnz denote the deterministic model output and h(θ) : Rnθ → R+ be a performance function related to the model output though a transformation T : Rnz → R+ h(θ) = T {z(θ)} .
(1)
We are interested here in generating samples according to the probability distribution function
π(θ) =
Θ
h(θ)p(θ) h(θ)p(θ)dθ
(2)
3. Moving least squares response surface Response surface approaches [9] belong to the general class of surrogate modeling methodologies and ultimately aim at approximating a complex process, requiring large computational cost for its evaluation, by a simpler mathematical model. This is established by expressing the function corresponding to the initial process f (θ) : Rnθ → R, where θ = [θ1 . . . θnθ ] ∈ Rnθ denotes the vector of free variables, through a number of NB prescribed basis functions (interpolating functions) bi (θ) : Rnθ → R; i = 1, . . . , NB. The approximation is expressed as a linear combination of the bi (θ) by introduction of coefficients ai {θ} ∈ R; i = 1, . . . , NB. The latter can be constant or depend on the location θ for which the interpolation is established. This dependence is denoted by {θ} here. The approximation is then NB
bi (θ)ai {θ} = b(θ)T a{θ}
(3)
i =1
where b(θ) is the vector of basis functions and a{θ} is the vector containing the coefficients for the basis functions. Different classes of basis functions have been suggested and used in the literature [9]. A common choice is a second order approximation, which would give: fˆ (θ) = ao {θ} +
nθ
aj {θ}θj +
b(θ) = 1 NB =
θ1 . . . θnθ
nθ (nθ + 3) + 2
ajk {θ}θj θk ;
θ12
θ1 θ2
θ1 θ3 . . . θn2θ
T
,
(4)
2
T
a{θ} = ao {θ} a1 {θ} . . . anθ {θ} a11 {θ} a12 {θ} a13 {θ} . . . anθ nθ {θ}
JR {θ} =
NS
2 w{θ} fˆ (θI ) − f (θI )
I =1
=
NS
2 w{θ} b(θI )T a{θ} − f (θI )
= [Ba{θ} − F]T W{θ} [Ba{θ} − F]
.
The coefficients a{θ} are calculated by initially evaluating f (θ) in NS > NB support points {θI ; I = 1, . . . , NS } and then by minimizing the mean squared error over these points between f (θ)
(5)
where the following quantities have been introduced B = [b(θ1 ) . . . b(θNS )]T
F = [f (θ1 ) . . . f (θNS )]T
W{θ} = diag [w (d(θ; θ1 )) . . . w (d(θ; θNS ))]
(6)
and w {d(θ; θI )} is a variable weight function with a compact support that depends on some measure of the distance between the interpolation point θ and each of the supporting points. A typical selection for this distance measure is a weighted quadratic vector norm
nθ 2 d(θ; θI ) = ∥θ − θI ∥v = θi − θi,I vi2 i=1
=
[v (θ − θI )]T v (θ − θI );
v = diag(v1 . . . vnθ )
(7)
with vi representing the relative weight for each component of yi . Since v is diagonal, the transformation v · θ corresponds to scaling of each component θi by vi . The introduction of the dependence on the distance weights w{d} aims at reducing the approximation error at each point by performing a weighted local averaging of the information obtained by the support points that are closer to it. Without these weights, the coefficient vector, a, would be constant over the whole domain for θ which means that a global approximation would be established (global least squares). The efficiency – i.e., fit to f (θ) – of global approximations depends significantly on the selection of the basis functions, which should be chosen to resemble f (θ) as closely as possible. Such a selection is not always straightforward [13]. The MLS circumvents such problems by establishing a local approximation for a{θ} around each point in the interpolation domain. This leads to a smaller dependence of the fit on the type of basis functions used [14–16]. On the other hand the efficiency of the MLS interpolation depends on the weighing function chosen. This function should prioritize support points that are close to the interpolation point, and should vanish after an influence radius D. An appropriate support size D should be selected at any point θ so that a sufficient number of neighboring supporting points are included to avoid singularity in the solution for a{θ}. This means that D should include at least NB points. Many types of weighting functions have been suggested in the literature. One of the most common is the exponential type of function
w(d) =
j=1 k≥j
j=1
nθ nθ
and the approximation fˆ (θ) established through (3). In the Moving Least Squares (MLS) approach the coefficients are dependent on θ, and are selected by minimizing a weighted sum of squared error that is a function of θ
I =1
∝ h(θ)p(θ)
where the denominator in (2) simply corresponds to a normalization constant. A wide class of algorithms have been developed for generation of samples from distributions such as π (θ) in (2), ranging from direct Monte Carlo methods to advanced Markov Chain implementations [1,12]. Appendix A presents a brief review of two popular algorithms. Though there is a significant degree of variability on the implementation details of each of these algorithms, they share some common characteristics; they rely on generation of candidate samples based on some proposal density q(θ) and then proceed to evaluate the model response for each of them. Thus the stochastic sampling process requires simulation of the model performance to calculate h(θ) for each candidate sample. For applications with complex system models this is the computationally most intensive aspect of the algorithm and for reduction of the associated computational burden a response surface approximation will be suggested. A review of moving least squares response surface methodologies is initially presented in next section, and then the proposed framework for stochastic sampling is discussed in Section 4.
fˆ (θ) =
217
e
−
d cD
2k
−e
1−e
−
−
2k
2k 1 c
1 c
if d < D; 0 else
(8)
where c and k are free parameters to be selected for better efficiency. Common values suggested in the literature for these parameters are c = 0.4, k = 1 (Gaussian). Finally the minimization of (5) is a standard quadratic optimization problem and yields solution a{θ} = M−1 {θ}L{θ}F
(9)
218
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
(a) Exact function.
(b) Support points and corresponding function value.
(c) Global least squares approximation.
(d) Moving least squares approximation.
Fig. 1. Illustrative example for response surface approximation implementation.
NE
where M = B W{θ}B, T
T
L{θ} = B W{θ}.
(10)
Ultimately D in (8) should be selected so that M is invertible. Finally from (3) we have the approximation fˆ (θ) = bT (θ)M−1 {θ}L{θ}F.
SST =
SSE SST
NE
;
SSE =
(11)
f (θp ) − f¯ (θp )
p=1
NE
2
f (θp ) − fˆ (θp )
,
p=1
2
,
f¯ (θp ) =
NE
NE
100%.
(13)
|f (θp )|
p=1
This MLS approximation requires for each point θ evaluation not only of the basis vector b(θ) at that point but of the coefficients a{θ} as well – or equivalently matrices M−1 {θ} and L{θ} – through (9). Fig. 1 presents an illustrative example for a two dimensional problem; it includes a plot of the exact function f (θ)—part (a)-, the location of the support points and the value of f (θ) at these points—part (b)- and finally the global and MLS approximations using quadratic basis functions—parts (c) and (d), respectively. In this case a quadratic approximation provides a poor fit to f (θ) and so the global response surface cannot efficiently describe f (θ). On the other hand the MLS approach provides an accurate response surface approximation to f (θ). The fit of the response surface approximation may be judged by the coefficient of determination R2 , given by R2 = 1 −
AE =
|f (θp ) − fˆ (θp )|
p=1
(12) f (θp )
p=1
where θp ; p = 1, . . . , NE are the points for which the fit is evaluated. SSE is the sum of the squared error between the predictions and the actual value of the function and SST is the sum of the variability of the function with respect to its mean value. Ultimately R2 denotes the proportion of variation in the data that is explained by the response surface model. A larger value of R2 (i.e., a value closer to 1) indicates a good fit. Another measure for goodness of fit is the average percent error AE:
Smaller values (i.e., close to zero) for AE correspond to better fit in this case Another important aspect for the MLS response surface approximation is the selection of the exact location of the support points {θI ; I = 1, . . . , NS }, a task sometimes referred to as design of experiments in the relevant literature [9]. Common approaches for this selection are orthogonal arrays or small composite designs that aim to efficiently cover the entire space for vector θ. These approaches have been proven to be efficient when the space for θ is compact. More detailed discussions may be found in [9]. 4. Stochastic sampling with response surface methodologies 4.1. Outline of approach Moving least squares response surface methodologies may be applied for efficient approximation of the system model response and reduction of the computational burden associated with the stochastic sampling from distribution (2). The exact function f (θ) to be approximated may be the overall performance measure h(θ) or the system model output itself z(θ). For applications for which the transformation T {.} in (1) relating the performance function to the model output is simple, typical characteristic of many engineering problems, the approximated function should be each of the components of the model output vector z(θ). Otherwise h(θ) should be directly approximated. The final performance function h(θ) obtained by using a response surface methodology, no matter what stage of the evaluation of h(θ) is the approximation established, is denoted by hˆ (θ) herein.
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
The MLS response surface approximation discussed in the previous section is then directly applicable to the stochastic sampling problem. An important question that needs to be addressed for such an implementation is the selection of the support points. The common approaches for this task, such as orthogonal arrays, are not appropriate in this case. The space for the model parameters Θ may be non-compact, which imposes a great difficulty. Also, since ultimate goal is simulation of samples from π (θ) it is beneficial to exploit the design experiments for stochastic sampling as well, but more importantly it is critical that h(θ) is accurately approximated over the regions of importance for π (θ), which indicates that enough support points over these regions are required. Typical methodologies for selecting support points will probably correspond to inefficient proposal densities, thus will be unable to produce samples near such regions. To meet this requirement the MLS approximation will be integrated here within the stochastic sampling with a novel approach for selecting the support points. Initially Nd samples are obtained for θ from direct evaluation of the system model. These samples are denoted by θd and are ultimately simulated according to distribution π (θ). In this process, evaluation of f (θ) at Nt total support points {θt } becomes available. These points include the set {θd } but also any other points for which f (θ) has been evaluated depending on the sampling algorithm used, for example candidate samples that were rejected. Thus the selection of support points is dictated by the stochastic sampling algorithm selected. These points are ultimately distributed according to the proposal density adopted in the context of this algorithm. A MLS response surface is then generated for hˆ (θ) and Na additional samples are simulated, denoted by {θa }. For each of these samples f (θ) is approximated by (11). Thus the latter samples are simulated according to
π(θ) ˆ =
Θ
hˆ (θ)p(θ) hˆ (θ)p(θ)dθ
.
(14)
The selection of weights for the response surface approximation is discussed next. 4.2. Weight selection An important aspect for the efficiency of the fit provided by the MLS approximation is the selection of the weights vi for the distance measure in (7). These weights define the relative importance of each component of θ for selecting W{θ} and establishing the moving character of the interpolation over the support points. The weights should (i) establish a normalization for the different components of θ but more importantly (ii) provide higher weights for components that have larger influence on the stochastic performance. The first task may be easily established by a weight inversely proportional to the variance σi2 of each component according to the variance of the support points {θi ; i = 1, . . . , NS }; the scaled components have then unity variance. This is equivalent to a normalization of vector θ. For the second task components that have greater importance on the stochastic sampling process should be prioritized. This may be established by comparing the marginal target distributions π(θi ) to the marginal proposal distribution q(θi ), an idea initially introduced in [10] and motivated by the concept for sensitivity analysis first proposed in [5]. Bigger discrepancies between distributions π (θi ) and q(θi ) indicate greater sensitivity of the performance function to the specific model parameters and ultimately greater importance in affecting the stochastic sampling process. In this setting, a quantitative measure [10] that can describe the relative importance on the stochastic sampling process of the various model parameters is the relative information
219
entropy, which for any variable θi ∈ Θi expresses the difference between probability distributions π (θi ) and q(θi ) [11] and is given by:
π (θi ) π (θi ) log Dre (π (θi )∥q(θi )) = dθi . q(θi ) Θi
(15)
Note that the marginal distribution π (θi ) is given by
π (θi ) =
˜i Θ
π (θ)dθ˜ i =
1
Θ
h(θ)p(θ)dθ
˜i Θ
h(θ)p(θ)dθ˜ i
(16)
˜ i is the component of θ excluding θi , with a similar where θ˜ i ∈ Θ expression holding for q(θi ). An analytical expression is not readily available for this marginal distribution π (θi ) which makes estimation of the integral (15) challenging [17]. For scalar components though, as in this case, an estimate of the relative information entropy can be efficiently obtained based on samples from this marginal distribution π (θi ), which are readily available through the stochastic sampling process; they correspond to the samples from π (θ) projected to the space of component θi . The marginal PDF (16) may then be approximated using the information in the available samples by kernel density estimators [17,18]. This estimate will not necessarily have high accuracy, but it can still provide an adequate approximation for estimating the information entropy integral. A Gaussian kernel density estimator may be established for this purpose [19]. If n samples for θi are available with θi,j denoting the j-th sample and σsi is the standard deviation for these samples then the approximation for π (θi ) would be π˜ (θi ) =
1 nσki
√
n
2 π j =1
−
e
(θi −θi,j )2 2σ 2 ki
;
σki = 1.06 · n−1/5 σsi .
(17)
For establishing better consistency in the relative information entropy calculation q(θi ) may also be approximated (with π replaced by q and π˜ replaced by q˜ in (17)) using the samples {θt } even when an analytical expression is available for it. This way, any type of error introduced by the kernel density approximation is common for both of the densities compared. The approximation for the relative information entropy is then [17]
π (θi ) dθi q(θi ) −∞ bi,u π˜ (θi ) ≈ π˜ (θi ) log dθi q˜ (θi ) bi,l
Dre (π (θi )∥q(θi )) =
∞
π (θi ) log
(18)
where the last scalar integral can be numerically evaluated and [bi,l , bi,u ] is the region for which samples for π (θi ) and q(θi ) are available. For example, using the trapezoidal rule and ne points for the integration, (18) is estimated Dre (π (θi )∥q(θi )) =
e −1 1θi n
π˜ (θi,k ) 2 k=1 q˜ (θi,k ) π˜ (θi,k+1 ) + π˜ (θi,k+1 ) log q˜ (θi,k+1 ) π˜ (θi,k ) log
(19)
where θi,k = bl + k1θi and 1θi = (bi,u − bi,l )/ne . This kerneldensity based approximation is efficient for quantifying the importance for each scalar component θi to the stochastic sampling process. Finally the following weight selection for the MLS approach is suggested by establishing both a normalization of each component and a prioritization according to their relative importance in the stochastic sampling process
vi =
Dre (π (θi )∥q(θi ))
σi
.
(20)
220
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
5. Illustrative example We now illustrate the concepts discussed in this paper through an example, which considers the response of a 3 base-isolated story-building, shown in Fig. 3, during dynamic earthquake excitation. 5.1. System and excitation models
Fig. 2. Schematic of the stochastic sampling process using MLS response surface approximations.
4.3. Stochastic sampling with moving least squares response surface approximation The algorithm for stochastic sampling with MLS response surface approximation is finally: Initialization: Select number of total samples N, the number Nd of samples simulated from π (θ), the characteristics of the sampling algorithm used and the proposal density q(θ). Additionally choose for the response surfaces the shape of the basis functions and the support size D. Step I: Obtain in total Nd samples {θd } from direct simulation of the system model and by directly evaluating the model response at Nt points {θt } corresponding to the candidate samples. Step II: Based on the samples {θd } calculate the weights vi by (20), with the kernel density approximation (17) for π (θi ) and q(θi ) corresponding to the distribution for {θt }. Step III: Then use the available support points {θt } for MLS response surface approximation to the system performance hˆ (θ). ˆ Step IV: Simulate additional N − Nd samples {θa } from π (θ) in (14). The combination of the samples generated in Steps I and IV ultimately correspond to the requested samples from the target density. Fig. 2 presents a schematic of this stochastic sampling process. Note that the response surface approximation can be built either in the initial space for the model parameters θ or to any other transformed space. This could be established through a transformation of the probability space, for example transformation to the standard Gaussian space, or by implementation of dimensional analysis [20,21]. If the variability of the response is reduced in the transformed space then the efficiency of the response surface approximation is expected to significantly improve [20]. 4.4. Evaluation of efficiency The accuracy of this MLS approximation may be judged by the traditional measures, such as the coefficient of determination (12) or the average error (13) for f (θ). Since the aim of the stochastic sampling process is the generation of samples from π (θ), a measure that directly compares π (θ) and πˆ (θ) is more relevant. The relative information entropy may be used for this purpose D(π(θ)∥π ˆ (θ)), quantifying the difference between probability distributions π (θ) and πˆ (θ):
πˆ (θ) dθ. D(π(θ)∥π ˆ (θ)) = πˆ (θ) log π (θ) Θ
(21)
Evaluation of this multidimensional entropy integral is not straightforward [6]. Appendix B discusses an approach based on stochastic simulation.
The superstructure is modeled as a planar shear building with uncertain inter-story stiffness and uncertain classical modal damping. The lumped mass of the top story is 636 ton while it is 817 ton for the bottom two. The inter-story stiffnesses ki of all the stories are parameterized by ki = θk,i km,i , i = 1, 2, 3, where [km,i ] = [633.9, 526.1, 319.5] kN/mm are the most probable values for the inter-story stiffness, and θk,i are non-dimensional uncertain parameters, assumed to be correlated Gaussian variables with mean value one and covariance matrix 6ij = (0.15)2 exp[−(i − j)2 /22 ], that roughly implies significant correlation between inter-story stiffnesses within two stories apart and a coefficient of variation (c.o.v) of 15%. The damping ratios ζi for each of the three vibration modes are assumed to be independent log-normal variables with median value 5% and coefficient of variation 20%. The base-isolation system consists of lead–rubber bearings. The hysteretic behavior of the isolators and the isolator forces are characterized by a Bouc–Wen type of model, U y η˙ = αis x˙ b − η2 (γis sgn(˙xb η) + βis )˙xb fis = αp ke xb + ke (1 − αp )U y η
(22)
where xb is the displacement of the base of the structure; fis is the isolator force; η is a dimensionless hysteretic variable; U y is the yield isolator displacement; sgn the sign function; ke is the pre-yield stiffness of the isolators; ap is the post-yield to pre-yield stiffness ratio; and αis , βis , and γis are dimensionless quantities that characterize the properties of the hysteretic behavior. Typical values for these parameters are chosen, αis = 1, βis = 0.1, and γis = 0.9 [22]. The mass of the base is 999 ton and the yield displacement of the isolators characteristics U y = 2.4 cm. The preyield stiffness is modeled as a Gaussian variable with mean ke = 138.26 MN/m and coefficient of variation 15% and the post-yield to pre-yield stiffness ratio as a lognormal variable with median 0.15 and coefficient of variation 20%. The choices correspond to a natural period of 2.5 s based on the nominal value for the post-yield stiffness. The passive viscous dampers forces, fd , are modeled by fd = cd sgn(˙xb ) |˙xb |ad
(23)
where cd is the damping coefficient and ad the damper exponent parameter. Two cases are considered for the damper configuration, the first one has cd = 0 (i.e., no dampers) and the second uses cd = 6 kN/mm s and ad = 0.8. These configurations are denoted by C1 and C2 , respectively. They ultimately represent different degrees of variability for the model output, and thus will enable investigation of the efficiency of the MLS fit under different conditions. A sinusoidal input that resembles the pulse component [23] of near-fault earthquakes is selected as dynamic excitation for the system. The velocity time history for the pulse is: V (t ) =
Ap 2
1 + cos
t ∈ 0, γp /fp
2π fp
γp
(t − to ) cos 2π fp (t − to ) + νp
= 0 otherwise;
to = γp / 2fp
(24)
where Ap , fp , νp , γp and to describe the signal amplitude, prevailing frequency, phase angle, oscillatory character (i.e., number of half
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
221
Fig. 3. Base-isolated structure model.
cycles) and time shift, respectively. The amplitude is selected as 100 cm/s. The rest of the pulse characteristics are assumed to be uncertain parameters with probability models lognormal with median 0.7 Hz and coefficient of variation 0.4 for fp , Gaussian with mean 3 and standard deviation 0.3 for γp , and uniform in [0, π] for vp . The vector of model parameters is θ = [γp fp vp ke ap ζi θk,i ; i = 1, 2, 3]T . A transformed model parameter space is also considered by converting all Gaussian and log-normal distributions to the standard Gaussian space. The resultant vector of model parameters θn will be denoted herein as transformed model parameter vector. Finally, a simplified problem is additionally formulated where uncertainty is only included in the excitation model parameters and the parameters of the base isolation system θ = [γp fp vp ke ap ] while the nominal values, i.e., most probable, are assumed for the model parameters of the superstructure. The full problem with nθ = 11 model parameters is denoted FP and the simplified with nθ = 5 model parameters SP. The performance function for the structure is the summation of the fragilities for (i) the superstructure drift, the (ii) base displacement and (iii) the absolute base acceleration. These fragilities are assumed to have a lognormal distribution with median µ and coefficient of variation β , so the transformation in (1) is h(θ) =
1 3
Φg
+ Φg
ln(zd ) − ln(µd )
βd ln(za ) − ln(µa )
βa
+ Φg
ln(zb ) − ln(µb )
βb
(25)
where Φg is the standard Gaussian cumulative distribution function, zd the maximum absolute inter-story drift, zb the maximum absolute base displacement and za the maximum absolute base acceleration. The characteristics for the median of the fragility curves are µd = 2 cm, µb = 40 cm and µα = 0.6g, whereas the coefficient of variation is set for all of them equal to βd = βb = βα = 0.4. Since the transformation function between model response and performance function (25) is simple, the MLS approximation is established only for the system response, in particular for ln(zd ), ln(zb ) and ln(za ) that are directly involved in the evaluation of h(θ). 5.2. Results and discussion The stochastic sampling framework discussed in Section 4.3 is used to generate in total N = 3000 samples from the target
distribution. Initially Nd = 80 samples are simulated from direct evaluation of the actual system model using the Accept–Reject algorithm discussed in Appendix A with selection for proposal density the prior probability model p(θ). A response surface is then built with full quadratic basis functions, as in (4), and choice of weights (8) with k = 1 and c = 0.4 and D adaptively chosen so that to include 2 · NB points. The weights for the distance (7) are chosen by the procedure described in Section 4.2. Additional samples are then generated using the response surface approximation to the model response by either the Accept–Reject algorithm or the Metropolis–Hastings algorithm, also discussed in Appendix A. For simplicity of the comparisons and discussion the proposal density in both instances is selected as the prior probability model p(θ). The sampling process is repeated 10 times for each algorithm and the efficiency of the proposed approach is evaluated by examining the accuracy of the approximation. Both the coefficient of determination R2 in Eq. (12) and the average error AE in Eq. (13) are used as measures to evaluate the goodness of fit. The comparison is performed with respect to (i) all candidate samples as well as with respect only to (ii) samples from the target distribution. For the former case the points θp in (12) and (13) for the fit evaluation are samples from the proposal density p(θ) and for the latter case samples from π (θ). Since for the simulation of samples the performance measure h(θ) is used, this measure is selected as the basis for the fit comparison. Thus f (θ) in Eqs. (12) and (13) corresponds to h(θ) and not to the model response (for which the response surface approximation is actually established). Also the relative entropy D(πˆ (θ)∥π (θ)) in Eq. (21), calculated by the second approach discussed in Appendix B [i.e., Eq. (35)], is used for the overall evaluation of the efficiency of the response surface approximation with respect to the stochastic sampling. This is the most important comparison since it is directly related to the accuracy of the stochastic sampling process and it measures this accuracy over the entire space Θ . Results are presented for both the full and the simplified problem and for both damper configurations in Tables 1 and 2 for the fit based on R2 and AE and in Table 3 for the fit based on D(πˆ (θ)∥π (θ)). Table 1 presents the results when the response surfaces are formulated in the original model parameter formulation, and Table 2 in the transformed (standard Gaussian) model parameter space. Table 3 presents all results for the comparison based on D(πˆ (θ)∥π (θ)). Three different cases are considered (a) a global response surface approximation (no weight penalization in Eq. (5)), (b) MLS without considering the entropy term for selecting the normalization of the weights in Eq. (20), and (c) the full MLS approach suggested.
222
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
Table 1 Fit for the response surface approximation for original model parameter space based on R2 and AE. Problem and damper case
Global least squares
θp ∼ p(θ)
MLS without entropy
θp ∼ π(θ)
θp ∼ p(θ)
MLS with entropy
θp ∼ π(θ)
θp ∼ p(θ)
R2 (%)
AE (%)
R2 (%)
AE (%)
R2 (%)
AE (%)
R2 (%)
AE (%)
R2 (%)
SP
C1 C2
68.9 46.4
28.8 41.3
51.5 29.8
21.7 37.8
76.5 72.4
21.9 15.4
49.4 63.4
18.4 15.9
97.9 95.6
FP
C1 C2
57.4 31.9
35.1 47.2
28.4 23.8
34.8 41.2
52.3 27.1
38.8 45.1
22.4 17.1
32.2 38.8
95.1 93.1
θp ∼ π(θ) AE (%) 6.72 5.61 10.9 12.5
R2 (%)
AE (%)
96.6 91.9
4.66 6.01
90.2 88.4
8.61 11.9
Table 2 Fit for the response surface approximation for transformed model parameter space based on R2 and AE. Problem and damper case
Global least squares
θp ∼ p(θ)
MLS without entropy
θp ∼ π(θ)
θp ∼ p(θ)
MLS with entropy
θp ∼ π(θ)
θp ∼ p(θ)
θp ∼ π(θ)
R2 (%)
AE (%)
R2 (%)
AE (%)
R2 (%)
AE (%)
R2 (%)
AE (%)
R2 (%)
SP
C1 C2
79.8 75.3
25.6 25.8
64.2 59.7
19.9 26.2
91.2 7.5
12.9 9.8
80.5 86.6
10.6 9.20
98.1 96.8
6.42 5.51
97.1 94.7
4.50 5.44
FP
C1 C2
73.2 71.9
29.7 30.6
48.7 49.4
22.9 28.8
72.9 67.3
33.1 34.3
44.3 47.2
24.8 30.2
96.1 94.5
9.91 11.9
93.6 90.5
7.62 10.4
AE (%)
R2 (%)
AE (%)
Table 3 Relative entropy between approximate and actual target probability distributions. Problem and damper case
Global least squares D(π(θ)∥π(θ)) ˆ
MLS without entropy D(π(θ)∥π(θ)) ˆ
MLS with entropy D(π(θ)∥π(θ)) ˆ
Original
Transformed
Original
Transformed
Original
Transformed
SP
C1 C2
0.2291 0.2043
0.1397 0.0551
0.1051 0.0441
0.0402 0.0243
0.0145 0.0071
0.0133 0.0062
FP
C1 C2
0.2811 0.3092
0.1523 0.1072
0.2986 0.221
0.1464 0.1147
0.0257 0.0181
0.0202 0.0125
Comparison first between the global least squares and the MLS approximations verifies the pattern discussed earlier; the MLS approach provides a better fit when appropriately tuned (using the entropy information for our case). This is attributed to the poor approximation provided for this application by a global quadratic function over the whole domain for θ. Also the fit is much better, as anticipated, for the simplified problem SP that has fewer model parameters nθ = 5, contrary to the full problem FP for which nθ = 11. These characteristics hold for both the approximation to h(θ) (judged based on R2 and the AE) as well as for the approximation to the density π (θ) (judged based on D(πˆ (θ)∥π (θ))), which corresponds of course to the more important measure for the efficiency of the response surface approach. The characteristics for the fit to h(θ) do change depending on the basis of the evaluation: candidate samples, θp ∼ p(θ), or samples from the target distribution, θp ∼ π (θ). In the latter case the AE is generally improved but the coefficient of determination becomes smaller. Of course the accuracy for the response is ultimately more appropriately judged by D(πˆ (θ)∥π (θ)). Comparing the two damper configurations, the efficiency of the MLS approximation significantly improves for the C2 case, which indicates that the approximation can efficiently take advantage of the reduction of the variability of the response. The same comment does not extend to the global response surface approximation. The most important comparison, though, is between the MLS approaches with or without the entropy term for the weights in the distance measure (20). When the weights are selected considering the relative contribution of each component of θ to the stochastic performance (MLS with entropy case), the goodness of fit of MLS greatly improves with respect to both the average error and the coefficient of determination. The difference between π(θ) ˆ and π (θ), measured by D(πˆ (θ)∥π (θ)), is also significantly
reduced. The improvement is much larger for the FP problem since it is more important to prioritize between the different components of θ when the number of those components is larger. The fit provided by the MLS approach without the entropy component is in this case quite poor and even worse than the global response surface approximation. It is important to stress that the MLS approach with the entropy weights provides a very good approximation for all cases considered and under all different measures for evaluating the accuracy, which indicates strong robustness properties. All these characteristics indicate the potential efficiency of the proposed scheme for the MLS approximation to the model response within the stochastic sampling process. It should be pointed out that for all approaches there is an improvement when the response surface is built on the transformed model parameter space rather than the original one (compare Table 2 to Table 1, or adjacent columns in Table 3). The normalization of the model parameter space facilitates, as one should expect, a better fit. The associated improvement is significant for the global least squares approach or the MLS without the entropy weight. For the MLS approach with the entropy weight the improvement is smaller which further testifies to the robustness of the suggested methodology; it performs well regardless of whether transformed model parameters are considered. 6. Conclusions The stochastic sampling from target probability distributions which are related to the response of some engineering system model was discussed. This sampling process requires repeated evaluation of the model response, for each candidate sample generated from the chosen proposal density, task which for
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
complex models represents the most computationally expensive component for any sampling algorithm selected. Implementation of surrogate modeling, in particular moving least squares (MLS) response surface methodologies, was suggested for efficient approximation of the system model response and reduction of the computational burden associated with the stochastic sampling. The selection of appropriate weights for the MLS approximation was discussed using information provided through the available samples for performing a sensitivity analysis with respect to the system model parameters. The relative information entropy between the target and proposal marginal densities for each model parameter was suggested for this purpose and an approach based on kernel density estimation was discussed for approximating these scalar densities. A novel measure was also introduced for evaluating the accuracy of the response surface methodologies for the stochastic sampling. The relative entropy between the exact (obtained by direct evaluation of the system model) and approximate (obtained by approximation of the model response) probability density functions was selected as such a measure. Stochastic simulation was considered for evaluation of the multidimensional integral corresponding to the entropy for this latter case. An example was presented that shows the efficiency of the proposed methodology that investigated the performance of a base-isolated structure against pulse type of dynamic excitation with uncertain characteristics. The fragility of the structure was chosen as the performance function involved in the target density formulation. In this example, the use of information from the sampling process to fine tune characteristics of MLS was shown to be highly efficient and greatly improve the accuracy of the response surface approximation. Appendix A. Stochastic sampling algorithms Two algorithms that can be used for simulating samples from π (θ) are discussed here. The first one is the Accept–Reject method which may be considered to belong to direct Monte Carlo approaches. First, choose an appropriate proposal PDF q(θ) and then generate independent samples as follows:
the target distribution are already available. If the initial samples are distributed according to π (θ), the Markov Chain generated in this way is always in its stationary state and all simulated samples follow the target distribution. The sampling efficiency of both these algorithms depends on ˜ k ). These PDFs should be chosen, the proposal PDFs q(θ) and q(θ|θ if possible, to closely resemble h(θ)p(θ) and still be easy to sample from. If the first feature is established then the efficiency of the algorithms is expected to be high [7,8]. In particular for the MCMC approach the proposal densities may establish a local random ˜ k ), i.e. depend on the previous sample, or be global walk q(θ|θ ˜ , i.e. independent of the previous sample. In any case densities q(θ) when samples are available from the target density π (θ), then the information in these samples may be used to formulate better proposal densities. For example a Gaussian or a Gaussian mixture may be selected with characteristics (mean and covariance) the ones estimated from the available samples. Appendix B. Multidimensional relative entropy evaluation Substituting for the densities πˆ (θ) and π (θ), the relative entropy D(πˆ (θ)∥π (θ)) in (21) simplifies to D(πˆ (θ)∥π (θ)) =
Θ
M · q(θc )
> u;
where M > max θ∈Θ
h(θ)p(θ) q(θ)
.
+ log
The second one is the Metropolis–Hastings algorithm, which is a Markov Chain Monte Carlo method and is expressed through the following iterative form: (1) Simulate candidate sample θ˜ k+1 from a proposal PDF q(θ˜ k+1 |θk ). (2) Compute acceptance ratio rk+1 =
h(θ˜ k+1 )p(θ˜ k+1 )q(θk |θ˜ k+1 ) h(θk )p(θk )q(θ˜ k+1 |θk )
.
(27)
(3) Simulate u from a uniform distribution with support [0,1] and set
˜ θk+1 = θk+1 θk
if rk+1 ≥ u otherwise.
(28)
In this case the samples are correlated (the next sample depends on the previous one) but follow the target distribution after a burn-in period, i.e. after the Markov Chain reaches stationarity. The algorithm is particularly efficient when samples that follow
hˆ (θ)
dθ
h(θ)
Θ
h(θ)p(θ)dθ
Θ
hˆ (θ)p(θ)dθ
.
(29)
Using Nk samples available from the proposal density θ ∼ q(θ), with θk denoting the kth sample, an estimate for the second component of the summation based on stochastic simulation (independently for the numerator and the denominator) is
log
Θ Θ
h(θ)p(θ)dθ
hˆ (θ)p(θ)dθ
≈ log
Nk
1 Nk
k =1 Nk
1 Nk
k =1
Nk
p(θ )
h(θk ) q(θk ) k
p(θk )
hˆ (θk ) q(θ ) k p(θk ) k q(θ ) k
h(θ ) k=1 = log N . k ˆh(θk ) p(θk ) q(θk )
(26)
(3) Return to (1) otherwise.
πˆ (θ) log
(1) Simulate candidate sample θc from q(θ) and u from a uniform distribution with support [0, 1]. (2) Accept θ = θc if h(θc )p(θc )
223
(30)
k=1
If N samples are available for θ ∼ πˆ (θ), with θi denoting the ith sample, then an estimate for the integral in expression (29) may be also obtained using stochastic simulation as
Θ
πˆ (θ) log
hˆ (θ) h(θ)
dθ ≈
N 1
N i=1
log
hˆ (θi ) h(θi )
.
(31)
Combining these two results, leads to the following approximation for D(πˆ (θ)∥π (θ)) D(πˆ (θ)∥π (θ)) ≈
N 1
N i=1
log
Nk
hˆ (θi )
h(θi ) p(θk ) k q(θ ) k
h(θ ) k=1 + log N k p(θ ) hˆ (θk ) q(θk ) k
(32)
k=1
with θi ∼ πˆ (θ), and θk ∼ q(θ). This approach requires evaluation of the performance measure h(.) for both set of samples, {θi } and {θk }. To reduce this
224
A.A. Taflanidis, S.-H. Cheung / Probabilistic Engineering Mechanics 28 (2012) 216–224
computational cost, the set {θk } ∼ q(θ) could be used for the estimation of the first integral in (29) as well. This is established by rewriting this integral as
hˆ (θ) πˆ (θ) π(θ) ˆ log log dθ = q(θ)dθ h(θ) h(θ) Θ Θ q(θ) hˆ (θ)p(θ) hˆ (θ) log h(θ) q(θ)dθ Θ q(θ) = hˆ (θ)p(θ) q(θ)dθ Θ q(θ) hˆ (θ)
(33)
where the expression for πˆ (θ) was plugged in to derive the last equation. This may be estimated then using stochastic simulation as
Θ
π(θ) ˆ log
hˆ (θ)
Nk
h(θ)
k=1
dθ ≈
p(θ )
hˆ (θk ) q(θk ) log
ˆ
h(θk ) h(θk )
k
Nk k=1
.
(34)
p(θ )
hˆ (θk ) q(θk ) k
In this case we end up with the following alternative approximation for D(πˆ (θ)∥π (θ)) Nk
D(π(θ)∥π ˆ (θ)) ≈
k=1
p(θ )
hˆ (θk ) q(θk ) log
ˆ
h(θk ) h(θk )
k
Nk
p(θ )
k=1
hˆ (θk ) q(θk )
Nk
k=1 + log N k k=1
k
p(θ )
h(θk ) q(θk ) k
p(θk )
(35)
hˆ (θk ) q(θ ) k
with θk ∼ q(θ). References [1] Beck JL, Au SK. Bayesian updating of structural models and reliability using Markov chain Monte Carlo simulation. Journal of Engineering Mechanics 2002; 128:380–91. [2] Cheung SH, Beck JL. Bayesian model updating using hybrid Monte Carlo simulation with application to structural dynamic models with many uncertain parameters. Journal of Engineering Mechanics 2009;135:243–55. [3] Taflanidis AA, Beck JL. Stochastic subset optimization for reliability optimization and sensitivity analysis in system design. Computers & Structures 2009; 87:318–31.
[4] Taflanidis AA, Beck JL. An efficient framework for optimal robust stochastic system design using stochastic simulation. Computer Methods in Applied Mechanics and Engineering 2008;198:88–101. [5] Au SK, Beck JL. Subset simulation and its applications to seismic risk based on dynamic analysis. Journal of Engineering Mechanics, ASCE 2003;129: 901–917. [6] Cheung SH, Beck JL. Calculating the posterior probability for Bayesian model class assessment and averaging by using posterior samples based on dynamic system data. Computer-Aided Civil and Infrastructure Engineering 2010;25: 304–21. [7] Robert CP, Casella G. Monte Carlo statistical methods. 2nd ed. New York, NY: Springer; 2004. [8] Berg BA. Markov chain Monte Carlo simulations and their statistical analysis. Singapore: World Scientific Publishing; 2004. [9] Myers RH, Montgomery DC. Response surface methodology, 2nd ed., New York (NY), 2002. [10] Taflanidis AA. Stochastic subset optimization with response surfacemethodologies for stochastic design. In: 1st International conference on soft computing technology in civil. Structural and environmental engineering. 2009. [11] Jaynes ET. Probability theory: the logic of science. Cambridge, UK: Cambridge University Press; 2003. [12] Cheung SH, Beck JL. New Bayesian updating methodology for model validation and robust predictions of a target system based on hierarchical subsystem tests. Earthquake Engineering Research Laboratory Report EERL 2008-04. California Institute of Technology, Pasadena, CA. [13] Gavin HP, Yau SC. High-order limit state functions in the response surface method for structural reliability analysis. Structural Safety 2007;30: 162–179. [14] Breitkopf P, Naceur H, Rassineux A, Villon P. Moving least squares response surface approximation: formulation and metal forming applications. Computers & Structures 2005;83:1411–28. [15] Youn BD, Choi KK. A new response surface methodology for reliability-based design optimization. Computers & Structures 2004;82:241–56. [16] Bucher C, Most T. A comparison of approximate response functions in structural reliability analysis. Probabilistic Engineering Mechanics 2008;23: 154–63. [17] Beirlant J, Dudewicz EJ, Gyorfi L, Van der Meulen EC. Nonparametric entropy estimation: an overview. The International Journal of Mathematical and Statistical Sciences 1997;6:17–40. [18] Mokkadem A. Estimation of the entropy and information for absolutely continuous random variables. IEEE Transactions on Information Theory 1989; 35:193–6. [19] Martinez WL, Martinez AR. Computational statistics handbook with MATLAB. Boca Raton (FL): Chapman & Hall, CRC; 2002. [20] Irish J, Resio D, Cialone M. A surge response function approach to coastal hazard assessment. Part 2: quantification of spatial attributes of response functions. Natural Hazards 2009;51:183–205. [21] Dimitrakopoulos E, Makris N, Kappos AJ. Dimensional analysis of the earthquake-induced pounding between adjacent structures. Earthquake Engineering & Structural Dynamics 2009;38:867–86. [22] Taflanidis AA. Robust stochastic design of viscous dampers for base isolation applications. In: M. Papadrakakis, N.D. Lagaros, M. Fragiadakis (Eds.), Computational methods in structural dynamics and earthquake engineering. Rhodes, Greece. 2009. [23] Mavroeidis GP, Papageorgiou AP. A mathematical representation of near-fault ground motions. Bulletin of the Seismological Society of America 2003;93: 1099–131.