Accepted Manuscript
System reliability under prescribed marginals and correlations: Are we correct about the effect of correlations? Fan Wang , Heng Li PII: DOI: Reference:
S0951-8320(17)30679-8 10.1016/j.ress.2017.12.018 RESS 6038
To appear in:
Reliability Engineering and System Safety
Received date: Revised date: Accepted date:
8 June 2017 17 November 2017 28 December 2017
Please cite this article as: Fan Wang , Heng Li , System reliability under prescribed marginals and correlations: Are we correct about the effect of correlations?, Reliability Engineering and System Safety (2017), doi: 10.1016/j.ress.2017.12.018
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights
AC
CE
PT
ED
M
AN US
CR IP T
Probability distribution model is built on prescribed marginals and correlations A sequential search strategy (S3) is proposed to retrieve pair-copula parameters Both Gaussian and Non-Gaussian dependency can be incorporated into system reliability Reliability can be evaluated using qualitative dependency and quantitative statistics
ACCEPTED MANUSCRIPT
Title page
System reliability under prescribed marginals and correlations:
Fan Wanga,b,c, Heng Lia
a
Department of Building and Real Estate, The Hong Kong Polytechnic University,
AN US
Kowloon, Hong Kong
b
CR IP T
Are we correct about the effect of correlations?
Department of Civil Engineering and Mechanics, Huazhong University of Science and Technology, Wuhan, 430074, PR China
Department of Resource and Civil Engineering, Wuhan Institute of Technology,
M
c
ED
Wuhan, 430073, PR China
[email protected] (H. Li)
AC
CE
PT
[email protected] (F. Wang)
1
ACCEPTED MANUSCRIPT
System reliability under prescribed marginals and correlations: Are we correct about the effect of correlations?
Abstract
CR IP T
Many reliability problems involve correlated random variables. However, the probabilistic specification of random variables is commonly given in terms of marginals and correlations, which is actually incomplete because the data dependency needed for distribution modeling is not characterized. The implicitly assumed
AN US
Gaussian dependence structure is not necessarily true and may bias the reliability result. To investigate the effect of correlations on system reliability under non-Gaussian dependence structures, a general approach to the probability distribution model construction based on the pair-copula decomposition is proposed. Numerical examples have highlighted the importance of dependence modeling in
M
system reliability since large deviation in failure probabilities under different
ED
dependencies is observed. The method for identifying the best fit data dependency from data is later provided and illustrated with a retaining wall. It is demonstrated that
PT
the reliability result can be accurately estimated if the qualitative dependence
CE
structure is complemented to the available quantitative statistical information.
Keywords:
AC
System reliability; Marginals and correlations; Transformation method; Dependence structure; Probability distribution model; Pair-copula decomposition
-1-
ACCEPTED MANUSCRIPT
1 Introduction Engineering systems are associated with various uncertainties including variable material properties and random environmental conditions, which necessitate the use of reliability concept for engineering design within the constraint of economy [1]. A critical step to this target is to capture the joint probability distribution of the uncertain
CR IP T
parameters. Unfortunately, due to the high cost or the limited conditions for in situ experiments, the data needed to characterize the joint probability distribution is frequently insufficient [2]. In contrast, the univariate data analysis can be readily performed to obtain the marginal distribution of each parameter. If the random variables representing the parameters are independent, the joint probability
AN US
distribution will be a simple multiplication of its marginal distributions. However, correlations may exist between different stochastic parameters or in the variance of a stochastic parameter over time or space [36]. Many articles and books have illustrated how to consider the correlations in reliability analysis (e.g., [7–8]). The basic idea is
M
to transform the correlated random variables into uncorrelated standard ones to simplify the subsequent reliability evaluation. When the correlated random variables
ED
are normal, the orthogonal transformation can be used for this goal. If the correlated non-normals are involved, the Nataf transformation [9] is commonly adopted.
PT
It is known that the marginals and correlations can be readily derived from the joint probability distribution; however, the reverse of the process is difficult because
CE
the probabilistic description in terms of the first two statistical moments (including the correlation or covariance matrix) is actually incomplete [9]. The missing information
AC
is the dependence structure that is usually characterized by a copula function [10–11]. From the copula perspective, the orthogonal and the Nataf transformation arbitrarily assumes a Gaussian dependence structure characterized by a Gaussian copula [12]. Although this arbitrary selection of distribution model is done quite often, recent investigation has shown that the data dependency is not always Gaussian [13–14], which implies that the reliability results based on the Gaussian dependence assumption could be biased. -2-
ACCEPTED MANUSCRIPT
In this paper, the system reliability problems involving correlated random variables are revisited. The aim is to investigate the effect of correlations between random variables on the system reliability under both Gaussian and non-Gaussian dependence structures, and tries to provide insights into uncertainty modeling in reliability analysis. To this end, this paper proposes a general method for constructing the multivariate distribution that is consistent with the given marginal distributions
CR IP T
and linear correlations. The pair-copula decomposed model is adopted to represent the multivariate distribution while the model parameters are retrieved from given probabilistic information via a sequential search strategy (S3). The constructed distribution model is flexible as the correlated random variables are not necessarily
AN US
Gaussian dependent. Then the reliability results under the same probabilistic description of input random variables but different dependence structures are compared, and the cause of large deviation in results is discussed. Finally, with a given dataset, we show how to choose an appropriate distribution model in reliability
ED
reliability-based design.
M
analysis and demonstrate the impact of inaccurate dependence modeling on
2 Reliability analysis considering correlations
PT
2.1 A brief review on the formulation of reliability
CE
Let x x1 ,, xn be a random vector representing uncertain parameters in a system and f x be its joint probability density function (PDF). In practice, an
AC
engineering system often has multiple components or subsystems. The failure probability of a system can be generally defined as:
P Esys
Esys
f x dx
(1)
where Esys denotes the failure event of the system. For a cut-set system, Esys can be expressed as: N
N
Esys Ck Ei k 1
k 1 iCk
-3-
(2)
ACCEPTED MANUSCRIPT
where Ck , k 1,, N is the kth cut-set event that the system fails if any cut-set event occurs; Ei Gi x 0 is the ith component failure event and Gi x defines its performance function. Note Eq. (2) is also valid for a link-set system, which is defined as the intersection of the unions of component failure events, if the complementary event of Esys (i.e., survival event of the link-set system) is of
CR IP T
interest.
Various methods can be used to approximate the system reliability (e.g., [15–17]). Generally, these methods are accurate and efficient; yet, it is usually assumed that the random variables are statistically independent. If the random variables are correlated,
AN US
they are commonly mapped into the independent standard normal variables u through a transformation of the joint probability distribution model of x , namely
u T x , and the reliability evaluation can be equivalently implemented in the standard normal space as:
M
P Esys
Esys
f u du
(3)
ED
where the component failure event in Eq. (2) is re-written as Ei Gi T 1 u 0
and T 1 is the inverse of the transformation T ; f u is the PDF of an
PT
independent standard multivariate normal distribution.
CE
Theoretically, f x is needed to perform the above nonlinear transformation. However, this requirement is not fulfilled at all times. Instead, the marginal
AC
distributions of x and the correlation matrix is commonly given, which motivates the use of copula theory [10] to bridge the gap between the incomplete probability information and a probability distribution model.
2.2 A copula perspective on dealing with correlations Copula theory allows the analysis of a joint probability distribution by investigating its dependence structure and marginal distributions respectively [10]. According to Sklar’s theorem, the joint bivariate cumulative distribution function -4-
ACCEPTED MANUSCRIPT
(CDF) can be formulated as:
F ( x1 , x2 ) C (u1 , u2 ; )
(4)
where u1 F1 x1 and u2 F2 x2 are the marginal CDFs of x1 and x2 respectively, C is a bivariate copula function with a parameter indicating the strength of dependence. A number of copula functions can be found in the literature
CR IP T
[10], and some widely used copulas are listed in Table 1. The corresponding joint bivariate PDF can be written as:
f ( x1 , x2 )
(5)
2C (u1 , u2 ; ) is a copula density function, f1 ( x1 ) and f 2 ( x2 ) u1u2
AN US
where c(u1 , u2 ; )
2C (u1 , u2 ; ) c(u1 , u2 ; ) f1 ( x1 ) f 2 ( x2 ) x1x2
are the marginal PDFs of x1 and x2 , respectively. Table 1. Examples of bivariate copulas
1
( (u1 ), (u2 ))
M
1
*
Frank
(e u1 1)(e u2 1) ln 1 e 1
Clayton
(u1 u2 1)1/
1 2 21 2 2 2 exp 2 2 2 1 1 e
PT
CE
1 u1u2
u1 u2 1 (1 u1 ) (1 u2 ) 1 *
u1 u2
1
2
(, ) \{0}
(u1 u2 1)21/
1 (1 u1 )(1 u2 )
1
1/
[1, 1]
(e 1)
(e 1) (e u1 1)(e u2 1)
[(1 u1 ) (1 u2 ) 1]21/
AC
CClayton
1
Range of
1
ED
Gaussian
c(u1 , u2 ; )
C (u1 , u2 ; )
Copula
is the bivariate standard normal distribution function with a correlation coefficient ,
1 1 u1 , 2 1 u2 , where 1 is the inverse function of the CDF of a univariate
standard normal distribution.
-5-
(0, ) (0, )
ACCEPTED MANUSCRIPT
(b) 3
2
2
1
1
0
0
-1
-1
-2
-2
-3 -3
-2
-1
0
1
2
-3 -3
3
-2
x1
(c)
2
2
1
1
x2
3
1
2
3
0
-1
1
2
3
(d)
0
AN US
x2
0
x1
3
-1
-2
-3 -3
-1
CR IP T
x2
x2
(a) 3
-2
-2
-1
0
1
2
x1
-3 -3
3
-2
-1
0
x1
M
Fig. 1 joint PDF isoline with the same standard normal marginal distributions, a correlation
ED
coefficient of 0.5, and the different copulas: (a) Gaussian copula; (b) Frank copula; (c) Clayton copula; and (d) CClayton copula
PT
Given a certain copula, the distribution model can be uniquely established by linking the copula parameter to the correlation coefficient. Fig. 1 shows the
CE
iso-density contours of f x1 , x2 when different copulas are adopted. It is obvious that the joint PDFs differ significantly despite that they share the same standard
AC
normal marginal distributions and correlation coefficient. Several studies have reported that the reliability can be differed significantly due to a different copula selection [11, 13, 14]. However, most probability distribution models considering non-Gaussian dependency are illustrated in two-dimensions [11, 13, 14]. If more than two random variables are mutually correlated, the above copula-based approach to the multivariate distribution modeling usually fails except for the elliptical copula (e.g., Gaussian copula). For the widely used n-variate Archimedean copulas (e.g., Frank, Clayton, and CClayton copulas), the maximum number of copula parameters is n-1 -6-
ACCEPTED MANUSCRIPT
while the n-variate probability distribution has n(n-1)/2 pair-wise correlation coefficients. Consequently, the number of copula parameters is less than the number of
correlation
coefficients
unless
in
two-dimensions.
This
inequality
in
multi-dimensions prohibits a one-to-one mapping between the copula parameters and the correlation matrix that is required to rebuild a probability distribution model from the prescribed marginals and correlations [18]. In other words, the simple extension of
CR IP T
bivariate copulas to its multivariate form generally has limited flexibility that cannot be used to build a distribution in consistency with arbitrary correlations. Unfortunately, there are many engineering systems involving incomplete probabilistic description of correlated multivariates. To provide insights into the effect of copulas on the system
AN US
reliability, it is necessary to develop a general approach to probability distribution model construction under prescribed marginals and correlations.
3 Construction of multivariate distribution model
M
3.1 Pair-copula decomposed model for multivariate distributions The joint PDF of x x1 ,, xn can be decomposed as:
ED
f x f x1 f x2 | x1 f xn | x1 ,, xn1
(6)
PT
wherein the marginal conditional PDFs in the general form of f ( x | v) can be
CE
expressed as [19]:
f x | v c F x | v j , F v j | v j ; f x | v j
(7)
AC
where v j is an arbitrarily excluded element from vector v and v j denotes the vector v after excluding v j ; c is a bivariate copula density function defined in Eq. (5); the subscript of indicates the conditioning variables to be drawn. Using Eq. (7) recursively, f (x) can be written as a product of bivariate copula density functions with marginal conditional CDFs in the form of F ( x | v) , where F ( x | v) can also be represented recursively [20]: -7-
ACCEPTED MANUSCRIPT
F x | v
C F x | v j , F v j | v j ; F v j | v j
(8)
If v is a univariate, Eq. (8) simplifies to:
F x2 | x1
C F1 x1 , F2 x2 ; F1 x1
h u2 , u1 ;
(9)
CR IP T
where h(u2 , u1; ) is the derivative of C (u1 , u2 ; ) with respect to u1 . Note the second variable in the h-function is always the conditioning variable (i.e.,
F x2 | x1 h(u2 , u1; ) and F x1 | x2 h(u1 , u2 ; ) ). For reference, the h-functions corresponding to the bivariate copulas in Table 1 are listed in Table 2.
AN US
Table 2 The h-functions of the example copulas
h(u2 , u1; )
Copula
Frank
e u1 (e u2 1) (e 1) (e u1 1)(e u2 1)
M
Gaussian
1 (u2 ) 1 (u1 ) 1 2
u1 1 (u1 u2 1)11/
CClayton
1 (1 u1 ) 1[(1 u1 ) (1 u2 ) 1]11/
PT
ED
Clayton
The n-dimensional joint PDF f (x) can thus be expressed in terms of a series of
CE
bivariate copulas (i.e., pair-copula decomposition). However, there are many different
AC
choices of v j during the decomposition of the joint PDF, leading to many different ways of representing f (x) . Bedford and Cooke [21–22] proposed a graphical tool known as vines to help categorize different pair-copula decompositions. The canonical vine (C-vine) and the drawable vine (D-vine) are the two major types of regular vines. The joint PDF f (x) corresponding to a C-vine is [19]: n
n 1 n j
k 1
j 1 i 1
f (x) f k ( xk ) c F ( x j | x1 , , x j 1 ), F ( x j i | x1 , , x j 1 ); j , j i|1,, j 1
-8-
(10)
ACCEPTED MANUSCRIPT
while f (x) corresponding to a D-vine is given as [19]: n
n 1 n j
k 1
j 1 i 1
f ( X) f k ( xk ) c F ( xi | xi 1 , , xi j 1 ), F ( xi j | xi 1 , , xi j 1 ); i ,i j|i 1,,i j 1 (11) where the subscript indices indicate the conditional random variables to be drawn. For an n-dimensional distribution, there are n!/2 different C-vines and n!/2
CR IP T
different D-vines. Each vine structure contains a specific set of pair-copula parameters that need to be estimated. The critical step to a general PDF construction thus lies in the retrieval of these parameters, which will be presented below.
3.2 A sequential search strategy for retrieving the pair-copula parameters from marginals and correlations
i , j
AN US
The pair-wise correlation coefficient, by definition, has the following expression: x i i i
xj j j
f ( xi , x j )dxi dx j
(12)
where i ( j ) and i ( j ) are respectively the mean and the standard deviation of
M
x1 ( x2 ); f x1 , x2 is the bi-dimensional marginal PDF which can be written as:
ED
f ( xi , x j )
f ( x) f (x i , j | xi , x j )
(13)
PT
where x i , j denotes the random vector excluding the i-th and j-th variables. By
CE
analogy with Eq. (6), the conditional PDF f (xi , j | xi , x j ) can also be expressed as a product of marginal conditional PDFs such as: (14)
AC
f ( xi , j | xi , x j ) f ( x1 | xi , x j ) f ( x2 | x1 , xi , x j ) f ( xn | xi , j ,n , xi , x j )
Therefore, both f (x) and f (xi , j | xi , x j ) in Eq. (13) can be expressed by a series of pair-copulas. However, in multi-dimensions, both f (x) and f (xi , j | xi , x j ) have more than one pair-copula-based expressions, which leads to various formulations of f ( xi , x j ) . Unfortunately, the pair-copula parameters cannot be directly determined from a single correlation coefficient using Eq. (12) except those associated to the unconditional pair-copulas. To retrieve the pair-copula parameters in -9-
ACCEPTED MANUSCRIPT
higher hierarchical levels, additional parameters irrelevant to the desired construction of f (x) have to be estimated, which can result in unacceptable computational cost [18]. Bedford and Cooke [22] use vines to demonstrate a one-to-one mapping relation between the pair-copula parameters and correlation coefficients. This mapping
CR IP T
relation is further visualized in Fig. 2, in which the pair-copula parameters associated to the edges in a C-vine structure are separately related to the pair-wise correlation coefficients in a correlation matrix. Moreover, the nodes at a particular tree level are only necessary for determining the edges in the next level. In other words, the conditional pair-copula parameters at a specific level are dependent on the lower-level
AN US
pair-copula parameters. Bearing this in mind, a sequential search strategy shown in Algorithm 1 is proposed to approximate the pair-copula parameters, where each pair-copula parameter is retrieved from its corresponding correlation coefficient by a simulation-based bisection search method shown in Algorithm 2. Appendix A
M
illustrates that the pair-copula parameter at higher levels can be uniquely determined
...
from the corresponding coefficient if the solution exists.
ED
i-1,i|1,…,i-2
PT
Tree i
i+j |1, …, i-1
θi,i+
j+1|1
,…,i-1
...
... i-1,i+j|1,…,i-2
Correlation matrix
CE
i ,i j i 1,i j
i-1,i+j+1|1,…,i-2
i ,i j 1 i 1,i j 1
i,i+1|1,…,i-1
Tree i+1
θi+
...
AC
θi,
1,i +
j|1 ,… ,i
θi+1
,i+j+ 1|1,…
,i
...
... i,i+j|1,…,i-1
i,i+j+1|1,…,i-1
Fig. 2 Illustration of the one-to-one relation in a C-vine structure.
- 10 -
ACCEPTED MANUSCRIPT
Algorithm 1 Sequential search strategy for retrieving pair-copula parameters Input: Vine structure: V correlation matrix: ρ Output: pair-copula parameters: Θ
CR IP T
Initialize Θ (e.g., Θ ← zero matrix) Case V=C-vine For i=1 to n
θi,j|1,2,…,i-1←ρi,j by Algorithm 2 End End Case V=D-vine
M
For i=1 to n-1
AN US
For j=i+1 to n
For j=1 to n-i
ED
θj,j+i|j+1,j+2,…,j+i-1←ρj,j+i by Algorithm 2 End
PT
End
Algorithm 2 Approximating pair-copula parameter from marginals and correlations
CE
Input:
A set of pair-copulas: C
AC
Marginal CDFs: Fd(xd) d=1,…,n Targeted correlation coefficient: ρij Search interval: ij|... ,ij|... Current pair-copula parameters: Θold Tolerance of approximation error: ε Sample population: m Output: Updated pair-copula parameters: Θnew - 11 -
ACCEPTED MANUSCRIPT
%Fix seed for generating random numbers While ij|... ij|... Θtrial ← replace ij|... in Θold by ijtrial |... (ij|... ij|... ) / 2 um×n←{V, C, Θtrial} % Sampling m copula random numbers from the vine xm×n←F-1(um×n) % obtain the random variables by the inverse of marginal CDFs
trial ←[x( : , i), x( : , j)] % obtain the i-th and j-th column of xm×n ij
CR IP T
If ijtrial ij
ij|... ijtrial |... Else
ij|... ijtrial |... End
AN US
End Θnew=Θtrial
4 MCS approach to system reliability
M
Due to the conceptual simplicity and algorithm robustness, the MCS is adopted
ED
in this paper for system reliability analysis. In general, the system failure probability
PT
can be approximated by:
Pf
nt
nf nt
I x i 1
F
i
(15)
nt
CE
where nf is the number of failure samples and nt is the total number of generated
AC
random variables; I F xi is the indicator function, i.e., I F xi 1 if the system failure event Esys shown in Eq. (2) occurs and I F xi 0 otherwise. As mentioned above, the generation of correlated normal or non-normal random
variables based on the orthogonal transformation or the Nataf transformation inherently adopts a Gaussian dependence structure. Therefore, strictly speaking, the Rosenblatt transformation U k 1 F ( xk | x1 , xk 1 ) , k 1,, n [23] should be used for transforming the uncorrelated random variables into the correlated ones. - 12 -
ACCEPTED MANUSCRIPT
The use of Rosenblatt transformation requires the joint probability distribution to be available. By rebuilding the distribution, the Rosenblatt transformation can be re-written in terms of pair-copulas:
w1 U1 F1 ( x1 ) w2 U 2 h F2 ( x2 ), F1 ( x1 );1,2
w3 U 3 h h F3 ( x3 ), F1 ( x1 );1,3 , h F2 ( x2 ), F1 ( x1 );12 ; 2,3|1
CR IP T
(16)
where w1 ,, wn are independent random variables uniformly distributed on [0,1] , and U1 ,,U n are independent standard Gaussian random variables. Therefore, the
AN US
random variables x1 ,, xn can be generated from U1 ,,U n or w1 ,, wn by applying the inverse of the Rosenblatt transformation. For example: x1 F1-1 ( w1 )
x2 F2-1 h 1 w2 , F1 ( x1 );1,2
(17)
M
x3 F3-1 h 1 h 1 w3 , h F2 ( x2 ), F1 ( x1 );12 ; 2,3|1 , F1 ( x1 );1,3
ED
where F 1 and h 1 are the inverse functions of marginal CDFs and h-functions, respectively. Note Eqs. (16) and (17) are true for a C-vine structured model, in which
PT
the conditioning variable is always excluded as:
(18)
CE
F x j | x1 ,, x j 1 h F x j | x1 ,, x j 2 , F x j 1 | x1 ,, x j 2 ; j 1, j|1,2 j 2
The Rosenblatt transformation corresponds to a D-vine structured model can be
AC
derived in a similar manner. Note the Nataf transformation is actually a special case of the Rosenblatt transformation if all pair-copulas are Gaussian [18].
5 Numerical examples Two examples are used to illustrate the proposed method for reliability under prescribed marginals and correlations. Gaussian and Non-Gaussian data dependencies are respectively assumed to examine the possible deviation in failure probability. For simplicity, we use the C-vine structured decomposed model with canonical - 13 -
ACCEPTED MANUSCRIPT
conditioning order to represent the multivariate probability distribution, as shown in Eq. (10). The MCS population is set to 106. 5.1 A five-variate example Consider the following series system which has five performance functions.
2 x3 x5 2 2 g 2 x x1 x3 x5 x4 2 g3 x 2 x1 x5 g 4 x 2 x3 x5 g5 x x1 x2 x5 x4
CR IP T
g1 x x2
(19)
parameters:
x1 ~ N 137.2, 20.58 ,
AN US
All the random variables conform to normal distribution with the following
x2 ~ N 98.0,14.70 ,
x3 ~ N 196.0, 29.40 ,
x4 ~ N 29.4, 4.41 , x5 ~ N 127.4,38.22 . When all the random variables are independent, the system failure probability is 1.5e-2 regardless of the dependence
M
structure. If all the pair-wise correlation coefficients are 0.5, the failure probability
ED
from the Nataf transformation is 6.6e-4; however, the failure probability can increase up to 2.6e-3 if all pair-copulas are Frank. Fig. 3 shows the PDFs and CDFs of the
PT
minimum response values of the series system corresponding to different copulas. Fig. 4 illustrates the conditional bivariate distribution of
x1 and x2 provided that the
CE
remaining variables are set to their mean values. The assumed dependence structure underlying the random variables obviously impacts the distribution of the uncertain
AC
parameters and thus the system reliability, even if the marginals and correlations remain the same.
- 14 -
ACCEPTED MANUSCRIPT
(a) 0.014
10 normal Frank
0.012
10
0.01 0.008
CDF
PDF
10
0.006
10
10
(b)
0
normal Frank
-1
-2
-3
-4
0.004
0 -50
0
50
100
150
200
10
250
-5
-6
-50
0
50
100
150
200
250
CR IP T
10
0.002
min(g1,g2,g3,g4,g5)
min(g1,g2,g3,g4,g5)
Fig. 3 (a) PDFs and (b) CDFs of the minimum response values of the series system under different copulas. 120 0.1
0. 1
110
0.3
0.1
0.1 0.2 0.2
0.1
x2
0. 2
0.3
100 95
0.1
0.2 0.2 0.3
0.2 0.3
105
0.1
0.1
0.1
AN US
normal Frank
115
3 0.
2 0.
M
90 85
0.2
0.3 .2 0
0.1 0. 1
80
ED
75 100
110
120
130
0.11 0.
140
150
160
170
x1
PT
Fig. 4 Contour of f x1 , x2 | x3 196.0, x4 29.4, x5 127.4 .
5.2 A sewer system example
CE
Consider a sewer system consisting of three pipes (denoted as ac, bc, and cd,
respectively) as shown in Fig. 5. The system performance is considered unsatisfactory
AC
if overflow occurs in any pipe. According to Ang and Tang [24], the performance function for each pipe is given as:
Gac 8.3Y 4.3542C1W1
(20a)
Gbc 8.3Y 4.3542C2W2
(20b)
Gcd 25.835Y 4.3542W3 C3 KC1 KC2
(20c)
where Y is proportional to the pipe flow capacity; C1 , C2 , C3 are respectively the - 15 -
ACCEPTED MANUSCRIPT
runoff coefficients at basin areas a, b, and c; W1 ,W2 ,W3 is proportional to the precipitation rate at each basin area; and K is an attenuation factor. The statistics of the random variables are summarized in Table 3 and they are all assumed to follow a normal distribution. Since basin areas a, b, and c are adjacent, it is reasonable to assume that the precipitation rates at the three different areas are positively correlated.
CR IP T
Generally, the value of correlation coefficient is inversely related to the relative distance between the areas (i.e., precipitation rates at two locations with shorter relative distance are more likely to have similar values). For simplicity, we assume the three correlation coefficients between W1 and W2 , W1 and W3 , W2 and W3 have
AN US
the same value and the other random variables are statistically independent.
b
c
d
ED
M
a
Fig. 5 The sewer system example.
AC
CE
PT
Table 3. Statistics of the eight random variables in the sewer system example. Variable
Distribution
COV
C1 C2 C3 W1 W2 W3 K Y
normal normal normal normal normal normal normal normal
0.825 0.825 0.9 8.59 8.59 8.59 0.9 7.01
0.07 0.07 0.05 0.233 0.233 0.233 0.05 0.123
The Gaussian copula, Clayton copula, and CClayton copula are respectively examined. Fig. 6 shows the effect of dependence assumption on the sewer system CClayton Gaussian Clayton Psys Psys failure probability. It can be observed that Psys and the failure
- 16 -
ACCEPTED MANUSCRIPT
probabilities are exactly the same only if W1 ,W2 ,W3 are fully correlated or uncorrelated. This consistency is expected because W1 ,W2 ,W3 are equivalent to a single random variable if they are fully correlated and the joint PDF of W1 ,W2 ,W3 is a simple multiplication of its marginals if they are independent, which leads to a unique joint PDF. However, for a medium strength of correlation between random
12
x 10
CR IP T
variables, the effect of dependence assumptions is not negligible. -3
Gaussian Clayton CClayton
AN US
P(Esys )
10
8
0
0.2
0.4
0.6
12=13=23
0.8
1
M
6
ED
Fig. 6 Effect of dependence structure on the sewer system reliability as the correlation coefficients increase from 0 to 1.
PT
The effect of dependence assumption is illustrated in a series system. It is of interest if the effect remains similar for other systems. Suppose the sewer system is
CE
defined satisfactory if water can be drained from basin area a to d or b to d without occurrence of overflow. Thus the sewer system becomes to a series-parallel combined
AC
system Fig. 7 shows the effect of dependence assumption on the system reliability and the CIMs of components, respectively. The system failure probability now increases as
the
correlation
coefficients
between
W1 ,W2 ,W3
increase
and
CClayton Gaussian Clayton Psys Psys Psys . However, the pattern of influence of dependence
assumptions on system reliability in the series-parallel combined system is basically the same to that in the series system.
- 17 -
ACCEPTED MANUSCRIPT
3
x 10
-3
Gaussian Clayton CClayton
P(Esys )
2.5
1.5
0
0.2
0.4
0.6
0.8
12=13=23
CR IP T
2
1
AN US
Fig. 7 Effect of dependence assumption on reliability by taking the sewer pipes as a series-parallel combined system.
6 A retaining wall case
The above examples have demonstrated the effect of pair-copulas on system reliability under prescribed marginal distributions and correlations. Therefore, how to
M
choose the appropriate copula function is of great value for an unbiased estimate of reliability. A retaining wall subjected to backfill soil pressure (Fig. 8) is investigated
ED
here for this purpose. The retaining wall system has three failure modes, which are
H
soil , csoil , soil
W2
Soil
L2 W1
AC
CE
PT
defined with respect to the factor of safety (FS) [24].
L1
Pa H0
O
b
a Foundation
Fig. 8 schematic diagram of the retaining wall.
(1) Sliding of the wall along its base - 18 -
ACCEPTED MANUSCRIPT
The FS against sliding failure mode is defined as:
FS1
a b W1 W2 tan 2 / 3base 1/ 3cbase Pa
(21)
and base respectively the cohesion and friction angle of foundation soil; where cbase
Pa is the active earth thrust calculated by:
CR IP T
W1 1/ 2 wall bH and W2 wall aH ; wall is the unit weight of the retaining wall;
2 1 2coh 2 Pa H K a 2cohH tan 2 4 2
(22)
where coh, , are respectively the cohesion, friction angle, and unit weight of the
AN US
backfill soil. a, b, and H are geometry parameters shown in Fig. 8. (2) Overturing of the wall with respect to its toe
The FS against overturning failure mode is defined as: M r W1L1 W2 L2 Mo Pa H 0
(23)
M
FS2
ED
1 1 2 a where L1 b , L2 b , and H 0 H 2coh tan . 3 3 2 4 2
(3) Ultimate bearing capacity of the foundation soil exceeded
PT
The FS against bearing capacity failure mode is defined as:
CE
FS3
qu qu 6e qmax W1 W2 1 ab ab
AC
where qu is the bearing capacity of the foundation and e
(24)
a b Mr Mo . 2 W1 W2
The retaining wall system fails if any failure mode occurs. The corresponding
performance functions are as follows:
G1 FS1 1
(25a)
G2 FS2 1
(25b)
G3 FS3 1
(25c)
- 19 -
ACCEPTED MANUSCRIPT
Here coh , , and are random variables. The other parameters are taken as
=100kPa, base =30°, qu =400kPa, a=0.5m, deterministic: wall =24kN/m3, cbase b=1.45m, H=6.0m. 42 samples of the soil properties, namely coh, , , are given as shown in
CR IP T
Appendix B. The correlation coefficients are coh, 0.458 , coh, 0.270 , and
, 0.033 . The trivariate distribution best fitted to the above 42 data samples can be identified by investigating its marginal distributions and the copula separately. The univariate data analysis results are given in Appendix B. The set of best fit
AN US
pair-copulas can be selected based on the minimum value of Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), which are calculated as follows:
AIC 2L 2k
(26)
BIC 2L k ln M
M
(27)
ED
where k is the number of pair-copula parameters and M is the sample size; L is the pseudo-likelihood given by:
(m) (m) L coh, , coh, , , |coh log c scoh , s( m ) ; coh, log c scoh , s( m ) ; coh,
PT
M
m 1
M
m 1
log c h s , s ; coh, , h s , s ; coh, ; , |coh M
(m) coh
(m)
( m) coh
(28)
CE
m 1
(m)
where s( m) r( m) / M 1 , coh, , are the pseudo-observations and r( m )
AC
are the respective ranks associated to the m-th sample. Table 4. AIC and BIC values of different pair-copula decomposed models. Pairs of
Gaussian
Frank
Clayton
CClayton
variables
(AIC, BIC)
(AIC, BIC)
(AIC, BIC)
(AIC, BIC)
coh
-6.81, -5.07
-7.13, -5.39
2.00, 3.74
2.00, 3.74
coh
-0.98, 0.76
-2.75, -1.02
-3.77, -2.03
0.44, 2.17
1.77, 3.51
1.81, 3.54
1.71, 3.45
1.98, 3.72
Note: the underlined AIC and BIC values indicate the preferred pair-copula. The best fit - 20 -
ACCEPTED MANUSCRIPT
pair-copula in tree 2 ( data) is selected by adopting the preferred pair-copulas in tree 1 (i.e., Frank for coh data and Clayton for coh ).
Four bivariate copulas, namely the Gaussian, Frank, Clayton, and CClayton copula, are taken as the candidate pair-copulas. Table 4 summarizes the result of model selection. Note this identification of best fit pair-copulas is independent of the
CR IP T
univariate data analysis. The CDF of the trivariate distribution can thus be calculated by:
1 F coh, , C u1 , u2 , u3 C h u2 , u;coh, , h u3 , u;coh, ; , |coh du 0
u
(29)
where u1 Fcoh coh , u2 F , u3 F are the fitted marginal CDFs of the
AN US
soil properties. Fig. 9 compares the empirical bivariate CDF with the fitted bivariate CDF, which are generally in good agreement. Note the fitted bivariate CDF can be conveniently obtained from the trivariate distribution (e.g., F coh, C u1 , u2 ,1 ). (a)
M (kN/m3)
(°)
ED
PT 24
6 0.
19
0.3
16
0.2
34
14 12
39
4
9
14
CE
19
24
(c) 0.3
0.8
0.3
0.6
0.1
0.4
22
0.2
20
0.6 0.5 0.4
0.3 0. 2
0.1
18
0.7
0.5 0.4
(kN/m3)
9 0.
0.7
0.2
0.5
28
24
0.8
0.6
30
0.3 0.2
0.1
16
0.1
14 12
empirical best fit 15
20
(°)
25
30
Fig. 9 Comparison of the empirical and the best fit bivariate CDFs. - 21 -
0.1
0.1
empirical best fit
c (kPa)
26
0.3
0.2
0.1
29
0.4
0.3
0.2
c (kPa)
AC
7 0.
14
0.8
9
0.6
4
0.6 0.5
0.4
18
0.1
empirical best fit
0.8
0.7
0.5
0.4
12
20
0.3
0.1
14
0.5
0.2
16
0.4
22 0.1
0.1
0. 4
0.6
0.3
0.2
18
0.7
0.5
0.3
0.3
0.1
20
0.6
0.9
24 0.2
0.4
0.2
22
26
0.8
0.7
0.4
0.5
24
0.1
0.8
0.3
0.1
26
28
0.9
(b)
0.2
9 0.
30
0.6
0.4
28
0.7
0.2
0.5
30
29
34
39
ACCEPTED MANUSCRIPT
The data dependencies show no Gaussian characteristics. Table 5 presents the failure probabilities by different methods. The result from the best fit trivariate distribution can be taken as the actual failure probability. If marginals and correlations are given, the transformation method based on Gaussian dependence assumption apparently gives a biased estimate. In contrast, by adopting the qualitative best fit
Table 5. Failure probabilities approximated by different methods. Available information
Best fit from data
Pf Relative error (%)
0.1473 /
Marginal distributions and correlations 0.1598a 8.49
0.1487b 0.95
Assuming the Gaussian dependence structure.
b
Assuming the best fit non-Gaussian dependence structure.
AN US
a
10
CR IP T
copula functions, the failure probability from the transformation method is corrected.
0
best fit non-Gaussian Gaussian (Nataf)
-1
ED
p(Esys )
M
10
-2
=2.5 pf =0.0062
PT
10
CE
10
-3
1
2
3
4
=b/a
5
6
7
8
Fig. 10 System failure probability as a function of λ.
The dependence assumption can further impact the reliability-based system
AC
design. Suppose the normalized width =b / a of the retaining wall is a design parameter ( a 0.5m ). Fig. 10 shows the system failure probability with respect to . If the targeted system reliability is 2.5 ( p f 6.2e-3 ), the value that optimized towards this targeted reliability should be approximately 6.98 based on the best fit distribution. The value from the transformation method with the best fit non-Gaussian dependence structure is 7.12 (the actual failure probability is 6.0e-3), which is slightly biased towards the safety side. However, if the Nataf transformation - 22 -
ACCEPTED MANUSCRIPT is used, the optimized value is only 6.29 (the actual failure probability is 8.3e-3) and the targeted reliability is actually not reached.
7 Concluding remarks System reliability has gained extensive attentions in the past few decades. A
CR IP T
major challenge in system reliability analysis is how to deal with correlated random variables. The orthogonal transformation and the Nataf transformation are commonly used to deal with correlations in the reliability analysis; however, these transformations actually assume a Gaussian dependence structure for random variables. This study investigates the effect of the dependence assumption, an
AN US
unawarely ignored factor in the transformation method, on the system reliability, and tries to shed some light on reliability analysis considering correlations. To investigate the reliability under the assumption of non-Gaussian dependence structure, a general method for probability distribution reconstruction from prescribed marginals and
M
correlations is needed. The pair-copula decomposition approach with a sequential search strategy for retrieving the model parameters is proposed to construct various
ED
joint multivariate PDFs that can share the same marginals and correlations. The Rosenblatt transformation in terms of pair-copulas is presented for transforming
PT
non-Gaussian dependent random variables. The effect of correlations under different dependence structures are first shown
CE
in two numerical examples. Given the same marginal distributions and correlations, different multivariate distributions are demonstrated and the deviation in failure
AC
probabilities is non-trivial except that the random variables tend to be uncorrelated or fully correlated. The result from the Gaussian dependence structure can be either smaller or larger than the results from other dependence structures. This result runs counter to the intuition that the reliability is unique under specific marginal distributions and correlations. Note the result also has implications for time-dependent reliability, in which the time-variance is commonly realized by a set of correlated random variables. - 23 -
ACCEPTED MANUSCRIPT
Nevertheless, the actual system failure probability remains unique. It is just the incomplete probabilistic specification in terms of marginals and correlations that causes the deviation. The identification of the appropriate probability distribution model is thus critical to an unbiased estimate of system reliability. The retaining wall example has shown that the best fit pair-copulas can be determined independently, and the probabilistic result can be sufficiently accurate as long as the qualitative
CR IP T
pair-copulas are complemented to the quantitative statistical information regarding marginals and correlations. The widely used Nataf transformation, however, failed to provide an accurate estimate and the subsequent reliability-based design drifted towards the unconservative side because it adopts a Gaussian dependence structure
AN US
without proper validation.
Acknowledgments
This work is supported by the National Natural Science Foundation of China
M
(NSFC) under Grant No. 51608399 and the Research Grants Council of Hong Kong
ED
under Grant No. PolyU 152093/14E.
PT
Appendix A. Monotonicity of Gaussian pair-copula parameters at higher levels
CE
Fig. A1 illustrates the monotonicity of Gaussian pair-copula parameters with a trivariate example. As the correlation coefficients corresponding to the first level tree
AC
(i.e., 1,2 and 1,3 ) increases, the relation between 2,3 and 2,3|1 changes such that the interval of viable values of 2,3 becomes narrow. This variation is explained below. Suppose 1,2 =1,3 =0.5 and 2,3 0.5 , the corresponding unconditional Gaussian
pair-copula
2,3 0.5237
.
parameters
Thus
the
are
respectively
conditional
- 24 -
1,2 =1,3 =0.5076 ,
pair-copula
parameter
and is
ACCEPTED MANUSCRIPT
2,3|1
2,3 1,21,3 2 2 1 1,2 1 1,3
1.0525 , which obviously cannot be reached since the range
of Gaussian pair-copula parameters are bounded in 1, 1 . The reason is rooted in 1,2 1,3 in the equivalent trivariate Gaussian copula 1 2,3 2,3 1
1
that the correlation matrix
1,2 1,3
CR IP T
is not positive definite. 1 0.8 0.6 0.4
1,2=y if 1,2=x
0
or
1,3=y if 1,3=x
-0.2 -0.4
2,3|1=y if 2,3=x
-0.6
2,3|1=y if 2,3=x and 1,2= 1,3=0.2
-0.8
-0.8
-0.6
-0.4
2,3|1=y if 2,3=x and 1,2=1,3=0.5
-0.2
0 x
0.2
and
1,2=1,3=0.8
0.4
0.6
0.8
1
M
-1 -1
AN US
y
0.2
Fig. A1 C-vine structured Gaussian pair-copula parameters as a function of the correlation
ED
coefficient in a trivariate distribution. Lognormal marginals with mean of 40 and standard deviation of 10 are assumed for all three random variables. Black line indicates the variation of
PT
pair-copula parameters in tree 1. Blue dashed lines indicate the variation of pair-copula parameter in tree 2 if the pair-copula parameters in tree 1 are determined by the corresponding correlation
AC
CE
coefficients ρ1,2=ρ1,3=0.2, 0.5, or 0.8.
Appendix B. Best fit marginal distributions for the 42 data samples The 42 data samples of soil mass properties are listed in Table B1. The values of soil properties have a wide range because the samples are measured at different depths. Some samples at deep depth are weathered rock.
- 25 -
ACCEPTED MANUSCRIPT
Table B1. Statistics of the random variables used in the retaining wall example.
coh (kPa)
(°)
(kN/m3)
No.
coh (kPa)
(°)
(kN/m3)
1
11.3
24.37
16.9
22
14.4
19.33
27.0
2
9.8
24.98
23.3
23
16.2
20.51
20.0
3
9.0
23.34
17.3
24
5.1
22.65
17.9
4
20.3
30.48
24.5
25
15.0
19.33
19.8
5
9.0
27.98
17.3
26
19.2
18.07
27.5
6
12.8
27.98
18.9
27
11.1
17.50
16.4
7
6.8
28.59
25.9
28
11.9
23.34
19.2
8
25.5
21.62
16.5
29
10.5
19.33
18.5
9
26.6
17.11
17.2
30
1.9
19.16
20.0
10
27.0
18.98
23.0
31
11.4
16.91
14.7
11
14.4
27.61
25.2
32
10.8
23.87
21.5
12
2.0
26.22
13.0
13
5.7
18.62
14.3
14
18.5
22.50
22.7
15
15.6
20.68
20.7
16
22.2
15.06
22.5
17
21.6
20.51
25.3
18
11.6
24.12
19
19.8
15.27
20
31.5
21
29.4
AN US
CR IP T
No.
22.5
15.48
21.3
34
6.3
22.79
12.0
35
12.8
16.51
24.2
36
9.0
13.74
15.9
37
1.5
20.18
30.7
38
26.3
10.41
21.2
30.2
39
24.0
19.51
18.5
26.0
40
24.8
10.41
21.5
ED
M
33
13.51
29.8
41
9.8
15.48
18.4
16.11
17.6
42
39.0
16.11
24.7
PT
The best fit marginal distribution is selected based on the Kolmogorov–Smirnov
CE
statistic. Table B2 summarizes the test results and Fig. B1 shows the comparison of empirical CDFs of the soil properties with their respective best fitted CDFs.
AC
Table B2. Results of the Kolmogorov–Smirnov test for the marginal distributions (α=0.05). The best fit distribution is underlined. Soil property
coh
Distribution
Parameters
p-value
Test statistic
Normal
15.55, 8.58
0.46
0.1279
Lognormal
2.54, 0.73
0.26
0.1524
Gamma
2.60, 5.98
0.65
0.1103
GEV*
-0.09, 7.42, 11.85
0.88
0.0874
Weibull
17.47, 1.86
0.89
0.0865
- 26 -
Normal
20.15, 4.78
0.95
0.0772
Lognormal
2.97, 0.25
0.97
0.0708
Gamma
16.98, 1.19
0.99
0.0640
-0.27, 4.68, 18.43
0.98
0.0698
Weibull
22.03, 4.65
0.82
0.0937
Normal
20.93, 4.57
0.91
0.0833
Lognormal
3.02, 0.22
0.99
0.0642
Gamma
20.77, 1.01
CR IP T
ACCEPTED MANUSCRIPT
GEV
GEV
-0.20, 4.28, 19.16
Weibull
22.78, 4.95
* GEV: generalized extreme value (a) 0.9
0.0684
0.84
0.0915
empirical best fit
0.8
0.7
0.7 0.6
CDF
0.6
CDF
0.98
AN US
1
empirical best fit
0.8
0.5 0.4
0.5 0.4 0.3
0.2 0.1 0
5
10
15
20
25
30
M
0.3
0
0.0630
(b)
1 0.9
0.99
35
40
0.2 0.1
0 10
15
25
30
ED
c (kPa)
20
(°)
25
(c)
1
PT
0.9
empirical best fit
0.8 0.7
AC
CDF
CE
0.6 0.5 0.4 0.3 0.2 0.1 0
15
20
(kN/m3)
Fig. B1 Comparison of the empirical and the best fit CDFs for each soil property.
- 27 -
30
ACCEPTED MANUSCRIPT
References [1] Zio E. Reliability engineering: Old problems and new challenges. Reliab Eng Syst Saf 2009; 94(2): 125–41. [2] Beer M, Zhang Y, Quek ST, Phoon KK. Reliability analysis with scarce information: Comparing alternative approaches in a geotechnical engineering context. Struct Saf 2013; 41: 1– 10.
CR IP T
[3] Zhao W, Fan F, Wang W. Non-linear partial least squares response surface method for structural reliability analysis. Reliab Eng Syst Saf 2017; 161: 69–77. [4] Wang C, Zhang H. Roles of load temporal correlation and deterioration-load dependency in structural time-dependent reliability. Comput Struct 2018; 194: 48–59.
AN US
[5] Wang C, Zhang H, Li Q. Time-dependent reliability assessment of aging series systems subjected to non-stationary loads. Struct Infrastr Eng 2017; 13(12): 1513–1522. [6] Huang XC, Zhou XP, Ma W, Niu YW, Wang YT. Two-dimensional stability assessment of rock slopes based on random field. Int J Geomech 2016; 17(7): 04016155.
M
[7] Ditlevsen O, Madsen HO. Structural reliability methods. Chichester: Wiley; 2005.
ED
[8] Ang AH-S, Tang WH. Probability concepts in engineering: emphasis on applications to civil and environmental engineering. 2nd ed. New York: John Wiley and Sons; 2007.
PT
[9] Der Kiureghian A, Liu PL. Structural reliability under incomplete probability information. J Eng Mech 1986; 112(1): 85–104.
CE
[10] Nelsen RB. An introduction to copulas. 2nd ed. New York: Springer; 2006
AC
[11] Ferson S, Hajagos JG. Varying correlation coefficients can underestimate uncertainty in probabilistic models. Reliab Eng Syst Saf 2006; 91(10–11): 1461–1467. [12] Lebrun R, Dutfoy A. An innovating analysis of the Nataf transformation from the copula viewpoint. Probab Eng Mech 2009; 24(3): 312–20. [13] Tang XS, Li DQ, Rong G, Phoon KK, Zhou CB. Impact of copula selection on geotechnical reliability under incomplete probability information. Comput Geotech 2013; 49: 264–78. [14] Diermanse FLM, Geerse CPM. Correlation models in flood risk analysis. Reliab Eng Syst Saf 2012; 105: 64–72. [15] Zhang Z, Jiang C, Wang GG, Han X. First and second order approximate reliability analysis - 28 -
ACCEPTED MANUSCRIPT
methods using evidence theory. Reliab Eng Syst Saf 2015; 137: 40–49. [16] Zhang WG, Goh ATC. Reliability assessment on ultimate and serviceability limit states and determination of critical factor of safety for underground rock caverns. Tunnel Undergr Space Tech 2012; 32: 221-230. [17] Zio E. The Monte Carlo simulation method for system reliability and risk analysis. London: Springer; 2013.
CR IP T
[18] Wang F, Li H. Towards reliability evaluation involving correlated multivariates under incomplete probability information: A reconstructed joint probability distribution for isoprobabilistic transformation. Struct Saf 2017; 69: 1–10. [19] Aas K, Czado C, Frigessi A, Bakken H. Pair-copula constructions of multiple dependence. Insur Math Econ 2009; 44(2): 182–98.
AN US
[20] Joe H. Families of m-variate distributions with given margins and m(m-1)/2 bivariate dependence parameters. In: Rüschendorf L, Schweizer B, Taylor MD (Eds.). Distributions with Fixed Marginals and Related Topics, 28. Institute of Mathematical Statistics, Lecture Notes-Monograph Series, 1996, pp. 120–41.
M
[21] Bedford T, Cooke RM. Probability density decomposition for conditionally dependent random variables modeled by vines. Ann Math Artif Intel 2001; 32: 245–68.
ED
[22] Bedford T, Cooke RM. Vines–a new graphical model for dependent random variables. Ann Stat 2002; 30: 1031–68.
PT
[23] Rosenblatt M. Remarks on a multivariate transformation. Ann Math Stat 1952; 23(3): 470– 72.
AC
CE
[24] Ang AH-S, Tang WH. Probability concepts in engineering planning and design. Vol II Decision, risk and reliability. New York: John Wiley; 1984.
- 29 -