SORM methods

SORM methods

Reliability Engineering and System Safety 65 (1999) 29–40 Uncertainty analyses in fault trees and Bayesian networks using FORM/SORM methods Enrique C...

332KB Sizes 2 Downloads 23 Views

Reliability Engineering and System Safety 65 (1999) 29–40

Uncertainty analyses in fault trees and Bayesian networks using FORM/SORM methods Enrique Castillo a,*, Jose´ Marı´a Sarabia b, Cristina Solares a, Patricia Go´mez a a

Department of Applied Mathematics and Computational Sciences, University of Cantabria, Cantabria, Spain b Department of Economics, University of Cantabria, Cantabria, Spain

Abstract Efficient computational methods based on first and second order approximations (FORM/SORM) of the tails of a random variable, which is defined as a function of a set of basic variables, are given. Interesting examples of these types of functions are the probabilities of occurrence of the root events of fault trees or nodes in a Bayesian network. The method allows estimation of extreme percentiles and high confidence one-sided probability intervals of the target variables. Several examples of applications to real cases are used to illustrate the whole process. 䉷 1999 Elsevier Science Ltd. All rights reserved. Keywords: Bayesian networks; Common cause events; Extreme percentiles; Fault trees; FORM/SORM; Tail simulation

1. Introduction Risk analysis involves two basic types of uncertainty. The first is due to inherent randomness in the phenomenon and the variables chosen to model it. The second is due to inaccurate modeling, insufficient data, etc. This paper is concerned mainly with this second, or epistemic, source of uncertainty, and its propagation through a risk analysis involving rare events. In standard analysis, model parameters are assumed to be constant values. However, on many occasions, these parameters are difficult to assess or are estimated. Thus, their initial deterministic character is considered to be inadequate and parameters are assumed to be random variables. When this occurs and the aim of the analysis is to monitor the effect of this randomness on a given target variable, we say that we are dealing with an uncertainty analysis. In the nuclear field, an uncertainty analysis is performed either to estimate the uncertainty in the final risk results of a probabilistic safety assessment (PSA), or to estimate the uncertainty associated with intermediate quantities such as core melt frequency. In the case of fault tree or Bayesian network models, the input uncertainties associated with the basic fault event or conditional probabilities (the parameters) are propagated through the model to obtain the corresponding uncertainty associated with the probability of the top event. Since fault * Corresponding author. Tel.: ⫹34-942-201720; fax: ⫹34-942-201703. E-mail address: [email protected] (E. Castillo)

tree models are an integral part of PSAs, the propagation of input parameter uncertainties through such models to arrive at the corresponding uncertainty in the probability of the top event, that is, the system unavailability, is of fundamental importance. Numerous methods have been proposed for propagating uncertainties in various nuclear reactor PSAs. The most common approximate methods used to estimate the percentiles of the distribution of the top event are: • • • • • •

Monte Carlo, Method of moments, Method of moments with Tchebyshev inequality, Bootstrap, Discrete probability distributions, Fast probability integration (FPI) or FORM/SORM methods, • Direct tail simulation. Several surveys and comparative evaluations of such methods have also appeared in the PSA literature. Ahmed et al. [1], Cox and Baybutt [14] and Jackson et al. [26] draw numerous conclusions regarding the range of applicability, advantages and disadvantages of the methods. In these comparisons, Monte Carlo simulation methods for propagating uncertainties appeared as generally superior to other methods, particularly for large fault trees. Monte Carlo simulation allows us to deal with a random variable which is related to other basic random variables by a complex but well known relationship. The common technique consists of simulating a large sample of basic

0951-8320/99/$ - see front matter 䉷 1999 Elsevier Science Ltd. All rights reserved. PII: S0951-832 0(98)00083-0

30

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

variables, calculating the associated target variable values with this relationship, and then using their empirical cumulative distribution function as an approximation to the exact one. This method performs well in the central part of the distribution, but gives very poor approximations in the tails. Thus, it is inadequate for estimating small or large percentiles. As a consequence, the usual technique consists of simulating the target random variable and rejecting the values not in the tail, an extremely inefficient method. However, in engineering, one has to deal with extreme percentiles, very large or very low values of magnitudes (temperatures, winds, waves, earthquakes, etc.), because they produce structural, supply or environmental problems. This has motivated the appearance of the extreme value theory in general (see, for example, Galambos [23] or Castillo [7]) and several papers dealing with the estimation of large percentiles in particular (see, for example, Weissman [41] or Pickands [35]). In the past, as reported by Staple and Haskin [39], it has been difficult to evaluate the extreme percentiles (such as the 99.9th or the 99.99th) of the top frequency. Since nuclear accidents can be disastrous and very high security coefficients are required, those high percentiles are needed in nuclear safety analyses. Several alternatives to the Monte Carlo simulation methods have been proposed in the literature, such as: stochastic simulation (Pearl [34]), likelihood weighing (Shachter and Peot [38]; Fung and Chang [22]), hybrid methods of logic sampling and stochastic simulation (Chavez and Cooper [15]), stratified sampling (Bouckaert [3], Bouckaert et al. [4], Castillo et al. [9]), etc. However, these methods are not designed for the tail estimation problem. Castillo et al [13] give a method for directly simulating the tails and show that important improvements can be done with respect to the standard Monte Carlo method. One alternative to simulation is the fast probability integration method, also known as a first order reliability method (FORM), which has been demonstrated to be more efficient than standard Monte Carlo methods for estimating extreme output percentiles and to perform sensitivity analyses (see Staple and Haskin [39] and Haskin et al. [24]). In the standard practice for uncertainty analysis we detect the following problems: 1. Unlimited distributions (normal, lognormal, gamma, etc.) are used for representing unavailabilities, which are bounded values. Thus, simulated values of the unavailabilities which are outside the interval [0, 1] are obtained. In these cases, two normal recommendations are: (a) reject the sample when this occurs (see, for example, Martz et al. [32]), and (b) truncate distributions to avoid the problem. 2. However, both cases can lead to a serious error in extreme percentile estimation (on the unsafe side),

since the tail values are badly approximated or neglected. 3. Models based on the rare event approximation (in which the events in the Boolean expression are simply replaced by their probabilities) can break down at high percentiles, unless the Boolean form of the top event equation is expressed as the union of disjoint cutsets. 4. The high reliability of systems or components required in this field implies: (a) a very high probability of failure, that is, only extreme percentiles are involved, and/or (b) only one side of the probability distribution function, that is, percentiles in the tail associated with failure, are of interest. 5. Available methods for simulating extreme percentiles are not adequate, except the FPI, FORM/SORM and direct simulation methods, since they are inefficient and lack precision. In some cases, the standard FPI method or FORM do not give sufficient quality estimates and second order reliability methods (SORM) (see Madsen [28, 29] or Ang et al. [2]) or response surfaces based on quadratic forms methods (see Faravelli [18], Breitung and Faravelli [6] or Ditlevsen and Madsen [17]) are required. In this paper, we apply these methods to fault tree and Bayesian network propagation analyses. The paper is structured as follows. In Section 2 we give a summary of the first order methods. In Section 3 we remind the reader about the bases of the second order methods. In Sections 4–6 we present several applications, including real examples of risk assessment. Finally, Section 7 gives some conclusions and recommendations. 2. First order methods The first order reliability methods (FORM) appeared in the field of structural reliability with Freudenthal [21] in 1956, and have been expanded by a number of authors, including Hasofer and Lind [25], Rackwitz and Flessler [36], Breitung [5], Wirsching and Wu [42] and Wu et al. [43]. These methods have been shown to give precise results and demonstrated to be much more efficient than Monte Carlo simulation techniques for estimating extreme percentiles (see, for example, Wirsching [42] or Haskin et al. [24]). FORM methods transform the initial set of variables in a multi-normal set and use a linear approximation to eliminate the need for complex numerical integrations. Assume that we have a univariate random variable Z, which is defined in terms of other random variables X ˆ (X1,…,Xn) by the function Z ˆ g(X1,…,Xn). Assume also that we know the probability density function (pdf), fX(x), of X, and that we are interested in extreme percentiles of Z, or the tails of the cumulative distribution function (cdf), FZ(z).

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

Thus, we have: FZ …z† ˆ Prob‰Z ⱕ zŠ ˆ

Z gX …x†ⱕz

fX …x†dx

…1†

The main problem is that the integral [Eq. (1)] is usually difficult to calculate, due to the complicated forms of fX(x) and gX(x) ⱕ z. Thus, using first order methods, the set X is transformed into the set U, using the transformation U ˆ T(X), such that U becomes a standard multi-normal random variable N(0,In) and Z ˆ gX(X) ˆ gU(U), then gU(U) is replaced by a linear approximation h(U). Eq. (1) is transformed into: FZ …z† ˆ Prob‰Z ⱕ zŠ ˆ Prob‰gX …X† ⱕ zŠ ˆ Prob‰gU …U† ⱕ zŠ ⬇ Prob‰h…U† ⱕ zŠ

…2†

31

where b is the minimum distance from the origin to the surface gU(u) ˆ z and   2gU …u†=2ui ai …u† ˆ …7† 兩7gU …u†兩 is the unit vector in the gradient direction. In other words, u* is the boundary point such that its unit tangent vector passes through the origin. Thus, the first order reliability methods reduce the target variable to a normal random variable. More precisely, they replace the integration region by a hyperplane and use the cdf of a normal random variable as an approximation to the integral to be calculated. Unfortunately, in many cases, the first order approximation is not sufficient and we have to use a second order approximation.

where h…U† ˆ

n X

…3†

ai Ui

iˆ1

is a normal random variable, because it is a linear combination of normal random variables. More precisely, h(U) is normal ! n X T ai mi ; AA ; N iˆ1

where A ˆ (a1,…,an) and the final approximation is: 1 0 n X z ⫺ a m i iC B C B iˆ1 C ˆ F…b† …4† r  F^ Z …z† ˆ Prob‰h…U† ⱕ zŠ ˆ FB X C B @ a2i A

3. Second order methods The main aim of the SORM methods is to improve the previously mentioned linear approximation by using quadratic expressions to approximate the integration region. Then, the well known results on quadratic forms in normal random variables are used to calculate the corresponding probabilities (integrals). In this way, an important improvement can be obtained. To this end, the exact integration region gU(U) ⱕ z is approximated by g^U …U† ⱕ z, a second order approximation, i.e.: FZ …z† ˆ Prob‰Z ⱕ zŠ ˆ Prob‰gU …U† ⱕ zŠ ⬇ Prob‰g^U …U† ⱕ zŠ

…8†

i

where z⫺



n X

ai mi

iˆ1

r X 2 ai

…5†

g^U …u† ˆ gU …u*† ⫹

i

is the distance from z to the mean, measured in standard deviations, and F is the cdf of the standard N(0,1) random variable. Thus, FZ(z) is approximated by the cdf of a normal variable h(U) and the problem is easily solved. Without loss of generality, we can assume that the set U is N(0,I), because we can always transform N(m,S) to N(0,I). We make this assumption in the following. To determine h(u), first order methods may be used to replace the boundary gU(u) ˆ z with a hyperplane h(u) ˆ z, which is tangential to it at the point u*, with maximum likelihood. Taking into account that U is a standard normal, the contours become hyperspheres, centered at the origin, and then u* can easily be obtained, because it satisfies the equation: ui * ˆ ai *…u†b

where the last term is the cdf of a quadratic function in normal variables, which can be easily calculated by well known formulas. To obtain g^ U …U†, the second order method considers the partial Taylor series expansion:

…6†



n X 2gU …u*† …ui ⫺ ui *† 2ui iˆ1

n 1 X 22 gU …u*† …ui ⫺ ui *†…uj ⫺ uj *† 2 i;jˆ1 2ui 2uj

…9†

which can be written as: g^U …u† ˆ a ⫺

n 1 X b …u ⫺ ui0 †…uj ⫺ uj0 † 2 i;jˆ1 ij i

…10†

or 2…a ⫺ g^ U …u†† ˆ

n X

bij …ui ⫺ ui0 †…uj ⫺ uj0 †

…11†

i;jˆ1

To determine the coefficients a, bij and ui0, we identify

32

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

hand side of Eq. (11), can be written as: n X

bij …ui ⫺ ui0 †…uj ⫺ uj0 † ˆ

i;jˆ1

coefficients in Eq. (9) and Eq. (11), and obtain:

X j

22 gU …u*† ; ᭙i; j 2ui 2uj n X

2g …u*† bij uj0 ˆ bij uj * ⫹ U ; i ˆ 1; …; n 2ui jˆ1

…12†



…13†

n n X 2gU …u*† 1 X ui * ⫹ b u *u * 2ui 2 i;jˆ1 ij i j iˆ1

n 1 X b u u 2 i;jˆ1 ij i0 j0

…15†

i;jˆ1

1. R. Davies [16] gives a method for calculating the cdf F(z) of linear combinations of x 2 random variables, which is based on the numerical inversion of the characteristic function. More precisely, he analyzes the random variable: Vˆ

and a ˆ gU …u*† ⫺

lj Vj2

that is, as a linear combination of n independent x 2 variables Vj2 ; j ˆ 1,…,n, where the coefficients l j; j ˆ 1,…,n, are the eigenvalues of the matrix b. Several methods that have been used in the past for approximating the random variable in Eq. (15) are:

Fig. 1. An example of a five-node Bayesian network.

bij ˆ ⫺

n X

…14†

r X

lj X j ⫹ s X 0

…16†

jˆ1

where Xj; j ˆ 1,…,r are independent random variables having a noncentral x 2 distribution with nj degrees of freedom and noncentrality parameter d2j ; j ˆ 1,…,r, and X0 is a standard normal N(0,1). 2. C. Field [19] gives a method for evaluating the tail probabilities of linear combinations of x 2 and noncentral x 2. In particular, he considers the random variable: Vˆ

r X

lj …Xj ⫹ dj †2

…17†

jˆ1

However, any random variable, such as that on the right Table 1 Numeric and symbolic conditional probabilities Node Xi Pi X1 X2

None X1

X3

X1

X4

X2 , X3

X5

X3

Node Xi Pi X1 None X 2 X1 X3

X1

X4

X2 , X3

X5

X3

Parameters Xi ˆ 0

u 10 ˆ p(X1 ˆ 0) u 200 ˆ p(X2 ˆ 0兩X1 ˆ 0) u 201 ˆ p(X2 ˆ 0兩X1 ˆ 0) ˆ 0.7 u 300 ˆ p(X3 ˆ 0兩X1 ˆ 0) ˆ 0.4 u 301 ˆ p(X3 ˆ 0兩X1 ˆ 1) u 4000 ˆ p(X4 ˆ 0兩X2 ˆ 0, X3 ˆ 0) ˆ 0.2 u 4001 ˆ p(X4 ˆ 0兩X2 ˆ 0, X3 ˆ 1) ˆ 0.4 u 4010 ˆ p(X4 ˆ 0兩X2 ˆ 1, X3 ˆ 0) ˆ 0.7 u 4011 ˆ p(X4 ˆ 0兩X2 ˆ 1, X3 ˆ 1) ˆ 0.8 u 500 ˆ p(X5 ˆ 0兩X3 ˆ 0) u 501 ˆ p(X5 ˆ 0兩X3 ˆ 1) Parameters Xi ˆ 1 u 11 ˆ p(X1 ˆ 1) u 210 ˆ (X2 ˆ 1兩X1 ˆ 0) u 211 ˆ p(X2 ˆ 1兩X1 ˆ 1) ˆ 0.3 u 310 ˆ p(X3 ˆ 1兩X1 ˆ 0) ˆ 0.6 u 311 ˆ p(X3 ˆ 1兩X1 ˆ 1) u 4100 ˆ p(X4 ˆ 1兩X2 ˆ 0, X3 ˆ 0) ˆ 0.8 u 4101 ˆ p(X4 ˆ 1兩0, X3 ˆ 1) ˆ 0.6 u 4110 ˆ p(X4 ˆ 1兩X2 ˆ 1, X3 ˆ 0) ˆ 0.3 u 4111 ˆ p(X4 ˆ 1兩X2 ˆ 1, X3 ˆ 1) ˆ 0.2 u 510 ˆ p(X5 ˆ 1兩X3 ˆ 0) u 511 ˆ p(X5 ˆ 1兩X3 ˆ 1)

where Xi are independent standard normals. 3. Breitung [5] has derived an asymptotic result. 4. Rice [37] gives an efficient complex numerical integration method. 5. Tvedt [40] gives a three-term approximation by a power series expansion, ignoring terms of order higher than two. This method gives good asymptotic results. 6. Katsuki and Frangopol [27] derive a method for computing the integrals based on a radial-space division technique in conjunction with automatic generation of unit centerline vectors. 7. Papadimitriou [33] develops an asymptotic expansion and uses important sampling techniques that converge much more rapidly than straightforward Monte Carlo methods. For a complete description of some of these methods and some illustrative examples, see Ditlevsen and Madsen [17] and Madsen et al. [30]. 4. An example of uncertainty analysis after symbolic propagation In this section, we present a very simple example in which a Bayesian network is used. Bayesian networks are models with clear advantages over fault trees because they allow the representation of networks instead of trees. This is

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

33

Fig. 2. Right tail of the cdf of p(X3 ˆ 0兩X2 ˆ 1), obtained from using a first order approximation, when it is assumed that u 10, u 210 and u 301 are uniform random variables U(0.5, 0.7), U(0.3, 0.5) and U(0.7, 0.9), respectively.

particularly useful for common cause analyses. It is well known that in fault tree analyses, one is forced to duplicate nodes in order to keep the tree structure. Bayesian networks are couples of elements: a DAG and a table of conditional probabilities. When all the probabilities in this table are numeric, we say that the Bayesian network is numeric; otherwise, we say that it is symbolic. As an example, consider the binary variables {X1,…,X5}, which take values in the set {0,1}, and consider the discrete Bayesian network (D, P), where D is the directed acyclic graph (DAG) in Fig. 1, and P is the set of conditional probability tables in Table 1 (see Castillo et al. [12]). Nodes X2 and X3 are said to be parents of Node X4 (see the arrow directions in Fig. 1). Similarly, Node X1 is a parent of Nodes X2 and X3, and Node X3 is a parent of Node X5. The conditional probabilities in Table 1 are the probabilities of nodes to take given values, given their parents’ values.

Since all the conditional probabilities associated with Node X4 are numeric, we say that this node is numeric. Since the conditional probabilities associated with the remaining four nodes are not all numeric (some parameters are specified only symbolically), we say that they are symbolic. The main advantage of using symbols is that we can easily calculate the target probability (rare event) as a function of these parameters (symbols) and study their influence to perform a sensitivity analysis. Castillo et al. [9] have shown that this can be done using only numeric calculations, that is, without the burden of symbolic calculus. In Bayesian networks the joint probability density p(X1,…,Xn) can be written as: n

p…X1 ; …; Xn † ˆ P p…Xi 兩pi † iˆ1

…18†

where p i is the set of parents of node Xi. In our example, we

Fig. 3. Simplified system diagram.

34

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

Table 2 Variables and associated physical meanings Variable

Physical meaning

ACA B1A B1B G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 G13 G15 M1 O1 O2 SISA T1A T1B V1A V2A V2B V3A V4A V5A V5B V6A V6B

Electric power failure Pump B1 fails to start Pump B1 fails after starting Collector does not receive water flow Valve V6 fails to open Valve V6 does not receive water flow Valve V5 fails to open Valve V5 does not receive water flow Valve V4 is closed Valve V4 does not receive water flow Pump B1 failure Pump B1 does not receive water flow Valve V3 is closed Valve V3 does not receive water flow Valve V2 fails to open Valve V2 does not receive water flow Valve V1 does not receive water flow Element out of service (maintenance) Valve V4 opened after maintenance Valve V3 opened after maintenance Logic signal failure Tank failure Ventilation tank failure Valve V1 is blocked Valve V2 mechanical failure Valve V2 is blocked Valve V3 is blocked Valve V4 is blocked Valve V5 mechanical failure Valve V5 is blocked Valve V5 mechanical failure Valve V6 is blocked

Table 3 Table of the variables and their physical meanings, where M, P, V and T are used for motor, pump, valve and turbine, respectively and S, D, T and G are, respectively, used for single, double, triple and general common cause events Variable Name

have: P…X1 ; …; X5 † ˆ p…X1 †p…X2 兩X1 †p…X3 兩X1 †p…X4 兩X2 ; X3 †P…X5 兩X3 † …19† Suppose that the target node is X3, the evidence X2 ˆ 1 is available, and the aim of the analysis is to calculate the conditional probabilities p(X3 ˆ j兩X2 ˆ 1); j ˆ 0,1 (this can be the probability of failure of X3 after the event X2 ˆ 1 has occurred). Castillo et al. [8,12], analyzed the algebraic structure of marginal and conditional probabilities and concluded that they are polynomials and rational functions of the parameters, respectively. In particular, for this example they have shown that: p…X3 ˆ 0兩X2 ˆ 1† ˆ

0:4u10 u210 ⫹ 0:3u301 ⫺ 0:3u10 u301 0:3 ⫺ 0:3u10 ⫹ u10 u210 …20†

where the denominator in Eq. (20) is a normalizing constant. In this example, the parameters u 10,u 210 and u 301 are assumed to be random variables and the uncertainties in p(X3 ˆ 0兩X2 ˆ 1) associated with the random character of these parameters are analyzed. In particular, the right tail of p(X3 ˆ 0兩X2 ˆ 1) (high values) is to be estimated.

Physical meaning

X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 Triple X21 Triple X22 Triple X23 Triple X24

MS1 MS2 MD12 PS1 PS2 PS3 PD12 PD13 PD23 PG VS1 VS2 VS3 VS4 VD12 VD13 VD14 VD23 VD24 VD34

X25 X26

T C

VG

Independent failure of drive M1 Independent failure of drive M2 Double common cause failure of drives M1 and M2 Independent failure of pump P1 Independent failure of pump P2 Independent failure of pump P3 Double common cause failure of pumps P1 and P2 Double common cause failure of pumps P1 and P3 Double common cause failure of pumps P2 and P3 Global common cause failure of pumps P1, P2 and P3 Independent failure of valve V1 Independent failure of valve V2 Independent failure of valve V3 Independent failure of valve V4 Double common cause failure of valves V1 and V2 Double common cause failure of valves V1 and V3 Double common cause failure of valves V1 and V4 Double common cause failure of valves V2 and V3 Double common cause failure of valves V2 and V4 Double common cause failure of valves V3 and V4 VT123 common cause failure of valves V1, V2 and V3 VT124 common cause failure of valves V1, V2 and V4 VT134 common cause failure of valves V1, V3 and V4 VT234 common cause failure of valves V2, V3 and V4 Global common cause failure of valves V1, V2, V3 and V4 Failure of drive T Failure of supply C

To be more precise, u 10,u 210 and u 301 are assumed to be uniform random variables U(0.5,0.7), U(0.3,0.5) and U(0.7,0.9), respectively. Castillo et al. [9, 12] show that the maximum and minimum values of the marginal and conditional probabilities are attained at the vertices of the cube: {…u10 ; u210 ; u301 †兩0:5 ⱕ u10 ⱕ 0:7; 0:3 ⱕ u210 ⱕ 0:5; 0:7 ⱕ u301 ⱕ 0:9} This is an interesting and convenient convex behaviour of these functions. In this case, the maximum value is 0.65, which corresponds to (u 10, u 210, u 301) ˆ (0.5, 0.3, 0.9), and the minimum is 0.461364, which corresponds to (u 10, u 210, u 301) ˆ (0.7, 0.5, 0.7). Fig. 2 shows the right tail of the cumulative distribution function of p(X3 ˆ 0|X2 ˆ 1), corresponding to a first order approximation (FORM). From it, any probability interval

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

35

Fig. 4. DAG corresponding to the standby system.

can be determined. In particular, the 0.98 probability interval for p(X3 ˆ 0|X2 ˆ 1) becomes (0, 0.6181). If we use a second order approximation (SORM), we get (0, 0.6026), which is practically the same result.

5. An application to uncertainty analysis in risk assessment In this section, an example of application to probability risk assessment is presented. Fig. 3 shows the simplified flow diagram of a typical standby system and Table 2 gives the notation and the

physical meaning of all involved variables (see Castillo et al. [10, 11]). In Fig. 4, we show the graph associated with the Bayesian network reproducing the system in which we have used a network to avoid the node replication that occurs with fault tree diagrams, and showing the corresponding dependence structure. The aim of the analysis consists of obtaining a probability interval for the probability of failure of the system G1. In particular, the influence of the failure probabilities (unavailabilities) of the common causes: logic signal failure (SISA), the electric power system (ACA) and the maintenance policy (M1) on the probability of failure of the system G1,

Fig. 5. Exact and simulated tail of h(x1, x2, x3) ˆ 1 ⫺ 0.999x1x2x3, when it is assumed that Xi (i ˆ 1, 2, 3) are uniform U(0.9999, 1) (reduced rejection method, as given by Castillo et al. [10]).

36

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

Fig. 6. The right tail of the reliability q of the system for the common cause analysis using a second order approximation method (SORM).

are to be analyzed. Note that after removing nodes, SISA, ACA and M1, the Bayesian network degenerates to a tree. It can be shown that the probability of failure of the system can be written as: Z ˆ P…G1 ˆ 1† ˆ h…x1 ; x2 ; x3 † ˆ 1 ⫺ ax1 x2 x3

…21†

where x1,x2 and x3 are the probabilities of no failure associated with variables SISA, ACA and M1, respectively, and a is a constant which is close to one, and depends on the failure probabilities of the remaining elements in the system. In this paper, with Castillo et al. [10], we assume that a ˆ 0.999 and that the X1,X2 and X3 are independent and identically distributed U(0.9999, 1) random variables. A FORM method allows us to obtain one-sided probability intervals for the probability of system failure, which in this case becomes (0, 0.001264) for a probability of 0.98. The corresponding interval given by Castillo et al [10] was

(0, 0.00125). Fig. 5 shows the exact and simulated tails given by these authors. However, both results are very similar. Fig. 6 shows the same tail using a second order approximation method (SORM). We get the interval (0, 0.001249), which is very close to the previous interval.

6. A second application: the auxiliary feedwater system In this section, we show a second example of application of the proposed method to probability risk assessment. We use an example taken from Fleming et al. [20] and, for the sake of illustration, analyze only the normal alignment (no test or maintenance being performed). The system is represented in Fig. 7. It is typical of pressurized water reactors (PWR) in the US, and consists of three pump trains, which take suction from a common

Fig. 7. Simplified scheme of major components in an auxiliary feedwater system.

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

37

Fig. 8. Unavailability of the top event as a function of the unavailabilities of the basic components (algebraic equivalent of the 129 minimal cutsets of the fault tree).

condensate storage tank and supply header and provide auxiliary feedwater flow to four steam generators. This system has two identical electric motor-driven pumps and a steam turbine-driven pump. There are four motor-operated valves at the pump discharge that are normally closed. The success criterion for this example system is that at least one of the three pumps must supply flow to at least two steam generators. Each motor-driven pump can supply flow

through successful valve openings to two dedicated steam generators, and the steam turbine driven pump can supply flow to as many as all four steam generators, depending on the degree of success in the motor-operated valve (MOV) opening. Each pump is periodically tested by opening a miniflow valve (not shown) and pumping water back to the condensate storage tank in a recirculation loop (not shown). If a system actuation signal is received during a

Table 4 Point estimates from Fleming et al. [20] and posterior Beta(a, b) distributions used in the reliability analysis Algebraic term

Basic parameter model

Point estimate

V1 V2 V3 V4 P1 P2 P3 M1 M2 T C

l V1 l V2 l V3 l V4 l PS1 ⫹ l PR1t l PS2 ⫹ l PR2t l PS3 ⫹ l PR3t l MS1 ⫹ l MR1t l MS2 ⫹ l MR2t l TS ⫹ l TRt l C(t ⫹ Tc/2)

0.00379 0.000045 0.00001 0.000380 0.001143 0.000222 0.000540 0.001893 0.000325 0.001006 0.0000023

Posterior Beta(a, b) a b 3.98 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 3.72 10 10

1046 89077 392147 10517 3489 18042 7395 2104 12289 61.77 4.348 × 10 15

38

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

Fig. 9. Cumulative distribution function of the probability of failure of the system in all its range.

pump test, the miniflow lines normally isolate automatically, but need not be isolated to meet the system success criteria. Scheduled and unscheduled maintenance is performed on each pump, and all other components are subject to unscheduled maintenance only. Fig. 7 shows the simplified scheme of major components in an auxiliary feedwater system and Table 3 gives the notation and physical meanings of all involved variables, where M, P, V and T are used for motor, pump, valve and turbine, respectively, and S, D, T and G are used for single, double, triple and general common cause events. The following groups of components are identified as candidates for experiencing common cause events: the three mechanical pumps, the four motor-operated valves

and the two motor drives. A fault tree is built and common cause events incorporated into it. For details of the fault tree and how the common cause events are incorporated, see Fleming et al. [20]. The resulting 129 minimal cutsets are shown in Fig. 8. To quantify the probabilities of the top event, we select the basic parameter model which assumes a model similar to the Marshall and Olkin model [31], except that there are both time-based and demand-based failure rates. For a group of k components in standby, which must start and run for t hours, the model has 2k ⫹ 1 parameters. These are: • l sj; j ˆ 1,…,k: failure to start on demand rate for each specific group of j components.

Fig. 10. Cumulative distribution function of the probability of failure of the system in the right tail.

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

• l rj; j ˆ 1,…,k: failure to operate rate (h ⫺1) for each specific group of j components. • t: mission time. This parameter is assumed known and is not subject to uncertainty.

We have used the parameter estimates given in Fleming et al. [20], whenever possible. However, due to the lack of some information, we could not determine the parameter estimates associated with the unavailability of element C. Thus, we have assumed that it is deterministic and equal to the mean value in Ref. [20]. Fleming et al. [20] make a symmetry assumption stating that the probability of any independent or common cause event associated with a particular common cause group is only dependent on the number of components failed as a result of that specific cause. The first two columns of Table 4 show these parameters, as developed by Fleming et al. [20], where the subindices refer to the number of affected elements. The point estimates of the parameters of the basic parameter model, as given in Fleming et al. [20] are shown in the third column. For illustration purposes, we have assumed Beta(a, b) distributions for the parameters in the second column of Table 4, since they are the natural conjugate of the associated Bernoulli experiments. The corresponding parameter values are shown in the fourth and fifth columns of this table. Fig. 9 shows the cumulative distribution function of the probability of failure of the system in all its range and Fig. 10 shows the cumulative distribution function of the probability of failure of the system in the right tail, as given by Castillo et al. [13], who approximated the right tail by its direct simulation. From it, we can conclude that a one-sided probability interval at 0.996 confidence level for the unavailability of the top event is (0, 0.0021), that is, we can say that the probability of failure of the system is at most 0.0021 with a probability of 0.996. Using a first order approximation (FORM) we get the interval (0, 0.002065), and with a second order approximation (SORM), the interval (0, 0.002113). 7. Conclusions, extensions and recommendations Efficient computational methods (FORM/SORM), based on first and second order approximations, for approximating the tails and therefore estimating extreme percentiles and high confidence one-sided probability intervals of a random variable, which is defined as a function of a set of basic variables, have been given. Interesting particular cases of these types of functions are the probabilities of occurrence of the root events of fault trees or nodes in a Bayesian network. Standard methods and computer programs can be used for an uncertainty analysis. As it has been shown, by its application to several theoretical and real case examples, the method has immediate applications in many fields of reliability theory, as probability risk or security assessments of

39

complex systems. Some comparisons of the real cdf in the adequate tail with the approximated cdf show that the method has a good performance.

Acknowledgements We thank Iberdrola, the Leonardo Torres Quevedo Foundation of the University of Cantabria and the Direccio´n General de Investigacio´n Cientifica y Te´cnica (DGICYT) (projects PB94-1056 and TIC96-0580), for partial support of this work.

References [1] Ahmed S, Metcalf DR, Pegram JW. Uncertainty propagation in probabilistic risk assessment: a comparative study. Transaction of the American Nuclear Society 1981;33:483–484. [2] Ang AHS, Shinozuka M, Schueller GI, editors. Proceedings of ICOSSAR’89, the Fifth International Conference of Structure Safety and Reliability, San Francisco, USA. New York: ASCE, 1989. [3] Bouckaert R. Properties of Bayesian belief networks learning algorithms. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, Seattle, USA. Lopez R, Poole D, editors. San Francisco, CA: Morgan Kaufmann, 1994:102–109. [4] Bouckaert R, Castillo E, Gutie´rrez JM. A modified simulation scheme for inference in Bayesian networks. International Journal on Approximate Reasoning 1995;14:55–80. [5] Breitung K. Asymptotic approximations for multinormal integrals. Journal of the Engineering Mechanics Division ASCE 1984;100(3): 357–366. [6] Breitung K, Faravelli L. Log-likelihood maximization and response surfaces in reliability assessment. Non-Linear Dynamics 1994;5(3):273–285. [7] Castillo E. Extreme value theory in engineering. New York: Academic Press, 1988. [8] Castillo E, Gutie´rrez JM, Hadi AS. Parametric structure of probabilities in Bayesian networks. In: Lecture Notes in Artificial Intelligence 946, Proceedings of ECSQARU’95, Fribourg, 1995:89–98. [9] Castillo E, Gutie´rrez JM, Hadi AS. Expert systems and probabilistic network models, New York: Springer, 1997. [10] Castillo E, Solares C, Go´mez P. Tail sensitivity analysis in bayesian networks. In: Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence, (UAI’96), Portland, OR. San Francisco, CA: Morgan Kaufmann, 1996:133–140. [11] Castillo E, Solares C, Go´mez P. Estimating extreme probabilities using tail-simulated data. International Journal of Approximate Reasoning, 1996;17:163–190. [12] Castillo E, Gutie´rrez JM, Hadi AS. Goal-oriented symbolic propagation in Bayesian networks. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, (AAAI’96), Portland, OR. Menlo Park, CA: AAI Press – The MIT Press, 1996:1263–1268. [13] Castillo E, Solares C, Go´mez P. High probability one-sided confidence intervals in reliability models. Nuclear Science and Engineering 1997;126:158–167. [14] Cox DC, Baybutt P. Methods for uncertainty analysis: a comparative study. Risk Analysis 1981;1:251–258. [15] Chavez RM, Cooper GF. An empirical evaluation of a randomized algorithm for probabilistic inference. In: Lemmer J, Kanal LN, Levitt TS, editors. Uncertainty in artificial intelligence 2. Amsterdam: North-Holland, 1990. [16] Davies RB. Numerical inversion of a characteristic function. Biometrika 1973;60:415–417.

40

E. Castillo et al. / Reliability Engineering and System Safety 65 (1999) 29–40

[17] Ditlevsen O, Madsen HO. Structural reliability methods. Chichester: Wiley, 1996. [18] Faravelli L. Response surface approach for reliability analysis. Journal of Engineering Mechanics 1989;115(2):2763–2781. [19] Field C. Tail areas of linear combinations of chi-squares and noncentral chi-squares. Journal of Statistical Computation and Simulation 1993;15(45):243–248. [20] Fleming KN, Mosleh A, Deremer K. A systematic procedure for the incorporation of common cause events into risk and reliability models. Nuclear Engineering and Design 93. Amsterdam: 1986;93:245–273. [21] Freudenthal AN. Safety and the probability of structural failure. Transactions ASCE 1956;121:1337–1397. [22] Fung R, Chang KC. Weighing and integrating evidence for stochastic simulation in Bayesian networks. In: Kanal LN, Henrion M, Shachter RD, Lemmer JF, editors. Uncertainty in artificial intelligence 5. Amsterdam: North-Holland, 1990. [23] Galambos J. The asymptotic theory of extreme order statistics. Malabar, FL: Krieger, 1987. [24] Haskin FE, Staple BD, Ding C. Efficient uncertainty analyses using fast probability integration. Nuclear Engineering and Design 1996:225–48. [25] Hasofer AM, Lind NC. Exact and invariant second moment code format. Journal of the Engineering Mechanics Division ASCE 1974;100(EM1):111–121. [26] Jackson PS, Hockenbury RW, Yeater ML. Uncertainty analysis of system reliability and availability assessment. Nuclear Engineering and Design 1981;68:5–29. [27] Katsuki S, Frangopol DM. Hyperspace division method for structural reliability. Journal of the Engineering Mechanics ASCE 1994;120(11):2405–2427. [28] Madsen HO. First order versus second order reliability analysis of series structures. Structural Safety 1985;2(3):207–214. [29] Madsen HO. Reliability and risk analysis in civil engineering. In: Proceedings of ICASP5, the Fifth International Conference on Applications of Statistics and Probability in Soil and Structural Engineering, Vancouver, British Columbia, Canada, vol. 1. University of Waterloo, Canada: Institute for Risk Research, 1987:564–577. [30] Madsen HO, Krenk S, Lind NC. Methods of structural safety. Englewood Cliffs, NJ: Prentice Hall, 1986.

[31] Marshall AW, Olkin I. A multivariate exponential distribution. Journal of the American Statistical Association 1967;62:30–44. [32] Martz HF, Beckman RJ, Campbell K, Whiteman DE, Booker JM. A comparison of methods for uncertainty analysis of nuclear power plant safety system fault tree models, NUREG/CR-3263, LA-9729MS, NRC FIN no. A7725. Los Alamos, NM: Los Alamos National Laboratory, 1983. [33] Papadimitriou C, Beck JL, Katafygiotis LS. Asymptotic expansions for reliability and moments of uncertain systems. Journal of the Engineering Mechanics ASCE 1997;123(12):1219–1229. [34] Pearl J. Evidential reasoning using stochastic simulation of causal models. Artificial Intelligence 1987;32:245–257. [35] Pickands J. Statistical inference using extreme order statistics. The Annals of Statistics 1975;75:119–131. [36] Rackwitz R, Fiessler B. Structural reliability under combined load sequences. Journal of Computer Structures 1978;9:489–484. [37] Rice SO. Distribution of quadratic forms in normal random variables; evaluation by numerical integration. SIAM Journal of Scientific and Statistical Computing 1980;1:428–448. [38] Shachter RD, Peot MA. Simulation approaches to general probabilistic inference on belief networks. In: Kanal LN, Henrion M, Shachter RD, Lemmer JF. Uncertainty in artificial intelligence 5. Amsterdam: North-Holland, 1990. [39] Staple B, Haskin FE. Analysis of extreme top event frequency percentiles based on fast probability integration. In: Proceedings of PSAMII, San Diego, CA. 1994:4619–4624. [40] Tvedt L. Two second order approximations to the failure probability, Veritas Report, RDIV/20-004-83, Oslo, Norway: Det norske Veritas, 1983. [41] Weissman I. Estimation of parameters and large quartiles based on the k largest observations. Journal of the American Statistical Association 1978;73(364):812–815. [42] Wirsching PH, Wu YT. Advanced reliability methods for structural evaluation. Journal of the Engineering Mechanics Division ASCE 1987;109(2):19–23. [43] Wu YT, Burnside OH, Cruse TA. Probabilistic methods for structural response analysis. In: Lam WK, Belytschko T, editors. Computational mechanics of probabilistic and reliability analysis. Annandale, Vancouver: Elme Press, 1989:181–196.