Reliability Engineering and System Safety 41 (1993) 209-215
Failure frequencies of non-coherent structures G. Becker Technische Universitiit Berlin, lnstitut fiir ProzeB- und Anlagentechnik, Marchstr. 18, Berlin 10587, Germany
&
L. Camarinopoulos Aristotelian University of Thessaloniki, Department of Mathematics, Physics and Computer Science, Thessaloniki, 540 06 Greece (Received 23 September 1992; accepted 14 January 1993)
Algorithms to determine the failure frequency of a non-coherent structure function are few, and they depend on the knowledge of the prime implicants. For coherent structures, there exists a well-known approach based on criticality and the Birnbaum importance measure, which is more flexible concerning the form in which the structure function is given. This concept is extended to cover non-coherent structure functions by considering two types of criticality functions and a corresponding extension of the Birnbaum importance measure. This theory is a generalization of presently existing concepts, as it includes systems described by prime implicants as well as coherent systems. For the components, renewal frequency densities are required in addition to failure frequency densities and unavailabilities. For common component models, these are given to show that everything that can be done for coherent systems can also be achieved for non-coherent ones. This includes also the method of modularization, which is addressed in a final section. The topic of multi-state components is not addressed in this paper.
1 INTRODUCTION
restricted to failure functions, which implies no loss of generality, as every failure function can be transformed into an equivalent structure function and vice versa. This isomorphism has been described in detail by Alesso ~ and is covered there in T h e o r e m 2. Reliability modelling presently is p e r f o r m e d in practice only for a subclass of the Boolean functions, which are called coherent or monotonic functions. A Boolean function ~ of a vector of c o m p o n e n t s x (with n elements) is coherent if the following conditions hold: ~o(0) = 0 (1)
Reliability analysis by the fault tree method or by reliability block diagrams is sometimes referred to as Boolean reliability modelling, as either method is based on a different graphical representation of a Boolean function, which relates c o m p o n e n t behaviour (intact or failed) to system behaviour. Boolean logic is a two-state logic using the integer numbers 0 and 1. If the meaning of 'failed' is assigned to the n u m b e r 0, and 'intact' to the number 1, the Boolean function is called a structure function. If the assignments are chosen the other way round, i.e. 0 is m a p p e d to 'intact' and 1 to 'failed', which is sometimes called 'negative logic', the resulting Boolean function is called a failure function. As the fault tree method, which is presently most widely used for reliability analysis in practice, nearly exclusively makes use of failure functions, and as it is confusing to intermix both conventions in the same paper, this p a p e r is
For eqn (3), a shorthand notation is used in this paper: qo(Oi, x) --< g0(1,, x) (4)
Reliability Engineering and System Safety 0951-8320/93/$06.00 (~) 1993 Elsevier Science Publishers Ltd, England.
Put into plain words, this means that a coherent system is intact if all its c o m p o n e n t s are intact (eqn (1)), it is failed if all its components are failed (eqn
qg(1) = 1 (p(Xl' X2 . . . . .
Xi
I' O, Xi+ l . . . . .
-< qg(xt, x2 . . . . .
209
(2) Xn)
xi-1, 1,
Xi+ 1 . . . . .
Xn)
(3)
210
G. Becker, L. Camarinopoulos
(2)), and, should it be failed, failure of an additional component will not cause it to function again (or, if the system is intact, repair of one of the failed components will not fail it) (eqn (3) or (4)). If the failure function can be written in a form that contains only conjunctions and disjunctions, but no negations, the system is coherent in the strict sense, as defined in eqns (1)-(3). Commercial reliability analysis codes offer few facilities to handle noncoherent structures. If negated events are included at all, quantification will occur with constant figures only, which corresponds to an unavailability calculation. Whereas unavailability calculations for noncoherent systems can be performed by taking the expectation of the failure function in much the same way, as is possible for coherent systems 2 (though some approximative techniques do not work3), little has been published on the determination of failure frequencies, and existing approaches depend on the failure function being given in a special form, in terms of its prime implicantsf1-7 This appears to be unsatisfactory as, for a coherent system, the system failure frequency may be determined using the concepts of criticality and Birnbaum importance ~ for a much more general class of Boolean function. An extension of these concepts has been developed, which allows for computation of frequencies for non-coherent systems. in the following, a simple example will be introduced, which will serve for demonstration purposes. Then, after a brief review of frequency computation for coherent systems using criticality functions and Birnbaum importance, this concept is generalized to allow for frequency calculations for any non-coherent system.
2 A SIMPLE EXAMPLE FOR A NONCOHERENT SYSTEM Practitioners frequently ask to what extent noncoherent systems occur in realistic situations, and may even doubt whether they exist at all. The example given in Ref. 4 is not very convincing: though the prime implicants given certainly are non-coherent; it can be doubted whether they correctly reflect the behaviour of the given technical system. Non-coherent structures do occur, however, as will be shown by a more obvious example. Typically, a system may be non-coherent if it contains some redundant components which share a resource which, due to some limitation, cannot supply all redundancies at the same time. Such a system will contain some switching device, which has the function to keep the number components required active and spare ones inactive. If the switching device fails with respect to this function,
S
Supply
7 ! i Radiator
a
.... r
Radiator ] b
T I
Fig. 1. A simple non-coherent structure. the system will fail if the redundant components are all intact, but it will work if some redundancies are not operational. A very simple system with the property outlined above is given in Fig. l. It consists of two electric radiators a and b, which might form an emergency heating system for the apartment of one of the authors. As the mains can supply just one of these radiators, there is a switch s, which allows one to switch between the radiators and has the function to connect at most one of the radiators to the electric supply. Now, the switch has a failure node, which causes all of its poles to be short-circuited with one another. If this happens, the system will be defective if both radiators are intact, but it will be intact if one of the radiators is failed. As the system is also failed, if both redundant radiators are defective it can be seen easily that the failure function is given by cp(a, b, s) = ab + s(1 - a)(1 - b).
(5)
If the object is the determination of the unavailability, it can be found by taking the expectation of eqn (5), which is in a form applicable for the substitution rule, which states that if a failure function is in the form of a sum of disjunct product terms, and if the events associated with the Boolean variables are independent, the component unavailabilities may be substituted for the corresponding Boolean variables to determine system unavailability. Hence U~,,~ = U~Uh + U~(1 - U.)(1 - Uh)
(6)
In this case, for realistically small component unavailability values, the negated terms in eqn (5) could be neglected. This yields a system with no negated events with a conservative unavailability result. If, however, the failure frequency is to be found, such a simple approximation can no longer be given.
3 CRITICALITY, MARGINAL IMPORTANCE AND SYSTEM FAILURE FREQUENCY In the following section, the concepts and terminology of criticality, marginal (Birnbaum) importance and
Failure frequencies of non-coherent structures system failure frequency will be reviewed for coherent systems. This approach will then be extended to non-coherent systems.
3.1 Concepts and terminology for coherent systems A component i is called critical for the system under consideration, if the system is in such a state that if that component is intact the system is intact, and if the component is defective the system is defective. Using the shorthand notation of eqn (4), these two conditions can be written as (~(1i, x)
=
1)
and
(qg(0~, x) = 0)
(7)
A Boolean function which yields the value of 1 exactly when the conditions (7) hold is called the criticality function of component i, denoted by q~. Such a function may be expressed as x)(1 -
x))
(8)
or
q~i= (p(1,, x) -
cp(1,, x)(p(0,, x)
-
q~(O,, x)
(10)
This follows from eqn (4). For some realizations of x, (p(li, x) will be equal to q0(0~, x). For these, eqn (10) obviously holds due to the law of idempotency (stating that a Boolean object taken to any power greater than one is identical with the object itself). For all other realizations of x, ~(li, x) must be greater than (p(0, x), which means that it must be equal to one. Thus eqn (10) also holds in these cases. Note that ~ is a Boolean function, just like q~ itself. Thus, the expectation of q~ is the probability p~f that component i is critical for the system, and it can be found in the usual way by bringing it into a form of disjunct products and substituting the component unavailability vector U for x, provided that the events associated with the Boolean variables are independent. By simple algebraic transforms, it can be shown for the right-hand side of eqn (10) that if it is in a form that has no such dependent factors, tP(li, x) - tp(Oi, x) = O
Oxi
system failure frequency density h~, where h{(t)dt is the probability of system failure in some time interval (t, t +dt). In this time interval (t, t + d t ) , the system will fail if one of its components fails and the component is critical for the system at time t. This results in h{(t) = ~ hfi(t)pJ(t) (12) Vi
where h{ is the failure frequency density of component i. For a parallel system, or a minimal cut set (this is a set of components which leads to system failure if all members of the set fail, and which is minimal with respect to this property), this results in the well-known relationship
h((t) = ~ h{(t) I-I uj(t) Vi
(13)
Vj~i
which is implemented in most modern computer codes to quantify minimal cut sets.
3.2 An extension to cover non-coherent systems
(9)
The form (9) of the criticality function may be simplified further if the Boolean function under consideration is coherent. In this case, the criticality function assumes the well-known form s q0{= (p(1,, x)
211
(11)
From this, it can be concluded that, for a coherent system, the probability of criticality p~i is equal to the partial derivative of the system unavailability with respect to the unavailability of the component i. Both are identical and are known as the marginal importance of this component, according to Birnbaum. With this concept, it is easy to determine the
Whereas system failure frequency density can be found via the criticality function for coherent structure functions and failure functions in a general way, up to now the only published algorithms depend on a special form of the failure function, its prime implicants. 4-6 In this section the method for coherent systems, as outlined above, is extended to find the failure frequency density for non-coherent systems in a similar way. First, it can be noted that the formulation of the criticality function given in eqns (8) and (9) does not depend on coherence or non-coherence. Also, the system will fail if a component fails which is critical for the system. This, however, is not the only case when the system can fail. A non-coherent system may fail if one of its components changes to an intact state. Therefore a second type of critically function has to be introduced to represent such states where the system is intact when component i is failed, and the systems is failed when component i is intact. For the purpose of distinction, the criticality function t~ as defined by eqn (8) will be referred to as the f-criticality function (failure criticality function), whereas the new one will be called the r-criticality function q~7 (renewal criticality function). As in the previous section, q0~ can be expressed as q~r = qg(0i, X)(1 -- cp(li, X))
(14)
Note that with a similar argument as was used for the derivation of eqn (10) above, it can be shown that q0~:= 0 for a coherent system. The properties of q97 are equivalent to those of tf~. It also is a Boolean function not depending on the state of component i. Thus, by takijng expectation, the probability p~r that a component is r-critical can easily be determined.
G. Becker, L. Camarinopoulos
212
Now the probability that a component is critical for a system, and the partial derivative of system unavailability, are no longer the same. The latter has to be evaluated by taking the expectation of eqn (11). For a non-coherent system, this is not a probability; it may assume values between - 1 and 1. Subsequently, the expectation of eqn (11) will be referred to as Birnbaum's importance measure, bearing in mind that it has lost one of its interpretations if the system is non-coherent. With this concept of criticality, it is easy to find the failure frequency of a non-coherent system. A system will fail in some time interval (t, t + d t ) if in this interval either a component fails which is f-critical for the system, or a component is repaired which is r-critical for the system. This result in
h~(t) = ~ h{(t)p'/(t) + ~ hT(t)p}~(t) Vi
(15)
Vt
renewal. It is approximately the case if the component state is observable and repair starts immediately after failure. It is definitely not the case if the state of the component cannot be monitored but by inspection, or if the component is non-repairable, as will be shown in Section 5. As eqn (18) is thus useful only in special cases for frequency determination, it appears to be better to leave the definition of Birnbaum's importance meaure as it is, and to see that for a non-coherent system it is generally not related to the probability that a component is critical. As failure frequency density and renewal frequency density may be different for components, so may they be for systems. With a reasoning similar to the method used for the derivation of eqn (15), the system renewal frequency density h~ can be found to be given by hf(t) = ~_, h~(t)p'/{t) + ~ h{(t)p',.'(t) (19) Vi
This result is valid under the assumption that at most one component will fail or be repaired in an infinitesimally small interval of time. This is true for such events between different components, as the components are assumed to behave independently of each other. For the failure-repair event of one component, this is not necessarily true. For example, if the instantaneous renewal model is applied, failure and repair will always occur at the same time. However, from the definition of criticality, a component cannot be s-critical and r-critical at the same time. Thus, system failure due to failure of component i and due to repair of component i are exclusive events and eqn (15) holds even in this situation. The case of instantaneous renewal is interesting for another reason, as in this case failure frequency density and renewal frequency density are identical. For a system composed of components where h{= h7 = h;, eqn (15) can be rewritten as h.{(t) = ~ h;(t)(p'/(t) + p}~(t)) where
(16)
Vi
p)~(r) +p}~(t) = E{q~7 + ~ } = E{q~,}
(17)
% = 1 holds if the component is critical for the system in either way. With trivial algebraic transforms, it can be shown that ~; = I~(1,, x) - ~(0;, x)) (18) Jackson 7 proposed using the expectation of eqn (18) as the definition of a 'Birnbaum importance for non-coherent systems'. With this definition, Birnbaum's meaure loses its property to yield the effect of changes of component unavailablity on system unavailability. Only if failure frequency density and renewal frequency density are identical can E{ (p; } be used for calculation of failure frequency density. This is clearly the case if there is instantaneous
Vi
The renewal frequency density is especially useful if the structure is to be transformed into modules. As a module of a failure function is treated like an independent component, its renewal frequency density must be known in the same way as is required for components.
4 SOLUTION FOR THE EXAMPLE SYSTEM As the example system is very simple, it could be evaluated manually. With the aid of modern symbolic equation processors, neither manual calculation nor the development of a new piece of software is necessary. The Mathematica program 9 requires only 10 statements for the solution of the example problem; a protocol is given in Fig. 2. In the session protocol, user input is bold-faced and system responses are printed in normal style. Very briefly, the input may be described as follows. Statement (a) gives a rule to determine the linear form of a Boolean function. The function has to be expanded completely, and all powers of anything have to be omitted. Statements (b) and (c) give rules to determine the failure criticality function and the renewal criticality function according to eqns (8) and (14) respectively. These first three statements may be considered as a 'program'. Statement (d) defines the failure function under consideration. Statements (e) and (f) provide rules to associate numerical values for unavailabilities and frequency densities with the symbols for future use. Statements (g) and (h) determine the linear form of the failure function in symbolic form and numerically after substituting component unavailabilities to yield the system unavailability. Statements (i) and (j) finally determine the system failure frequency density in symbolic and numeric form respectively.
Failurefrequencies of non-coherent structures LifO
[p_]
rKrltFa
~£tRO ~ s
ww a b
+
:= [p_, [p_,
(p // Ixp~11) z_] := LifO [(p s_] :- Lifo [(1
(1 - a) (i
-
a)
(i - b) (i
- b)
U~v&ilabilltleg
x - (p 1))(p
/.
(-) (b) (c)
z -> 0))] /. • -> 0)]
(d)
+ a b s
{& - > b -> s ->
=
/. x *n_ -> /. s -> 1)(1 - (p /. s ->
213
(e)
20 I 5000, 20 1 5000, 10 1 2 0 0 0 0
) 1 {a
->
1
---,
b
->
1
---,
250
s
->
->
1
/
-> ->
1 1 5000, 1 1 20000
250
Frequencies
=
{ha hb hs
.... ) 2000
(f)
5000,
) i {ha
->
1
.... , hb
->
5000 Lifo
[FF]
ab+
s-
1
.... , hs
->
.....
5000
}
20000
(g) as-
bs+
SymteaUnav&ilability
abs s
Lifo
[FF]
/.
Un&v&ilabilitie8
1/
(h)
N
0.000512008
Bystem~&ilureFrequencyDengity
(i)
m
ha hb
(FKrltFa (FKritFa
[FF, [FF,
&] b]
+
FKrltRe + FKrltRe
[FF, [FF,
&]) hi)
h8
(FKrltFa
[FF,
8l
÷ FKrltRe
[FF,
m] )
(i - a ha
(b
- b +
+ a b) s
- b
hs
+ hb
(a + s - a s)
+ ÷
+
s)
SystemFailureFrequencyDen|ity
I.
Unavailabilities
Frequencies
II
I. N
0.0000514
Fig. 2. Protocol of a Mathematica session to solve the simple example.
5 COMPONENT MODELS FOR NONCOHERENT SYSTEMS
Some simple component models are in widespread use for the time dependent evaluation of system reliability parameters like unavailability and failure frequency. For these models, relations describing unavailability and failure frequency density have been published in several papers. These relations will be repeated here in addition to the renewal frequency density, which is required to evaluate non-coherent systems. For the sake of practical use, all relations will be given for the case of exponentially distributed lifetimes with a failure rate A and exponentially distributed repair times with a repair rate p. The unavailability is denoted by U(t), the failure frequency density by hf(t), and the renewal frequency density by h~(t). 5.1 The observable c o m p o n e n t m o d e l
For a component which is observable (or selfannunciating), it is assumed that a failure is noticed immediately and repair action is started at once. With
some simplifying assumptions, this leads to an alternating renewal process. Solving it for durations with constant rates leads to
U(t) = ~
(1 - exp(-(~, +
p)t))
hI(t)=~p (l +~exp(-(~.+ p)t))) W(t) = A-t-p ~ (1 -
exp(-(~, + /9))
(20) (21) (22)
5.2 The instantaneous renewal m o d e l
A simplified version of the observable component model is the instantaneous renewal model. Renewal occurs at the same instant of failure, which corresponds to an infinite repair rate. This results in
U(t) = 0 hf (t) = hr(t) = ~.
(23) (24)
G. Becker, L. Camarinopoulos
214
5.3 The non-repairable component model A component which is not repaired observation period is described by
during
the
U(t) = 1 - exp(-)~t)
(25)
hl(t) = ~. exp(-~.t)
(26)
hr(t) = 0
(27)
5.4 The basic inspection model If state change of a component is not annunciated immediately, components are frequently inspected at given time points Tk and repaired, if necessary. The frequently used basic inspection model is valid with the following assumptions. After an inspection, the component is as good as new (which is always true, if the lifetime distribution is constant), the inspection is performed without disturbing the intended function of the component, and repair time is negligible. This results in
U(t)= l -exp(-)~(t-
Tk))
h J ( t ) = , ~ e x p ( - ; ~ ( t - Tk))
Tk <-t < Tk+,
(28)
Tk <-t
(29)
hr(t) = ~] (1 - exp(-~.(Tk - Tk ,))) 6(t -- Tk) VT~ "ilk --< t < L +,
(30)
where 6 is the Dirac pulse function and T. = 0.
6 MODULES IN NON-COHERENT STRUCTURES Informally, a module of a failure function is a group of components which acts like a single component and thus may be evaluated separately. If modules can be identified, the evaluation of a structure, e.g. a fault tree, becomes considerably simpler. Formally, a set of components Xm is a module if the failure function can be written as cp(x) = ~P(X(x,,), xa)
(31)
where Xm and Xd are disjoint, and ~p and X are Boolean functions. A module is a proper module unless it is a trivial module, i.e. Xm or Xd consists of one component or less. All this is completely unaffected by the question whether the structure is coherent or not. There are two main differences, however, which should be noted. Using results from game theory, Billera ~° and Chatterjee ~ have shown that a unique finest modular decomposition exists for coherent structures. These proofs heavily depend on the representation of the failure function in terms of cut sets and path sets. Thus, they are not valid for non-coherent structures, and it is indeed an open
question whether a finest modular decomposition exists for these. Secondly, it must be pointed out that, if a module behaves like a component, the renewal frequency density of the module is required in the same way as it is for components. It can be determined using eqn (19). For practical purposes, e.g. in fault tree analysis, the finest modular decomposition is not used, but rather, modules are identified from the structure of the fault tree (being sub-trees, which have a single entry point from the rest of the tree). Thus, modularization techniques used in practice apply to non-coherent systems.
7 CONCLUSION An extension to the concept of criticality has been used for failure frequency determination of noncoherent systems. It can be shown that a time dependent frequency density calculation can be performed with the usual component models in the same way as is possible for coherent systems. For this purpose, it is necessary to see a distinction between Birnbaum's importance measure and probability of criticality. Non-coherent systems do exist; examples from the literature show this, and a very simple example has been provided in this paper. The mathematical framework is available to handle them in the same way as coherent structures. It is only a matter of programming to create fault tree codes with this capability.
REFERENCES 1. Alesso, H. P., Some algebraic aspects of decomposed non-coherent structure functions, Reliability Engineering, 5 (1983) 129-38. 2. Alesso, H. P. & Benson, H. J., Fault tree and reliability relationships for analyzing non-coherent two-state systems, Nuclear Engineering & Design, 56 (1980) 309-14. 3. Chu, T. L. & Apostolakis, G., Methods for probabilistic analysis of noncoherent fault trees, IEEE Trans. Reliability, R-29 (1980) 354-60. 4. Inagaki, T. & Henley, E. J., Probabilistic evaluation of prime implicants and top-events for non-coherent systems, IEEE Trans. Reliability, R-29 (1980) 361-77. 5. Bossche, A., The top-event's failure frequency for non coherent multi-state fault trees, Microelectronics & Reliability, 24 (1984) 707-15. 6. Bossche, A., Top-frequency calculation of multi-state fault trees including interstate frequencies, Microelectronics & Reliability, 26 (1986) 481-2. 7. Jackson, P. S., On the s-importance of elements and prime implicants of non-coherent systems, 1EEE Trans. Reliability, R-32 (1983) 21-5. 8. Birnbaum, Z. W., On the importance of different components in a multicomponent system, in
Failure frequencies of non-coherent structures Multivariate Analysis II, ed. P. R. Krishnaih, Academic Press, New York, 1969, pp. 581-92. 9. Wolfram, S., Mathematica---a System for Doing Mathemathics by Computer, Addison-Wesley, California, 1991. 10. Billera, L. J., On the composition and decomposition of
215
clutters, J. Combinatorial Theory, 11 (1971) 234-45. 11. Chatterjee, P., Modularization of fault trees: a method to reduce the cost of the analysis. In: Reliability and Fault Tree Analysis, Society for the Industrial Application of Mathematics. Philadelphia, USA, 1975, pp. 101-26.