Reliability Engineering and System Safety 191 (2019) 106554
Contents lists available at ScienceDirect
Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress
Procedures to model and solve probabilistic dynamic system problems Rafael Raoni , Argimiro R. Secchi ⁎
T
Chemical Engineering Program-COPPE, Universidade Federal do Rio de Janeiro, Cidade Universitária, Centro de Tecnologia, 21941-914 Rio de Janeiro-RJ, Brazil
ARTICLE INFO
ABSTRACT
Keywords: Probabilistic Safety Assessment (PSA) Probabilistic Dynamic System Problem (PDSP) Monte Carlo simulation Deterministic and stochastic modeling Risk assessment
Probabilistic Safety Assessment (PSA), characterized by process-behaviours modelling and event likelihood calculation, has great importance for quantitative risk evaluation. PSA presents some difficulties for implementation, mainly when the analysis of a dynamic process is required. In this work, a set of procedures to formulate and solve Probabilistic Dynamic System Problems (PDSPs) is presented. Such procedures explain how events should be modelled and connected with each other to build a process model that makes it possible to answer two main questions: (i) What is the discrete probability of occurrence of a specific process event? And, given its occurrence (ii) What is the distribution of event time to occurrence? After answering these questions, the event-occurrence probability in a specific length of time, which is the main goal of PSA, is easily calculated. To explain this proposal, two PDSPs are solved: the pressure change in a vessel caused by failure of two valves and the change in holdup tank level caused by failure of two pumps and one valve.
1. Introduction Risk assessment can be characterized as identifying (i) the possible accident scenarios, (ii) the consequences, and (iii) the likelihood of these scenarios [25]. After the identification of scenarios, a quantitative likelihood analysis normally requires analysis of a dynamic system with stochastic behaviour, which can be modelled and solved by several different techniques [12]. Nevertheless, Probabilistic Safety Assessment (PSA) has gained importance in several technological areas, strongly influencing the design and operation of complex systems [21]. Amendola (1981) [3], studying nuclear reactors, was the first to propose accident-sequences likelihood analysis using a system dynamic model [2]. Roughly described, PSA aims to estimate the probability of occurrence of undesired process events. For such estimation, two main evaluation classes may be identified: (i) the static model and (ii) the dynamic state-space model [10], which are mostly employed for probabilistic risk and reliability/profitability estimation, respectively. Event Tree (ET) and Fault Tree (FT) are techniques classified as static models that represent scenarios or systems under study using, respectively, basic probability theory and Boolean algebra [19]. Despite not be suitable for dynamic system behaviour [25], FT is the most frequently used quantitative technique for accident-scenario likelihood assessment in industry [10, 17, 25]. To deal with dynamic systems, dynamical Fault Tree (DFT) has introduced dynamic gates to compute the dependence of occurrence time between events and has gained attention for solving safety-critical systems [14]. Currently, the procedure
⁎
is employed when deterministic and stochastic system behaviours are mixed into the problem [7, 16, 24], requiring great efforts in modelling [20]. Unlike static model procedures, methodologies applied to dynamic reliability do not model systems or scenarios by a generic representation scheme [19]. Such methodologies use state-space evaluation, which allows modelling of system transitions whose time dependency behaviours are important [25]. Among several representations, state transition diagrams, based on Markov models [10], have been popular for many decades. The representation employs diagrams of the evolution of the system's state transition, with nodes denoting states and arrows denoting transitions to formulate the state equations [19]. In this class of problem, time is an explicit variable, making it possible to cover time-dependent parameters, and the problem is solved by analytical resolution, making it possible to calculate the probabilities of rare event sequences [25]. Despite that, the unpractical identification of states, transitions and probabilities for large problems, the difficult of dealing with complex systems, and the exponential explosion of states leads to unmanageable problems [12, 19, 25]. Furthermore, all of the techniques introduced generate models based on description of the qualitative behaviour of the system, making them an inadequate model for some system behaviours [8]. As described by Devooght and Smidts (1996) [12], the Champman–Kolmogorov equation can also be used to mix probabilistic and deterministic dynamic analyses to solve PSA problems. The state equations, which represent the temporal evolution of physical variables
Corresponding author. E-mail addresses:
[email protected] (R. Raoni),
[email protected] (A.R. Secchi).
https://doi.org/10.1016/j.ress.2019.106554 Received 19 January 2018; Received in revised form 1 November 2018; Accepted 27 June 2019 Available online 28 June 2019 0951-8320/ © 2019 Elsevier Ltd. All rights reserved.
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
• Modelling a continuous dynamic process based on discrete events
and their links with reliability parameters [5], are solved analytically, and the human and control error can be included [12]. Despite the useful results, difficulties are faced when trying to solve high-dimensional problems based on realistic circumstances, making the application hard to use for probabilistic dynamic system problems (PDSPs) [18]. Furthermore, the Event Sequence Diagram (ESD) provides a framework that allows the combination of dynamic methodologies, as Markov chain, and static procedures, as FT [31], being supported by different mathematical formulations of its components [32]. The ESD provides an interesting framework since it allows modelling problems with different layers, what is useful when trying to handle situations that may go from static to dynamic considerations or with sparse knowledge [31]. To overcome some analytical resolution difficulties to solve PDSP, probabilistic methods, such as the Monte Carlo (MC) procedure [20, 34], may be employed to solve the analytical problems [5, 18]. The application of MC to solve PDSP leads to the highest freedom for problem modelling and resolution, and does not require any specific representation of the system. Furthermore, MC is insensitive to the system complexity and dimension, allowing the use of any probability density function, non-fixed failure rate, interactions between components, and so on. These characteristics make MC the best approach for solving realistic systems [9], being limited by the analyst's ability to model the possible behaviours of the system [25]. Normally, applying MC requires, beside the system model and state specifications, the system mission time (maximum history time), the definition of the system failure model, and the number of histories to be simulated in order to obtain a representative statistical result [20, 21]. Difficulties arise in MC when the system under study is very reliable, since, to obtain meaningful results, a large number of system histories and consequent large computational effort are needed [2, 26]. However, biasing techniques may be applying to deal with the difficulty of solving very reliable problems [18, 21, 34]. Other probabilistic techniques have been developed and are in widespread use for PSA, such as Dynamic Reliability Block Diagram (DRBDs), Bayesian Network, Stochastic Petri Network, GO-FLOW, and DYLAN, among others [1, 19, 20, 22]. This large number of techniques highlights the wide range of possibilities for modelling and solving scenario-likelihood problems. As reported at NUREG-0492 (1981) [23], in the scenario likelihood problem, the greatest emphasis should be placed on assuring that the system model provides the most accurate representation of reality. In this sense, it is observed that one difficulty in obtaining the most representative model is the restrictions imposed by the problem resolution procedures of existing methods, such as FT and ET static pictures, DFT state space explosion, difficulty in solving the formulated high-dimensional problem for some realistic application of the Champman-Kolmogorov equation, the large number of MC histories needed for the resolution of very reliable system, and so on. Thus, the model-resolution relation is one of the most important features that must be analysed during the use of these existing techniques. In this sense, procedures to solve PDSP should be improved in order to consider, from a unique perspective, (i) building the model of the system's behaviour, (ii) the problem resolution, and (iii) the representativeness of the obtained results, which must meet the expectations of the solution. In the present work, a procedure to understand and model dynamic system behaviour is proposed with the aim of achieving better model building, problem resolution, and interpretation of results. To introduce the proposals, two main questions to be answered are highlighted:
• • •
(state-spaces) that embraces a set of deterministic behaviours limited by stochastic connections. Representing each event connection by a discrete random variable that defines whether the connection will or will not occur, combined with a continuous random variable time dependence; Application of the deepness concept, to identify the set of process behaviours in the events, combined with logics, represented by gates, to represent the relationship between these identified events; The application of MC to allow the highest modelling freedom and the probabilistic problem resolution, setting the absorbing events as a unique process feature that ends the simulated MC history.
In Section 2, procedures for event identification, system modelling, and problem resolution are introduced to obtain the answers of both of the abovementioned questions about discrete probability and time distribution of the events occurrence. In Section 3, two case studies are modelled and solved to illustrate how the obtained results can be used to answer common PSA questions. In Section 4, the conclusions of the work are presented. 2. Procedures 2.1. Event characterization and its interactions In this work, an event is defined as a state space in which some process behaviour is identified, making it possible to separate the system's deterministic from the system's stochastic behaviours. Due to the fact that, in many domains, system dynamics may be described by deterministic behaviours punctuated by stochastic transitions [6, 8, 13, 19, 35], the idea that a dynamic deterministic behaviour does not need to be separated into different events should be followed. Therefore, all sequenced deterministic behaviours should be grouped into a unique state space limited by stochastic transitions. For these event limits, which are points of connection of events, both discrete and continuous random variables must be considered. Continuous random variables, which are characterized by continuous probability distribution functions, should represent the stochastic time to the event occurrence that must be computed already considering that the event will occur, representing only the randomness of the time to the event occurrence. On the other hand, the discrete random variable is employed to forecast process paths, identifying whether some downstream event will or will not occur in the simulated history. In other words, the discrete random variables determine the process behaviour identified as “instantaneous decisions”, determining whether a safeguard valve will or will not be activated when requested for example, and the continuous random variables determine the stochastic time dependency between the events. To link the identified events and build the target process model, some logics may be employed to describe how these events interact with each other. Logics, as the gates “and” and “or” employed in the FT or Event Sequence Diagram (ESD) [31, 32] for example, should be understood as a feature that allows the modelling of the process behaviour based on interactions between events through their connections, including also the possibility to manage a self-loop connection. Fig. 1 shows features for the identification of the events and their
(i) What is the probability of occurrence of a specific process event (disregarding any time dependence)? (ii) Given that the event has occur, what is the time distribution of its occurrence? Fig. 1. Process event and connection features.
To answer these questions, the following procedures are introduced: 2
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
deepness concept highlights the level of detail of the process behaviour description in the events, influencing the model size and complexity. For example, one may agree that, after the process understanding, the modeller might aim to build a detailed model, requiring to represent it by numerous events and connections. On the other hand, depending on the problem scope, the modeller might also aim to build a simplified model to represent the same process, needing a smaller number of events and connections to its representation. Despite that difference in the model size, since the models are referred to the same process, their events and connections should be somehow related. In this sense, the level of detail in the events and the continuous and discrete random variables in the connection between events may be employed in different manners aiming towards the same goal, and the allowance of this possibility is the idea behind the deepness concept. Thus, no matter how detailed or superficial a model is, all the models of the same process should somehow be related to each other. Given that a more detailed model leads to a more accurate result, it is expected that the available data and the scope of the problem should guide the modeller to decompose the process as much as possible in order to build the most detailed model. However, the problem arises when this detailed model starts to be unmanageable to be built or solved, needing simplifications that sacrifice the accuracy of the result. In this sense, a procedure that allows to simplify the level of detail of the model, enabling linking of a not detailed event with the rest of the model using both discrete or/and continuous random variables in its connection for example, is a suitable approach to handle those situations. Finally, the introduced deepness concept can be understood as: the capacity of grouping or ungrouping the deterministic events and connections present in the PDSP model, allowing the changes in the level of detail of the events in the model without losing the problem result accuracy. In order to manage that change in the model deepness, the modeller must identify the state space with a process perspective, knowing beforehand the process behaviour to be limited in the event, in order to allow such grouping or ungrouping of process behaviours and also to know that, for this grouping, both continuous and discrete random variables in the connections must be considered. In Fig. 3, a graphical example of how to manage discrete and continuous random variables, to group or ungroup events, following the idea of deepness for PSA model building is shown. With this possibility to manage the model, a process behaviour may be modelled as detailed as wanted and solved to be further simplified in order to compose a wider problem that without that simplification could be unmanageable to be solved. In this sense, the deepness concept
connections. The discretization of a continuous probabilistic process based on events that embraces deterministic behaviour limited by discrete stochastic connections, and the uses of logics to make the interactions between the events, is highlighted. It is important to note that, to improve the quality of the model, the event identification must follow the process perspective instead of the components status perspective. This means that a unique event may group different devices conditions as long as the process remains at the same previously defined boundary, and then the model of the process behaviour may be built using reduced number of states. 2.2. Process model To build a process model that characterizes a sequence of events that occurs in continuous time, definitions of: (i) initial and final boundaries of the process (initial and absorbing event), (ii) intermediary events, and (iii) how such events are connected with each other are required. Knowing that this modelling process is subjective, attention should be paid to its representativeness and accuracy. 2.2.1. Event connections and logics During a process history, time computation is started at the initial event, in which the connections with its downstream events are analysed. Such connections must take into account both cause-consequence instantaneous probability assumptions (how probable it is that the previous event will cause the consequent downstream event?), by employment of discrete random variables; and the time to event transition, which can be a continuous random variable. After that verification, the process may go to further events or remain in the initial event after a self-loop connection, where this decision can be made by the logic of a gate “or”, for example. With this proposal, it is possible to build any dynamic and stochastic system behaviour that can also be supported by graphic representation. Fig. 2 shows examples of representation of the graphical connections that use the logic gate “or” to decide between two downstream events possibilities and the logic gate “and” to verify if a further event will occur based on two primary events. 2.2.2. The deepness concept To provide more flexibility for modelling, the deepness concept is introduced. Deepness can be understood as the difference between using a unique state space and carrying out deep analysis to use different events to represent some process behaviour. In this sense, the
Fig. 2. Examples of graphical representation of event connections. 3
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 3. Example of how to simplify a model by using the deepness concept.
process condition (e.g.: maintain the tank level in an acceptable range) that may occur even with different conditions (normal or failure condition) of the process devices is highlighted.
may be related to the level of detail obtained by the bottom-up approach for model building, where the system is superficially modelled as detailed until it reaches the desired deepness, or by the top-down approach, where system modelling starts with highly detailed models of subsystems, or high deepness, which will be joined to build the model of the entire system [5]. Finally, it is understood that there is no right procedure to be followed during the modelling process as long as the modeller uses the process perspective to identify the events and its connections.
2.3. Problem solving The main goal of the problem resolution is to obtain the probability of occurrence of all events present in the model, considering that these probabilities can be a function of the time. For that, a difficulty is faced when trying to develop an analytical method to solve the formulated models, and then a probabilistic resolution approach, such as the MC procedure, is the most suitable option to obtain the desired result. Thus, the problem resolution must be implemented in software to automate the calculation of the deterministic models in the events, to calculate the parameters of the event connections in order to decide the path of the simulated process histories, and also to do the MC iterations to simulate all the required process history to obtain the probabilistic result of the PDSP. Since deterministic models may be solved in different manners, the modeller should define the way of its resolution. According to Raoni et al. (2018) [33], the deviations of the physical variables have dependency among each other and are caused by an initial device failure. With this reasoning, the deterministic behaviour must be solved given the initial condition of the problem, which may be considered as the normal operating condition, and, as it evolves in time, devices may randomly fail, according to discrete or continuous time random variables, changing the history of the process that may reach a downstream event. In this sense, the problem resolution starts with the resolution of the deterministic process behaviour that, as it reaches the gates, the probabilistic random variables are verified identifying the process paths. In the event connections, to obtain the time as a continuous random variable, a non-tendentious random number between 0 and 1 is obtained and the event occurrence time is calculated employing its continuous probability distribution. To obtain the occurrence of the connection following its discrete random variable, another non-tendentious random number between 0 and 1 is obtained to identify whether or not the downstream event will occur. Furthermore, sometimes, the discrete random connections, or also the parameters of the continuous probability distribution, may vary through the time, being dependent on the process history elapsed time. Given an example, the failure of starting a pump may changes through the time following the bath curve and, in order to calculate the discrete probability of this occurrence in some specific time, it is required the time as a input and the use of a
2.2.3. Proposed steps for model building The system model can be understood as a space in which deterministic behaviours are connected by stochastic connections. The proposed model resembles the state space of the stochastic hybrid system (SHS), which comprises stochastic transitions of dynamic behaviours [13]. However, differently from a system model comprising a unique deterministic model connected with other stochastic model [11], the proposals make it possible to build different deterministic models, with behaviours dependent on their stochastic inputs, and then to evaluate the variability of their behaviours not only due to the variability of the initial system condition [16, 26] but also due to the system's intrinsic stochasticity. For example, when a process has a modelled deterministic behaviour that forms the initial event of the problem, and a second modelled deterministic behaviour that represents the process behaviour after some device failure, the PDSP may be formed by these two events connected by stochastic connection of the probability of the device failure. In this case, each of the events must be represented by one deterministic model. Furthermore, if these are dynamic models with time dependent process variables (e.g.: pressure(time) and temperature (time)), the failure time of the device in the first event will determine the process variable value that will feed the second event of the PDSP, making it possible to evaluate the system's intrinsic stochasticity. Thus, what is proposed is that the modeller should know the interested deterministic behaviours of the PDSP and how they are linked with each other in order to build the entire PDSP model. In order to manage such PDSP modelling, Fig. 4 presents the steps that should be followed. It is important to highlight that, differently from most application of state space representation, the proposal enhances the importance in building the state space from the perspective of process conditions rather than device conditions. The use of the deepness concept, by grouping events given their discrete and continuous random variables at their connections, makes it possible to avoid the state space based on device conditions, and consequently the problem of state space explosion. Thus, the importance of building a unique event to represents a 4
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 4. Steps for model building.
Fig. 5. Application of the continuous probability distribution to solve continuous and discrete random events.
5
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
continuous probability distribution function that represents this time dependence of the discrete probabilistic value of the pump start failure. The difference between employing a continuous probability distribution to obtain the time as a continuous random variable and the influence of the time in the problems parameter such as on the discrete random connection is shown in Fig. 5, in which the exponential distribution was used. It is important to note that, just as exemplified by the use of continuous probabilistic function to influence the variation of the discrete random variable through the time, any other process conditions can be considered as influencers of the problem parameters by using continuous probabilistic function or by managing a discretization of the influenced parameter [34]. These process conditions might include process physical variables as continuous random variables of the problem that is able to influence the problem event occurrence and resolution. For that consideration, the stochastic part of the problem would have more parameters but the problem would still be manageable. Furthermore, unlike the usually procedure for solving PDSP by MC, the proposed procedure does not ends the simulated history based on a pre-defined mission time. It is proposed that the unique way to end a simulated history is when an absorbing event occurs. For example, the simulation starts with the system at its initial event, obtaining its entire deterministic behaviour and identifying its outlets that will occur given the discrete probability. Knowing the connections that will occur, the problem resolution is continued by solving the downstream events. If these downstream events are dependent on more than one event that has previously occurred, their inlet logics should be analysed to evaluate whether and when the event activation will occur to continue the history simulation. The same procedure, as already described to solve the first event, should be repeated for the activated ones, computing the total elapsed time and identifying the events that have occurred. On continuing this procedure until no more events can be solved, which means that absorbing events occur, one process simulation, or history, is finished. To obtain the probabilistic number of occurrence and time to occurrence of all modelled events, the procedure should be repeated as many times as necessary to obtain an accurate result. Fig. 6 shows the flow chart that explains how to solve the model built by the proposed procedure. In Fig. 6, the inner loop represents the modelled process behaviour simulation while the outer loop represents the MC iterations. After solving the problem, the most important information to be obtained is the number of activations and the required time of all activation of all evens. As probabilistic resolutions demand high computational effort, not solving the same computationally expensive deterministic problem repetitively is important to save computational work in problem resolution. In this sense, after the first resolution of the event deterministic behaviour, the results may be further obtained by using a table of data, simplified mathematical model, or process control transfer functions [8]. However, if an deterministic process behaviour in an event is too sensitive with respect to the previously elapsed time of the process and its initial conditions, the study should be done with repetitive resolution of the original deterministic model, together with the stochastic behaviour, to ensure the highest accuracy of the results.
Pei =
Nei NLMC
(1)
where Pei is the probability of occurrence of event i; NMC is the number of process simulations in MC; and Nei is the number of times that event i has occurred, considering all NMC histories. After the consideration that an analysed event has already occurred, the obtained elapsed times for the event occurrences can be used to identify a cumulative probability distribution function of its time to occurrence. As shown in Eq. (2), by computing cumulatively the number of event occurrences in time intervals, a cumulative distribution equation that best fits these data can be identified.
CFD (t j )i =
Nei
t = tj t=0
(2)
Nei
where CFD(tj)i is the cumulative distribution value of event i, which defines the probability of occurrence of the event i between the time 0 t = tj and tj; Nei | t = 0 is the number of occurrences of event i between the times 0 and tj. It is important to note that the continuous probability function for an event time to occur, obtained using the Eq. (2), does not depend on the probability of occurrence of the event, obtained using the Eq. (1). Using these two independent results, the probability of occurrence of the event given a specified time is obtained using Eq. (3).
Pe (t j )i =
Nei
t = tj t=0
NMC
= Pei
CDF (t j )i
(3)
where Pe(tj)i is the probability of occurrence of event i in the interval t = [0, tj]. 2.5. Further comments The constant development of new techniques to solve PDSP shows the importance of the problem and the desire to improve its resolution. When dealing with dynamic behaviours, some methods such as FT and ET are incipient, when dealing with any kind of data, the Markov method presents some restrictions, and when simultaneously dealing with deterministic and stochastic behaviours, the analytical resolution of the Champman-Kolmogorov equation problem is limited and can be unmanageable for some systems. The framework of the ESD is a good approach to mix different procedures. Thinking about the deepness concept, the marrying of ET and Markov chain with ESD provides a way to change the layer of the model, being a way to change the level of the detail of the model. However, what is different is that the proposed methodology does not mix procedures to manage this change of the layer of the model, and this is because different procedures have different drawbacks for the problem resolution. These drawbacks of the existing methods and the need for better interaction between model building and resolution pose the main difficulties in solving PDSP to which the current work proposes a solution. In order to overcome this drawback, the joint of both discrete and continuous random variable in connections, which enables the layers transitions, and the use the proposed method to solve the formulated problem is proposed. Furthermore, as the use of existing methods is suitable to solve some kind of problems, the proposal also enables to employ just some of the procedures if it is desired to simplify the approach. As an example, employing the proposed procedures to solve a PDSP with only discrete random variable as the stochastic contribution of the problem, and without deterministic process behaviour in the events, already makes it possible the obtainment of the probability of occurrence of the modelled events being it an adequate result for some specific scope. In this sense, the proposals aim to presents procedures to help with PSA problem modelling and resolution being flexible in such way that may or may not be used all together. It is also important to highlight why the occurrence of absorbing
2.4. Interpretation of results The proposal mixes different existing probabilistic system analysis procedures, such as logics and discrete probability data employed in FT and ET, and state-space and continuous random time employed in reliability problems. The main objective of that is making it possible to answer two questions posed about: (i) the probability and (ii) time to event occurrence. Analysing the i th event, its probability of occurrence can be estimated by Eq. (1). 6
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 6. Procedure for solving the PDSP. Table 1 Parameters of the pressurized vessel problem. Parameters
Value
Parameters
10 Number of MC histories P0 (Pa) Ph (Pa) 20 λ (V2 failure rate—failure per hour) Pc (Pa) 30 ρ (probability of V1 fail to open) Parameter of normal vessel pressurization (α) Parameter of V1 pressurization (α) Parameter of V2 pressurization (α)
Fig. 7. Pressurized vessel process.
7
Value 10,000 4 × 10–3 0.05 0.05 −0.10 0.06
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
probabilistic time to the event occurrence as separated results, Eq. (3) can be used to obtain the probability of occurrence of the event in any time interval, reducing the required number of MC histories for calculation of the probability of rare events. As an example, considering that, after the PDSP resolution, the obtained discrete probability of occurrence of one absorbing event is equal to Pei = 0.5 and an exponential probability distribution with λ = 10–5 failures per year represent its probabilistic time to occur, according to Eq. (3), the probability of occurrence of the event in one year is 5 × 10–6. Considering that the proposed procedure was used with 105 MC histories, since only Pei = 0.5 had influenced the occurrence of this specified event, half of the MC histories (5 × 104) had ended in the discussed event, what gave enough data to obtain the parameter (λ = 10–5 failures per year) to fit the exponential probability distribution function of the probabilistic time to the event occurrence. However, when employing a MC resolution with a mission time equal to one year, which is a protocol usually applied to this kind of problem, no occurrence of this absorbing event would be expected in all 105 MC histories. In the end, the major proposed steps are: (i) identifying events characterized by deterministic behaviours and limited by stochastic connections; (ii) connecting events, employing both discrete and continuous random variables, logics, and the deepness concept; and (iii) solving the model by the MC procedure to obtain the probability of occurrence and the probabilistic distribution curve of the time to occurrence of all events.
Fig. 8. Graphic representation of the pressurized vessel problem.
3. Case studies
events should be the unique way to finish a MC history. This is important to the proposed procedure because it is aimed to obtain the time that the absorbing event occurs, no matter how long, in the simulated history time, it takes. This means that every MC history will lead as a result of at least one time sample of occurrence of an absorbing event. If the problem has one absorbing event and it was simulated 100 MC histories, the result will have 100 time values of the absorbing event occurrence. The obtainment of these time samples makes it possible to plot and fit a probabilistic function (just as exemplified in Figs. 10 and 15 of the present paper) that allows to calculate the probability of the absorbing event occurrence in any desired length of time, being a practical approach to obtain the probability of rare event occurrence. However, a PDSP may also have more than one absorbing event, being the discrete probability in the transitions between the modelled events the behaviour that defines which absorbing event will terminate the MC history. In this sense, in a problem with several absorbing events, the number of time sample of one absorbing event occurrence will follows its discrete probability of occurrence in front of the total number of MC histories, which groups also the occurrence of the others absorbing events of the problem. Thus, the discrete probability of occurrence of the events is the unique feature that contributes to the non-occurrence of absorbing events and, because of that, it is the modelled process behaviour that may change the required number of MC history to obtain a representative probabilistic result that allows fitting a time probabilistic function for the time samples of occurrence of the absorbing events. Furthermore, with the discrete probability of occurrence and the
In this section, two benchmark reliability problems are presented. The first problem is related to a pressurized vessel in which the failure of two valves that contributes to a change in the vessel pressure is investigated. The second problem investigates the dynamic behaviour of a holdup tank level perturbed by the failure of two pumps and one valve. 3.1. Pressurized vessel 3.1.1. Problem description The pressurized vessel problem presented in Labeau et al. (2000) [19] was modelled and solved. The vessel is continuously pressurized from its initial pressure condition (P0) until it reaches the high-pressure condition (Ph), when valve V1 must be opened to relieve the vessel pressure. The valve V1 has a probability of failure on demand equal to ρ, and its failure leads to a continuous elevation of the vessel pressure until it reaches the critical pressure condition (Pc). The system also contains a second valve (V2), which can eventually open inadvertently, contributing to the elevation of the vessel pressure. Failure of V2 occurs only when the vessel pressure is between the initial and the highpressure conditions (P0 < < Ph). When valve V2 is opened and the vessel pressure reaches Ph, even with the V1 is opened, the pressure will rise until the critical condition (Pc). With the V1 opens, the vessel pressure is reduced until it reaches its initial condition (P0), when V1 is closed without failure. Then, the unique problem-absorbing event is when the vessel reaches its critical pressure, which can occur if V1 fails to open; V2 opens inadvertently; or V2 opens inadvertently and V1 fails
Table 2 Descriptions of the discrete and random variables of the pressurized vessel problem model. Connection
Variable dependence Continuous
Discrete
C1 C2 C3
Time (λ)— Random Time—Deterministic* Time—Deterministic*
1** ρ 1
* Time and pressure as deterministic values. ** Probability equal to one for no random discrete occurrence. 8
Connection
Variable dependence Continuous
Discrete
C4 C5 C6
Time—Deterministic* Time—Deterministic* Time—Deterministic*
(1 ρ 1
)
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
to open successively. Fig. 7 presents the described process. 3.1.2. Problem modelling Following Fig. 4, the identification of the result to be obtained and the data of the problem are required at first. From the description in Section 3.1.1, it is desired to analyse the probability of the critical pressure in the vessel (unique absorbing event) being required to know the deterministic behaviours that identify the different pressurizations conditions in the vessel, and the probability of the valves failure that makes the process to change between the pressurization conditions. The deterministic model for the pressurizations in the vessel is shown in Eq. (4), in which the parameters for the normal pressurization and for the pressurization condition change due to the failures of valves V1 and V2, as well as the other parameters of the problem, are shown in Table 1.
P (t ) = Pi
e
t
(4)
where P(t) is the time-dependent pressure in the vessel; Pi is the pressure in the vessel at the moment of the change in the vessel pressurization behaviour; α is the relative vessel pressurization parameter (normal condition, V1 open, or V2 open); and t is the time. To continue the problem modelling, the identification of the events is required. The identified events are: (E1) Process in the normal condition; (E2) V1 fails to open; (E3) V2 opens inadvertently; (E4) V2 opens inadvertently and V1 fails to open successively; and (E5) critical pressure is reached in the vessel. It was considered the system's start point as the vessel being pressurized from P0 and the initial event E1 as the state space in which no valve failure has occurred, including the deterministic behaviours that consider the oscillation of the vessel pressure in its normal condition (P0 < P < Ph) and also the two selfloops that occurs when the V1 and V2 do not fail. Combining stochastic behaviour of the valves failures with the logic in a “or” gate, the system may remain in the initial event or reach some failure event (E2 or E3) that leads to the absorbing event E5. With failure of V2 (E3) followed by failure of V1 to open when the vessel pressure reaches Pc, the system reaches E4. Such a system model embraces one probabilistic discrete event (failure on demand of V1) that may occur in two events (E1 and E3), one continuous random variable (time to failure of V2), and deterministic behaviours that identify the different pressurization conditions given the valves status. Furthermore, it is important to note that the E1 groups two deterministic behaviours considered as normal process conditions: (i) when the pressure is rising and (ii) when the pressure is getting lower. These two deterministic behaviours could be separated in two events but, in order to simplify the modelling, both of them were located in a unique event. A graphic representation of the described model is shown in Fig. 8 and the descriptions of the discrete and random variables of the model connections are listed in Table 2. 3.1.3. Problem resolution The system was modelled and solved in MATLAB® software assuming that the particles are walking through the state spaces according to the MC procedure that follows the steps described in Fig. 6. Fig. 9 shows examples of the process behaviours, passing through E2, E3, and E4, which lead to the critical pressure E5, in which the points in Fig. 9(a)–(c) represent the point when the V2 fail (square point in Fig. 9(b) and (c)), the V1 fail (circle point in Fig. 9(a) and (c)) and V1 do not fail after the failure of V2 (triangle point in Fig. 9(b)). Fig. 10 shows the cumulative probability distribution curves of the time to occur of all four events, and Table 3 shows the discrete probabilistic results that should be used as the new connection parameters that assemble all the problem information according to modelled process behaviours.
Fig. 9. Possible process behaviours of the pressurized vessel: (a) failure of V1; (b) failure of V2; (c) failures of V2 and V1.
3.1.4. Problem observations and discussion The graphic representation (Fig. 8) makes it possible to visualize the possible paths of the MC histories, helping to understand how the 9
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 10. Cumulative distribution functions of the pressurized vessel: (a) E2: V1 failure event; (b) E3: V2 failure event; (c) E4: V2 and V1 failure event; (d) E5: critical vessel pressure. Table 3 Discrete probabilistic results. MC histories = 10,000 (E2) (E3) (E4) (E5)
Number Number Number Number
of of of of
failure of V1 * failure of V2 failure of V1 and V2 critical pressure
3001 6999 177 10,000
Probability Probability Probability Probability
of of of of
failure of V1 (%) failure of V2 (%) failure of V2 and V1 (%)** critical pressure (%)
30.01 69.99 2.53 100.00
* Number of failures of V1given the previous non-occurrence of failure of V2. **Probability of failure of V1 given the occurrence of failure of V2.
probabilistic information of the events E2, E3, E4, and E5 is calculated. Exemplifying deepness, all the simulated behaviours of the process may be summarized by the cumulative distribution function obtained from Fig. 10(d). In this sense, if the problem goal doing a further PSA that aims to obtain the consequent probability of explosion of the vessel, which may be followed by dead of an operator for example, the cumulative distribution function obtained from Fig. 10(d) should be employed in the state space presented in Fig. 11. As the critical pressure does not depends on discrete probability of occurrence, the connection C1 is represented by only the cumulative function distribution obtained from Fig. 11(d) that provides suitable accuracy to represents all the previous process behaviours in the
analysed problem. Furthermore, the others connections of the state apace presented in Fig. 11 may be estimated or even be obtained by other deeper analysis just as the one used to obtain de specifications of C1. This example shows how a wider problem can be divided in small and manageable deeper pieces that lead to suitable result to be used in the further resolution of the wider and less deep problem. In Labeau et al. (2000) [19], the pressurized vessel problem was modelled by different representation schemes (state graph, Petri net, event sequence diagram, DFM, and GO-FLOW). On comparing the results, a more intuitive method for model building and a better representation of the real system behaviour may be highlighted. Such features are mainly due to the modelling freedom, which is enabled by 10
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 11. Further PSA problem with critical pressure as one of its events.
Fig. 12. Holdup tank process [27]. Table 4 Operational logics of the holdup tank process. Unit 1 (pump 1)
Unit 2 (pump 2)
Unit 3 (valve)
Action
Level > lsp → Level < lsp Level < hsp → Level > hsp
Level > lsp → Level < lsp Level < hsp → Level > hsp
Level < hsp → Level > hsp Level > lsp → Level < lsp
On/open Off/close
Table 5 Parameters of the holdup tank problem. Parameters
Value
Parameters
Value
Initial level (m) H (m)
5.5 (filling) 10
1.67 × 10–3 0.03
hsp (m) lsp (m)
8 3
λ (Pump 1 failure rate –f/h) ρ (Pump 1 discrete failure -%) λ (Pump 2 failure rate –f/h) ρ (Pump 2 discrete failure -%) λ (Valve failure rate –f/h) ρ (Valve discrete failure -%)
L (m) 1 Number of MC 50,000 histories Change of the tank level due to unit 1 operation—Pump 1 (m/h) Change of the tank level due to unit 2 operation—Pump 2 (m/h) Change of the tank level due to unit 3 operation—Valve (m/h)
Fig. 13. Graphic representation of the holdup tank problem.
1.67 × 10–3 0.03
are located in the event connections, and the problem resolution follows a MC procedure that simulates a particle walking through the events. Given that, no further mathematical worries were needed to obtain the probability of occurrence and the probabilistic distribution curve of the time to occurrence of all events.
3.33 × 10–3 0.1 0.40 0.20 −0.40
3.2. Holdup tank
the proposed combination of problem modelling and resolution. The problem resolutions presented in Labeau et al. (2000) [19] start with the mathematical modelling of the problem by the Chapman–Kolmogorov equation to describe the analytic and probabilistic resolution procedures. In the presented example, the deterministic behaviours are expressed by phenomenological equations, the stochastic behaviours
3.2.1. Problem description The holdup tank is a widely studied benchmark problem in the literature and is illustrated in Fig. 12 [16, 25, 27–30]. The process consists of a tank whose level changes due to the operation of two inlet pumps (units 1 and 2) and one outlet valve (unit 3), in which the possibility of the tank overflowing or drying out due to failures of the 11
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 14. Holdup tank problem behaviours: (a) dried out; (b) stable level; (c) overflow.
units is investigated. The operational logics of the units are shown in Table 4, in which, for example, when level > lsp changes to level < lsp, the pumps 1 and 2 turn on and the valve closes. Without failure of the units, the tank level varies between hsp and lsp. When the tank level reaches L the tank is at dried-out condition and when the tank reaches H the tank is at overflows condition.
reached one of the absorbing events. As absorbing events, were identified the drying out (E2), the overflow (E4), and the stable level (E3), which occur when all devices may be in the failure position with a stable tank level between lsp and hsp. To make the process go through the possible deterministic behaviours determined by the status of the devices, the discrete and the continuous random variables were employed for the failures of the devices, as well as the logic gates. In this case study, all the described process deterministic and stochastic behaviours were grouped in the E1, and the probabilistic parameters of the connection (C1, C2 and C3) between the identified events (E2, E3 and E4) were obtained as problem resolution since the parameters of the connection are equal to the probabilistic result of occurrence of the absorbing events, which are the main result of the problem. A graphical representation of the three absorbing states is shown in Fig. 13.
3.2.2. Problem modelling The aim of the problem is to obtain the probabilistic data of the final conditions of the tank level, and the data needed are the model of the process dynamic deterministic behaviours, the initial tank condition, the parameters of the logic, and the probabilistic failure data of the units. The failures of the units could be discrete random failures, which occur when their operational logics are not followed, or follow a failure rate described by an exponential probability distribution function, which makes the valve remain in its last position (open or close) and turn off the pumps. All these data are shown in Table 5. Instead of doing the detailed state space model that identifies all possible unit statuses and the consequent process tank level conditions, a simplified model was built. For that, the normal operation (event E1) groups the oscillation of the level between hsp and lsp, which is the consequence of correct operation of the devices, and also the conditions in which some failure has occurred and the tank has not already
3.2.3. Problem solving Assuming that particles are walking through the described process behaviours, in an MC procedure that follows the steps described in Fig. 6, the system was modelled and solved in Matlab® software. Fig. 14 shows the process behaviours that led to the absorbing events (E2, E3, and E4), Fig. 15 shows the cumulative probability distribution curves of the time to occur, and Table 6 shows the discrete probabilistic results obtained. 12
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
Fig. 15. Cumulative distribution functions of holdup tank: (a) E2: dried out; (b) E3: stable level; (c) E4: overflow.
resolution, and to obtain better results that may feed further studies with accurate and simplified data, attending the idea presented by the deepness concept.
Table 6 Discrete probabilistic results. MC histories = 50,000 (E2) Dried out (E3) Stable level (E4) Overflow
8849 40,180 971
Dried out probability (%) Stable level probability (%) Overflow probability (%)
17.70 80.36 1.94
4. Conclusion A procedure that deals with the drawbacks of methods for solving probabilistic dynamic system problems that consider dynamic deterministic behaviours was introduced. The procedure is capable of simplifying the system model building, combining it with an adequate probabilistic resolution. Due to the complex interactions among events and the required level of detail of some system models, the proposed way of identifying deterministic behaviours separated by stochastic connections, combined with the event deepness to characterize the model representativeness, introduces a new way to solve PDSP. Furthermore, use of the procedure allows the following questions to be answered:
3.2.4. Problem observations and discussion The graphical representation of the problem (Fig. 13) does not represent all of process details but focuses on the stochastic system behaviours that lead to the absorbing events. Furthermore, because of the behaviours in E1, the connections C1, C2, and C3 were characterized by both discrete probability and time dependence, and their probabilistic results are shown in Table 6 and Fig. 15 since, given the formulated problem, the probability of the events and its connections with the E1 are the same. This case study has been used to exemplify the applications of different methodologies for dynamic systems PSA [25], to improve ET/FT when considering dynamic systems [27], and to the application of the Chapman-Kolmogorov equation [25 and 16] and of the MC method [16, 21]. The capability to handle complex problems is exemplified in this case study and, as highlighted in the previous case study, the procedure was employed to facilitate the model building, to simplify the problem
(i) What is the probability of occurrence of a specific process event (disregarding any time dependence)? (ii) Given that the event has occurred, what is the time distribution of its occurrence? Using these answers in Eq. (3), the probability of occurrence of the 13
Reliability Engineering and System Safety 191 (2019) 106554
R. Raoni and A.R. Secchi
event in a specific length of time may be obtained, which is the main objective of most PSA problems. The drawback in employing probabilistic problem resolution, which requires a large number of histories to obtain a representative result of a rare event, was partially solved by using absorbing events as the unique way to end a simulated history. Such a procedure makes discrete random behaviours the unique feature that prevents the occurrence of any modelled event. Thus, the number of MC histories required to obtain a meaningful result is naturally reduced without needing mathematical procedures or approximations such as biased techniques. Both case studies were modelled and solved respecting their characteristics, allowed by the freedom of the MC procedure that impose no restriction for modelling the problem. Although not exemplified, this freedom makes it possible to solve problems with specific system behaviours such as the dependence of the stochastic parameters on previous events and/or past history time and the employment of any continuous probability distribution function. The proposed ideas also allow building of simplified model to obtain specific aimed results by using only some of the proposed procedures. Furthermore, after understanding the deepness concept, the simplification of the model representation by grouping process behaviours in events, just as done in the E1 of both case studies, was possible. Such features make it possible to build models that may be gradually updated given newly acquired information, such as new connections parameters that can be placed in events, without the need to increase the model size and the complexity in use the problem results to further studies. Thus, this application of the procedures helps to mitigate uncertainties that are identified through dynamic knowledge development [4,15]. Finally, given that PDSP is a multidisciplinary problem, the proposed procedures to build the model, solve the problem, and analyse the results may be employed in different areas. For that, it is important to highlight that the event must embrace deterministic behaviours to allow model building from a system perspective instead of from the perspective of the status of devices or subsystem failures. Such importance is exemplified in the case study of the holdup tank, where the model presents a unique event for the normal operating condition (level in the normal range), instead of different events for the different status of each unit.
[4] Aven T, Krohn BS. A new perspective on how to understand, assess and manage risk and the unforeseen. Reliab Eng Syst Safety 2014;121:1–10. [5] Babykina G, Brînzei N, Aubry JF, Deleuze G. Modeling and simulation of a controlled steam generator in the context of dynamic reliability using a stochastic hybrid automaton. Reliab Eng Syst Safety 2016;152:115–36. [6] Berner C, Flage R. Strengthening quantitative risk assessments by systematic treatment of uncertain assumptions. Reliab Eng Syst Safety 2016;151:46–59. [7] Di Maio F, Rai A, Zio E. A dynamic probabilistic safety margin characterization approach in support of integrated deterministic and probabilistic safety analysis. Reliab Eng Syst Safety 2016;145:9–18. [8] Domínguez-García AD, Kassakianb JG, Schindallb JE, Zinchukc JJ. An integrated methodology for the dynamic performance and reliability evaluation of fault-tolerant systems. Reliab Eng Syst Safety 2008;93:1628–49. [9] Dubi A. Analytic approach and Monte Carlo methods for realistic system analysis. Math Comput Simulation 1998;47:243–69. [10] Chiacchio F, Compagno L, D'Urso D, Manno G, Trapani N. Dynamic fault tree resolution: a conscious trade-off between analytical and simulative approach. Reliab Eng Syst Safety 2011;96:1515–26. [11] Chiacchio F, D'Urso D, Manno G, Compagno L. Stochastic hybrid automaton model of a multi-state system with aging: reliability assessment and design consequences. Reliab Eng Syst Safety 2016;149:1–13. [12] Devooght J, Smidts C;. Probabilistic dynamics as a tool for dynamic PSA. Reliab Syst Safety 1996;52(3):185–96. [13] Dhople SV, DeVille L, Domínguez-García AD. A stochastic hybrid systems framework for analysis of Markov reward models. Reliab Eng Syst Safety 2014;123:158–70. [14] Durga Rao K, Gopika V, Sanyasi Rao VVS, Kushwaha HS, Verma AK, Srividya A. Dynamic fault tree analysis using Monte Carlo simulation in probabilistic safety assessment. Reliab Eng Syst Safety 2009;94(4):872–83. [15] Flage R, Aven T. Emerging risk – Conceptual definition and a relation to black swan type of events. Reliab Eng Syst Safety 2015;144:61–7. [16] Karanki DR, Rahman S, Dang VN, Zerkak O. Epistemic and aleatory uncertainties in integrated deterministic and probabilistic safety assessment: tradeoff between accuracy and accident simulations. Reliab Eng Syst Safety 2017;162:91–102. [17] Khakzad N, Khan F, Amyotte P. Risk-based design of process systems using discretetime Bayesian networks. Reliab Eng Syst Safety 2013;109:5–17. [18] Labeau PE. A Monte Carlo estimation of the marginal distributions in a problem of probabilistic dynamics. Reliab Eng Syst Safety 1996;53(1):65–75. [19] Labeau PE, Smidts C, Swaminathan S. Dynamic reliability: towards an integrated platform for probabilistic risk assessment. Reliab Eng Syst Safety 2000;68(3):219–54. [20] Manno G, Chiacchio F, Compagno L, D'Urso D, Trapani N. MatCarloRe: an integrated FT and Monte Carlo simulink tool for the reliability assessment of dynamic fault tree. Expert Syst Appl 2012;39(12):10334–42. [21] Marseguerra M, Zio E. Monte Carlo approach to PSA for dynamics process systems. Reliab Eng Syst Safety 1996;52(3):227–41. [22] Nývlt O, Haugen S, Ferkl L. Complex accident scenarios modelled and analysed by stochastic petri nets. Reliab Eng Syst Safety 2015;142:539–55. [23] NUREG-0492. Fault tree handbook 20555. Washington D.C: U.S. Nuclear Regulatory Commission; 1981. January. [24] Roy A, Srivastava P, Sinha S. Dynamic failure assessment of an ammonia storage unit: a case study. Process Safety Environ Protect 2015;9(4):385–401. [25] Siu N. Risk assessment for dynamic systems: an overview. Reliab Eng Syst Safety 1994;43:43–73. [26] Zio E. Integrated deterministic and probabilistic safety assessment: concepts, challenges, research directions. Nuclear Eng Des 2014;280:413–9. [27] Bucci P, Kirschenbaum J, Mangan A, Aldemir T, Smith C, Wood T. Construction of event-tree/fault-tree models from a Markov approach to dynamic system reliability. Reliab Syst Safety 2008;93(11):1616–27. [28] Li J, Mosleh A, Kang R. Likelihood ratio gradient estimation for dynamic reliability applications. Reliab Eng Syst Safety 2011;96(12):1667–79. [29] Mandelli D, Yilmaz A, Aldemir T, Metzroth K, Denning R. Scenario clustering and dynamic probabilistic risk assessment. Reliab Eng Syst Safety 2013;115:146–60. [30] Marseguerra M, Zio E. Monte Carlo simulation for model-based fault diagnosis in dynamic systems. Reliab Eng Syst Safet 2009;94(2):180–6. [31] Swaminathan S, Smidts C. The event sequence diagram framework for dynamic probabilistic risk assessment. Reliab Eng Syst Safety 1999;63(1):73–90. [32] Swaminathan S, Smidts C. The mathematical formulation for the event sequence diagram framework. Reliab Eng Syst Safety 1999;65(2):103–18. [33] Raoni R, Secchi A, Demichela M. Employing process simulation for hazardous process deviation identification and analysis. Saf Sci 2018;101(2018):209–19. [34] Tombuyses B, DeLuca PR, Smidts C. Backward Monte Carlo for probabilistic dynamics. Math Comput Simul 1998;47(2–5):493–505. [35] Chiacchio F, D'Urso D, Compagno L, Pennisi M, Pappalardo F, Manno G. SHyFTA, a stochastic hybrid fault tree automaton for the modelling and simulation of dynamic reliability problems. Expert Syst Appl 2016;47:42–57.
Acknowledgements The present work was accomplished with the CNPq support, Conselho Nacional de Desenvolvimento Científico e Tecnológico—Brasil and CAPES, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil. Supplementary materials Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.ress.2019.106554. References [1] Aldemir T. A survey of dynamic methodologies for probabilistic safety assessment of nuclear power plants. Ann Nucl Energy 2013;52:113–24. [2] Domínguez-García AD, Kassakian JG, Schindall JE, Zinchuk JJ. An integrated methodology for the dynamic performance and reliability evaluation of fault-tolerant systems. Reliab Syst Safety 2008;93(11):1628–49. [3] Amendola A. Event sequence and consequence spectrum: a methodology for probabilistic transient analysis. Nucl Sci Eng 1981;77(3):297–315.
14