Performance Evaluation 56 (2004) 277–306
Construction and stepwise refinement of dependability models Cláudia Betous-Almeida, Karama Kanoun∗ LAAS-CNRS, 7, Avenue du Colonel Roche, 31077 Toulouse Cedex 4, France
Abstract This paper presents a stepwise approach for dependability modeling, based on generalized stochastic Petri nets (GSPNs). The first-step model called functional-level model, is built based on the system’s functional specifications and then completed by the structural model as soon as the system’s architecture is known. It can then be refined according to three complementary aspects: component decomposition, state and event fine-tuning and distribution adjustment to take into account increasing event rates. We define specific rules to make the successive transformations as easy and systematic as possible. This approach allows the various dependencies to be taken into account at the right level of abstraction: functional dependency, structural dependency and those induced by non-exponential distributions. A part of the approach is applied to an instrumentation and control (I&C) system in power plants. © 2003 Elsevier B.V. All rights reserved. Keywords: Dependability modeling; Generalized stochastic Petri net; Functional-level model; Model refinement
1. Introduction Dependability evaluation plays an important role in critical systems’ definition, design and development. Modeling can start as early as system functional specifications, from which a functional-level model can be derived to help in analyzing dependencies between the various functions. This model can then be completed and refined by incorporating more information about the system’s structure, including dependencies between system’s components. The starting point of our work was to help (based on dependability evaluation) a stakeholder of an instrumentation and control (I&C) system in selecting and refining systems proposed by various contractors in response to a Call for Tenders. To this end, we have defined a stepwise modeling approach that can be easily used to select an appropriate system and model it thoroughly. This modeling approach is general and can be applied to any system, to model its dependability in a progressive way. Thus, it can be used by any system’s developer. The process of defining and implementing an I&C system can be viewed as a multi-phase process starting from the issue of a call for tenders by the stakeholder. The call for tenders gives the functional and ∗
Corresponding author. E-mail addresses:
[email protected] (C. Betous-Almeida),
[email protected] (K. Kanoun). 0166-5316/$ – see front matter © 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.peva.2003.07.012
278
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 Suppliers’ proposals Call for Tenders
Refinement of the system
Selection
Preselection
System’s operation time
Functional specifications
Candidate systems
Retained system
Candidate system comparaison
Dependability measures & Sensitivity analysis
Highlevel Dependability Model
Functionallevel Model Model construction
Detailed Dependability Model Model refinement
Fig. 1. Various steps of I&C definition and implementation, and modeling.
non-functional (e.g., dependability) requirements of the system and asks candidate contractors to make offers for possible systems/architectures satisfying the specified requirements. A preliminary analysis of the numerous responses by the stakeholder, according to specific criteria, allows the pre-selection of two or three candidate systems. At this stage, the candidate systems are defined at a high-level and the application software is not entirely written. The comparative analysis of the pre-selected candidate systems, in a second step, allows the selection of the most appropriate one. Finally, the retained system is refined and thoroughly analyzed to go through qualification. This process is illustrated in Fig. 1. Even though this process is specific to a given company, the various phases are similar to those of a large category of critical systems. Dependability modeling and evaluation constitute an efficient support for the selection and refinement processes, thorough analysis and preparation for the system’s qualification. Our modeling approach follows the same steps as the development process. It is performed in three steps as described in Fig. 1: Step 1. Construction of a functional-level model based on the system’s specifications. Step 2. Transformation of the functional-level model into a high-level dependability model, based on the knowledge of the system’s structure. A model is generated for each pre-selected candidate system, the aim being the comparison of the pre-selected systems. Step 3. For the selected system, refinement of the high-level model into a detailed dependability model. Modeling is based on generalized stochastic Petri nets (GSPN) [2] due to their ability to cope with modularity and model refinement. The GSPN model is processed to obtain the associated dependability measures (i.e., availability, reliability, safety, . . . ) using an evaluation tool such as SURF-2 [5]. The relevance of our approach lies in supplying a set of coherent techniques, allowing to master step by step dependability model construction, based on GSPNs. It allows the progressive incorporation of the newly available information into the existing model, changing its initial organization according to a well identified set of rules. Model refinement can be achieved to take into account: component decomposition,
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
279
state/event fine-tuning and distribution adjustment. In particular, the same set of rules is used for generating the high-level model from the functional-level model and for component refinement. We have adapted the method of stages (used for simulating increasing failure rates) to take into account dependencies between interacting components without changing their initial models. This modeling approach has been applied to three different I&C systems, to help select the most appropriate one [9]. In this paper we illustrate our approach on a small part of one of them. This paper is an elaboration of our previous work [6,9], and an extension of [8], to which we have added the construction rules’ formalization and detailed the application in Section 5. Ref. [6] was devoted only to the high-level dependability model’s construction from the functional-level and did not address at all the structural model’s refinement. Ref. [9] mainly refers to the comparison of the three different I&C systems at a high-level. The remainder of the paper is organized as follows. Section 2 describes the functional-level model. The high-level dependability model construction is presented in Section 3. Section 4 deals with the structural model’s refinement and Section 5 presents an example of application of the proposed approach to an I&C system. Finally, Section 6 concludes the paper.
2. Functional-level model The derivation of the system’s functional-level model is the first step of our method. This model is independent of the underlying system’s structure. Hence, it can be built even before the call for tenders, by the stakeholder. It is formed by places representing possible states of functions. For each function, the minimal number of places is two (Fig. 2): one represents the function’s nominal state (F) and the other ¯ its failure state (F). In the following, we assume only one failure mode, but it is applicable in the same manner when there ¯ there are events that manage changes from are several failure modes per function. Between states F and F, ¯ F to F and vice-versa. These events are inherent to the system’s structure that is not specified at this step, as it is not known yet. The model containing these events and the corresponding places, is called the link ¯ that constitutes the system’s GSPN model, will be completed model (ML ). Note that the set {F, ML , F}, once the system’s structure is known. However, systems generally perform more than one function. In this case we have to look for dependencies between these functions due to the communication between them. We distinguish two degrees of dependency: total dependency and partial dependency: F
ML
F
Fig. 2. Functional-level model related to a single function.
280
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 F1
F2 d
Arc Double arc
F2
M L1
M L2
Inhibitor arc Immediate transition F1
F2
Fig. 3. Partial functional dependency (F2 ← F1 ).
• Case (a). Total dependency: F2 depends totally on F1 (noted F2 ; larrlpF1 ). If F1 fails, F2 also fails. This means that the probability that F2 fails equals the probability that F1 fails, times the probability that F2 fails due to the failure of its components. • Case (b). Partial dependency: F2 depends partially on F1 (noted F2 ← F1 ). F1 ’s failure does not induce F2 ’s failure, but it puts F2 in a degraded state. In Fig. 3, the degraded state is represented by place F2d that is marked whenever F1 is in its failure state and F2 in its nominal one. The token is removed from F2d as soon as F1 returns to its nominal state. Different scenarios might be considered.
3. High-level dependability model The high-level dependability model is formed by the function’s states and the link model that gathers the set of states and events related to the system’s structural behavior. This behavior is modeled by the so-called structural model (MS ) and then it is connected to F and F¯ places through an interface model (MI ). The link model is thus made up of the structural model and of the interface model. The structural model represents the behavior of the hardware and software components taking into account fault-tolerance mechanisms, maintenance policies as well as dependencies due to the interactions between components. The interface model connects the structural model with its functional state places by a set of immediate transitions. In this section, we mainly concentrate on the interface model. In particular, we assume that the structural model can be built by applying one of the many existing modular modeling approaches (see, e.g., [12,18–20]), and we focus on its refinement in Section 4. Note that the structural models presented in this section are not complete, some examples of complete structural models are given in Section 4. We present simple examples to help understand the notion of interface model before presenting the general interfacing rules. 3.1. Examples of interface models For sake of simplicity, we first consider the case of a single function then the case of multiple functions.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
281
F
tC
tC
tF
1
2
MI
S ok
Hok
MS
Hko t’C
ML
S ko t’C
t’F
1
2
MI
M = Link model L M = Interface model I M = Structural model
F
S
Place
Definition
Hok
hardware's up state
Sok
software's up state
Hko
hardware's down state
Sko
software's down state
Fig. 4. Two series components.
3.1.1. Single function Several situations may be taken into account. Since the two most important cases are the series and the combination series–parallel components, we limit the illustrations to these two basic cases which allow modeling of any system. More details are given in [6,7]: • Series case. Suppose function F carried out by a software component S and a hardware component H. Then, F and F¯ places’ markings depend upon the markings of the hardware and software components models (Fig. 4). The behavior of H and S is modeled by the structural model and then it is connected to places F and F¯ through an interface model. Note that there is only one interface model: we split it into two parts, an upstream part and a downstream part, so that it is constructed in a systematic way. This allows our approach to be re-usable, facilitating the construction of several models related to various architectures. Also, the case of simultaneous failures is not treated at this level. • Series–parallel case. Consider function F implemented by two redundant software components S1 and S2 , running on the same hardware component H. F’s up state is the combined result of H’s up state and S1 or S2 ’s up states, and F’s failure state is the result of H’s failure or S1 and S2 ’s failure, as indicated in Fig. 5.
282
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 F
tC
1
t 1F
t 2F
tC
H ok
23
MI
S 2ok S 1ok MS
H ko t’C
1
S 1ko t ’1 F
t ’2 F
S 2ko t’C
23
MI
F
Fig. 5. Two redundant software components on a single hardware component.
The use of rather complex GSPNs such has those presented in Figs. 4 and 5 to model series/parallel combinations of items is necessary to increase re-usability. These models are part of a library of basic models we built to model I&C systems. This library is presented in [9]. 3.1.2. Multiple functions Consider two functions (the generalization is straightforward) and let {C1i } (resp. {C2j }) be the set of components associated to F1 (resp. F2 ). We distinguish the case where functions do not share resources (such as components or repairmen), from the case where they share some. Examples of these two cases are presented hereafter: • F1 and F2 have no common components: {C1i } ∩ {C2j } = ∅. The interface models related to F1 and F2 are built separately in the same way as explained for a single function. There are no structural dependencies, only functional ones. • F1 and F2 have some common components: {C1i } ∩ {C2j } = ∅. This corresponds to the existence of structural dependencies, in addition to functional dependencies. This case is illustrated on a simple example: ◦ F1 performed by three components: a hardware component H and two redundant software components S11 and S12 . F1 ’s model corresponds to Fig. 5. ◦ F2 performed by two components: the same hardware component H as for F1 and a software component S21 . F2 ’s model corresponds to Fig. 4. The global model of F1 and F2 is given in Fig. 6. It can be seen that (i) both interface models (MI1 and MI2 ) are built separately, and (ii) in the global model, the common hardware component H is represented only once by a common component model. Sharing of H thus creates a structural dependency. The functional dependencies are not represented in this figure.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 F2
F1
M
tC
I1
S
M
t 1F
12
S
12ok
t 2F
1
283
tC 1
11
tC
tF
21
S
S12ko
I1
M
I2
M
S1
M
22
S21ok
Hok
11ok
tC
2
t’C
12
S2
S21ko 11ko
t’F1
1
Hko t’F2
1
t’C
11
t’C
21
F1
t’F
2
t’C
22
M
I2
F2
Fig. 6. Two functions with a structural dependency.
3.2. Interfacing rules The interface model MI connects the system’s components with their functions by a set of transitions. This model is a key element in our approach. Particular examples of interface models have been given in Figs. 4–6. In this section, the general organization of the interface model is presented. Interfacing rules are defined in formal terms in Appendix A. Here, the main rules are stated in an informal manner. Upstream and downstream MI have the same number of immediate transitions and the arcs that are connected to these transitions are built in a systematic way: • Upstream MI . It contains one function transition tiF for each series (set of) component(s), to mark the function’s up state place, and one component transition tCx for each series, distinct component that has a direct impact on the functional model, to unmark the function’s up state place: ◦ Each tiF is linked by an inhibitor arc to the function’s up state place, by an arc to the function’s up state place and by a double arc to each initial (ok) component’s place. ◦ Each tCx is linked by an arc to the function’s up state place and by a double arc to each failure component’s place. • Downstream MI . It contains one function transition t iF for each series (set of) component(s), to unmark the function’s failure state place, and one component transition t Cx for each series, distinct component that has a direct impact on the functional model, to mark the function’s failure state place: ◦ Each t iF is linked by an arc to the function’s failure state place and by a double arc to each initial (ok) component’s place. ◦ Each t Cx is linked by an inhibitor arc to the function’s failure state place, by an arc from the function’s failure state place and by a double arc to each component’s failure place.
284
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
4. Refinement of the structural model We assume that the structural model is organized in a modular manner, i.e., it is composed of sub-models representing the behavior of the system’s components and their interactions. For several reasons, the first model that is built, starting from the functional-level model, may be not very detailed. One of these reasons could be the lack of information in the early system’s selection and development phases. Another reason could be the complexity of the system to be modeled. To master this complexity a high-level model is built and then refined progressively. As soon as more detailed information is available concerning the system’s composition and events governing component evolution, the structural model can be refined. Another refinement may be done regarding event distributions. Indeed, an assumption is made that all events governing the system’s behavior are exponentially distributed, which, in some cases, is not a good assumption. In particular, failure rates of some components may increase over time. Model refinement allows detailed behavior to be taken into account and leads to more detailed results compared to those obtained from a high-level model. In turn, these detailed results may help in selecting alternative solutions for a given structure. For our purpose, we consider three types of refinement: component, state/event and distribution. Given the fact that the system’s model is modular, refinement of a component’s behavior is undertaken within the component’s sub-model and special attention should be paid to its interactions with the other sub-models. However, we will mainly address the new dependencies created by the refinement, without discussing those already existing. The latter are either unchanged or should be refined according to the type of refinement achieved. Component refinement consists in replacing a component by two or more components. From a modeling point of view, such a refinement leads to the transformation of the component’s sub-model into another sub-model. Our approach is to use the same transformation rules as those used for the interface model presented in Section 3. State/event fine-tuning consists in replacing, by a subnet, the place/transition corresponding to this state/event. We define basic refinement cases, whose combination covers most usual possibilities of state/event refinement. For distribution adjustment, we use the method of stages. Considering an event whose distribution is to be transformed into a non-exponential one, this method consists in replacing the transition associated with this event, by a subnet simulating an increasing event rate. We have adapted already published work to take into account dependencies between the component under consideration and components with which it interacts. This is done without changing the sub-models of the latter. A section is devoted to each refinement type. 4.1. Component decomposition Consider a single function achieved by a single software component on a single hardware component. Suppose that the software is itself composed of N components. Three basic possibilities are taken into account (combinations of these three cases allow modeling of any kind of system): • The N components are in series. • The N components are redundant, which means that they are structurally in parallel. • There are Q components in parallel and R + 1 components in series (with Q + R = N).
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
285
Fig. 7. Series decomposition.
These decompositions are, respectively, called parallel, series and mixed. Our goal is to use refinement rules identical, as far as possible, to the ones used in Section 3. In the following, we explain how a single component is replaced by its N components. 4.1.1. Series decomposition Consider the decomposition of software S into two series components S1 and S2. This case is presented in Fig. 7. 4.1.2. Parallel decomposition Consider software S’s decomposition into two redundant components S1 and S2. Thus, S’s up state is the result of S1 or S2’s up states, and S’s failure state is the combined result of S1 and S2’s failure states. Fig. 8 gives a GSPN model of this case. The generalization to N components is straightforward. 4.1.3. Mixed decomposition Suppose S composed of three components: S1, S2 and S3, where S3 is in series with S1 and S2, that are redundant. This case is identical to the example presented in Fig. 5 when replacing F by S and H by S3.
286
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
Fig. 8. Parallel decomposition.
4.1.4. Conclusion It is worth mentioning that the interface model between the system and its components is built exactly in the same manner as the interface model between a function and its associated components. In all the cases illustrated above, we have considered only one token in each initial place. Indeed, K identical components can be modeled by a single model with K tokens in the initial place. When refining the behavior of such components, a dissymmetry may appear. This is due to the fact that some components that have the same behavior at a given abstraction level, may exhibit a slightly different behavior when more details are taken into account. For example, when modeling two redundant computers at a high-level, we may consider them to be identical (they perform the same tasks at the same time, and the failure of one of them puts the system in a degraded state). When refining their behavior, the model can, for example, show that the failure of the primary unit is followed by a switch, while the failure of the secondary does not require a system’s switch. This means that their refined behavior is different. If this is the case, one has to modify the model of the current abstraction level before refinement. 4.2. State/event fine-tuning In GSPNs, places correspond to system’s states and timed transitions to events that guide state changes. The fine-tuning of places/transitions allows more detailed behavior to be modeled. In fact, state/event
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
287
fine-tuning may result from assumptions’ refinement or lead to a refinement of the assumptions. Refinement has been studied in Petri nets [23,24] and more recently in time Petri nets [17]. One of the objectives of these refinement theories was to preserve the results of model processing. Our goal is to detail the system’s behavior by refining the underlying GSPN. Our sole constraint is to ensure that the net’s dynamic properties (liveness, boundness and safeness), at each refinement step, are preserved. Our main motivation for model refinement is to have more detailed results about the system’s behavior, that better reflect reality. We define three basic refinement cases. Combinations of these three cases cover most usual situations for dependability models’ refinement. They are given in Table 1. TR1 allows the replacement of one event by two competing events. It allows the event’s separation into two other events with different rates. TR2 allows a sequential refinement of events, while TR3 allows the refinement of a state into two or more states. These transformations are illustrated in the following simple example. Consider the hardware model given in Fig. 9(a). Several successive refinement steps are depicted in Fig. 9(b)–(d). After a fault activation (T1 ) two types of faults are distinguished: temporary and permanent, with probability a and 1 − a, respectively. Using TR3, we obtain the model depicted in Fig. 9(b). This corresponds to an assumption refinement. Table 1 State/event refinement
Initial model
TR1, separation into two events
Two competing events
TR2, sequence of events
Refinement of the action represented by transition T
TR3, state refinement
t1 → p1 = prob. of firing t1 , t2 → p2 = prob. of firing t2 , p1 + p2 = 1
288
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
(a) Compact model T1 = λ T2 = ν
(b) 2 types of faults t 1 = 1a t2 = a T21 = µ T22 = ε
failure rate restoration rate
repair rate disappearance rate Hok
H ok
TR3
T2
T1
Hfail H fail
t2
t1
T21
T22 H ok TR2
H ok H fail
H fail TR3
t3
T2122 t 3 = 1d t4 = d T2121 = µ T2122 = π
t4
T 211
T2121
T 212 T 211 = δ T 212 = µ
error detection rate
error perception rate
(d) Error detection efficiency
(c) Error detection latency
Fig. 9. State/event refinement.
To take into account error detection latency (1/δ) for hardware components, we apply TR2 to transition T21 of Fig. 9(b). The resulting model is presented in Fig. 9(c), which is a state refinement. Finally, we model the error detection efficiency by applying TR3. Detected errors allow immediate system’s repair. We then add a perception latency (transition T2122 , Fig. 9(d)). This latency is important
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
289
to be modeled because, as long as the effects of the non-detected error are not perceived, the system is in a non-safe state. Repair can be performed only after perception of the effects of such errors. This is a small example of a state/event refinement application. Other details can be added to the model using the three rules presented in this section. 4.3. Distribution adjustment It is well known that the exponential distribution assumption is not appropriate for all event rates. For example, due to error conditions accumulating with time and use, the failure rate of a software component might increase. The possibility of including timed transitions with non-exponential firing time is provided by the method of stages [14,15]. This method transforms a non-Markovian process into a Markovian one, by decomposing a state (with a non-exponential firing time distribution) into a series of k successive states. Each of these k states will then have an exponential firing time distribution, to simulate an increasing rate. In GSPNs, a transition, referred to as extended transition, is replaced by a subnet to model the k stages. The transformation of an exponential distribution into a non-exponential one might create new timing dependencies. Indeed, the occurrence of some events in other components might affect the extended transition. For example, the restart of another software component might lead to the restart of the component under consideration (that has an increasing failure rate) and thus stop the accumulation of error conditions, bringing back the software under consideration to its initial state. In previously published work [1,2], the dependency between events is modeled only by concurrent transitions enabled by the same place. This is not very convenient when several components interact with the component under consideration, as it could lead to changing their models. We have adapted this extension method to allow more flexibility and take into account this type of dependency. The salient idea behind our approach is to refine the event’s distribution without changing the sub-models of the other components, whose behavior may affect the component under consideration (when assuming a non-exponential distribution). In the rest of this section, we first present the extension method presented in [2] and then present our adapted extension method. 4.3.1. Previous work Concerning the transitions’ timers, three memory policies have been identified and studied in the literature, namely, resampling, age memory and enabling memory. The latter being well adapted to model the kind of dependency that is created when modeling system’s dependability as mentioned above, we will focus on it in this paragraph. It is defined as follows: at each transition firing, the timers of all the timed transitions that are disabled by this transition are restarted, whereas the timers of all the timed transitions that are not disabled hold their present values. In [1,2] an application of the enabling memory policy in structural conflict situations has been given. It concerns the initial model of Fig. 10, in which transition T1 to be extended is in structural conflict with transition Tres . When applying the enabling memory policy as given in [2] to transition T1 of Fig. 10, the resulting model is presented in Fig. 11. In this figure, the k series stages are modeled by transitions tc1 , tc2 , T11 and T12 and places P1 , P2 and P3 . Token moving in these places is controlled by the control places Pc1 , Pc2 and P4 . S¯ is marked once the k stages have been crossed, through fire of T12 .
290
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 T1
S
S
T res
Fig. 10. Initial model.
The enabling memory policy consists in stopping process T1 ’s evolution after firing of Tres (i.e., clearing of places P1 , P2 and P3 ). This is accomplished in two steps. As soon as S becomes empty, immediate transitions t1 , t2 and t3 are fired as many times as needed to remove the k tokens from places P1 , P2 and P3 . At the end of this step, places Pc2 and P4 are marked with one token each. Once places P1 , P2 and P3 become empty, the return to the initial state is performed by immediate transition t4 that puts one token in place Pc1 . 4.3.2. Enabling memory with external dependencies Our approach replaces the transition to be extended by two subnets: one internal to the component’s model, to model its internal evolution, and a dependency subnet, that models its interaction with other components. The initial model is given in Fig. 12(a). In this model, we assume that T1 , Tdis1 and Tdis2 are exponentially distributed. Suppose that in refining T1 ’s distribution, its timer becomes dependent on Tdis1 and Tdis2 . The transformed model is given in Fig. 12(b). A token is put in Pdep each time the timer of transition T1 has to be restarted, due to the occurrence of an event that disables the event modeled by T1 (firing of Tdis1 and Tdis2 in other component models). Like in the previous case, this is done in two steps. T11
P2
P1
S
k 1
T12
P3
S
t c2
t c1
k 1
P c2
T res
t1
P4
t2
t3
t4
P c1
Fig. 11. Enabling memory with structural conflict.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
291
Other component submodels Tdis2
T dis1
T1
S
S
(a)
Other component submodels Tdis2
Tdis1
Pdep t’1
P1
S
t’3
t’2
P2
T1
k 1
t c1
t c2
1
T2 P3
1
S
k 1
Pc2
t’4
(b)
Pc1
Fig. 12. Enabling memory with external dependencies.
As soon as place Pdep is marked, t 1 , t 2 and t 3 are fired as many times as needed to remove all tokens from places P1 , P2 and P3 . Once places P1 , P2 and P3 are empty, the return to the initial state is performed by transition t 4 that removes a token from place Pdep and puts one token in place Pc1 . Note that transitions t 1 , t 2 , t 3 and t 4 replace, respectively, t1 , t2 , t3 and t4 . Also, we simplified Fig. 12(b), by replacing place P4 by an inhibitor arc between t 4 and Pc1 . Thus, the two major differences between Figs. 11 and 12(b) are: (1) place P1 of Fig. 12(b) is replaced by an inhibitor arc going from place Pc1 to immediate transition t; (2) place Pdep , that manages dependencies between this net and the rest of the model, is added. 4.4. Concluding comments In this section, we have presented several ways of refining dependability models. Refinement may increase considerably the size of the initial model depending on
292
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
• the number of refined components; • the refinement depth; • the number of stages (k value) for distribution adjustment. Concerning the last adjustment, a sensitivity study with respect to k showed that the number of stages needed to simulate an increasing failure rate can be limited to two or three stages, in the case of the considered I&C systems. On the other hand, the distribution refinement can only be achieved at the state space (Markovian) level as explained in [16]. So doing, the complete transition rate matrix can be represented by resorting to Kronecker algebra [22], thus alleviating the state explosion problem. However, this requires the analysis of the resulting Markov chain, which may be tedious for some systems. Implementing distribution refinement at the Petri net level avoids analysis of the underlying Markov chain. More generally, the systematic refinement approach given in this paper particularly increases the number of immediate transitions. This is the price to pay for facilitating model construction by first building small, high-level generic and re-usable models, that can be systematically refined following well specified rules. Fortunately, several techniques are available for model reduction, by suppressing immediate transitions (see, e.g., [2,3,13]). Nevertheless, immediate transitions have no impact on the size of the associated Markov chain. Also, refinement may lead to stiff Markov chains (resulting from the introduction of fast transitions with high failure rates). This problem is not specific to our approach and is rather general. Techniques allowing state aggregation (see, e.g., [10]) or place aggregation (see, e.g., [3]) allow a non-stiff Markov chain with a smaller state space to be obtained. However, such techniques provide only approximations. Thus, the result accuracy is conditioned by the ratio of the slow-to-fast transition rate: the lower is the ratio, the more accurate the result is. Finally, in this section, we concentrated on “flat” models related to components with dependencies. For a given system, we can divide the components into subsets, each of which contains components that are dependent while the subsets are independent. Each subset can be processed alone and the results combined to provide the overall dependability measures, using a hierarchical modeling approach such in the SHARPE tool [21], the Dynamic Fault Tree [4] or the chain of heterogeneous models presented in [11]. 5. Application to I&C systems In this section we illustrate the application of our modeling approach to I&C systems. We first present the main functions together with the functional-level model for a general I&C system. Then, we describe how the high-level dependability model is built for one of the I&C systems. Finally, we show some results concerning a small part of its detailed dependability model. 5.1. System presentation An I&C system performs five main functions: human–machine interface (HMI), processing (PR), archiving (AR), management of configuration data (MD), and interface with other parts of the I& C system (IP). The functions are linked by the partial dependencies: HMI ← {AR, MD},
PR ← MD,
AR ← MD,
IP ← MD.
These relations are modeled by the functional-level model depicted in Fig. 13.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
293
HMI
M L1
HMI
HMI d1
d2
HMI MD
PR
AR PR d
AR d M L3
M L2
M L4
AR
PR
MD IP
IPd
M L5
IP
Fig. 13. Functional-level model for I&C systems.
To illustrate the second step of our modeling approach, we consider the example of an I&C system composed of five nodes connected by a local area network (LAN). The mapping between the various nodes and their functions is given in Fig. 14. Note that while HMI is executed on four nodes, Node 5 runs three functions. Nodes 1–4 are composed of one computer each. Node 5 is fault-tolerant: it is composed of two redundant computers. The initial structural model of this I&C is built as follows: • Nodes 1–3. In each node, a single function is achieved by one software component on a hardware component. Its model is similar to the one presented in Fig. 4. Node 5 Node 1
Node 2
Node 3
Node 4 HMI
HMI
HMI
HMI
MD
AR PR
AR PR
IP
IP
other systems
LAN Fig. 14. I&C structure.
294
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
PH
SH
Switch
SS
PS
PH = Primary Hardware SH = Secondary Hardware
PS = Primary Software SS = Secondary Software
Fig. 15. High-level dependability modules for function AR.
• Node 4. It has two functions that are partially dependent. Its functional-level model will be similar to F1 and F2 ’s functional-level model given in Fig. 3. Its structural model will be similar to the one depicted in Fig. 6. • Node 5. It is composed of two hardware components with three independent functions each. Its structural model is more complex. A part of it is given in Section 5.2. • LAN. Two different assumptions can be made: (i) it is non-fault tolerant, in which case its model is similar to the one presented in Fig. 4; (ii) it is fault tolerant, its model will be a duplex one. Moreover, since the LAN is a single point of failure, it is in series with all of the other components. This means that the LAN’s failure induces the loss of the system’s functions. For this reason, its model is not presented in Fig. 13. The complete high-level dependability model for this system is composed of 41 places and 19 tokens. After refinement, the model is much larger, as illustrated in Section 5.3. In the rest of the section, we concentrate on a single function: function AR of Node 5. We first present its high-level dependability model, and then we show how it can be refined progressively. We conclude the section with some dependability evaluation results. 5.2. AR high-level dependability model Let us consider the case of Node 5 of Fig. 14. Function AR is performed by two units:1 a primary unit and a secondary one. When the primary fails, a switch between the two units is attempted. The global view for the high-level dependability model of function AR is presented in Fig. 15. In this figure, SH, PH, SS and PS correspond, respectively, to the models of the secondary hardware component, primary 1
A unit is composed of a hardware computer and its corresponding software component (at this level of detail, we suppose that each function is executed by a single software component).
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 F
Switch SH
H 2ok
P sw T5
T6
Tsw
t2 T2 S 2ok
SS
T1
T1rs
T4
T3
t sws2 t sws1
T2rs
PS
S 1ok
H 1ok
H 2ko
t4
295
t swf
T8
T7
S1st
H 1ko
S 1fail
t1 PH S 1ko
S2st S 2fail
t3
S 2ko F
Fig. 16. High-level dependability model for function AR.
hardware component, secondary software component and primary software component. Arrows from SH to SS, and from PH to PS represent in each case, the software component’s stop when the respective hardware component fails. This will be detailed later. The high-level GSPN dependability model for the AR function is given in Fig. 16. In this figure, each set {SH, SS} and {PH, PS} corresponds to the complete high-level model of Fig. 4. It is worth noting that we consider two software unavailability states: S1st and S1fail . The first one represents the software’s stop after the failure of the hardware on which it is implemented. S1fail corresponds to the software failure state. The switch module is composed of place Psw and timed transition Tsw . If the switch succeeds (immediate transitions tsws1 and tsws2 ), the function continues to be executed. If it does not succeed (immediate transition tswf ), the function fails. It is worth noting that the switch may be activated following a software or a hardware failure. In either case, it switches simultaneously hardware and software components. 5.3. AR model refinement Once the model is refined, the modules obtained are given in Fig. 17. In this figure, SHrf , PHrf , SSrf and PSrf correspond, respectively, to the refined models of the secondary hardware component, primary hardware component, secondary software component and primary software component. It is worth noting that refinement introduces new dependencies between components: in this case, modules EPshs , EPphs and EPss , corresponding to the error propagation between components. This is due to the refinement of
296
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306 SH rf
PH rf
Switch EP
EP shs
phs
PS rf
SS rf EP
ss
EP shs = Secondary hardwaresoftware error propagation EP phs = Primary hardwaresoftware error propagation EP ss = Softwaresoftware error propagation
Fig. 17. Detailed dependability modules for function AR.
the modeling assumptions. Indeed, Fig. 17 distinguishes permanent and temporary errors (which is not the case for Fig. 15). As a consequence, Fig. 17 assumes that temporary hardware errors might propagate to the software (EPphs and EPshs ) and temporary software errors of the primary might propagate to the secondary software (EPss ). In Fig. 17, the switch model is the same as in Fig. 15. This is true when assuming the same switch rate for hardware and software components, i.e., 1/βh = 1/βs . However, if we consider that these values are different, when refining the models, we obtain two switch models: one corresponding to the switch between hardware components and the other to the switch between software components. The detailed GSPNs presented are obtained using the rules described in Section 4.2. The following assumptions and notations are used (Fig. 18): • The activation rate of a software fault on the primary is λsp (Tafsp ) and of λss (Tafss ) on the secondary. • A software fault is detected by the fault tolerant mechanisms with probability ds . The detection rate is δs (Tdesp /Tdess ). • The effects of a non-detected error are perceived with rate πs (Tepsp ). • Errors in the software may necessitate only a reset. The reset rate is ρ (Trsp /Trss ) and the probability that an error induced by the activation of a software fault disappears with a reset is r (trsfp /trsrs ). • If the error does not disappear with the software reset, the software is re-installed. The software’s re-installation rate is σ (Trip /Tris ). Note that a software fault may propagate from the primary to the secondary with probability pss . The following assumptions and notations are used (Fig. 19):
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
Fig. 18. Software redundant module with error propagation.
297
298
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
Fig. 19. Hardware and software module with error propagation.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
299
Table 2 AR annual unavailability: 1/λh = 1000 h, 1/µh = 10 h c
1/β
0.99 0.98 0.95
30 s
1 min
5 min
10 min
1 h and 34 min 2 h and 00 min 3 h and 20 min
1 h and 38 min 2 h and 04 min 3 h and 23 min
2 h and 06 min 2 h and 33 min 3 h and 52 min
2 h and 45 min 3 h and 11 min 4 h and 31 min
Table 3 Node 5 annual unavailability: 1/λh = 1000 h, 1/µh = 10 h c
1/β
0.99 0.98 0.95
30 s
1 min
5 min
10 min
1 h and 56 min 2 h and 29 min 4 h and 09 min
2 h and 00 min 2 h and 33 min 4 h and 13 min
2 h and 36 min 3 h and 10 min 4 h and 50 min
3 h and 26 min 3 h and 59 min 5 h and 39 min
• The activation rate of a hardware fault is λh (Tafhp ). • The probability that a hardware fault is temporary is t (tftp ). Such faults will disappear with rate ε (Tdftp ). • A permanent hardware fault is detected by the fault-tolerance mechanisms with probability dh . The detection rate is δh (Tdehp ) for the hardware. • The effects of a non-detected error are perceived with rate πh (Tephp ). • Errors detected in the hardware component require its repair: repair rate is µ (Trhp ). Note that a temporary fault in the hardware may propagate to the software (tpss ) with probability phs . Also, when the hardware is in the repair state, the software is on hold. The software will be reset or re-installed as soon as the hardware repair is finished. 5.4. AR availability evaluation The model has been processed using the SURF-2 tool [5]. The unavailability, evaluated in hours per year according to the switch parameters (switching time 1/β and coverage factor2 c) is given in Table 2. It can be seen that both parameters have significant impact. It is obvious that the smallest unavailability is obtained for the shortest switch time and the best coverage factor. This table shows that an annual unavailability of approximately 2 h may be obtained for (0.99; 5 min) and (0.98; 1 min or 30 s). A second example of results (Table 3) shows how the annual unavailability is affected by the variation of the hardware failure rate value. The only values of 1/β and c leading to an annual unavailability of 2 h are 1/β ≤ 1 min and c = 0.95. Also, there is a difference of about 22 min, and even more than an hour in case of (0.95; 10 min). 2
Probability that the switch succeeds.
300
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
Table 4 AR annual unavailability: 1/λh = 1000 h, 1/µh = 10 h c
0.99 0.98 0.95
1/β 30 s
1 min
5 min
10 min
1 h and 33 min 2 h and 00 min 3 h and 19 min
1 h and 37 min 2 h and 03 min 3 h and 23 min
2 h and 05 min 2 h and 32 min 3 h and 51 min
2 h and 44 min 3 h and 10 min 4 h and 30 min
The last result example presented here concerns a change in the repair time. Table 4 shows that reducing the hardware repair duration from 10 to 2 h, does not improve significantly the annual system’s availability (compared to Table 2.
6. Conclusions Our modeling approach follows in the footsteps of most of the existing work on dependability modeling. Where this approach is unique is in the inclusion of the system’s functional specifications into the dependability model, by means of a functional-level model. Also, it allows modeling of one system from its functional specification up to its implementation. The existing refinement techniques are conceived in order to preserve the result values. On the contrary, ours provides more accurate models and associated results. Thus, the modeling approach presented in this paper gives a generally applicable process for system’s analysis, based on GSPNs. This process involves a stepwise refinement in which dependencies are introduced at the appropriate level of refinement. A careful and precise definition of the constructs and of the refinement process is given. Indeed, we have shown how starting from functional specifications, a functional-level model can be transformed progressively into a dependability model taking into account the system’s structure. We have also shown how the structural model can be refined to incorporate more detailed information of the system’s behavior. Refinement is a very powerful tool for mastering progressively model construction. It will allow experimented, but not necessarily specially trained, modelers to analyze the dependability of one or several systems and compare their dependability at the same level of modeling abstraction, if required. The approach was illustrated here on simple examples related to a specific structure of an instrumentation and control system in power plants. However, we have applied this approach to three different I&C systems to identify their strong and weak points, in order to select the most appropriate one [9].
Acknowledgements This work has been partially financed by the LIS (Laboratoire d’Ingénierie de Sˆureté de fonctionnement). The authors wish to thank Mohamed Kaˆaniche for his helpful comments on an earlier version of this paper.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
301
Appendix A. Formal definition of interfacing rules The interface model MI has been defined in formal terms. Its formal definition is presented hereafter: • Transitions of MI : Consider Tup (respectively, Tdown ) the set of immediate transitions t (respectively, t ) of the upstream (respectively, downstream) part of the interface model, i.e.: Tup = {t : t ∈ MI up}
and
Tdown = {t : t ∈ MI down}
and P (respectively, P ) the set of ok places (resp. ko places) of the structural model, i.e.: P = {pok : pok ∈ MS }
and
P = {pko : pko ∈ MS }.
Also, • if N is the total number of components (corresponding equally to the number of tokens of MS ’s initial marking), and • if Q is the number of places having an initial marking not nil (Q ≤ N and Q = Card(P)) then Card(Tup ) = Card(P ) + 1 ∧ Card(Tdown ) = Card(P) + 1. Plus, since by principle, Card(P) = Card(P ): Card(Tup ) = Card(Tdown ). • Arcs of MI : Consider: • • t corresponding to the set of input arcs of t: represented by the set of input places of t. • t• corresponding to the set of output arcs of t: represented by the set of output places of t. • ◦ t corresponding to the set of inhibitor arcs of t: represented by the set of input places of t. (i) Arcs of MI up: •
tF = {px ok , py ok : x and y are in series}, = {F} ∪ {px ok , py ok : x and y are in series},
t•F •
◦
tCx = {F} ∪ {px ko , py ko : x and y are in parallel}, • ◦ tCx = {px ko , py ko : x and y are in parallel}, tCx
tF = {F}, = ∅.
(ii) Arcs of MI down: • ¯ ∪ {px ok , py ok : x and y are in series}, tF = {F}
• ◦ t F = {px ok , py ok : x and y are in series}, tF = • tCx = {px ko , py ko : x and y are in parallel},
• ¯ ∪ {px ko , py ko : x and y are in parallel}, t Cx = {F}
∅, ◦ tCx
¯ = {F}.
302
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
Rules have been presented for two components, nevertheless they can easily be generalized for several components: • Weight of the arcs It is given for the minimum marking of each place needed for the activation of the respective transition: (i) If components are all different: (a) They are all in series: tF : M(F) = 0 ∧ ∀x≤N M(px ok ) = 1, ¯ = 1 ∧ ∀x≤N M(px ok ) = 1, t F : M(F)
tCx : M(F) = 1 ∧ M(px ko ) = 1, ¯ = 0 ∧ M(px ko ) = 1, t Cx : M(F)
where N is the total number of components. (b) They are all in parallel: tiF : M(F) = 0 ∧ M(piok ) = 1, tCx : M(F) = 1 ∧ ∀x,y≤N M(px ko ) = 1 ∧ M(py ko ) = 1, ¯ = 1 ∧ M(piok ) = 1, t iF : M(F)
¯ = 0 ∧ (∀x,y≤N M(px ko ) = 1 ∨ M(py ko ) = 1). t Cx : M(F)
(c) There are Q in parallel and R in series (R + Q = N): tiF : M(F) = 0 ∧ ∀x,y M(px ok ) = 1 ∧ M(py ok ) = 1, x and y are in series and i = 1, . . . , Q, tCx : M(F) = 1 ∧ ∀x,y M(px ko ) = 1 ∧ M(py ko ) = 1, x and y are in parallel and x = 1, . . . , R,
i ¯ tF : M(F) = 1 ∧ ∀x,y M(px ok ) = 1 ∧ M(py ok ) = 1, x and y are in parallel and i = 1, . . . , R,
¯ tCx : M(F) = 0 ∧ ∀x,y M(px ko ) = 1 ∧ M(py ko ) = 1, x and y are in series and x = 1, . . . , Q. (ii) If there are some identical components and considering r the maximum number of tokens in place px and Q the number of ok places of MS (Q < N): (a) They are all in series: tF : M(F) = 0 ∧ ∀x≤Q M(px ok ) = r, ¯ = 1 ∧ ∀x≤Q M(px ok ) = r, t F : M(F)
tCx : M(F) = 1 ∧ ∀x≤Q M(px ko ) = 1, ¯ = 0 ∧ ∀x≤Q M(px ko ) = 1. t Cx : M(F)
(b) They are all in parallel: tF : M(F) = 0 ∧ ∀x≤Q M(px ok ) = 1, ¯ = 1 ∧ ∀x≤Q M(px ok ) = 1, t F : M(F)
tCx : M(F) = 1 ∧ ∀x≤Q M(px ko ) = r, ¯ = 0 ∧ ∀x≤Q M(px ko ) = r. t Cx : M(F)
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
303
(c) There are Q in parallel and R in series: in this case, we have to separate them, obtaining the case presented in i.c.
Appendix B. Example of the enabling memory with external dependencies We present here an example of application of the enabling memory with external dependencies presented in Section 4.3. Consider a fault tolerant system (two redundant computers: C1 and C2 and two redundant disks: D1 and D2) with two tasks of different priority (Fig. 20). The system has the following characteristics: C1 executes high priority task HT, C2 executes low priority task LT, HT uses C1 and D1 or D2, and LT uses C2 and D1 or D2. Under these conditions, if: • • • •
C1 fails (and C2 is in its up state)—LT is stopped and HT is executed in C2. C2 fails (and C1 is in its up state)—LT is stopped and HT will continue to be executed in C1. D1 (or D2) fails—both tasks (HT and LT) will continue to be executed. D1 and D2 fail—both tasks are stopped. Some precisions on the system’s functioning in case of failure of one or more of its components:
• • • • • •
The transfer of HT to C2 is done by automatic coverage by means of D1 or D2. The failure rate of each Ci is λ. There are two types of disk failure: electronic (with rate λe ) and electromechanical (with rate λe ). There is only one repairman. The repair rate of each Ci is µ. There are two types of disk repair: electronic (with rate µe ) and electromechanical (with rate µe ).
We assume that since there is only one repairman, both Ci have repair priority over each Di . Also, if both Ci or both Di are in a failure state, there is a global system’s stop and repair. Fig. 21 gives the GSPN model of the system’s behavior considering all the above given assumptions. The figure tables present the correspondence between the places/timed transitions with the system’s states/transitions rates.
Fig. 20. Duplex system with two tasks of different priority.
304
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
Fig. 21. GSPN model of the duplex system.
¯ D ¯ e , D and D ¯ m , respectively, when Immediate transitions t3 , t4 , t5 , t6 and t7 allow to empty places C, C, a general system’s failure arises. Assume that timed transition T5 ’s law is an Erlang-k. In this case, when a global system’s failure arises, the system’s developer must choose between the replacement of the fail component(s) or a global system’s repair. This last case corresponds to the enabling memory policy. Fig. 22 gives the correspondent GSPN model. It is worth noting that place DG, which empties the subnet, is not the same as the one that activates/stops the timer (D place) of timed transition T5 . Thus, DG place corresponds to place Pdep and immediate transitions t10 , t11 and t12 correspond to immediate transitions t 1 , t 2 et t 3 of Fig. 12. Thus, it is important when making a distribution modification, to take into account the existing structural conflicts and the possible time conflicts.
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
305
Fig. 22. GSPN model of the enabling memory case.
References [1] M.A. Ajmone Marsan, G. Chiola, On Petri nets with deterministic and exponentially distributed firing time, Lecture Notes in Computer Science, vol. 266, Springer, Berlin, 1987, pp. 132–145. [2] M.A. Ajmone Marsan, G. Balbo, G. Conte, S. Donatelli, G. Franchescinis, Modelling with Generalized Stochastic Petri Nets, Series in Parallel Computing, Wiley, New York, 1995. [3] H.H. Ammar, Y.F. Huang, R.-W. Liu, Hierarchical models for systems reliability, maintainability, and availability, IEEE Trans. Circ. Syst. 34 (6) (1987) 629–638. [4] J. Bechta Dugan, K.J. Sullivan, D. Coppit, Developing a low-cost high-quality software tool for dynamic fault-tree analysis, IEEE Trans. Reliab. 49 (1) (1987) 49–59. [5] C. Béounes, et al., SURF-2: a program for dependability evaluation of complex hardware and software systems, in: Proceedings of the 23rd International Symposium on Fault-tolerant Computing (FTCS-23), 1993, pp. 668–673.
306
C. Betous-Almeida, K. Kanoun / Performance Evaluation 56 (2004) 277–306
[6] C. Betous-Almeida, K. Kanoun, Dependability evaluation: from functional to structural modelling, in: Proceedings of the 20th International Conference on Computer Safety, Reliability and Security (SAFECOMP 2001), Lecture Notes in Computer Science, vol. 2187, Springer, Berlin, 2001, pp. 227–237. [7] C. Betous-Almeida, Construction and refinement of dependability models—application to instrumentation and control systems, Ph.D. Dissertation, LAAS-CNRS Report 02275, 2002 (in French). [8] C. Betous-Almeida, K. Kanoun, Stepwise construction and refinement of dependability models, in: Proceedings of the International Conference on Dependable Systems and Networks (IPDS), 2002, pp. 515–524. [9] C. Betous-Almeida, K. Kanoun, Dependability modelling of instrumentation and control systems: a comparison of competing architectures, LAAS-CNRS Report 02204, 2002. [10] A. Bobbio, K.S. Trivedi, An aggregation technique for the transient analysis of stiff Markov chains, IEEE Trans. Comput. 35 (9) (1986) 803–814. [11] A. Bobbio, E. Ciancamerla, G. Franceschinis, R. Gaeta, M. Minichino, L. Portinale, Methods of increasing modelling power for safety analysis applied to a turbine digital control system, in: Proceedings of the 21st International Conference on Computer Safety, Reliability and Security (SAFECOMP 2002), Lecture Notes in Computer Science, vol. 2434, Springer, Berlin, 2002, pp. 212–223. [12] A. Bondavalli, I. Mura, K.S. Trivedi, Dependability modelling and sensitivity analysis of scheduled maintenance systems, in: Proceedings of the Third European Dependable Computing Conference (EDCC-3), Lecture Notes in Computer Science, vol. 1667, Springer, Berlin, 1999, pp. 7–23. [13] G. Chiola, S. Donatelli, GSPN versus SPNs: what is the actual role of immediate transitions? in: Proceedings of the Fourth International Workshop on Petri Nets and Performance Models (PNPM’91), IEEE Computer Society Press, 1991, pp. 20–30. [14] P. Chen, S.C. Bruell, G. Balbo, Alternative methods for incorporating non-exponential distributions into stochastic timed Petri net, in: Proceedings of the Third International Workshop on Petri Nets and Performance Models (PNPM’89), IEEE Computer Society Press, December 1989, pp. 187–197. [15] D.R. Cox, H.D. Miller, The Theory of Stochastic Processes, Chapman & Hall, London, 1965. [16] A. Cumani, ESP—a package for the evaluation of stochastic Petri nets with phase-type distributed transition times, in: Proceedings of the International Workshop on Petri Nets and Performance Models (PNPM’85), vol. 684, IEEE Computer Society Press, 1985, pp. 144–151. [17] M. Felder, A. Gargantini, A. Morzenti, A theory of implementation and refinement in timed Petri net, Theor. Comput. Sci. 202 (1-2) (1998) 127–161. [18] N. Fota, M. Kaˆaniche, K. Kanoun, Incremental approach for building stochastic Petri nets for dependability modeling, in: D.C. Ionescuand, N. Limnios (Eds.), Statistical and Probabilistic Models in Reliability, Birkauser, Basel, 1999, pp. 321–335. [19] K. Kanoun, M. Borrel, T. Morteveille, A. Peytavin, Availability of CAUTRA, a subset of the French air traffic control system, IEEE Trans. Comput. 48 (5) (1999) 528–535. [20] M. Rabah, K. Kanoun, Dependability evaluation of a distributed shared memory multiprocessor system, in: Proceedings of the Third European Dependable Computing Conference (EDCC-3), Lecture Notes in Computer Science, vol. 1667, Springer, Berlin, 1999, pp. 42–59. [21] R. Sahner, K.S. Trivedi, A. Puliafito, Performance and Reliability Analysis of Computer Systems: An Example-based Approach Using the SHARPE Software Package, Kluwer Academic Publishers, Dordrecht, 1996. [22] M. Scarpa, A. Bobbio, Kronecker representation of stochastic Petri nets with discrete PH distributions, in: Proceedings of the International Computer Performance and Dependability Symposium (IPDS98), IEEE Computer Society Press, 1998, pp. 52–61. [23] I. Suzuki, T. Murata, A method for stepwise refinement and abstraction of Petri net, J. Comput. Syst. Sci. 27 (1983) 51–76. [24] R. Valette, Analysis of Petri nets by stepwise refinement, J. Comput. Syst. Sci. 18 (1) (1979) 35–46.