Bayesian belief networks for human reliability analysis: A review of applications and gaps

Bayesian belief networks for human reliability analysis: A review of applications and gaps

Reliability Engineering and System Safety 139 (2015) 1–16 Contents lists available at ScienceDirect Reliability Engineering and System Safety journa...

946KB Sizes 0 Downloads 68 Views

Reliability Engineering and System Safety 139 (2015) 1–16

Contents lists available at ScienceDirect

Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress

Review

Bayesian belief networks for human reliability analysis: A review of applications and gaps L. Mkrtchyan, L. Podofillini n, V.N. Dang Paul Scherrer Institute, Switzerland

art ic l e i nf o

a b s t r a c t

Article history: Received 17 December 2014 Received in revised form 11 February 2015 Accepted 15 February 2015 Available online 21 February 2015

The use of Bayesian Belief Networks (BBNs) in risk analysis (and in particular Human Reliability Analysis, HRA) is fostered by a number of features, attractive in fields with shortage of data and consequent reliance on subjective judgments: the intuitive graphical representation, the possibility of combining diverse sources of information, the use the probabilistic framework to characterize uncertainties. In HRA, BBN applications are steadily increasing, each emphasizing a different BBN feature or a different HRA aspect to improve. This paper aims at a critical review of these features as well as at suggesting research needs. Five groups of BBN applications are analysed: modelling of organizational factors, analysis of the relationships among failure influencing factors, BBN-based extensions of existing HRA methods, dependency assessment among human failure events, assessment of situation awareness. Further, the paper analyses the process for building BBNs and in particular how expert judgment is used in the assessment of the BBN conditional probability distributions. The gaps identified in the review suggest the need for establishing more systematic frameworks to integrate the different sources of information relevant for HRA (cognitive models, empirical data, and expert judgment) and to investigate algorithms to avoid elicitation of many relationships via expert judgment. & 2015 Elsevier Ltd. All rights reserved.

Keywords: Human reliability analysis Bayesian belief networks Expert judgment Human factors Organizational factors Performance shaping factors Human error probabilities

Contents 1. 2. 3.

4.

5.

6.

n

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Bayesian belief networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Bayesian belief networks for human reliability analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. BBNs to model the impact of organizational factors on human reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2. BBNs to model the relationships among PSFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.3. BBN-based extensions of HRA methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.4. BBNs to model HFE dependence in HRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.5. BBNs for situation assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Building BBNs: information sources and knowledge acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.1. Nodes, states, structure and model verification/validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2. CPT assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.1. Modelling of complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5.2. Combination of different sources of information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.3. Use of expert judgment in the quantification of the CPTs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Corresponding author. Tel.: þ 41 56 310 53 56. E-mail address: luca.podofi[email protected] (L. Podofillini).

http://dx.doi.org/10.1016/j.ress.2015.02.006 0951-8320/& 2015 Elsevier Ltd. All rights reserved.

2

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

Acknowledgement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1. Introduction Human reliability analysis (HRA) aims at systematically identifying and analysing the causes, consequences and contributions of human failures in socio-technical systems (e.g., nuclear power plants, aerospace systems, air traffic control operations, chemical and oil and gas facilities). For many industrial sectors, with differences in the level of sophistication and detail of the applications and methods, HRA is an established practice within the Probabilistic Safety Assessment (PSA) (or Probabilistic Risk Assessment, Quantitative Risk Assessment, Formal Risk Assessment as it may be referred to, depending on the industrial sector). HRA is an essential element for using PSA for regulatory and operational decisions. A number of HRA methods are used in the current practice [1–7]. They differ in their scope, the types and levels of decomposition of the tasks addressed, and the factors considered to influence the error probability. The methods guide analysts in the identification of potential human errors, in the analysis of the performance contexts and in the quantification of error probabilities. Despite the established and successful practice, there are some areas of HRA in need of development, among these: extensions of the method scope (to different types of errors and to other industrial sectors); stronger basis on cognitive models and empirical data; applicability to advanced human-machine interfaces; more structured and detailed qualitative analyses, and more empirically-based representation of the failure influencing factors. For comprehensive and recent analyses of the strengths and limitations of HRA methods, see [8–10]. Recently, applications of Bayesian Belief Networks (BBNs) to HRA are receiving increasing attention. Generally speaking, BBNs appear promising for their ability to represent complex influencing factor relationships. Also, their ability to combine different sources of information potentially allows developing HRA models with a stronger basis on cognitive theory and empirical data. BBNs are models that represent and quantify probabilistic relationships among factors. Their primary use is the representation of knowledge and decision support under uncertainty; their application is established in diverse areas such as medical diagnosis and prognosis, engineering, finance, information technology, natural sciences [11–15]. Their use ranges from data-mining applications to the representation of expert knowledge in rare-event applications; the latter situation being typical of risk analysis. The general use of BBNs in data-rich applications is to identify the important factors, their relationships (correlations and causal relationships) and their quantitative influence on the variables of interest, as these emerge from the data. In most of the applications dealing with rare events, BBNs are used to represents the expert knowledge about factors and their influences. HRA is a field in which data is scarce, but precious. Bayesian frameworks have been naturally recognized as appropriate methods for handling scarce, multi-source data, potentially allowing to improve both the estimation of human error probabilities and the underlying assumptions in the quantitative algorithms employed by the different HRA methods [16]. Correspondingly, BBN applications in HRA have steadily increased within the last decade. The studies using BBNs within the HRA domain can be grouped as follows [17]. A number of studies use the BBN ability to the model multi-level influences of Management and Organizational Factors (MOF) on human error [18–28]. In some cases [19–20] these studies have extended previous safety analyses by mapping/integrating traditional reliability models such as fault trees to BBNs, in an effort to integrate the BBN ability to model soft influences (e.g. from human or more

generally organizational factors) within existing safety models. BBNs have been used to understand and capture the relationships among PFSs and the quantitative impact of PSFs configurations on the error probability [29–34]. Some contributions proposed BBN versions of existing HRA models, such as SPAR-H and CREAM [35–37], by introducing additional modelling features such as interdependent Performance Shaping Factors (PSFs)1 [35] and extending the deterministic approach of the control mode assessment in CREAM [36]. BBN applications to improve dependency assessment in HRA are presented in [38–41]. Potential misdiagnosis errors by nuclear power plant operators are analysed exploiting BBN backward reasoning, i.e. reasoning from effects to possible causes [42–44]. The use of BBNs for HRA has found applications within different industries: nuclear [18,29,30,35,36,38,39,42–44], oil industry [19–24,28,34,37], and aviation [26,27,30]. Each of the mentioned studies emphasizes a different BBN feature relevant for HRA: e.g. ability to deal with scarce data, to incorporate diverse information, to model complex multi-layer relationships. The present paper systematically surveys these applications, critically reviewing these features as well as identifying research needs. Five groups of HRA applications are identified [17]: modelling of organizational factors, analysis of the relationships among PSFs, BBN-based extensions of existing HRA methods, dependence assessment among Human Failure events (HFEs), and modelling of situation assessment. The present paper analyses in further detail the contributions from each group. The review gives special emphasis to how the BBNs are built and in particular to how expert judgment is incorporated into the models. Indeed, given the limited availability of empirical data for comprehensive model validation, the phase of model development acquires special importance for the acceptance of the models. Also, generally, for HRA applications, the primary source of information when developing BBN models is expert judgment – though important exceptions are [29,30]. The present paper analyses the approaches used to elicit the expert knowledge, to include it into the BBN model and to combine this expert data with empirical data, when available. Note that a very important area of on-going development for HRA is the collection of data from simulated environment: fundamental issues are being researched and tools and guidelines are being developed, e.g. [45–48]. On the one hand, it can be expected that these efforts will enhance the empirical basis of HRA models and, eventually, decrease the requirement for judgment for some HRA applications. On the other hand, the elicitation of expert knowledge will maintain key importance for HRA, especially for applications for which data will be very difficult to obtain. This is the case, for nuclear PSA applications, of HRA for accident mitigation conditions and external initiating events, as examples. Also, expert judgment will remain an important source of information for industrial sectors in which the collection of HRA data is less advanced than in the nuclear industry. The paper is organized as follows. Section 2 briefly presents BBNs. In Section 3, the HRA aspects modelled by BBNs are presented and each of the five HRA application groups is discussed in detail. Section 4 addresses the BBN development process in the reviewed

1 In the present paper these factors will be generically referred to as Performance Shaping Factors (PSFs), as they are often referred to in HRA – although some HRA methods refer to these factors differently to highlight their different features.

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

studies; node selection, structure definition and parameters quantification, verification and validation are discussed. Section 5 discusses the BBN capabilities that drive their application in HRA, in particular: the ability to model complex factor relationships, the possibility to treat multiple sources of information, the formal representation of the expert knowledge in lack of empirical data. These BBN features are critically reviewed and research gaps are identified. Finally, Section 6 summarizes the survey insights and the identified research needs.

2. Bayesian belief networks BBNs were first introduced by Pearl [49] in 1986 and are defined as “Belief networks are directed acyclic graphs in which the nodes represent propositions (or variables), the arcs signify direct dependencies between the linked propositions, and the strengths of these dependencies are quantified by conditional probabilities”. BBNs are also called belief networks, Bayesian networks, and influence networks. If the nature of the links is causal, the BBNs can also be called causal networks. Each node in a BBN (e.g. nodes A–E in Fig. 1) represents a random variable and each link (also referred to as arc) between two variables represents a dependence between them. Nodes with outgoing links into a specific node are called parents of that node; the node into which parents are pointing is called child node. With reference to Fig. 1, nodes A and D are parents of the child node E. Node D is the child of the parents B and C. In their most typical representation, the BBN nodes are associated discrete states (with reference to Fig. 1: a1, a2 are the states of node A; b1, b2 are the states of node B; …; e1, e2 are the states of node E). The general preference for discrete nodes arises from the simpler calculation scheme as compared to when using continuous nodes, although the adoption of continuous nodes could be advisable (or even necessary) for some applications [13]. Fig. 1 shows an example of BBN with binary states, but also multi-state nodes are frequently used. For BBNs with discrete state nodes, the BBN relationships are in the form of Conditional Probability Distributions (CPDs) of each child state for each possible combination of the parents' states. For example, the relationship between node E and its parents A and D is represented by the conditional probabilities:

The BBN formalism allows visualizing and quantifying properties of the joint probability distribution of A–E. In particular, for the most typical HRA applications, nodes A–C would model observable PSFs; node D would model an error context or error mechanism (possibly not observable); node E would model the occurrence of the human error. HRA applications are mostly interested in calculating the probability of human error, given the factor states are in a particular configuration, or to infer knowledge on the factor states (i.e. of what has influenced the error occurrence), given that the error has been observed. The former use of BBNs is often referred to as predictive, while the latter as diagnostic. The general process for building BBNs is detailed in several references, e.g. [11–13]. Summarizing, after the definition of the model scope, the first step is the definition of the nodes and their states (Fig. 2). The states can represent values or intervals for physical quantities, linguistic terms, or ‘soft’ concepts not directly quantifiable. The next two steps address the determination of the graphical structure of the network and the assessment of the CPT values. Data-rich applications (e.g. some medical diagnosis and financial applications) typically resort to algorithms to learn both the BBN structure and CPT values. The structure-learning algorithms generally search for the BBN structure that optimizes some predefined criteria, under some constraints; for the CPT values, depending on the amount and quality of the available data, the algorithms are based on the maximum likelihood estimator, Bayesian estimator and more sophisticated procedures to cope with realistic data sets with missing, incorrect, sparse entries (refer to [11,50], for a comprehensive treatment). For rare-event applications, BBNs are typically constructed based on input from expert domain. In this phase, the typical tools are questionnaires, interviews, panel discussions. The issues to deal with are typical of applications in which a large number of probabilities is elicited from experts, e.g. avoiding different types of biases, ensure consistency in the assessments (detailed analyses of these issues can be found in [12,15,49,51,52]). Indeed, many applications fall in-between these two categories. If small data sets are available, the typical approach is to construct the BBN structure with expert judgment (structure learning algorithms typically require very large data sets) and use the available data for quantification of the relationships. If data is available only for some relationships, the rest of them are completed with expert Table 1 CPT for node E of the BBN in Fig. 1. A

A¼ a1

D

D ¼d1

D¼ d2

D¼ d1

D¼ d2

E ¼e1 E ¼e2

0.9 0.1

0.6 0.4

0.3 0.7

0.2 0.8

P(E¼e1 | A¼a1, D ¼d1); P(E¼ e2 | A¼ a1, D¼d1) ¼1-P(E ¼e1 | A ¼a1, D¼ d1) P(E¼e1 | A¼a1, D ¼d2); P(E¼ e2 | A¼ a1, D¼d2) ¼1-P(E ¼e1 | A ¼a1, D¼ d2) P(E¼e1 | A¼a2, D ¼d1); P(E¼e2 | A¼a2, D ¼d1)¼ 1-P(E¼ e1 | A¼ a2, D ¼d1) P(E¼e1 | A¼a2, D ¼d2); P(E¼e2 | A¼a2, D ¼d2)¼ 1-P(E¼ e1 | A¼ a2, D ¼d2)

3

A¼ a2

The CPDs for each node are arranged in the Conditional Probability Tables (CPTs) as showed in Tables 1 and 2. Table 2 CPT for node D of the BBN in Fig. 1.

A B

b1 b2 C c1 c2

a1 a2 D

B ¼ b2

B

B¼ b1

E

C

C¼ c1

C ¼c2

C ¼ c1

C ¼c2

e1 e2

D¼ d1 D¼ d2

0.05 0.95

0.6 0.4

0.9 0.1

0.78 0.22

d1 d2

Fig. 1. A simple example of a BBN.

Fig. 2. Main steps to build a BBN (iterations among steps are generally required).

4

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

judgement. Correlation and Factor Analyses can also be used to help analysts to identify the presence of relationships among the factors (and therefore the need for a corresponding link in the BBN), as demonstrated in [29]. The last step involves verification and validation. Verification is intended as the process to make sure that the model behaves according to its specifications. It generally involves sensitivity analyses, internal consistency checks, model runs on well-known scenarios [13]. Validation refers to the process to make sure that the model reflects the reality. The common approach for model validation entails splitting available data sets into training and test sets. Indeed, this approach can only be adopted for data rich-applications. For rare-event applications, validation is indeed an inherently problematic issue. In particular, for HRA, the quality of personnel performance in industrial systems on safety-relevant tasks is generally high so that few failures are observed. Also, human performance is strongly influenced by specific contextual factors so that data collected in one situation is difficult to generalize. Kirwan in [53,54] provides a detailed discussion on the definitions and main criteria for validation and verification of HRA methods, but some considerations are applicable beyond HRA. Kirwan classifies validation according to data quality: absolute validation based on real data, approximate validations based on other data (simulator data, experimental literature, expert judgement etc.) and convergent validation where the results are compared with those of other modelling techniques. Depending on the problem at hand, the appropriate validation approach should be pursued.

3. Bayesian belief networks for human reliability analysis The reviewed studies fall into five groups [17], according to the HRA aspect they address (Table 3). These are studies using BBNs: (1) to represent the possible impact of MOFs on human reliability, (2) to analyse the relationships between PSFs, (3) to extend existing HRA methods to capture additional features allowed by the BBN modelling framework, (4) to assess dependence among Human Failure events (HFEs), and (5) to model situation assessment. Indeed, these aspects are strongly inter-related. For example, representing the influence of MOFs on human error would naturally also address the relationships among PSFs as well as the possible dependencies across multiple failure events introduced by the MOFs. Whenever a study fits into multiple groups, the study is classified according to the most representative HRA aspect treated. Fig. 3 shows examples of the types of BBNs used within each group. The features of each type will be presented in details in the following sections. 3.1. BBNs to model the impact of organizational factors on human reliability Investigations of major industrial accidents point to management and organizational factors (MOFs) as important contributing causes of the accidents [55–57]. The key role of MOFs in ensuring safety is recognized by the industries dealing with complex sociotechnical systems and attention is increasingly taken to measures to Table 3 HRA aspects approaches.

addressed

with

BBN-based

HRA aspect

Contributions

MOF impact assessment PSF relationship assessment HRA method extensions HFE dependence assessment Situation assessment

[18–28] [29–34] [35–37] [38–41] [42–44]

maintain a high level of safety culture. The incorporation of MOFs in quantitative risk assessments, PSA, is also receiving increasing interest; the main open issues being: which factors should (or can) be incorporated; how to quantify MOFs and their influence on risk; and which modelling techniques should be adopted. The aim is to explicitly model the influence on risk of the deeper, more fundamental accident causes related to the organization and the environment in which the organization operates (e.g. safety culture, financial pressure, oversight by regulatory bodies, to name a few). The incorporation of MOFs in quantitative risk assessment has significant challenges. MOFs are numerous, and strongly interconnected. Their states and their influence on risk are difficult to define, operationalize, and measure. The impact of MOFs on risk can be subtle, but far-reaching, affecting normal operation practices, maintenance, training plans effectiveness, emergency operation. Also, generally, the influences are long-term, so very difficult to monitor. A thorough discussion of these challenges is beyond the scope of the present paper; the interested reader can refer to [55–57]. The use of BBNs to incorporate MOFs in risk assessment has been strongly fostered by the work in [55,56]. In particular, [55] introduces a set of modelling principles aimed at improving the theoretical foundations of the field of organizational safety risk analysis; based on these principles, the Socio-Technical Risk Analysis (SoTeRia) framework is developed [56]. Ref. [56] proposes the use of a hybrid technique to represent and quantify the SoTeRia framework: System Dynamics, BBNs, and PSA logic modelling techniques (Event Sequence Diagrams, Event Trees and Fault Trees). In particular, BBNs are advocated to capture the “uncertain nature of the relation between human performance and its organizational context” [56]. Indeed, BBN is a natural framework for “explicit probabilistic relations among elements of the model, where objective data are lacking and use of expert opinion and soft evidence is inevitable” [56]. Generally, the approach to incorporate the influences of MOFs within HRA is to use BBNs to explicitly model their multi-level, hierarchical influences on the HEP, Fig. 3, element (1). Some factors (typically those considered by the HRA methods as PSFs) directly influence the HEP, e.g. the quality of the human machine interface and the time available for the personnel to carry out their tasks. Other factors, typically the organizational factors, influence the HEP indirectly, e.g. the management's commitment to safety influences the quality of personnel training, which, in turn, directly influences the HEP. BBNs are a natural tool to capture these influences, because of their graphical formalism, which explicitly shows the direct, indirect and generally multi-level influences, as well as their modelling approach based on decomposing complex and interrelated relationships into conditionally independent influences. In what follows, we briefly outline the main focus of each of the surveyed works, with the goal of identifying the specific issues addressed with the BBNs framework and the problematic areas. Note the focus of the present paper is on BBN uses: the review of methods and application cases for integration of organizational factors in HRA is beyond the scope of this paper. The interested reader can refer to [55–57]. The study in [18] uses BBNs to hierarchically and explicitly model the multi-level influences on the human failure: from the organization (e.g. “organizational goals and strategies”, “organizational structure”, “procedures”, “organizational culture”); from the specific task to be performed and the context in which it is performed (e.g. “human– computer interface”, “task”, “environment”); and from the individual performing the task (e.g. “psychological state”, “physiological state”, “quality and capability”). The study presents a conceptual framework showing how these influences interact to produce a human error and a detailed classification of specific items that would contribute to these influences. Unfortunately, the case study in [18] addresses only factors which are typically considered by HRA methods (e.g.

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

(1)

5

(2)

Available Time Stressors

Complexity

Training

Human error

Procedures

ErgoHIMI Fitness for duty Work processe

(3)

(4)

(5) Fig. 3. Examples of BBN types for each HRA aspect group (adapted from [17]): (1) MOF multi-level influences; (2) PSFs hierarchical structure (the line thickness indicates the importance and strength of the link); (3) A BBN version of the SPAR-H model; (4) HFE dependence modelling with BBNs; (5) A BBN-based situation assessment model [43].

procedures, training, work environment) and does not include those organizational factors which pose the most significant modelling challenges (those with indirect influences, e.g. “organizational goals

and strategies”, “organizational culture”); the result is that the case study does not really shed light on how organizational factors can be quantitatively incorporated in probabilistic safety analyses.

6

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

Ref. [19] also focuses on capturing the multi-level influences: the authors model human errors as the result of the failure of three barriers: the organisational factors barrier, the group factors (group intended as both the operating crew and the company management) barrier and individual factors barrier. Organizational factors (such as “safety plan”, “commercial pressure” “training standard”) are modelled as nodes of a BBN representing the occurrence of the human error. The use of dynamic BBNs in [19] is intended to model the influence that repair actions have of human factor barrier failure (repair actions are directed to recover failures of displays, alarms, communication systems). A BBN is developed for application to offshore blowouts: the quantitative model allows prioritizing attention to the most important contributors to a blowout event. In Ref. [20], the BBN models both hierarchical influences of factors on the error probability and the overall influences of the organizational factors on the hazard (collision accident in the marine industry). The multi-level model of HEP influences includes management and organizational factors, internal factors, and required skills to perform the action – the latter required because each factor may have different influence on the skills (example of skills: “concentration”, “perception”) required to perform the actions. A BBN model is then built, integrating failure events (including hardware and human), required skills and influencing factors: in this way, the global influences of organizational factors on the hazard probability are captured. Ref. [21] uses BBNs to model the dependence among multiple failure events introduced by organizational factors. Specifically, the basic events appearing in fault trees (in [21] modelling collision hazard in the maritime industry) model human errors and technical failures; the probability of occurrence of each event is expressed as the function of a specific organizational variable which is in turn modelled by a BBN. The effect of the organizational factors is to modify the probability of each basic event through the corresponding organizational variable; in this way, the same organizational factor may impact multiple basic events. As conceptualized in [56], the combined fault tree – BBN framework allows representing the relationships among multiple organizational factors and between organizational factors and the basic events; the latter being explicitly included within the risk model. The importance of the different HOFs for the hazard probability (collision hazard) is investigated via sensitivity analysis. A very important contribution to the quantitative integration of organizational factors in probabilistic risk assessment can be found in the group of papers [22–25]. Refs. [22,25] introduce and discuss the modelling framework; [23,24] present application case studies. The modelling framework combines fault trees (modelling the contribution of the failure events) and BBNs (modelling the influence of the different factors on the event probabilities). The combination is also exploited in [26,27] for air transport, and [28] for oil applications. In particular, [28] includes a first attempt to model the dynamic feedback that managerial decisions have on safety, emphasizing the delayed impact of some decisions, e.g. the decision to allocate more resources on safety would not have immediate effect on safety. The most notable contribution of the work in [22,23] is the emphasis on how to measure the influencing factors (Risk Influencing Factors, RIFs). The scoring of each factor (a six-point scale is used, from A to F) is assessed via a questionnaire survey and an interview scheme (example questions are provided), following the approach developed by [58]. The use of surveys and interviews is typical in the quantitative assessment of organizational indicators (e.g. typically safety culture surveys [59]), but the use of these as input to the probabilistic models points to an important issue for the incorporation of organizational factors in HRA and more generally PSA: factors need clear definitions, operationalization and measuring techniques – at least the latter two dimensions can be addressed by appropriately designed surveys and interviews [55,56].

3.2. BBNs to model the relationships among PSFs Most of HRA methods use PSFs to support the analysis of the contextual conditions under which personnel tasks are performed and the quantification of their influence on the HEP [60]. There is general agreement on the key influencing factors that need to be covered by HRA methods (e.g. support from procedural guidance, of indications, of training, adequacy of time). However, the full set of the factors, their scope and definition, and their quantitative influence on the HEP are different from method to method, and depend on the method's scope, level of detail, underlying representation of the error causing mechanisms. The lack of a standard set of PSFs across HRA methods is seen as a major problem associated with the use of PSFs in HRA [60]. Also, Ref. [60] highlights the fact that the PSF definitions are not specific enough to ensure consistent interpretation of similar PSFs across methods and that few rules exist for the creation, definition and usage of PSFs. A frequent assumption of HRA methods is that the effects of the PSFs in the HEP are mutually independent. Typically, when multiple PSFs are assumed to influence the HEP, HRA methods model the PSF joint influence by cumulating the effect of each factor, as each of them would act independently on the HEP. Indeed, this assumption largely simplifies the development of the HEP quantification models. However, it becomes problematic to capture joint PSF effects. For example, good performance conditions, e.g. good support from the procedural guidance and high crew experience, can compensate the challenges of performing a complex task so that a relatively low error probability can be expected (even if associated to a complex task). On the other hand, if a complex task is performed with limited support from procedural guidance and/ or without experience, then the failure probability may become very high (much higher than if each of the three factors would be present singularly). BBNs are naturally suited to model these effects. Of course, the key practical problem is to identify the interacting factors and quantify their joint effect. These are the main issues addressed by the papers discussed in the present section [29–34]. All papers address identification and quantification of the interactions as these emerge from data (empirical data in [29,30]; artificial data in [31–33]; from simulated accidental conditions in [34]). The data used in [29] is obtained by merging two data sources: the Human Events Repository Analysis (HERA) database [62] and four events analysed in [63] with the Information-Decision-Action (IDA) cognitive model [64]. Ref. [30] presents two applications: one within the nuclear power industry using the same data set as in [29], one for the aviation industry using the Aviation Accident Report Database available at http://www.ntsb. gov/investigations/reports_aviation. References in [31–33] take a different approach; BBNs are constructed based on artificial data: data generated with known properties (concerning the PSF relationships and their influence on the HEP), in order to test the BBN modelling approach and evaluate its performance [31]. The data used in references [29–33] is in the form of a coded string indicating which factor is present during the accidental event. For example, in [29] the following coding is used: 1, the PSF is in the “less than adequate” state during the event; 0, the PSF is “nominal”/“indeterminate” during the event. An important difference between the data used in [29,30] and that in [31–33] concerns the presence of success events: refs. [29,30] use failure databases and therefore allow extracting information on the PSF relationships, conditional on the fact that the human failure event had occurred. Instead, [31–33] simulate databases including the success events, therefore allowing quantification from data of failure probabilities, as ratio of failure events over the total number of occurrences. The inclusion of success events in human performance databases is possible if the data is collected from simulated environments: recent initiatives are promising in this sense, see for example [45–48]. An example of use of data from simulated conditions is given

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

in [34]: a virtual environment simulating offshore emergency situations. Ref. [34] considers a small BBN, with three binary influencing factors and one output (failure) node. Data collected in six out of the eight possible factor state combinations were available to directly calculate the conditional failure probability in a similar way as in [31–33]. The main aim of the study in [29] is to present a methodology to build HRA causal models (based on the BBN modelling framework), by combining different data sources; namely empirical data, theoretical models and expert judgment. The approach involves expert judgment when there is not enough data to complete some aspects of BBN building: the choice of the relevant PSFs set, the presence of links among them, or the assessment of the conditional probabilities. A unique feature of the work in [29] is the use of empirical data to inform the BBN development, both for the graphical and quantitative parts: data analysis (correlation and factor analysis) allows identifying the causal relationships among the PSFs, their interaction to produce errors and some of the BBN conditional probabilities (if the number of data points allows). In particular, correlation and factor analysis are used to determine the links between the BBN nodes. The marginal probabilities for the PSF nodes (the input nodes of the BBN model) are determined based on the relative frequency of the (binary) node states. The conditional probabilities for each child node are assessed using a literature approach [61], using the marginal probabilities of the child node and of its parents. The uniqueness relates to the fact that, as mentioned in Section 2, the typical approach for BBN applications in field with poor data availability is to use expert judgment for building the structure of the networks and then the available data for quantifying the relationships. A primary aspect of the work in [29], which makes it relevant for contributions in understanding PSF relationships, is the exploration of the Error Contexts (Ecs), defined as certain combinations of PSFs which are more likely to produce a human error [29], Fig. 3, element (2). Ecs are identified from the data patterns, via Factors Analysis, a family of multivariate techniques that aims to derive a set of classification categories to simplify the relations among observed variables of interest [65]. For example, one of the Ecs emerging from the data is the combination of inadequate teamwork and training to respond to a complex, dynamic situation (involving three PSF, “Team”, “Training” and “Complexity”, as defined in [29]). The probability of an error occurring when this combination occurs is much larger than when these factors act alone. This suggests the importance of teamwork and training to cope with complex tasks (good teamwork and training can compensate the challenges of complex situations so that the probability of error can significantly decrease). Quantification of the conditional probability distributions is made from the data (when the frequency of occurrence allows it) and expert judgement. The main aim of reference [30] is to develop an HRA model suitable for the dynamic PSA simulation environment. Dynamic PSA integrates plant physical models (typically thermo-hydraulic codes in the nuclear power industry) and operating crew models into stochastic simulation engines. The basic idea is to directly generate the accident scenarios by simulating the interactions of the plant systems, components, and operating crew over time from the initiating events. The operating crew model simulates the operator response to the plant conditions, guided by the procedural guidance and informed by the operator knowledge, training and experience. Ref. [30] introduces two HRA modeling levels: coarse-grain and fine-grain. Causal graphs are introduced for the fine-grain modeling, built from the IDAC operator model [66] (IDAC: Information, Decision, Action in Crew Context). Causal graphs are used to represent the causal relationships among PSFs, supporting factors and decision nodes. Supporting factors are derived from [67] and they contain reference values that the operating crew records, such as for

7

example, the actual number of alarms activated. The decision factors link the influence of one PSF to another depending on criteria such as the ‘current strategy selected’ or the ‘diagnosis made’ [30]. A simplified version of the causal graph is then converted into a BBN: the BBN includes only the PSF nodes which are observable (an unobservable variable can be for example the variable Environment as the environment in a control room rarely changes, so there are limited chances to observe control room errors related to the room environment [29]); those for which data is available for quantification; and the most important ones (“major PSFs” as referred to in the paper, i.e. those with a larger number of incoming links in the causal map). Two separate BBNs are quantified, with data from the nuclear and aviation industries, respectively. The quantification of the CPTs from data allows investigating the importance of the PSFs in causing the human error (Fig. 3, element (2)). The BBNs show that the general pattern of PSFs state probabilities and the strength of influence pattern in many cases are similar in the two domains indicating similar interactions of PSFs. Some dissimilarities of PSFs interaction for nuclear and aviation domains can be explained by the difference in the time span in which an accident unfolds in both nuclear and aviation industries (generally hours for a nuclear event and minutes for aviation events). The main aim of the work presented in [31–33] is to examine BBNs as a modelling tool to capture the relationships among the PSF as these emerge from data. The use of artificial data (generated, with known properties) allows evaluating the BBN models as these are able to reproduce the known properties. The data was generated under the assumption that certain PSFs interact: their joint effect on the HEP being much larger than their individual effect. The references analyse the BBN performance under different modelling conditions: reduced data sets [31], missing and/or incorrect data values [32], different BBN structures [31,32], uncertainty in the BBN parameters [33]. In Refs. [31,32], the BBN conditional probability distributions are obtained from the artificial data by use of the Maximum Likelihood Estimator (MLE, the ratio of the number of outcomes, operator errors in the present case, to the number of possible trials, i.e. error opportunities). In Ref. [33], a Bayesian approach is used: for all CPDs, a uniform prior distribution is updated based on the number of occurrences of the relevant parent and child states. The results from the studies show that the BBN formalism supports the modeling of PSFs interactions. As expected, the BBN performance degrades as the size and quality of the data set decreases. Combination with expert judgment can be sought to improve the BBN performance in these cases; an approach in this direction is presented in [68]. Finally, Ref. [34] addresses the use of data from a virtual environment simulating an offshore emergency evacuation. A BBN is developed to model the effect of the failure in safe evacuation. Three PSFs (BBN input nodes) are considered: “training”, “visibility”, and “complexity”. The forty-three participants are divided and given different level of training for the simulated scenarios. Scenarios characterized by different visibility conditions (day/night) and of different evacuation complexity (low/high) are simulated. The available data allows estimating six out of the eight CPDs. The obtained CPDs are the basis for assessing the relative importance for safe evacuation of the three factors. In the study, two elements allowed direct estimation of the CPDs: the small size of the BBN and the high probability of failure. Indeed, for many HRA models the systematic exploration of all possible factor combinations (or of most of the combinations) with dedicated simulator settings and the collection of statistically significant data could be very challenging targets to achieve. 3.3. BBN-based extensions of HRA methods Three contributions [35–37] combine BBNs version with existing HRA methods, namely SPAR-H and CREAM, to overcome some

8

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

limitations of the original modelling frameworks (Table 3). In [35], the authors state that the reason for using the BBN version of SPAR-H is to demonstrate the benefits of using BBNs on a method with which practitioners are familiar. This is believed to help “bridging the gap” between the development of HRA methods by researchers and its adoption by the practitioners. Ref. [35] transfers the SPAR-H [4] method into a BBN (Fig. 3, element (3)). The network structure consists of the eight PSFs of SPAR-H and the human error (all PSF nodes directed into the human error node). Being a BBN replica of SPAR-H, the CPDs are given by the SPAR-H relationships between PSFs and error probabilities. In particular, for each parent state combination, the corresponding CPD for the human error node is directly given by the SPAR-H probability of error for that combination of PSF levels. Building on the BBN version of a familiar method, Ref. [35] shows how the BBN framework can accommodate analyses with perfect, partial or no information on the PSF states as well as evidential reasoning (i.e. about the PSFs states given error occurrence). Finally, possible modifications to the model are discussed: first, the possible expansion of the PSF nodes into additional parent nodes further specifying influencing factors (e.g. the PSF node “training and experience” can be further specified into “Relevance of training to scenario”, “Frequency of training”, “Years of experience”); second, capturing of PSF interdependency. In particular, the latter is intended as the influence that the state of one PSF has on the state of another; e.g. limited time influences stress (produces high levels of stress). The BBN can easily capture this interdependency by connecting the nodes (the stress PSF node becomes the child of the time PSF node) and assigning the conditional probability. Ref. [36] is based on the HRA CREAM method [6]. BBNs are used for the probabilistic assessment of the control mode, different from the original CREAM formulation in which the control mode is deterministically assessed. According to CREAM, the control mode represents the degree of control that the personnel has over a situational context. Four categories are used to describe the control mode, namely scrambled control, opportunistic control, tactical control and strategic control. The situational context is captured by the state of nine Common Performance Conditions (CPCs), each having three/four possible ratings. CREAM includes a diagram to determine the control modes, depending on the CPC ratings. The BBN developed in this study are used to extend the described deterministic approach in CREAM and it calculates the probability distribution of control modes given the probability distributions of the nine CPCs. Note that the relationships between the CPC ratings and the control mode are still deterministic: one control mode deterministically corresponds to each set of CPC ratings (the CPDs modelling these relationships actually model deterministic correspondence to one control mode). The probabilistic distribution among the modes results from partial evidence (modelling uncertainty about the real state) about the CPC rating. Ref. [37] proposes a modified CREAM version by incorporating three theoretical frameworks, namely BBNs, evidence theory and fuzzy set theory (FST). In this study BBNs are used for two different purposes. The first one is to model interactive rules to adjust the CPCs: the BBN identifies the CPC dependencies and determines the “adjusted” CPC effect, which takes into consideration possible double-counting effects of different CPCs. The structure of the BBN corresponds to the influential structure of the CREAM CPCs. Additional nodes are created to adjust the CPCs according to the CPC dependence guideline provided within the CREAM method [6]. For example, according to the CREAM guidance, the CPC ‘Crew collaboration quality’ changes its influential direction (e.g., from being neutral to having a positive influence) depending on the states of the ‘Adequacy of organisation’ and ‘Adequacy of training and experience’. Thus the BBN will contain an additional node ‘Adjusted crew collaboration quality’ which has the original node

‘Crew collaboration quality’ and the ‘Adequacy of organisation’ and ‘Adequacy of training and experience’ influencing nodes as its parent nodes. By introducing the adjusted CPCs as BBN nodes, the link between the dependent CPCs and the control modes becomes indirect via the adjusted CPCs nodes. The second type of BBN use in [37] is to derive the HEP from the CPC assessments (in a similar way as done in [35,36]). 3.4. BBNs to model HFE dependence in HRA In HRA, dependence analysis refers to assessing the influence of the failure of the operators to perform one task on the failure probabilities of subsequent tasks [1]. A typical approach to model this in the HRA practice is to determine successive conditional probabilities associated to each task along the modelled operator response. The THERP handbook suggests general rules to determine the appropriate dependence, qualified in terms of levels: zero, low, moderate, high, complete. The level of zero relates to the situation in which the failure in the preceding task(s) has no effect on the probability of failure of the subsequent task (therefore the HEPs of the two tasks are independent). The level of complete relates to the situation in which failure in the preceding task(s) is believed to lead to guaranteed failure in the performance of the subsequent task (the conditional probability of failure on the subsequent task is then equal to 1). The THERP handbook provides a formula for computing the dependent, conditional probability for each dependence level. Decision trees are often used in the HRA practice to support dependence assessment: the tree headings guide analysts through the assessment of the different factors influencing dependence (e.g. closeness in time between two successive tasks, similarity of the cues related to the two tasks). Decision trees are implemented within a variety of HRA methods, e.g. [4]. Two lines of research have addressed the use of BBNs for dependence assessment in HRA. The first one investigates the use of BBNs [38,39], and more generally expert models [69], to increase the repeatability and the transparency of the dependence assessments. Generally, dependence assessment following the THERP handbook entails large degree of judgment, because of the limited support from the handbook guidelines. When supported by decision trees, the repeatability of the assessment is expected to improve (the same dependence level is produced by the tree in correspondence of the same set of factor evaluations, as determined by the tree); however, the trees are typically not build based on a systematic process of expert elicitation so that it can be very difficult to trace the hypotheses underlying their construction. The consequence is that the fundamental assumptions behind the influence of the factors (e.g. which factors are important and how much each influences the model output) are not directly linkable to the developed model [69]. In [39], a BBN is introduced modelling the relationships between the influencing factors (BBN input nodes) and the level of dependence (the output node), as shown in the example in Fig. 3, element (4). The key point of [38,39] is to illustrate the use of a computable model to reduce the subjectivity required to the analyst (although the main aim of [38] is to compare specific aspects of uncertainty treatment in BBNs and fuzzy expert systems). On the one hand, repeatability is expected to improve thanks to the use an explicit, computable model (as it would be the case with decision trees). On the other hand, the use of a systematic expert elicitation approach should allow, once the dependence model is built, to verify the base expert’s statements that originated the model and follow how these are incorporated into the model. The second line of research aims at using BBNs to more explicitly capture the causal dependences between different human failure events [40,41]. The causal dependences are originated by the fact that the same PSFs influence multiple HFEs: BBNs are used to

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

explicitly model these influences. In terms of structure, BBNs developed for these applications would have multiple HFE nodes and different factors influencing the different HFE nodes as necessary. The BBN would then connect each HFE node to the set of factors believed to have the common influence (e.g., a hypothetical factor “training” may have common influence on multiple HFEs of the same accident scenario for the tasks that are subject to the same training sessions). Refs. [40,41] propose the use of BBNs for a rigorous, elegant and effective way of capturing this type of dependence. Note that this line of research for capturing HFE dependences has strong connection with the uses of BBNs for incorporating organizational factors in the PSA framework: organizational factors constitute fundamental, common causes of human failures. As discussed in Section 3.1, the approach for modelling MOF influences is to explicitly model the factors (including organizational ones) that affect multiple HFEs simultaneously. 3.5. BBNs for situation assessment As defined in [71]: “Situation awareness is the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future”. Further: “Situation assessment is the process of achieving, acquiring and maintaining situation awareness” [71]. A BBNbased approach to capture situation assessment for HRA applications (relevant for the group of papers addressed in this section [42–44]) is adopted in dynamic PSA methodologies. In the dynamic framework, situation assessment is directly modelled by the operating crew model, which, as mentioned in the previous Section 3.2, simulates the operator response, and therefore the plant state assessment. The work in refs [42–44] builds on [72], where the BBN situation assessment model is developed as a diagnosis/decision aid tool for the operating crew. In [72], BBNs are used to support the diagnosis of emergency situations (e.g. Steam Generator Tube Rupture, SGTR, Loss of Coolant Accident, LOCA, Loss of Secondary Cooling, LOSC), based on the pattern of indicators and alarms. As the pattern of indications and alarms progresses along the accident scenarios, BBNs are shown to eventually be able to diagnose the correct accident scenario (in terms of the scenario types SGTR, LOCA, LOSC for example). The work in [42–44] uses similar BBN concepts for HRA applications. Based on indicator and alarm patterns, the BBN assigns probabilities to the different accident scenarios: this provides the probabilities for correct diagnosis and misdiagnosis as one of the alternative accident scenarios (e.g. probability of misdiagnosis a SGTR as LOCA in the specific scenario characterized by a specific indicator and alarm pattern). Fig. 3, element (5) shows an illustrative example of a BBN for situation assessment described in [43]. In the dynamic simulation framework, the sensor states are determined by the plant evolution simulator, and the BBN is then used to assign probabilities to the different possible branches of operator response. Note, the use of BBNs to support diagnosis of emergency situations has been recently addressed in [73], where information from extensive dynamic PSA runs is used to inform the BBN knowledge base.

4. Building BBNs: information sources and knowledge acquisition 4.1. Nodes, states, structure and model verification/validation As presented in Section 2, the main steps for developing a BBN are: defining the BBN nodes and states, developing the BBN structure, quantifying the CPTs, and verifying/validating the model. This section discusses how HRA applications have addressed the definition of the nodes, states, and structure as well as verification

9

and validation issues. CPT assessment is addressed in the following Section 4.2. In the majority of the reviewed studies, the BBN nodes directly represent PSFs and/or MOFs. The studies presenting BBN extensions of CREAM and SPAR-H directly use the method PSF labels and rating scale as nodes and states, respectively. Some studies build on the PSFs of one HRA method, then add and remove factors as necessary. For example, the study in [30] starts from the PSF set of the IDAC cognitive model [67]. Additional supporting factors (e.g., actual number of alarms activated or flow sensor values) and decision factors are then added as BBN nodes. Finally, as presented in Section 3.2, the BBN is refined by maintaining only the nodes which are observable, important, or quantifiable with the data available. The final BBN consists of 17 binary nodes. This approach of starting from a relatively large set of factors as candidates for nodes and later refining the set with expert judgment or quantitative analyses is typical when defining BBN nodes. Some studies present BBNs with nodes that are conceptually different from PSFs, such as ‘Activity’ nodes [20] or personnel qualifications [74] (e.g., the specialist in charge, the support specialist etc.), hardware failures [21], types of failure events (“Captain verification failure” [20]), and situation events (“Ship leaves the planned route” [20]). Generic nodes for demonstration are used in several studies [31,32,40,41]; in the more application-oriented studies (i.e. studies that aim at solving a case-study with BBNs) [20,21,74], nodes are specific to the domain or to the problem at hand. The output node is generally the failure event, which can be defined as generic (“Human Failure”) or specific (“Ship collision” [20]), depending on the application. In almost all applications, the failure node is binary, so that the probability of human failure is then the probability of the node being in the failed state. Exceptions are [38,39], in which the output node is defined on the five-level scale of dependence introduced in the THERP handbook. The size of the BBNs in terms of the number of the nodes varies significantly from one study to another. Demonstrative examples present BBNs with five to ten nodes [31,32,38–41]. The BBN hierarchical structure in these applications is limited to one or to two levels. Application-oriented contributions normally present BBNs with a relatively large number of nodes [19–21,23,30]. The largest BBN is developed in [20] and consists of 263 binary nodes: the vast majority represent skills, internal PSFs and environmental factors, followed by different activities nodes, nodes representing hazardous events and MOF nodes. Two other relatively large BBNs are introduced in [19,21] with about 65 nodes; a relatively detailed PSF-based BBN may be expected to have above 20 nodes [23,30]. Some studies include relatively small BBNs for demonstration; however there is enough information to extend the BBN to a more comprehensive set of nodes. For example, a hierarchical BBN of 12 nodes is presented for application in [18]. However, the reference provides a comprehensive list of factors (around 250 specific factors) that can be added as root nodes. A large variety can be seen in the definition of the states as well. For example, in [19,30] all node states are defined with the same labels: ‘Absent’ or ‘Present’ in [30], ‘Yes or ‘No’ in [19]. As an example of a specific factor, the level of team training, in [30] the node is described as ‘Inadequate team training’, with states ‘Absent’ and ‘Present’; other studies introduce the node ‘Training’, with state labels such as ‘Inadequate’, ‘Adequate with limited experience’, and ‘Adequate with high experience’ [18,20]. It is important to note that the majority of the surveyed studies gives limited emphasis on the factor operationalization: this can be a limitation for using the model consistently across different analysts and applications. As mentioned in Section 3.1, an important contribution in this direction comes from the study in [22], which addresses the issue of how the influencing factors should be measured.

10

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

Concerning BBN structures, again a large variety exists in the development approaches as well as in the resulting structures. The CREAM and SPAR-H BBNs are derived from the structure of the respective HRA model: all factor nodes are directed into the output node modelling the occurrence of the human failure. The other studies use combinations of expert judgement and data [29,30], solely expert judgement (typically informed by existing HRA methods) [18,21,74], or previously built frameworks such as Fault Trees [19–21], causal maps [30], converted into a BBN. The data in Refs. [29,30] is from human error databases (refer to Section 3.2). Ref [29] observes that the causal relationships cannot be directly derived from available HRA data, but quantitative analyses (e.g. correlation and factor analyses) can help to determine the causal links. The precondition is that the definitions of PSFs do not overlap: any variance observed in the data would prompt a possible causal link between the PSFs. The procedure for determining the structure is not automatic, the decision of which links to include and their direction relies on the analyst judgment. When the structure is derived with expert judgment, the elicitation process is done either with expert panel discussion, interviews with experts [42] or the Delphi method [18,21]. In general, the reviewed papers do not discuss the phase of BBN building thoroughly; in many cases the BBNs used are presented as final results without discussing how the BBN model was developed. Indeed, it would generally be important to have more details on how the information was presented to the experts, how the level of detail and scope of the nodes was determined; which elements of the model were more challenging to develop. On the one hand, this would allow better understanding of the modelling choices made in these works and enhance the credibility of the developed models. On the other hand, it would be useful base information for developing new models. In terms of resulting structure, both hierarchical and nonhierarchical structures were used; within hierarchical structures, the classification of the nodes within each layer can also be very different. For example, the BBN in [22] has three layers with both outer layers representing MOFs. In other studies, factors in different layers represent conceptually different influences. For example, in [19] each layer represents factors influencing the individual, group or the organizational barriers, respectively. Non-hierarchical BBNs, e.g. [21,29] can have many interlinked nodes: this generally complicates the structure understanding as well as the CPT quantification. An important issue for the credibility of a model for decision makers and the end users is its verification and validation. Before referring to the validation or the verification of BBN-based HRA applications, first, a clear distinction between the terms validation and verification should be established. This is an important issue, especially for a field with poor data available, because in this case the boundary between the two concepts may not be obvious. A detailed discussion on the definitions and main criteria for validation and verification of HRA methods is provided by Kirwan in [53,54]. Verification is the proof that the model works as it is specified to work, while validation means that the model does what it is supposed to do in real world. Consequently, verification is related to the internal validity of the model; validation is concerned with its external validity, related to its accuracy, precision, meaningfulness and utility. Kirwan classifies the validation of HRA quantification methods (according to data quality) into three groups: absolute validation based on real data, approximate validations based on other data (simulator data, experimental literature, expert judgement etc.) and convergent validation where the results are compared with those of other modelling techniques. With respect to the reviewed studies, as typical for BBNs [13], verification is performed through sensitivity analysis [20,21,37]. For example, ref. [21] models the collision hazard in maritime domain, and to conduct sensitivity analysis the authors take two extreme operational conditions for reference: the first is high traffic density

and good weather conditions, and the second low traffic with bad weather conditions. Accordingly two sets of evidence are provided as an input to the BBN. The aim of the sensitivity analysis is to measure the sensitivity of a basic event (e.g., crew confused by other ship's movement) to a variation of the MOFs given the operational conditions in each scenario. The results help verifying that the important factors in the two scenarios are in line with the domain expert experience and understanding of the factor influences. Note, the described internal consistency check through sensitivity analysis may also be referred to as face validation. Concerning validation, convergent validation is the dominant approach (when performed). Generally, references for comparison are other HRA methods such as SPAR-H [35], CREAM [37], SLIM [75], or other studies in the literature [20,42,44]. The aim of the convergent validation is either to make sure that the BBN provides similar results of the other HRA methods, when the modelling features are comparable or to explain the possible differences when additional modelling features of the BBNs are exploited [53,54]. Of course, given the limited validation of HRA methods themselves, convergent validation does not have an absolute value; however, it can provide important insights on whether and how new techniques or models compare with the accepted state of knowledge and whether and how bring additional value. 4.2. CPT assessment As discussed in Section 2, depending on the amount and quality of the available data, the assessment of the CPTs may be done from data, expert judgment, or a combination of the two. Refs. [29,30] are the only ones evaluating the CPTs from empirical data (see Section 3.2). Indeed, this could be done only for those BBN parent-child state configurations with available failure data. While these references constitute a landmark for BBN research and applications for HRA, the current status of HRA data does not allow to fully develop an HRA model solely built on data (note the case of Ref. [34], treated in Section 3.2 addresses a very small BBN and a situation with relatively high failure probability values). In addition, the data available for use in [29,30] is extracted from databases of human failure events, therefore the values in the CPTs are conditional on the fact that the failure have occurred. In other words, from failure databases one can extract information on the frequency of PSF combinations given that failures have occurred, but not on the number of successful performances, which would be required to assess the failure probability empirically. Therefore, despite the continuous efforts to collect human performance data for HRA applications [9], expert judgement still remains an essential data source. Given the importance that rigorous treatment of expert judgment has on the transparency and acceptability of models in safety-analyses, the remaining part of this section will discuss in detail how the reviewed studies address this aspect, in particular referring to the seven works presented in Table 4. The rest of the references are not relevant for the discussion of CPT assessment via expert judgment: references [35–37], related to BBN-based extensions of SPAR-H and CREAM, derive the CPTs from the underling method relationships; references [31–33] use artificial data; references [42–44] use deterministic BBN relationships (therefore the elicitation of probabilities is not an issues); Refs. [39–41] use demonstrating values for the CPDs (therefore not elicited from experts). Refs. [74,75] have been included in the discussion in the present Section, but not in Section 3, because the emphasis in these papers is more on the CPD elicitation via expert judgement, rather than on HRA. Table 4 shows that the reviewed studies address the elicitation of the judgment in different ways (first three columns), most often combined: direct assessment of the probabilities by one or multiple experts; elicitation of probability ranking on qualitative scale (to avoid shortcoming, e.g. biases, of directly eliciting probabilities

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

11

Table 4 Approaches for CPT assesessment in the reviewed studies. Probability elicitationa Direct

Filling-up algorithm

Multiple expert aggregation

Notes

√ √ √ √

Cooke's method [52] Combination rule of evidence theory Delphi method (group consensus process) Delphi method (group consensus process) Linear programming (elicitation of intervals) Elicit strength of influence of each parent (‘Low’, ‘Medium’, ‘High’), pre-defined distributions for each strength category Linear interpolation between maximum and minimum effect HEPs (anchors for interpolation from other literature studies and expert judgment) Noisy-OR filling up algorithm (independent factor assumption) Weighted distance of child state from parent states; CPD as exponential function of the distance Child CPDs from weighted functions of parent state values [70]

Indirect

Ale et al. [26,27] Musharraf et al. [75] Li et al. [18] Trucco et al. [21] Firmino et al. [74] Vinnem et al. [22,23]

√ √ √ √

Martins and Maturana [20]





Cai et al. [19]





Røed et al. [24]





Baraldi et al. [38]





√ √



a

Probability elicitation:“Direct” refers to elicitation of probability values.“Indirect” refers to elicitation on qualitative scales, questionnaires, relative judgements, then converted into probability values.

of from experts); elicitation of selected model relationships (or, more generally of partial model information) and filling-up of remaining relationships with algorithms. The studies which use direct elicitation of probability values mainly involve multiple experts, and either an aggregation algorithm ([26,27,75]) is used for a group assessment or a recursive technique such as the Delphi method [18,21] is used to reach consensus opinions. Refs [26,27] use the classical aggregation approach by Cooke [52]. In [75], each of the CPDs is elicited from multiple experts, which are directly asked to provide the probability values. The elicited probability values are then aggregated using the Dempster– Shafer evidence theory. The mathematical aggregation tends towards the reduction of the uncertainty in the probability estimates. The following example from [75] relates to the assessment of the state probability for a PSF, “Physical condition”, with states “good”, “bad”, “incomplete knowledge”. One expert provides the following state probability values: 80% for state “good”, 10% for state “bad”, 10% for “incomplete knowledge”; while the other provides: 85% for state “good”, 5% for state “bad”, 10% for “incomplete knowledge”. The combined probabilities via the evidence theory are: 97% for state “good”, 2% for state “bad”, 1% for “incomplete knowledge”. As mentioned, the aggregation favours state “good”, to which both experts assign the highest probability. A problem with the use of the Dempster – Shafer evidence theory as made in [75] is indeed the tendency towards uncertain reduction: for HRA applications (and risk analysis in general), the models for expert judgment aggregation should tend towards fully representing uncertainty and variability in the assessments, as opposed to favour the agreement (note this relates to the mathematical aggregation of disagreeing judgments, not the search for expert consensus during the elicitation process). Ref. [74] is one of the few studies discussing in detail the knowledge elicitation process from experts. The elicitation burden is reduced by iteratively refining and reducing the number of the questions to ask experts and allowing the expert to have a gradual evolution of his assessment. The judgment is elicited by comparing the likelihood of the true probability value being in different probability intervals. For example the expert would be asked whether it is more likely that the true probability values is within (0–0.3) or within (0.5–1). The number of comparisons required depends on the desired precision of the estimate. For example, [74] shows that 26 comparisons are needed to assign the probability to one of 10 equally-sized intervals within (0–1). The expert statements

determine the constraints of a linear programming algorithm that bounds the probability estimate. Indeed, the approach allows overcoming the issues connected with direct probability elicitation; yet, its application remains resource extensive and its applicability when eliciting low probability values (down to 1e-3 and lower as the case for HRA) remains to be demonstrated. The elements of the CPT in [22] are determined by eliciting from domain expert the strength of influence of each child node on the parent node. The strength can be either ‘low’, ‘medium’, or ‘high’. Pre-defined functions are associated to each strength category, based on triangular functions. The functions for each strength category differ by their variance: the stronger the influence, the smaller the variance. The functions for each parent are then weighted (according to their strength), to derive the child CPT. In order to cope with limitations in the data available and in the possibility to elicit a large number of probability values from experts, Ref. [20] introduces an algorithm to quantify the BBN conditional probabilities: the probabilities are determined by linear interpolation according to the number of states of the parent nodes representing positive conditions. In other words, the interpolation is performed between the two anchor CPD values associated to all parents in the positive condition states to all parents in the negative condition states. The anchor values are taken from reference studies and direct expert judgement. The interpolation rules only depend on the number of parents in their positive states and therefore it does not differentiate among configurations with different parent nodes in the same amount of positive states. Yet, as discussed in Section 3.2, the PSFs in an HRA model can be characterized by strong interactions so that their effect can be strongly dependent on which factors are in their positive and negative states. Nevertheless, the algorithm proposed in [20] addresses an important aspect: the impracticality (often, impossibility) of eliciting all CPTs one-by-one and the consequent need for approaches to determine these, at the same time limiting the information that is required to the experts. The issue of avoiding the elicitation of the complete CPT is also addressed by the algorithm in [24], where CPDs are determined based on a weighted distance of the child state from the parent states. The weights reflect the importance of the parents in affecting the child state. Each CPD is calculated as an exponential function of the weighted distance, with parameters determined by expert judgment. Compared with [20], the algorithm in [24] allows better differentiation among parent state configurations, by introducing the parent weights.

12

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

However, the weights are calculated based on evaluating the factor effects taken one at a time, thus possibly missing the multiple factor interactions typical of HRA applications. A CPT building algorithm is also used in Ref. [38] as well. Ref. [38] uses the approach from [70], which is based on predefined weighted functions determined on the basis of the general tendency of the parent nodes influence. The choice of the appropriate function and of the factor weights depends on the effect of the value of the parent nodes on the child node. This can be inferred from statements elicited from experts on selected parentchild relationships and possibly other qualitative considerations. Note that, however, this choice requires a number of subjective assumptions be made, i.e. no hard rules connecting elicited information and these functions exist. The needs for CPT filling algorithms for HRA applications will be returned to in Section 5.3. Some studies use hybrid approaches for CPT building, with different approached for different nodes in the network [19,20]. For example, in [19] for certain nodes, the CPTs directly follow by transforming ‘OR’ or ‘AND’ gates in FT while for the nodes for which this transformation is not feasible, the noisy-Or algorithm [76] is used. The noisy-Or method is one of the most practical and widely used CPT building algorithms and many commercial software provide an option to generate CPTs with this method. To apply the noisy-Or algorithm two major assumptions should be made: first the nodes should be binary and the parent nodes should be independent. The number of input parameters to generate the full CPT is proportional to the number of parent nodes. As an input the algorithm needs the activation probability values of each parent node which is sufficient to produce the effect on the child node in the absence of all other parent nodes. Anticipating the more detailed discussion of Section 5, three shortcomings of CPT building approaches can be highlighted here. First, the approaches for partial model elicitation and fill-up via algorithms only accommodate expert judgment: the algorithms do not treat the combination of judgment and empirical data, the latter being the preferred source of information (when available). Indeed, the typical need when developing an HRA model would be to combine partial information from data and judgement and to fill-up for the missing relationships. Second, the treatment of the uncertainties in the model parameters is generally missing: BBN parameters are considered as point values, without information on the confidence on their estimates. Without this information, it becomes impossible to formally combine empirical data and judgment and possibly incorporate new information as it becomes available at a later stage. Third, some algorithms require assumptions on the models which may not be suitable for HRA, typically independence/linearity of the factor influences on the HEP or limitations on the number of states characterizing the model factors (e.g. two-state factors).

5. Discussion A number of BBN features attractive for HRA have emerged from the review. These are as follows:

 Graphical formalism.  Probabilistic representation of uncertainty.  Decomposition of factor relationships into state connections among nodes.

 Possibility to accommodate diverse information sources into the model development.

 Representation of expert judgment (more generally, of subjective beliefs) on factor influences. The first two features are discussed next; the last three will be addressed in separate (Sections 5.1, 5.2, and 5.3, respectively), given their implications in the identified research needs.

The generally intuitive graphical formalism facilitates the communication of some properties of the model, such as which factors are included and their influence hierarchy. Clearly, this can be exploited when building the model structure via expert judgment, in support of the panel discussions and interviews. None of the reviewed references explicitly reports on the experiences on this support, however the use of the representation formalism in this context is an obvious approach, as suggested in [11–15]. The visualization property of BBNs supports the development of the model structure also when this is derived from data. Indeed, in Ref. [29] the causal relationships are not directly derived from the data, but through an iterative process of correlation and factor analyses. Here, the visualization of the structure is part of the process to understand which relationships are actually relevant for the model (some relationships may spuriously be identified from the data analysis because of the limited sample size or because of the overlapping of the factor definitions). As a final remark on the graphical formalism, it can support, in principle, the review of the models by third parties, in the same way as it does in the model development phase; this is a key feature for risk analysis applications where the traceability of all modelling and analysis assumptions is necessary for the correct interpretation and use of the results. The probabilistic nature of BBNs makes them compatible with the other mathematical models typically adopted in risk analysis, such as fault trees and event trees as well as Probabilistic Safety Analysis (PSA); this allows for fully exploiting all synergies between the modelling frameworks, as proposed in [21,41,56]. On the one hand, this is used to extend the scope of existing risk analyses, e.g. modelling hardware elements, with influences that can be better represented with BBN-based sub-models, typically addressing the human influences. On the other hand, this allows capturing the dependences introduced by the human influences by linking the BBN sub-model in different parts of fault and event tree models, as discussed in Section 3.1. In a different perspective, the probabilistic and Bayesian frameworks are the established practice to represent and quantify uncertainty in risk analysis and, in particular, in Probabilistic Safety Assessment (PSA). This should help the assimilation of BBNs within the industrial practice. 5.1. Modelling of complexity As mentioned in Section 2, the CPDs of each child state are associated to each possible combination of the parents’ states. In principle, each CPD could be defined independently on all others; this allows large modelling flexibility, potentially allowing for the comprehensive modelling of the relationships among PSFs and between these and the HEP. Indeed, one feature of BBNs relevant for HRA is their ability to capture possibly complex relationships. The present review has highlighted different types of interactions, which are discussed in the following. The first element of complexity relates to the influences among the factors. In general, in HRA factor influences can be strongly interrelated. In the BBN modelling framework, this may result in the development of highly interconnected networks. Although the graphical representation of BBNs conveniently allows visualization and understanding of the connections, the development of highly interconnected networks should be made with care. Highly interconnected networks imply a large number of CPDs to be determined as well as potential computational problems for the application of CPT building algorithms and for the use of BBNs for evidential reasoning. In addition, in highly interconnected BBNs, it becomes difficult to track the influences of each variable through the model. Indeed, the development of highly interconnected BBNs actually contrasts with the main idea behind BBN development: this is to build models by combining conditionally independent elements [11] or by combining smaller fragments called

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

‘building blocks’ into larger BBNs by certain combination rules [51]. On the other hand, these practicality issues should not justify oversimplifying models. In HRA, factors are indeed numerous and interconnected and the development of complex models is often necessary. This is becoming evident from the recent efforts to strengthen the cognitive basis of HRA [60,77]. Note, however, that highly interconnected networks may also result from poor definitions of the HRA factors (which, of course should be avoided). For example, in developing the BBN structure, one analyst may feel that the factor “Personnel Experience” should be linked to “Task Complexity”: after all, some tasks can be perceived as complex by inexperienced personnel and easier by more experienced ones. This connection would be the result of an inaccurate definition of the complexity factor, mixing objective difficulty features related to the task (e.g. number of alarms to attend to) with the subjective perception of the task difficulty by the personnel. For an HRA specialist this example is probably a trivial error, however, more subtle examples of overlapping factor definitions are frequent in HRA methods; ref. [60] discusses the issue thoroughly, and presents a PSF hierarchy for which the orthogonality (i.e. no overlap) of the factor definitions is one of the main principles. Note however, that orthogonality does not relate to independence, it relates to the factor definitions: factors may still influence each other. However, their definitions should not overlap to properly understand how they interact instead of overlap [60]. Note that, as mentioned in Section 3.1, the issue of proper factor definitions is also very important for incorporating organizational factors in the PSA framework for which clear definitions, operationalization and measuring techniques are needed [22]. Care should be given to the factors definitions to avoid that dependencies relate to their overlap as opposed to their interaction and causal relationships. The second element of complexity relates to how factors influence the HEP. As mentioned in Section 3.2, the use of BBNs allows overcoming the assumption of many HRA methods that the PSFs independently affect the HEPs (e.g. SPAR-H [4]), provided that the CPTs can be assessed. Indeed, in the BBN modelling framework, CPDs are defined for each combination of the parent states so that there is large flexibility for modelling joint factor effects (both compensatory and amplifying as discussed in Section 3.2). Nevertheless, again, the modelling flexibility needs to be balanced with practicality. Even for relatively small BBNs the number of CPDs can be very large and the assessment of all distributions can be problematic. Indeed, when the BBN quantification is based on empirical data, then the data availability determines the parent state combinations for which CPDs can be assessed. In case expert judgement is used, then, as mentioned in Section 4.2 it arises the impracticality (often, impossibility) of eliciting all CPTs one-by-one and the consequent need for approaches to determine these, while at the same time limiting the information that is required to the experts. As discussed in Section 4.2, two of the surveyed papers [20,19] coped with the problem by using algorithms to determine the CPDs. Yet, in both cases, the application of the algorithms required the simplifying assumption that the factors independently influence the error probability. Other refs [24,38] use algorithms allowing some level of factor dependence; however, a systematic evaluation of the suitability and applicability to HRA of these algorithms (and available alternatives from the literature as mentioned in the later Section 5.3) has not yet been done.

5.2. Combination of different sources of information The ability of BBNs to be built by aggregating diverse sources of information (different types of data and judgements) is key for fields in which comprehensive and homogeneous data sets are not available, such as HRA and generally risk analysis. In HRA, the relevant sources of information to develop models include:

13

 Empirical sources: e.g. databases of human failure events

 

typically collected from operational experience or simulator studies, retrospective analyses of reported events (operational or simulated). Theoretical sources: e.g. human factor studies and theoretical models of human cognition. Expert judgment sources: e.g. qualitative on the factors that are expected to be important is some situations as well as quantitative on error probability ranges expected for some tasks.

Of course the three categories of sources above are closely connected, for example: databases are structured according to theoretical models and/or judgment; theoretical sources and expert judgment are based on empirical studies. As mentioned in Section 3.2, an approach to develop a BBN as a causal HRA model integrating the sources above is presented in [29]. The fundamental contribution of [29] has been to show the feasibility to integrate empirical sources in the model construction – it has to be recalled that the majority of BBN applications for HRA are solely based on expert judgment. This has been a key development considering that the increasing use of PSA to support regulatory and operational decisions requires that the tools and methods (HRA methods in particular) be developed to the extent possible on empirically sound basis. Empirical data, theoretical models and judgment are used to develop the BBN structure; while for the calculation of the CPDs, empirical data is used for the relationships for which it is available, and expert judgment comes in otherwise. As it is generally the case (not only for HRA), in the quantification of the BBN relationships, expert judgment is generally viewed as the alternative to missing data [11,12]. Whenever data is available, it is used; expert judgment comes in otherwise. However, the two sources are rarely combined for the quantification of the same relationships (i.e., typically some relationships are quantified from data, while some others from judgment). This approach is problematic for fields in which data is scarce: the uncertainty in the estimates of the BBN CPDs can be very large – of course depending on the data applicable for a specific CPD. For quantification of the relationships for which little data is available, the two sources need to be combined, to possibly strengthen statistically poor estimates. An important modelling challenge for the combination of the different sources is the rigorous characterization of the information they bring in, and of how each piece of evidence (from a database, a simulator run, an occurred event, an expert statement) modifies the belief on the BBN parameters. For human failure events, the sources of uncertainty in HEP include stochastic crew behaviours (same crew could fail or succeed on the same task under the same conditions), crew-to-crew variability (different crews have different performances), task variability and scenario variability (depending on the granularity of the task definitions, these can envelop different task and scenario variants). The different information sources considered for HRA (operational and accidental events, simulator runs, and expert judgement) provide different information on the above aspects of uncertainty and variability. For example, a reported human failure event relates to the behaviour of a single, specific crew on a single, specific task, while simulator runs may address multiple crews on the same task. An expert judgment would typically consider a generic crew. In addition, expert judgment may come in different forms, such as qualitative or quantitative, absolute or relative. The characterization of these aspects is an important pre-requisite for the development of the approach to combine data and judgment. Indeed, the combination of different sources of information in causal models is a very delicate issue: as discussed in [78], if not properly done can lead to several orders of magnitude errors. However, for the HRA field (and generally risk-analysis) where data is scarce, but precious, efforts towards the rigorous combination of empirical and expert judgment data for the

14

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

quantification of the models appears to be the way ahead, to maintain the empirical basis provided by the data, while decreasing via expert judgment the (typically) large uncertainty. A necessary step in this direction is the representation of the uncertainty in the BBNs parameter (i.e. the parameters defining the BBN CPDs). As discussed in [33], BBNs typically treat parameters as point values, with no information on the level of confidence of their estimates. This approach should be generally criticized in any risk analysis application, because the representation of uncertainty is a fundamental input to the decision making process. Specifically for HRA, the uncertainty representation is necessary to represent the different levels of knowledge that each information source brings to the model, as well as to update this knowledge as new evidence becomes available. 5.3. Use of expert judgment in the quantification of the CPTs As discussed in Section 4.2, for the large majority of applications in HRA, expert judgments are the primary inputs to the development of the BBNs. Conceptually, the high-level process for developing a BBN with expert input is established [12,13]. The building of BBNs consists of a number of phases such as the decision on what to model, definition and choice of the variables to represent the nodes in a BBN, the range of continuous or discrete variables to describe the states of each node, building the structure of BBNs in terms of links between the predefined node, quantifying the model in terms of CPT building, and verifying the model through sensitivity analysis or testing the model behaviour in well-defined and known scenarios [12]. The most delicate part of the process is indeed the assessment of the CPDs. As discussed in Section 4.2, for HRA, a variety of approaches has been developed and applied, aiming at avoiding different types of biases in the probability elicitation as well as to make the elicitation process more practical. Approaches to avoid biases are relatively established [52,79]. Some of the proposals to inhibit or to prevent biases are for example the anticipation of possible biases to occur in the planned elicitation, redesign the planned elicitation, make the experts aware of the potential biases that they are likely to exhibit, using mathematical models to analyse the possible biases occurred etc. [79]. A good review of expert elicitation of probabilities and related issues such as biases, consistency, and coherence is done in Ref. [80]. As discussed in Section 3.5, due to the generally large number of relationships, an important aspect of the elicitation process is to limit the amount of information to be required by the expert. On the one hand, this is important to avoid the impracticality of eliciting all CPDs. On the other hand, elicitation of CPDs one by one may cause losing track of general model properties, e.g. functional relationships of the factors over their entire variability range, overall importance of some factors, influences of groups of factors. This can be obtained by resorting to algorithms to populate the CPDs. When applying these algorithms, expert judgement is limited to the determination of selected relationships (selected CPDs) or/and to the definition of general tendencies in the factor influences; then, the algorithm populates the CPDs on the basis of the expert input. As presented in Section 4.2 three of the surveyed studies apply this type of algorithms [19,20,22]. In [19], the noisy-Or filling-up algorithm is applied; in [20], the authors developed a procedure according to which the probabilities are determined by linear interpolation according to the number of states of the parent nodes representing positive conditions; in [22], CPDs are determined by eliciting from domain expert the strength of influence (characterized as low, medium, or high) of each child node on the parent node. In all cases, the important assumption of independence in the factor influences on the HEP had to be made (see also Section 5.1 on the complexity aspects modelled by the BBN framework). The algorithms in [24] and [70] allow some level of interactions among

the factors. For example, in [24], by calculating CPDs as a weighted distance of the child state from the parent states. However, the weights are calculated based on evaluating the effect of factors taken one at a time, thus possibly missing multiple the multiple factor interactions typical of HRA applications. Indeed, alternative approaches to the assessment of CPDs have been proposed in the general literature. The methods modelling dependent causality can be classified into two sub-groups: methods which are generalizations of the noisy-Or model (one of the most widely used CPT filling up algorithms which has an assumption of independent parent nodes) [81,82] and methods having different methodological approaches such as interpolation of parent node states [83–85]. The suitability and applicability of those methods for HRA (the method handles multi-state, dependent variables, discusses multi-expert elicitation and expert aggregation, deals with uncertainty treatment, etc.) applications are yet to be systematically investigated.

6. Conclusions BBNs seem a very good candidate for developing HRA models. Yet, while BBN-based HRA applications are steadily increasing in the literature, critical analyses of the BBN potential in this field as well as the identification of research challenges are lacking. This has been the aim of the present paper, informed by a survey of HRA application studies. The survey has addressed multiple types of HRA applications in various industrial fields (mostly within the nuclear, oil and gas, and aviation domains), covering the relevant literature from its appearance in HRA applications (early 2000s) until present. In terms of developed BBNs, the survey underscores a large variety in the definition of BBN nodes, states and structures, even within each group of the applications examined in this review. Depending on the study, nodes are used to directly represent the classical HRA influencing factors (the PSF), MOFs, as well as conceptually different elements, such as personnel qualifications, hardware failures, and hazardous situations possibly initiating an accident. Binary nodes are mostly used; however, the state definition can be very different across the studies, also when associated to the same influencing factor. Variety is also observed in the BBN structures (hierarchical and non-hierarchical networks have been developed) and sizes. On the one hand, this variety underscores the flexibility of the BBN modelling tool, allowing to capture very different modelling approaches, levels of detail, and levels of integration with the hardware failure models. On the other hand, it reflects the fact that BBNs within HRA have not yet reached a strong level of maturity: the appropriate level of detail, the layers of influences, the integration within the overall safety analyses that can be captured by this modelling tool are still subject to exploration. The survey looked closely at the process adopted for BBN development in each study. With respect to the BBN structure development, a landmark contribution has been provided by ref. [29], which demonstrates the possibility of informing the BBN structure from empirical data. This allows understanding how the failure influencing factors interact in producing error-forcing conditions, as these have emerged from real cases. The applicability of the approaches in [29] is currently constrained by the limited availability of usable data; however, this situation is expected to largely improve, in view of the new momentum [45,46] recently seen for data collection efforts for HRA applications. Refs. [45–48] stress the key importance for HRA to continue with data collection and to investigate the open issues related operationalization of influencing factors and performance measures and how to eventually analyse the collected data and use it to inform HRA models and support actual analyses. For most of the surveyed studies, the BBN structure is developed from expert judgment, typically elicited through panel discussions, questionnaires and interviews. It has to be mentioned

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

that the surveyed literature gives limited emphasis to the aspect of BBN building. It appears that this aspect is generally perceived as less problematic in light of the very intuitive graphical representation – which largely facilitates the elicitation of the judgment. In contrast, CPD assessment has received more attention. In terms of model traceability and credibility, details on how the information was presented to the experts, how the level of detail and scope of the factors was defined, and which elements of the model were more challenging or controversial, would be important. Concerning the assessment of the CPDs, in only few studies did the amount of data support their empirical estimation, and then only for selected model relationships. As in the development of model structure, the use of expert judgement dominates the CPD assessment. An area that has not been investigated satisfactorily for HRA application is related to the need to avoid eliciting all of the CPDs one by one – which can lead to inconsistencies and is problematic due to the resources needed. This can be overcome by eliciting limited model information and then completing the remaining CPDs with the help of algorithms. Most algorithms used in the reviewed studies require the assumption of factor independence, which does not suit many HRA applications. Indeed, while simplifying the algorithm development and application, this assumption contrasts one of the purposes of using BBNs in HRA: the possibility to model the factor interactions. The present paper has highlighted the need to systematically investigate alternative algorithms, better suited to the HRA needs. A fundamental element for the use of any model in risk analysis is its verification and validation. In the surveyed studies, verification is approached by sensitivity analysis, when at all addressed. Indeed, comprehensive model validation is prejudiced by shortage of data. On the one hand, it is hoped that the mentioned increased interest in collection of HRA data will allow a more empiricallybased attempts to validation. On the other hand, it may be worth investigating whether more systematic validation approaches can be pursued, even under the current lack of comprehensive data. These include partial model validation e.g., response in correspondence of selected, known model relationships or responses; model sensitivity to parameter changes (direction of changes, amount of changes, response to known factor interactions); and qualitative ranking validation. In the last, a set of tasks are first ranked by difficulty, which is generally feasible; this ranking should be in line with the HEP ranking produced by the BBN. BBNs in principle allow the combination of different sources of information relevant for HRA, namely empirical data, cognitive theory, expert judgment. The present paper has argued that formal ways to combine expert judgment and empirical data need to be sought. On the one hand, expert judgment would allow refining the large uncertainties affecting the empirical information; on the other hand, the formalism allows distinguishing, reviewing, and possibly further refining the more subjective elements of the model development. The modelling challenges ahead relate to the mathematical characterization of the information content that each source brings, of the different forms of expert judgment (e.g. quantitative/qualitative, absolute/relative/ranking), and of the uncertainty in the BBN parameters. As a closing remark, HRA models (typically, as part of the overall PSA model) are used to make decisions with risk-relevant implications: all assumptions in the model development and use need to be subject to review and acceptance, which, depending on the industrial fields, may need to go through regulatory bodies. Therefore, the improvements in the formalism advocated by the present paper in the model development, validation and verification could support the acceptance of BBN applications for HRA. Note that in this regard, HRA is not unique given similar observations remarked outside the HRA field, quoting from [51]: ‘In the literature much more attention is given to the algorithmic properties of BBNs than to the method of actually building them in practice’. A similar statement can be found also in [86]: ‘In many papers and books of

15

BBNs they are usually presented as completed pieces of work, with no insights into the reasoning and working that must have gone into determining why the particular set of nodes and links between them were chosen rather others’.

Acknowledgement This work was funded by the Swiss Federal Nuclear Safety Inspectorate (ENSI), under DIS-Vertrag Nr. 82610. The views expressed in this article are solely those of the authors. References [1] Swain A, Guttmann HE. Handbook of human reliability analysis with emphasis on nuclear power plant applications. NUREG/CR-1278: US Nuclear Regulatory Commission; 1983. [2] Kirwan B. A guide to practical human reliability assessment. CRC press; 1994. [3] Spurgin A. Human reliability assessment theory and practice. CRC press; 2010. [4] Gertman D, Blackman H, Byers J, Haney L, Smith C, Marble J. The SPAR-H method. NUREG/CR-6883:US Nuclear Regulatory Commission; 2005. [5] Forester J, Kolaczkowski A, Cooper S, Bley D, Lois E. ATHEANA user's guide. NUREG-1880: U.S. Nuclear Regulatory Commission; 2007. [6] Hollnagel E. Cognitive reliability and error analysis method: CREAM. New York: Elsevier; 1998. [7] Embrey DE, Humphreys PC, Rosa EA, Kirwan B, Rea K. SLIM-MAUD: an approach to assessing human error probabilities using structured expert judgment. NUREG/CR-3518: Department of Nuclear Energy, Brookhaven National Laboratory, US Nuclear Regulatory Commission; 1984. [8] Oxstrand J. Human reliability guidance – how to increase the synergies between human reliability, human factors, and system design & engineering. Phase 2: the American point of view. Nordic Nuclear Safety Research Council (NSK) Technical Report NSK-229; 2010. [9] Forester J, Dang VN, Bye A, Lois E, Massaiu S, Broberg H, et al. The international HRA empirical study –final report – lessons learned from comparing HRA methods predictions to HAMMLAB simulator data. HPR-373 OECD Halden Reactor Project, Norway; 2013. [10] Mosleh A, Chang YH. Model-based human reliability analysis: prospects and requirements. Reliab Eng Syst Saf 2004;83(2):241–53. [11] Jensen FV, Nielsen TD. Bayesian network and decision graphs. New York, NY, USA: Springer Science; 2007. [12] Fenton NE, Neil MD. Risk assessment and decision analysis with Bayesian networks. Boca Raton, FL, USA: CRC Press; 2013. [13] Langseth H, Portinale L. Bayesian networks in reliability. Reliab Eng Syst Saf 2007;92(1):92–108. [14] Brooker P. Experts, Bayesian belief networks, rare events and aviation risk estimates. Saf Sci 2011;49(8):1142–55. [15] Chen SH, Pollino CA. Good practice in Bayesian network modelling. Environ Model Softw 2012;37:134–45. [16] Hallbert B, Kolaczkowski A, Lois E. The employment of empirical data and Bayesian methods in human reliability analysis: a feasibility study. NUREG/CR6949: US Nuclear Regulatory Commission; 2007. [17] Mkrtchyan L, Podofillini L, Dang VN. A survey of Bayesian belief network applications in human reliability analysis. In: Proceedings of the European safety and reliability conference (ESREL 2014), September 14–18, Wroclaw; 2014. [18] Li PC, Chen GH, Dai LC, Zhang L. A fuzzy Bayesian network approach to improve the quantification of organizational influences in HRA frameworks. Saf Sci 2012;50(7):1569–83. [19] Cai B, Liu Y, Zhang Y, Fan Q, Liu Z, Tian X. A dynamic Bayesian networks modelling of human factors on offshore blowouts. J Loss Prevent Process Ind 2013:639–49. [20] Martins MR, Maturana MC. Application of Bayesian belief networks to the human reliability analysis of an oil tanker operation focusing on collision accidents. Reliab Eng Syst Saf 2013;110:89–109. [21] Trucco P, Cagno E, Ruggeri F, Grande O. A Bayesian belief network modelling of organisational factors in risk analysis: a case study in maritime transportation. Reliab Eng Syst Saf 2008;93(6):845–56. [22] Vinnem JE, Bye R, Gran BA, Kongsvik T, Nyheim OM, Okstad EH, et al. Risk modelling of maintenance work on major process equipment on offshore petroleum installations. J Loss Prevent Process Ind 2012;25(2):274–92. [23] Gran BA, Bye R, Nyheim OM, Okstad EH, Seljelid J, Sklet S, et al. Evaluation of the Risk OMT model for maintenance work on major offshore process equipment. J Loss Prevent Process Ind 2012;25(3):582–93. [24] Røed W, Mosleh A, Vinnem JE, Aven T. On the use of the hybrid causal logic method in offshore risk analysis. Reliab Eng Syst Saf 2009;94(2):445–55. [25] Groth K, Wang C, Mosleh A. Hybrid causal methodology and software platform for probabilistic risk assessment and safety monitoring of socio-technical systems. Reliab Eng Syst Saf 2010;95(12):1276–85. [26] Ale BJ, Bellamy LJ, Cooke RM, Goossens LHJ, Hale AR, Roelen ALC, et al. Towards a causal model for air transport safety—an ongoing research project. Saf Sci 2006;44:657–73.

16

L. Mkrtchyan et al. / Reliability Engineering and System Safety 139 (2015) 1–16

[27] Ale BJ, Bellamy LJ, Van der Boom R, Cooper J, Cooke RM, Goossens LHJ, et al. Further development of a causal model for air transport safety (CATS): building the mathematical heart. Reliab Eng Syst Saf 2009;94(9):1433–41. [28] Hanea D, Hanea A, Ale B, Sillem S, Lin P H, Van Gulijk C, et al. Using dynamic Bayesian networks to implement feedback in a management risk model for the oil industry. In: Proceedings of the international conference on probabilistic safety assessment and management and the European safety and reliability conference PSAM11 & ESREL 2012, June 25–29, Helsinki, Finland; 2012. [29] Groth KM, Mosleh A. Deriving causal Bayesian networks from human reliability analysis data: a methodology and example mode. Proc Inst Mech Eng, Pt O: J Risk Reliab 2012;226(4):361–79. [30] Sundarmurthi R, Smidts C. Human reliability modelling for next generation system code. Ann Nucl Energy 2013:137–56. [31] Stempfel Y, Dang VN. Developing and evaluating the Bayesian belief network as a human reliability model using artificial data. In: Proceedings of the European safety and reliability conference (ESREL 2011), September 18–22, Troyes, France; 2011. [32] Dang VN, Stempfel Y. Evaluating the Bayesian belief network as a human reliability model – the effect of unreliable data. In: Proceedings of the international conference on probabilistic safety assessment and management and the European safety and reliability conference PSAM11 & ESREL 2012, June 25–29, Helsinki, Finland; 2012. [33] Podofillini L, Pandya D, Dang VN. Representation of parameter uncertainty in Bayesian Belief Networks for Human Reliability Analysis. In: Proceedings of the european safety and reliability conference (ESREL 2013), September 29– October 2, Amsterdam, The Netherlands; 2013. [34] Musharraf M, Bradbury-Squires D, Khan F, Veitch B, MacKinnon S, Imtiaz S. A virtual experimental technique for data collection for a Bayesian network approach to human reliability analysis. Reliab Eng Syst Saf 2014;132:1–8. [35] Groth KM, Swiler LP. Bridging the gap between HRA research and HRA practice: a Bayesian network version of SPAR-H. Reliab Eng Syst Saf 2013;115:33–42. [36] Kim MC, Seong PH, Hollnagel E. A probabilistic approach for determining the control mode in CREAM. Reliab Eng Syst Saf 2006;91(2):191–9. [37] Yang ZL, Bonsall S, Wall A, Wang J, Usman M. A modified CREAM to human reliability quantification in marine engineering. Ocean Eng 2013;58:293–303. [38] Baraldi P, Podofillini L, Mkrtchyan L, Zio E, Dang VN. Comparing the treatment of uncertainty in Bayesian networks and fuzzy expert systems used for a human reliability analysis application. Reliab Eng Syst Saf 2015;138:176–93. [39] Baraldi P, Conti M, Librizzi M, Zio E, Podofillini L, Dang VN. A Bayesian network model for dependence assessment in human reliability analysis. In: Proceedings of the European safety and reliability conference (ESREL 2009), September 7–10, Prague; 2009. [40] Ekanem NJ, Mosleh A. Human failure event dependency modelling and quantification: a Bayesian network approach. In: Proceedings of the European safety and reliability conference (ESREL 2013); September 29–October 2, Amsterdam, The Netherlands; 2013. [41] Mosleh A, Shen SH, Kelly DL, Oxstrand J, Groth, KM. A model-based human reliability analysis methodology. In: Proceedings of the international conference on probabilistic safety assessment and management (PSAM 11), 25–29 June, Helsinki, Finland; 2012. [42] Kim MC, Seong PH. An analytic model for situation assessment of nuclear power plant operators based on Bayesian inference. Reliab Eng Syst Saf 2006;91:270–82. [43] Kim MC, Seong PH. A computational method for probabilistic safety assessment of I&C systems and human operators in nuclear power plants. Reliab Eng Syst Saf 2006;91:580–93. [44] Lee HC, Seong PH. A computational model for evaluating the effects of attention, memory, and mental models on situation assessment of nuclear power plant operators. Reliab Eng Syst Saf 2009;94(11):1796–805. [45] Chang JY, et al. The SACADA database for human reliability and human performance. Reliab Eng Syst Saf 2014;125:117–33. [46] Prvakova S, Dang VN. A review of the current status of HRA data. In: Proceedings of ESREL, September 29–October 2, Amsterdam, The Netherlands; 2013. [47] Hallbert B, Morgan T, Hugo J, Oxstrand J, Persensky JJ. A Formalized approach for the collection of HRA Data from nuclear power plant simulators. NUREG/ CR-7163, INL/EXT-12-26327, U.S. NRC, Washington, DC; 2014. [48] Park J, Jung W, Kim S, Choi SY, Kim Y, Dang VN. A guideline to collect HRA data in the simulator of nuclear power plants. KAERI/TR-5206/2013, Daejeon, South Korea; 2013. [49] Pearl J. Fusion, propagation, and structuring in belief networks. Artif Intell 1986;29(3):241–88. [50] Druzdzel MJ, van der Gaag L. Building probabilistic networks: where do the numbers come from? IEEE Trans Knowl Data Eng 2000;12(4):481–6. [51] Neil M, Fenton N, Nielson L. Building large-scale Bayesian networks. Knowl Eng Rev 2000;15(03):257–84. [52] Cooke RM. Experts in uncertainty. New York: Oxford University Press; 1991. [53] Kirwan B. Validation of human reliability assessment techniques: Part 1 – validation issues. Saf Sci 1997;27(1):25–41. [54] Kirwan B. Validation of human reliability assessment techniques: Part 2 – validation results. Saf Sci 1997;27(1):43–75. [55] Mohaghegh Z, Mosleh A. Incorporating organizational factors into probabilistic risk assessment of complex socio-technical systems: Principles and theoretical foundations. Saf Sci 2009;47(8):1139–58. [56] Mohaghegh Z, Kazemi R, Mosleh A. Incorporating organizational factors into probabilistic risk assessment (PRA) of complex socio-technical systems: a hybrid technique formalization. Reliab Eng Syst Saf 2009;94(5):1000–18.

[57] Øien K. Risk indicators as a tool for risk control. Reliab Eng Syst Saf 2001;74 (2):129–45. [58] Sklet S, Ringstad A, Steen S, Tronstad L, Haugen S, Seljelid J, et al. Monitoring of human and organizational factors influencing the risk of major accidents. In: Proceedings of SPE International conference on health, safety and environment in oil and gas exploration and production, April 12–14, Rio de Janeiro, Brazil; 2010. [59] Guldenmund FW. The nature of safety culture: a review of theory and research. Saf Sci 2000;34(1):215–57. [60] Groth KM, Mosleh. A. A data-informed PIF hierarchy for model-based human reliability analysis. Reliab Eng Syst Saf 2012;108:154–74. [61] Bonafede CE, Giudici P. Bayesian networks for enterprise risk assessment. Phys A: Stat Mech Appl 2007;382(1):22–8. [62] Hallbert B, Boring R, Gertman D, Dudenhoeffer D, Whaley A, Marble J, et al. Human event repository and analysis (HERA) system – overview. NU-REG/CR6903: Vol. 1, INL/EXT-06-11528, US Nuclear Regulatory Commission; 2006. [63] Mosleh A, Smidts C, Shen S. Application of a cognitive model to the analysis of reactor operating events involving operator interaction (detailed analysis). University of Maryland technical report, UMNE-94-004, College Park, MD; 1994. [64] Shen SH, Smidts C, Mosleh A. A methodology for collection and analysis of human error data based on a cognitive model: IDA. Nucl Eng Des 1997;172 (1):157–86. [65] Cattell RB. A biometrics invited paper. Factor analysis: an introduction to essentials I. The purpose and underlying models. Biometrics 1965;21(1):190–215. [66] Chang YHJ, Mosleh A. Cognitive modeling and dynamic probabilistic simulation of operating crew response to complex system accidents—Part 1: overview of the IDAC model 2007. Reliab Eng Syst Saf 2007;92:997–1013. [67] Chang YHJ, Mosleh A. Cognitive modeling and dynamic probabilistic simulation of operating crew response to complex system accidents (ADS-IDA Crew). Center for Technology Risk Studies, Maryland; 1999. [68] Podofillini L, Dang VN. A Bayesian approach to treat expert-elicited probabilities in human reliability analysis model construction. Reliab Eng Syst Saf 2013;117:52–64. [69] Podofillini L, Dang V, Zio E, Baraldi P, Librizzi M. Using expert models in human reliability analysis—a dependence assessment method based on fuzzy logic. Risk Anal 2010;30(8):1277–97. [70] Fenton NE, Neil M, Caballero JG. Using ranked nodes to model qualitative judgments in Bayesian networks. IEEE Trans Knowl Data Eng 2007;19 (10):1420–32. [71] Endsley MR. Toward a theory of situation awareness in dynamic systems. Human factors. J Human Factors Ergon Soc 1995;37(1):32–64. [72] Miao AX, Zacharias GL, Kao SP. A computational situation assessment model for nuclear power plant operations. Syst, Man Cybern, Pt A: IEEE Trans Syst Humans 1997;27(6):728–42. [73] Groth, K.M., Denmana, M.R., Cardoni, J.N., Wheeler, T.A. “Smart procedures”: using dynamic PRA to develop dynamic, context-specific severe accident management guidelines (SAMGs). In: Proceedings of probabilistic safety assessment and management PSAM 12, 22–27 June, Honolulu, Hawaii; 2014. [74] Firmino PRA, Menezes RDCS, Droguett EL, Lemos Duarte DC. Eliciting engineering judgments in human reliability assessment. In: Proceedings of reliability and maintainability symposium 2006, January 23–26, Newport Beach, California; 2006. [75] Musharraf M, Hassan J, Khan F, Veitch B, MacKinnon S, Imtiaz D. Human reliability assessment during offshore emergency conditions. Saf Sci 2013;59:19–27. [76] Pearl J. Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo, CA: Morgan Kaufmann Publishers; 1988. [77] Whaley AM, Xing J, Boring RL, Hendrickson ML, Joe JC, Le Blanc KL. Building a psychological foundation for human reliability analysis NUREG-2114, INL/EXT11-23898, US nuclear regulatory research; August 2012. [78] Druzdzel MJ, Diez FJ. Combining knowledge from different sources in causal probabilistic model. J Mach Learn Res 2003;4:295–316. [79] Meyer MA, Booker JM. Eliciting and analyzing expert judgment: a practical guide. New York, USA: Academic Press; 1991. [80] Kynn M. The ‘heuristics and biases’ bias in expert elicitation. J R Stat Soc: Ser A (Stat Soc) 2008;171(1):239–64. [81] Xiang Y, Jia N. Modeling causal reinforcement and undermining for efficient CPT elicitation. IEEE Trans Knowl Data Eng 2007;19(12):1708–18. [82] Lemmer JF, Gossink DE. Recursive noisy OR-a rule for estimating complex probabilistic interactions. IEEE Trans Syst, Man, Cybern, Pt B: Cybern 2004;34 (6):2252–61. [83] Cain J. Planning improvements in natural resource management. Guidelines for using Bayesian networks to support the planning and management of development programmes in the water sector and beyond. Wallingford, UK: Centre for Ecology and Hydrology, Gromwarsh Gifford; 2001. [84] Wisse BW, van Gosliga SP, van Elst NP, Barros AI. Relieving the elicitation burden of Bayesian belief networks. In: Proceedings of the sixth Bayesian modelling applications workshop on UAI; 2008. [85] Tang Z, McCabe B. Developing complete conditional probability tables from fractional data for Bayesian belief networks. J Comput Civil Eng 2007;21 (4):265–76. [86] Fenton N, Neil M, Lagnado DA. A general structure for legal arguments about evidence using Bayesian networks. Cogn Sci 2013;37(1):61–102.