A framework to estimate task opportunities from the operational experience of domestic nuclear power plants

A framework to estimate task opportunities from the operational experience of domestic nuclear power plants

Safety Science 88 (2016) 146–154 Contents lists available at ScienceDirect Safety Science journal homepage: www.elsevier.com/locate/ssci A framewor...

711KB Sizes 3 Downloads 82 Views

Safety Science 88 (2016) 146–154

Contents lists available at ScienceDirect

Safety Science journal homepage: www.elsevier.com/locate/ssci

A framework to estimate task opportunities from the operational experience of domestic nuclear power plants Jinkyun Park ⇑, Yochan Kim, Wondea Jung Korea Atomic Energy Research Institute (KAERI), Republic of Korea

a r t i c l e

i n f o

Article history: Received 18 December 2015 Received in revised form 21 April 2016 Accepted 4 May 2016

Keywords: Nuclear power plant Probabilistic safety assessment Human reliability analysis Operational experience Estimation of task opportunity Human error probability

a b s t r a c t Since one of the most important issues in operating socio-technical systems is to enhance their safety through reducing the likelihood of human errors, it is prerequisite to secure reliable human performance data clarifying when and why human operators make an error. In this regard, many researchers tried to calculate an HEP (Human Error Probability) from operational experience data based on its traditional definition (i.e., HEP = number of errors observed/number of task opportunities for error). Accordingly, most of existing HEPs mainly based on the number of task opportunities being estimated from routine or periodic tasks that are usually performed in a full power condition with fixed time intervals. In contrast, calculating an HEP for a task being conducted in an off-normal condition is relatively seldom because it does not happen with a fixed time interval. For this reason, in this study, a novel framework is proposed, which can be used to estimate the number of task opportunities in terms of off-normal tasks from the operational experience of domestic NPPs. Although the proposed framework still has a couple of limitations, it could be a good starting point not only to enrich the ability of HEP calculation from the operational experience data but also to provide a reference information for HEPs obtained from other sources of information (e.g., full-scope simulators). Ó 2016 Elsevier Ltd. All rights reserved.

1. Introduction One of the most substantial issues in operating socio-technical systems, such as NPPs (Nuclear Power Plants), chemical/petrochemical plants, and commercial airplanes, is to secure their operational safety because any incidents and/or accidents could be catastrophic for the public health and living environment (List25, 2015). Therefore, it is natural that most organizations running such socio-technical systems want to continuously confirm whether or not their safety (or risk) level is acceptable or tolerable. In this regard, many kinds of risk quantification techniques have been developed for several decades, and a PSA (Probabilistic Safety Assessment) or PRA (Probabilistic Risk Assessment) is widely used especially in the nuclear sector (Mosleh, 2014; H. Kim et al., 2015; Lee et al., 2015; Duy et al., 2016). The basic idea of the PSA technique is to quantify all the risk contributions of plausible initiating events that can lead the status of an NPP toward hazardous consequences (e.g., a core damage or a large release of radioactive material). In general, such initiating events are classified into internal events (e.g., the failure of safety ⇑ Corresponding author at: 1045 Daedeokdaero, Yuseong-Gu, Daejeon 34057, Republic of Korea. E-mail address: [email protected] (J. Park). http://dx.doi.org/10.1016/j.ssci.2016.05.001 0925-7535/Ó 2016 Elsevier Ltd. All rights reserved.

critical systems) and external events (e.g., earthquake, typhoon, flood, and high wind). However, since the diverse spectrum of human actions is also attributable to the safety of the sociotechnical systems (Akyuz, 2015; Evans, 2011; Hughes et al., 2015; Kim and Kim, 2015; Pasquale et al., 2015), it is indispensable to incorporate the likelihood of human errors (i.e., HEPs; Human Error Probabilities) to the framework of the PSA in a systematic manner (Vaurio, 2009; Farcasiu and Nitoi, 2015). To this end, not only HEPs but also other information including the effect of error-forcing contexts (e.g., PSFs; Performance Shaping Factors) on the associated HEPs should be available to HRA practitioners (for convenience, the term of HRA data is used hereafter for representing all kinds of data necessary for conducting an HRA). For this reason, many researchers have spent huge amount of efforts in providing HRA data to HRA practitioners, of which the contents are collected from several sources of information such as (1) operational experience data based on event reports (e.g., maintenance reports, periodic test reports, near miss reports, and incident reports), (2) full-scope and/or partial-scope simulators, (3) laboratory experiments, (4) expert judgments, and (5) interviews with subject matter experts (Hirschberg and Dang, 1996; IAEA, 1998; Isaac et al., 2002; NEA, 2008). Of them, the use of simulators (especially full-scope simulators) is a main stream in collecting HRA data because they allow HRA practitioners to

147

J. Park et al. / Safety Science 88 (2016) 146–154

directly observe the variability of human performance with respect to diverse error-forcing contexts (IAEA, 1995a). In addition, if we recall the fact that most initiating events being considered in the PSA have an extremely rare frequency, it is unrealistic to obtain sufficient HRA data from other information sources except full-scope simulators (Chang and Lois, 2012; Lederman, 1988; Stanton, 2005). Subsequently, for several decades, many HRA databases have been developed on the basis of human performance data collected from full-scope simulators (Chang et al., 2014; Moieni et al., 1994; Park and Jung, 2007; Reece et al., 1994). However, it should be emphasized that HRA data obtained from the operational experience of NPPs are needed in parallel with those from full-scope simulators due to a couple of issues. The first one is that the use of full-scope simulators is one of the alternating solutions because operational experience data are not sufficient for securing necessary HRA data. In other words, if the sufficient amount of reliable operational experience data (e.g., near miss or incident reports) is available, it is possible to secure more realistic HRA data reflecting actual working environments. The second issue is the reality of HRA data collected from fullscope simulators. According to existing studies, it seems that the overall tendencies of human behaviors being observed from simulated situations properly reflect those from real situations (Gibson et al., 2006; Hirschberg and Dang, 1996; Park et al., 2004; Park and Jung, 2007; Takano and Reason, 1999). However, it is also true that the level of stress and/or reality to be felt by human operators under simulated situations is different from those of real situations. This means that it is still careful to directly use HRA data gathered from full-scope simulators because of a certain bias from the real situations (Criscione et al., 2012; IAEA, 1995a; NEA, 1988). In this case, if we are able to use HRA data observed from real situations (i.e., operational experience data), it can be used as reference information to clarify the difference and/or similarity of human performance data collected from full-scope simulators. The last issue is that, from the point of view of conducting an HRA, HRA data pertaining to the performance of routine tasks under a full power condition (e.g., periodic tests and maintenances) are also necessary. For example, in terms of the PSA, the IAEA (1995b) has clearly stated that an HRA should quantify the likelihood of human errors with respect to the following three task types: (1) Type A task includes human actions associated with maintenance and testing that can degrade the availability of a certain component or system, (2) Type B task contains human actions directly resulting in the occurrence of initiating events (e.g., an unexpected reactor shutdown due to a human error in carrying out a periodic test procedure), and (3) Type C task involves various kinds of required actions institutionalized in AOPs (Abnormal Operating Procedures) and EOPs (Emergency Operating Procedures), which are crucial for responding and/or mitigating the progression of initiating events. In this regard, although full-scope simulators are very useful for collecting HRA data related to Type C tasks, it is still necessary for HRA practitioners to access those of Type A and Type B tasks, which are not able to sufficiently gather from the full scope simulators. In this paper, in order to resolve the abovementioned issues, a novel framework is proposed, which allows us to systematically estimate the opportunity of Type C tasks from the operational experience of domestic NPPs. To this end, it is necessary to distinguish the category of off-normal tasks, of which their opportunity can be soundly estimated. In this regard, in total 193 incident reports that have been accumulated over 14 years (i.e., from January of 2002 to December of 2013) are reviewed in detail. As a result, it is revealed that the opportunity of Type C tasks can be reasonably estimated if an error has occurred during the performance of either abnormal tasks being described in AOPs or emergency tasks being prescribed in EOPs.

The structure of this paper is organized as follows. First, in order to clarify the background information of this study, the reason for estimating the opportunity of Type C tasks is described based on the characteristics of off-normal tasks in Section 2. After that, in Section 3, an underlying concept that should be considered in calculating the opportunity of Type C tasks is outlined. Next, a brief explanation on a novel framework is given in Section 4, which can be used to determine the opportunity of Type C tasks from the operational experience data of domestic NPPs. Finally, the contribution and limitation of this study are discussed with a concluding remark in Section 5. 2. Basic idea for HEP quantification As briefly outlined in the previous section, it is important to collect HRA data from the operational experience data of NPPs. From this concern, a couple of HRA databases have been developed through an extensive review of operational experience data across diverse industrial sectors. Typical HRA databases include CAHR (Connectionism Assessment of Human Reliability), HERA (Human Event Repository and Analysis), and CORE (Computerized Operator Reliability and Error) (Hallbert et al., 2006; Kirwan et al., 1997; Sträter, 2000). More recently, Preischl and Hellmich (2013) calculated the HEPs of 37 tasks based on the operational experience data of German NPPs. In this regard, although several quantification techniques are available for quantifying HEPs based on operational experience data (Reer, 2004; Reer and Sträter, 2014), most of the existing HRA databases have quantified an HEP by using a very straightforward formula as given in Eq. (1):

HEP of the ith task ðHEPi Þ ¼

mi ni

ð1Þ

Here, mi and ni denote the number of human errors observed from the performance of the ith task and the number of opportunities for the performance of the ith task, respectively. Actually, this formula is a direct reflection of the traditional assumption such that human operators will show similar HEPs if they have to accomplish identical tasks under a specific task environment (Fleishman and Buffardi, 1999; Li and Wieringa, 2000; Reason, 2000; Stassen et al., 1990). In this light, Preischl and Hellmich (2013) stated that ‘‘[. . .] if an individual is randomly selected from a population, the probability for making an error in performing a certain task under given conditions at a given point of time depends on the individual’s error probability, and thus becomes uncertain. This uncertainty [. . .] can be modeled by considering HEPi as a random variable with a distribution concentrated on the interval [0, 1]” (p. 151). Therefore, if we are able to properly estimate the opportunity of a certain task (i.e., ni) from operational experience data, it is strongly expected that the corresponding HEP can be calculated in a reliable manner. From this perspective, Table 1 exemplifies how to estimate a task opportunity from the operational experience data of a nuclear reprocessing plant (Taylor-Adams and Kirwan, 1997). As can be seen from Table 1, an error has occurred because a human operator put radioactive materials into a wrong waste flask. Here, since there has been no such human error for four years of operation, the corresponding task opportunity can be estimated as: 20 (loading tasks/week)  26 (weeks/year)  4 (years) = 2080 loading tasks. This means that the HEP of the loading task can be calculated as 4.81E4. Accordingly, it is promising to assume that the key step of an HEP quantification is to reasonably estimate a task opportunity (i.e., ni). This means that the very first step is to distinguish the catalog of tasks, of which the opportunity can be properly estimated. To this end, the operational experience data of domestic NPPs, which are stored in a NEED (Nuclear Event Evaluation Database) are reviewed in detail.

148

J. Park et al. / Safety Science 88 (2016) 146–154

Table 1 Estimating a task opportunity from operational experience data; reproduced from Taylor-Adams and Kirwan (1997), p. 332. Item

Contents

Task description Error description Operating history Task frequency HEP Industry Data origin

Procedural errors During an in-cave operation to load active material into waste flasks, a piece of highly active waste was places in the wrong flask 4 years Twenty loading operations per week, for 26 weeks a year 4.81E4 (= 1/2080) Nuclear reprocessing plant Real data

The NEED is managed by the nuclear regulatory body of the Republic of Korea (KINS; Korea Institute of Nuclear Safety), of which the primary purpose is to provide valuable insights that are helpful for preventing the recurrence of similar incidents in NPPs (NEED, 2015). In this regard, when an incident significant to the safety of an NPP has occurred, the KINS dispatches an inspection team comprised of three or more investigators who are able to identify its root cause. The significant incidents typically include: (1) an unexpected automatic and/or manual reactor shutdown, (2) any incidents that lead to the actuation of engineered safety features, and (3) any incidents resulting in a power reduction (i.e., a loss of MWe) due to the violation of LCOs (Limiting Conditions for Operations). Once the investigation is finished, the KINS uploads all kinds of information (such as the initiation and progression of an incident, an investigation process, and remedial actions to avoid the recurrence of similar incidents) to the Internet since 2002. Here, it is very interesting to point out that the NEED can be used as a source of estimating the opportunity of Type C tasks because one of the significant incidents to be investigated by the KINS is an unexpected automatic/manual reactor shutdown. In other words, since a reactor shutdown corresponds to one of the representative initiating events being analyzed from the PSA perspective, it is promising that the HEPs of Type C tasks can be calculated through analyzing the incident reports of the NEED. In this vein, in total 193 incidents that have occurred in the period from January of 2002 to December of 2013 are reviewed in detail in order to clarify the types of tasks, of which task opportunities can be reasonably estimated. As a result, it is revealed that the opportunity of Type C tasks can be determined if an error has occurred during the performance of either abnormal tasks or emergency tasks. For example, one of the popular solutions to securing the reliability of critical components and/or equipment is to periodically test their functions based on predefined intervals (Cho and Jiang, 2008; Ghosh and Roy, 2009; Khalaquzzaman et al., 2011; Zio and Bazzo, 2009). Accordingly, if a human error has occurred during the performance of a task prescribed in a specific maintenance procedure (i.e., Type A task), its opportunity can be estimated by considering the number of previous maintenances being successfully carried out without any human errors. In addition, if a human error has occurred in the course of conducting a certain test procedure (i.e., Type B task), which should be carried out in a specific operation mode (such as a Start-up, Cold shutdown, Hot shutdown, Hot standby, and Refueling mode) (USNRC, 2011), its opportunity can be properly estimated based on how many times the corresponding operation modes have been experienced in a given NPP. Unfortunately, in the case of Type C tasks, the direct calculation of task opportunities is not possible because they do not appear with a fixed interval or regular frequency. In order to understand the characteristics of Type C tasks, it would be helpful to consider Fig. 1 that illustrates how human operators working in NPPs are able to cope with various kinds of off-normal conditions. Without loss of a generality, the off-normal conditions of NPPs can be caused by diverse events belonging to one of the three

categories, such as an abnormal event, initiating event soundly diagnosed, and initiating event difficult to diagnose (CEOG, 1996; Park and Jung, 2004, 2015; WOG, 1987). The first category represents all kinds of deviations from the normal condition of an NPP (i.e., a full power condition), which are usually triggered by potential faults such as degraded instrumentations or components and/ or equipment failures (IAEA, 2000, 2004). In order to effectively deal with these potential faults, most of the existing NPPs prepare a huge volume of AOPs that describe detailed countermeasures to be conducted by human operators (Park and Jung, 2015). In addition, since most of these potential faults are apt to result in trivial incidents that do not result in any significant consequences in terms of the safety of NPPs, it is possible to return the full power condition without the shutdown of NPPs after removing and/or isolating them (refer to the region represented by Abnormal operation in Fig. 1). In the case of an initiating event that requires either the manual or automatic shutdown of an NPP, however, a more sophisticated strategy is unavoidable (refer to the region denoted as Emergency operation in Fig. 1) because the chance of significant consequences is not negligible. In this light, it is possible to postulate two groups of initiating events. The first group includes the initiating event, of which the nature can be properly distinguished (or diagnosed) by the list of symptoms identifiable from comparing and/or observing several process parameters such as a pressure, temperature, and water level. Typical initiating events belonging to this group is a LOCA (Loss Of Coolant Accident) or SGTR (Steam Generator Tube Rupture). Once the nature of an initiating event is accurately diagnosed, it is strongly expected that a couple of procedures can be developed, which are able to guide human operators through providing effective countermeasures with optimized sequences (e.g., LOCA and SGTR procedure as depicted in Fig. 1). In contrast, in the case of an initiating event corresponding to the second group, it is almost impossible to properly diagnose its nature. For example, when one or more instrumentation failures that could severely distort a correct symptom picture have concurrently occurred, it would be very difficult for human operators to properly recognize the status of a situation at hand. Similarly, it is not reasonable to expect that human operators are able to perform the correct diagnosis of multiple events (e.g., a LOCA followed by an SGTR) because of the avalanche of entangled and distorted symptoms. For these reasons, EOPs being used in existing NPPs consist of ORPs (Optimal Recovery Procedures) for the initiating events belonging to the first group and FRPs (Functional Recovery Procedures) for those involved in the second group. When an off-normal event has occurred, therefore, human operators have to cope with it by strictly conducting required tasks institutionalized in either AOPs or EOPs. Here, it is to be noted that there are times when human operators have to manually shutdown a reactor for several reasons such as an internal flooding due to a torrential rain or a forest fire near the site boundary of an NPP. In addition, although an automatic reactor shutdown is the very first response of the NPP in dealing with initiating events, it is true that most of the automatic reactor shutdowns are caused by the combination of the abovementioned

J. Park et al. / Safety Science 88 (2016) 146–154

149

Fig. 1. Simplified strategy to cope with off-normal conditions in NPPs; reproduced from Park et al. (2005).

potential faults that do not result in any significant consequences. In this circumstance, human operators have to follow an RT (Reactor Trip) procedure that provides a series of tasks specifying how to lead the status of the NPP to a safe shutdown condition (e.g., Cold shutdown mode). 3. Concept in determining the opportunity of off-normal tasks With the response strategy of off-normal conditions depicted in Fig. 1, it is expected that task opportunities can be soundly estimated from the operational experience data of domestic NPPs. To this end, it is necessary to distinguish two kinds of opportunities from the perspective of a procedure and task (i.e., a procedure and task opportunity). For example, let us consider a significant incident occurred in April 2008 in Hanul Unit 4 (NEED, 2015). According to an investigation report issued by the KINS, this significant incident was caused by a human operator working in a local area (i.e., a field operator). At that time, the human operator was conducting a test procedure for EDG (Emergency Diesel Generator) fast start-up, which is supposed to be regularly carried out within a period of 15 days under a full power condition. Unfortunately, since the human operator pushed a wrong button in the course of performing one of the tasks prescribed in the test procedure, a loss of voltage (LOV) was initiated. As a result, a couple of engineered safety features were automatically actuated due to the stoppage of the components and equipment being affected by the LOV. Interestingly, the operational experience data of Hanul Unit 4, of which the date of a commercial operation (DCO) is December 31, 1999 announced that there was no human error related to the performance of the corresponding test procedure.

Therefore, the opportunity of the test procedure can be estimated by dividing the total days of operation at full power condition (i.e., TFP; time for full power operation) by the test period (i.e., 15 days). It is to be noted that there are three principal rules in calculating a TFP. Fig. 2 shows hypothetical examples illustrating the application of these rules. The first rule is very straightforward such that the TFP should reflect the actual days being operated under a full power condition (refer to Case A in Fig. 2). This implies that all the time loss including the overhauls and unexpected shutdowns of a specific NPP should be subtracted from the total operation time (TTotal) started from its DCO (Date of a Commercial Operation). The second rule is that, when the contents of a procedure under consideration have changed, the initial time of the TTotal should be recalibrated. That is, since the modification of a procedure could result in the variation of a task environment to be faced with human operators (i.e., different PSFs), the time before the modification are not comparable to that of after the modification. Therefore, if a human error has occurred during the performance of the modified procedure, it is reasonable to calculate the TTotal from the time of the procedure modification (refer to Case B in Fig. 2). The last rule is that, similar to a procedure modification, the initial time of the TTotal should reflect the change of component configurations or the replacement of components (refer to Case C in Fig. 2). That is, since it is natural to expect that the contents of several procedures will be modified in order to reflect these amendments, the time before and after the configuration change and/or replacement should be distinguished each other. Consequently, it is natural to calculate the TTotal from the time point of these amendments.

150

J. Park et al. / Safety Science 88 (2016) 146–154

Fig. 2. Rules for calculating the time for full power operation (TFP).

Once a procedure opportunity is determined, a task opportunity can be estimated by analyzing its contents (refer to Fig. 3). For example, let us assume that a human operator has to carry out an AOP (AOP1) that consists of diverse tasks, such as ‘Verifying alarm occurrence,’ ‘Verifying state of indicator,’ and ‘Reading simple value.’ In this situation, if an investigation report reveals that the human operator made a mistake in conducting the fourth task of the AOP1 (i.e., the task of Verifying the initiation of Alarm A), then the opportunity with respect to the task of Verifying alarm occurrence can be estimated by multiplying the number of previous successes in conducting the AOP1 without any human errors (i.e., NS  1) by the number of similar tasks included in the AOP1 (i.e., NT1). In other words, since it is strongly expected that the AOP1

has conducted under the same task environment (e.g., same location, temperature, illumination, tools, and outfits), it is reasonable to say that the opportunity for the task of Verifying alarm occurrence should be the sum of all the previous successes being carried out without any human errors.

4. A framework to calculate the number of task opportunities Based on the concept of the task opportunity estimation illustrated in Fig. 3, it is possible to suggest a systematic framework comprised of five steps (Fig. 4). As can be seen from Fig. 4, the first step is to identify a human error from an incident report. After that,

Fig. 3. Calculating the number of task opportunities – an example.

J. Park et al. / Safety Science 88 (2016) 146–154

Fig. 4. Framework for calculating the number of task opportunities.

it is necessary to clarify the nature of a human error identified from the first step. To this end, task types and the associated human error modes suggested by Y. Kim et al. (2015) are adopted in this study (refer to Table 2). Table 2 Human error modes and the associated task types; reproduced from Y. Kim et al. (2015). Task type

Subtask type

a

Information gathering and reporting – checking discrete state

Verifying alarm occurrence Verifying state of indicator Synthetically verifying information

EOO, EOC

Reading simple value Comparing parameter Comparing in graph constraint Comparing for abnormality Evaluating trend

EOO, EOC EOO, EOC EOO, EOC

Entering step in procedure Transferring procedure Transferring step in procedure Directing information gathering Directing manipulation Directing notification

EOO

Situation interpreting without explicit guide of document

Diagnosing Identifying overall status Predicting

EOO, EOC EOO, EOC

Manipulation

Manipulating simple (push button) control Manipulating simple (rotary) control

EOO, EOC (WDEV, WDIR) EOO, EOC (WDEV, WDIR, WQNT) EOO, EOC (WDEV, WDIR, WQNT)

Information gathering and reporting – measuring parameter

Response planning and instruction

Manipulating dynamically Notifying to external agent Unauthorized control

– –

Error mode

EOO, EOC EOO, EOC

EOO, EOC EOO, EOC

EOO, EOC EOO, EOC EOO, EOC EOO, EOC EOO, EOC

EOO, EOC

EOO, EOC EOC

a EOO: Error of Omission, EOC: Error of Commission; WDEV: Wrong Device; WDIR: Wrong Direction; WQNT: Wrong Quantity.

151

As shown in Table 2, human errors basically consist of an EOO (Error of Omission) and EOC (Error of Commission). However, in the case of the EOC, it could be subdivided into a WDEV (wrong device selection), WDIR (wrong direction), and WQNT (wrong quantity) if necessary. Here, the EOO denotes a human error related to a failure in conducting a required task, while the EOC can be caused by actively doing a wrong task that is not needed to be carried out. In addition, the WDEV and the WDIR mean human errors initiated by selecting a wrong device (i.e., selecting Switch A instead of Switch B) and an inappropriate control direction (e.g., setting a control device in order to increase a temperature instead of decreasing it), respectively. Even though human operators properly select a device and control direction, they put a wrong control input into a control device (e.g., adjust the openness of a valve to 20% instead of a full close position). The WQNT can be applied to this situation. In addition, the abovementioned human error modes could be discriminated with respect to the types of tasks to be conducted by human operators. For example, as summarized in Table 2, it is evident that the EOO and EOC are the most common human error modes across task types. For example, the task types are basically categorized into: (1) Information gathering and reporting, (2) Response planning and instruction, (3) Situation interpreting, (4) Manipulation, (5) Notifying to external agent including field operators, operation support teams, and an electric power control center, and (6) Unauthorized control being carried out without a specific or firm rationale. It is to be noted that, in the case of the information gathering and reporting, two kinds of task types (i.e., Checking discrete state and Measuring parameter) can be separately considered with respect to their differences in task characteristics. For example, the nature of an EOO observed from the task of Verifying alarm occurrence is quite different from that of Comparing parameter, because the former denotes a human error pertaining to recognizing dichotomous information (e.g., activation and deactivation) while the latter implies a human error related to integrating the readings of two or more process variables being continuously changed. In addition, each task that extensively requires the manipulation of components and/or equipment can be refined further along with its characteristics, such as (1) manipulating simple (push button) control, (2) manipulating simple (rotary) control, and (3) manipulating dynamically (e.g., concurrently adjusting two or more process parameters in order to satisfy a predefined target status). It is also noted that, from the point of view of extracting HEPs from incident reports, determining task taxonomy (i.e., the catalog of task types) are very important because HRA practitioners require diverse HEPs for different tasks depending on HRA methods what they want to employ. It is true that the best solution is to provide unique HEPs for specific task types that are involved in each HRA method. Unfortunately, it is problematic to prepare the unique list of HEPs for a dedicated HRA method. In contrast, if there is a common task taxonomy, in which task types being considered in diverse HRA methods are soundly covered, it is more realistic to extract HEPs along with it. This implies that the task types and the associated human error modes, which will be used to analyze the operational experience of domestic NPPs, should be compatible with those from existing HRA methods. For this reason, in this study, the task taxonomy summarized in Table 2 is adopted because it was developed through an intensive review on human error modes and the associated task types being involved in popular HRA methods, such as THERP (Technique for Human Error Rate Prediction), ASEP (Accident Sequence Evaluation Program), K-HRA (Korea standard Human Reliability Analysis), SPAR-H (Standardized Plant Analysis Risk – Human reliability analysis), HEART (Human Error Assessment and Reduction Technique), HCR (Human Cognitive Reliability), and CBDT (Cause Based Decision Tree) (Y. Kim et al., 2015).

152

J. Park et al. / Safety Science 88 (2016) 146–154

The third step is to calculate a procedure opportunity (NS). For example, in the case of a procedure to be performed with a fixed time interval, it is easy to calculate its opportunity along with three principal rules illustrated in Fig. 2. Similarly, with the response strategy of off-normal conditions outlined in Fig. 1, it should be emphasized that a procedure opportunity pertaining to the Emergency operation can be soundly calculated. Let us consider human operators who were asked to manually shutdown a reactor. In this case, it is mandatory for the human operators to follow the RT procedure. Accordingly, if an error has occurred in the course of conducting one of the prescribed tasks included in the RT procedure, it is possible to reasonably determine its opportunity by scrutinizing operational experience data such as the number of manual shutdowns being successfully accomplished in the past without any human errors. Unfortunately, when a human error has occurred during the performance of a task involved in a specific AOP, it would be not easy to directly calculate a procedure opportunity because we need to know how many times the corresponding abnormal event were experienced in the history of a certain NPP. In this regard, Fig. 5 shows the basic idea of a procedure opportunity calculation with respect to a hypothetical system. As can be seen from Fig. 5, the main function of a hypothetical system comprised of four components (Oil reservoir, Oil pump, Electrical power source, and Valve) is to deliver a lubrication oil to a cooling fan, of which the failure results in a major loss of MWe. Accordingly, there is a dedicated AOP preventing from an abnormal event such as the loss of the lubrication oil. Here, since all the four components are linked in a serial fashion, it is evident that the lubrication oil loss will happen when one of these components has failed. This means that the number of the lubrication oil losses can be reasonably surmised by investigating how many times the four components have failed. If so, it is possible to calculate the representative frequency of the abnormal event (i.e., the loss of the lubrication oil) based on a given period of time. Once the representative frequency of a specific abnormal event is identified, it is promising to determine how many times a specific AOP has been carried out in the past (i.e., a procedure opportunity). In this light, it is very important to point out that the representative frequencies of abnormal events, which are determined by scrutinizing various kinds of component failure data with respect to the associated abnormal events, are available in domestic NPPs (KHNP, 2011). Table 3 shows a part of the representative frequencies for selected abnormal events, which are rated by a 5-point scale. For example, it is expected that human operators will use an AOP related to the abnormal event of Turbine generator trip once a year, while that of Spurious generation of safety injection actuation signal will be probably used once in ten years. Based on these representative frequencies, in this study, it is assumed that the annual frequency of each abnormal event can be translated as epitomized in Table 4. That is, in the case of the representative frequency 5, it is supposed that the mean time of an

Fig. 5. A hypothetical system to deliver lubrication oil.

Table 3 A part of the representative frequencies; adopted from KHNP (2011). ID

Abnormal event

a

1 2 3 4 5 6

Turbine generator trip Main feedwater pump trip Closure of main steam isolation valve Turbine high vibration Spurious generation of safety injection actuation signal Steam generator tube leak

5 5 4 3 2 1

Frequency

a The meaning of each representative frequency is as below: 5: The occurrence frequency is less than 1 year. 4: The occurrence frequency is between 1 and 2 years. 3: The occurrence frequency is between 2 and 5 years. 2: The occurrence frequency is between 5 and 10 years. 1: The occurrence frequency is greater than 10 years.

Table 4 Annual frequency translated based on the corresponding representative frequency. a

Frequency

5 4 3 2 1 a

Annual frequency 2.00/year 0.67/year 0.29/year 0.13/year 0.08/year

(= (= (= (= (=

1/0.5 year) 1/1.5 year) 1/3.5 year) 1/7.5 year) 1/12.5 year)

Representative frequency (refer to Table 3).

event occurrence will be 0.5 years because an abnormal event belonging to this category should happen at least once a year. Similarly, since it is likely that an abnormal event, of which the representative frequency is 4, will happen at least once between 1 and 2 years, it is assumed that the mean occurrence time is 1.5 years. The fourth step is to count the number of similar task types (i.e., NT) through analyzing the contents of a procedure related to the occurrence of a human error. That is, as exemplified in Fig. 3, the number of similar tasks that are included in a given procedure should be identified in order to determine a task opportunity. In this regard, the catalog of subtask types epitomized in Table 3 can be used. In other words, since it is possible to assign an appropriate subtask type to each task being prescribed in the given procedure (e.g., how many tasks belonging to the category of Verifying alarm occurrence are involved in the RT procedure?), the profile of similar tasks can be soundly developed. If we are able to calculate a procedure opportunity (NS) and the number of similar task types (NT), the last step is to estimate a task opportunity through multiplying NT by (NS  1). That is, the opportunity of a specific task can be determined by considering the number of similar task types that are previously conducted without any human errors. For example, let us consider a significant incident that happened in Hanul Unit 4 on October 10, 2003 (NEED, 2015). According to the investigation report of the KINS, this significant incident was initiated by the abnormal event of Turbine generator trip (NEED, 2015). Unfortunately, human operators working in the main control room (MCR) failed to properly control the feedwater flow of steam generators (SGs), which is one of the required tasks institutionalized in the corresponding AOP of Turbine generator trip. In addition, after the automatic shutdown of a reactor triggered by the instability of SGs’ water level, the human operators failed to conduct a task included in an EOP (i.e., the RT procedure depicted in Fig. 1), which is a prerequisite for stabilizing the pressurizer pressure. As a result, two kinds of EOCs were identified in the course of conducting two tasks belonging to the category of Manipulating dynamically, one came from the Abnormal operation while the other from the Emergency operation. Here, if we recall that the annual frequency of Turbine generator trip is 2.0 (refer to Table 3), it is expected that the human operators

153

J. Park et al. / Safety Science 88 (2016) 146–154

&RQWULEXWLRQRIWKLVSDSHU

([LVWLQJSUDFWLFH 6RXUFH

7\SH$

7\SH%

7\SH&

6RXUFH

)XOOVFRSH VLPXODWRU

)XOOVFRSH VLPXODWRU

2SHUDWLRQ H[SHULHQFH

2SHUDWLRQ H[SHULHQFH

7\SH$

7\SH%

7\SH&

Fig. 6. Estimating the HEPs of the Type C tasks from operational experience – its meaning.

were exposed to the corresponding Abnormal operation about eight times since the commercial operation of Hanul Unit 4 because the review of operational experience data revealed that its TFP calculated from December 1999 to October 2003 is about 3.7 years (KHNP, 2014). This implies that the human operators have successfully restored an off-normal condition induced by the abnormal event of Turbine generator trip in seven times since its commercial operation. In addition, as of December 2009, it has been reported that the number of reactor shutdowns experienced in Hanul Unit 4 (both an automatic and manual shutdown), which are not caused by any initiating events resulting in significant consequences (such as LOCA or SGTR), is nine (KHNP, 2014). This strongly alludes to the fact that the number of opportunities pertaining to the performance of the RT procedure is nine. Accordingly, if there are 30 tasks in the RT procedure, which belong to the category of Manipulating dynamically, the corresponding HEP under an Emergency 1 operation could become 4.167E3 (i.e., 830 ). Similarly, if 20 tasks of Manipulating dynamically are involved in the AOP of Turbine generator trip, the corresponding HEP under an Abnormal operation can 1 ). be quantified as 7.143E3 (i.e., 720 5. Discussion and conclusion It is evident that the contribution of human errors to the safety of socio-technical systems is very critical. For this reason, it is important for HRA practitioners to provide reliable HRA data including HEPs. Although a full-scope simulator can be used to collect valuable HRA data, it is still necessary to extract HRA data from the review of operational experience data. If so, it is possible to expect several benefits, such as the use of HRA data gathered from the operational experience data of domestic NPPs as reference information to clarify the appropriateness of those collected from full-scope simulators. In this light, it is very interesting to point out that the coverage of HRA data collections seems to be clearly distinguished (refer to Fig. 6). As already mentioned in Section 2, most of the existing practices in calculating HEPs from operational experience data have mainly focused on those related to Types A and B because it is not easy to determine the opportunity of a task belonging to Type C. For example, all kinds of maintenance tasks that are able to affect the safety of NPPs (i.e., Type A tasks) should be carried out along with a series of strict internal processes in order to optimize its strategy, plan, and required tasks (Perla et al., 1984; IAEA, 2003; CNSC, 2012). This implies that diverse documents are available for estimating a task opportunity. Similarly, in the case of periodic tests (i.e., Type B tasks), estimating a task opportunity can be done by simple math after determining a procedure opportunity through dividing a TFP with its test interval. Accordingly, it can be said that the determination of ni in Eq. (1) is quite straightforward in the case of tasks belonging to Types A and B. Unfortunately, since off-normal events have generally happened in a random manner, counting how many times human operators have successfully carried them out (i.e., Type C tasks) is

not easy compared to those of Types A and B. Consequently, as depicted in Fig. 6, there is no intersection between HEPs gathered from operational experience data and those from full-scope simulators. In this regard, in this study, a novel framework is suggested in order to soundly estimate the opportunity of Type C tasks. It is true that the proposed framework has at least two limitations. The first one is that, since the annual frequency of each abnormal event is assumed based on the associated representative frequency, it could be different from the actual frequency. That is, due to the uncertainty of the annual frequency, it is likely that the estimation of a task opportunity probably has a wide range of variation. The second limitation is the assumption of a homogeneous task environment. As briefly explained in Section 2, the underlying idea of Eq. (1) is that human operators will show similar HEPs if they have to accomplish identical tasks under a similar task environment. Conversely, if human operators are exposed to different task environments, it is cautious to estimate HEPs by using Eq. (1) because of the inappropriate aggregation of task opportunities. However, it is strongly anticipated that the uncertainty in determining the annual frequency of an abnormal event could become tolerable if we are able to use a reliable component failure database. In other words, as depicted in Fig. 5, the uncertainty in calculating the frequency of a lubrication oil loss will be drastically reduced if we are able to use the failure frequencies of four components, which are extracted from a more reliable data source. In addition, it is promising to expect that the variation of HEPs pertaining to the homogeneity of task environments can be largely covered if we consider the upper and lower bond of HEPs being represented by various kinds of statistical approaches (e.g., the Bayesian update) (Preischl and Hellmich, 2013). If so, the comparison of two kinds of HEPs, one came from operational experience data and the other from full-scope simulators, could become a good source of information that allows us to clarify how to use simulator-based HRA data for conducting a practical HRA. Another promising application is to distinguish the effect of stress (or a stressful task environment) on HEPs. For example, it is reasonable to assume that the level of stress to be felt by human operators under an Emergency operation is higher than that of a full power condition. Similarly, it is believed that human operators who are exposed to a real situation probably feel a higher degree of stress than those exposed to a simulated situation. This means that the catalog of HEPs obtained from operational experience data could be useful for scrutinizing the difference and/or similarity of HEPs collected from full-scope simulators. From this perspective, the results of this study seem to be meaningful because we are able to take the first step in securing a set of HEPs from operational experience data.

Acknowledgments This work was supported by Nuclear Research & Development Program of the National Research Foundation of Korea (NRF) grant, funded by the Korean Government, Ministry of Science, ICT & Future Planning (Grant Code: 2012M2A8A4025991).

154

J. Park et al. / Safety Science 88 (2016) 146–154

References Akyuz, E., 2015. Quantification of human error probability towards the gas inerting process on-board crude oil tankers. Saf. Sci. 80, 77–86. CE Owner’s Group, 1996. Combustion Engineering Emergency Response Guidance. CEN-152, Rev. 4. Chang, J., Lois, E., 2012. Overview of the NRC’s HRA Data Program and Current Activities, PSAM 11, Helsinki, Finland. Chang, J., Bley, D., Criscione, L., Kirwan, B., Mosleh, A., Madary, T., Nowell, R., Richards, R., Roth, E.M., Sieben, S., Zoulis, A., 2014. The SACADA database for human reliability and human performance. Reliab. Eng. Syst. Saf. 125, 117–133. Cho, S., Jiang, J., 2008. Analysis of surveillance test interval by Markov process for SDS1 in CANDU nuclear power plants. Reliab. Eng. Syst. Saf. 93, 1–13. Canadian Nuclear Safety Commission, 2012. Maintenance programs for nuclear power plants. Regulatory Document RD/GD-210, Ottawa, Canada. Criscione, L., Shen, S., Nowell, R., Egli, R., Chang, Y., Koonc, A., 2012. Overview of LICENSED Operator Simulator Training Data and Use for HRA, PSAM 11, Helsinki, Finland. Duy, T.D.L., Vasseur, D., Serdet, E., 2016. Probabilistic safety assessment of twin-unit nuclear sites: methodological elements. Reliab. Eng. Syst. Saf. 145, 250–261. Evans, A.W., 2011. Fatal train accidents on Europe’s railways: 1980–2009. Accid. Anal. Prev. 43, 391–401. Farcasiu, M., Nitoi, M., 2015. The organizational factor in PSA framework. Nucl. Eng. Des. 293, 205–211. Fleishman, E.A., Buffardi, L.C., 1999. Predicting human error probabilities from the ability requirements of jobs in nuclear power plants. In: Misumi, J., Wilpert, B., Miller, R. (Eds.), Nuclear Safety: A Human Factors Perspective. Taylor & Francis Ltd.. Ghosh, D., Roy, S., 2009. Maintenance optimization using probabilistic cost-benefit analysis. J. Loss Prev. Process Ind. 22, 403–407. Gibson, W.H., Hickling, B., Kirwan, B., 2006. Feasibility Study into the Collection of Human Error Probability Data. Eurocontrol Experimental Center, EEC Note No. 02/06. Hallbert, B., Boring, R., Gertman, D., Dudenhoeffer, D., Whaley, A., Marble, J., Joel, J., Lois, E., 2006. Human Error Repository and Analysis (HERA) System – Overview, NUREG/CR-6903, vol. 1. US Nuclear Regulatory Commission, Washington, D.C.. Hirschberg, S., Dang, V.N., 1996. Critical Operator Actions: Human Reliability Modeling and Data Issues. Final Task Report for Principal Working Group 5, Task 94-1, OECD/NEA. Hughes, B.P., Newstead, S., Anund, A., Shu, C.C., Falkmer, T., 2015. A review of models relevant to road safety. Accid. Anal. Prev. 74, 250–270. International Atomic Energy Agency, 1995. Models and Data Requirements for Human Reliability Analysis, IAEA-TECDOC-499, Vienna, Austria. International Atomic Energy Agency, 1995. Human Reliability Analysis in Probabilistic Safety Assessment for Nuclear Power Plants, IAEA-SS-50-P-10, Vienna, Austria. International Atomic Energy Agency, 1998. Collection and Classification of Human Reliability Data for Use in Probabilistic Safety Assessment, IAEA-TECDOC-1048, Vienna, Austria. International Atomic Energy Agency, 2000. Management of Aging of I&C Equipment in Nuclear Power Plant, IAEATECDOC-1147, Vienna, Austria. International Atomic Energy Agency, 2003. Guidance for Optimizing Nuclear Power Plant Maintenance Programmes, IAEA-TECDOC-1383, Vienna, Austria. International Atomic Energy Agency, 2004. Management of Life Cycle and Aging at Nuclear Power Plants: Improved I&C Maintenance, IAEA-TECDOC-1402, Vienna, Austria. Isaac, A., Shorrock, S.T., Kirwan, B., 2002. Human error in European air traffic management: the HERA project. Reliab. Eng. Syst. Saf. 75 (2), 257–272. Khalaquzzaman, M., Kang, H.G., Kim, M.C., Seong, P.H., 2011. Optimization of periodic testing frequency of a reactor protection system based on a risk-cost model and public risk perception. Nucl. Eng. Des. 241, 1538–1547. Korea Hydro and Nuclear Power Company, 2011. The Catalog of Education and Training Contents with Respect to Plant Operations: YGN Unit 2 and 3, 201150011863-0782 (Written in Korean). Korea Hydro and Nuclear Power Company, 2014. Operating Status of Korean Nuclear Power Plants. (Written in Korean). Kim, Y., Kim, J., 2015. Identification of human-induced initiating events in the low power shutdown operation using the commission error search and assessment method. Nucl. Eng. Technol. 47 (2), 187–195. Kim, H., Lee, S.H., Park, J.S., Kim, H., Chang, Y.S., Heo, G., 2015. Reliability data update using condition monitoring and prognostics in probabilistic safety assessment. Nucl. Eng. Technol. 47 (2), 204–211. Kim, Y., Park, J., Kim, S., Choi, S.Y., Jung, W., Jang, I., 2015. Task analysis of emergency operating procedures for generating quantitative HRA data. In: Transactions of the Korean Nuclear Society Autumn Meeting, October 29–30, Gyeongju, Korea.

Kirwan, B., Basra, G., Taylor-Adams, S.E., 1997. CORE-data: a computerized human error database for human reliability support. Proceedings on IEEE Sixth Annual Human Factors Meeting, Orlando, Florida, pp. 7–12. Lederman, L., 1988. Accident sequences sensitive to human errors. Reliab. Eng. Syst. Saf. 22, 269–276. Lee, S.J., Jung, W., Yang, J.E., 2015. PSA model with consideration of the effect of fault-tolerant techniques in digital I&C systems. Ann. Nucl. Energy 87 (Part 2), 375–384. Li, K., Wieringa, P.A., 2000. Understanding perceived complexity in human supervisory control. Cogn. Technol. Work 2, 75–88. 25 Biggest Man Made Environmental Disasters in History. . Moieni, P., Spurgin, A.J., Singh, A., 1994. Advances in human reliability analysis methodology. Part 1: Framework, models and data. Reliab. Eng. Syst. Saf. 44, 27–55. Mosleh, A., 2014. PRA: a perspective on strengths, current limitations, and possible improvements. Nucl. Eng. Technol. 46 (1), 1–10. Nuclear Energy Agency, 1988. The human factor in nuclear power plant operation. NEA issue brief – an analysis of principal nuclear energy issues, no. 2, Vienna, Austria. Nuclear Energy Agency, 2008. HRA data and recommended actions to support the collection and exchange of HRA data. Report of Working Group on Risk Assessment, NEA/CSNI/R(2008)9, Vienna, Austria. Nuclear Event Evaluation Database, 2015. . Park, J., Jung, W., 2004. A study on the systematic framework to develop effective diagnosis procedures of nuclear power plants. Reliab. Eng. Syst. Saf. 84, 319–335. Park, J., Jung, W., 2007. OPERA – a human performance database under simulated emergencies of nuclear power plants. Reliab. Eng. Syst. Saf. 92, 503–519. Park, J., Jung, W., 2015. A systematic framework to investigate to coverage of abnormal operating procedures in nuclear power plants. Reliab. Eng. Syst. Saf. 138, 21–30. Park, J., Kim, J., Jung, W., 2004. Comparing the complexity of procedural steps with the operators’ performance observed under stressful conditions. Reliab. Eng. Syst. Saf. 83, 79–91. Park, J., Jung, W., Kim, J., Ha, J., 2005. Analysis of Human Performance Observed under Simulated Emergencies of Nuclear Power Plants, KAERI/TR-2895. Korea Atomic Energy Research Institute, Daejeon, Republic of Korea. Pasquale, V.D., Miranda, S., Iannone, R., Riemma, S., 2015. A simulator for human error probability analysis (SHERPA). Reliab. Eng. Syst. Saf. 139, 17–32. Perla, H.F., Sattison, M.B., Stampelos, J.G., Gekler, W.C., 1984. A Guide for Developing Preventive Maintenance Programs in Electric Power Plants, EPRI-NP-3416. Electric Power Research Institute, Palo Alto, California. Preischl, W., Hellmich, M., 2013. Human error probabilities from operational experience of German nuclear power plants. Reliab. Eng. Syst. Saf. 109, 150–159. Reason, J., 2000. Human error: models and management. Br. Med. J. 30, 768–771. Reece, W.J., Gilbert, B.G., Richards, R.E., 1994. Nuclear Computerized Library for Assessing Reactor Reliability (NUCLARR), NUREG/CR-4639. US Nuclear Regulatory Commission, Washington, D.C.. Reer, B., 2004. Sample size bounding and context ranking as approaches to the HRA data problem. Reliab. Eng. Syst. Saf. 83, 265–274. Reer, B., Sträter, O., 2014. A case study on addressing the error forcing context in human reliability analysis. Int. J. Perform. Eng. 10 (7), 717–727. Stanton, N., 2005. Simulator: a review of research and practice. In: Stanton, N. (Ed.), Human Factors in Nuclear Safety. Taylor & Francis, London. Stassen, H.G., Johannsen, G., Moray, N., 1990. Internal representation, internal model, human performance model and mental workload. Automatica 26 (4), 811–820. Sträter, O., 2000. Evaluation of Human Reliability on the Basis of Operational Experience (Ph. D. Dissertation). Economics and Social Science, Munich Technical University, Munich, Germany. Takano, K., Reason, J., 1999. Psychological biases affecting human cognitive performance in dynamic operating environments. J. Nucl. Sci. Technol. 36 (11), 1041–1051. Taylor-Adams, S., Kirwan, B., 1997. Human reliability data requirements. Disaster Prev. Manage. 6 (5), 318–335. US Nuclear Regulatory Commission, 2011. Risk Assessment of Operational Events Handbook – Shutdown Event (vol. 4). . Vaurio, J.K., 2009. Human factors, human reliability and risk assessment in license renewal of a nuclear power plant. Reliab. Eng. Syst. Saf. 94, 1818–1826. Westinghouse Owner’s Group, 1987. Emergency Response Guidance – High Pressure Volume, Rev. 1A. Zio, E., Bazzo, R., 2009. Optimization of the test intervals of a nuclear safety system by genetic algorithms, solution clustering and fuzzy preference assignment. Nucl. Eng. Technol. 42 (4), 414–425.