Reliability Engineering and System Safety 148 (2016) 44–56
Contents lists available at ScienceDirect
Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress
Human error probabilities from operational experience of German nuclear power plants, Part II Wolfgang Preischl a, Mario Hellmich b,n a b
Gesellschaft für Anlagen- und Reaktorsicherheit (GRS) mbh, Forschungszentrum, Boltzmannstraße 14, 85748 Garching, Germany Bundesamt für Strahlenschutz (Federal Office for Radiation Protection), Willy-Brandt-Straße 5, 38226 Salzgitter, Germany
art ic l e i nf o
a b s t r a c t
Article history: Received 13 August 2015 Received in revised form 3 November 2015 Accepted 7 November 2015 Available online 27 November 2015
This paper is a continuation of an earlier publication (Preischl et al., Reliab Eng Syst Saf 2013;109:150–9) and presents the second part of a project aimed to collect human reliability data from the operational experience of German nuclear power plants. We employ a method which utilizes the German licensee event reporting system to gather the data. In this way, in addition to the data already presented in the previous paper, another 30 estimates for human error probabilities (HEP) are obtained. Moreover, a new method to access parts of the operational experience below the notification threshold of the German event reporting system is described. This method is demonstrated in cooperation with a reference nuclear power plant, resulting in 18 additional HEP estimates. As a result of both projects altogether 74 usable HEP estimates for a wide variety of tasks were derived. Notably, a number of them concern memory related or cognitive errors. A comparison with the THERP database shows that for 48 of these HEP estimates THERP provides no data, whereas in the 26 cases where THERP proposes a HEP it agrees with our data in all but eight cases. & 2015 Elsevier Ltd. All rights reserved.
Keywords: Human reliability assessment Human reliability data Probabilistic risk assessment Operational experience
1. Introduction This paper is a continuation of an earlier publication [25]. It presents the results of the second part of a project aimed to infer human reliability data from operational experience of German nuclear power plants. In this part of the project [24], as well as in its predecessor [23], the German licensee event report system is utilized to gather human reliability data, in order to both validate and extend existing databases. Both projects were funded by the German federal nuclear regulator, the Ministry for the Environment, Nature Conservation, Building and Nuclear Safety, and conducted by Gesellschaft für Anlagen- und Reaktorsicherheit (GRS) under the technical attendance of Bundesamt für Strahlenschutz (BfS). As in [25], we apply a specifically designed method to infer human error probabilities from licensee event reports corresponding to events which have certain properties. In the present paper, besides supplying another 24 human error estimates for a wide variety of tasks, we describe a new method to access parts of the operational experience below the notification threshold, i.e. for tasks in which an error would lead to a reportable event, but which have not yet produced an event. This n
Corresponding author. E-mail addresses:
[email protected] (W. Preischl),
[email protected] (M. Hellmich). http://dx.doi.org/10.1016/j.ress.2015.11.011 0951-8320/& 2015 Elsevier Ltd. All rights reserved.
method is demonstrated in cooperation with a reference nuclear power plant, and 18 human error estimates obtained in this way by zero failure estimation are presented. The scarcity of validated and traceable human reliability data is frequently quoted as a major problem in human reliability analysis (HRA), for both the development of new HRA methodologies and for applications in the context of probabilistic risk assessment (PRA). See [22] for a thorough recent literature review concerning the HRA data problem, and [31,2] for a historical perspective. Due to the shortage of relevant human reliability data, several data collection efforts have taken place in the past, where in some cases “HRA data” is understood as any relevant operational experience and is not confined to just human error probabilities for specific tasks; moreover, some databases collect data from various sources and not only from operational experience of nuclear power plants. Note that in most cases the data is not publicly available. General references concerning the problem of HRA data collection and database requirements are [31,29,16,17,14]. Some recent examples of data collection efforts in the nuclear sector are the SACADA database by the US NRC [4], which is intended as a long term data collection program, the Computerized Operator Reliability and Error Database (CORE-DATA) [11,12], supported by the UK Health and Safety Executive, which is used in conjunction with the NARA HRA methodology [19] in the UK, the older Nuclear Computerized Library for Assessing Reactor Reliability (NUCLARR) by the US NRC [10], which is reported to contain more than 2000 task based
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
human error estimates [3], and the Operator Performance and Reliability Analysis Database (OPERA) developed by the Korean Atomic Energy Research Institute (KAERI) [20], which includes data from real operational experience as well as simulator data. See [22] for further examples and discussion. Moreover, many HRA methods propose their own data, such as THERP [30], with its database consisting of over 100 human error probabilities (HEP). In spite of these data collection efforts and the data already available the importance of inferring data from actual plant experience remains, as has been repeatedly pointed out in the literature [8,29,18,31]. Since the German guideline for PRA in the context of periodic safety reviews of nuclear power plants recommends THERP as the primary method to be employed in the HRA part [5], the main goal of the data collection projects reported in the present paper and in [25] is to check whether the THERP database is in accordance with the operational practice of German nuclear power plants, and moreover, to extend it by HEP estimates for tasks for which THERP provides no data. Nevertheless, the data we obtain is not specific to any HRA method. Summarizing the results of the two research projects [23,24], we contribute 67 HEP estimates from samples that were generated using actual operational experience (however, in ten of them the sample size is considered to be too small for precise data generation, and in one case the sample is probabilistically trivial). Moreover, another 18 samples from operational experience below the notification threshold are reported, resulting in a total of 74 usable HEP estimates. The samples cover a wide variety of tasks; notably, a number of them are memory related or involve cognitive errors, for which THERP does not provide data. For altogether 48 samples THERP provides no data, and in those 26 cases in which THERP proposes a HEP estimate, it agrees with our data within the uncertainty bounds in all but eight cases (in most of them the disagreement is slight, and THERP deviates in the conservative direction, see also the remarks on the comparison in Section 4). Hence we conclude that the THERP database is generally in good agreement with the German operational practice, at least for those tasks for which samples are available. This paper is organized as follows. Section 2 briefly recalls the statistical method that is used to estimate human error probabilities from the sample data (the number of errors and the number of opportunities for errors). Moreover, uncertainty analysis of the data is addressed, which is not only of theoretical interest but also relevant for application of the data in PRA studies. In Section 3 our data source, the German licensee event report system, is introduced: Section 3.1 briefly recalls the method that is employed to gather HRA data from reportable events and, in particular, how it is possible to infer the number of opportunities for errors (the so-called “denominator problem” [22]) for certain tasks, whereas Section 3.2 details the method that was developed to access parts of the operational experience below the notification threshold of the event report system. Section 4 presents the altogether 48 samples obtained in the present project in the form of data tables. A detailed discussion and interpretation of the results of the present project, taking into account also our earlier results [25], is provided in Section 5. The paper closes with concluding remarks in Section 6.
2. Statistical inference of human error probabilities and uncertainty analysis In this section we briefly describe the method of statistical inference used to analyze the samples and to infer human error probability (HEP) estimates from the sample data. A more detailed description was given in [25].
45
Recall that the HEP is the probability that “when a given task is performed, an error will occur” [30]. Thus, according to the relative frequency interpretation of probability, for a particular task labeled by i the corresponding HEP can be estimated by HEPi
mi ; ni
ð1Þ
where ni is the number of times the task i was performed, and mi is the number of errors that occurred. Clearly, the parameter HEPi always lies in the interval ½0; 1. A more powerful inference method than simply taking (1) as an estimator for HEPi is Bayesian analysis [7], which, for the purpose of the present application, is now briefly recalled. As with all input data for PRA, uncertainty analysis should also be done for human reliability data and the corresponding uncertainty should be appropriately propagated through the PRA model. To perform uncertainty analysis for the HEP estimates, two conceptually different uncertainty contributions can be distinguished. First, according to the relative frequency interpretation it is an underlying assumption that every individual has a certain error probability at a particular time, given a particular task to be performed under given circumstances [6]. Thus, there is a variability of HEPi both with the individual selected from the population of shift personnel as well as due to the individual's temporal variability of its fitness for duty (e.g. during night shift), or due to the variability of the boundary conditions under which the task is performed. This variability can be modeled by considering HEPi as a random variable with a distribution concentrated on the interval ½0; 1. In the context of PRA, this variability becomes manifest as an uncertainty about HEPi since it cannot be known in advance which individual of the population is in charge of the task to be performed at the (random) time the PRA initiating event occurs. It was argued in [25] that a beta distribution (with suitably chosen parameters) should be used to describe this variability, since this distribution is unimodal (with only a single maximum, reflecting the fact that among the population of operators one usually cannot find two or more subgroups with grossly different performance levels) and in general unsymmetrical, as well as being concentrated on the interval ½0; 1. Recall that the beta distribution depends on two parameters α; β 4 0; it is absolutely continuous and is given by the density 8 < 1 xα 1 ð1 xÞβ 1 x A 0; 1½ f α;β ðxÞ ¼ Bðα; βÞ ; ð2Þ : 0 else where Bðα; β Þ is the beta function. The graph of (2) has a bellshaped appearance for parameter ranges of interest in the present application. There is another contribution to the uncertainty of HEPi , attributable to our limited knowledge about the “system” under consideration (the operator in the socio-technical context of the power plant), due to the limited amount of data on which the estimation of HEPi is based: the so-called epistemic uncertainty. According to the Bayesian interpretation of probability as a “degree of belief” this uncertainty can also be modeled by a probability distribution, which we choose from the beta family as well. Consequently, samples of a smaller size ni with a larger epistemic uncertainty tend to have a beta distribution with a greater variance (i.e. a wider bell-shaped density curve), corresponding to larger uncertainty intervals for the single point HEP estimates. We remark that, in the present paper and in [25], due to the way the samples are taken they reflect the performance of a group of operators in charge of a specific activity and summarize the performance differences of the individuals in the group. However, the HEP variability due to individual differences or randomly
46
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
selected boundary conditions is not known and not described by the samples, and the uncertainty analysis performed in this paper solely refers to the epistemic uncertainty. For PRA applications of the HRA data presented here or in [25], wider uncertainty bounds should be assigned to account for the HEP variability due to individual performance differences. For this purpose generic task specific error factors provided in the literature [30,8], or expert judgment, can be used. We now describe the inference method based on Bayesian analysis. Consider some task i and assume that the corresponding HEP is a fixed (i.e. nonrandom) value θi A ½0; 1. If the task is performed independently ni times, the probability pðmi ∣θi Þ for observing mi errors, mi ¼ 0; …; ni , has a binomial distribution with parameters θi and ni. Now according to the Bayesian paradigm we assume that θi is a random variable. In consideration of the above discussion we assume that θi has a beta distribution with parameters α0 and β0, the so-called prior distribution. Then it can be shown that the so-called posterior distribution pðθi ∣mi Þ, i.e. the distribution of θi conditional on the observed data mi and ni, is beta as well but with the parameters replaced as follows:
α 0 - α 0 þ mi ; β0 -β0 þ ni mi :
ð3Þ
This process is called Bayesian updating. From the posterior distribution we calculate our single point HEP estimate, which is chosen as the 50% quantile q50 in the present paper. Uncertainty intervals, reflecting the epistemic uncertainty about HEPi as described above, are obtained from the 5% and 95% quantiles q5 and q95. All quantiles are calculated by numerical integration from the density (2) of the beta distribution. The two parameters α0 and β0 in the prior distribution are chosen as α0 ¼ β 0 ¼ 12. This corresponds, according to Jeffrey's rule [7], to a noninformative prior, which is equivalent to the assumption that we know nothing about the HEP value in advance. We remark that with Bayesian analysis it is possible to do estimation from zero failure data, which is the case when the number mi of observed errors from ni opportunities is zero, i.e. mi ¼0. The significance of such a zero failure sample is the same as for any other number of observed errors mi between 1 and ni. Especially in this case Bayesian inference is much more meaningful than the empirical relative frequency estimator (1) which simply gives a HEP value equal to 0; moreover, the Bayesian method provides epistemic uncertainty intervals depending nontrivially on the sample size ni.
3. The data source The HEP estimates presented in this paper originate from two different data sources: on the one hand, we employ the German licensee event reporting system, and on the other hand, we use operational experience below the notification threshold of the event reporting system together with zero failure inference. 3.1. Human error probabilities from reportable events The method used to obtain human reliability data from the event reporting system has been described in detail in [25], here we only give a brief summary. Significant events occurring in German nuclear installations have to be reported to the competent authorities if the notification criteria laid down in the Atomrechtliche Sicherheitsbeauftragtenund Meldeverordnung (AtSMV) [1] are fulfilled. After reporting the events they are analyzed, documented, and the event documentation is stored in the database BEVOR. Since the event reporting system is not designed to support data collection, some
problems have to be addressed which prevent the direct use of the database for this purpose. These problems are listed below; the approach to address them was developed by GRS [13] and was applied both in the present project and its predecessor [25]:
Problem 1: Not every human error has to be reported. If the consequences of an error remain below the notification threshold it will not show up in the database. Hence, in general, the number of times mi of errors which occurred while performing a certain task i is unknown. ▹ Solution: Only those tasks are taken into account which, in case an error occurs, lead to an unambiguously detectable consequence which necessarily produces a reportable event. Thus mi can be inferred from the database. In general, do determine the reportability of a certain task detailed on-site investigations are needed. Problem 2: The database contains no information on how often a certain task i was performed successfully in the past. Thus, the number of times ni the task i was performed is unknown. ▹ Solution: Only those events from the database are considered which involve human errors in tasks for which it can be determined how often they were performed in the past. These are mainly tasks occurring in the context of periodic testing and maintenance, or which are specific to certain plant operational modes (e.g. start-up and shutdown). To determine ni, plant regulations and records of operational history have to be analyzed. Problem 3: Relevant performance shaping factors are not always reported in the database. ▹ Solution: If performance shaping factors (PSF) cannot be determined with sufficient accuracy from the database, they are identified by retrospective investigation. This involves plant visits and interviews with on-site experts and possibly with the personnel in charge when the error occurred. In general, it can be very laborious and time consuming to assess the relevant PSFs. Notice that the quantities mi and ni, as well as the HEP estimate inferred from them, are conditional on the relevant PSFs for the task under consideration. Of course, constant notification criteria over time are an important prerequisite for a statistical analysis. The last major change of the notification criteria was in 1985, from then on the criteria are sufficiently homogeneous. However, in certain cases the examination time window can be enlarged if the error would have led to a reportable event under the old notification criteria as well. Moreover, when determining the number of opportunities ni for an error it has to be taken into account that the licensee may have made organizational or technical changes which would inhibit a recurrence of the error; in that case, counting the number of opportunities has to be stopped after the change was implemented. Due to the large number of events stored in the BEVOR database (currently more than 6000), as a first step a screening process was carried out to identify those events which involve a human error in some well defined task that presumably was the direct cause of reportability, and for which the number ni presumably can be determined. In this way 126 events were identified. In the previous project, which closed in 2010, altogether 77 of these events were treated, and 45 were subjected to a detailed analysis and finalized. This led to 37 HEP estimates which are reported in [25]. During detailed analysis, only eight of the 45 events turned out to be unusable for data generation. Since the retrospective analysis was more time consuming and resource intensive than anticipated (a large number of the events occurred many years ago), the remaining 81 events had to be relegated to a successor project which closed out in early 2013 and which is reported in the
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
present paper. During detailed analysis it turned out that 30 of these 81 events are usable for data generation—the results are reported in the present paper. In this way altogether 67 HEP estimates for a wide variety of tasks performed in various locations (main control room, local control stations, switchgear rooms, machine shop, emergency diesel generator room, etc.), and occurring during various plant operational modes (start-up, power operation, shutdown, and standby) were generated. Events from almost all German nuclear power plants, older and newer ones as well as pressurized water reactor plants and boiling water reactor plants, some of which are now under decommissioning, are represented in the samples. Summarizing, we found that roughly 47% of the 126 events identified in the screening process turned out to be unusable for quantification: In 15% of the 126 events the unusability was due to the plant in question being under decommissioning so that the relevant information and plant personnel was no longer available, and in about 6% it was not possible to establish a cooperation with the plant until the project close out due to organizational reasons. In the remaining 26% of the events statistical data could not be generated because it was not possible to address all of the above three problems in an unambiguous way. 3.2. Human error probabilities from operational experience below the notification threshold Of the many tasks performed in a nuclear power plant, only a small fraction is safety relevant to a degree that a human error in them will have consequences that lead to a reportable event. If errors have occurred in performing these tasks in the past, the corresponding reportable events can be analyzed by the method of Section 3.1 to generate human reliability data, provided the tasks have the attributes explained in Section 3.1. However, if no errors have occurred yet, no trace of the task will be present in the reportable event database, so this “operational experience below the notification threshold” is not accessible by the method of Section 3.1. Nevertheless, as was explained in Section 2, by the way of zero failure estimation, Bayesian analysis even then allows us to derive human reliability data if there has been a sufficiently large number of opportunities for errors. Therefore a method has been developed which allows exploiting parts of this operational experience. The goal is to identify countable tasks which demonstrably have been performed error free in the past, but for which an error would lead to a reportable event, and for which relevant information about performance shaping factors as described in Section 3.1 can be obtained. More precisely, we are trying to identify tasks which have the following properties:
It can be determined how often the task has been performed in
the examination time window (whereas it is immaterial whether the task is performed in regular or irregular time intervals). Sufficient information about relevant PSFs can be gathered, and it can be ascertained that the conditions under which the task is performed (as determined by procedures, regulations, human machine interfaces, tools, etc.) are reasonably constant during the examination time window. An error in performing the task has consequences which necessarily lead to a reportable event. No reportable events due to the task in question have occurred during the examination time window.
In order to identify tasks with the above properties a detailed guidance has been set up and documented in [24]; here we will only briefly mention some key points. Due to the large number
47
tasks potentially eligible for analysis it is necessary to proceed in steps: In the first step, general work areas are identified where procedures are performed repeatedly that presumably contain one or more of the tasks with the above properties. Some general prospective work areas to look for such procedures have been identified in [24]. Possible screening criteria for a search are the repeated execution of the procedure, the possibility to anticipate which notification criteria may be fulfilled in the case of errors, and the accessibility of information on past changes in procedures, test instructions, etc., or the manner in which the work is performed. Moreover, the aspect of countability should also be considered already at this stage. In the next step the nature and sequence of the individual tasks performed by the operators when carrying out the procedure under consideration are determined, together with information about relevant PSFs. Besides the study of plant documentation and written procedures, on-site observations and interviews are necessary to complete this step. Key questions considered in this investigation are:
Which activities are actually performed by the operators, and which are not performed?
Which activities are performed instead of activities required by written procedures?
Which activities are performed beyond the explicit requirements of written procedures (e.g. when do operators have to apply their technical expertise and training in order to fill in gaps which are not part of a written procedure)? Moreover, a possible variability in the PSFs is to be documented as well. This process includes a conventional task analysis as ordinarily performed in human reliability analysis. The following step determines those tasks for which an error would necessarily lead to a reportable event. For this purpose plausible errors (e.g. errors of omission, timing errors, sequence errors, “too much”, and “too little”) are assumed for each of the tasks determined in the task analysis of the previous step, and their consequences are evaluated. During this process, the influence of the errors on other tasks in the procedure is analyzed as well, and possible recoveries are identified, along with an investigation of dependencies between operators. This approach corresponds to the conventional construction of HRA event trees as described in [30]. In the resulting event tree, all event sequences which contain a single identifiable human error in a particular task and which lead to consequences which are reportable must be identified. Now if this event sequence has not yet led to a reportable event, it can be concluded that the erroneous task in the sequence has been performed error free in the past. Finally, the plant operational history is reviewed to count the number of times the procedure (and hence the task in question) was performed during the examination time window. In the final step all information is summarized. The result is a list of tasks which have been performed under similar conditions and circumstances without errors, for which the relevant PSFs are documented, and for which it is known how often they have been performed under similar circumstances and PSFs. Now the tasks can be submitted to a zero failure analysis to generate a HEP estimate. This method to access a part of the operational experience below the notification threshold has been demonstrated in cooperation with a reference nuclear power plant. Tasks to be considered for a statistical analysis were searched in procedures belonging to the following areas:
plant start-up, periodic inspection of the emergency power supply diesel generators (full-load test of diesel generator, simulated failure of
48
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
the emergency bus bar, test of trip criteria of the diesel generator unit protection system), periodic inspection the containment lining gap air extraction system.
During the preparatory work of this investigation the reference plant provided sufficient written material (procedures, test instructions, schematic diagrams, etc.). On this basis and in cooperation with plant staff, a detailed task analysis of the procedures was performed and analyzed with regard to plausible errors, their consequences and a possible reportability, and the presence of error forcing factors, along the four steps explained above. All information collected in this way was verified on-site by a thorough walk-through. Moreover, interviews with plant staff in charge of the tasks were conducted and the potential for errors was discussed. When possible, tests and inspections were attended and observed, alternatively the procedure was talked through on location with the plant staff in charge. In this way altogether 18 HEP estimates were generated by zero failure analysis; they are reported in the next section. During the detailed on-site investigations it turned out that the tasks performed in the context of the periodic inspection of the containment lining gap air extraction system could not be used for data generation: Even though initially several candidate tasks were identified, the detailed on-site verification indicated that either a postulated error would not lead to a reportable event, or would lead to a reportable event only in conjunction with other errors. Hence it cannot be ascertained that the corresponding tasks have been performed error free in the past, and no zero failure samples could be established. Nevertheless, this experience underlines the importance of on-site investigations and the assistance of competent plant staff.
4. Results This section reports the remaining 30 samples derived from reportable events which could not be treated in the predecessor project, together with the 18 samples derived from operational experience below the notification threshold. For easy reference, the samples generated from reportable events are numbered consecutively: samples 1–37 can be found in [25] and samples 38– 67 are disclosed below. The 18 samples derived from operational experience below the notification threshold are numbered N1 to N18. All results are reported in Tables 1–7, their format is the same as in [25], which will now be briefly recalled. The first column of the tables displays the sample number, and the second Task column briefly describes the task under consideration. Generally, this task is a part of a more complex
procedure; here only the task that was (or, in the case of zero failure samples, is assumed to be) performed erroneously or was omitted is described in a generic way, taking into account human factors specific categories (e.g. “operating a pushbutton”, “checking indicator lights”). Details about the context are provided in the project reports [23,24]. The error (or the assumed error in zero failure samples) in performing the task is briefly characterized in the Error column, again in a generic way. It is the consequences of this error that lead (or would lead) to the reportable event. Performance shaping factors considered relevant for the task, as determined from the event reports or on-site investigations, are briefly described in the Relevant PSFs column. Again, more detailed information is contained in the project reports [23,24]. The fifth column provides the sample parameters, i.e. the number for opportunities for an error ni and the number of errors mi that were observed. From the sample data the probabilistic data is calculated according to the statistical method explained in Section 2, and the results are indicated in the following column. We use the 50% quantile q50 (the median) of the posterior beta distribution (assuming a noninformative prior beta distribution with parameters α0 ¼ β0 ¼ 12) as a single point HEP estimate. Uncertainty intervals, corresponding to the epistemic uncertainty due to the finite sample size, are given as the 5% and 95% quantiles q5 and q95; the intervals are displayed as ½q5 ; q95 in the tables. If the THERP handbook [30] provides a HEP for the task in question it is included in the final THERP column for comparison, multiplied by the appropriate factor to take situation specific PSFs into account. A symbol indicates that the THERP handbook either provides no HEP for the task, or that HEP given there is inappropriate for the specific situational conditions under which the task was performed. Uncertainty bounds for the single point HEP estimates as given in the THERP handbook (in the form of error factors) are not quoted since they should not be compared to our uncertainty bounds: The uncertainty bounds in the THERP handbook, in contrast to ours, include the uncertainty due to the individual performance variability between operators and other situational circumstances, and thus constitute not only the epistemic uncertainty. In order to compare the THERP data with ours, we check whether the HEP estimate given in the THERP handbook lies in the epistemic uncertainty interval given our data tables. We remark that this is a conservative approach since we have neglected an uncertainty contribution. In order to achieve a coarse classification of the samples, we use the same scheme as in [25]: On the one hand, we look at the externally visible error consequences (commission errors and omission errors), and on the other hand at the different phases of the underlying cognitive process (identifying and defining a task,
Table 1 Execution errors: errors in action execution control, samples generated from reportable events. No. Task
Error
Relevant PSFs
mi/ni
q50, ½q5 ; q95
THERP
38
Changing the position of a locally operated three-way valve
Valve not fully set to required position
1/23
5.07 10 2,
Operating a continuously adjustable rotary handle
Handle rotated too far
Imprecise visual indication of required position, check of required position only by increase of manipulation force No markings and no end stop present
Pulling an isolating terminal in a control cabinet
Wrong terminal pulled
Cleaning of panel surfaces in a local diesel control station
Unintended operation of a pushbutton
39 40
42 43
Inserting a power switch into a Not fully inserted switchgear cabinet by a crank handle
Similar terminals nearby, terminals arranged in regular patterns, similar terminal identification codes Insufficient protective measures Final position indicated by a front panel indicator light, high precision necessary
½0:7; 15:8 10 2 1/612
1.93 10 3,
1/162
½0:2; 6:3 10 3 7.29 10 3,
½1:0; 23:9 10 3 1/1000 1.18 10 3, 1/28
½0:1; 3:9 10 3 4.17 10 2, ½0:6; 13:1 10 2
½0:08; 82 10 4 2.90 10 3,
1.3 10 3
9.71 10 4,
49
6.49 10 4,
5 10 4
THERP
and executing it). Regarding the cognitive cause of the error we distinguish between “errors in action execution control” and “errors in task generation”. Following this approach we group the data in four tables corresponding to these categories:
q50, ½q5 ; q95
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
Wrong button selected N18 Operating a pushbutton control on a MCR panel
Similar buttons nearby, mimic layout with color coding
½0:01; 14 10 4
½0:03; 25 10 3 0/1460 0.56 10 4,
0/78 Similar relays nearby, with similar identification codes Wrong relay selected N15 Opening an isolating link on a relay
Poor visibility, uncomfortable work posture necessary, movement of link to an 0/234 end stop necessary Link not moved far enough
Operating a pushbutton control when electric power reaches a threshold N8
N14 Opening an isolating link on a relay
Operating a pushbutton control on a MCR panel N5
N11 Operating a pushbutton control when power factor reaches Operated too late a threshold
Short time window, two displays are to be monitored, both displays in field of 0/2010 1; 13 10 4 , view ½0:01; 9:6 10 4 Short time window, two displays are to be monitored, both displays in field of 0/2010 1; 13 10 4 , view ½0:01; 9:6 10 4
Discontinuous reading of quantitative information on a display N3
Operated too late
½0:01; 9:6 10 4
0/2010 1; 13 10 4 ,
0/350
Task
5.1.2. Commission errors due to cognitive errors in identifying or defining the task The eight samples in [25] involving commission errors due to cognitive errors in task generation are complemented by the
No.
5.1.1. Commission errors due to errors in action execution control Commission errors caused by errors in action execution control had for the most part been treated in [25]. With the five additional samples in Table 1 now altogether 20 samples generated from reportable events are available. The new samples 38–40 and 43 concern tasks performed locally in the plant (changing the position of valves and power switches, opening isolation terminals). For these tasks no HEP estimates are provided in the THERP handbook. Notably, these tasks are also part of emergency operating procedures; if this data is to be applied for quantification purposes the elevated level of stress in emergency situations still needs to be taken into account (e.g. by using the modification factors from Table 20-16 of the THERP handbook [30]). An important result of [25] is that among the factors influencing the probability of operating the wrong control element on ergonomically well designed MCR panels due to an error in action execution control, the presence of other control elements of similar design (and similar to operate) within reaching distance is dominant over other performance shaping factors such as the presence of a mimic layout, provided no serious ergonomic deficiencies are present. For this situation a generic HEP estimate of 1:56 10 3 was found. No additional data to corroborate or weaken this conclusion or to improve the HEP estimate was found from reportable events in the present project, but see Section 5.2.1.
Table 2 Execution errors: errors in action execution control, zero failure samples.
5.1. Samples from reportable events
Error
Relevant PSFs
This section provides some discussion and interpretation of the data disclosed in the tables in Section 4. The discussion also takes into account the results of the predecessor project published in [25].
Read too late
5. Interpretation and discussion of the results
Small time window, high task load
The samples in Table 7 are of too small size to be statistically relevant, which is reflected in the large uncertainty intervals. Hence this data should not be used for PRA purposes.
Similar buttons nearby, similar identification codes, mimic layout with color coding
Wrong button selected
mi/ni
while the task was correctly set and identified by the operator (e.g. selection of wrong control element among a group of similar control elements, timing errors). The corresponding data is given in Table 1 (samples from reportable events) and Table 2 (zero failure samples). Commission errors caused by a cognitive error in identifying or defining the task (e.g. memory problems: professional knowledge remembered incorrectly, false interpretation of oral instructions). The corresponding data is given in Table 3 (samples from reportable events) and Table 4 (zero failure samples). Omission errors caused by memory problems (information not remembered, written information not read or remembered incorrectly). The corresponding data is given in Table 5 (samples from reportable events) and Table 6 (zero failure samples). Samples considered too small to be of sufficient statistical significance. The corresponding data is given in Table 7.
½0:06; 55 10 4
Commission errors caused by errors in action execution control,
50
Table 3 Execution errors: cognitive errors in identifying or defining the task, samples generated from reportable events. Error
Relevant PSFs
mi/ni
q50, ½q5 ; q95
THERP
44
Inadmissible control action performed
Time consuming procedure, operator intended to save time by departing from procedure
1/324
5.04 10 3,
Opened to early, false interpretation of oral instruction
Ambiguous oral instruction
1/490
½0:7; 17 10 3 2.41 10 3,
Necessity to wait not remembered
Rarely performed task, necessity to wait is part of professional knowledge
1/66
½0:3; 8:0 10 3 1.78 10 2,
1 10 2
Wrong breaker operated, oral instruction remembered incorrectly
Short time span between instruction and operation, similar component identification codes
1/108
Monitoring the main steam pressure over time during stretch out operation
Performing an additional inadmissible control action, false interpretation of procedure
Rarely performed procedure with ergonomic deficiencies, moderately 1/7 high level of stress
Reinstallation of control rod drive motors
Drive motor mounted to wrong control rod, false identification of position
No position labels on control rods, position inferred indirectly from secondary information
45 46
47 49 50 51
52 53 55 56 57 63
Refilling nitrogen to SCRAM accumulator Manually opening a locally operated valve Keeping the necessary waiting time ð 30 sÞ between two operations of manual controls at a MCR panel Operating a circuit breaker in a switchgear cabinet
Testing the emergency feedwater supply system during power operation Connecting transducers to pressure sensing lines
Changing the order of two manual control actions, wrong task sequence generation
Control actions appear in wrong order in written procedure, proper ordering was to be inferred from professional knowledge, frequently performed task Connections swapped, professional knowledge Frequently performed task, no labeling remembered incorrectly
½0:2; 5:8 10 2 1.09 10 2,
½0:1; 3:6 10 2 1.6 10 1,
1/120
½0:3; 4:4 10 1 9.83 10 3,
1/1200
½1:4; 33 10 3 9.86 10 4,
½1:4; 33 10 4 1/156
7.57 10 3,
1/1248
½1:1; 25 10 3 9.84 10 4,
½1:4; 32 10 4 1.05 10 2,
½0:1; 3:5 10 2 1.87 10 1,
½0:1; 0:3 10 1 7.97 10 3,
½1:1; 27 10 3 261/261 1.00, ½0:99; 1:0
Manual control operation at local diesel control panel
Wrong control operation performed, false interpretation of oral instruction
Frequently performed task, ambiguous oral instruction
Plugging connectors to jacks in control cabinets
Connected to wrong jack, incorrect task generation
Very error prone written instructions, recall of rarely used professional 1/112 knowledge necessary
Adjusting valve body overpressure protection fittings
Tripping value set too high, beginning movement of valve head interpreted incorrectly
Visual information designed ergonomically unfavorable, occupational 37/198 safety constraints
Returning a power switch to operational condition at a local switchgear cabinet
Performing an inadmissible switching operation, false interpretation of written procedure
Imprecise written procedure, rarely performed, professional knowledge necessary for proper interpretation
Setting overload protection trip points for electrical Trip point set too low, incorrect task drives generation
For the given adjustment procedure the correct trip point value was neither documented nor part of training
1/148
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
No. Task
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
samples in Tables 3 and 4. In the majority of these samples, the error was committed because knowledge was incorrectly recalled from memory, so they provide clues on both long and short term memory reliability. Such data is relevant to most second generation HRA methods which often include a detailed cognitive model, of which the human memory is an essential element. Moreover, since the THERP database does not provide data for memory related commission errors, it is especially relevant to complement it in this regard. A characteristic feature of many of these samples is that the erroneous activity that was performed has a similarity to the correct one (e.g. in the location in which the task is performed, in visual appearance of objects, or operating modes of controls). The HRA analyst should keep this in mind when searching for opportunities of memory related errors in PRA applications. Samples 49, 44 and 57 are special in that the underlying reportable event was caused by extraneous operator activities not present in procedures and in violation of applicable rules and procedures. The detrimental effect of his activities was not transparent to the operator when he executed them; a contributing factor was incomplete or vague system understanding. In sample 44, the operator deliberately departed from procedures in order to achieve a benefit (saving time). According to the German PRA guideline [5], such extraneous detrimental activities are to be considered when their detrimental consequences are not transparent to the operator, and when a considerable (personal) benefit results from the violation. Samples 49 and 57 are different in that the extraneous detrimental activity was stimulated by ergonomic deficits in written procedures; the fact that the situation specific professional knowledge was rarely used was a contributing factor since it tended to conceal the detrimental nature of his activities from the operator. These samples may serve to improve and validate newer HRA methods which aim to include extraneous detrimental operator actions, the so-called errors of commission (EOC) [26–28,21]. Samples 45, 47 and 53 concern activities performed locally in the plant which are prompted by oral instructions. When receiving an oral instruction, the operator has to interpret the instruction, store it in his memory, and recall it correctly while performing the corresponding activities. In the reported samples the instructions were ambiguous or imprecise; moreover, in all cases the operator had to memorize them for only a short time. Likewise, in all cases there were similarities between the correct and the erroneous action (e.g. in the plant identification codes the operator had to remember). The THERP database contains no data for such errors, and in PRA studies oral instructions are rarely included (perhaps partly due to the lack of relevant data). It is to be emphasized that such instructions also play an important role during abnormal operation. The present data may thus be vital in quantifying the
51
reliability of comparable communication processes during such operating modes. With the new data reported in Section 4, now altogether 14 samples are on hand which involve situations in which professional knowledge has to be correctly recalled from long term memory. Failure to correctly recall professional knowledge occurred in samples 16, 17, 18, 19, 20, 22, 23, 46, 48, 49, 51, 55, 56, and 57. A characteristic feature of these samples is that professional knowledge must be used in conjunction with knowledge from other sources (e.g. written procedures) to correctly perform the task: For example, if a procedure instructs the operator to manually operate a power switch in a switchgear room, the operator has to recall his professional knowledge and previous experience about the proper operation of the relevant type of power switch. Looking at the corresponding HEP estimates and the listed relevant PSFs of these samples, it is noticeable that the reliability to correctly recall professional knowledge is correlated with the frequency with which the task is performed. Hence we group the samples according to this attribute:
Samples 19 and 51 concern tasks which are frequently performed, samples 16, 17, 18, 23, 46 and 57 concern rarely performed tasks
(e.g. once per year), all with no additional error promoting factors, samples 20, 22 and 55 concern tasks which are rarely performed, with additional error promoting factors, samples 48 and 49 concern tasks which are rarely performed, under very error prone circumstances.
Combining the samples accordingly, the resulting HEP estimates are disclosed in Table 8. Samples 51 and 52 are exceptional: in them an error in task generation occurred in a situation where the operator's professional knowledge was the only available information source. The task had to be planned and executed in a situation specific way, hence it can be concluded that the operator was acting in a knowledge based fashion. In such situations the cognitive load is much higher than in situations where the operator acts in a rule based fashion. Currently, very few HRA methods are available which aim to take such knowledge based tasks into account. The HEP estimates found in samples 51 and 52 are influenced by the task frequency and may be used as estimates in PRA application and as a basis to quantify new HRA methods. Sample 63 is exceptional as well. The underlying task is the adjusting of trip points of a certain model of overload protection devices for electric motors. In the plant in which the corresponding reportable event occurred, altogether 58 of these devices are installed. The trip point was set incorrectly in all protection devices since in the checking method employed (single phase
Table 4 Execution errors: cognitive errors in identifying or defining the task, zero failure samples. No.
Task
Error
Relevant PSFs
mi/ni
q50, ½q5 ; q95
THERP
N4
Reading an analog meter
Wrong value read
High precision necessary, high task load
0/350
6.49 10 4,
6 10 3
Wrong value read
Easy to read, favorable ergonomic design
½0:06; 55 10 4 0/2010 1.13 10 4,
Operated too long
½0:01; 9:6 10 4 Frequently performed task, part of professional 0/2010 1.13 10 4, knowledge ½0:01; 9:6 10 4
Wrong value read
Easy to read, favorable ergonomic design
N6 N7
Reading an analog meter Adjusting a process parameter by pushbutton controls
N10 Reading an analog meter N16 Reading a digital readout N17 Reading analog voltage and current meters
Wrong value read
Easy to read, favorable ergonomic design
Wrong values read Easy to read, favorable ergonomic design
0/2010 1.13 10 4, 0/180
½0:01; 9:6 10 4 1.26 10 3,
0/438
½0:01; 11 10 3 5.19 10 4, ½0:04; 44 10 4
3 10 3 3 10 3 1 10 3 3 10 3
1 10 2 ½0:1; 3:4 10 2 3.38 10 3,
1/115
1/350 Performing a manual control action at a MCR panel Task omitted 67
Long procedure, no checkoff provisions
Discontinuous check of a process parameter on a MCR control panel 65
Check not performed, task not remembered
½0:5; 12 10 3
½0:7; 16 10 3 1.03 10 2,
Disassembly of a circuit breaker at a local switchgear cabinet 64
Cable not disconnected, sub-step in a task consisting of four sub-steps not remembered
Memory based activity, very slow variation of process parameter, high task load
1.6 10 2 4.73 10 3,
1 10 1 Operating a circuit breaker at a local switchgear cabinet 60
Task not remembered
Rarely performed test procedure consisting of many sub-steps, task is 1/13 part of professional knowledge and has to be remembered, high task load Frequently performed, part of professional knowledge 1/250
½1:3; 27 10 2
½1:3; 29 10 4 8.86 10 2,
1/1347 Frequently performed task, part of professional knowledge, position of indicator lamps ergonomically unfavorably designed Operation of a manual control at a MCR control panel 59
Task not remembered
,
1/15,200 7.78 10 Replugging omitted
Pulling and replugging a simulation pin on an electronic module front cover in a control cabinet 58
Highly trained task, not part of a written procedure but part of professional knowledge, favorable ergonomic design
5
q50, ½q5 ; q95 mi/ni Error
Relevant PSFs
measurement only) a correction factor was to be applied to infer the correct trip point for three phase operation. Omitting to apply this correction is a commission error in correctly setting the task. The test instructions did not mention the necessity to apply the correction, nor is it part of the professional knowledge of the tester in charge. Consequently, the error occurred in 261 adjusting tasks, and was only discovered during a special test. Even though this sample is trivial from the probabilistic point of view, it illustrates the far reaching consequences that errors in task setting can have. 5.1.3. Omission errors: task not remembered In addition to the six samples reported in [25], another six samples concerning the omission of steps in procedures are reported in the present paper. Together with the memory related samples in Section 5.1.2 (correctly recalling professional knowledge, remembering instructions), they illuminate the reliability of the human memory under various situational conditions. We group the now altogether 12 samples concerning omission errors as in [25], with two additional categories:
Highly trained task, no further error promoting factors (sample 58), frequently performed task, no further error promoting factors
No. Task
Table 5 Errors of omission, samples generated from reportable events.
½1:1; 26 10 5 8.78 10 4,
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
THERP
52
(samples 30, 59 and 64),
rarely performed task, no further error promoting factors (sample 35),
rarely performed task, moderately high level of stress (samples 27, 34 and 65),
rarely performed task, moderately high level of stress, additional error promoting factors (samples 31 and 60),
rarely performed task, moderately high level of stress, addi
tional error promoting factors, dynamic work environment (sample 28), extremely rarely performed task, no further error promoting factors (sample 66).
Notice that sample 66 has a small size (three opportunities for error) and may thus be considered too small for generation of precise data (indicated by the large uncertainty interval). Nevertheless, the fact that in three opportunities already one error occurred carries some information and indicates the errorproneness of the PSFs in this case. Therefore it is included here. The resulting HEP estimates are reported in Table 9. Notice that the HEP varies over more than three orders of magnitude with the PSFs. The THERP handbook provides HEP estimates for the case that “written procedures are available and should be used, but are not used” in Table 20-7, items (5) and (5)‡, in the range of 1 10 2 to 5 10 2 ; this situation compares best to sample 35. Since Table 9 provides HEP estimates for similar tasks performed under different stress levels, it can be used to gain insight into the validity of the THERP stress model. This has already been discussed in [25], Section 5.4, using samples 27, 28, 31, 34 and 35. We reassess the results obtained there with the new data (samples 60 and 65). Recall that THERP categorizes the amount of stress in four levels (“very low”, “optimum”, “moderately high” and “very high”), with corresponding HEP modification factors in Table 20-16 of [30]. Using the corresponding HEPs (i.e. the singe point estimates, the beta median q50) in Table 9 we obtain the modification factors from operational experience disclosed in Table 10; the modification factors proposed by THERP are included for comparison. It is seen that for a moderately high level of stress we find a modification factor lower than the THERP proposal (and in fact smaller than one), whereas for the effect of a dynamic task performed under a moderately high level of stress we find a larger factor. This is the same finding as in [25], where with the additional data of the present paper the discrepancy to the THERP modification factors became larger in the first case, and smaller in the second.
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
53
Table 6 Errors of omission, zero failure samples. Relevant PSFs
mi/ni
q50, ½q5 ; q95
THERP
Reading instructions in a written Omitting to read one procedure instruction
Long procedure, checkoff provisions, high task load
0/350
6.49 10 4,
6 10 3
Remembering an instruction read shortly before
Short time span between reading and action (short term memory), high task load
0/350
½0:06; 55 10 4 6.49 10 4,
No.
Task
N1 N2
Error
Instruction not remembered
½0:06; 55 10 4 Long procedure, checkoff provisions, task also part of 0/2010 1.13 10 4, professional knowledge ½0:01; 9:6 10 4
N12 Reading instructions in a written Omitting to read one procedure instruction
Long procedure without checkoff provisions
1 10 2
N13 Opening an isolating link on a relay
Poor visibility, procedure based on short term memory, uncomfortable work posture necessary
N9
Reading instructions in a written Omitting to read one procedure instruction
Link not opened
Nevertheless, as already remarked in [25], this finding does not cast doubt on the THERP stress model in view of the still rather small sample sizes: looking at the corresponding uncertainty intervals in Table 9, the THERP stress model can still be considered consistent with the present data. Moreover, it is to be noted that according to newer findings [9,15] stress influences various aspects of human cognition to a different degree, whereas this aspect is not addressed in THERP, whereas the modification factors in Table 10 only refer to the proneness for omission errors. 5.1.4. Omission errors: written procedure read or interpreted incorrectly Besides memory problems, errors while reading and interpreting written procedures can lead to omission errors. This error mode occurred in samples 29, 32, 33, 37 and 67. In sample 37 a step in a procedure was omitted due to an ambiguous task description the procedure, whereas in samples 29, 32, 33 and in the new sample 67 a step was omitted because the corresponding item was not read. The procedures in samples 29, 32 and 33 have checkoff provisions, whereas in sample 67 no checkoff provisions are used. A detailed discussion of samples 29, 32, 33 and 37, as well as a sensible way of combining them according to similar performance shaping factors, and an application to test the THERP stress model, was provided in Section 5.4 of [25], hence it will not be repeated here. The THERP handbook postulates an influence of the presence of checkoff provisions in procedures on the probability of omission errors: in case checkoff provisions are not provided a HEP increase by a factor of 3.3 is proposed (THERP handbook, Table 20-7, items (2) and (4)). In order to test this postulate the new sample 67 can be compared with the combination of samples 29 and 33 (single point HEP estimate 3:34 10 3 , uncertainty interval ½0:88; 8:48 10 3 , see Table 9 in [25]). The comparison of the single point HEP estimates leads to a modification factor of only 1.01 for lacking checkoff provisions, at variance with the THERP postulate. Again, as in Section 5.1.3, this result casts no doubt on the THERP proposal since the samples involved are rather small, so in view of the uncertainty intervals the THERP proposal cannot be rejected. 5.2. Zero failure samples The information “no failure observed in a certain amount of opportunities” can be used to estimate the probability of a postulated failure mode. Since the zero failure samples, corresponding to operational experience below the notification threshold, have been generated by a different method than the samples from reportable events, we discuss them separately in the present section. Nevertheless, in view of the method described in Section 3.2 to establish them, which applies established ergonomic practices (e.g. task analysis, human error analysis, and HRA event trees), it can be
0/78
2.90 10 3,
0/234
½0:03; 25 10 3 9.71 10 4,
½0:08; 82 10 4
contented that the samples are of a quality comparable to the samples generated from reportable events. In particular, by the extensive on-site investigations that were conducted a comparable amount of information about performance shaping factors could be collected. Typically, more than one zero failure sample can be generated from a procedure satisfying the selection criteria of Section 3.2, since often more than one task in which a human error would have consequences leading to a reportable event can be identified. In such a case the samples corresponding to the different tasks generated from one procedure have the same sample parameters mi and ni, and consequently also the same HEP estimates and epistemic uncertainties; however, the performance shaping factors are not necessarily the same. We next discuss the 18 zero failure samples according to the error classification scheme introduced in Section 4, which was also used to group the samples generated from reportable events. 5.2.1. Commission errors due to errors in action execution control Samples N3, N8 and N11 concern timing errors (“too slow” or “too fast”, resp. “too late” or “too early”); the THERP handbook does not propose HEP estimates for this kind of errors. Sample N3 is different from samples N8 and N11 in that parallel tasks have to be completed in sample N3, leading to an elevated level of stress, whereas in samples N8 and N11 the ergonomic conditions can be considered optimal. If we combine samples N8 and N11 we obtain a single point HEP estimate of 5:66 10 5 , with an epistemic uncertainty interval of ½0:05; 47:7 10 5 . This may be compared with sample 3, in which the necessary response time is much shorter, in combination with a high task load; here we found a HEP of 3:58 10 3 and an uncertainty interval ½0:9; 9:1 10 3 . However, a verification of the modification factors of the THERP stress model (a factor of 5 is proposed in Table 20-16 of [30], item (5a), assuming moderately high stress impact on skilled operator due to heavy task load in a dynamic work situation) from a comparison between samples N3 and the combination of N8 and N11 would be very awkward using only this data, since the lower and upper bounds of the uncertainty interval of the combination of N8 and N11 differ already by a factor of about 103. In general, it is questionable if the concept of modification factors to take stress and cognitive load into account is still valid for very small error probabilities: For example, multiplying a very small HEP of, say, 10 4 or 10 5 by a factor of 10, the factor for treat stress proposed by THERP, still a small error probability results, which is not matching qualitative empirical evidence about human performance under threat stress. This point is to some extent taken into account by THERP in Table 20-16 of [30], which proposes for extremely high stress levels in a dynamic environment a blanket HEP of 0.25 for skilled operators, and of 0.5 for novice operators. Next we turn to samples N5 and N18, which concern errors in action execution control while manipulating controls on
½0:6; 7:6 10 1
1/3 Automatic operation not disabled, step not remembered High demand on memory performance, rarely performed task 66 Disabling automatic operation in order to perform a test
½2:3; 10 10 1 3.25 10 1,
1/1 Special operating mode, no written procedures available Further increase of thermal power in spite of a lacking prerequisite 62 Start-up of reactor
½2:3; 10 10 1 8.37 10 1,
1/1 No indication in written procedure, rarely used professional knowledge Failed to check the presence of essential test prerequisites 61 Testing the 24 V DC power supply system
½2:3; 10 10 1 8.37 10 1,
1/1 54 Closing pegging steam control valves Not fully closed, error in task generation after SCRAM
Special operating mode, no written procedure available, complex thermohydraulic context
5 10 1 ,
Rarely performed task, lack of detailed instructions in procedures, unfavorable 1/2 ergonomic design of alarm indication Signal plugs erroneously removed in all redundancies, professional knowledge not remembered 48 Testing electronic modules in the reactor protection system
½0:1; 9 10 1 8.37 10 1,
,
½2:3; 10 10 1
Unfavorable relationship between process parameter and control valve position, unfavorable ergonomic display design, lack of experience Process parameter adjusted too far 41 Manual control of a process parameter from a MCR panel
1/1
8.37 10
1
mi/ni q50, ½q5 ; q95 Relevant PSFs Error No. Task
Table 7 Samples considered too small to be of statistical significance, generated from reportable events.
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
THERP
54
ergonomically well designed panels, hence they are comparable to samples 1, 5, 6, 8, 9, 11 and 13. For these samples the combined HEP was given in Table 6 of [25] (as already quoted above, the single point HEP estimate is 1:56 10 3 , and the uncertainty interval ½0:8; 2:7 10 3 ). With samples N5 and N18 the combined sample parameters and the resulting HEP estimate are disclosed in Table 11. Notice that the new zero failure samples lead to a slightly lower single point HEP estimate than given in Table 6 of [25], and to tighter uncertainty bounds; moreover, notice that the new estimate still lies in the uncertainty interval of the old one. Samples N14 and N15 concern errors in handling isolating links of relays installed in control cabinets. The ergonomic boundary conditions are unfavorable due to poor visibility inside the cabinet and the necessity for the operator to assume an uncomfortable work posture. These samples are potentially useful for the analysis of emergency operating procedures and the HEP estimates may be used as a guidance to quantify them; again, in such cases the elevated level of stress in emergency situations still has to be properly taken into account. 5.2.2. Commission errors due to cognitive errors in identifying or defining the task Samples N4, N6, N10, N16 and N17 concern the task of reading values from analog and digital meters. In each case the operator has to read the indicated value correctly, interpret it and generate an appropriate task (e.g. adjust process parameters). The HEP estimates for reading analog meters incorrectly (N4, N6, N10, N17) range from 1:13 10 4 to 6:49 10 4 , whereas THERP proposes more pessimistic HEPs between 3 10 3 and 5 10 2 (Table 20-10 of [30], items (1) and (4)). However, THERP does not distinguish between ergonomic boundary conditions (e.g. easy or difficult to read meters and required precision) and thus covers also unfavorable conditions, as for example in sample N4. Therefore the present estimates complement the available data for errors in reading analog meters and contribute to a further differentiation concerning the ergonomic design of the meter. The HEP estimate obtained for reading digital meters in sample 16 matches the THERP proposal (Table 20-10 in [30], item (2)) quite well. The error mode in sample N7 concerns the failure of the operator to correctly recall frequently used professional knowledge. In the situation of sample N7 the level of training can be assumed to be even higher than in samples 19 and 51, hence this sample complements the data on commission errors due to incorrectly remembered professional knowledge provided in Table 8. 5.2.3. Omission errors Concerning omission errors the zero failure samples N1, N9 and N12 (step in a procedure not read) and N2 and N13 (failure to recall information from short term memory) are on hand. Since there are no other samples available which indicate the reliability of the short time memory they complement the data generated so far. The HEP estimate of samples N2 and N13 is slightly smaller than 1 10 3 , hence this value may be used as a basis for quantifying the short term memory reliability in second generation HRA methods. The result of sample N12 (omitting one item of instruction, no checkoff provisions) with a HEP of 2:90 10 3 is lower than the corresponding THERP proposal of 1 10 2 ; together with the results of Section 5.1.4 this finding leads to the hypothesis that THERP may overestimate the beneficial effect of checkoff provisions. Sample N9 is special in that the corresponding step in the procedure is also part of the professional knowledge of the operator, i.e. the underlying error consists in both not reading a step in a procedure and failing to correctly recall professional knowledge. Sample N1 fits well in the HEP range found in Section 5.1.4 (see also Table 9 in [25]) for omitting a step in a procedure. Combining samples 29, 33 and N1 yields a HEP estimate 2:17 10 3 with an
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
55
Table 8 Commission errors due to incorrectly remembered professional knowledge. HEP estimate resulting from combining samples 19 and 51, samples 16, 17, 18, 23, 46 and 57, samples 20, 22 and 55, and samples 48 and 49. Task
Error
mi/ni
Relevant PSFs
q50, ½q5 ; q95
2/2088 1.04 10 3,
Remembering professional knowledge Remembered incorrectly Part of frequently performed procedure Part of rarely performed procedure
6/423
½0:2; 2:7 10 3 1.46 10 2,
Part of rarely performed procedure, further error promoting factors
4/134
½0:7; 2:6 10 2 3.11 10 2,
Part of rarely performed procedure, very error prone circumstances
2/9
½1:2; 6:2 10 2 2.33 10 1, ½0:6; 5:0 10 1
Table 9 Omission errors: task not remembered. HEP estimates resulting from sample 58, samples 30, 59 and 64, sample 35, samples 27, 34 and 65, samples 31 and 60, sample 28, and sample 66. Task
Error
Relevant PSFs
mi/ni
q50, ½q5 ; q95
Carrying out a sequence of tasks
Memorized task step not remembered
Highly trained, no error promoting factors
1/15,200
7.78 10 5,
Frequently performed, no error promoting factors
3/3067
½1:1; 26 10 5 1.03 10 3,
Rarely performed, no error promoting factors
1/48
½0:3; 2:3 10 3 2.45 10 2,
Rarely performed, moderately high level of stress
3/185
½0:3; 7:9 10 2 1.71 10 2,
Rarely performed, moderately high level of stress, ergonomically deficient work environment Rarely performed, moderately high level of stress, error prone PSFs and dynamic work environment Extremely rarely performed, no error promoting factors
2/41
½0:5; 3:8 10 2 5.62 10 2,
Table 10 Modification factors for the HEP in order to include the effect of stress, as derived from the data in Table 9. The modification factors proposed by THERP [30] are included for comparison. Level of stress
Samples
Factor
THERP
Moderately high Moderately high, ergonomic deficits
35 vs. 27, 34, 65 27, 34, 65 vs. 31, 60
0.7 3.29
2 2.5
Table 11 Commission errors due to errors in action execution control: operating a manual control element on a panel. HEP estimate resulting from samples 1, 5, 6, 8, 9, 11, 13, N5 and N18. mi/ni
q50, ½q5 ; q95
Task
Error
Relevant PSFs
Operating a control element on a panel
Wrong element selected
Wrong control 7/8058 8.90 10 4, element within ½4:5; 15:5 10 4 reach and similar in design
uncertainty interval ½2:1; 5:5 10 3 for omitting an item in a long list of instructions with checkoff provisions; THERP proposes a HEP estimate of 6 10 3 , which is slightly more conservative.
6. Conclusion In this paper we report about the continuation of a project [24,25] to infer human reliability data from operational experience of
½1:4; 13 10 2 1/7
1.61 10 1, ½0:2; 4:4 10 1
1/3
3.52 10 1, ½0:6; 7:7 10 1
German nuclear power plants. To this end, the technique already described in [25] using the German licensee event reporting system to gather the data was used, and 30 new HEP estimates for a wide variety of tasks are reported, which complement the 37 HEP estimates from the predecessor project disclosed in [25]. Moreover, a new method to gather human reliability data below the notification threshold by means of zero failure estimation is introduced. This method allows us to access parts of the operational experience which is not accessible by analyzing licensee event reports and opens up a new source of human reliability data with a large empirical potential. The method was tested in cooperation with a reference nuclear power plant. Possible tasks in which errors would lead to reportable events, where yet no events have occurred, were searched for in three different working areas, leading to another 18 HEP estimates that were generated by estimating from the resulting zero failure data. This test clearly established the feasibility of the method, but also indicated that it is rather resource-intensive. This is mainly due to the extensive on-site investigations necessary to establish information about performance shaping factors and the situational conditions under which the tasks are performed, as well as to the expenditure to ensure the reportability in case errors are committed. In the research projects described here, all reportable events which were identified in an initial screening process for potential usability to generate human reliability data, and which could not be treated in the predecessor project [25], have now been analyzed. Hence, the German licensee event reporting system as a data source is for now exhausted. Nevertheless, it is possible to integrate the accumulating operational experience after the end of the projects into the data already established to both enlarge (where appropriate) the size of certain samples and generate new samples from new reportable events.
56
W. Preischl, M. Hellmich / Reliability Engineering and System Safety 148 (2016) 44–56
Both the present project and its predecessor [25] demonstrate the feasibility to use event reporting systems to generate HEP estimates from operational experience both above and below the notification threshold. From the positive experience gained it can be concluded that the employed methodology can also be used in industrial sectors other than nuclear energy. Moreover, it is conceivable that a licensee uses the methodology to analyze his internal records of operational experience to generate human reliability data. This would allow, for the purpose of data generation, to define internal notification criteria with a lower threshold than the licensee event reporting system and lead to a better exploitation of the available operational experience. Looking at the HEP estimates that are reported in the present paper—derived from operational experience both above and below the notification threshold—we can summarize our findings as follows: Altogether 48 samples with corresponding HEP estimates are reported. Of them, 6 are of a size considered too small for data generation, and another one (sample 63) is probabilistically trivial, so 41 usable HEP estimates are contributed, together with uncertainty bounds representing the epistemic uncertainty. For 13 of them the THERP handbook [30] proposes a HEP estimate. We find that in nine cases the THERP estimate lies within the epistemic uncertainty interval, and in another two cases it lies barely outside. However, in all cases in which the THERP estimate does not agree with our data within the uncertainty interval, THERP deviates in the conservative direction. Together with our findings from [25] this adds further confidence to the THERP database, and it can be concluded that the THERP data is in good agreement with the operational practice of German nuclear power plants, at least concerning those tasks for which data from operational experience is available. Besides a validation of the THERP database, both research projects report 48 HEP estimates (and another eleven estimates from samples which are trivial or too small for precise data generation) for which THERP does not provide data. They serve to complement and extend the THERP database, and in some cases allow a further differentiation, e.g. in terms of the effects of certain PSFs. Moreover, a number of samples concern cognitive errors (e.g. “remembered incorrectly”, “acoustically or visually perceived information interpreted incorrectly”, and “false interpretation of oral instruction”), the recollection of professional knowledge, and extraneous detrimental activities which are not part of procedures. The corresponding HEP estimates may thus contribute to the quantification of second generation HRA methods.
[2] [3] [4]
[5]
[6] [7] [8] [9] [10]
[11] [12]
[13]
[14] [15] [16] [17]
[18] [19]
[20] [21]
[22]
[23]
[24]
Acknowledgments This research would have been impossible without the many committed experts from the German nuclear power plants who helped in collecting the data. Specifically, we would like to thank Dr. Bernd Schubert (Vattenfall Europe Nuclear Energy) and Dr. Peter Röß (RWE Power AG) for their important support while investigating operational experience below the German notification threshold.
References [1] Verordnung über den kerntechnischen Sicherheitsbeauftragten und über die Meldung von Störfällen und sonstigen Ereignissen (Atomrechtliche Sicherheitsbeauftragten- und Meldeverordnung – AtSMV). 14. Oktober 1992 (BGBl.I
[25] [26]
[27] [28] [29]
[30]
[31]
1992, Nr. 48, S. 1766), zuletzt geändert durch Verordnung vom 8. Juni 2010 (BGBl.I 2010, Nr. 31, S. 755). Boring RL. Fifty years of THERP and human reliability analysis. In: Proceedings of PSAM/ESREL; 2012. Chang Y, Lois E. Overview of the NRC's data program and current activities. In: Proceedings of PSAM 11/ESREL; 2012 2012. Chang YJ, Bley D, Criscione L, Kirwan B, Mosleh A, Madary T, et al. The SACADA database for human reliability and human performance. Reliab Eng Syst Saf 2014;125:117–33. Facharbeitskreis Probabilistische Sicherheitsanalyse: Methoden zur probabilistischen Sicherheitsanalyse für Kernkraftwerke. Bundesamt für Strahlenschutz (BfS). Report no. BfS-SCHR-37/05; 2005 [in German]. Bedford T, Cooke R. Probabilistic risk assessment. Cambridge: Cambridge University Press; 2001. Box GEP, Tiao GC. Bayesian inference in statistical analysis. Reading: Addison Wesley; 1973. Dougherty EM, Fragola JR. Human reliability analysis: a systems engineering approach. New York: John Wiley; 1988. Driskell JE, Salas E, editors. Stress and human performance. Mahwah, NJ: Erlbaum; 1996. Gertman DI, Gilmore WE, Galyean WJ, Groh MR, Gentillon CD, Gilbert BG, et al. Nuclear computerized library for assessing reactor reliability (NUCLARR), vol. 1. U.S. Nuclear Regulatory Commission. Report no. NUREG/CR-4639, Summary description; 1990. Gibson H, Basra G, Kirwan B. Development of the CORE-DATA database. Saf Reliab J 1999;19:6–20. Gibson WH, Megaw TD. The implementation of CORE-DATA, a computerised human error probability database. Health and safety executive. Report no. 245/1999; 1999. Menschliche Zuverlässigkeit in der Probabilistischen Sicherheitsanalyse (PSA), Teil 2: Methoden zur Verifikation von Swain-Daten und zur Datenverbreiterung. Gesellschaft für Anlagen- und Reaktorsicherheit (GRS). Technical report no. GRS-A-2951; 2001 [in German]. Hallbert B, Gertman D, Lois E, Marble J, Blackman H, Byers J. The use of empirical data sources in HRA. Reliab Eng Syst Saf 2004;83:139–43. Hockey GRJ. Environmental stress: effects on human performance. Encyclopedia of stress. Amsterdam: Elsevier; 2007. p. 940–5. IAEA Human error classification and data collection. Vienna: International Atomic Energy Agency. Report IAEA TECDOC-538; 1990. IAEA Collection and classification of human reliability data for use in probabilistic safety assessments. Vienna: International Atomic Energy Agency. Report IAEA TECDOC-1048; 1998. Kirwan B, Martin B, Rycraft H, Smith A. Human error data collection and data generation. Int J Qual Reliab Manag 1990;7:34–46. Kirwan B, Gibson H, Kennedy R, Edmunds J, Cooksley G, Umbers I. Nuclear action reliability assessment (NARA): a data-based HRA tool. In: Spitzer, C, Schmocker, U, Dang, VN, editors. Proceedings of ESREL 2004/PSAM 7. London: Springer; 2004. p. 1206–11. Park J, Jung W. OPERA–a human performance database under simulated emergencies of nuclear power plants. Reliab Eng Syst Saf 2007;92:503–19. Podofillini L, Dang VN, Nussbaumer O, Dres D. A pilot study for errors of commission for a boiling water reactor using the CESA method. Reliab Eng Syst Saf 2013;109:86–98. Prvakova S, Dang VN. A review of the current status of HRA data. In: Steenbergen, RDJM, von Gelder, PHAJM, Miraglia, S, Vrouwenvelder, ACWM, editors. Proceedings of ESREL 2013. London: Taylor& Francis; 2014. p. 595–603. Preischl W. Verifikation von Zuverlässigkeitsdaten für Personalhandlungen im Rahmen der PSA. Gesellschaft für Anlagen- und Reaktorsicherheit (GRS). Technical report no. GRS-A-3515; 2010 [in German]. Preischl W, Fassmann W. Quantifizierung der Zuverlässigkeit von Personalhandlungen durch Auswertung der aktuellen deutschen Betriebserfahrung. Gesellschaft für Anlagen- und Reaktorsicherheit (GRS). Technical report no. GRS-A-3716; 2013 [in German]. Preischl W, Hellmich M. Human error probabilities from operational experience of German nuclear power plants. Reliab Eng Syst Saf 2013;109:150–9. Reer B, Dang VN, Hirschberg S. The CESA method and its application in a plant-specific pilot study on errors of commission. Reliab Eng Syst Saf 2004;83:187–205. Reer B. Review of advances in human reliability analysis of errors of commission. Part 1: EOC identification. Reliab Eng Syst Saf 2008;93:1091–104. Reer B. Review of advances in human reliability analysis of errors of commission. Part 2: EOC quantification. Reliab Eng Syst Saf 2008;93:1105–22. Sträter O, Bubb H. Assessment of human reliability based on evaluation of plant experience: requirements and implementation. Reliab Eng Syst Saf 1999;63:199–219. Swain AD, Guttmann HE. Handbook of human reliability analysis with emphasis on nuclear power plant applications. Final report. U.S. Nuclear Regulatory Commission. Report no. NUREG/CR-1278; 1983. Taylor-Adams S, Kirwan B. Human reliability data requirements. Int J Qual Reliab Manag 1995;12:24–46.