A critical review of approaches to human reliability analysis

A critical review of approaches to human reliability analysis

International Journal of Industrial Ergonomics. 2 (1988) 111- 130 Elsevier Science Publishers B.V., Amsterdam- Printed in The Netherlands 111 REVIEW...

2MB Sizes 0 Downloads 82 Views

International Journal of Industrial Ergonomics. 2 (1988) 111- 130 Elsevier Science Publishers B.V., Amsterdam- Printed in The Netherlands

111

REVIEW

A CRITICAL REVIEW OF APPROACHES TO HUMAN RELIABILITY ANALYSIS Joseph Sharit Department of Industrial Engineering, State University of New York at Buffalo, Amherst, NY 14260 (U.S.A.)

(Received December15, 1986; acceptedin revisedform February9, 1987)

ABSTRACT This paper provides a critical analysis of four major approaches to human "reliability assessment: THERP, the use of qualitative models of human performance, simulation methods, and methods based on classical reliability theory. Methodologically and conceptually, these approaches are more representative of a multidimensional than unified perspective to the problem of human reliability analysis and are considered to encompass many of the other methods that have been proposed. Recent developments in each of these areas and the growing concern for the

consequences of human error in highly complex systems requires that these approaches be sufficiently understood in order to identify both their strengths and their shortcomings. A ithough no single approach is advocated, this review intends to provide insights that could suggest improvements to these methods as well as aid the analyst in selecting the approach most optimal for the situation in question. In addition, critical research needs are identified and summarized.

INTRODUCTION

In this paper, four approaches will be examined that are considered to encompass, both conceptually and methodologically, most existing approaches to HRA. In a broad sense, they consist of: (1) a well-documented technique for quantifying the effects of human error that combines psychological and mathematical frameworks, (2) a loose collection of ideas and conceptual psychological frameworks for analysis of human error, (3) simulation methods, and (4) methods that borrow heavily from classical mathematical reliability techniques. The intention is to critically review these approaches from the standpoints of both their utility and validity. Hopefully, a clearer understanding of the limitations of these approaches will emerge that will form the basis for modifications, extensions, or perhaps combinations of these approaches to HRA while also providing practi-

Due to the need for the human to interact with equipment and complex systems, it has become necessary to extend or modify classical realiability methods in order to assess the various system-related risks. The potential impact of the integrated reliability assessment is more far-reaching than that which is restricted to mechanical components. The various techniques and approaches that have been offered for dealing with one or more aspects of this problem have gradually formed (and are still in the process of contributing to) the area of human relability analysis (HRA). Their specifics have tended to reflect the backgrounds of the individual researchers, the source or establishment motivating the analysis, and the particular objective. 0169-8141/88/$03.50

© 1988 Elsevier Science Publishers B.V.

112 tioners with an appreciation for the tradeoffs that will need to be considered in possibly adopting these approaches to industrial situations.

FAMILIARIZATION • INFORMATIONGATHERING • PLANT VISIT • REVIEW OF PROCEDURESINFORMATION FROM SYSTEM ANALYSTS

THE THERP TECHNIQUE The THERP (Technique for Human Error Prediction) technique is generally associated with Alan Swain and his coworkers. This technique reflects the belief that only through quantification can reduced system reliability be attributed to equipment and/or procedural design and increased system reliability stemming from the application of ergonomic principles be accomplished. At a relatively early stage in its development, Swain (1964) noted various problems associated with quantifying human performance such as: (1) indicating how the individual tasks the human performs are related, (2) reducing the complexity arising from the fact that the individual's behavioral properties implicit to these tasks have distributions of outputs (and response times), (3) the need to statistically characterize situational factors such as stress and to acknowledge the dependencies between individual tasks, and (4) estimating human error probabilities (HEPs) and the accompanying general resistance to quantification, primarily from research-oriented psychologists who find many of the constructs underlying human performance too elusive to justify quantification. Methods for dealing with these concerns have been gradually developed and form the basis to THERP (Swain and Guttmann, 1983), a technique intended for providing probabilistic risk assessments of nuclear power plants and which Henley and Kumamoto (1985) note is "regarded as the most powerful and systematic methodology for the quantification of human relability" (p. 353). This HRA technique is summarized in Fig. 1. Referring to this figure, note that it is in the first phase that the analyst identifies performance shaping factors (PSFs) associated with information unique to the plant. PSFs can affect the probabilities of human errors and include factors such as task and equipment requirements, job and task instructions, stress, level and type of training, and ergonomic design issues. Qualitative models that relate performance to these factors form the

QUALITATIVE ASSESSMENT • • • • •

DETERMINE PERFORMANCE REQUIREMENTS EVALUATE PERFORMANCE SITUATION SPECIFY PERFORMANCE OBJECTIVES IDENTIFY POTENTIAL HUMAN ERRORS MODEL HUMAN PERFORMANCE

QUANTITATIVE ASSESSMENT • • • • •

DETERMINE PROBABILITIES OF HUMAN ERRORS IDENTIFY FACTORS INTERACTIONS AFFECTING HUMAN PERFORMANCE QUANTIFY EFFECTS OF FACTORS INTERACTIONS ACCOUNT FOR PROBABILITIES OF RECOVERY FROM ERRORS CALCULATE HUMAN ERROR CONTRIBUTION TO PROBABILITY OF SYSTEM FAILURE

INCORPORATION

J

• PERFORM SENSITIVITY ANALYSIS • INPUT RESULTS TO SYSTEM ANALYSIS

Fig. 1. Four phases of human reliability analysis (after Swain and Guttmann. 1983).

basis to the quantitative modelling methodology utilized in the third phase. The qualitative assessment in the second phase primarily involves task analysis (Drury et al., 1987), where the operator's actions are identified and broken into tasks and subtasks. The development of a model at this stage allows PSFs to be more accurately represented and both recovery tasks (which involve the use of recovery factors to detect abnormal conditions, e.g., checking another operator's work) and error-likely situations (i.e., the potential for error which implies the categorization of human outputs) to be identified. At the third level, quantitative assessment and probabilistic methods are applied. The basic index of human performance is represented by HEPs which include incorrect performance of an action when required as well as the probability that the task will not be completed correctly within some specified time interval. The HEPs used in Swain and Guttman (1983) were extrapolated from similar tasks derived from various sources. This information is typically used in combination with expert judgment, where the similarities and differences between the tasks are judged in order to determine how error probabilities should be adjusted, al-

113 though informal expert opinions are also sometimes used. The HEPs themselves are classified as basic, conditional, joint, and nominal. The former considers human performance on a task in isolation, i.e., unaffected by any other task, although the estimated probability could be reflecting the effects of PSFs other than dependence. A more general term is the nominal HEP, which gives the probability of a human error when PSFs are not taken into account and represents the estimated HEPs derived as noted above. The joint HEP, the probability of error on all tasks that must be performed to achieve some end result, ultimately determines the success and failure probabilities for the event that has been decomposed into the various tasks. The conditional HEP is defined as the probability of error on a task given success or failure on some other task. The concept of dependence is treated as a PSF and, as is done with other PSFs that are continuous variables, is discretized to a finite number of points which, in this case, represent zero (ZD), low (LD), moderate (MD), high (HD), and complete (CD) levels of dependence. Although these levels of dependence are chosen based on judgment, a set of general guidelines for assessing the level of dependence between tasks (or individuals) is provided to aid in this process. For example, the level of dependence associated with actions that are spatially or temporally in proximity would intuitively be higher. This dependence model only considers positive dependence where the probability of success and failure is increased on the second task if success and failure, respectively, occurred on the first task. Swain and Guttman (1983) argue that by assuming no dependence where negative dependence is in effect, the HEP estimates would, in most cases, be conservative. Based on evidence on the response time performance of skilled persons in typical industrial settings and the rational that performance of skilled persons will tend to cluster towards the lower end of the distribution of either time or error performance, the assumption of a lognormal distribution underlying human performance was adopted. The single point HEP estimates are then regarded as medians of this distribution. Guidelines for establishing uncertainty bounds (UCBs) around these point estimates are presented which form the basis

for the propagation of uncertainty associated with the HEP estimates. Methods are also presented for how these UCBs propagate when the series of tasks whose joint probability of failure is of interest is taken into account. Swain and Guttmann (1983) offer a search scheme to assist the analyst in assigning HEPs through use of various tables provided in their handbook. However, prior to utilizing any such scheme, the relevant human task elements should have already been identified through task analysis and mapped into a type of event tree referred to as a probability tree, where each limb represents either correct or incorrect performance of the task. Its use is illustrated in Fig. 2 for identifying the major tasks and errors of three licensed nuclear reactor operators engaged in actuating the appropriate steam generator feed-and-bleed procedure following a loss of both normal and emergency feed to the steam generators in a pressurized water reactor. Although the JHEP (for one person) usually reflects the joint probability of error in performing two or more tasks, here it represents the joint error probability of the three persons performing any of the individual tasks. Also, in this example, only single point estimates of HEPs are used. However, UCBs may be applied to each estimate and propagated to obtain an estimate for the (total) probability of failure that is itself bounded. The dependence model could also be incorporated, modifying the probability of failure or success on each limb on the basis of whether success or failure occurred on the previous limb. These conditional probabilities would, in this example, actually represent conditional JHEPs and would be implied; i.e., conditional notation would not be indicated on the tree. Propagation of UCBs could also be applied to these conditional probabilities; however, due to the much higher probabilities of error resulting from consideration of dependency, a different procedure is employed for assigning UCBs. Note that recovery factors can be directly represented in event trees (the dotted lines in Fig. 2), although the types of recovery considered in Swain and Guttmann (1983) are mostly limited to cases involving checking another person's work or being alerted to an error by an annunciator. The overall failure probability then involves summing the probabilities of all failure paths in the tree.

114

oooo /

b=.99

~ B=.01

=.99985/

d =.9984 /

\C =.00015

F2

~D :.0016

r3 e=.9984

=.0016

G =.00001

=.9999g

F4 h=.9984

H=.0016

=.99999

=.00001

F5 k =.9999 /

\ K =.0001

ST

F6

JHEPs

EVENTS

J O I N T HEPs FOR 3 OPERATORS

A

F A I L TO I N I T I A T E A C T I O N TO ANNE ~*

.00008

E

MISDIAONOSIS

.01

C

F A I L TO i N I T I A T E A C T I O N TO ANN

.00016

D

OMIT STEP 2 . 4

.0016

E

OMIT STEP 2 . 8

.0016

G

F A I L TO I N I T I A T E A C T I O N TO ANN

.OOOO1

H

OMIT STEP 2 . 6

.001E

K

F A I L TO i N I T I A T E HPI * *

.0001

* A N N " ANNUNCIATOR * * HI~ "NIOH.*IqqESEUNE INJECTION

Fig. 2. HRA

e v e n t tree

for loss of steam generator feed (after Swain and Guttmann, 1983).

The scheme provided for assisting the analyst in the use of various tables which provide HEPs for the HRA is actually an iterative process presented in flow chart form whose essential elements are represented by decision points. The initial points consider whether screening (assigning very high failure probabilities to each human task and evaluating their effects on system analysis) is re-

quired, whether diagnosis and/or rule-based actions need to be evaluated, and the type of error in the rule-based task(s). PSFs are then considered, including dependencies, which serve to adjust the nominal HEPs. The next major set of adjustments involves the assignment and propagation of UCBs in order to determine the UCBs for the (total) probability of failure and the consideration of

115 recovery factors, which can be modified by relevant PSFs. Lastly, sensitivity analysis is considered (although it can be performed at any point in the process); this formally comprises the fourth phase of the HRA (Fig. 1). There is no formal approach to this analysis; any approach that allows the various assumptions related to HEP estimation to be evaluated serves this purpose. For example, sensitivity analysis can be used to assess the impact on system analysis of assuming different UCBs for the error terms. Similarly, it can be used to determine whether differences between, e.g., the assumptions of no dependence and high dependence are of any significance with respect to the total system failure probability, or whether changes in the BHEP significantly affect system failure probability for an assumed level of dependence. In the final component of the fourth phase (Fig. 1) which consists of the input of results to system analysis (referred to as the probabilistic risk assessment), the results of the HRA are combined with other components of the system through either fault tree (Roland and Moriarty, 1983) or event tree methods. According to Swain and Guttmann (1983), it is still not clear at which level in the system model this process of incorporation should occur.

Previous criticism and some counterarguments In a critique aimed primarily at the work of Swain (1964; Swain and Guttmann, 1983) but also at other workers in the area of HRA espousing THERP (Meister, 1964; Embrey, 1976), Adams (1982) has argued that this approach is fundamentally unsound. His position is based on four issues, summarized as follows. (1) The unit of behavior for which the reliability is to be established is typically elusive and reflects the difficulty in task taxonomy. Four approaches suggested by Fleishman (1975) towards defining units of behavior are noted and dismissed as methods for potentially defining units whose reliability can be determined. One of these is the behavior description approach which corresponds to the empirically-oriented task analytic procedure that underlies the methods advocated by Swain and Guttmann (1983). According to Adams (1982),

this approach is inadequate because it "has no rules or rationale for defining behavioral categories, and so there is disagreement" (p. 5). If accepted, this view essentially undermines the entire foundation of THERP. (2) Human error requires precise definition if its contribution to system failure is to be meaningful. However, unlike component failure, human failure cannot be operationally defined and therefore utilized as effectively as component failure. (3) In addressing the sequence of behavior or subtasks that occurs in performing some task, the lack of knowledge concerning correlations between the elements of this response sequence makes it difficult to justify the reliability derived as a function of these components. Although this argument was presented prior to the more comprehensive updated treatment of dependency given in Swain and Guttmann (1983), it tends towards being suspect of any attempt to formalize the concept of dependencies in human operations. (4) Since the problem of combining human reliability with equipment reliability must ultimately be confronted, measures of human reliability that are consistent with those describing equipment reliability must be obtained. However, failure rate data cannot usually be obtained for humans, and those measures obtained in laboratory or related environments are considered unusable for reliability evaluation because they were collected for different purposes. Human error data banks are likewise considered unfunctional, and the use of expert judgment does not solve the problem of utilizing the mathematically-based system reliability methods for both equipment reliability, whose evaluation is consistent with these methods, and human reliability, whose evaluation is not. It is doubtful whether issues concerning task taxonomy should deter the utilization of task-analytic approaches such as THERP for HRA. Implicit to the problem of reliability assessment is an understanding of the fundamental elements requiting evaluation. A careful, well-rationalized task analysis is therefore necessary and should not be criticized in and of itself on the ground that conceptual issues related to identifying behavioral units have not been resolved. The issue Adams raises concerning human error has two components: the problem of error-re-

116 covery and the multidimensionality of error, both of which concern defining the state of error. With respect to the latter issue, Adams (1982) objects to Swain's approach "that an error is a failure only when it affects system performance" on the grounds that "component reliabilities determine system reliability, not vice versa" (p. 6). However, this argument fails to acknowledge the degrees of freedom in human interaction with system components and instead expects human and machine failure to be identically defined, which is unrealistic. In contrast to the more general relationships between system reliability and component failure, the complex nature of human response requires an operational definition of human error that is task-specific. The problem of defining human failure therefore becomes a major part of the process of system modelling, noted by Hunns (1982) as the most formidable of three primary areas of difficulty in HRA due to the difficulty in predicting the set of potentially significant event chains associated with human interaction with the system. Implicit to this problem is determining whether a particular event sequence or action is significant, i.e., can impact system reliability. This determination would, in part, be based on the task-analytic procedure utilized. Recognizing and assessing error-recovery is crucial to HRA, and is an area in which further research is needed. The approach taken in Swain and Guttmann (1983) appears too limited and largely ignores its dynamic properties. The potential for recovery will depend on the interactions between the error, the demands associated with the events following the error, and the potential for recovery given the nature of the error and the time-related constraints. Models of the error-recovery process are needed that can focus on these aspects of recovery. From these models, the potential for recovery or attempted recovery, and the impacts of these behaviors, can hopefully be better predicted for various task-specific operations. Interestingly, the issue raised by Adams (1982) concerning the limited existing understanding of dependencies among response elements and its consequences for HRA is the area of HRA considered least problematic by Hunns (1982). His contention is that this problem can potentially be virtually eliminated through careful representation of dependencies between events at the system

modelling stage, the view obviously taken by Swain and Guttmann (1983). These differences in opinions probably reflect contrasting orientations to human performance that emphasize either the molecular or more global aspects of human response sequences. The former is the position apparently taken by Adams and makes it difficult to accept any scheme for determining conditional dependencies between events. The general guidelines provided by Swain and Guttmann (1983) for assessing levels of dependence are intuitively appealing and serve to promote greater emphasis being given to the critical system modelling stage of HRA. Some of the more specific concerns associated with their dependency model will be addressed in the section that follows. The last issue raised by Adams, the synthesis of human reliability and equipment reliability, presumes that for both measures to be combined and evaluated according to conventional reliability methods, human reliability must be defined in a manner consistent with equipment reliability. This view opposes extrapolation of human performance data and the entire premise of human error data banks. To Swain (1984), a useful human performance data bank, especially one that contains relevant data on the cognitive aspects of human behavior, is the solution to the major limitation in the application of HRA. In contrast. Hunns (1982) expresses a lack of confidence in any solution based on the availability of human performance data banks. However, he believes their impracticality stems not so much from the inability of human failure data to be obtained that is consistent with equipment relability data, but from other difficulties characteristic to the human component, namely: (1) the frequent inability to carry out "transparent" performance monitoring which is likely to invalidate performance measures, (2) the inherent lack of "sensitivity" in the data collection process which leads to an inability to detect error recovery, and (3) the absence of a "comprehensive" account of the determining factors associated with the event which, given the dynamic interrelationships that exist between events during task performance, often precludes extraction of this knowledge from the human following task completion. Furthermore, Hunns believes that the data collection process is confounded for reasons that are even more fundamental, the current

117 lack of knowledge on mechanisms of human failure, resulting, in turn, in the lack of any logical basis for obtaining relevant human performance data (the problem which forms the basis to Rasmussen's (1979, 1982b) approach to HRA). Hunns (1982) suggests the use of the method of paired comparisons, a subjective estimation approach aimed at optimally extrapolating human error estimates from a minimally existent data structure. Theoretically, this approach could lead to metrics consistent with those of equipment reliability, yet Adams (1982) also discounts judgmental procedures since they do not offer a solution to the problem of "uniting equipment and human reliability under the umbrella of reliability mathematics so that they can be synthesized into system reliability" (p. 8). It would seem that the only way a statement of this type could be proven is if one can show that the process of obtaining metrics for human reliability (in this case one based on psychological scaling techniques) that provides data consistent in form with equipment reliability results in a significant degree of error in computing system reliability.

A critical examination of therp Clearly, the thoroughness implicit to THERP as reflected in the task analysis and other procedures provides, at least in principle, a tool for evaluating overall system design. When increased overall system reliability is required, a strategy is then available for appropriate allocation of developmental and redesign efforts between the system components to meet these new requirements (Peters and Hussman, 1959). From the outset, Swain (1964) appeared strongly committed to the concepts underlying THERP for pragmatic reasons: "the human factors reliability analyst can better command the attention of system planners and designers when he provides quantitative estimates of the effects of human factors related to various design concepts" (p. 698). The methods that ensued became systematically more detailed and correctice in nature. The irony is that while these methods devised by Swain and his coworkers are useful for perhaps establishing the bounds in the assessment of system safety or for enabling scrutiny of system components to be conducted along more logical lines, they are prob-

ably least useful for achieving their intended goal --deriving a valid measure of human reliability which can ultimately be synthesized with other system components. In the discussion below, some of the potential problems associated with these methods are examined.

Human performance models A general weakness of THERP is its dependency on very fundamental descriptive models relating human performance to PSFs. Approximate estimation of these PSFs is then accomplished through discretizing the continuous variables representing these PSFs in order to derive several point estimates. Often, the models are so elementary and general that their usefulness is questionable. An example is the relationship provided between the probability of correct diagnosis and the amount of time following recognition of the abnormal situation. These, and other models utilized, simply lack the necessary details that can help us understand as well as predict human performance in various critical situations of interest. Human performance data banks Swain (1984) contends that only the lack of good data has limited the usefulness of applying HRA methods (such as THERP), and with the current commitments to remedy deficiencies in sources of human performance data "less judgment by analysts will be required, and there should develop over time a much better appreciation of the problems that cause human errors in both normal and abnormal operations, i.e., nuclear power plants, so that remedial and preventive action can be taken" (p. 294). However, it would seem that any appreciation of causality can be more feasibly obtained through model development and empirical validation than through data collection. Perhaps of more interest than the program currently being dedicated towards development of a data bank are the simulator research and psychological scaling research programs (Swain, 1984). Although the latter, intended for developing formal procedures for incorporating expert judgment, is viewed as potentially advantageous due to its being less time consuming and more cost effective, it is the simulator research program that has the capability for overcoming the problems in data

118 collection noted earlier by Hunns (1982). In addition, simulator research has the potential for providing insights into the human error-recovery process that could lead to more accurate estimates of human error probability as well as uncovering the complex behavioral sequences that could lead to hazardous situations or accidents. Through post-hoc analysis (Rouse and Rouse, 1983), this information could lead to the development of more useful models of human performance.

Dependency The need for a dependency model is primarily due to the unavailability of actual data that contains conditional probabilities and to the degree of expertise required for direct subjective estimation of these probabilities. Conditional probabilities in THERP are only two-stage; i.e., if A, B, and C represent the errors in a series of three tasks, the JHEP would be P(A)* P(B/A)* P(C/B) instead of P ( A ) * P ( B / A ) * P ( C / B , A) where the influences of A on C is accounted for. The rationale given for this simplification (Swain and Guttmann, 1983) is that the predominant influence of success or failure, assuming a dependency relationship between tasks, is the immediately preceding task. The event sequence is thus viewed as a firstorder Markovian process which, according to Swain and Guttmann, results in a conservative estimate, although the magnitude of the dependencies existing between the tasks would need to be known to be certain. In any case, it results in a distortion of the estimates for success and failure and consequently in an approach that essentially ignores the explosion uncertainty in error estimation that occurs when one considers that uncertainty is already a factor when PSFs other than dependency (e.g., stress) are incorporated. This explosion in uncertainty becomes more problematic as the number of tasks or events increases, resulting in the corresponding JHEPs becoming increasingly insensitive to the influences of predecessor tasks. Overall, a lack of confidence in the basic error (or success) probabilities upon which the entire approach is ultimately based is indicated. The methodological issues associated with discretization of the continuum of (positive) dependence primarily reduce to development of a dependence model that can accommodate both the

mathematical and behavioral requirements. A nonlinear model was selected where ZD and CD represent the lower (BHEP) and upper (probability= 1) bounds, respectively and where the conditional probabilities of failure given a preceding failure are approximately 0.05, 0.15, and 0.50 for LD, MD, and HD, respectively, of the distance between ZD and CD. The model was developed with the intensity of making differences between the conditional probability (due to some level of dependence) for an event and its corresponding BHEP dependent on the magnitude of the BHEP, although the conditional HEPs are approximately constant for BHEPs_< 0.01. What is lacking, however, is a sound basis for determining the relationship between the magnitude of the BHEP and the presumed impact of dependence. Also, the accuracy of the BHEP associated with the first task becomes exaggerated in importance since it serves as the anchor in the THERP computation. One must seriously consider whether guidelines for determining general levels of dependence provided by Swain and Guttmann (1983) represent the logical limit to which representation of dependencies can be achieved in HRA.

Uncertainty bound (UCB) propagation The propagation of the uncertainty associated with the single point estimates of human error that represent the limbs on the event tree essentially requires that the distributions specified for each HEP be accurate and representative. In practice, these nominal HEPs are estimated and are assumed to be lognormally distributed (based on the rationalization presented earlier). The estimation of UCBs then reflects the variance estimates from the distributions of these HEPs. Allowances are made for many unknown factors, including the error associated with estimating the nominal HEP, but most especially for the possible PSFs associated with a task (Swain and Guttmann. 1983). The guidelines provided for estimating UCBs for the HEP estimates are typically" generous in allowing for the contribution of the various unknown factors to the variability. Consequently, they appear to offer a conservative approach towards obtaining UCBs associated with final failure probability for the task that is useful when performing a sensitivity analysis. However. in the process they contradict the requirement for accurate

119 representation of the underlying distributions for the HEPs. This problem is further complicated by the incorporation of the dependency model, so that a modified approach (based on still further assumptions) becomes necessary for obtaining UCBs for conditional HEPs. If, however, all these assumptions are accepted, the method for UCB propagation is essentially sound. Utilizing the mean and variance of the natural logarithm of HEP, In HEP (easily obtained since In HEP for a lognormal distribution is normally distributed), the mean and variance of In Prob (total failure) can ultimately be derived from which the median and UCBs on the Prob (total failure) can be obtained.

Sensitivity analysis and incorporation In performing sensitivity analysis, Swain and Guttmann (1983) suggest various models and PSFs be utilized and then evaluated in order to determine the accuracy of their underlying assumptions. The problem is that there are too many assumptions in operation simultaneously (involving recovery factors, UCBs, PSFs, HEPs, dependency, etc.), and the sensitivity analysis is not likely to be sensitive to this interplay. This layering of assumptions can lead to a result which is nonadditive with respect to the individual uncertainties and without any apparent pattern as to the nature of the nonadditivity. To the extent that there is no logical framework to infer sensitivity, sensitivity analysis is invalid. The effects of interplay in assumptions becomes even more exaggerated when incorporation (Fig. 1) is considered. At this stage in HRA, the HEP is likely to represent a node in a system fault tree and would be described as the median of a lognormal distribution with upper and lower UCBs representing the 5th and 95th percentiles, respectively. Sensitivity analysis now spans more levels of uncertainty in the sense that the sensitivity of assumptions at the task level are now evaluated in terms of their ability to affect output at the system level. Given the nature of fault trees, a symbology which can reduce all relationships to AND and OR operations, the particular structure of the fault tree could now also be affecting inferences regarding these assumptions.

QUALITATIVE APPROACHES TO HRA Qualitative approaches to HRA do not necessarily serve as alternatives to quantitative approaches but rather tend to represent a set of loosely organized ideas that advocate understanding the types of errors humans perform and the mechanisms underlying these errors (Norman, 1981; Carnino and Griffin, 1982; Rouse and Rouse, 1983; Reason, 1985). It is primarily the work of Rasmussen (1979, 1982a, 1982b, 1983, 1985, 1986, 1987), however, that bridges the area of human error to systems reliability, especially as it applies to industrial safety systems. The contributions of his work that are of interest to this paper are categorized according to (1) a set of interrelated ideas that address current problems in evaluating human reliability and its consequences for safety analysis, and (2) models of human performance intended for determining mechanisms of failure.

Human reliability and safety analysis Rasmussen (1979, 1982a) appears to accept the THERP technique but only under limited conditions, and it is from the specifications of these conditions that some of the major needs in the area of HRA can be identified. Although Rasmussen (1979) acknowledges the lack of useful models of human performance and error mechanisms, he, unlike Swain and Guttmann (1983), finds their utilization acceptable only "as long as such human error models are used only for sensitivity analysis, to determine the range of uncertainty due to human influences. If quantitative risk figures are derived, these should be qualified by the assumptions underlying the human error models used, and by a verification of the correspondence of the assumptions to the system which is analyzed" (p. 361). Human error is viewed as a loosely defined construct that can be defined on the basis of specified limits of acceptability which, in direct contrast to Adams' (1982) view, must be a function of the system. Rasmussen's (1979) treatment of human error reflects an appreciation of the difficult trade-offs humans must exercise in response to conflicting demands and to the intrusion of familiar but inappropriate associations in analyzing complex and novel situations. These

120 situations, along with sporadic errors defined as "'extraneous acts with peculiar effects", represent the unanticipated low-probability high-consequence events that result in the complex chain of events that can potentially bypass automatic safety systems and lead to accidents or incidents of the type for which a HRA appears justified. The use of the data bank concept is therefore not likely to be helpful, and only through "systematic functional analysis of realistic scenarios modelling the relevant situations" (p. 364) can the potential for error be identified (Rasmussen, 1979). In his discussion on the potential effects of human error, Rasmussen (1979) clearly distinguishes between the search strategies presumably developed through the systematic functional or structural analyses that identify the chain of events stemming from human errors on the specified task from those used to predict the effects of these errors and the human's reaction in a sequence of accidential events. To meet these objectives, Rasmussen suggests the use of heuristic searches guided by "topographical" and "psychological proximity." However, he notes that even these methods may be insufficient when accidents resulting from "minor mishaps or malfunctions in simultaneous human activities which only become risky in case of very specific combinations and timing" (p. 374) are considered. Human actions affecting only one individual component often deserve less emphasis in HRA since their effects may already be included in the component failure data. HRA is considered much more essential for analyzing human activities which cause coupling between otherwise independent events, e.g., acts which initiate a transient while, at the same time, also disturb the system's protective functions (Rasmussen and Pedersen, 1984; Rasmussen, 1987). The analysis of operator activities causing couplings between events is especially critical in evaluating human performance during emergencies and is likely to require more complex search strategies resembling design algorithms. Rasmussen and Pedersen (1984) suggest that the practical feasibility of potential search strategies for identifying such couplings should be judged on the basis of developments in databases for computer-aided design. In general, Rasmussen's (1979, 1982a) approach can be characterized as top-down, where

the usefulness of the HRA often depends on the structural aspects of the system as well as the functional requirements of the human. This approach emphasizes the need to first define safe system states and then determine the monitoring and protective functions that are either performed by the human or by automatic safety systems. An upper bound on the probability of the set of event sequences leading to the effect that is monitored or protected can then be obtained through a (lower level) reliability analysis of the protective function itself and of the probability of human error. At this point, determining the probability of the human errors that can contribute to the event sequences which lead to incidents or accidents becomes critical. Determining the probability of these errors utilizing HRA techniques such as THERP is, according to Rasmussen, usually feasible for, e.g., maintenance, test, and calibration tasks but not for many other types of tasks, especially those with significant cognitive components or those that are infrequent as in emergency situations. Various conditions considered prerequisite for obtaining human error probabilities in reliability analysis are specified by Rasmussen (1979), including the existence of the potential for error detection and correction. Error detection typically depends on factors associated with the task interface whereas error correction mostly depends on the dynamics and linearity of the system properties (Rasmussen, 1985). Stereotyped tasks are differentiated from the more flexible situations in terms of the potential for identifying error recovery. The detection and correction of errors occupies a sufficiently critical feature in Rasmussen's (1979, 1982a, 1982b, 1985) approach to HR.& to warrant the capability for their employment in task performance through design as a means for circumventing problems of assessing the reliability of complex human performance. The focus consequently becomes more shifted towards evaluating the reliability of the monitoring or error-detection mechanism(s). Overall, Rasmussen's views towards HRA provide insights to the problem that not only more clearly define the scenarios for which quantitative assessment would prove adequate but, for those situations not satisfying these conditions, the approaches to the problem that are necessary. These views appear to reflect Rasmussen's (1982a) posi-

-

J



~

J

/

/ | [

1

/ take-over

...... gnized

familiar pattern not

stereotype

familiar short-cut

sterecltvpe fixation

Ph ysii'at ¢ ' o i l f d i l i a t i l l l i l l l l l t l l r vafia h i l i l y ~llal lal n l l ~ o f i e l l l a l l l l l l

f o r g e t isiilated act llli+t ;ike +Illerllat ive ii[her slip . f lill,lllliry If+fereni'e C~illdilil)ll o r +i(h. ef/'ecl Hot CUll~iderell

Recall

| it) IIf rmatiiJll processing inf(lrnlatioll not received misinterl)retatioll a~umption

• +

. . .

.

/

|

1

l

|

|

lI

-2

: r~

/%Cl ilHi p r o ( ' e d tl re e x l , c u | HHI c ( i m m t l l l i c a t H+I+

Procedure

design

I+t,gistics Adminisl rat ion MaHagemeJit

Mainte,.;+nce, repair

Fabrication installation Inspection Operation Test and calibration

[---I INTEI{NAL IIUMAN MALFUNt'TION Detection Identification ~ [ Decisilm l select goal seJe¢! target st, h+et task

Fig. 3. Multi-facet taxonomy for description and analysis of events involving human malfunction (after Rasmussen, 1982b).

-



~x terllill evelits ( d i s t r a c l i o l l , etc ) ['-'xcessive task d e n l a n d I f o r c e , tiPle, k i l l I w i e l l g e , etc. i Ollerahl r incallal, italed I s i c k i i e s s i,lc } hitriiisic hunl;ill varililHlil)

CAUSES OF IIUMAN MALFLINCTION

-

SITUATION FACTORS Task characteristics Physical environment • ~ characteristics _ . _

[%:,,,^m~Ms i,~:,l~,mZ,s-.~IA|.rt,m,.r;,~ [ " l)i~riw=hla t ioil

l

J PERFORMANCE S I I A P I N G FACTORS i - Suiijeetive gtmls and intentions • Mental load resources + Affective factors

PERSONNEL TASK EqUiliment design

t

122 tion that "the main benefit to draw from an analytical risk assessment will probably not be the quantitative risk figure derived, but the possibility of using the structure and assumptions of the analysis as tools for risk management to secure the proper level of risk during the entire plant life" (p. 149).

Models of human performance Rasmussen's (1982a, 1982b, 1983, 1986) work on a classification system for events involving human error (Fig. 3) addresses the need for performance models that can potentially predict human error, especially in unfamiliar situations. Flow diagrams serving as guides for utilizing event analysis to identify the internal human malfunction and mechanisms and causes of human malfunction, respectively, are provided that illustrate the potential usefulness of the proposed taxonomy for (1) relating probabilities to the effects of the malfunction, (2) predicting couplings between multiple errors, and (3) ultimately aiding in identification of relevant event sequences that include human error (Rasmussen, 1982b).

More directly related to the issue of predicting human error in unfamiliar situations are the models of human performance that represent different levels of behavior and which underlie the mechanisms of human malfunction (Fig. 4). These levels consist of (1) automated, skill-based behavior; (2) goal-oriented, rule-based behavior; and (3) goalcontrolled, knowledge-based behavior. A detailed analysis of these levels, including the implications (1) of shifting between levels for problem solving and (2) for selecting between quantitative and qualitative modelling, can be found in Rasmussen (1982a). Being task and system independent, this conceptual model can provide a useful framework for both identifying the information needed for HRA and predicting various types of human error. For example, skill-based behavior represents performance based on feedforward control characterized by smooth integrated skilled acts. Error data will consequently be a function of the particular human-machine interface configurations and therefore require provision of information on the topographical layout of this equipment. Manual variability (Fig. 4) resulting in an inappropriate set-up adjustment represents the type of Effects of

FlxatJor~s GOALS

linear

thought m causal net

- Assume, e x p e l

J

-Causal conditions not

- Asso£1ate from

+

-S=de effects not

,nd:v,dual observation q

FICATIoNIDENTIH "

oFDECISIONTAsKH

KNOWLEDGE-BASED. BEHAVI0UR

cons=clered

PROCEOUREIi

CO~SldereCl

PLANNING L

Mental traps

Reca|[ ineffective

- Faro:hat asso£tat¢on (cue not defining)

-Omtsslon of tsolatecl acts

RULE-BASED

BEHAVIOUR - Abf~l..nt- mmdeclnes.s Icue not dlscrimin ) - Alertne~ tow

q

RECO6NITION

SKILL - BASED

H-H

-Mistakes among a(ternohves

RULES FOR TASKS



$_.pahal - t e m p o r a l

FEATURE FORMATION

,...oo,.c,,..,,o0, t IT!t BEHAVIOUR

TtON STATE/ TASK

S e n s o r y Inputs

I

c00rclmahon

AUTOMATED I inaclequate .:_] SENSORI-MOTOR [ PATTERNS - Manual varlabdlty

I IIlll T,me - space

-::::::::.o...o,o,,oo Actions

information

Fig. 4. Schematic illustration of the different levels of internal control of h u m a n activities. The typical mechanisms of h u m a n malfunction are indicated from a study of 200 U.S. Licensee Event Reports (after Rasmussen. 1982a).

123 Oiscrlmmahon

E ' I

Errors clepencl upon person-ancl s, tuot,on-

' i

J

Knowledge - based behavlour Tos~, sequence sltuohofl- spec=f,c If errors observable and revers=hie Lower bou~cl of rehobd*ty from feeclOock loop Qnolysts (Quohty assurance techmque) LERs support problem ~denhhcat=or~

Goals I

=0,0

~

i og: ....

I. . . . . . . . . . . . . . . .

Co,ego,,es

p

o,~

]

Rule-based behaviour

j

tl

I I..o.,.fonl

~

Recl°Qn*f'°nH slate/task H iiiks

]

i

formation

I

t l t!l Sensory inputs

,-

I

AnolySLs by clecompos=hon techn~ues Ooto COn be clerlved from LERs In-plant analysis for clenommators and error cOrrectton features of task

1 sen$ori- mOtOr patterns

t t t i;, Time-space ACtions informahon

]

S k i l l - based behaviour Collect cloto for compklte ~:ts or ochvltles - w=th reference to man- equipment

conftQurot)Ons Data from LERs O~ h,gh ftclehty s=mulotors

Fig. 5. The feasibilityof quantification and prediction of human performance depend on the internal control of the performance in question. The methodological characteristics indicated in the figure have been concluded from analysis of legally required event reports from U.S. nuclear power plants, Licensee Event Reports (after Rasmussen, 1982a). behavior (and potential for error) expected at this level of performance. For less familiar situations such as emergencies, performance is knowledge-based, and predictions of human performance as well as its consequences is extremely difficult. In these situations, error mechanisms related to the process of discrimination between appropriate levels of performance are potentially critical in many hazardous situations (Fig. 5). The level for which inappropriate discrimination can prove most dangerous is the knowledge-based level, which in the case of "familiar association short-cut", one of the three categories of inappropriate discrimination identified by Rasmussen (1982a), can occur when an abnormal situation is recognized. However, the information which should be perceived as symbols for the purpose of forming the internal conceptual representations that are the basis for reasoning and planning is instead perceived as signs that form the basis of the rules used for action. At the same time, it is at this level where human flexibility and variability can be most exploited for not only being able to deal with those unforeseen events that are transparent at the design stage but also for optimizing performance of the overall system. The example above suggests that operators be

trained to first exercise appropriate discriminations between levels of performance and then to focus on developing the appropriate conceptual representation that can hopefully lead to the effective planning necessary to correct the abnormal situation. The issue of training, however, cannot be approached until the problem of allocation of functions between human and machine/software (Price, 1985) is resolved. This factor could, for instance, dictate a specific (rule-based) as opposed to generalized training strategy which could result in the manifestation of a completely different type of inappropriate discrimination (e.g., "stereotype fixation"). Rasmussen's work on levels of performance should therefore also serve to alert the analyst to the subtle yet potentially critical impact training can have on HRA.

SIMULATION APPROACHES TO HRA The application of digital simulation techniques to HRA has been primarily associated with Siegel and his coworkers (Siegel and Wall 1969; Siegel et al., 1974, 1975). Although acknowledging that problems exist in the approach of Siegel et al. (1974) with respect to defining the "unit for analysis," Adams (1982) believes that it could accom-

124 modate the problem concerning the nature of human error: "There is no reason why error correction and the multidimensionality of error could not be incorporated into a simulation" (p. 8). Adams supports this method primarily because it does not attempt to combine measures of human and equipment relability that force the former into an analytical scheme that is more consistent with the latter, but instead provides a synthesis resulting in an arbitrary index that is not a probability metric. In Siegel et al. (1974), the combining of equipment and human reliability measures to obtain a system reliability level was performed in order that the latter be between 0.7 and 1.0. By being derived on the basis of satisfying a probability criterion, the issue of how human and equipment reliability are combined becomes almost meaningless; it is simply a means towards the end of defining system reliability. Adams (1982) refers to their synthesis as "One of the most interesting features of their work on relabiiity" (p. 8)--an assessment which seems to contradict the arbitrariness that forms its basis. Similarly, no basis exists for assuming that this technique has any special ability to solve problems concerning the nature of human error that other techniques do not have. A closer analysis of the methods underlying the work of Siegel et al. will serve to illustrate both the potential usefulness and limitations of computer simultation techniques for HRA.

The modelling issue The computer simulation technique utilized by Siegel et al. (1974) attempted to model crews on surface ships consisting of teams of 4-20 members for the purpose of generating systems reliability and systems availability information based on integrated human and equipment performance. Task analysis along with information on equipment, personnel, emergencies, etc., provided the input data according to the computer model's logic. The selection and utilization of variables for use in the model proceeds as follows: "1. From the principal features of the model and its known goals, select one or more theories/approaches of greatest importance, e.g.. small group theory, environmental considerations, extent of

importance of equipment performance. 2. With these guidelines, select specific variables on the basis of literature studies, prior model results, a n d / o r best judgment. 3. Identify those factors on which selected variables should depend, i.e., the relationship among variables. 4. Extract from the literature the qualitative analytical expressions which link the variables one to another fitting trend lines to known or estimated relationships. 5. Scale the variables and expressions to achieve consistency throughout the model." (p. 10) Based on the above, the model simulates the attributes (characteristics) of individuals and the equipment they operate. Attributes are not only altered by the events that occur during the simulated mission but influence these events as well. An overview of the flow logic underlying the model is illustrated in Figs. 6 and 7. Variables representing the physical capability and short-term peak (physical) workload requirements of an individual are a function of factors such as normalized body weight and energy consumption and are relatively easily rationalized. Values representing the levels of parameters such as aspiration, fatigue, stress, and motion sickness of each individual are, however, more difficult to derive. Siegel et al. (1974), however, were apparently of the opinion that sufficient empirical data existed to generate the precise distributions from which individual values could be sampled. Stress was operationally defined as the "ratio of the amount of time needed for completion of the current event to the amount of time available for completing the event" (p. 44). In terms of simulating this factor, this definition is obviously computationally convenient. However, by viewing stress strictly as a function of time (i.e., if the event has no time limit, stress does not exist), the context of the situation as well as many other considerations are largely ignored. Furthermore, the inverted-U relationship relating performance to arousal is taken literally when, in reality, the peak of the inverted-U occurs at different levels of arousal for different tasks, greatly limiting practical use of this theory. Therefore, while the logic underlying many simulation models can often be treated as independent of its elements or variables since these tend to be well-defined, this is not the case with

125 many of the variables used to assess human reliability. The logic can actually become confounded by the assumptions made concerning the human since the logical flow of events will be a function of the variables affected by these assumptions. The sequence of actions required to successfully repair equipment when repair events are to be performed is another illustration of this problem. These actions are based on first identifying the major class of repair (electrical, mechanical, etc.) expected to be necessary in the categorization of the equipment involved. Factor types based on previously reported factor-analytic methods in electronic repair (and extended to the other major classes of repair) are then utlized. This approach ignores much of the complexity of human performance, e.g., the dependencies and feedback mechanisms, and treats the integration of performance elements in a manner usually reserved for mechanical components. It should be emphasized, however, that this problem is not implicit to the simulation method but to the modelling process underlying it. More recently, Siegel and his coworkers have developed a computer simulation model called MAPPS (Maintenance Personnel Performance Simulation) for providing maintenance performance reliability information in nuclear power plants (Siegel et al., 1984a, 1984b, 1985). This model has the capability for considering preventive as well as corrective maintenance tasks, personnel characteristics reflecting various levels of decision-making and perceptual-motor skill, aspiration, and stress, and task and subtask characteristics such as noise, temperature, and radiation levels, shift change, and decision-making and perceptual-motor requirements. It has, however, been developed along lines similar to its predessor models and is therefore susceptible to some of the same criticisms. Nevertheless, MAPPS still appears capable of providing an insightful approach towards analysis of various maintenance activities. Consequently, it is potentially useful for serving as a basis for evaluating trade-offs in design and for augmenting analyses involving probabilistic risk assessments.

Benefits of simulation methods Meister (1984) differentiates simulation methods from other methods of HRA on the basis of

their ability to avoid the necessity for having human error equated to failure in task performance. This immunity apparently diminishes the extent to which simulation methods need to account for dependencies: "the dependency relationships occur as a result of the exercise of the model parameters" (p. 326). Compared to THERP, simulation methods are considered by Meister to be potentially more powerful. One reason is that the human's performance can be reproduced over a number of trials, theoretically allowing for the quantification of the distributions surrounding the performance estimates, in contrast to THERP which a priori assumes (lognormal) distributional properties. More compelling is the property fundamental to the simulation technique--its ability to consider the complex interaction of a large number of variables. This characteristic provides the simulation method with the capability of solving problems such as those involving the potential for human error in takeover from automatic systems. In this situation, Sheridan (1981) notes, "It is far from clear when to let automatic systems have their way and when the operator(s) should take control themselves.... Interference with the automatic functions of the safety system was one contributive factor in the Three Mile Island accident. Machines are programmed with explicit criteria for when to take over and when not. Should that be done with operators?" (p. 23). Clearly, a simulation method could better evaluate the different criteria that might be in effect for takeover since the decision to intervene could conceivable be a function of variables related to the configuration of redundant safety systems and time-dependent factors associated with the task. The interactive simulation model discussed by Siegel et al. (1975) is especially relevant to the issue of construct validity in simulation models employed for HRA stemming from the manner in which human variables are incorporated. In this approach, activities are allocated to the simulation for which an adequate database is available (e.g., perceptual-motor activities), while decision-related activities whose variability would necessarily undermine the accuracy of the computer model are allocated to the human. Sharit (1985) employed a similar approach for evaluating human capabilities and limitations at control of a computerized

126 READ IN DATA FOR EVENT TYPESAND SCHEDULED EVENTS (MISSION DATA} FOR EACH DAY

~1

READIN

~ [

+o'NoA+AI -[-EAo'No'+A I

READIN DATA FOR EQUIPMENTS (FAILURE AND REPAIR DATA FOR EACH EQUIPMENT}

I

-I

RECORD

ON PERSONNEL

~

-

4

~

~

ASSIGNEACH MAN IN THE CREW A PRIMARY AND SECONDARY PERSONNEL TYPE NUMSER

DETERMINE INITIAL VALUE FOR EACH CREWMEMBER OF THE FOLLOWING VARIABLES: • • • • • • • • • • •

PHYSICALCAPABILITY COMPETENCEIN PRIMARY SPECIALTY COMPETENCYIN SECONDARYSPECIALTY PACE ASPIRATION LEVEL HOURSSINCE LAST SLEEP PHYSICALINCAPACITY (DEGRADATION) PSYCHOLOGICALSTRESSTHRESHOLD AVERAGE DALLY PHYSICAL WORK LOAO AVERAGE SHORTTERM PHYSICAL WORKLOAD FATIGUE

ON EMERGENCIES

I

]

RECORD CREW INITIAL CONDITIONS AND AVERAGES BY COMMAND ECHELON {OPTION 2)

©

Fig. 6. Flow logic representing the initation of the simulation model (after Siegel. 1974). manufacturing system. Notice should also be given to the recent developments in SAINT (Systems Analysis of Integrated Networks of Tasks), a simulation language designed specifically for modelling and evaluating the human operator of a system. In recent years, a microcomputer version of SAINT has been developed (Laughery, 1984) that incorporates more sophisticated mathematical relationships linking human performance (as based on a set of 16 skill categories) to various variables known to affect the performance of skills. These relationships exist in modular form and are therefore easily replaced or modified to accommodate new findings. This approach certainly deserves consideration when factors such as costs and time prohibit development of specialized simulation tools.

METHODS BASED ON CLASSICAL RELIABILITY THEORY Since probabilistic risk assessment tools such as the fault tree method are more easily adaptable to

static as opposed to dynamic reliability models, it is often more efficient to discretize tasks in the continuous time domain prior to their incorporation into a systems analysis. Alternatively, and especially in the case of tasks in the continuous time domain such as vigilance, monitoring, and tracking, it would seem reasonable to approach HRA in accordance with classical reliability theory (Kapur and Lamberson, 1977). Operating under this premise, Askren and Regulinski (1969, 1971) derived a general mathematical model of human performance reliability that closely conforms to classical reliability methods and therefore could easily be combined with equipment reliability. This function is

a.)

where e(t) is the instantaneous error rate analogous to the hazard function in classical reliability theory (and could represent either a constant or time varying function), and Rh(t ) represents the human reliability function. The prediction of human reliability is obtained directly from this probabilistic model. Analogies can also be drawn between the equipment concepts of mean-time-to-

127

CALCULATEG,O I I STRESS THRESHOLD

h

CALCULATE PSYCHOLOGICAL STRESS FOR THE GROUP AS A FUNCTION OF TIME AVAILAIILE UNTIL DESIREQ END OF EVENT, AVERAGE PIERFORMANCE TIME AND MENTAL LOAD

J

I

DOES THE EVENT HAVE A FIXED END TIME OR A FIXED DURATION?

NO

I I

CAI~CULATE CURRENT PERFORMANCE AND ASPIRATION OF GROUP

CALCULATE PERFORMANCE TIME OF GROUP ON EVENT AS A FUNCTION OF

I

I

• • • • •

GROUP STRESS GROUP PACE GROUP STRESS THRESHOLD AVERAGE EVENT TIME SIGMA OF AVERAGE EVENT TIME

CALCULATE WORKING PACE FOR THE GROUP AS A FUNCTION OF INDIVIDUAL MEMBERS IN. HERENT PACE VALUES, PHYSICAL CAPABILITIES AND ASPIRATION VS PERFORMANCE

h

CALCULATE EQUIPMENT UPTIME, DOWN TIME, AND PERFORMANCE LEVEL FOR EQUIP" MENTS USED.

DETERMINE EVENT COMPLETION TIME

COMPUTE FOR EACH MAN IN GROUP • HOURS SINCE SLEEP • TIME WORKED AND WORKED LAST • WORK DONE

.'-

ADEQUACY IF REPAIR OR EMERGENCY DURATION IS EXCEEDED

ACCUMULATE CREW MENTAL LOAD AND HAZARD VALUES

I

-"

I ~

ADEQUACY OF GROUP AS A I FUNCTION OF • COMPETENCE • PSYCHO LOGICAL STR ESS • ASPIRATION • PHYSICAL CAPA81LiTY

CALCULATE PERFORMANCE ADEQUACY AND EFFICIENCY

~

CALCULATE TiME FATIGUE DUE TO INCREASE IN TIME SINCE LAST SLEEP

[

Fig. 7. Flow logic representing an arbitrarypoint in the simulation model (after Siegel et al., 1974). failure (MTTF), mean-time-to-first-failure (MTTFF), and mean-time-between-failures (MTBF), each of which reflects the mean of a random variable characterized by some probability density function, and human reliability. For example, Mq"rF could describe errors that would lead to failure of some system function. MTTFF and MTBF could describe errors whose effects could be corrected; the former characterizing errors promoting hazardous conditions and the latter characterizing errors that are less critical. Utilizing a vigilance task, the MTTFF was modelled by Askren and Regulinski (1969) through the use of graphical methods that isolated the underlying density function; the derivation of Rh(t) then follows. Results indicated that the error rate of the data was not constant, implying that the often assumed exponential distribution could not be appropriate. The normal distribution was also rejected, and the Weibell distribution was

found to best fit the data. In a second study (Askren and Regulinski, 1971), human reliability on a two-axis tracking task was modelled with the intention of exploring not only the mean-time-to-first-human error (MTTFHE) but also the mean-time-between-human errors (MTBHE). An attempt was also made to similarly model human error correction performance where CR(t), the instantaneous task correction rate, is analogous to e(t), and Ch(t), the correctability function representing the probability that a self-generated task error will be corrected in time t, is analogous to Rh(t ) (Dhillon and Singh, 1981). The solution to Ch(t ) requires determining the underlying probability density function for the relevant error correction data. For the ten distributions examined, results indicated that the Weibell distribution best fit the MTTFHE data while the lognormal distribution best fit the MTBHE and error correction data. Overall,

128 the normal distribution provided the worst fit. From these studies, several important considerations for HRA emerge. First, it does not appear that a generic distribution such as the Iognormal proposed by Swain and Guttmann (1983) could be assumed to underly all human performance. Distinctions may need to be made on the basis of the MTTFHE, MTBHE, etc., error types with potentially different implications for the probability of system failure. Second, the explicit recognition of error recovery in the modelling of human reliability raises the issue of whether task domains exist for which models of the error-recovery process can be applied. Tasks in the continuous time domain appear to be the best candidates since, for many of these tasks, errors are operationally defined, and the elusive qualities of error-correcting behavior alluded to by Adams (1982) are less concerning. Finally, the work of Askren and Regulinski impacts the issue of synthesis of human reliability and equipment reliability raised by Adams (1982), who acknowledged that their work represented one of the exceptions where human subjects are "run in a system-relevant behavior sequence for the purpose of obtaining failure rates" (p. 7). The same hypothetical and empirical foundations that underly the behavioral literature should also be capable of being applied to certain tasks in the continuous time domain for the purpose of performing a HRA. An example is the monitoring of dynamic functions from a complex multidimensional display, where the correlation among variables could influence the speed-accuracy trade-off the operator must consider when monitoring (Moray, 1981). Many of the critical elements of the problem are quantifiable, such as the frequency characteristics and time-variant entropy (uncertainty) functions associated with each variable and the correlation between variables. Errors can be operationally defined as the development of an abnormal state in one or more variables (representing the development of hazardous conditions) and the reliability function for the human derived, at least in theory. The methodological procedure upon which derivation of the human reliability function is based would, however, obviously be expected to significantly differ from the approach used for equipment reliability. In the Markov modelling approach to HRA by

Dhillon (1986), human and equipment reliabilities are more directly synthesized. State-space models are developed that allow the separate human and equipment reliability components to be identified in terms of their respective contributions to system reliability. Although assumptions of constant (random) failure rates are generally assumed for the human, these types of assumptions only affect the computational aspects of the problem and can be altered. More critical, however, is the need to integrate into these models more realistic descriptions of human behavior. Currently, these models do not adequately distinguish the human and hardware components. The complex nature of human performance requires caution when drawing analogies between human and equipment reliability in the continuous time task domains. For example, although a bathtub error rate curve for the human analogous to the bathtub hazard rate curve (Kapur and Lamberson, 1977) in classical reliability theory is intuitively appealing, its usefulness even as a construct is discouraged if only because it implies that the human exhibits behavioral patterns that are analogous to that of hardware components. The potential use of classical reliability theory approaches to HRA should not be as an explanatory device but to achieve better representation of the dynamic element in human performance (including error-recovery properties), and subsequently for more compatible incorporation of human reliability into a systems analysis. There do appear to be classical dynamic reliability paradigms with frameworks sufficiently sound to support formulation of human reliability performance models in the continuous time domain. One example is the standby redundant system (Kapur and Lamberson, 1977), a form of parallel system where the standby component is not activated unless the on-line component fails. The switch can be an automatic sensing device or a human who replaces the on-line component with one that is on standby. Viewing the human switch as a supervisory controller of a complex system, the components could represent functions that need to be activated when others have been exhausted in the sense of sustaining effective system operations. Note that imperfect switching implies either human unavailability or human error at performing the replacement activity. Time-depen-

129

dent functions characterizing human availability integrated with information-processing and component or subtask reliability models could then conceivably form the basis for a model of system or subsystem reliability.

CONCLUSIONS A summary of some of the research needs associated with the approaches reviewed in this paper are presented below. (1) A much more thorough approach to human performance modelling is needed, especially one that recognizes human problem-solving and decision-making strategies, the effects of time-estimation. and information-processing characterisitics unique to dangerous or extremely critical situations. Not only would they enable more useful quantitative metrics to be obtained, but they would also allow more optimal use of event trees as compared to fault trees (Pat~-Cornell, 1984) in HRA. (2) Despite being considered costly and lacking in timeliness, simulators appear to offer the best means for exploring the dynamic aspects of human performance, including the error-recovery process. They could also be used to evaluate human and system reliability stemming from different strategies in allocation of decision-making responsibility between human and compouter. The potential for development and validation of human performance models that emphasize dynamic factors can be even further realized when the use of simulators are combined with conventional simulation languages that exercise control over the scheduling, execution, and statistical analysis of events. (3) The development of mapping functions which predict the impact of the human component when incorporated at specified points into the system model (e.g., fault tree) could increase our understanding of the effects of combining human and equipment reliabilities. Towards this end, fuzzy set techniques (Wang et al., 1986) could prove useful by allowing the effects of various human performance attributes to be evaluated in combination with various system configurations. (4) The development of guidelines are needed for establishing whether quantitative human relia-

bility assessment, including that which might be based on classical reliability theory, is feasible. If not, the balance between quantitative and qualitative assessment that is most logical for the situation needs to be specified. (5) An emphasis is needed in unifying HRA with risk analysis, where the latter considers the probabilities of the various consequences resulting from the accident or hazardous conditions. Implicit to this approach is the development of search strategies aimed at identifying accident sequences whose complex spatial and temporal structures make them intractable to conventional analytic techniques such as fault trees. Along these lines, the use of search techniques associated with the area of artificial intelligence (Waterman and Hayes-Roth. 1978) is suggested. (6) The issues of generalized versus specific (rule-based) training needs to be evaluated for various types of complex systems in terms of their impact on human and system reliability. (7) Paradigms in the area of classical reliability theory that are potentially useful in HRA need to be identified, along with the necessary modifications. Ultimately, research is encouraged for not only strengthening the various approaches to HRA, but also for developing a taxonomy of system-related scenarios that would serve to identify one or more of these approaches as most appropriate for HRA.

REFERENCES Adams, J.A.. 1982. Issues in human reliability. Human Factors, 24: 1-10. Askren, B.W. and Regulinski, T.L.. 1969. Mathematical modeling of human performance errors for reliability analysis of systems. Technical Report AMRL-TR-68-93, Aerospace Medical Research Laboratory,, Wright-Patterson Air Force Base, OH. Askren, B.W. and Regulinski, T.L.. 1971. Quantifying human performance reliability. Technical Report AFHRL-TR-7122. Air Force Systems Command. Brooks Air Force Base. TX. Carnino, A. and Griffon, M., 1982. Causes of human error. In: A.E. Green (Ed.), High Risk Safety Technology. John Wiley & Sons. London, pp. 171-179. Dhillon. B.S. and Singh, C., 1981. Engineering Reliability: New Techniques and Applications, John Wiley & Sons. New York. Dhillon, B.S., 1986. Human Reliability with Human Factors. Pergamon Press. New York.

130 Machine Interaction. North-Holland. Amsterdam. Drury, C.G., Paramore, B., Van Cott, H.P., Grey. S.M. and Rasmussen, J.. 1987. Approaches to the control of the effects Corlett, E.N., 1987. Task analysis. In: G. Saivendy (Ed.). of human error on chemical plant safety. In: Proc. 1987 Handbook of Human Factors. John Wiley & Sons, New International Symposium on Preventing Major Chemical York, pp. 370-401. Accidents. Embrey, D.E., 1976. Human reliability in complex systems: an overview. NCSR Report RI0, National Centre of Systems Reason, J., 1985. Slips and mistakes: two distinct classes of human error? In: D.J. Oborne (Ed.), Contemporary ErgoReliability, Warrington, Great Britain. nomics 1985. Proc. Ergonomics Society's 1985 Conference. Fleishman, E.A., 1975. Toward a taxonomy of human perforTaylor and Francis, London. 103-110. mance. Amer. Psychol, 30: 1127-1149. Roland, H.E. and Moriarty, B.. 1983. System Safety EngineerHenley, E.J. and Kumamoto, H., 1985. Designing for Reliabiling and Management. John Wiley & Sons, New York. ity and Safety Control. Prentice-Hall, Engiewood Cliffs. Rouse, W.B. and Rouse, S.H., 1983. Analysis and classification NJ. of human error. IEEE Trans. Syst. Man Cybern., SMC-13: Hunns, D.M., 1982. Discussion around a human factors data 539-549. base. An interim solution: the method of paired compariSharit, J., 1985. Supervisory control of a flexible manufactursons. In: A.E. Green (Ed.), High Risk Safety Technology. ing system. Hum. Factors, 27: 47-59. John Wiley & Sons, New York, pp. 181-215. Sheridan, T.B., 1981. Understanding human error and aiding Kapur, K.C. and Lamberson, L.R., 1977. Reliability in Enhuman diagnostic behavior in nuclear power plants. In: J. gineering Design. John Wiley and Sons, New York. Rasmussen and W.B. Rouse (Eds.), Human Detection and Laughery, K.R., Jr., 1984. Computer modeling of human opDiagnosis of System Failures. Plenum Press, London, pp. erators in systems. In: Proc. 1984 International Conference 19-35. on Occupational Ergonomics, Toronto. Human Factors Siegel, A.I, and Wolf, J.J., 1969. Man-Machine Simulation Conference, Inc., Rexdale, Ontario, Canada pp. 26-34. Models. John Wiley & Sons, New York. Meister, D., 1964. Methods of predicting human reliability in Siegel, A.i., Wolf, J.J. and Lautman, M.R., 1974. A model for man-machine systems. Hum. Factors, 6: 621-646. predicting integrated man-machine system reliability: model Meister, D., 1984. Alternate approaches to human reliability logic and description. Technical Report AD-A009 814, analysis. In: R.A. Waller and V.T. Corello (Eds.), Low Applied Psychological Services, Wayne, PA. Probability/High Consequence Risk Analysis. Plenum Siegel, A.I., Wolf, J.J. and Lautman, M.R., 1975. A family of Press, New York, pp. 319-333. models for measuring human reliability. In: Proc. 1975 Moray, N., 1981. The role of attention in the detection of Annual Reliability and Maintainability Symposium. Inerrors and the diagnosis of failures in man-machine sysstitute of Electrical and Electronics Engineers, New York, tems. In: J. Rasmussen and W.B. Rouse (Eds.), Human pp. 110-115. Detection and Diagnosis of System Failures, London, Siegel, A.I., Bartter, W.D., Wolf, J.J., Knee, H.E. and Haas, Plenum Press, pp. 185-198. P.M., 1984a. Maintenance personnel performance simulaNorman, D.A., 1981. Categorization of action slips. Psychol. tion (MAPPS) model: summary description. NUREC/CRRev., 88: 1-15. 3626, Vol. 1, ORNL/TM-9041/V1. Pat~-Cornell, M.E., 1984. Fault-trees vs. event trees in reliabilSiegel, A.I., Bartter, W.D., Wolf, J.J. and Knee, H.E., 1984b. ity analysis. Risk Anal., 4: 177-186. Maintenance personnel performance simulation (MAPPS) Peters, G.A. and Hussman, T.A., 1959. Human factors in model: description of model content, structure, and sensisystems reliability, Hum. Factors, 1: 38-42. tivity testing. NUREG/CR-3626, Vol. 2, ORNL/TMPrice, H.E., 1985. The allocation of functions in systems. Hum. 9041/V2. Factors, 27: 33-45. Siegel, A,I.. Wolf. J.J., Bartter, W.D.. Madden, E.G. and Rasmussen, J., 1979. Notes on human error analysis and Ko~F.F., 1985. M a i n t e n ~ p e r s o n n e l performance prediction. In: G. Apostolakis and G. Volta (Eds.), Synthesimulation (MAPPS) model: field eval"uation/validation. sis and Analysis Methods for Safety and Reliability Stud~__ NUREG/CR-4104, ORNL/TM-9503. ies. Plenum Press, London, pp. 357-389. Swain, A.D., 1964. Some problems in the measurement of Rasmussen, J., 1982a. Human reliability in risk analysis. In: human performance in man-machine systems. Hum. FacA.E. Green (Ed.), High Risk Safety Technology. John tors, 6: 687~00. ~-Wiley & Sons, London, pp. 143-170. Swain, A.D., and Guttmann, H.E., 1983. Handbook of Human Rasmussen, J., 1982b. Human errors. A taxonomy for describReliability Analysis with Emphasis on Nuclear Power Plant ing human malfunction in industrial installations. J. Occup. Applications. NUREG/CR-1278, US=NRC. Accid, 4: 311-333. Swain, A.D., 1984. Qualifications of human error in LP/HC Rasmussen, J., 1983. Skills, rules, and knowledge: signals, risk analysis. In: R.A. aW.~r and V.T. Covello (Eds.), signs, and symbols, and other distinctions in human perforLow-Probability High-Consequence Risk Analysis. Plenum mance models. IEEE Trans. Syst. Man Cybern., SMC-13: Press, New York. pp. 293-295. 257-266. Wang, J-M~,. Sharit, J., and.~Drury, C.G., 1986. An application Rasmussen, J. and Pedersen, D.M., 1984. Human factors in of fuzzy set theory for evaluation of human performance on probabilistic risk analysis and in risk management. Operan inspection task. In: W. Karwowski and A. Mital (Eds.), ational Safety of Nuclear Power Plants, Vol. 1. IAEA. Applications of Fuzzy Set Theory in Human Factors. ElseVienna, pp. 181-194. vier, Amsterdam. pp. 257-268. Rasmussen, J., 1985. Trends in human reliability analysis. Waterman, D.A. and Hayes-Roth, F., 1978. Pattern-Directed Ergonomics, 28:1185-1195. Inference Systems. Academic Press, New York. Rasmussen, J., 1986. Information Processing and Human-