Collection of offshore human error probability data

Collection of offshore human error probability data

Reliability Engineering and @stem Safety 61 (1998) 77-93 © 1998 Elsevier Science Limited All rights reserved. Printed in Northern Ireland PII:S0951-83...

1MB Sizes 64 Downloads 135 Views

Reliability Engineering and @stem Safety 61 (1998) 77-93 © 1998 Elsevier Science Limited All rights reserved. Printed in Northern Ireland PII:S0951-8320(97)00064-1

ELSEVIER

0951-8320/98/$19.00

Collection of offshore human error probability data Gurpreet Basra & Barry Kirwan* Industrial Ergonomics Group, University of Birmingham, School of Manufacturing and Mechanical Engineering, Edgbaston, Birmingham, UK (Received 1 December 1996; revised 1 May 1997; accepted 7 May 1997)

Accidents such as Piper Alpha have increased concern about the effects of human errors in complex systems. Such accidents can in theory be predicted and prevented by risk assessment, and in particular human reliability assessment (HRA), but HRA ideally requires qualitative and quantitative human error data. A research initiative at the University of Birmingham led to the development of CORE-DATA, a Computerised Human Error Data Base. This system currently contains a reasonably large number of human error data points, collected from a variety of mainly nuclearpower related sources. This article outlines a recent offshore data collection study, concerned with collecting lifeboat evacuation data. Data collection methods are outlined and a selection of human error probabilities generated as a result of the study are provided. These data give insights into the type of errors and human failure rates that could be utilised to support offshore risk analyses. © 1998 Elsevier Science Limited.

of an error, as well as its consequences, determines whether an error is seen as important and actions are taken to prevent it or mitigate its effects, or whether it is ignored as having negligible risk. If the quantification is wrong, the resultant risk assessment will be wrong, and the offshore system itself may therefore be more unsafe than predicted. Human error probabilities (HEPs) are usually determined either by analysts' judgement, or by usage of one or more H R A quantification techniques, such as Human Error Assessment and Reduction Technique (HEART; Williams9), Success Likelihood Index Methodology (SLIM; E m b r e y et al. 1°) or the Technique for Human Error Rate Prediction (THERP; Swain and GuttmannS). Whilst some o f these techniques were shown to have reasonable accuracy (Comer et al.ll; Kirwanl2; Kirwan et al. 13), there is still concern over their precision. In the offshore world this uncertainty is compounded because the techniques were developed largely for nuclear power applications, and much of the data which have either helped develop the techniques or validate them, is nuclear power-related in origin. The degree to which such techniques, and such human error data as do exist, translate into the world of offshore systems, is unclear. Ideally therefore, the offshore risk assessment world would have at least some of its own data on human error probabilities, to support risk assessment practice.

1 INTRODUCTION

1.1 Background Offshore systems such as drilling and production platforms place operators in a hazardous and complex system, in a fairly unforgiving environment. Such systems are prone to human error, and the results of human error can be catastrophic (e.g. the Piper A l p h a disaster in 1988; Cullenl). Risk assessment (Henley and Kumamot02; Green3; Cox and Tait 4) of such offshore systems must quantitatively take account of all the potential risks for such systems, and therefore must address the problem of human error. The most appropriate way to do this is via the quantitative approach of Human Reliability Assessment (HRA; see Swain and GuttmannS; Dhillon6; Dougherty and Fragola7; Kirwan8). H R A identifies what errors could occur in a system, how likely they would be, and how to reduce or avoid entirely their likelihood of occurrence. This paper is concerned primarily with the middle part of the H R A process, the quantification of the likelihoods or probabilities of human error. This is a crucial phase of HRA, since the probability *Now at National Air Traffic Services, Bournemouth Airport, Christchurch, Bournemouth, BH23 6DF, UK. 77

G. Basra and B. Kirwan

78

Table 1. Key CORE-DATA areas

I. 2. 3. 4. 5.

Task description: a general description of the task being completed by the operator and the operating conditions, e.g. launch lifeboat. External error modes: the observable manifestation of the task, e.g. fail to disengage hooks. Psychological error mechanisms: the operators internal failure mode, e.g. freezing. Performance shaping factors: concern the human factors in the operating environment, e.g. stress, training, etc. Error opportunities: how many times the task was completed and the number of times the operator failed to achieve the correct task outcome, e.g. 1 failure in 50 launches. Nominal HEP: the mean human error probability for error in a given task, e.g. 0.02. Upper bound: the 95th percentile (if derivable). Lower bound: the 5th percentile (if derivable). Data Pedigree: categorising the data (simulator, real expert judgement data etc.). Industry: categorises where the data originated from (nuclear, petro-chemical, offshore, etc.). Task/equipment: categorises the task or the equipment in use when the incident/accident occurred. Human action: categorises what cognitive process the operator was engaged in prior to the incident/accident.

6. 7. 8. 9. 10. 11. 12.

For some thirty years, and in order to address HRA issues in a number of industries, there have been attempts to develop databases of HEPs, most of which have failed (Topmiller et al. H). Fundamentally, this failure was generally caused by a lack of theory of human error to properly structure such a database (Kirwan et al. Ls; Taylor_Adams,6). However, because of recent advances made to the theory of qualitative HRA (Rasmussen et al. 17; ReasonS8), it was believed that a structured data collection program, which could be successfully encoded into a human error databank, was feasible. In practical terms, these theoretical developments have led to three inter-related data encoding classifications. These classifications or taxonomies define what the error was; why the error occurred; and how the operator failed internally. In technical terms, the External Error Mode (EEM) defines what the external manifestation of the error was, e.g. operator omits check. Performance Shaping Factors (PSFs)determine why the error occurred, e.g. inadequate training or production pressures. Finally, Psychological Error Mechanisms (PEMs) aim to give an insight into how the operator failed internally, e.g. memory Jailure or overconfidence. If these three aspects of an error are known, then data can be properly and reliably encoded, and generalisation to new similar scenarios becomes feasible. As a consequence of these developments, and because of a national body review of H R A (ACSNI19), a three-year project was instigated at the University of Birmingham, to provide the human reliability community with a usable, accurate and valid human error databank• This research has led to the development of the Computerised Operators Reliability and Error Database ( C O R E - D A T A ) (TaylorAdams and Kirwan2°; Taylor-Adamsl~'; Taylor-Adams2~; Taylor-Adams22; Taylor-Adams and K l r w a n - ; TaylorA d a m s and Kwwan" ). C O R E - D A T A currently collects and collates data in the key areas shown in Table 1. At present C O R E - D A T A contain approximately 100 HEPs and a further 900 or so are available in hard-copy format. Potential uses for this data are as follows. • •

Direct usage in assessments Usage as 'calibration data' for certain HRA techniques (SLIM and Paired Comparisons; see Kirwan 8)

• •

Usage as validation data when comparatively testing the accuracy of HRA techniques As guidance data for assessors and regulators

The error data were collected from a variety of sources, such as incident reports, psychological and ergonomics literature, simulator studies and specific data collection schemes. It is one of these data collection schemes that is the main focus o f this article. Most of the data in C O R E - D A T A have originated, in some form, from the nuclear sector. Data from other industries, in particular the offshore industry, is sparse. This project, funded by the Offshore Safety Division, part of the UK Health and Safety Executive, was therefore concerned with collecting a number of HEPs in the offshore sector. 1.2 The offshore oil and gas sector-areas of v u l n e r a b i l i t y to h u m a n error

As a result of the remote location of offshore oil rigs, and the inherent dangers associated with such installations, one would expect robust safety and evacuation systems to cope with the requirements of a major accident. However, there are many opportunities for error in offshore installations, caused in part by the very nature of the work and its location. Offshore drilling is one such area: drilling tasks such as tripping in an out of the hole are fairly labour intensive. Another area prone to human error is the Permit-to work (PTW) system. One of the most significant causes of the Piper Alpha tragedy was the breakdown in co-ordination of hazardous activities, which should have been achieved through the P T W procedures. This was later confirmed by Lord Cullen in his report of the accident (Cullenl). Another area which is highly reliant on adequate human performance, is that of offshore lifeboat evacuation. It was decided that lifeboat evacuation would be studied, as useful data could be collected from practice lifeboat launches at training centres. Table 2 gives an overview, from a lifeboat launching perspective, of a number of offshore evacuation incidents in which the crew were required to abandon the rig, Not all these abandonments were successful. All these incidents have illustrated the need to

Collection o f offshore h u m a n probability data

79

Table 2. Overview of UK offshore accidents - a lifeboat perspective

Installation

Number of crew

Number abandoned via TEMPSC

Total number of survivors

General comments

Alexander L. Kielland, March 1980 (Landolt et al. 25

212

59

89

Ocean Ranger, February 1982 (Landolt et al. ~5

84

Approx. 30

0

Vinland, February 1984 (Landolt et al. 25)

76

76

75

Piper Alpha, July 1988 (Cullen J) Ocean Odyssey, September 1988 (Robertson and Wright26)

226

0

65

67

58

66

There were initial problems in launching the lifeboats. When they finally reached the water, they subsequently smashed against the platform because of problems in releasing the boat under load. A standby vessel was in position to rescue the men on board the TEMPSC. Some of the crew however, emerged from the TEMPSC. This, coupled with the damage the boat had already sustained, resulted in the craft losing its self-righting properties and the boat capsized. During transit, one of the men suffered an apparent heart attack and later died. Other than mild shock and seasickness, there were no injuries to the survivors. Evacuation by lifeboat was impossible due to flames and dense smoke. The lifeboats were never used during this incident. Accounts from survivors indicate that the TEMPSC launch was rife with problems, in particular releasing the TEMPSC from the fall wires. Hatches had to be opened in the presence of fire and gas in order to disconnect the falls manually.

maximise human reliability in the evacuation process, and in particular to answer the following questions: what can go wrong?; why will errors occur?; how can errors be prevented? These questions are those asked and answered via the approach of Human Reliability Assessment (HRA). Davit-launched lifeboats were the primary focus of this project. The system of interest was the Totally Enclosed Motor Propelled Survival Craft (TEMPSC) lifeboat, which is lowered via 'fall wires' to the sea and, having released two hooks, drives away from the platform. The study was aimed at generating a number of offshore HEPs associated with lifeboat evacuation. The study was a feasibility study, and was not intended to analyse or derive all HEPs for all errors that could be modelled in a risk assessment, i.e. it was not intended to serve as a full HRA of lifeboat evacuation. A number of questions were posed at the outset of the study. These relate to the viability of data collection in the offshore sector, and its utility for quantitative risk assessment. • • •

• • •

Can human error data for offshore lifeboat evacuation be collected? What are the best methods for data collection? Can realistic data on evacuation be generated, which take account of the stress that would be evident in a real situation? How high are the HEPs likely to be? Could data collection preclude the need for HRA techniques? Does data collection lead to any qualitative insights into how to improve lifeboat evacuation?

These issues will be returned to in the discussion. 1.3 O b j e c t i v e s o f the study

The objectives of the work were to collect qualitative

information and quantitative human error probabilities (HEPs) from an offshore survival training simulator. • • •

To conduct a brief task and human error analysis to determine what errors should be recorded. To derive between 10 and 30 HEPs and their associated qualitative information. To assess the impact of factors such as stress on the HEPs, to render the data applicable to real-life offshore evacuation situations.

Data were collected from two Robert Gordon Institute of Technology (RGIT) Offshore Survival Training Centres in Dundee and Aberdeen. The two courses of particular interest were the four day coxswain training course and the two day coxswain refresher course. Both courses involved intense classroom-based and practical elements. The Dundee base houses two freefall systems and a number of davit-launch boats. The Aberdeen base is concerned only with davit-launch training.

2 METHOD

The ultimate objective was to generate a number of HEPs. This was achieved through a number of stages as shown in Fig. 1. The overall process requires firstly a description of how the task should be carried out, and this is achieved by task analysis. Secondly, errors are identified using tools for human error analysis, and these result in a table of errors and associated consequences which is similar in format to a failure modes and effects analysis. Data can then be collected via a range of methods, and HEPs generated. These stages are described in detail later.

80

G. Basra and B. Kirwan

I Hierarchical Task Analysis (HTA) construction

Specifies the consequences of the error, e.g. inoperable system may lead to ingress of smoke, a lack of air and subsequent asphyxiation; etc.;

I

l

Psychological error mechanisms

] Subsequent H. . . . Error Analysis (HEA) [

Concern the operators internal failure mode, e.g. memory failure, etc.;

l DataCollection

I

~dgem°.,

Performance shaping factors

]

The human factors present in the operating environment influencing performance, e.g. stress, etc.; HEP Human error probability, e.g. 0-01, or once in a hundred opportunities for error, etc.; Uncertainty bounds

Analysis and collation of data

I.EPg .... ,ion ] Fig. 1. Methodology; hierarchical task analysis (HTA) construction.

2.1 Hierarchical task analysis (HTA) H T A is the most popular and flexible of the task analysis techniques (Kirwan and AinsworthZ7). A detailed HTA, illustrating the steps and sequences involved in preparing a boat for launch and the actual launch, was constructed following an initial familiarisation visit to RGIT Dundee/ Aberdeen (see later Fig. 2), Informal discussions were held with trainers at both centres and coxswain training booklets were consulted. The resultant H T A was subsequently modified after further discussions with trainers.

2.2 H u m a n error analysis (HEA) The first stage in the development of the H E A tables (Kirwan s) involved incorporating the steps from the H T A (see later Table 5). The data from the tables would eventually be seeded into C O R E - D A T A , so it was necessary to ensure all relevant information for C O R E - D A T A was collected. The following headings were developed to make up the table. Task step and equipment The task step is taken directly from the HTA. The equipment used to achieve the task step is also specified, e.g. check air support system with a 6 s purge, etc.; Error modes Concerns the observable manifestation of the error, e.g. action omitted, etc.; Consequences

Upper and lower uncertainty bounds, e.g. 0.03-0-003. Source Concerns the source of qualitative and quantitative data, e.g. observed, generated via expert judgement, derived from accident reports, etc.; Existing offshore data were then added from two main reference sources, Brabazon and Bellamy 2s. and Kennedy 29. The first of these studies looked at ways of comparing conventional davit launched T E M P S C ' s and Freefall lifeboats, The second study developed a concise summary of the possible human errors/failures at each stage of evacuation, escape and rescue.

2.3 Review of incidents/accidents A review of offshore incident reports failed to yield quantitative data, for the reason that the vital information pertaining to the denominator (i.e. the number of opportunities for an error-essential for generation of an HEP) was not derivable and therefore a HEP could not be generated.

2.4 Data collection via direct observation Observational data during this study were collected by direct visual observation. The data collector was present at every launch during the training courses, and error data were recorded as and when a mistake was made. Two weeks were spent collecting davit launch data, and a total of fifty four launches were observed in this time period. Although this was a good source of data, one may ask the question, how realistic is this approach to data collection, since training simulations are not the same as real evacuations? This type of data may therefore be 'unrepresentative" for two reasons. Firstly, a level of intrusion will undoubtedly alter the way in which the observed personnel work, adversely or otherwise. Secondly, the stressors present in a real offshore evacuation would not all be present in the simulator. Assuming that trainees would act in a similar way offshore as they do in

Collection of offshore human probability data DAVIT-LAUNCH

I

I 1. Check boat

I1.1

I

II.

~ p l a n

I

1.1.2

HTA

Dm in order I

I

I

/

1

~fil~warn ~ d ~ i ~ a ~ ~J

A

I

I

I

I

I

[

L 1

Plm: l)o in my o[xk'r

I

I

I.L7 Ch,,ck~,had~ d , ~oai~ mclm Itltc

1,1.3 C b ~ k p t ~ mdmd~

B I

I

I

I 1.1.1 ~ k ~ d ~ md~e

LIFEBOAT

81

[

/

1.2.4 Cheekfidtnp sli~atlyflad~

1.2.5 ~'cvatrl~ay sy~rmo~rnlioual

] [

[

I

I

1.2.7 Chcdcmr vmts

1.2.8 ~'~¢dpofifimed o c carinstallation

1.2.9 C~¢d~compassto cl~iu~latiou

r

1.2.6 3¢~e all doorsmdbada~

t ]

Fig. 2. Sample of the HTA. the onshore simulators may therefore be taking an oversimplified view. Observations were made by the data collector for a period o f approximately 3 weeks in total. All trainees were informed of the observer presence, and the general nature of the data collection task. They were assured that any observations made would have no effect on their actual assessment, likewise the trainers were assured that it was not their training abilities that were being assessed. They were asked to proceed with all exercises as normal. A checklist with all the discrete steps taken from the H T A was developed, and as the trainee worked through the launch procedure any errors were recorded on this checklist. Both coxswain course types were observed, and all training coxswains were of varying ability. Where possible a brief discussion was held afterwards with the coxswain who made the error, to try and elicit information on why the error occurred. 2.5 D a t a c o l l e c t i o n

through

a search of trainee records

A search of records can sometimes provide supplementary information. Details of errors not observed in the allowed timescales can be elicited. However, this information may not be recorded in enough detail required for data collection exercises such as this one. Quantifying such data can also be problematic if the denominator is unknown. The Offshore Petroleum Industry Training Organisation (OPITO) have laid down a number of subject areas in which a coxswain must be competent before a certificate can be issued. There are eight basic subject areas in total, all of

which must be passed. A fail in any one area will result in an overall fail. Details of trainee performance are recorded on 'Coxswain Assessment Summary Sheets', which are kept for a number of years. A search was made of these records, dating back to 1991, enabling additional error data to be analysed. 2.6 D a t a g e n e r a t i o n

via e x p e r t j u d g e m e n t

sessions

Harnessing expert judgement can be very powerful. Two c o m m o n expert judgement techniques are the methods of Paired Comparisons (PC; see Hunns3°; Kirwan s) and Absolute Probability Judgement (APJ; see Seaver and Stillwell31; Kirwan8). Data can be generated for scenarios that cannot be observed at onshore simulators, e.g. the effects that explosions and severe weather conditions have on performance. Shortfalls in observed data can therefore be rectified. Additionally, expert judgement techniques can be used to ' m o d i f y ' data to make it more representative of the real situation. Expert judgement sessions in the form of Paired Comparisons (PC) and Absolute Probability Judgement (APJ) sessions were conducted as part of the study.

2.6.1 Introduction to PC and APJ For PC, experts are required to compare a set of pairs of tasks for which HEPs are required. For each pair of tasks the expert must decide which has the highest likelihood of error. The technique elicits these comparative judgements from a number of experts and develops a likelihood scaling, i.e. a relative scale on which errors can be located, which can then

G. Basra and B. Kirwan

82

Table 3. Errors used in expert j u d g e m e n t sessions

Davit

Freefall

1. Fail to check whether air support system operational 2. Fail to disengage boat from hooks 3. Fail to use brake cable correctly 4. Fail to start engine before lowering 5. Fail to monitor compass 6. Fail to ensure personnel strapped in 7. Fail to start engine at all 8. Fail to position wheel to clear installation 9. Fail to check drop zone clear 10. Fail to ensure hatches and doors closed

1. Fail to check helicopter hatch closed and secure 2. Fail to check engine 3. Fail to check exterior tk)r damage 4. Fail to check life support system operational 5. Fail to check everything in interior secure 6. Fail to release boat 7. Fail to operate correct launching pump/valve 8. Fail to start engine for 30 secs prior to launch 9. Fail to check hook disconnected 10. Fail to check back door vent closed was understood what was required the sessions began. The subject was able to ask questions at any point during the exercise.

be translated into HEPs if calibrated with at least two known data points, based on a logarithmic transformation. APJ is the most direct approach to the quantification of HEPs. It relies on experts to estimate HEPs, based on their knowledge and experience. The aggregated individual method was employed in this study. This involves a number of experts making their estimates individually. These data are then collated and statistically aggregated, based on the geometric mean (i.e. the 'nth' root of the product of 'n" items) of the individual estimates.

3 RESULTS 3.1 H i e r a r c h i c a l

task analyses

Fig. 2 illustrates a small portion of the entire davit HTA. The full version runs to 3 pages.

2.6.2 Data collection Ten tasks, for each lifeboat system, were selected from the H T A and H E A (see Table 3). These errors were selected to range in likelihood of occurrence. O f the ten davit lifeboat tasks, three errors had been recorded during the earlier observations. These errors were seeded into the exercise to serve as calibration points during the later PC calculation process. In order to obtain more realistic offshore data, two platform scenarios were hypothesised, a controlled and a severe evacuation (see Table 4). The controlled scenario, it was hoped, would give data very similar to that obtained through observation. The severe scenario data would give an insight into the magnitude of stress effects on performance. All subjects were davit launch lifeboat trainers from RGIT Aberdeen/Dundee. Training experience of the subjects ranged generally from 3 to 7 years, and one subject had 15 years training experience. Although none of the trainers interviewed had any working offshore platform experience, all but three of the subjects had extensive Royal Navy experience ranging from 5 - 1 4 years, and one subject was involved in a major offshore evacuation rescue mission. Each subject was seen individually, and both techniques were described with simple everyday examples. The subject was able to ask questions at this stage, and when it

3.2 H u m a n

error analysis tables

An extract of the final H E A tables can be lkmnd in Table 5. This H EA ran to 13 pages. 3.3 R e v i e w o f i n c i d e n t / a c c i d e n t s

The search of incident/accident reports did not result in any useful supplementary data generation, though it provided useful infl)rmation on the PSFs that exist in real events. 3.4 D a t a c o l l e c t e d v i a d i r e c t o b s e r v a t i o n

From direct observation, twelve HEPs were collected for the davit-launch lifeboat (see Table 10 for a summary of HEPs generated). Uncertainty bounds for observed errors were also evaluated by the following simple procedure. N = Total number of observations R = N(E) the number of observations comprising a particular event E Define Q = R / N Define A = 2OQ(I - Q)/N

Table 4. Platform conditions used in expert j u d g e m e n t sessions

Controlled scenario - Force 4 wind and sea state - Daylight - Unignited gas leak - cause of the evacuation

Severe scenario -

Force 6 wind and sea state Night Explosions have occurred Fire on platform Fatalities have occurred on platform

Collection o f offshore human probability data The 95% confidence interval for the probability P(E) is then given by the following expression. [Q-A;Q+A]

3.5 Data collected via a search of trainee assessment sheets

Hence; c = ~-~(102 - 1) - 8 0 . 5 0 2

3.6 Expert judgement sessions with lifeboat trainers 3.6.1 Paired comparisons A total of nine subjects completed the PC for the two davitlaunched lifeboat conditions, severe and controlled. One of the advantages of PC is that it offers, via consistency analysis, a new means of detailing a lack of expertise or coherence in an experts' assessment. Such inconsistency may be caused by a lack of expertise or a particular bias, or more often caused by a lack of understanding of some of the tasks. Whether such a lack of consistency is truly a lack of expertise or an artefact of the particular PC generation process, it must be rooted out in the analysis. Some experts may therefore exhibit internal inconsistencies that manifest themselves as logical inconsistencies known as 'circular triads', i.e. event A is judged to be more likely than event B, event B more likely than event C, and event C more likely than event A. The results of an expert who is exhibiting such inconsistencies should be rejected to avoid contamination of 'useful' data. It is necessary, therefore, to determine the number of circular triads (c) put forward by the expert. A coefficient (K) can then be calculated. This coefficient varies from zero for a completely random (maximum number of circular triads) set of judgements, to one for a completely consistent (no circular triads) set. As an example Subject 1 'controlled' evacuation is shown in the matrix below. A ' 1', in the following matrix (Table 7), indicates that the row event was judged more likely than the column event, 'ai' indicates the row sums, and ' a ' = (n - 1)/2, where n is the number of events. The number of circular triads are first determined using the following equation (Seaver and Stillwell3J): n ~ T c = ~(n ~- 1)2 Where, T= ~

( a i - h)2

i=1

Therefore; T = ( -- 1.5) 2 + ( 1 . 5 ) 2 -k- ( 3 . 5 ) 2 q- ( -- 3 . 5 ) 2 q- ( 4 . 5 ) 2

+ ( - 1.5) 2 -]- ( - 1.5) 2 + ( - 4 . 5 ) 2 + ( 2 . 5 ) 2 q- ( 0 . 5 ) 2 T = 80.50

1

The coefficient of consistency can now be found by the following; K=I

Table 6 illustrates all the errors on the davit launch lifeboat which were recorded during a search of trainee assessment sheets. However, there was insufficient information to quantify these as HEPs.

83

24c n(n 2 - 4)

if 'n' is even, K=I

24c n(n 2 - 1)

if 'n' is odd In this case, since n is even: K= 1

24(1) - - -- 0.975 960

The value of 0.975 indicates a very consistent subject. These calculations were performed for all subjects at both evacuation conditions. All subjects showed a consistency coefficient of 0-825 or more, indicating high levels of consistency. This means the results are coherent, and the expertise consistent. Following the consistency analysis of the subjects, the actually HEPs were derived. As no subjects were rejected at this stage the following calculations are based on all nine subjects. The scale values that follow, were derived for the controlled scenario (see Table 3 for a description of the errors). This scale is subjective, but the scale values are related to absolute probabilities. Error 8 4 2 6 9 l 3 7 10 5 Scale . . . . 0.003 0.17 0.61 0.69 0.87 0.95 value 1.54 1.37 0.28 0.09 In order to convert this subjective scale into error probabilities, a pair of calibration points or 'anchors' is required, and the best results are achieved if the HEPs are from known observational data and are at the two extreme ends of the scale. In this case errors 8 and 5 were used. A calibration equation was developed as probabilities are assumed to be logarithmically related to the derived scale values (Hunns3°); log(HEP) = As + B where, s = mean scale rank value; A and B = constants However, there are two ways to calibrate the PC scale. The first is to use robust APJ estimates, and the second is to use observation-based data. Since observation-based data do not occur whilst watching real evacuations, it is possible that APJ-based calibration points may serve better as calibration points since they may contain experts assessments of 'stressors' that will not be present in observing training launches. In order to decide which type of calibration points to use, each PC scaling (controlled and severe) was calibrated twice, once purely by APJ, and once via APJ and observation-based data. Tables 8 and 9 summarise the

A blocked or unsafe access will impair a quick and efficient evacuation. Injury to personnel may result. Boat will not launch. Damage to boat. Delay in evacuation. Time increasing threat to personnel. Boat may become stuck halfway. An obstruction, rope for example, may get caught up in the propeller and result in the propeller becoming inoperable. It would not be possible to steer the lifeboat if the rudder was obstructed in any way. A delay in evacuation and increased risk to all personnel will result.

Consequences

Water may enter boat depending on the condition of the valve. Possibility of ingress of smoke. Lack of air - asphyxiation

Slip of memory Short cut invoked

Ref. 26 Ref. 26/32

Visibility Frequency of training and platform drills Quality of training and platform drillsNumber of trained crew present Evacuation procedure

Motivation Low skill level

0.130

0.0235 0.0572 Controlled Severe

APJ

HEP

0.0523 0.116

PSFs

Controlled Severe

Memory failure

Forget isolated act

PEMs

PC

A/S

Ref. 26

A/S

Source

Severe conditions may be underestimated O and the occupants of the boat put in unnecessary danger. Boat may not be capable of handling conditions. Safer option may be missed or ignored. Wind conditions may not be used to their full advantage - coxs. may be doing more work than necessary and taking longer than necessary. Information not A/S obtained

Action omitted

Check omitted

Check omitted

Check omitted

Action omitted

Error modes

1.1.8 Secure and enter boat 1.2 Internal checks Check omitted 1.2.1 Check drain plug fitted and secure 1.2.2 Ensure air support Check omitted system operational

1.1.6 Ensure escape route clear 1.1.7 Check wind speed, direction and sea state

1.1.4 Check condition of craft 1.1.5 Check drop zone clear

l.l.2 Clear guard rails, gripe wires and maintenance pennants 1.1.3 Check propeller and rudder

1. Check boat 1.1 External checks 1.1.1 Check access is clear and safe

Task step and equipment

Table 5. H u m a n error analysis table for the Davit lifeboat

0.222; 0.0385

0.135; 0.004 0.398; 0.008

UCB

4:~

Controlled Severe Controlled Severe

PC APJ

Information not obtained

Misinterpretation

Ref. 26

Wrong check

0.0284 0.158

0.0103 0.0695 0.124; 0.0065 0.202; 0.124

Key: Ref. 26, Kennedy, 1992; Ref. 28, Brabazon and Bellamy, 1993; O, Observed data; A/S, Data extracted from trainee assessment sheets; PC, Data generated from paired comparisons; APJ, Data generated from absolute probability judgements; PEM, Psychological Error Mechanism; PSF, Performance Shaping Factor; HEP, Human Error Probability; UCB, Uncertainty Bounds; N/C, Not calculable; Notes (1) If there are no data for a particular task step, this means no error was observed or assessed by expert judgement within this study. (2) When there are several error modes, the consequences described relate to all error modes listed, except where consequences are clearly aligned with the different error modes.

1.2.2 Ensure air support system operational (cont.)

t.lt

~2

G. Basra and B. Kirwan

86

T a b l e 6. E r r o r d a t a c o l l e c t e d via a s e a r c h of t r a i n e e assessment sheets

Error modes Action omitted Check omitted Information not obtained Action omitted Check omitted Information not transmitted Action omitted Action omitted Action ()milled Action omitted

Task step and equipment 1.1.1 Check access is clear and safe 1.1.3 Check propeller and rudder 1.1.7 Check wind speed, direction and sea state 1.2.6 Secure all doors and hatches not used for access 1.2.7 Check air vents closed 3.1 Supervise embarkation of personnel 3.3 Ensure personnel strap in securely 5.2.3 Monitor compass 5.3.3 Release hooks 5.5.1 Monitor compass results obtained for the controlled and severe conditions respectively, using the two calibration methods. Table 9 illustrates, clearly, that the upper values (errors 1, 7, 10 and 5) are given roughly twice the H E P values using the two APJ calibration points. This possibly represents the experts 'loading in' a stress factor not evident in the observational data. The APJ calibrated data are more conservative, and are the ones that were chosen for use in preference to the other set (with one APJ and one observation calibration point) for the 'controlled' and the 'severe" scenarios. This is because the subjects had little offshore platform experience, and were themselves trainers, it seems likely that they might tend to underestimate failure likelihoods (e.g. basing them on their projections of how they might perform, rather than average offshore personnel). It is therefore safer to utilise the more conservative (pessimistic) HEPs, as these may therefore be closer to real performance. As a result of the small n u m b e r of subjects, uncertainty b o u n d s for the PC HEPs were not calculated.

A method was devised to screen out such inconsistencies, and 2 subjects were consequently rejected. Subsequent calculations using APJ data were therefore based on seven, as opposed to nine subjects (see Table 10). HEPs were calculated by statistically aggregating the data. The geometric mean of HEPs obtained was calculated, i.e. all the (seven) experts' probabilities for a particular error are multiplied together and the 'nth' root of the product is taken. This value can be cross-checked with that obtained as a by-product when calculating APJ uncertainty bounds. These are calculated using the following formulae adapted from Seaver and Stillwel131. Upper and lower uncertainty b o u n d s are equivalent to; log HEP +_ 2s.e. Where, / V(log HEPi) s.c. = V ;n V(log HEPi) m × E ( l o g JEPij) 2 - ( E l o g HEPii) 2 m(m - 1)

3.6.2 Absolute probability judgement As with PC, it is necessary to screen out j u d g e s whose APJ estimates are inconsistent, but in this case inconsistencies a m o n g s t j u d g e s are investigated. If for example, eight subjects believed the HEP to be 0-01 for a particular error, and two subjects believed it to be 0-00001, then the latter two subjects might be considered incorrect for this error (the converse could, of course, also be true). If certain subjects disagree with the general consensus on a n u m b e r of occasions, and their disagreements are themselves inconsistent, their data m a y then be rejected as being n o n - h o m o g e n e o u s .

T a b l e 7. S u b j e c t

and,m = no. of subjects Uncertainty bounds were calculated for all ten errors. The calculations for Error 1 are given here by way of example. Error 1 - - Controllednl = 7 E ( l o g HEP0) 2 = ( - 1.70) 2 + ( - 1.00) 2 + ( - 1.30) 2 +(-

0.82) 2 + ( -

1.00) 2 + ( -

3.30) 2 + ( -

E ( l o g HEP0) 2 = 21.04

1. C o n t r o l l e d

evaucation

condition

1 1



2 0

3 0

4 1

5 0

6 1

7 0

9 0

10 0

ai 3

2 3 4 5 6 7 8 9 10

I 1 0 1 0 1 0 l

• 1 0 1 0 0 0 l

0 • 0 I 0 0 0 0

1 1 • 1

1 1 0 I

I 1 0 1

0 I 0 1

1 1 0 1

6 8 I 9

1

0 l

0 0 0 • 0 0 0 0

0 0 0

0 0 0

3 3 0

1

0

0

1

0

• 0

1 •

7 5

1 1

1.70) 2



0 0 1 1



0 1 1

1 1

ai-a -1.5 1.5 3.5 3.5 4.5 -1.5 1.5 -4.5 2.5 0.5

Collection of offshore human probability data Table 8. 'Controlled' Evacuation Condition HEP data

Error

PC scale calibrated by two APJ HEPs

PC scale calibrated by one APJ and l observation HEP

0.000204 0.000302 0.00372 0.00575 0.00713 0.0105 0.0288 0.0347 0.0525 0.0631

0.000201 0.000303 0.00423 0.00669 0.00837 0.0125 0.0363 0.0441 0.0682 0.0827

8 4 2 6 9 1 3 7 10 5

(~-".log HEPij) 2 =- (( - 1.70) + ( - 1.00) + ( - 1.30) + ( - 0.82) + ( - 1.00) + ( - 3.30) + ( - 1.70)) 2 ( Z l o g HEPij) 2 _- (10.82) 2

m(m- 1)=7(7-

s.e. = ~

Table 9. 'Severe' Evacuation Condition HEP data

Error

8 4 2 3 6 9 1 7 10 5

PC scale calibrated by two APJ HEPs 0.00143 0.00242 0.0139 0.0217 0.0295 0.0361 0.0695 0.1092 0.1160 0.1563

PC scale calibrated by one APJ and one observation HEP 0.00140 0.00219 0.00959 0.0139 0.0181 0.0214 0.0372 0.0545 0.0573 0.0738

4 DISCUSSION 4.1 H E P data collected

4.1.1 Data collection via observation

1)=42

These values are substituted into the formulae;

V(log HEPi) =

87

7 × 21.04-(-

10.82) 2

42

=-0.717

= 0.320

The average log equivalent HEP value is then used for the final steps of the calculation; Average log equivalent HEP = - 1.546 2 s.e. = 0.640 Therefore for Error l; HEP = log( - 1.546) = 0.0284 Upper bound = log( - 1.546 -4-0.640) = 0.124 Lower bound = log( - 1.546 - 0.640) = 0.00652 These calculations were performed for all ten errors at both evacuation conditions (see Table 10 for a summary of the results).

3.6.3 Summa~ of main heps derived from the study A total of nineteen HEPs were generated for the davitlaunch system. Of this total, twelve were collected via observation, and an additional seven were generated through the expert judgement sessions. Table 10 summarises the HEPs obtained as a result of this study. For each task step, in each condition, the more conservative HEP (APJ or PC) will be incorporated into the C O R E - D A T A system.

A total of nineteen HEPs were generated during the study. These were encoded recently into the CORE-DATA system, thus giving C O R E - D A T A at least an initial segment of human error data in the offshore sector. Collecting data via direct observation produced twelve HEPs. Although the observation was obtrusive in nature (i.e. an observer was present in the boat, watching the coxswains), it appeared that this had no significant effect, adverse or otherwise, on the trainees' performance. However, it must be noted that observation will only detect errors that are frequent enough to occur during the observation period, and those errors that are observable and are in fact seen by the observer. In lifeboat evacuation, where the task steps are highly predefined and observable, this approach will work. In other tasks, which involve more decisionmaking or less observable behaviour, this approach will be less successful. The major problem faced when collecting the observational data was the constant prompting the trainees received from their trainer as part of the training process. Although this hindered the data collection process it must be noted that coxswains attending courses at RGIT centres are there to learn. Trainees were therefore given continual guidance on what checks they needed to make, how to actually launch the boat and how to manoeuvre the boat. It was felt that if the trainees were left to their own devices, they perhaps would have made many more errors. Potentially fatal errors may have therefore been intercepted before they could actually occur (this makes good training sense). The lifeboats were also always left in a fully operational state: all the trainee was required to do was make the necessary checks and start the boat. It would have been interesting to see if the trainees could actually identify a fault whilst carrying out their checks. This could have been easily demonstrated by the introduction of a number of 'traps'. It would then be clear if the coxswain actually knew what he was looking for, or

0.0103

0.0286 0.00361

0.000286 0.0630 0.00561 0.0344 0.000193 0.00697 0.0523

0.0213 (0.081 ;0.006) 0.0226 (0.038;0.014)

0.000566 (0.00664;0.0000483) 0.0631 (0.166;0.0239) 0.00869 (0.035;0.002) 0.0205 (0.118;0.0036) 0.000193 (0.0015;0.00003) 0.00794 (0.105;0.0006) 0.0235 (0.135;0.004)

PC

0.0284 (0.124;0.0065)

APJ

Controlled

0.0695

0.0217 0.0139

0.00242 0.156 0.0295 0.109 0.00140 0.0361 0.116

0.0425 (0.116;0.0156) 0.0692 (0.112;0.043)

0.00437 (0.0132;0.0014) 0.158 (0.417;0.0608) 0.0411 (0.098;0.017) 0.106 (0.452;0.0249) 0.00145 (0.0087;0.0002) 0.00794 (0.159;0.0004) 0.0572 (0.398;0.008)

PC

0.158 (0.202;0.124)

APJ

Severe

Notes (1) Numbers in brackets are uncertainty bounds. (2) N/C (Not Calculable), denotes those uncertainty bounds for which values could not be calculated.

1. Fail to check wind speed, direction and sea state. 2. Fail to check air support system. 3. Fail to check fuel tap slightly slack. 4. Fail to correctly initiate air support system. 5. Incorrectly operate brake cable. 6. Fail to disengage boat. 7. Fail to disengage boat quickly. 8. Retaining pin removed too early. 9. Fail to release hooks properly. 10. Fail to remove hooks correctly. 11. Hooks reset too early. 12. Forward gear and full throttle in the incorrect sequence. 13. Fail to start engine before lowering. 14. Fail to monitor compass. 15. Fail to ensure hatches and doors closed. 16. Fail to ensure personnel strapped. 17. Fail to start engine at all. 18. Fail to position wheel to clear installation. 19. Fail to check drop zone clear.

Task step and error

Table 10. Summary of the davit-launch HEP's obtained

O. 130 (0.0385;0.222) 0.0370 (0.0884;N/C) 0.0370 (0.0884;N/C) 0.0370 (0.0884;N/C) 0.0370 (0.0884;N/C) 0.0185 (0.0552;N/C) 0.0370 (0.0884;N/C) 0.0185 (0.0552;N/C) 0.0185 (0.0552;N/C) 0.0370 (0.0884;N/C) 0.0185 (0.0552;N/C) 0.0185 (0.0552;N/C)

Observed

Collection of offshore human probability data if he was merely following a number of procedural checks in a rote fashion. These factors illustrate the difficulties in collecting data from a training simulator, where the primary objective is to produce trained personnel, rather than to collect realistic data. The frequency of the observed errors is necessarily a function of the total number of observed trials, in this case approximately sixty trials. In this study, observed HEPs were in the approximate range from 0.01 to 0.1, giving a fairly narrow segment of the total usual HEP range 1.00.0001. There may be less frequent errors which have more significant consequences. Such errors, which could have a HEP of 0.001 for example, would be very resource-intensive to detect through observation. Observation is therefore one approach to data collection, but is unlikely to provide comprehensive HEPs for the entire task, including the low probability high consequence tasks that are often of most interest to the PSA/HRA assessors. There may also be higher HEPs in an actual launch, due to the stress of the scenario (e.g. HEPs in the 0.1-1.0 range). These stressors might prove impractical (and perhaps unethical) to induce in a simulation, so that such highstress-induced HEPs are not seen in observation of simulator trials. The solution to generating HEPs that will not be seen in simulations is either via accident reports, expert judgement, or via training assessment records. Each of these is discussed later.

4.1.2 Data collection from accident reports Real accidents tend to be very idiosyncratic, whereas PSAs are more generic in nature. This means that an individual accident represents a particular and highly specific failure path that would be 'bounded' by a failure sequence modelled in a PSA for the system. However, it is usually very difficult to equate such accidents with PSA fault sequences and determine what the individual hardware failure probabilities and HEPs in the fault and event trees should be, based on the accident occurrence. This is partly because the 'level of resolution' of an accident is deeper than most PSAs, and also because accidents only usually occur once, each time representing a unique accident pathway. The database based on accidents therefore often suggests that, given a particular set of conditions, the probability of an accident is unity (1-0), because those conditions have not been seen before, and therefore the denominator and numerator are both unity. This situation would be partly assuaged if data on near miss events or accident precursors could be collected, to determine a more useful and indeed realistic denominator. However, until such data are collected accident data from the offshore sector on evacuations, which are thankfully rare themselves, are unlikely to yield a comprehensive source of HEPs. This does not mean however, that accident data should be ignored. The accident data do yield useful qualitative and semi-quantitative data in three ways.







89

The error forms in real accidents suggest those that should be modelled in HRAs/PSAs - - this is important when errors are thought to be incredible, or have simply not been identified, prior to the accident. Accidents are therefore a check on the error identification stage of HRA. The PSF in accidents can be used in the HRA quantification stage of HRA, whether using expert judgement techniques or not. The frequency of accidents does give a very real benchmark of performance in the industry. If an accident occurs where it was predicted to have a probability of occurrence of less than once in a thousand years, then this is cause for concern for the PSA and H R A analysts.

The last point often tends to be played down following accidents, as companies and politicians often espouse the theory that the accident occurred because of freak conditions, and hence such an accident could not occur again with other installations. This happened after Three Mile Island, Bhopal, Chernobyl, and the Herald of Free Enterprise. In the latter case of course, the Estonia disaster, albeit a modification of the Herald of Free Enterprise tragedy, proved them wrong. Interestingly, after the Challenger Space Shuttle accident in 1986, analysts were pointing out after the accident that the chance of another one was one in a hundred thousand. The astronauts noted, however, that there was one failure in twenty launches, and therefore the current or working probability of failure was one in twenty. (See also Feynman, in Leighton 33 for a detailed discussion of the Challenger disaster). In summary therefore, accident statistics, are less likely to yield HEPs, but will support data generation processes qualitatively, and yield pertinent feedback for PSA/HRA in general on overall safety performance in a particular industrial sector. The best way to maximise utility of accident data is to collect data on near misses (events which were unsafe but with no accidental consequences: see also van der Schaaf e t ai.34). Such data may help to generate a more informative denominator.

4.1.3 Data collection via expert judgement The two techniques of APJ and PC proved useful in generating HEPs, and the statistical tests of consistency suggest that the HEPs derived robustly represent the coherent expertise of the judges. Data collected through these sessions were also consistent with data collected via observation. Further, since the HEPs generated in the severe evacuation condition are higher than for the controlled condition, this suggests that the expert judgement approach was able to 'factor in' the effects of stress etc. In most cases this amounts to an increase in HEP by a factor of 2. This extra loading may appear to be low given the stressors that will be evident in a real evacuation. Techniques such as H E A R T would probably suggest a factor of ten, rather than a factor of two, and so it must be questioned whether

90

G. Basra and B. Kirwan

the data are accurate enough (or conservative enough) for HRA purposes. In order to decide on this matter, it is first necessary to examine in more detail what experts are really judging in an expert judgement session. Expert judges are usually from the field of interest, typically being operators or trainers, without training in H R A or PSA. When assessing HEPs the experts are also not usually attempting to derive conservative values, but rather they are assessing best estimates (in PSA the former are utilised). The experts will base their judgements primarily on three thought experiments: 'what would I do in such a scenario'; 'what would I expect others to do on average, given how I have seen them perform'; and 'what is the worst I would expect, given the worst performance I have seen or heard of'. However, in assessing disasters, particularly if the experts are themselves involved in maintaining safety, there is a possible tendency to assume that when life was in imminent danger, then someone would pull the crew through the scenario. There is an appealing and intuitive logic to this: if for example, the coxswain goes to pieces and does nothing, are the other forty people going to merely sit by and watch the platform fall down around them? The answer is probably not, which means that recovery will tend to be high, as long as there is time for recovery, and other personnel are competent and have enough training to achieve recovery. It is during considerations like this one that one would wish to review accident histories to see whether such faith in human recovery is in fact justified, Essentially, there appear to be examples of heroic recovery and also of catastrophic failures, but because the accident database is rather small, the strength of recovery cannot be judged. Since PSA and H R A are necessarily conservative, the expert judgement derived HEPs should be considered as best estimates, rather than the typical conservative HEPs used in PSAs. This means that H R A s dealing with evacuation perhaps could start from such HEPs, but should then probably add in effects of stressors using a technique like H E A R T or SLIM. This means that the current estimates should be treated as 'ballpark' estimates, and the particular PSA/ HRA should then adapt these to their own particular assessment. One way to render the data more directly of use for PSA would be to carry out a benchmark HRA, actually generating data for all the relevant human errors that could impact on the safety of the launch, and also using human factors and reliability experts as part of the group. This suggested approach went beyond the scope of this study, but was recommended as a potential objective for further work.

4.1.4 Data collection via training records The search through the training assessment sheets was not as helpful as was originally hoped, since assessor comments were on the whole too general in nature and errors were not recorded in the amount of detail required for this study. For example, if a trainee initially omitted a weather check

during the 'craft preparation' but correctly carried out every other check, he would be prompted by the assessor on his mistake, he would then carry out a weather check and would attain a pass for that c a t e g o r y - - t h e error of 'omit weather check' would not, however, be recorded. Slightly more detailed information was available for non-attaining coxswains, but here again assessor comments were very general. Once again, this is because these simulators are primarily concerned with training rather than data collection. One useful aspect of training assessment sheets, however, is their support in identifying critical and unrecovered rare errors. This is because such errors will be identified as a result of their severity, and because analysis of training records can cover thousands of trials, as observed by the trainers. This was probably the most useful aspect of the training assessment sheets in this study.

4.1.5 Importance and comprehensiveness o['observed e r r o rs"

Of the observed errors, nine were judged to be of secondary importance, in that the resultant consequences were not directly potentially fatal. The remaining six errors, however, were potentially more severe. Were these errors committed offshore, complications in the evacuation would have undoubtedly resulted. Although the number of observed davit launches totalled 60, only 12 of these were full abandonments, incorporating all checks and preparatory actions. The remaining 48 were actual launches, without preparatory checks etc. Potentially fatal preparation errors were therefore maybe not identified or were maybe underestimated. As was noted already, rarer errors, i.e. errors with an expected frequency of less than I in 60 opportunities, were not found via observation except by chance. Therefore the data generation process cannot be guaranteed to have been comprehensive, though hopefully the error identification process and the subsequent APJ/PC sessions dealt with some of the more critical but rarer errors. It is therefore believed that most of the errors that could occur during a davit-launched lifeboat evacuation have been identified, either through direct observation or from the training assessment sheets, which span thousands of launches. As this study was a feasibility study, not all of the identified errors have as yet been quantified, and therefk~re, while a data set now exists for offshore evacuation, it is currently incomplete. In terms of the actual HEPs themselves, these ranged from 0.0002 (failure to start engine at all during the evacuation process) to 0.2 (fail to monitor compass, and fail to check air support system). The data collected therefore spans the credible range of human performance, and this increases the utility of the data derived for supporting HRA in lifeboat evacuation, whether the HRA techniques are used to modify these HEPs, or whether the data derived are used to calibrate techniques such as SLIM, which can then simultaneously load in extra PSF.

Collection of offshore human probability data 4.1.6 Summary of feasibilio' and utility of data collection exercise In summary of the four approaches to data collection, observation and expert judgement have proven the most fruitful in practice, and together generated data that will support risk assessment of lifeboat evacuation. With respect to three errors (see Table 10) it was possible to contrast estimates from APJ, PC and observational data sources. In these three cases the error probabilities agreed within a factor of 2, 4 and 6 respectively. This convergence, albeit with only three data points, lends support to the validity of the data collected and data collection processes. Usage of training records and investigation of accident statistics should still be attempted for future data collection exercises (the latter was very successful e l s e w h e r e - - s e e Kirwan et al.~5), but in this study they were of limited utility. Based on the previous discussion points, a number of the questions raised in the earlier Section 1.2 can now be answered. • • • • •

Data collection for lifeboat evacuation is feasible Direct observation and expert judgement have proven the best methods in this study The expert judgement derived data at least partly accounts for real event stressors The HEPs themselves in this study have ranged from 0.0002 to 0.2 Data collection supports, but at present does not preclude the need for the use of H R A techniques

The remainder of the discussion focuses on qualitative insights from the data collection process.

4.2 Qualitative insights associated with the data collection exercise: training and selection of coxswains Informal discussions were held with personnel involved in an error where possible. It was evident, however, that most of the errors were occurring because of procedural and task unfamiliarity. Almost half of the trainees observed had never before handled a lifeboat of any sort, so problems were to be expected. Discussions with trainers highlighted another major factor in the ability of the training coxswains. Offshore companies and contractors are required to have a certain complement of trained coxswains. Selection of these coxswains is not always optimised, and inappropriate people may be sent on courses. Some of these people might be clearly unable to cope with the stresses of a reallife evacuation offshore or might not inspire confidence and be unable to maintain control of a panic-stricken crew. Others, who have no interest in the course and have simply attended out of obligation, m a y not retain anything they learn for a sufficiently long period of time. The safety of personnel offshore may therefore be compromised. Several trainers with these views suggested that the selection process needs to be more formalised. Asking for coxswain volunteers and selecting personnel with marine experience would overcome these problems to a certain extent.

91

Trainers also expressed a need to revise the frequency with which refresher training is required. At present coxswains are required to attend a refresher course at two year intervals. Such a long period of time between courses and very limited offshore practice results in returning coxswains being at nearly the same level as new coxswains, undergoing training for the first time. It was suggested that the frequency of refresher training be increased to yearly or even six monthly intervals. It is recognised that training centres have a natural interest in frequent refresher training, but the authors impartial observation to a large extent supported their concerns. The above points are the main ones that arose incidentally during the study. The point of their inclusion in this article is to demonstrate that data collection on human errors will have other 'spin-offs', in terms of useful insights into the root causes of the errors, and potentially fruitful avenues for error reduction.

4.3 Further research One aspect of further work which could be carried out in this area would entail determining the relative importance of the individual errors, via a risk or human reliability assessment of offshore lifeboat launching, using fault and event tree methods. This would more properly prioritise the errors, and also enable the consideration of error recovery factors. The human error contribution to risk would then be determinable. Certainly some of the observed human error rates from this study are uncomfortably high. Although real evacuations are themselves rare events, the data collected do not suggest that human reliability in offshore lifeboat evacuation has been maximised. If a more formal and complete risk or human reliability assessment was carried out, specific and comprehensive measures to improve the reliability and safety of offshore evacuation would be derived.

5 CONCLUSIONS 1. Qualitative and quantitative data on evacuation were collected successfully.

lifeboat

2. Task and error analyses were carried out; 3. A total of 19 HEPs were generated; 4. The impact of stress factors were discussed during APJ/PC sessions, to render the data more applicable to offshore evacuation situations. 5. The data derived can be utilised to support H R A / P S A in this area. It is likely that the best approach would be to employ a H R A technique to modify the data derived (and now in C O R E - D A T A ) to render it appropriate for individual risk assessments. 6. The results raise some cause for concern over human reliability in offshore lifeboat evacuation, since they suggest that certain important errors have undesirably

92

G. Basra and B. Kirwan high HEPs. Further research should attempt to carry out a more formal risk and/or human reliability assessment for lifeboat evacuation, determining both the relative contributions of human errors to risk, and how to improve the reliability and safety of lifeboat evacuation. It is likely, on the basis of insights gained in this study, that primary factors for risk reduction in this area would be training and selection of coxswains.

14.

15. 16.

ACKNOWLEDGEMENTS The authors would like to thank Rachel Harris (RGIT) and all the trainers/instructors from RGIT who took part in the study, as well as Fiona Davies (MaTSU) and Bob Miles (OSD) for project management and support throughout the project.

17.

18. 19.

REFERENCES 1. Cullen, W. D., The Public Inquiry into the Piper Alpha Disaster, HMSO, London, 1990. 2. Henley, E. J. and Kumamoto, H., Reliability Engineering and Risk Assessment, Prentice-Hall, New Jersey, 1981. 3. Green, A. E., Safety Systems Reliability, John Wiley, Chichester, 1983. 4. Cox, S. J. and Tait, N. R.S., Reliability, Safety and Risk Management, Butterworth-Heinemann, Oxford, 1991. 5. Swain, A. D. and Guttmann, H. E., Human Reliability Analysis with Emphasis on Nuclear Power Plant Applications, NUREG/CR - 1278, USNRC, Washington, DC 20555, 1983. 6. Dhillon, B. S., Human reliabilim with human Jactors, Pergamon, Oxford, 1986. 7. Dougherty, E. M. and Fragola, J. R., Human reliability analysis: a systems engineering approach with nuclear power plant applications, John Wiley and Sons, New York, 1988. 8. Kirwan, B., A Guide to Practical Human ReliabiliO, Assessment, Taylor and Francis, London, 1984. 9. Williams, J, C., H E A R T - - A Proposed Method for Assessing and Reducing Human Error. In 9th Advances in Reliability Technology Symposium, University of Bradford, 1986. 10. Embrey, D. E., Humphreys, P. C., Rosa, E. A., Kirwan, B. and Rea, K., SLIM-MAUD: An Approach to Assessing Human Error Probabilities Using Structured Expert Judgement. NUREG/CR-3518, (BNL-NUREG-51716). Department of Nuclear Energy, Brookhaven National Laboratory, Upton, New York 1173, for Office of Nuclear Regulatory Research, US Nuclear Regulatory Commission, Washington, DC 20555, 1984. 11. Comer, M. K., Seaver D. A., Stillwell, W. G. and Gaddy, C. D., Generating human reliability estimates using expert judgement, vol. 1. NUREG/CR-3688 (SAND 84-7115), USNRC, Washington, DC, 1984. 12. Kirwan, B., A Comparative Evaluation of Five Human Reliability Assessment Techniques. In Human Factors and Decision Making. (ed. B. A. Sayers) Elsevier, London, 1988, pp. 87-109. 13. Kirwan, B., Kennedy, R. and Taylor-Adams, S., A Validation Study of Three Human Reliability Quantification

20. 21. 22. 23.

24.

25.

26. 27. 28. 29. 30. 31.

Techniques. In European Safety and Reliability Conference, ESREL '95, Bournemouth, June 26-28 (eds I. Watson and M. Cottam) Institute of Quality Assurance, London, 1995, pp. 641-661. Topmiller, D. A., Eckel, J. S. and Kozinsky, E. J., Human Reliability Databank for Nuclear Power Plant Operations. USNRC Report NUREG/CR-2744, Washington, DC-20555, 1984. Kirwan, B., Martin, B., Rycraft, H. & Smith, A. Human error data collection and data generation. Journal of Quality and Reliabili~, Management, 1990, 7(4), 34-66. Taylor-Adams, S., The use of the Computerised Operator Reliability and Error Database (CORE-DATA) in the Nuclear Power and Electrical Industries. In 1BC Conference on Human Factors in the Electrical Supply Industries, Copthorne Tara Hotel, London, 17-18 October, 1995. Rasmussen, J., Pedersen, O. M., Carnino, A., Griffon, M., Mancini, C. and Gagnolet, P., Classification System for Reporting Events Involving Human Malfunctions. RISO-M2240, DK-4000, Riso National Laboratories, Roskilde, Denmark, 1981. Reason, J. T., Human Error, Cambridge University Press, Cambridge, 1990. ACSNI, Advisory Committee on the Safety of Nuclear Installations Study Group on Human Factors, Second Report: Human Reliability Assessment--A Critical Overview. Health and Safety Commission, HMSO, London, 1991. Taylor-Adams, S. & Kirwan, B. Human reliability requirements. International Journal of Quality and Reliability Management, 1995, 12(1), 24-46. Taylor-Adams, S., Development of a Taxonomy for use with a Human Error database. In Contemporary Ergonomics (ed. E. J. Lovesey) Taylor and Francis, London, 1994. Taylor-Adams, S., CORE-DATA - A Human Error Probability Database, Safety & Reliability Society Quarterly Journal, June, 1994. Taylor-Adams, S., and Kirwan, B., Development of a Human Error Databank. In International Conference on Advancement of System-based Methods for the Design and Operation of Technological Systems and Processes (PSAM II), San Diego, March 20-25, 1994, pp. 87-13-87-19. Taylor-Adams, S. and Kirwan, B., Development of a Human Error Database. In Second International Conference on Reliability, Maintainability and Safety, Beijing, China, 7 10 June, 1994. Landolt, J. P., Light I. M., Greenen, M. G. and Monaco, C., Seasickness in TEMPSC: Five offshore oil rig disasters, Aviation, Space and Environmental Medicine. 1992, pp. 138-144. Robertson, D. H. and Wright, M. J., Ocean Odyssey Emergency Evacuation: Analysis of Survivors Experiences, MaTSU OTX92 407, 1995. [CONFIDENTIAL]. Kirwan, B. and Ainsworth, L. K., A Guide to Task Analysis. Taylor and Francis, London, 1992. Brabazon, P. G. and Bellamy, L. J., Freefall versus DavitLaunched Lifeboats: Human Factors Study. Four Elements Report, C2334, London, 1993. [CONFIDENTIAL]. Kennedy, B., A Human Factors Analysis of Evacuation, Escape and Rescue from Offshore Installations. MSc Report, (unpublished) University of Birmingham, 1992. Hunns, D. M., The method of paired comparisons. In High Risk Safety Technology, (ed. A. E. Green) Wiley, Chichester, 1982. Seaver, D. A. and Stillwell, W. G., Procedures for using expert judgement to estimate human error probabilities in nuclear power plant operations. NUREG/CR-2743, USNRC, Washington, DC, 1983.

Collection o f offshore human probability data 32. Please give details. 33. Leighton, R., What do you care what other people think? Harper-Collins, London, 1993. 34. van der Schaaf, T. W., Lucas, D. A. and Hale, A. R., Near Miss Reporting as a Safety Tool. Butterworth-Heinemann, Oxford, 1991.

93

35. Shepherd, A. Issues in the training of process operators. International Journal of Industrial Ergonomics, 1986, 1, 49-64. 36. Shepherd, A., Analysis and training of information technology tasks. In Task analysis for human-computer interaction, (ed. D. Diaper) 1989, Ellis Harwood, Chichester, pp. 15-54.