Performance-based usability evaluation of a safety information and alarm system

ARTICLE IN PRESS Int. J. Human-Computer Studies 63 (2005) 328–361 www.elsevier.com/locate/ijhcs Performance-based usability evaluation of a safety i...

Download PDF

393KB Sizes 0 Downloads 21 Views

Report

PDF Reader
Full Text

ARTICLE IN PRESS

Int. J. Human-Computer Studies 63 (2005) 328–361 www.elsevier.com/locate/ijhcs

Performance-based usability evaluation of a safety information and alarm system Leena Norros, Maaria Nuutinen Technical Research Centre of Finland, P.O.Box 1301, Fin 02044 VTT, Finland Received 29 March 2004; received in revised form 16 March 2005; accepted 30 March 2005 Available online 31 May 2005 Communicated by S. Eklundh

Abstract Evaluation of the appropriateness of information technical systems for complex professional usage in safety-critical contexts poses signiﬁcant methodical and practical challenges. In this study, the usability of a Safety Information and Alarm Panel (SIAP) in a nuclear power plant control room was tested. An integrated validation concept was used that included a new approach to measuring system and operator performance in complex work environments. The tested system was designed to aid the operators in severe disturbance and emergency situations. It had already been implemented at a nuclear power plant. The study was conducted in a full-scope training simulator. The results veriﬁed that an acceptable level of performance could be achieved when using the SIAP. When the operators’ practices were analysed by a habit-centred analysis, it was discovered that the effects of the SIAP differed between crews and between test scenarios. Thus, the SIAP tended to promote coherence of practices but reduce situatively attentive action. In diffuse task contexts the tool failed to support the shift supervisor’s control of the overall process situation, his awareness of the crew’s work load and his ability to update the crew’s awareness of the process. The operators reported that the system supported their process control activity and reduced stress in the situation, but the shift supervisors and operators also noticed some possible negative effects of the tool. These subjective evaluations corresponded to the effects observed in practice. The results revealed the complexity of the implementation of new tools into professional practice. It was proposed that a validation project should focus on the trajectory of development of the entire distributed cognitive system instead of comprehending validation studies as tests of the Corresponding author. Tel.: +358 20 722 6551; fax: +358 20 722 6752.

E-mail addresses: Leena.Norros@vtt.ﬁ (L. Norros), Maaria.Nuutinen@vtt.ﬁ (M. Nuutinen). 1071-5819/$ - see front matter r 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijhcs.2005.03.004

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

329

effects of information systems on a pre-deﬁned process output. Formative evaluation criteria are needed in projecting distributed cognitive systems. r 2005 Elsevier Ltd. All rights reserved. Keywords: Validation method; Information presentation; Control room; Operators; Practice; Nuclear power plant

1. Introduction In the operation of complex socio-technical systems (Vicente, 1999), the reliability demands are typically strict. They are usually addressed by implementing defence-indepth principles in design, preplanning and training (Rasmussen and Svedung, 2000). Potential problems are anticipated and defences created to ensure safe and efﬁcient operation of the process. Industrial process plants require heavy economic investment and therefore they are designed to be operational over a long time. For example, the nuclear power plants that are of special interest in this article have a 40–60-year life span. During this time, the plants undergo many modiﬁcations and modernization processes. The nuclear community has become aware of the challenges of upgrading the present nuclear power plants in the near future (Quentin and Niger, 2003). Leading expert organizations have anticipated the needs, and methods for managing the change process have been developed (O’Hara et al., 2002; Jeanton, 2003; Naser et al., 2003). 1.1. Managing change in complex systems One of the central systems of complex industrial processes, such as a nuclear power plant, is the main control room. The control room constitutes a multi-layered interface between the processes and the personnel. It consists of the immediate human–machine interface and dialogue, which is linked with the human–automation function allocation that is speciﬁed by the instrumentation and control technology of the process automation solution (Papin, 2002). According to standards, the functional objective of the main control room of a nuclear power plant (NPP) is to provide the operator with accurate, complete and timely information regarding the functional status of the plant equipment and systems (IEC-9064, 1989). The diversity of functions in the control room sets high demands on the design, where the requirements on process monitoring in normal operational situations must be tackled systematically (Mumaw et al., 2000; Vicente et al., 2001). Due to the operational signiﬁcance of the control room, all changes that take place have potential safety consequences. The control room information system whose validation we consider in this paper deals with an improvement in a generation II plant and control room. The majority of the existing NPPs represent I & II generation designs (O’Hara, 2003; Pirus, 2003b). These plants were taken into operation between 1950 and 1995. The process automation is based on analogue control and instrumentation technology and the

ARTICLE IN PRESS 330

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

monitoring and operations take place with the aid of control panels and consoles. In the studied case, a safety information and alarm panel (SIAP) was designed for a NPP control room to aid the operators in severe disturbance and emergency situations. It was designed to provide safety-relevant information and guide operational decision-making. The change induced in the control room via the SIAP was relatively restricted. The SIAP had been implemented into both control rooms of the two-unit nuclear power plant, and the operators had received training of its use in training sessions in full-scope training simulators. The operators did not use the system actively in their normal daily operations but it supplemented the process information from the regular panels. However, high functionality and usability was required due to the intended usage of the SIAP in emergency situations, and the utility was therefore supposed to demonstrate that no deterioration in the safety had taken place. The study was launched to gain systematic information of the usage characteristics of the SIAP. It also aimed at gaining methodical experience of control room modiﬁcations since more comprehensive control room modernization projects were emerging. We were involved as an independent evaluator. We confronted the methodical problem: How do we know that a complex system can be safely operated? The aims of this paper are to present an emerging validation concept and to describe the validation results. Finally, we will discuss the validation challenges of complex systems. 1.2. Integrated validation concepts are required Design validation is the method used to test and verify the safe functioning of the control room system after modiﬁcation. The role of validation is to establish whether the design can meet operational and safety requirements once all subsystems (hardware, software and personnel) are integrated together (p. 37) (O’Hara, 1999). Summing-up of the results of isolated evaluations on the performance of subsystems does not provide the required holistic evaluation. Therefore, integrated system demonstrations are called for. While the demonstrations may bring valuable input to design, their sufﬁciency for validating the system for use is restricted because they do not consider properly the high variation of operational situations and individual behaviour, and, hence, these demonstrations do not provide representative results (O’Hara, 1999; Norros and Savioja, 2004). Consequently, a systematic and generic approach to validation is needed. O’Hara discussed thoroughly the methodical requirements for valid and reliable results in system evaluation tasks (O’Hara, 1999). An important methodical requirements to indicate that the new design provides adequate support to the operators, This means that good performance can be specified and identified empirically. In this article we concentrate on both of these issues. Miberg-Skjerve and Skraaning approached the question of performance criteria by distinguishing levels of evaluation in the validation of complex systems (MibergSkjerve and Skraaning, 2003; Norros and Savioja, 2004). The ﬁrst level consists of the users’ acceptance of the system, and validation is based on the operators’

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

331

evaluations of the system. The second level refers to the appropriateness of the design result with regard to task performance of individual operators and teams, and the third level to the performance of the system with regard to the outcome, e.g. the achievement of the overall production goals of safety and effectiveness. The concept under development at the OECD Halden Reactor Project emphasises a humancentred approach, and this orientation is reﬂected in the central role given to the criteria of user acceptability. This criteria have been stressed by some earlier authors (Rouse, 1991; Lee and Moray, 1994) in the context of complex work domains, but their signiﬁcance was ﬁrst identiﬁed with regard to smaller, usually consumeroriented equipment (Nielsen, 1993). The Halden validation concept also draws on the criteria developed at the Brookhaven National Laboratory. The Brookhaven concept uses the following performance measures: system performance, personnel task performance, cognitive and mental load factors and anthropometric factors (O’Hara, 1999). Similar criteria are listed also by other authors (Kontogiannis, 1996). When different criteria are used in the evaluation the question concerns the connections between the criteria (Miberg-Skjerve and Skraaning, 2003). It would, for example, be important to consider whether it is adequate in a validation process to establish a connection between the user evaluations and the performance criteria, and the overall system output. Could it be reasonable to expect that good results in operator performance become manifest as good overall system performance, i.e. in the outcome? This question has great relevance for the interpretation of the results of validation tests. Furthermore, it is a methodically challenging problem and relates to the underlying conceptions of human conduct and the role of technical tools in the formation of practices.

2. Validation concept 2.1. Theoretical background for an integrated validation approach There appears to be a well-founded need for an integrated validation concept for complex information systems. However, many questions arise concerning the methods of evaluation. Because information and control systems function as operator tools the evaluation of their appropriateness is linked with theoretical conceptions concerning human conduct. Scientists working in the human–technology interaction research have recently expressed concerns regarding the cognitivistic information-processing methodology that characterizes the traditional human–computer interaction studies. The linear conception of information processing and action is seen as inadequate for describing the adaptive processes of human practice in complex environments (Carrol, 1997; Bannon and Kaptelinin, 2000; Hollan et al., 2000; Hollnagel, 2003). Alternative theoretical approaches to human action have been proposed and implemented to system development processes (Bannon, 2000; Decortis et al., 2000). These divergent approaches include, e.g. the cultural–historical theory of activity (Vygotsky, 1978; Engestro¨m, 1999), ethnography (Suchman, 1987),

ARTICLE IN PRESS 332

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

the distributed cognition approach (Hutchins, 1995; Hollan et al., 2000), or the Francophone course of action analysis (Theureau and Filippi, 2000). These approaches share the idea that human action is situatively constructed (Suchman, 1987). They also maintain that cognition is embedded in the social system of activities and that it should be comprehended as distributed cognition shared between the human and artiﬁcial elements of the system (Halverson, 2002). The cultural–historical theory of activity emphasises the temporal aspect of distribution of cognition and the need to analyse the history of activity systems (Engestro¨m, 1987). The changes in technology reorganize the whole system and, consequently, the implications of changes must be analysed with regard to the functioning of the distributed cognitive system and in relation to the global goals of the particular work activity. Consequently, all these theories emphasize that the circumstances or context of cognition must be taken into account in the analyses. When analysing the differences between the theoretical orientations of cognitively oriented human–computer interaction studies and ethnographically oriented Computer-supported collaborative work tradition, several authors encourage interaction between the approaches (Luff and Heath, 2000). Major obstacles still remain in the application of the contextual cognitive theories of human action to system development. As Bannon noted, the informationprocessing approach and its empirical metrics appear to combine more easily with the demands of the system developers (2000) and cognitive studies are thought to be more useful for design (Hoc, 2001). For example system validation draws on traditional methods, even though investigators are aware of the theoretical limits of these methods (Kirwan, 2003). Practical experiences of the usability problems in the industries and in the consumer sector create an important driver for a new orientation in human–technology interaction research (Bannon, 2000). Furthermore, the high stakes in safety-critical work domains force designers to seek better validation methods and more synthetic or predictive human factors design and evaluation criteria (Papin, 2002). We chose to approach the validation task from the cultural–historical perspective that maintains that work performance is embedded in the societal activity system (Leont’ev, 1978; Vygotsky, 1978; Engestro¨m, 1999). The cultural–historical theory of activity emphasises the object-orientedness of activity and stresses the role of the targeted outcome in the structuring of the activity and actual actions. The theory also emphasizes the role of artefacts, norms and rules and the division of labour as mediators between humans and the environment. In a validation context where the performance of a joint cognitive human–technology system is tested, we are particularly interested in the situated actions, and aim at analysing concrete toolusing interactions between the actors and the environment. Hence, the organization of cognitive and operative functions within the operator team, and between the team and the information system in maintaining control of the process are studied in detail. In this respect our approach closely resembles Hutchins’s notion of distributed cognition (Hutchins, 1995), which is specially tailored for understanding interactions among people and technologies (Hollan et al., 2000). The distributed

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

333

cognition approach may be considered to be theoretically compatible with the activity theoretical approach and it provides added value by operating at a more detailed level of analysis (Halverson, 2002). 2.2. Research questions and hypothesis The main research question to be answered in the study was whether the new SIAP system supports the control of the process in a disturbance situation. This was analysed through three questions. First, does the availability of the SIAP affect process performance, which may be interpreted as adequacy of task performance? Our generic assumption was that the changes induced by implementation of the SIAP system in the control room might manifest themselves as better or constant system outcome. The arguments that support this positive effect are: the SIAP was designed for supporting disturbance control, it consisted of a summary of the relevant information on the plant state and the users had been involved in the design. A negative effect or remaining constant could be assumed because of the complexity of the whole system. However, because of the high level of expertise among the operators and the complexity of the operating tasks, the performance outcome measures would probably not be sensitive enough to indicate potential safety relevant effects induced by the SIAP in the overall system performance. Therefore, we expected that the identiﬁcation and explanations of possible changes would require analysis of the qualitative features of operator practices. The second question was: Does the availability of the SIAP change the working practices of the crew and if it does, in what way? The changes in the possibilities and constraints of action that the particular information system provides reorganize the practices of the operators, and the effect may be different depending on the nature of the current practices or situation. Finally, the third question was: What are the users’ experiences of the usefulness of the SIAP?

3. The method of evaluating the appropriateness of the SIAP system performance The major methodical problem in the study was to ﬁnd a way to determine if the SIAP is an appropriate tool for NPP operations in emergency situations. In the main hypothesis we already assumed that the process outcome would not be a sufﬁcient indicator for a safe enough operator tool. We also had doubts about the sufﬁciency of validating the instrument through inspecting its interface features against existing standards. We could share these concerns with authors who have argued for an integrated system validation (O’Hara, 1999; Miberg-Skjerve and Skraaning, 2003; Norros and Savioja, 2004). Hence, we set about to create a performance-based method to develop a more comprehensive evaluation basis for the usability of a complex system. A new type of task analysis framework has emerged during the empirical research that we have accomplished in complex work domains (Norros and Sammatti, 1986; Hukki and Norros, 1993; Norros and Hukki, 1995; Norros and Hukki, 1997; Hukki

ARTICLE IN PRESS 334

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

and Norros, 1998; Holmberg et al., 1999). This approach was labelled the Core Task Analysis (Norros and Nuutinen, 2002; Norros, 2004). The present study was one of the contexts in which the Core-Task Analysis method was developed and tested. In the Core-Task Analysis, we aim at ﬁnding generic principles that regulate action in and of an organization. Drawing on the cultural–historical theory of activity the object and outcomes of the activity constitute the societal motivation and meaning of work. These have an effect on the organization of actions within the activity system and they provide a regulatory background for actions. We developed the analysis of an activity system further by introducing a functional modelling approach to deﬁne the functional possibilities that the domain provides for the human in the particular activity. As a result of the activity system analysis and functional modelling we deﬁne the societally meaningful content of particular work, the core task. We see that the core task demands should be fulﬁlled in peoples’ situated actions in order for this activity to be sustainable and contain potential for development. The core-task is used as a reference when two types of indicators are derived for the evaluation of the overall system performance. The ﬁrst set of indicators deals with process performance and the second set with indicators of operator practices. 3.1. Process performance indicators Process performance denotes the functioning of the process to be controlled. It is considered from two points of view: The process is ﬁrst seen as a target in the environment that has become an object of human activity. Furthermore, the process is an outcome of the activity. The distinction between object and outcome is borrowed from the cultural–historical theory of activity (Leont’ev, 1978). The object aspect of the NPP process is dealt with by modelling the NPP process as a domain that fulﬁls the societal objectives of activity. The objectives of nuclear power production were considered in a broad sense and deﬁned as safety, productivity and well-being. These main objectives may be concretised by analysing the constraints and possibilities of action that characterise sustainable activity in the domain. Drawing on the work of Jens Rasmussen, Vicente recently introduced a formative modelling approach to conceptualization of the intrinsic constraints of complex domains (Rasmussen, 1986; Vicente, 1999). We exploited the above ideas and deﬁned the constraints on a generic level with the help of the critical function concept. This concept denotes a decomposition of a complex system into its main outcome-critical operating functions, which may be fulﬁlled by different means and physical and social systems and components (Corcoran et al., 1981; Rasmussen, 1986; Rasmussen and Svedung, 2000; Pirus, 2003a, b). Pirus maintained that functional analysis denotes dismantling the system into operating functions as ‘‘a set of additional or redundant means needed to fulﬁl a precise operation mission in response to the constraints imposed by the environment’’ (Pirus, 2002, 2003a). In our analyses we have considered the safety-critical functions deﬁned by Corcoran for

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

335

pressurized water reactors. These functions are reactivity, core heat removal, coolant inventory, pressure control, heat transfer from primary to secondary circuit, containment temperature/pressure control, containment integrity and power for emergency systems (Kautto, 1984; Norros, 2004). In the future it would be important to extend the deﬁnition of the outcome-critical functions to include the productivity and well-being objectives as well. Process performance may be deﬁned by considering how well the result-critical functions are maintained within the critical safety (productivity and well-being) boundaries in a particular situation. Hence, the outcome-critical functions should be portrayed into situation-speciﬁc scenarios. We call the results of this situationspeciﬁc conceptualization Functional Situation Models (FSM). FSMs include analyses of the available resources (information, operative methods, procedures, etc.) from the point of view of their role in the maintenance of the critical functions of the work in a given situation (Norros, 2004). The models deﬁne the situation in functional terms. They provide insight into the operators’ potential actions and decisions in a disturbance situation (normal daily operations were not considered in detail in this study). Based on the literature on process control and own work three tasks were considered central in a disturbance situation: identification of loss of process stability, use of stabilization measures and identification of the cause of disturbance (Rasmussen, 1986; Hoc et al., 1995; Hukki and Norros, 1998). It was further assumed that, in response to the possibly contradictory constraints of the activity system, human operators must take into account and balance between critical functions when accomplishing these tasks. Balancing between the functions in accomplishing the three main tasks constitutes the core-task demands of the particular work (Klemola and Norros, 1997; Nuutinen and Norros, 2001; Reiman and Norros, 2002; Oedewald and Reiman, 2003; Norros and Klemola, 2005). These core-task demands were further used in developing indicators of process performance. We utilized the core-task modelling ﬁrst to create scenario-speciﬁc functional criteria for the process performance. This was accomplished together with the plant personnel who had substantial operating experience. Then, particular features were deﬁned as indications for successful process control. These are the process performance indicators. The indicators included understanding of the main signs of disturbance, water level of the reactor tank when a particular decision was made or comprehensiveness of the disturbance diagnosis. Ratings by experts were used when the acquired indicators were used in the evaluation of the process performance. Because these indicators refer to the outcome of actions they are deﬁnable independently of the actual accomplishment of the actions. They may therefore be considered as external good of practice (MacIntyre, 1984). The process performance indicators are important indicators as they denote the successfulness of the actions. Good process performance follows from the actors’ ability and willingness to put their personal skills and knowledge into effect. In the present study, we studied performance in four different situations. Therefore, we prepared four different FSMs and four sets of process performance criteria.

ARTICLE IN PRESS 336

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

3.2. Indicators of operator practice There are, however, different logics for people to balance between different constraints when accomplishing their tasks, i.e. in identifying the loss of process stability, in using stabilization measures, and in the identiﬁcation of the cause of the disturbance. By referring to the logics of acting we mean the mode of performance that expresses the sense the core-task demands make in particular action. The mode of acting is the actor’s learned relationship to the work environment, a generic disposition that provides continuity into his actions in a contingent environment. A disposition to act expresses the potential to act. The distinction between the potential to act, and its actualization into a act is a central point in the theory of habit by Charles Sanders Peirce (Peirce, 1998a). The theoretical background is treated extensively elsewhere (Norros, 2004). This distinction is also utilized in the Francophone ‘‘course of action analysis’’ (Theureau, 1996) that also draws on Charles S. Peirce. The advantage of distinguishing between the actual and speciﬁc aspect, on the one hand, and the generic and potential aspect, on the other, provides a way to understand the continuous adaptive interaction between the actor and the environment. Hence, a theoretically based conception of practice emerges. In developing the indicators of operator practice in the control of NPP disturbance situations we ﬁrst distinguished three basic interactions: interactions with the process, interactions within the crew and interactions with oneself. These three types of interactions constitute the three categories of habits of action, the ‘‘way of decision making’’, ‘‘way of collaborating’’ and ‘‘way of coping with problem situations’’ (see Table 1). We developed indicators of practice in our extensive empirical analyses of the operators’ actual performance in several earlier studies (Norros and Hukki, 1995; Hukki and Norros, 1998; Norros, 2004). The material of the SIAP evaluation completed the deﬁnition of the indicators. The ﬁnal set of indicators used in the SIAP evaluation is indicated in Table 1 (see Table 1 numbered items). These indicators express the core-task demands that would be relevant to each of the three types of habits of action (way of decision making, way of collaborating and way of coping with problem situations). The subtitles of Table 1 indicate the subcategories of habits of action. The division of habits of action and of the subcategories is based on the evaluation scheme presented earlier by Hukki and Norros (1998). We identiﬁed 51 items to be used as indicators of practice. The indicators (and their abbreviations used in the illustrations in the results section) are presented in Table 1. The fulﬁlment of the core task-based indicators in the actual work performance may vary because features of the core-task may make different sense to these actors. To distinguish the sense of considering the core-task demands expressed by the indicators we created a deeper, generic evaluation aspect. It refers to the nature of adaptive behaviour in an uncertain environment (Hukki and Norros, 1998; Norros, 2004). Drawing ﬁrst upon Ewald Ilyenkov’s ideas (Ilyenkov, 1977, 1984) and later on pragmatist conceptions of human conduct (Peirce, 1998b), we see that situationally appropriate behaviour in an uncertain world is characterized by an interplay of doubt and belief regarding the state of the environment. Continuity in the behaviour

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

337

Table 1 Evaluation of the crew’s habits of action: indicators for the situational appropriateness of the crews’ working practice Item number

Abbreviation

Indicators for the situational appropriateness of the crews’ working practice

The underlying evaluation dimensions

Way of decision making To what extent has the crew taken the global state of the process into account in its decision-making 1 Parameters Reaction to the important C process parameters 2 Check ASF Checking automatic safety C functions 3 Task priority Prioritizing of tasks C 4 Methods Use of adequate methods C 5 Operations Synchronization of operations C 6 Reasons Taking into account the reasons C for the disturbance in process stabilization To what extent has the crew tried to comprehend the particular nature of the disturbed state of the process 7 Search Active search for process S information 8 ASF conditions Clarifying the fulﬁlment of S conditions of automatic safety functions 9 Causes Finding out the causes of the S disturbance 10 Critical info Using critical process S information in identiﬁcation of the causes 11 Event order Using the order of the process S events in identifying causes To what extent has the crew comprehended the situation-speciﬁc possibilities to act 12 Redundant info Use of redundant sources of S information 13 Conﬁrm Conﬁrm/double-check the S interpretation of the process state with additional information 14 Method limits Taking into account the S situational restrictions of the operation method Way of collaborating Way of communicating: To what extent have the shift supervisor (SS) and each crew member (CM) promoted a shared interpretation of the situation Shift supervisor: 15 SS: Comm. Clearness of communication C 16 SS: Task Clearness of task allocation C 17 SS: Shift’s SA Updating shift members’ S situation awareness

ARTICLE IN PRESS 338

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

Table 1 (continued ) Item number

Abbreviation

Indicators for the situational appropriateness of the crews’ working practice

The underlying evaluation dimensions

18

SS: Conclude

C

19

SS: Diagnose

20

SS: Methods

21

SS: Inform

Concluding the situation with the whole crew Bringing out the diagnostic meaning of the events Bringing out the differences between stabilization methods Informing beyond the control room

Crew members: 22; 25; 28 23; 26; 29 24; 27; 30

(Three operators, CM1-3) CM: Comm. Clearness of communication CM: Inform obs. Informing of own observations CM: Inform checks Informing of own checking and operations

S S C

C S S

Way of co-operating: To what extent have the shift supervisor and each crew member promoted unity of collaboration Shift supervisor’s management practice 31 SS: Leading role Taking the leading role C 32 SS: Monitor Making sure all tasks are C performed 33 SS: Crew’s opinion Taking into account the S conceptions of the crew in situation interpretation 34 SS: Crew’s work load Taking into account the work S load of the crew Crew members: (Three operators, CM1-3) 35; 38; 41 CM: Interpret Participating in common interpretation of the situation 36; 39; 42 CM: Plan Participating in actualisation of the plan Acting according to expectations 37; 40; 43 CM: Effort Making an extra effort

C C C C

Way of coping with problem situations To what extent have shift supervisor and each crew member been able to re-orientate when confronting the problem Shift supervisor: 44 SS: Control Maintaining overall control and C coping with obstacles in operations 45 SS: Question Questioning of the prevalent S interpretation of the situation Crew member: (Three operators, CM1-3) 46; 48; 50 CM: Question Questioning of the prevalent S interpretation of the situation 47: 49; 51 CM: Coping Coping with obstacles in C operations

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

339

Table 1 (continued ) Item number

Abbreviation

Indicators for the situational appropriateness of the crews’ working practice

The underlying evaluation dimensions

To what extent have the shift supervisor and each crew member been able to evaluate their own resources critically Shift supervisor K Using outside help when needed S Crew members: K Expressing the need for help CM ¼ crew member, SS ¼ shift supervisor

is accomplished by developing habits that are repeated. The stronger the personal ability and interest to attend to a situation as a speciﬁc instance and object of knowledge, the stronger the tendency to act in an interpretative way. The weaker this tendency is, the more the action tends towards reactiveness and mechanical routines. The habit of action analysis thus expresses to what extent the operators, with regard to certain core-task demands, have tended towards a global interpretation (coherence) of the particular disturbance (situativeness) in their diagnostic and operative decision-making, towards a shared interpretation and coherent cooperation in their teamwork, and towards a situationally adequate mobilization of resources in coping with problems (Norros, 1997; Hukki and Norros, 1998). As a result we acquired the grading of habits of action (we used three levels). These levels were then transformed to scenario-based behavioural markers. The indicators and grading of the operators’ practice deﬁne what the particular professional community of actors considers as good practice and what is held ethically sound within the professional community. According to philosopher MacIntyre, such qualiﬁcations of practice constitute the internal good of practice and are conceptually inseparable of the deﬁnition of the particular professional function itself (MacIntyre, 1984).

4. Conducting the validation study 4.1. Description of the system to be validated The purpose of the SIAP was to provide an overview of the state of the plant in a severe disturbance or accident situation (on the left in Fig. 1) and to aid the use of the emergency operating procedures (on the right in Fig. 1). The SIAP system consisted of two large conventional panels located on one side of the control room next to the normal operating panels. The safety information part of the panel showed the central plant process parameters in an illustrative manner. The reactor and containment were depicted schematically on the panel, including the main components and the safety systems.

ARTICLE IN PRESS 340

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

Fig. 1. The safety information and alarm panel.

The values of parameters relevant to the functioning of these systems were shown next to the system schemes on the regular instruments. The information panels were clearly and constantly visible from the normal control positions of the crew. In a disturbance situation, the shift supervisor was supposed to proceed to the panels to exploit the information they provided. The alarm part of the SIAP provided information on the state of the critical safety functions of the process. It also indicated the access criteria to emergency operating procedures in case any of the critical functions were threatened. The system was tailored for the plant and was retroﬁtted into the existing control room. 4.2. Participants in the validation tests Six operator crews, randomly selected from the thirteen crews of the power plant, took part in the study. All crews consisted of four persons, a shift supervisor and three operators. All but one had many years of experience of operating the plant. The crews had recently participated in a training course on the full-scale simulator, during which use of the SIAP had been practised. The SIAP had been available in the real control room for about 1 year. 4.3. Test design and test procedure The study was carried out in a full-scale training simulator by an interdisciplinary research team. Engineering, reliability engineering and psychology were represented

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

341

in the group that was completed by one simulator trainer and one expert from the operational unit of the power plant. The SIAP system was tested in an experimental study with a design according to which six crews each performed four accident scenarios. Each crew managed two test situations with, and two without the aid of the SIAP. Necessary permutations were conducted in the ordering of experimental conditions. Availability of the SIAP was the major experimental variable (with/without) and the process performance, i.e. the process outcome, the dependent variable. Different measures of operator practices were deﬁned and criteria created for evaluation. Each experimental session lasted one and a half days and consisted of the following phases: The purpose of the study was ﬁrst explained and discussed, then two scenarios per day were run (1–2 h each). After each experimental run, a debrieﬁng was conducted with the crew concerning their performance. A group interview was conducted at the end of the whole session concerning the crew’s overall impression of the SIAP. 4.4. Scenarios of four severe disturbance situations The scenarios used in the study were severe disturbance situations including leak in the feed-water system, leak in the steam line, pressure transition combined with a leak, and power failure (see Table 2). The situations were planned carefully with the aim of creating realistic operational conditions and to gain valid results of the usability of the SIAP. The operators should have the possibility to affect the course of events in a positive way, but the disturbances should be complex and difﬁcult enough from the point of view of diagnosis and operations. The disturbances should require cooperation within the crew and contacts outside the control room should become necessary. Several planning sessions and considerable effort were devoted to selecting and planning the scenarios. The expert from the plant and the research team participated in this task, for which the simulator trainer was responsible. As indicated in the previous chapter, we prepared functional situation models of the test scenarios. We also classiﬁed the scenarios with regard to several aspects: how complicated the scenario was, how fast the disturbance developed, the availability of the information and operating methods, and the relevance of the SIAP to the scenario. The overall task complexity of each scenario was estimated according to these aspects. The classiﬁcation of the scenarios is presented in Table 2. 4.5. Data collection During each run, major parameters of the processes were registered on logouts. Video recordings were made and on-line expert observations of the behaviour of the operators and events in the control room were carried out according to a structured protocol. In the debrieﬁng interviews after each scenario the operators’ interpretation of the situation was ﬁrst queried. Then the development of the disturbance was treated step by step in a chronological order. After each run the research group went through the recordings to make the necessary clariﬁcations of the course of actions.

ARTICLE IN PRESS 342

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

Table 2 Complexity of the different scenarios used in the validations Aspect

Scenario 1 Leak in Scenario 2 Leak in the feed water system the steam line and sticking of without isolation control rods in shutdown

Scenario 3 Pressure transition. Pressure control problem and leak

Complexity of scenario

Clear (main event)

Diffuse (several events)

How fast the disturbance developed Availability of information

Fast

Slow

Diffuse (many Diffuse (electrical problems at the same problems) time) Slow start, then fast Slow

Good, no important information missing

Availability of operating methods Relevance of SIAP

Good, many different High, both parts

Medium, no important information missing, but also wrong information Poor, methods safety effects not clear High, both parts

Estimated overall task complexity

Lowest

3rd highest

Medium, no important information missing, but also wrong information

Scenario 4 Power failure and problem with feed water system

Poor, both missing and wrong information

Poor, many methods Poor, availability of needed method unknown Medium, safety panel had little relevance and the procedure part had relevance only in the second part of scenario; 2nd highest

Low, both parts

Highest

The process experts prepared an evaluation of the adequacy of the process control. The operators’ conceptions regarding the SIAP were discussed in the group interview after the completed session. The interviews were tape-recorded and transcribed into protocols.

4.6. Analysis of the data The analysis was guided by our research questions and hypotheses. The main research question to be answered was whether the new SIAP system supported the control of the process in a disturbance situation. As indicated earlier, our hypothesis was that the positive changes induced by implementation of the SIAP system in the control room might manifest themselves as effects on the system outcome. However, we expected that identiﬁcation of and explanations for the possible changes would

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

343

require analysis of the operator practices. The following questions were used to guide the analysis and to test the hypothesis:

Does the availability of the SIAP improve process performance or keep it constant, which may be interpreted as the adequacy of task performance? Does the availability of the SIAP change the working practices of the crew and if it does, how? What are the users’ experiences of the usefulness of the SIAP?

We prepared charts in which the pieces of raw data from the simulator were combined into timelines. These included trends of the process parameters, logouts of the process events and alarms and the operators’ actions. Table 3 demonstrates how the data was reﬁned further to present the crews’ courses of action in the different scenarios. The crew’s interactions with the process and within the crew or outside the control room were classiﬁed with regard to the three task categories, i.e. identiﬁcation of the loss of process stability, use of stabilization measures and identiﬁcation of the cause of the disturbance. The crews’ habits of action were determined on the basis of the courses of action and the debrieﬁng interviews, in which the operators described and gave reasons for their performance. We used the 51 habit-of-action indicators deﬁned in Table 1. Ratings (0–2) with regard to each indicator were performed by two researchers. If the ratings differed, the investigators attempted to achieve a consensus through discussions. If needed the scenario-speciﬁc criterion was corrected and all crews were re-evaluated with the new criterion. Such discussions were needed in less than 5% of the items. Sum variables were formed of the system performance items (3 items) and of the habit of action items (16 different sum variables) for the quantitative analysis. The sum variables were formulated following the structure of the assessment tool (Table 1). The different levels of headers in the rows of Table 1 describe the division of items used for grouping them into the sum variables. These are: all items, the three interactions, their subcategories and also the division between the items considering the shift supervisors and the crew members in the subcategory. In the analysis, graphical representations were ﬁrst used with the intention to acquire an overview of the data and to sketch differences in the data. The differences were then subjected to statistical tests by Survo (Mustonen, 1992) at levels of 0.1%, 1% and 5%. A correlation matrix was created and the meaningfulness checked. The normality (Shapiro–Wilk test and Anderson–Darling test), variances and deviations of the variables were checked in order to choose adequate statistical tests. There were some items that did not have resolution power in the data and also items that were not relevant to every scenario. The statistical analyses of variance were carried out with ANOVA (Brown-Forsythe or F-test). Different kinds of interactions were checked. The effects of the operator crew and the scenario were also checked. The comparisons between pairs were carried out with Student’s t-test (for normally distributed) and the Mann–Whitney test (for other than normally distributed). Graphical representations were constructed for further analysis of the differences

ARTICLE IN PRESS 344

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

Table 3 Course of action analysis in a particular scenario Time

0y 14 15

16 17

18

19

20 21 22

23

24 25 26

Identiﬁcation of loss of process stability

Use of stabilisation measures

CM1 notices high reactivity R4.60. SS and CM2 go to see situation (place: KA). CM1 informs that they have a leak into the containment building and pressure increases. (Looks PB). CM1 supposes leak in 211K404 and doubts if it can be isolated.

(meeting) SS proposes power reducing. CM2 says that they must prepare themselves for I-isolation. He takes I-procedure out. SS takes O¨SI-instruction out (high reactivity). CM1 and CM3 follow up the development of the situation (place: KA) CM2 notices that there are differences in pressure measurements and doubts if there is error in the measuring. SS says that they rely on the other one: the pressure is increasing. CM3 goes to printer (KD) in order to check pressure. SS decides that they will stop. CM1 proposes partial reactor trip. SS accepts—Reactor trip goes off. Everybody thinking why it came. CM3 says that he takes ﬁrst control procedures for turbine. CM1 proposes Iisolation, but SS commands to take the ﬁrst control procedures ﬁrst. Pressure increases. CM2 and CM3 report the ﬁrst control procedures: no abnormality Pressure increases. CM1 reports the ﬁrst control procedure. SS asks if I-isolation has come and CM1 says no. SS checks steam line 4. CM1 says that they have had I-isolation and takes the ﬁrst control procedure.

CM rapports the ﬁrst control procedure— Fire alarm. CM3 asks CM2 for help. CM2 asks CM1 to have his eye on the level.

Identiﬁcation of the cause of disturbance

CM1 informs that they have a leak into the containment building and that the pressure increases. CM1 supposes a leak in 211K404 and doubts if it can be isolated.

—Reactor trip goes off. Everybody thinking why it came. CM3 says that they have two canals that went off.

CM1 and SS notice that the steam line shows open but v80 is almost closed.

SS notices steam line 3 ﬂow.

CM1 suggests leak in the steam line. SS and CM1 check on the steam conduct.

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

345

between the with-panel and without-panel situations. The interviews were used in further analysis of the research questions, especially to get a picture of what the shifts thought about the SIAP.

5. Results The results for the above research questions are presented with the help of a simple model that guided the analysis (Fig. 2). The arrows indicate the effects that we were testing. The numbers on the ﬁgure refer to the order of presenting the results below. 5.1. Effect of the SIAP on process performance The main research question to be answered in the study was whether the new SIAP system supported the control of the process in a disturbance situation. This question was studied in several phases. First, we analysed the effect of using the SIAP on process performance (number 1a in Fig. 2). No statistically signiﬁcant general effect was found (Compare, Mann–Whitney test). The SIAP and the crew, or the SIAP and the scenario had no combined effect on the process performance. The variances of the process performance were low. Next, we analysed the differences between the crews with regard to process performance (adequacy of process control, number 1b in Fig. 2). Our results indicated that there were no signiﬁcant crew-dependent differences in the process performance (ANOVA, one-way, F (5,18) ¼ 0.178, Brown–Forsythe statistic ¼ 1:735, df 5 and 13,47, p ¼ 0:196).

Operator practices Situative and habitual features 2b

2a 2c 2d

Information system Affords constraints and possibilities for action 4

1a

3 Process performance Object and outcome of operator performance 1b

Fig. 2. The model used to structure the analysis of the results. The arrows indicate the tested effects and numbers indicate the different steps.

ARTICLE IN PRESS 346

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

5.2. Does the availability of the SIAP change the working practices of the crew and if it does, how? In order to tackle this question we tested the effect of use of the SIAP on the operators’ habits of action (number 2a in Fig. 2). It was found that the SIAP has no direct statistically signiﬁcant effect on the habits of action (separately on 16 different sum variables; Compare, Mann–Whitney test). Moreover, the SIAP and the crew or the SIAP and the scenario had no combined effect on the operators’ habits of action (ANOVA, two-way, F test/ Brown–Forsythe statistic). We also tested whether there were differences among the crews with regard to the working practices (number 2b in Fig. 2). Our results show that the crews do differ on a global level with regard to their habits of action. The result was statistically assured with regard to the way of decision-making po0:05 (Brown–Forsythe statistic ¼ 4:827, df 5 and 10,32, p ¼ 0:0167) and the way of cooperating po0:05 (Brown–Forsythe statistic ¼ 3,350, df 5 and 14,07, p ¼ 0:034), but not with regard to the way of coping with a demanding situation p40:05. Crew E’s working practice was the best and crew D’s the worst. The differences found between the crews with regard to the habits of action could possibly overshadow the effect of the SIAP system. To check this possibility and reveal the potential effect of the information system, we standardized the habit of action. The analysis was accomplished by graphical representations. Fig. 3 illustrates the difference between ratings of the crews’ habits of action in the ‘‘with SIAP’’ and the ‘‘without SIAP’’ situation, thus standardizing the level of the crews’ habits of action and highlighting the differences. Fig. 3 also includes the results of testing the effect of the habits of action on the process performance (number 3 in Fig. 2). The correlation between an evaluation item of the crews’ habits of action and the process performance is marked with an asterisk when statistically signiﬁcant. As can be seen, correlation was observed for 23 items. As Fig. 3 indicates, the effect of the SIAP appeared to be different on different crews (2c in Fig. 2). Crew A’s and Crew B’s habits of action deteriorated with the SIAP. Only crew E performed clearly better with the SIAP than without it. The effect of the panel on F’s habits of action was confused. C’s and D’s practices remained quite similar in ‘with’ and ‘without’ situations. The SIAP appeared to promote the crews’ (except crew B’s) way of decision-making by supporting the checking of automatic safety functions (item 2). The SIAP appeared to have the opposite effect on determining the causes of the disturbance (Item 9) (especially crew A). If only the shift supervisors’ habits of action were analysed (Fig. 4), the SIAP appeared to support Supervisor D’s practice to some degree, but the effect on Supervisor E’s habits of action was confused and on C’s almost non-existent. The effect of the SIAP on shift supervisor A’s, B’s and C’s practices was similar to the effect on the entire crews. There were six items in the supervisors’ habits of action (17, 19, 20, 34, 44, and 45) on which the sum effect of the SIAP was mostly negative, and only two on which the effect was slightly positive (31 and 32) (Fig. 4). The SIAP appeared to support the shift supervisors’ (especially D’s and F’s) management

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361 SHIFTS

p

*** ** ** ** ** ** ** * ** * * * * * ** * * ** *** ** ***

** *

Performance with cw. without Items 1 Parameters 2 Check ASF 3 Task priority 4 Methods 5 Operations 6 Reasons 7 Search 8 ASF conditions 9 Causes 10 Critical info 11 Event order 12 Redundant info 13 Confirm 14 Method limits 15 SS: Comm. 16 SS: Task 17 SS: Shift’s SA 18 SS: Conclude 19 SS: Diagnose 20 SS: Methods 21 SS: Inform 22 CM1: Comm. 23 CM1: Inform obs. 24 CM1: Inform checks 25 CM2: Comm. 26 CM2: Inform obs. 27 CM2: Inform checks 28 CM3: Comm. 29 CM3: Inform obs. 30 CM3: Inform checks 31 SS: Leading role 32 SS: Monitor 33 SS: Crew’s opinion 34 SS: Crew’s work load 35 CM1: Interpret 36 CM:1 Plan 37 CM1: Effort 38 CM2: Interpret 39 CM2: Plan 40 CM2: Effort 41 CM3: Interpret 42 CM3: Plan 43 CM3: Effort 44 SS: Control 45 SS: Question 46 CM1: Question 47 CM1:Coping 48 CM2: Question 49 CM2: Coping 50 CM3: Question 51 CM3: Coping

C

B

A W

B

W

B

W

E

D B

W

347

B

W

F B

W

B

Fig. 3. Difference in the crews’ (A–F) habits of action between ‘‘with’’ and ‘‘without’’ situations. The bars to the left of the y-axis indicate that the performance (adequacy of the working practice) was worse when working with the SIAP than without, and the bars to the right of the y-axis that it was better. The length of the bar indicates the scale of the difference. The correlation between the evaluation item of the crews’ habits of action and process performance is marked as: po0:05; po0:01 and po0:001.

practice, with the exception of taking into account the workload of the crew (B and F). The availability of the panel complicated the supervisors’ way of communicating (A and B) and reorientation (A and E).

ARTICLE IN PRESS 348

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361 SHIFTS

Performance with cw. without p Items 15 SS: Comm. * 16 SS: Task * 17 SS: Shift’s SA 18 SS: Conclude * 19 SS: Diagnose * 20 SS: Methods 21 SS: Inform ** 31 SS: Leading role 32 SS: Monitor 33 SS: Crew’s opinion *** 34 SS: Crew’s work load * 44 SS: Control 45 SS: Question

A W

B B

W

W

E

D

C B

B

W

B

W

F B

W

B

Fig. 4. Difference of shift supervisors’ (A–F) habits of action between ‘‘with’’ and ‘‘without’’ situations. The bars to the left of the y-axis indicate that the performance (adequacy of the working practice) was worse when working with the SIAP than without and the bars to the right of the y-axis that it was better. The length of the bar indicates the scale of the difference. The correlation between the evaluation item of the crews’ habits of action and process performance is marked as: po0:05; po0:01 and po0:001.

In summary, we found that the availability of the panel improved most the habit of action of Crew E who was also rated as having the most appropriate habit of action. Moreover, the new system also improved the habits of action of the supervisor of Crew D whose (the entire crew’s) habit of action was rated as the poorest. Deterioration in the habits of action were identiﬁed in the case of two crews (A and B) whose habits of action were rated as moderate. The above differences could not be explained by the crews’ overall performance level, nor by their conceptions of the panel. But it can be noted that the negative effects of the SIAP on the shift supervisors’ action appeared to become overt with regard to habit of action items that manifest the underlying evaluation dimension ‘‘situativeness’’ (items 17, 19, 20, 34) clearly more often than that of ‘‘coherence’’ (44). The positive effects concentrate on items that express coherence (items 31, 32). To understand the effect of the SIAP on the habits of action we considered its role in the different scenarios in more detail (number 2d in Fig. 2). Even though we did not ﬁnd any statistically signiﬁcant scenario-dependent differences in the process performance or in the crews’ habits of action, nor a combined effect with the SIAP, some interesting observations could be made under closer scrutiny. As the graphic representation (Fig. 5) shows, the SIAP appeared to have a different effect on the crews’ habits of action in the different scenarios. The SIAP had a mostly positive effect on the habit of action in scenario 1, but mostly a negative effect in scenarios 2 and 4. In scenario 3 the effect of the SIAP was found to be the most controversial. From our analysis of the scenario (Table 2) we know that the estimated overall task complexity of scenario 1 was the lowest and the relevance of the SIAP was assumed to be high. The task complexity of scenario 4 was the highest. Scenario 3 had two parts, and the SIAP supported the second part better than the ﬁrst. Thus, the SIAP appears to better support a less complex and straightforward disturbance.

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361 SCENARIOS

Performance with cw. without p

*** ** ** ** ** ** ** * ** * * * * * ** * * ** *** ** ***

** *

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

Items Parameters Check ASF Task priority Methods Operations Reasons Search ASF conditions Causes Critical info Event order Redundant info Confirm Method limits SS: Comm. SS: Task SS: Shift’s SA SS: Conclude SS: Diagnose SS: Methods SS: Inform CM1: Comm. CM1: Inform obs. CM1: Inform checks CM2: Comm. CM2: Inform obs. CM2: Inform checks CM3: Comm. CM3: Inform obs. CM3: Inform checks SS: Leading role SS: Monitor SS: Crew’s opinion SS: Crew’s work load CM1: Interpret CM:1 Plan CM1: Effort CM2: Interpret CM2: Plan CM2: Effort CM3: Interpret CM3: Plan CM3: Effort SS: Control SS: Question CM1: Question CM1:Coping CM2: Question CM2: Coping CM3: Question CM3: Coping

1 W

2 B

W

4

3 B

W

349

B

W

B

ARTICLE IN PRESS 350

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

Analysis of the effect on the various items further explains the differences. The evaluation dimension underlying each item is indicated in parentheses. The clearest negative sum effects were found for the following items:

Updating shift members’ situation awareness (situativeness). Informing about one’s own observations (situativeness). Taking into account the crew’s work load (situativeness). Maintaining overall control and coping with obstacles in operations (coherence). Positive sum effects were found for the following items:

Informing about one’s own checking and operations (situativeness). Checking automatic safety functions (coherence). Taking into account the situational restrictions of operating methods (situativeness). Participating in common interpretation of the situation (coherence). Participating in common interpretation of the situation (coherence).

We may also observe that situativeness-laden items are more frequent (3/1) within the negative effects. The dominance of coherence-laden items within the positive effect is not clear (3/2). 5.3. What are the users’ experiences of the usefulness of the SIAP? Our approach suggested that we should also consider the information system characteristics by subjective criteria. These were related to the task load and teamwork effects. Number 4 in Fig. 2 indicates the treatment of this aspect. According to the interview results, all but one (E) shift supervisor felt that the availability of the SIAP reduced stress. The one who had a different opinion emphasized that they did not feel stress during the simulation runs because the focus was on how to act with the new tool. (As already described, Crew E was the one with the best working practice. Subsequently, they could beneﬁt from the SIAP). Also, Shift Supervisor E thought that in a real disturbance the panel would reduce stress. This effect was due to the system’s ability to bring out the most important information available. The positive effects of the panel were typically associated with its stress-reducing effect. The SIAP was experienced to provide help in the operational structuring of the supervisor’s own and the crew’s actions, in the use of emergency procedures and in maintaining overall situation awareness. The system

Fig. 5. Differences in the habits of action between ‘‘with’’ and ‘‘without’’ situations in different scenarios. The bars to the left of the y-axis indicate that the appropriateness of the working practice was worse when working with the SIAP than without and the bars to the right of the y-axis that it was better. The length of the bar indicates the scale of the difference. The correlation between the evaluation item of the crews’ habits of action and process performance is marked as: po0:05; po0:01 and po0:001.

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

351

was considered as an extra source of information and it had the ability to highlight novel events. Especially, the alarm panel was felt to support the shift supervisors in deciding when to start following procedures, how to prioritize tasks and when to communicate outside the control room. Many of these effects were also seen as a positive effect of the SIAP on the habits of action in our analysis of the operators’ practices (see above). It should, however, be noted that we did not ﬁnd a systematic correlation between the time pressure the shift supervisors or reactor operators felt in the different scenarios, and the expert-rated features of complexity or speed of the disturbance development, or the availability of the SIAP. Also negative aspects were recognized by the crews. It was stated that the SIAP might draw the shift supervisor’s attention from the ongoing singular process events and from the crews’ performance, particularly when using the alarm panel with access criteria for emergency operating procedures. This takes time and if there are several criteria alarmed it is difﬁcult to respond to all of them. Furthermore, the placement of the panel drew the supervisor away from the crew and he had to ask more questions, which might disturb the crew. It was also brought up that the use of the SIAP might require a more structured working practice, which includes, e.g. regular meetings with the crew. Some supervisors (B and E) stated that they were not yet used to using the SIAP system because they had long experience working without it. Shift Supervisor E noted that he used more procedures when working with the SIAP. However, with supervisor (F) the situation was the opposite. He had not had experience without the panel and experienced difﬁculties without it. These negative effects were also visible in the previous performance analysis.

6. Discussion The main research question concerning the ability of the safety information and alarm panel to support process control in a disturbance situation was studied by analysing the process performance, the operators’ practices and their conceptions of the tool. The results manifest the complexity of adopting new tools in professional practice and also the difﬁculties in identifying the usability of the new tool. In the following we will discuss ﬁrst the results concerning the usability of the SIAP. Then the quality of the results is evaluated. We conclude the discussion by reﬂecting on our validation approach in relation to some signiﬁcant general issues in the validation of complex systems. 6.1. Summary and conclusions concerning the usability of the safety information and alarm panel The main question in an integrated validation of new artefacts is to provide evidence that the designed system adequately supports personnel and that the human–machine integrated system remains within an acceptable performance envelope (O’Hara, 1999). Our results were positive with respect to this demand.

ARTICLE IN PRESS 352

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

We determined that process performance was within an acceptable level in all conditions and that the performance level remained unchanged when the operators changed over to the new system. There were no differences between the crews when the process performance, i.e. outcome was used as a criterion. Drawing on MacIntyre’s distinction between external and internal good of practice cited earlier, the process outcome provides an external criterion for good operator practice (MacIntyre, 1984). Our results indicated, however, that when evaluating the dispositional differences emerging through the habits of action analysis, we could observe differences among the control room crews in their ways of making decisions and ways of cooperating. We also found that at least some of the items used in the evaluation of habits of action correlated with process performance, i.e. outcome. We maintain that by using the habit of action concept we created a possibility to evaluate the internal good of operator practice. Such features relate to ways of acting and to the underlying values that people who belong to the professional community regard as signs of good practice (MacIntyre, 1984). Furthermore, we demonstrated that even expert-level operators differed with regard to their working practices. The variations in the habits of action manifest the personal differences in the sense attributed to the possibilities and constraints available in the situation. This is a signiﬁcant result that conveys the message that expertise is not a one-dimensional matter that increases with experience, but that there are qualitative differences in the content of expertise. We acquired corresponding results from our analysis of expert anaesthetists’ habits of action (Klemola and Norros, 1997, 2001). With regard to the validation concept, this result opens up a possibility to make ﬁne-grade distinctions between usages, even when the process outcome criteria (external good) are not sensitive enough, or the performance of the operators is close to its apex. These problems are frequently addressed in validation studies (O’Hara, 1999; Miberg-Skjerve and Skraaning, 2003). Distinctions between the habits of action make practical sense only if we can assume that they have signiﬁcance with regard to the outcome. How are we able to demonstrate this link if the process performance outcome does not indicate it? We argue that this connection is a logical one, because a favourable habit of action is one that is core-task oriented and observes appropriately the intrinsic constraints of the domain in speciﬁc situations. Due to the differences in actual situations and the complexity of the tasks, the connection may not necessarily be evident with regard to every single course of action. However, because behaviour also expresses habits, i.e. particular more-or-less appropriate learned propensities to act, it may potentially lead to certain results (Peirce, 1958, 1998b). It is to be expected that in a complex process the safety of which is ensured by defence-in-depth, and in which, therefore, several coincidences are needed to effect a deviation from safety boundaries, an actual signiﬁcant negative result seldom occurs (Schulman, 1993; Norros, 2004). The usefulness of distinguishing between habitual features of action and the process outcome in a validation context became obvious in our further analysis of the results. Hence, when we scrutinized the effect of the use of the SIAP system by ﬁrst standardizing the crews’ appropriateness of habits of action, we observed that the effect of the SIAP was different in different crews. It was evident that the crew

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

353

that was rated the highest also beneﬁted most from the new system, and that at least the shift supervisor of the crew rated the worst also gained from the SIAP. In the case of most of the crews, however, the effect of using the new system was negative. This raises an important question about the roles of ‘‘personalizing’’ tools according to one’s habit of action and the crews’ practices, based on focused training when implementing new tools. If we focus on the content of the habits of action, not on a particular crew’s habits of action, we may ask in what way the SIAP appeared to affect the content of the habits of action. According to our method, the items used to analyse the habits of action manifest underlying adaptation-relevant features of coherence and situativeness of action. Both these features are necessary qualiﬁcations in an appropriate habit of action. Our results suggest that a negative change in the habits of action is connected with features that express situativeness of action. A favourable effect in the habits of action correlates to coherence-related items. In the ecological approach to action which is the characterization that we use of the Core-Task Analysis methodology behind our validation concept (Norros, 2004), we must acknowledge the signiﬁcance of the variability of the environment for the overall performance of the distributed cognitive system. Therefore, we carefully analysed and modelled the characteristics of the disturbance situations. Our results suggest that the effects of the SIAP are different in different situations. Furthermore, the tendency is that in less complex and more transparent situations the availability of the SIAP is beneﬁcial with regard to the operator practice, but in more complex and less transparent situations the effect is not obvious. Moreover, there were indications that situation-speciﬁc negative transformations in the habits of action were linked to situativeness-related items, whereas coherence-related items indicated positive transformations. Our results were not very clear, but they may be supported by theoretical inference. An information and control system provides a model of the process performance. It is made to be as adequate a representation of the process as possible. If the phenomena described in the scenarios are anticipated in the design basis of the system, these phenomena become better highlighted and perceivable for the control room crew. The alarms direct attention appropriately, and there is a clear connection to the procedures to be followed. In such a situation, the coherence-related features of action are supported by the SIAP. The subjective reports of the crews also veriﬁed this interpretation. However, in a more complex disturbance the situational features were anticipated less precisely by the SIAP. In such a case, the operator’s epistemic attitude that acknowledges the need for perceiving situatively speciﬁc and novel features of the environment gains in relevance (von Wright, 1971; Norros, 1995). Such an approach would lead to constructing a particular interpretation of the situation (Klemola and Norros, 1997). The operator should be able to realise the uniqueness of the present situation and be able to determine if the new tool ‘‘ﬁts’’ in the situation or not. When necessary, he should be able to give up the tool and use alternate sources of information and other control strategies. It may be assumed that this problem becomes more important when the development of tools is rapid and people do not

ARTICLE IN PRESS 354

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

have time to create the interpretative actions that bridge this gap. The informing power of ICT-technology tools may also amplify the problem, because interpretative demands may not appear evident even though they are in fact even greater than in the case of more traditional tools (Zuboff, 1988). The discrepancy between the tool as a generic model of the world and the demands to focus on particular situations was emphasised as a generic problem that needs further attention both in the design of artefacts and in training (Beguin et al., 1999). Reﬂecting on the nature of the tools would promote a more realistic attitude towards them and make the role of expertise and judgement evident. The SIAP was experienced to reduce stress in a disturbance situation. We may thus formulate the hypothesis that the particular advantage of the information aid was its ability to promote the shift supervisors’ management actions and sense of control. This is expressed as reduction of stress in the situation, which indirectly promotes handling of the disturbance. This is the ability that could have huge importance in real disturbance situations. According to the expert identity model (Norros and Nuutinen, 2002; Nuutinen et al., 2003; Nuutinen, 2003), the new tool thus supports the situational sense-of-control component of the expert identity and, therefore, also energises activity and promotes the use of one’s competence. The various impacts of the stress could be among the reasons for these results, but this requires further examination and especially developing the criteria of the third interaction in ‘‘way of coping with problem situations’’ further. It should be possible to promote appropriate operator practices by improving information presentation by the design. Therefore, it would be necessary to develop a theory to understand better the affordances of the information systems and to create ways to improve them in concrete contexts. In the present study, we were not able to analyse the affordances of the SIAP sufﬁciently; for example, we did not analyse in detail the information content of the two parts of the SIAP system, nor their representational features. In our ongoing studies we focus more on comprehending the connection between speciﬁc features of the information tools and the practices of the operators (Savioja and Norros, 2004). In the present study, our results made visible the main strength (ability to reduce experienced stress) and possible disadvantages of the SIAP and their reasons. We presented these results to the plants’ crews and thus promoted the operators’ reﬂection of the new tool. In addition, the results could be taken into account when planning further training sessions. 6.2. Validity and reliability of the results Our study is an example of a ‘‘quasi-experiment’’ that was conducted in (simulated) ﬁeld conditions. Because comparisons between groups that are not randomly assigned are made in such experiments, the investigator must carefully examine the results for alternative explanations and address them either statistically or by logical argument (Cook and Campbell, 1979). In the following, we treat the validity of our results with respect to four general forms of validity that have been proposed as prerequisites for appropriate causal inference (O’Hara, 1999).

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

355

With respect to system representation validity, i.e. to the correspondence between the entire test situation and the conditions in real-life use of the system, our validation experiments may be rated as high. Hence, the tests were conducted in a full-scale high-ﬁdelity training simulator and the system was tested in its complete form. We used four severe process disturbance scenarios, the operational characteristics of which were deﬁned before selecting the situations, and carefully modelled to acquire a description of the constraints and possibilities that these situations provided. We used complete operator crews who were well acquainted with the system to be validated. The reliability of the evaluation method was demonstrated by the fact that the ratings of the two evaluators differed from each other by less than 5% among the items on the ﬁrst run, and after redeﬁning the criterion, agreement was successfully reached in every case. The used procedure aimed to carefully deﬁne the situational criteria that would reduce the effects of personal preference preventing ‘‘group thinking’’. The test design validity refers to the actual conducting of the validation test. The validation team was multi-disciplinary and consisted of different types of expertise, i.e. in research, instruction and NPP operations. The tests were planned carefully and the crews participating in the experiments were randomly assigned to the test conditions, and permutations were used to avoid sequence effects. The validation group worked according to a deﬁned procedure. This procedure included a thorough brieﬁng before the experimental session, and all the crews had had training in the use of the tested system. Also, the test design validity may be considered high. In validation tests it is also necessary to fulﬁl the requirements of statistical conclusion validity. The central question in validation is to prove that the integrated system performance is acceptable. We used relevant standard statistical methods and tested the requirements for the use of the selected statistical tests. In our case, as very often in validating integrated systems that require a wide range of conditions, it was difﬁcult to acquire sufﬁcient data for the statistics. In this situation, a combination of quantitative and qualitative methods for reasoning is recommended (O’Hara, 1999). Our validation method included a variety of measures that provided the possibility to analyse the convergence of results as a further means for inference. We see, furthermore, that the qualitative data and reasoning not only complemented our statistical analyses, but constituted a signiﬁcant part of the validation of the complex integrated system. As we argued, it is not sufﬁcient to consider system performance only from the point of view of the performance outcome. Instead, analysis of the personnel practices is needed. We found that the habitual features of behaviour were especially sensitive to changes in the tools. The evaluation of these changes required requesting the persons’ own accounts of their behaviour, and the data had to be evaluated from a perspective of understanding the explanations (von Wright, 1998). The ﬁnal aspect of validation that should be considered in evaluating the quality of the results is the performance representation validity. This answers the question of how well the performance measures represent the performance characteristics relevant to safety and efﬁciency. In the reported study we not only tested a particular system but also developed a new kind of performance measure. Therefore, the issue of performance representation validity is especially crucial and is addressed in the following section.

ARTICLE IN PRESS 356

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

6.3. Conclusions concerning the performance-based validation method When taking the empirical results of the study as the starting point for discussion of the performance representation validity, we may ﬁrst state that the strong correlation between the way of decision making and the process-performance items says much about the validity of the method. Moreover, the fact that we found differences between the groups, and that some items of operator practice explained the differences in the process performance, demonstrates that the method also has explanatory power. Some items of the habit of action measure did not have differentiating power, and some items could not be used at all. This does not necessarily mean that the items are invaluable, rather that they might illustrate phenomena that were not important in the studied situations. However, it was quite clear that there is a need to develop an additional approach to improve the power of the third interaction, i.e. interaction with oneself, to better reach the stress-related issues in the action. Moreover, it is clear that the data gathered in the study was quite small for very comprehensive inferences about a developing method. Further studies are needed to analyse the power of the evaluation method in the validation studies. The reported study allows further theoretical and methodological conclusions regarding the core issues of integrated system validation (O’Hara, 1999; Norros and Savioja, 2004). We maintain that our study contributed especially to the development of the requirement referenced, and to the normative referenced criteria. This type of criteria exploit engineering knowledge and are typically used in the analysis of system performance. In this study we developed further our approach to modelling environmental constraints and possibilities for human action. Our way of modelling shares the ecological Gibsonian perspective with several current approaches in the area of human factors in complex systems (Hammond, 1993; Flach et al., 1995; Zambok, 1997). It is also closely related to the idea of formative modelling proposed by Rasmussen and Vicente (Vicente, 1999; Rasmussen and Svedung, 2000). We analysed the result-critical boundaries of the domain and portrayed them in the situational constraints and possibilities for action. Corresponding modelling approaches have been used by Naikar and colleagues for designing team tasks (Naikar et al., 2002). The functional domain and situation models do not only provide a basis for determining the process performance outcome criteria, but they also serve as a basis for understanding the complexity, dynamics and uncertainty factors of the test scenarios. In this connection, it should be necessary to cover not only safety, but also productivity and well-being objectives. Hence, we may have more information of the characteristics of the situations and a better basis for explaining situation-based variance in the performance of integrated systems. Without such tools, the requirements for using a representative sample of situations in testing the system (O’Hara, 1999) remain empty. The differences in the test situations seem to be analysed less frequently, which means that the variance of the results remains unexplained (Bove and Andersen, 1999). In the validation of integrated systems, normative referenced criteria are mainly used to evaluate the human operators’ practices as indications of the validity of the

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

357

entire system. A normative evaluation requires scientiﬁc analysis of human conduct and both empirical and theoretical grounds for arguing what is good performance. The evaluation method used here was based on several studies. These aimed at identifying the critical psychological demands of the NPP operator work. A wide range of different disturbance situations had been analysed and theoretical work accomplished to discover the most important factors of process control work of the operators (Hukki and Norros, 1993, 1998; Norros, 1995, 1997; Norros and Hukki, 1995; Norros and Nuutinen, 2002). A comprehensive account of the theoretical underpinnings of the methodology known as Core-Task Analysis has recently been made available (Norros, 2004). As noted in the introduction, validation studies often restrict themselves to measurement of the effect of the tool on some external system performance criterion. We took a step further and attempted to analyse changes in the dynamic structure of action when adopting new tools. We proposed the use of a combination of a cultural–historical concept of activity (Engestro¨m, 1987) that would provide a societal basis for explaining the transformations in actions, and the distributed cognition paradigm (Hutchins, 1995). The latter approach gains support in a detailed analysis of the actors’ situated tool-using interactions with the environment. In the beginning of our study we conceived our task as testing the effects of the system on the process outcome or even on the operators’ practices. However, during the study, we learned that the task is to understand the trajectory of the change in the entire system. The new and more extensive target of validation is to understand the change of the comprehensive human–technology system (Savioja and Norros, 2004). The criteria used in projecting new forms of distributed cognitive systems should not only describe the present but also be capable of predicting the future forms of such systems. It is necessary to extend the analysis of the operators’ practices. As before, the evaluation of the process outcome is necessary, because we are all interested in maintaining the system within the result-critical boundaries of quality, efﬁciency or safety. However, we also need criteria that reveal the internal good of practices. These are qualiﬁcations and values of practice that are deﬁnable only by those acting within the system. These are the features that contribute to the adaptiveness and development of the system. The features reveal the core-task orientedness and situated adaptiveness of the practices. The new perspective also emphasizes the need for considering the distributed cognitive system in a societal perspective, for which the cultural–historical theory of activity provides the necessary means. The societal context enables analysis of changes in the core tasks.

7. Concluding remarks In this paper, we studied the Safety Information and Alarm Panel that represents the modernisation of generation II control rooms. In the near future, generation III control room designs will introduce digital technology into the instrumentation and automation, including the control room. The new designs provide attractive

ARTICLE IN PRESS 358

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

improvements that are expected to enhance the operators’ process management and facilitate integration between operations and other functions, maintenance especially (O¨stlund, 2003). The most important consequences of digitalisation on human– technology interaction are the implementation of compact workstations and overview displays including the possibility to control the process from the displays, i.e. ‘‘soft control’’. Integration of large amounts of data and improvements in the visual representation of the process information, improved alarm management, computerized procedures and operator support systems, etc. are further new possibilities (O’Hara, 2003; Pirus, 2003b; Norros and Savioja, 2004). In the longer perspective, generation III+ and IV power plants will apply new solutions to process technology, the consequences of which on the operation of the plants must be thoroughly understood. All these future possibilities in technology challenge the human-factors experts to ﬁnd more proactive and integrated means to ensure that the innovative systems can be safely operated.

Acknowledgements We would like to thank Jan-Erik Holmberg and Kristiina Hukki for participating in the planning and conducting of the empirical study. The writing of this article was supported by the Finnish National Nuclear Safety Research Programme, SAFIR. References Bannon, L., 2000. Situating workplace studies within the human–computer interaction ﬁeld. In: Luff, P., Hindmarsh, J., Heath, Ch. (Eds.), Workplace Studies. Recovering Work Practice and Informing System Design. Cambridge University Press, Cambridge, UK, pp. 230–241. Bannon, L., Kaptelinin, V., 2000. From human–computer interaction to computer-mediated activity. In: Stephanidis, C. (Ed.), User Interfaces for All: Concepts, Methods, and Tools. Lawrence Erlbaum, Mahwah, NJ, pp. 183–202. Beguin, P., Kazmierczak, M., Cottura, R., Leininger, J., Vicot, P., 1999. Design of an alarm and risk management system in chemistry. Some lessons learned from interdisciplinary research. Presented at Human Error, Safety, and Systems Development, Liege, Belgium. Bove, T., Andersen, H.B., 1999. The effect of advisory system on pilots go/no-go decision during take off. Presented at Human Error, Safety, and Systems Development, Liege, Belgium. Carrol, J.M., 1997. Human–computer interaction: psychology as a science of design. International Journal of Human–Computer Studies 46, 5105–5122. Cook, T., Campbell, D., 1979. Quasi-experimentation: Design and Analysis Issues for Field Settings. Houghton Mifﬂin, Boston, MA. Corcoran, W.R., Porter, N.J., Church, J.F., Cross, M.T., Guinn, W.M., 1981. The critical safety functions and plant operation. Nuclear Technology 55, 690–712. Decortis, F., Noirfalise, S., Saudelli, B., 2000. Activity theory, cognitive ergonomics and distributed cognition: three views of a transport company. International Journal of Human–Computer Studies 53, 5–33. Engestro¨m, Y., 1987. Expansive Learning. Orienta, Jyva¨skyla¨. Engestro¨m, Y., 1999. Activity theory and individual and social transformation. In: Engestro¨m, Y., Miettinen, R., Punama¨ki, R.-L. (Eds.), Perspectives in Activity Theory. Cambridge University Press, Cambridge, pp. 19–38.

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

359

Flach, J., Hancock, P., Caird, J., Vicente, K.J., 1995. Global Perspectives on the Ecology of Human–Machine Systems. Lawrence Erlbaum, Hillsdale, NJ. Halverson, C.A., 2002. Activity theory and distributed cognition: or what does CSCW need to do with theories. Computer Supported Cooperative Work 11, 243–267. Hammond, K.R., 1993. Naturalistic decision making from a Brunswikian viewpoint: its past, present, future. In: Klein, G.A., Orasanu, J., Calderwood, R., Zambok, C. (Eds.), Decision Making in Action: Models and Methods. Ablex, Norwood, NJ. Hoc, J.-M., 2001. Towards a cognitive approach to human-machine cooperation in dynamic situations. International Journal of Human–Computer Studies 54, 509–540. Hoc, J.-M., Amalberti, R., Boreham, N., 1995. Human operator expertise in diagnosis, decision making and time management. In: Hoc, J.-M., Cacciabue, P.C., Hollnagel, E. (Eds.), Expertise and Technology. Cognition & Human–computer Cooperation. Lawrence Erlbaum, Hillsdale, NJ, pp. 19–42. Hollan, J., Hutchins, E., Kirsch, D., 2000. Distributed cognition: toward a new foundation for humancomputer interaction research. AMC Transactions on Computer–Human Interaction 7 (2), 174–196. Hollnagel, E. (Ed.), 2003. Handbook of Cognitive Task Design. Lawrence Erlbaum, Mahwah, NJ. Holmberg, J., Hukki, K., Norros, L., Pulkkinen, U., Pyy, P., 1999. An integrated approach to human reliability analysis. Decision analytic dynamic reliability model. Reliability Engineering and System Safety 65, 239–250. Hukki, K., Norros, L., 1993. Diagnostic judgement in the control of disturbance situations in Nuclear Power Plant Operation. Ergonomics 36 (11), 1317–1328. Hukki, K., Norros, L., 1998. Subject-centred and systemic conceptualisation as a tool of simulator training. Le Travail Humain, 313–331. Hutchins, E., 1995. Cognition in the Wild. MIT Press, Cambridge. IEC-9064. 1989. Design for control rooms of nuclear power plants. IEC. Ilyenkov, E., 1977. Dialectical Logic: Essays on its History and Theory. Progress, Moscow. Ilyenkov, E., 1984. Learning to Think. Tutkijaliitto, Helsinki (in Finnish). Jeanton, G., 2003. Electricite de France organisation and process concerning modiﬁcations of nuclear plants. Presented at Modiﬁcations at Nuclear Power Plants, Paris, pp. 1–7. Kautto, A., 1984. Information presentation in power plant control rooms. Report RR 320, Technical Research Centre of Finland, Espoo. Kirwan, B., 2003. Design process and human–system interfaces. Presented at International Summer School on Design and Evaluation of Human–System Interfaces. Halden, Norway, pp. 1–26. Klemola, U.-M., Norros, L., 1997. Analysis of the clinical behaviour of anaesthetists: recognition of uncertainty as basis for practice. Medical Education 31, 449–456. Klemola, U.-M., Norros, L., 2001. Practice-based criteria for assessment the anaesthetists’ habits of action. Outline for a reﬂexive turn in practice. Medical Education 35, 455–464. Kontogiannis, T., 1996. Sress and operator decision making in coping with emergencies. International Journal of Human–Computer Studies 45, 75–104. Lee, J., Moray, N., 1994. Trust, self conﬁdence, and operators’ adaptation to automation. International Journal of Human–Computer Studies 40 (1), 153–184. Leont’ev, A.N., 1978. Activity, Consciousness, and Personality. Prentice-Hall, Englewood Cliffs, NJ. Luff, P., Heath, C., 2000. The collaborative production of computer commands in command control. International Journal of Human-Computer Studies 52, 669–699. MacIntyre, A., 1984. After Virtue: Study in Moral Theory. University of Notre Dame Press, Notre Dame, IN. Miberg-Skjerve, A.B., Skraaning, G., 2003. A classiﬁcation of validation criteria for new operational design concepts in nuclear process control. Presented at NEA/CSNI Workshop on Modiﬁcations at Nuclear Power Plants, Paris, France. Mumaw, R.J., Roth, E.M., Vicente, K.J., Burns, C.M., 2000. There is more to monitoring a nuclear power plant than meets the eye. Human Factors 42 (1), 36–55. Mustonen, S., 1992. SURVO: An Integrated Environment for Statistical Computing and Related Areas. Survo Systems, Helsinki.

ARTICLE IN PRESS 360

L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

Naikar, N., Pearce, B., Drumm, D., Sanderson, P.M., 2002. Designing teams for ﬁrst-of-a-kind, complex systems using the initial phases of cognitive work analysis: case study. Human Factors 45 (2), 2022–2217. Naser, J., Hanes, L., O’Hara, J., Fink, R., Hill, D., Morris, G., 2003. Guidelines for control room modernisation as part of instrument and control modernisation programs. Presented at Modiﬁcations at Nuclear Power Plants, Paris, pp. 1–10. Nielsen, J., 1993. Usability Engineering. Academic Press, Boston, MA. Norros, L., 1995. An orientation-based approach to expertise. In: Hoc, J.-M., Cacciabue, P.C., Hollnagel, E. (Eds.), Cognition and Human–Computer Co-operation. Lawrence Erlbaum, Hillsdale, NJ, pp. 137–160. Norros, L., 1997. Human factors in NPP operations. RETU The Finnish research programme on reactor safety. Interim Report 1995 May 1997, Report VTT Research Notes 1856, Espoo. Norros, L., 2004. Acting under Uncertainty. The Core-Task Analysis in Ecological Study of Work. VTT, Espoo. Norros, L., Hukki, K., 1995. Contextual analysis of the operators’ on-line interpretations of process dynamics. Presented at Proceedings of Fifth European Conference on Cognitive Science Approaches to Process Control, Espoo, pp. 182–195. Norros, L., Hukki, L., 1997. Analysis of control room operators’ ways of acting in complex process control situations. Presented at the 13th Triennial Congress of the International Ergonomic Association, Tampere, Finland, pp. 61–63. Norros, L., Klemola, U.-M., 2005. Naturalistic analysis of anaesthetists’ clinical practice. In: Montgomery, H., Lipshitz, R., Brehmer, B. (Eds.), How Professionals Make Decisions. Lawrence Erlbaum, Mahwah, NJ. Norros, L., Nuutinen, M., 2002. The concept of the core task and the analysis of working practices. In: Boreham, N., Samurcay, R., Fischer, M. (Eds.), Work process knowledge. Routledge, London, pp. 25–39. Norros, L., Sammatti, P., 1986. Nuclear power plant operator errors during simulator training. Research Report 446, Technical Research Centre of Finland, Espoo. Norros, L., Savioja, P., 2004. Usability evaluation of complex systems. A literature review. Report STUK-YTO-TR 204, Radiation and Nuclear Safety Authority, Helsinki. Nuutinen, M., 2003. The change of personnel generation as a challenge of safety critical work: Operator trainee as an apprentice or an inquiring learner? (In Finnish, English abstract) Tyo¨ ja ihminen 17(2), 173–189. Nuutinen, M., Norros, L., 2001. Co-operation on bridge in piloting situations. Analysis of 13 accidents on Finnish Fairways. In: Onken, R. (Ed.), The Cognitive Work Process: Automation and Interaction. Proceedings of the Eighth Conference on Cognitive Science Approaches to Process Control. European Association of Cognitive Ergonomics, Munich, pp. 3–14. Nuutinen, M., Reiman, T., Oedewald, P., 2003. Osaamisen hallinta ydinvoimalaitoksessa operaattoreiden sukupolvenvaihdostilanteessa (Management of operators’ competence and change of generation at NPP), VTT Publications No. 496. VTT, Espoo. Oedewald, P., Reiman, T., 2003. Core task modeling in cultural assessment: a case study in nuclear power plant maintenance. Cognition, Technology & Work 5, 283–293. O’Hara, J., 2003. Overview of different types of control rooms and their human system interface solutions. Presented at International Summer School on Design and Evaluation of Human System Interfaces, Halden, Norway, 3/1–28. O’Hara, J., Higgins, J., Persensky, J., Lewis, P., Bongarra, J., 2002. Human factors engineering program review model. Report NUREG-0711. United States Regulatory Commission, Washington, DC. O’Hara, J.M., 1999. A quasi-experimental model of complex human–machine system validation. Cognition, Technology & Work 1 (1), 37–46. O¨stlund, A.M., 2003. Sa¨kerheten i fokus i det nya kontrollrummet. In Nucleus, ed. pp. 26–29. Papin, B., 2002. Integration of human factors requirements in the design of future plants. Presented at Enlarged Halden Programme Group Meeting, Storefjell, C3/1/–10.

ARTICLE IN PRESS L. Norros, M. Nuutinen / Int. J. Human-Computer Studies 63 (2005) 328–361

361

Peirce, C.S., 1958. Letters to Lady Welby. In: Wiener, P. (Ed.), Selected Writings of C.C. Peirce. Dover Publications, New York, pp. 380–432. Peirce, C.S., 1998a. The Harvard lectures on pragmatism. In: The Peirce Edition Project (Ed.), The Essential Peirce. Selected Philosophical Writings. Indiana University Press, Bloomington and Indianapolis, pp. 133–241. Peirce, C.S., 1998b. The Peirce Edition Project. Introduction. In: The Peirce Edition Project, (Ed.), The Essential Peirce. Selected Philosophical Writings. Indiana University Press, Bloomington and Indianapolis, pp. XVII– XXXVIII. Pirus, D., 2002. Computerized operation using formal plant functional breakdown. Presented at Enlarged Halden Programme Group Meeting, Storefjell, C3/8/1–8. Pirus, D., 2003a. Functional analysis—Operating functions Report, Electricite´ de France, Septen, France. Pirus, D., 2003b. Human–system interfaces. Types and principles. Presented at International summer school on Design and Evaluation of Human–System Interfaces, Halden, 9/1–69. Quentin, L., Niger, D., 2003. Taking into account of socio-organisational and human aspects into upgrade packages. Presented at Modiﬁcations at Nuclear Power Plants, Paris, pp. 1–10. Rasmussen, J., 1986. Information Processing and Human–Machine Interaction. North-Holland, Amsterdam. Rasmussen, J., Svedung, I., 2000. Proactive risk management in a dynamic society. Swedish Rescue Services, Karlstad. Reiman, T., Norros, L., 2002. Regulatory culture: balancing the different demands on the regulatory practice in nuclear industry. In: Hale, A.R., Hopkins, A., Kirwan, B. (Eds.), Changing Regulation— Controlling Hazards in Society. Elsevier, Amsterdam. Rouse, W.B., 1991. Design for Success—A human-centered approach to designing successful products and systems. Wiley, New York. Savioja, P., Norros, L., 2004. Artefact performance evaluation. A preliminary framework. Report BTUO62-041223, Technical Research Centre of Finland, Espoo. Schulman, P.R., 1993. The analysis of high reliability organisations: a comparative framework. In: Roberts, K.H. (Ed.), New Challenges to Understanding Organisations. Macmillan, New York, pp. 33–53. Suchman, L.A., 1987. Plans and Situated Actions: The Problem of Human–Machine Communication. Cambridge University Press, New York. Theureau, J., 1996. Course of action analysis and ergonomic design. Presented at Work and Learning in Transition, San Diego, pp. 1–29. Theureau, J., Filippi, G., 2000. Analysing cooperative work in an urban trafﬁc control room for the design of a coordination support system. In: Luff, P., Hindmarsh, J., Heath, Ch. (Eds.), Work Place Studies. Recovering Work Practice and Informing System Design. Cambridge University Press, Cambridge, UK, pp. 68–91. Vicente, K.J., 1999. Cognitive Work Analysis. Toward a Safe, Productive, and Healthy Computer-Based Work. Lawrence Erlbaum, Mahwah, NJ. Vicente, K.J., Roth, E., Mumaw, R.J., 2001. How do operators monitor a complex, dynamic work domain? The impact on control room technology. International Journal of Human–Computer Studies 54, 831–856. Von Wright, G.H., 1971. Explanation and Understanding. Routledge and Kegan Paul, London. Von Wright, G.H., 1998. In the Shadow of Descart. Essays in the Philosophy of Mind. Kluwer, Dordrecht. Vygotsky, L.S., 1978. Mind in Society. The Development of Higher Psychological Processes. Harvard University Press, Cambridge, MA. Zambok, C., 1997. Naturalistic decision making: where are we now? In: Zambok, C., Klein, G.A. (Eds.), Naturalistic Decision Making. Lawrence Erlbaum, Hillsdale, NJ. Zuboff, S., 1988. In the Age of the Smart Machine. The Future of Work and Power. Basic Books, New York.

Performance-based usability evaluation of a safety information and alarm system

Performance-based usability evaluation of a safety information and alarm system

Recommend Documents