Alarm mistrust in automobiles: how collision alarm reliability affects driving

Alarm mistrust in automobiles: how collision alarm reliability affects driving

ARTICLE IN PRESS Applied Ergonomics 34 (2003) 499–509 Alarm mistrust in automobiles: how collision alarm reliability affects driving James P. Blissa...

490KB Sizes 0 Downloads 283 Views

ARTICLE IN PRESS

Applied Ergonomics 34 (2003) 499–509

Alarm mistrust in automobiles: how collision alarm reliability affects driving James P. Blissa,*, Sarah A. Actonb a

Psychology Department (MGB 244B), Old Dominion University, Norfolk, VA23529, USA b Foster-Miller, Inc., Boston, MA, USA

Received 19 November 2002; received in revised form 2 June 2003; accepted 11 July 2003

Abstract As roadways become more congested, there is greater potential for automobile accidents and incidents. To improve roadway safety, automobile manufacturers are now designing and incorporating collision avoidance warning systems; yet, there has been little investigation of how the reliability of alarm signals might impact driver performance. We measured driving and alarm reaction performances following alarms of various reliability levels. In Experiment One, 70 participants operated a driving simulator while being presented console emitted collision alarms that were 50%, 75%, or 100% reliable. In Experiment Two, the same participants were presented spatially generated collision alarms of the same reliability levels. The results were similar in both experiments: alarm and automobile swerving reactions were significantly better when alarms were more reliable; however, drivers still failed to avoid collisions following reliable alarms. These results emphasize that alarm designers should maximize alarm reliability while minimizing alarm invasiveness. r 2003 Elsevier Ltd. All rights reserved. Keywords: Alarm; Driving; Collision

1. Introduction Recently, the federal government has designed and tested Intelligent Transportation System (ITS) technologies in an effort to decrease collision rates by creating ‘‘intelligent’’ roadways and enabling safer automobile interactions (National Highway Traffic Safety Administration, 1997). As part of that effort, many automobile manufacturers have studied and included collision avoidance warning systems (CAWS) in new car models (Tilin, 2002; NHTSA, 2002). Properly designed and implemented, CAWS would notify drivers about potential dangers from roadway departures and other automobiles, particularly in rear-end collisions (Hirst and Graham, 1997; Chen et al., 1997; Araki et al., 1996). As noted by Parasuraman et al. (1997), rear-end collisions constitute a sizeable proportion of automobile accidents. Aviation was the first transportation division to implement CAWS. Because of increasing numbers of near-collisions, the Traffic Collision Avoidance System *Corresponding author. Fax: +1-757-683-5087. E-mail address: [email protected] (J.P. Bliss). 0003-6870/$ - see front matter r 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.apergo.2003.07.003

(TCAS) was implemented to notify pilots of impending collisions with other aircraft. Although the TCAS system improved overall safety, pilots voiced concerns about false alarms and improper directives (Shapiro, 1994). Since its initial development, the algorithms controlling aviation TCAS alarms have improved. However, the false alarm issue has not been resolved completely (Bliss et al., 1999). Researchers have studied the technological challenges of implementing automotive CAWS. Currently, several different versions of automotive CAWS are available (see Mazzae et al., 1995; Jansson et al., 2002). The technology incorporated in CAWS ranges from head-up displays that simply notify the operator about impending collisions to intelligent computers that could take over vehicle operations. One type of CAWS works on the basis of a turn signal onset rule, where a warning signal occurs if the driver activates the turn signal when there is another vehicle blocking the intended path. Such a system may be an effective deterrent for collisions, but only if drivers reliably use their turn signals (Tijernia and Hetrick, 1997). Other systems employ radar technology to initiate warnings when the driver’s automobile is close to

ARTICLE IN PRESS 500

J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

another vehicle, regardless of turn signal use (Tijernia and Hetrick, 1997). Crash avoidance potential is high in these situations, but the probability of nuisance alarms (those signals that are invalid for the current task operation context) is also high (Xiao and Seagull, 1999). Alarm signals may frequently occur in ‘‘normal’’ contexts such as parking lots, or when cars are following a close, parallel course. In addition to technical challenges, lack of warning compliance is a concern. Researchers have proposed many ways to improve operator detection of alarm signals, such as using auditory stimuli (Lilliboe, 1963; Bronkhorst et al., 1996) and manipulating signal parameters to heighten perceived urgency. Haas and Edworthy (1996) successfully increased perceived signal urgency by raising fundamental frequency, raising sound pressure level, and reducing interpulse intervals. Although such changes may improve signal detection, they do not guarantee signal compliance. Even when signals are clearly detectable, unreliable automation may cause mistrust (Parasuraman and Riley, 1997). Furthermore, Bliss (1993) demonstrated that alarm mistrust could cause operators to ignore alarms often. As noted within Xiao and Seagull’s (1999) useful taxonomy, a variety of alarm problems are possible, including false alarms (alarm stimuli without corresponding system problems), nuisance alarms (alarm stimuli indicating a potential problem in an unrelated context), and inopportune alarms (cascading alarm stimuli indicating minor problems). False and nuisance alarms are a particular concern for automotive CAWS. Research suggests that drivers may ignore or disable CAWS if its alarms do not reliably signal danger (Tijerina and Garrott, 1997). Such disregard, a direct consequence of mistrust, may be compounded under high levels of workload (Bliss and Dunn, 2000). Researchers have studied operator trust in many contexts. Muir (1989) presented a broad theory of machine trust based on social trust theories from Barber (1993) and Rempel et al. (1985). Barber (1993) claimed that social trust relies on one’s belief that a partner follows natural and moral laws, that the partner has technical competence, and that the partner acts to benefit others. Rempel et al. (1985) stated that humans show trust because a partner behaves predictably, has a dependable disposition, and follows moral laws consistently. Muir (1989) aggregated these theories and applied them to the human–machine relationship. Recently, researchers have applied Muir’s theory of alarm mistrust to the human–automation relationship. They have validated Muir’s theory, and have documented alarm related performance decrements in a variety of complex task areas such as process control (Lee and Moray, 1992), medicine (Bitan et al., 2000), aviation (Pritchett, 2001) and driving (MacKinnon et al., 1993).

In the laboratory, researchers have improved various aspects of alarm responsiveness by increasing alarm urgency (Haas and Casali, 1995; Hellier et al., 2002), advertising high reliability rates to operators (Bliss et al., 1995), and incorporating redundant information displays (Selcon et al., 1995). Each of these manipulations heightened operator situation awareness, resulting in faster reaction times and increased compliance rates. It is important that these manipulations be validated using an automotive task scenario, because of the potential for false alarms to be generated by automotive CAWS (Tijerina and Garrott, 1997). Farber and Paley (1993) judged the impact of false alarms on driver behavior to be a critical human factor research issue. 1.1. Goals and hypotheses The purpose of this research was to document the effects of alarm unreliability in an automotive context. Alarm systems may be unreliable in many ways, including failing to emit signals when such emissions are warranted (missed alarms), and emitting signals when emissions are not warranted (false alarms). Because of the implications of false alarms in the automotive cab, we chose to study false alarms in this research. As demonstrated by Baber (1994), a significant issue with false alarms is their potential for driver distraction, causing operators to redirect their attention away from the primary driving task. False alarms are also ubiquitous, because alarm manufacturers have a legal predisposition to warn. Using laboratory-based research paradigms, researchers have demonstrated performance deficits resulting from frequent false alarms in active collision avoidance systems, where collisions might result from actions taken by the driver. For example, Dingus et al. (1997) studied the impact of false collision avoidance alarms on young and old drivers, finding that young drivers exhibited signs of alarm mistrust in a rear-end collision scenario. Such research usually involves incorporating false alarms in an attempt to manipulate aspects of Muir’s trust framework; degrading the predictability and dependability of the alarm system. This approach is also followed in the current research; however, the collision avoidance system was a passive one, where threats approached the driver. The driver had to take evasive action to avoid the collision. Such situations commonly occur on multi-lane highways, when a car begins to change lanes or approaches from the rear, not realizing another car is in the way. We presented drivers with alarms of varying reliability levels signaling the threat of a collision from the rear. In the first experiment, the alarm signals originated from the center console. In the second, signals originated from various spatial locations within the automotive cabin, corresponding to the location of the approaching vehicle. In

ARTICLE IN PRESS J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

each case, the driver and alarm system operated in parallel, as described by Tijerina and Garrott (1997). In past research, alarm mistrust has consistently been associated with reduced alarm response frequency (Bliss, 1993; Sorkin, 1988). Furthermore, in their final report for the Automotive Collision Avoidance Systems (ACAS) Program, NHTSA suggests that drivers will likely ignore collision alarms that are not reliable (NHTSA, 2000). Therefore, participants were expected to respond less frequently to alarm systems of low reliability. We also measured alarm reaction time, alarm reaction appropriateness, and appropriateness of driving actions following the alarms. However, due to inconsistent results regarding these variables in the alarm mistrust literature, we made no hypotheses regarding them.

2. Experiment One 2.1. Method 2.1.1. Experimental design For this experiment, we manipulated auditory alarm reliability among three groups (50%, 75%, or 100% true alarms). Dependent measures reflected driving performance and reactions to alarms. Upon hearing an alarm, drivers were to determine whether it actually signaled an approaching car (true alarm) or not (false alarm). If the alarm was true, drivers were told to swerve to avoid being struck from behind by the approaching car. In addition to transcribing videotapes to capture steering movements, experimenters relied on vocalizations by drivers to determine whether and in what direction they swerved following the alarm signals. Magnitude of swerving, while not measured directly, was inferred from collision rate. If drivers did not swerve enough, they would collide with the approaching car. 2.1.2. Measurements Driving performance measures included driving reaction appropriateness (the proportion of trials, out of 12, where drivers swerved in the proper direction after true alarms and did not swerve after false alarms), and collision frequency (the proportion of trials during the experiment where reactions to alarms resulted in collisions with approaching vehicles). Collision frequency was calculated by dividing collision occurrences by the number of threats (6, 9, or 12, for reliability group 50%, 75%, or 100%, respectively). Measured reactions to auditory alarms included the frequency with which drivers swerved to avoid approaching cars (after true and false alarms); the time taken by drivers to swerve, in seconds from the onset of the auditory alarm stimulus to the beginning of the swerve; and the proportion of trials when participants

501

appropriately reacted to true alarms and ignored false alarms. Because of the importance of swerving, we coded it visually from videotapes and confirmed it from the drivers’ descriptions of their actions. Participants were randomly assigned to a 50%, 75%, or 100% reliability group. Following a procedure used in previous alarm mistrust research (see Bliss et al., 1995), participants were told the alarm system’s reliability before performing the experimental task. Because alarm mistrust accumulates as exposure to false alarms continues (see Bliss and Kilpatrick, 2000), this ensured that participants adopt stable and appropriate trust levels within each experimental session. It also reflected real-world situations where drivers may have preconceived notions (from the media or other drivers) about CAWS reliability. 2.1.3. Participants To ensure statistical power greater than 0.80 at the p ¼ 0:05 level (Cohen, 1988), data were collected from 70 undergraduate volunteers who received course credits toward their general psychology classes at The University of Alabama in Huntsville. All were licensed drivers living in the Huntsville, AL area. Forty participants were male, and 30 were female (this ratio was similar within all groups). The average age of participants was 22.1 years. The driving experience of participants ranged from 1 to 28 years, with an average of 6.6 years. 2.1.4. Materials A driving simulator owned by the Distributed Simulation Group at the US Army Aviation and Missile Command Center (AMCOM—Redstone Arsenal, AL) was used as a vehicular platform for the primary, driving task (see Fig. 1). The simulator was a military HMMWVt (multipurpose vehicle) modified to serve as an automotive simulator. The front window of the

Fig. 1. The HMMWVt driving simulator.

ARTICLE IN PRESS 502

J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

HMMWV faced three large (238.76 cm high  177.8 cm wide) screens. The total horizontal field of view was 135 , each individual screen’s horizontal field of view was 45 , and the vertical field of view was 33.75 . Although there was no motion base, physical and functional fidelity were high (see Hays and Singer, 1989). The steering wheel, brakes, accelerator, and gearshift controls were functional, and visual and auditory stimuli faithfully represented environmental changes according to driver actions. In addition to the front view screens, participants also viewed a rear-view mirror display inside the vehicle cabin. This display was a 1400 flat-panel display that showed a rearward view of terrain that had already been passed. The simulated driving environment was created using Multi-Gen (Creator)t on a Silicon Graphics computer. The environment faithfully represented a 1609.34 km2 area from Decatur, AL (West) to Huntsville, AL (East); and from the Tennessee State Line (North) to Arab, AL (South). The corridor contained the length of Interstate 565, was approximately 32.19 km from East to West, and 80.47 km from North to South. The environment was initially developed at Boeing’s Advanced Computing Laboratory in Huntsville, AL and translated at the Army-NASA Virtual Innovations Laboratory (ANVIL), Marshall Space Flight Center, AL. The environment was refined and rendered by personnel at AMCOM’s Distributed Simulation Center. After the initial environment was completed, AMCOM personnel incorporated it into existing distributed interactive simulation software to allow representation of other vehicles and terrain. To create a simulated environment of I-565, the experimenter selected and placed polygons and lines to create the road, background, overpasses, exit ramps, road signs, and landmarks. Land topography was determined by referring to topographical maps and Global Positioning System elevation estimates. There were approximately 255 stationary polygons in the environment plus moving objects such as vehicles. Objects such as road surfaces, signs, overpasses and trees were given a realistic appearance by placing texture maps on polygon surfaces. This technique facilitated real-time model rendering (see Fig. 2). During the simulation, the view of the computergenerated imagery was continuously updated at a rate of approximately 20 Hz to match the simulated vehicle’s movement. The update rate for the rear-view mirror display was 60 Hz. While driving in the simulation at 112.65 kph, participants were required to react to 12 intermittent auditory alarms spaced randomly throughout the session. The alarms originated from a center front console located behind the automobile cabin firewall. The alarm stimulus, audible for 4 s, consisted of regular 1000 Hz sine wave pulses at approximately 90 dB(A).

Fig. 2. A sample scene from the driving environment.

The signal was obtained from the Huntsville Chrysler Electronics Division, was digitized, and then incorporated into the driving simulation to coincide with the passage of particular landmarks. Designed initially for use in Dodge Neont automobiles, the signal was in general accordance with Haas et al.’s (1996) guidelines for high urgency alarms, and was the prototype signal for collision avoidance systems under development by Chrysler. The 4-s audibility duration was chosen to ensure that participants detected the signal, and was validated during a pilot study. True alarms signaled the appearance and approach of a car from behind the participant’s automobile. The approaching cars appeared in the driver’s rear view mirror display at the same time as the alarm signal sounded. Approaching cars appeared 91.4 m (300 feet) behind the participant’s car, traveled at 144.84 kph (90 mph), and could approach from the right rear, rear, or left rear. The 12 alarms were scheduled to appear randomly according to simulated distance traveled (interstimulus distance ranged from 0.40 to 4.43 km (0.25–2.75 mile), with an average of 2.41 km (1.50 mile)). Drivers were dependent upon the rear view mirror display to determine whether the alarm was true or false, and to determine the proper action to take. If a car approached from the left or right rear areas, it remained halfway in the participant’s lane, so that the participant needed to swerve left or right to avoid a collision. Participants were told to swerve left to avoid cars coming from the right rear, and right to avoid cars coming from the left rear or rear. They were told to verbally indicate their action while taking it, to facilitate data recording. If an alarm was false, the signal would sound but there would be no approaching car. There were no missed alarms (an approaching car without an alarm signal) included in the scenario.

ARTICLE IN PRESS J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

In addition to the simulated alarms and driving environment, the experimenters also used an informed consent form and two data collection forms. The informed consent form described the purpose of research, risks, benefits, and withdrawal procedures. The background questionnaire included information about age, driving experience, sex, vision, hearing, and computer game familiarity.

2.1.5. Procedure After completing the informed consent form, participants completed the demographics questionnaire and reviewed task instructions with the experimenter. The experimenter then led each participant into the simulator room. At this time, participants were shown the driving simulator. The participant started the vehicle, engaged the transmission, and drove down the road for approximately 3.22 km (2 mile). After reaching the 3.22 km point, participants made a left hand turn, a right hand turn, accelerated, and braked. Participants were encouraged to check their rear view mirror frequently. Participants were presented with the alarm signal during familiarization. Upon hearing an alarm, participants were told to scan the rear view display to locate the approaching vehicle, and then to respond appropriately (swerve left or right to avoid the approaching vehicle). After familiarization, the experimenter then repeated the experimental task instructions. Each participant was told the percentage of true alarms to expect during the experiment (50%, 75% or 100%). After the participant indicated that he or she understood the instructions, the experimental task began. The experimental task consisted of a 20-min session of driving and reacting to collision warning alarms. The route began at the on-ramp to I-565 from Washington Street in downtown Huntsville, AL. The driver traveled west from that point to the intersection of I-565 and I-65 near Decatur, AL. Drivers were encouraged to maintain a maximum speed of 112.65 kph (70 mph). While driving, the participants were presented alarms according to a predetermined random schedule, so that alarm onset could not be predicted from time passage or geographic location. True alarms signaled the approach of a car from the right rear, rear, or left rear of the participant’s automobile. False alarms sounded without an approaching car being present. Personnel at AMCOM programmed a semi-automated computer interface to assist the experimenter with data collection. The computer interface was responsible for generating alarm signals, starting and ending the simulation, and storing alarm response frequency and appropriateness data recorded by the experimenter after each alarm signal. As a supplementary data collection tool, an 8-mm video camera was placed just outside of

503

the simulator’s driver’s-side door to record the alarms and each participant’s vocalizations and movements. After participants had completed the 20-min driving task, they rested for 10 min before starting the second experiment.

2.2. Results After collating and entering the task performance data, we analyzed the videotaped performances to determine swerving reaction times to alarms. To determine if console alarms affected driver behavior, we calculated one-way analyses of variance (ANOVAs) for alarm reaction frequency, alarm reaction time, appropriateness of alarm-instigated driving reactions, appropriateness of alarm reactions, and collision frequency. The results of those ANOVAs are reported below. As shown in Fig. 3, participants responded more frequently when alarms reliably indicated the presence of an approaching car in the rear view mirror display, F(2,67)=20.603, po:001: Tukey post-hoc comparisons indicated that average alarm response frequencies were different across all three reliability levels (po0:05). There was no effect of alarm reliability on swerving reaction time (p > 0:05). The average time taken by participants to swerve following 50% reliable console alarms was 5.82 s (SD=0.54), the average time taken to swerve following 75% reliable console alarms was 5.91 s (SD=0.83), and the average time taken to swerve following 100% reliable alarms was 6.01 s (SD=0.67). Alarm reliability did have an effect on whether participants swerved correctly to avoid a collision following the alarms, F(2,67)=12.216, po0:001 (see Fig. 4). Tukey post-hoc tests indicated that both the 75% and the 50% groups swerved correctly less often than the 100% group; however, there was no significant difference between the reactions of the 50% and 75% groups.

Fig. 3. Frequency of console alarm reactions as a function of signal reliability.

ARTICLE IN PRESS 504

J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

Fig. 4. Appropriateness of driving reactions as a function of console alarm reliability.

Fig. 5. Console alarm reaction collision rates as a function of signal reliability.

The ANOVA for alarm reaction appropriateness (whether drivers properly reacted to true alarms and ignored false alarms) was not significant, p > 0:05: Participants in different reliability groups differed with regard to collision rate, F(2,67)=5.68, p ¼ 0:005 (see Fig. 5). Tukey post-hoc tests indicated that the 50% group collided with fewer cars than the 100% group (po0:05).

ability of Muir’s machine trust theory to automotive task situations. Previously, researchers have shown that alarm response rate and reaction appropriateness are sensitive to fluctuations in alarm system reliability (Bliss, 1993). However, primary task performance has rarely suffered as participants would typically ignore unreliable alarms and continue to perform the primary task efficiently. However, in the current research the primary task was intricately tied to the alarms, and so it is reasonable that appropriateness of the driving task (swerving response) suffered. One explanation for the lack of effect for alarm reaction appropriateness is the fact that drivers relied heavily on the rear view mirror display when reacting. Therefore, participants in all groups were able to confirm the appropriateness of their reactions prior to making them. The frequency of alarm reactions in this experiment resembled the frequency matching pattern demonstrated in previous research (Bliss, 1993); participants mirrored their reaction rates to the reliability of the alarm signals. However, in the past, participants generally overmatched their response rates (Bliss, 1993) whereas in the current experiment reaction rates were lower than the reliability of the signals. This may suggest that the participants experienced significant cognitive load while driving the simulator, and so had less spare cognitive capacity to react to alarms. Additionally, participants may have devoted more attention to the primary task, because it was realistic and somewhat novel. One interesting finding concerns the observed collision rate for the groups. Although we made no hypothesis concerning collision rate, logic might suggest that a higher reliability alarm system would result in fewer collisions. However, the 50% reliability group collided with significantly fewer approaching vehicles than the 100% group. Although all participants checked the rear view mirror display before swerving, members of the 50% group likely did so more deliberately, and so were more successful at avoiding collisions. In contrast, members of the 100% group appeared to fall into a sort of routine, swerving more carelessly as time progressed. In the next experiment, we presented participants with spatially generated signals, to determine whether additional signal information would mitigate the alarm mistrust demonstrated in the first experiment.

2.3. Discussion We were not surprised to see lower alarm reaction frequency and driving reaction appropriateness for lower reliability alarms. This supports our expectation that the cry-wolf phenomenon would occur, even in a comparatively realistic task setting. It is encouraging that dependent measures such as response frequency are sensitive indicators of alarm mistrust regardless of the veridicality of task structure. This shows the applic-

3. Experiment Two In recent years, researchers have demonstrated many ways to improve alarm compliance, including increasing alarm urgency (Haas et al., 1995; Hellier et al., 2002) advertising high reliability rates to operators (Bliss et al., 1995), and requiring operators to respond verbally (Bliss, 1997). One particularly successful method studied

ARTICLE IN PRESS J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

was to relay additional information about the alarm system, so that operators could refer to supplemental evidence of alarm validity or alarm system reliability before reacting (Bliss et al., 1996; Selcon et al., 1995). By increasing operator situation awareness in this fashion, reaction appropriateness increased and response frequency was optimized. In automotive environments, one way to relay additional information might be to design alarms that spatially mirror the dangers they signal. Previous research in the aviation domain suggests that perceived reliability might be greater for spatial alarms than for alarms generated from a central console because spatial alarms provide additional information about potential collisions (Lee and Patterson, 1993). However, research is needed to determine the value of spatial alarms for drivers (Parasuraman et al., 1997). In the aviation domain, Lee and Patterson (1993) found that fighter pilots preferred spatial signals for wingman communication and threat information. Rudmann and Strybel (1999) showed that using centered and spatial auditory cues for a search task in a virtual environment led to quicker response times over auditory cues generated in front of participants. Some automobile designers have considered presenting auditory alarm signals spatially to increase localization ability. The expectation is that driver reactions may improve because the threshold at which signals are discriminated is lowered (Bronkhorst et al., 1996). Spatially generated auditory signals may indeed be promising for automotive environments, because drivers would not have to look at a head-up display, console, or rear view mirror to know from which direction another car is approaching. Instead, information about the car’s location could be embedded into the spatial auditory signals. Research suggests that drivers may also perceive spatial signals as more reliable than centralized console signals, because of their closer correspondence to the event signaled (Bliss et al., 1996; Bronkhorst et al., 1996). We intended this experiment to partially replicate the first experiment, while providing additional data about spatial alarm reactions. A direct comparison of spatial and console alarm reactions is not possible because all participants experienced central alarms prior to spatial alarms. However, we were interested to know how participants would react to the introduction of spatial alarms, after experiencing console alarms. Few researchers have considered whether alarm respondents adapt well to changes in technology. However, such changes are commonplace in many environments, such as the automobile industry: With the numerous technologies in place controlling CAWS function, people who drive a variety of cars may be required to adjust to a variety of signal types and origins.

505

Because of the potential interaction between spatial signal source and perceived reliability, it is important to examine alarm mistrust in conditions where signals are generated spatially. In Experiment Two, we replicated the first experiment, using spatially representative signals. We expected that alarm reliability would impact alarm reactions and driving performance as it did in the first experiment. However, we expected better overall performance because of our use of spatial signals, and because the experimental participants had already received exposure to console based signals in Experiment One. 3.1. Method 3.1.1. Experimental design The approach was the same as in Experiment One. Alarm reliability was manipulated between groups (50%, 75% or 100% true alarms), and we measured alarm reaction frequency (to both true and false alarms), time, and appropriateness, driving appropriateness, and collision rate. Participants experienced the same level of false alarms in both experiments. 3.1.2. Participants The same participants were used for this experiment as were used in Experiment One. After participants completed the first experiment, they rested for 10 min, and then began the second experiment. 3.1.3. Materials All materials used were the same as in the first experiment, except for the origin of alarm signals and the direction of travel (participants drove east instead of west). In this experiment, the 12 alarms were spatially representative of the direction of the approaching car. Alarm signals sounded from the left rear, rear, or right rear of the automotive cabin. A preliminary study indicated that participants could effectively localize alarm sounds within the cabin. All other aspects of the alarms remained the same as in Experiment One. 3.1.4. Procedure After completing the simulated driving task in Experiment One, participants rested for 10 min. They then began to drive east from the intersection of Interstate 65 and Interstate 565 to Huntsville. This route brought them back to their origination point at the beginning of Experiment One. Along the way, 12 alarms were generated in a manner similar to the alarms during the first experiment: reliability of the alarm system was the same as it was in the first experiment, and the validity of individual alarms (true or false) was randomly determined. As before, participants were told to react to the alarms by ignoring them if they were false and swerving in the

ARTICLE IN PRESS 506

J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

proper direction (away from the approaching car) if they were true. As before, participants were instructed to maintain a speed of 112.65 kph (70 mph). After they completed the simulated scenario, they stopped the vehicle, exited, and completed the opinion questionnaire. After ensuring that participants were not suffering from simulator sickness, we debriefed and dismissed them. 3.2. Results To determine if the spatial alarms affected driver behavior, we calculated one-way analyses of variance (ANOVAs) using the same variables as in Experiment One. The results of those ANOVAs are reported below. As shown in Fig. 6, there was a significant effect of spatial alarm reliability on alarm response frequency, F(2,67)=21.448, po0:001: Tukey post-hoc comparisons indicated that each increase in reliability was associated with a significant increase in alarm responses (po0:05). As in the first experiment, there was no statistically significant effect of alarm reliability on swerving reaction times (p > 0:05). Participants averaged 5.73 s (SD=0.79) to swerve following 50% reliable spatial alarms; 5.83 s (SD=0.69) to swerve following 75%

Fig. 8. Spatial alarm reaction collision rates as a function of signal reliability.

reliable spatial alarms; and 5.95 s (SD=0.53) to swerve following 100% reliable spatial alarms. Alarm reliability did influence the way that participants drove following alarms, F(2,67)=16.089, po0:001 (see Fig. 7). Tukey post-hoc tests indicated that the 75% group and the 50% group each swerved in a significantly less appropriate manner than the 100% group; however, there was no significant difference between the 50% and 75% groups. Alarm reliability did not significantly affect alarm reaction appropriateness, p > 0:05: There was, however, a statistically significant effect of reliability on collisions, F(2,67)=6.694, p ¼ 0:002 (see Fig. 8). Tukey post-hoc tests indicated that the 50% group collided with fewer cars than the 75% and 100% groups (po0:05); however, there was no difference between the 75% and 100% groups. 3.3. Discussion

Fig. 6. Frequency of spatial alarm reactions as a function of signal reliability.

Fig. 7. Appropriateness of driving reactions as a function of spatial alarm reliability.

As in the first experiment, we observed performance deficits in the 50% and 75% reliability groups. Participants in those groups responded to alarms less frequently, and performed the driving task less appropriately following alarms. Yet, as before, participants in the 50% group collided with the fewest approaching cars, compared to the 75% or 100% groups. A rigorous comparison of console and spatial alarms across the two experiments is not advisable because of alarm origin and task experiences were confounded. However, an informal consideration of the data shows that spatial alarm reliability affected performance measures the same way that console alarm reliability had. An examination of performance means suggests that participants exhibited generally better performances when the alarm signals were spatially generated than when they were generated from the center console. The lone exception to this is driving appropriateness. Participants made less appropriate driving reactions following spatial alarms than they had following console

ARTICLE IN PRESS J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

alarms. While this seems contradictory to the results for the other variables, an examination of the videotaped performances revealed some indecision on the part of participants. In some cases, participants’ recorded initial reactions were inappropriate, but they ultimately avoided the oncoming car after correcting the initial reaction. In both the console and spatial experiments, we expected that participants would rely upon the rear-view mirror to detect the position of the approaching car. Indeed, an examination of mirror checking behavior confirmed this. Similar to Experiment One, as Experiment Two progressed some members of the 75% and 100% groups seemed to fall into a comfortable routine of swerving slightly or making a ‘‘token’’ reaction. In contrast, many members of the 50% group questioned every alarm and paid close attention to the rear view mirror, perhaps because they were not biased to believe that the alarms would be true. Therefore, they were more deliberate about swerving, and more effective at avoiding collisions. An additional factor that influenced collision rates was that true alarms appeared to create a startle effect. Upon hearing the alarms, some drivers would swerve correctly, but then swerve back again. As a result, even though their initial swerving response was technically appropriate, they ultimately collided with the approaching car. The possibility of such reactions has been discussed before by Baber (1994). An interesting aspect of performance in Experiment Two concerns the driving appropriateness data. As is evident from Fig. 7, drivers generally made poor driving reactions following spatial alarms. There are two likely explanations for this. The first is that scanning the rear view mirror display made it difficult for participants to localize the spatial alarm signals. The second explanation is that during the second experiment, drivers were more fatigued and frustrated with the driving and alarm reaction tasks. As a result, they may have reacted less appropriately in general.

4. General discussion Past research has determined that response frequency is a stable indicator of alarm mistrust (Bliss, 1997). For the current research, we increased the realism of the experimental paradigm by requiring drivers to avoid other traffic were realistic and task-relevant. It is encouraging that dependent measures such as response frequency are sensitive indicators of alarm mistrust regardless of the veridicality of task structure. As mentioned previously, this lends support to existing theories of machine trust (Muir, 1989), because aspects of the alarm system such as predictability and depend-

507

ability clearly impact performance, regardless of task realism. Perhaps one of the most important measures in this research was collision rate. Because the alarm systems were designed to prevent collisions, it is of concern that a 50% alarm system should result in fewer collisions. It is likely that such low reliability levels led participants to follow a wise course of action, confirming the existence and nature of any imminent threats before reacting. Yet, the urgency of the alarms seemed to lead members of the 100% group to be distracted, so that while their driving reactions began correctly, they often overcompensated or reversed their action. Dingus et al. (1997) have discussed the possibility that improperly designed collision avoidance alarms may actually lead to more collisions. Our findings may lend support to that notion. Considering the results of the present experiment, there are several recommendations that may be made. First, it is important to replicate these findings in a variety of other traffic conditions, as the driving task in this experiment was admittedly fabricated and simplistic. If future replications reveal similar results, alarm designers would be well advised to consider interventions to counter the potential for false alarms. In previous alarm mistrust research, researchers have found that making participants aware of high alarm system reliability rates improved responsiveness (Bliss et al., 1995). Such a strategy may be applicable for automotive collision avoidance alarm systems as well. Conversely, if drivers are made aware of low reliability rates, their responsiveness may decline. Another intervention shown by Bliss (1993) to decrease response times was to heighten alarm signal urgency. However, in the current experiment participants reacted poorly to alarm urgency, becoming distracted and confused. Urgent, reliable alarms evoked responses that, while appropriate, led to a greater number of collisions. For this reason, advocating quick, reflexive reactions to automated alarm systems may not be a wise course of action. Furthermore, the negative impact of such reflexive behavior may well be compounded in situations where task workload is heightened, or where there are a number of collateral alarm systems. As noted in the Section 1, introduction of spatial alarm systems has been considered as a way to improve alarm responsiveness. However, the effectiveness of such alarms may be limited. Even when spatial alarms were used, participants still showed clear signs of alarm mistrust. Furthermore, as indicated previously, a limitation of this research is that there is no way to separate the effects of alarm origin and task training. Because of this, future researchers are encouraged to manipulate alarm reliability and origin together, to determine the robustness of our findings.

ARTICLE IN PRESS 508

J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509

The current research provides an interesting view of CAWS. Past research has shown the importance of redundant information sources in situations where alarm reliability is suspect (Bliss et al., 1996). In the current research, it was clear that participants relied on the rear view mirror display, even when spatial location of the threats was coded within the alarm signal. Designers of alarm systems may do well to heed this information. It seems that any degradation of alarm reliability will likely cause drivers to verify alarm validity through other information sources. For this reason, the implementation of CAWS (particularly those that feature spatially generated signals) should be considered with caution. As this project was the first step toward a program of research concerning CAWS, there are many avenues to pursue. The driving environment created for this research was intentionally simplistic, to allow rigorous manipulation of reliability and signal source location and to facilitate the measurement of driving performance. Following these results, it is important that research be conducted that includes the presence of other traffic (besides cars overtaking the experimental vehicle). It is also important that investigators examine alarm responsiveness using traffic scenarios that may yield high numbers of false alarms, such as parking lots, two lane roads, and interstate interchanges. In the current experiments, we took care to ensure that participants could definitely tell whether alarms were true or false, by presenting no cars when false alarms sounded. Other researchers have chosen to present less threatening cars when false alarms sound. This may make the difference between true and false alarm situations less pronounced (see Gupta et al., 2002). Both approaches have advantages (see NHTSA, 2000). Situations where true and false alarm situations are less distinguishable may be more externally valid. However, we presented clearly distinguishable true and false alarm situations to avoid any ambiguity concerning alarm validity perception. The current collision avoidance warning system was a passive one, generating alarms when other cars encroached upon the experimental vehicle. Such situations are important to consider, because of their prevalence. However, many prototype CAWS operate according to an active algorithm, where alarms are generated after the driver activates the turn signal or begins to change lanes. Such alarm systems must also be evaluated for feasibility and reliability. Drivers may be more alert in such situations because they instigated the collision situation. Such situations may also generalize more readily to collision situations involving objects other than automobiles (i.e., car–pedestrian accidents and roadway departures).

Acknowledgements This project, funded by the University Transportation Center of Alabama (UTCA Grant No. 03UTCA-003) could not have been completed without the assistance of many people and entities. We wish to thank Frank Craig and Stephen Tanner from The Boeing Company, Mark Blasingame and Chris Daniel from the Army-NASA Virtual Innovations Laboratory, Robert Lock of Chrysler (Huntsville Electronics Division), Laurie Fraser, Greg Tackett, and Tim McKelvy from the US Army Aviation Missile Command—Distributed Simulation Center, Greg Lee and Skip Clay from Nichols Research, and Doug Barclay, Christy Bates, and Jimmy Moore from Computer Sciences Corporation.

References Araki, H., Yamada, K., Hiroshima, Y., Toshio, I., 1996. Development of rear-end collision avoidance system. Proceedings of the 1996 IEEE Intelligent Vehicle Symposium, Tokyo, Japan, pp. 652–657. Baber, C., 1994. Psychological aspects of in-car warning devices. In: Stanton, N. (Ed.), Human Factors in Alarm Design. Taylor & Francis, London. Barber, B., 1993. Logic and the Limits of Trust. Rutgers University Press, New Brunswick, NJ. Bitan, Y., Meyer, J., Shinar, D., Zmora, E., 2000. Staff actions and alarms in a neonatal intensive care unit. Proceedings of the IEA 2000/HFES 2000 Congress, San Diego, CA, July 29–August 4. Bliss, J.P., 1993. The cry-wolf phenomenon and its effect on alarm response. Unpublished Doctoral Dissertation, University of Central Florida, Orlando. Bliss, J.P., 1997. Alarm reaction patterns by pilots as a function of reaction modality. Int. J. Aviat. Psychol. 7 (1), 1–14. Bliss, J.P., Dunn, M.C., 2000. The behavioral implications of alarm mistrust as a function of task workload. Ergonomics 43 (9), 1283–1300. Bliss, J.P., Dunn, M., Fuller, B.S., 1995. Reversal of the cry-wolf effect: an investigation of two methods to increase alarm response rates. Perceptual Motor Skills 80, 1231–1242. Bliss, J.P., Freeland, M., Millard, J., 1999. Alarm related incidents in aviation: a survey of the aviation safety reporting system database. Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics Society, Houston, TX, September 27–October 1. Bliss, J.P., Jeans, S.M., Prioux, H.J., 1996. Dual-task performance as a function of individual alarm validity and alarm system reliability information. Proceedings of the Human Factors and Ergonomics Society 40th Annual Meeting, Santa Monica, CA, October 2–8. Human Factors and Ergonomics Society, Philadelphia, PA, pp. 1237–1241. Bliss, J.P., Kilpatrick, F., 2000. The influence of verbal content on alarm mistrust. Proceedings of the 2000 Human Factors and Ergonomics Society Annual Meeting, San Diego, CA, July 30– August 4. Bronkhorst, A.W., Veltman, J.A., van Breda, L., 1996. Application of a three-dimensional auditory display in a flight task. Human Factors 38 (1), 23–33. Chen, C., Quinn, R.D., Ritzmann, R.E., 1997. A crash avoidance system based on the cockroach escape response circuit. Proceedings on the 1997 IEEE International Conference on Robotics and Automation, Albuquerque, USA, pp. 2007–2012.

ARTICLE IN PRESS J.P. Bliss, S.A. Acton / Applied Ergonomics 34 (2003) 499–509 Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd Edition.. Erlbaum, Hillsdale, NJ. Dingus, T.A., McGehee, D.V., Manakkal, N., Jahns, S.K., Carney, C., Hankey, J.M., 1997. Human factors field evaluation of automotive headway maintenance/collision avoidance devices. Human Factors 39 (2), 216–229. Farber, E., Paley, M., 1993. Using freeway traffic data to estimate the effectiveness of rear end collision countermeasures. Proceedings of the Third Annual IVHS America Meeting, Washington, DC, April. Gupta, N., Bisantz, A.M., Singh, T., 2002. The effect of adverse condition warning system characteristics on driver performance: an investigation of alarm signal type and threshold level. Behav. Inf. Technol. 21 (4), 235–248. Haas, E.C., Casali, J.G., 1995. Perceived urgency and response time to multi-tone and frequency-modulated warning signals in broadband noise. Ergonomics 38 (11), 2281–2299. Haas, E.C., Edworthy, J., 1996. Designing urgency into auditory warnings using pitch, speed, and loudness. Comput. Control Eng. J. 7 (4), 193–198. Hays, R.T., Singer, M.J., 1989. Simulator Fidelity in Training System Design. Springer, New York. Hellier, E., Edworthy, J., Weedon, B., Walters, K., Adams, A., 2002. The perceived urgency of speech warnings: semantics vs. acoustics. Human Factors 44 (1), 1–17. Hirst, S., Graham, R., 1997. The format and presentation of collision warnings. In: Ian, N. (Ed.), Ergonomics and Safety of Intelligent Driver Interfaces. Lawrence Erlbaum Associates, Mahwah, NJ. Jansson, J., Johansson, J., Gustafsson, F., 2002. Decision making for collision avoidance systems. SAE Paper No. 2002-01-0403, Society of Automotive Engineers.Washington, DC. Lee, J., Moray, N., 1992. Trust, control strategies, and allocation of function in human–machine systems. Ergonomics 35 (10), 1243–1270. Lee, M.D., Patterson, R.W., 1993. The application of three-dimensional audio displays to aircraft cockpits: user requirements, technology assessment, and operational recommendations. Proceedings of the Beyond Speech/Virtual Reality/Teleoperation 1993. SIG—Advanced Applications, New York, NY, November. Lilliboe, M.L., 1963. Final Report: evaluation of Astropower, Inc. auditory information display installed in the VA-3B airplane. Technical Report ST 31-22R-63. US Naval Air Station, Naval Test Center, Patuxent River, MD. MacKinnon, D.P., Bryan, A.D., Barr, A., 1993. Four studies on the effects of multiple warnings: The false alarm effect and overwarning. Technical Report, Contract # AA8547. Project ABLE, Arizona State University. Mazzae, E.N., Garrott, W.R., Flick, M., 1995. Human factors evaluation of existing side collision avoidance system driver interfaces. SAE Paper No. 952659. National Highway Traffic Safety Administration, Washington, DC.

509

Muir, B.M., 1989. Operators’ trust in and percentage of time spent using the automatic controllers in a supervisory process control task. Doctoral Thesis, University of Toronto. National Highway Traffic Safety Administration, 1997. Report to congress on the national highway traffic safety administration ITS program. ITS Electronic Document Library Paper No. 2683. US Department of Transportation, Washington, DC. National Highway Traffic Safety Administration, 2000. Automotive Collision Avoidance Systems (ACAS) Program. Final Report No. DOT HS 809 080. US Department of Transportation Washington, DC. National Highway Traffic Safety Administration, 2002. Automotive collision avoidance system field operational test. Final Report No. DOT HS 809 462. US Department of Transportation, Washington, DC. Parasuraman, R., Hancock, P.A., Olofinboba, O., 1997. Alarm effectiveness in driver-centred collision-warning systems. Ergonomics 40 (3), 390–399. Parasuraman, R., Riley, V., 1997. Humans and automation: use, misuse, disuse, abuse. Human Factors 39, 230–253. Pritchett, A.R., 2001. Reviewing the role of cockpit alerting systems. Human Factors Aerospace Safety 1 (1), 5–38. Rempel, J.K., Holmes, J.G., Zanna, M.P., 1985. Trust in close relationships. J. Personality Social Psychol. 49, 95–112. Rudmann, D.S., Strybel, T.Z., 1999. Auditory spatial facilitation of visual search performance : effects of cue precision and distractor density. Human Factors 41 (1), 146–160. Selcon, S.J., Taylor, R.M., McKenna, F.P., 1995. Integrating multiple information sources: using redundancy in the design of warnings. Ergonomics 38 (11), 2362–2370. Shapiro, N., Executive Producer, 1994. NBC Dateline. National Broadcasting Corporation, New York. Sorkin, R.D., 1988. Why are people turning off our alarms? J. Acoust. Soc. Am. 84 (3), 1107–1108. Tijerina, L., Garrott, W.R., 1997. A reliability theory approach to estimate the potential effectiveness of a crash avoidance system to support lane change decisions. Proceedings of the SAE International Congress and Exposition, Detroit, MI. Tijernia, L., Hetrick, S., 1997. Analytical evaluation of warning onset rules for lane change crash avoidance systems. Proceedings of the Human Factors and Ergonomics Society 40th Annual Meeting, Philadelphia, USA, pp. 949–953. Tilin, A., 2002. You are about to crash. Wired Magazine, Issue 10.04. Online: http://www.wired.com/wired/archive/10.04/. Xiao, Y., Seagull, F.J., 1999. An analysis of problems with auditory alarms: defining the roles of alarms in process monitoring tasks. Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics Society, Houston, USA, pp. 256–260.