The effects of distractor sounds presented through bone conduction headphones on the localization of critical environmental sounds

The effects of distractor sounds presented through bone conduction headphones on the localization of critical environmental sounds

Applied Ergonomics 61 (2017) 144e158 Contents lists available at ScienceDirect Applied Ergonomics journal homepage: www.elsevier.com/locate/apergo ...

4MB Sizes 2 Downloads 29 Views

Applied Ergonomics 61 (2017) 144e158

Contents lists available at ScienceDirect

Applied Ergonomics journal homepage: www.elsevier.com/locate/apergo

The effects of distractor sounds presented through bone conduction headphones on the localization of critical environmental sounds Keenan R. May*, Bruce N. Walker Georgia Institute of Technology, 654 Cherry Street, Atlanta GA 30332, United States

a r t i c l e i n f o

a b s t r a c t

Article history: Received 20 January 2016 Received in revised form 4 January 2017 Accepted 15 January 2017

Bone conduction headphones are devices that transmit sound through the bones of a listener's head rather than through the air in their outer ear. They have been marketed as a safer way to enjoy audio content while walking, jogging, or cycling. However, listening to distracting sounds over bone conduction may still disrupt a listener's awareness of their auditory environment. The present study investigated the nature of this interference with the faculty of sound source localization-a key prerequisite for generating situation awareness through audio. Participants sat in the middle of a circle of loudspeakers and listened for target sounds played from different directions. Each time they heard a sound, they responded by indicating what direction they judged the sound to have come from. Meanwhile, participants listened to distractor sounds played through bone conduction headphones. Participants heard (1) no distractor sounds, (2) a spoken story that they were instructed to ignore, and (3) the same spoken story that they were instructed to attend to. For conditions (2) and (3), some participants heard a version of the story with background music, while others heard the spoken story without the music. Participants had greater localization error in the distractor-present conditions. Additionally, participants who heard the spoken story with music exhibited greater localization error. However, there was no effect of whether participants ignored or attended to distractors. This pattern was attributed to masking effects, and was more pronounced for narrow-band targets compared to broadband targets. Post-hoc analyses found evidence of a ‘pulling’ effect, in which localization judgments were systematically biased toward the apparent direction of the bone conducted distractors. These results indicate that using bone conduction headphones can be expected to cause a decline in a person's awareness of their environment, in a subtle way that a jogger or cyclist might not be actively aware of, even if their attention is directed to the environment and environmental sounds are readily detectible. © 2017 Elsevier Ltd. All rights reserved.

Keywords: Bone conduction Cycling Distraction Localization Audio

1. Introduction Scenarios in which users interact with technology via headphones while simultaneously moving about the world have become common. There are a variety of reasons to engage in these activities, such as the desire for entertainment, information, or communication in a variety of situations in which vision is occupied elsewhere. For example, listening to music while jogging has been shown to lower mental fatigue and increase run length (Srinivasan et al., 2009). While it is an understandably widespread practice, listening to music while jogging, walking or cycling also presents a

* Corresponding author. E-mail addresses: [email protected] (K.R. May), [email protected] (B.N. Walker). http://dx.doi.org/10.1016/j.apergo.2017.01.009 0003-6870/© 2017 Elsevier Ltd. All rights reserved.

safety risk. Lichenstein et al. (2012) reviewed police reports describing cases in which pedestrians were injured while wearing headphones. 70% of these cases resulted in a fatality, and 29% included reports of the victim having failed to heed an auditory warning sounded just prior to the incident. Basch et al. (2015) analyzed over 20,000 cases of pedestrians crossing intersections. They found that pedestrians wearing headphones were more likely to cross during “don't walk” periods. Listening to music through in-ear headphones has also been found to be hazardous for cyclists by impeding their ability to detect auditory warning signals (De Waard et al., 2011). In an effort to provide safer alternatives to traditional headphones, technologies such as active environmental audio passthrough earbuds and bone conduction (BC) headphones are currently being refined into viable consumer products. While technical hurdles remain before audio pass-through devices become a reality for consumers, mature BC devices are readily

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

accessible to consumers. BC headphones work by conducting sound directly into the middle ear by vibrating the bones of the skull. This allows environmental sounds to pass through the outer ear normally, via air conduction (AC). Device manufacturers (Chilli Air II, 2012) and journalists (Rhodes, 2015) have made the claim that BC headphones allow the wearer to maintain awareness of their auditory environment. However, to date little research has been conducted to assess the true efficacy of BC devices at preserving accurate perception of critical environmental sounds. 1.1. Components of auditory distraction There are several perceptual and cognitive activities that must take place in order for listeners to create accurate mental representations of environmental sounds. Sounds must be delineated from amongst the undifferentiated input stream, the sound's location in space must be computed, and the resulting auditory object must be brought into in a person's working memory so that situation awareness (SA) can be generated. While BC headphones do not physically obstruct incoming sounds to the same extent as AC headphones, bone conducted audio still enters the listener's cochlea and is subsequently processed in the same way as any other sound. Thus, BC distractor sounds would be expected to interfere with these perceptual computations in a manner similar to any other sound source. A prior study attempting to address this assertion focused on their effects on environmental target detection, which is a necessary but not sufficient component of one's SA (Chang-Geun et al., 2011). The authors asked participants to walk on a treadmill while listening to music through BC headphones or traditional headphones, or not listening to any music. Participants were tasked with raising their hand to indicate that they had detected vehicle sounds presented directly behind them. Reaction times and detection rates were used as measures of objective SA. Subjective SA was measured through self-report. The investigators found that objective and subjective SA were higher in the condition without distractors compared with the condition with BC distractors, followed by the condition with AC distractors. The authors took this to indicate that BC devices have a meaningful advantage over AC devices in terms of facilitating detection of critical sounds. The reader may note that the present study does not attempt to evaluate this claim as such, and is instead focused on comparing BC distractors to unimpeded listening. Over and above the need to detect alerts and other sounds, the comprehension of safety-critical sounds is another matter, and requires additional information processing. Comprehension of environmental sounds is required to support higher levels of SA that relate to understanding the location and meaning of objects in an environment, rather than whether or not they are present (Endsley, 1995). In order for a person to make an informed decision, they need to be able to accurately generate a spatial model of their environment, containing both this higher level information and references to the environment that support its rapid acquisition ondemand (Durso et al., 1995). One key ability that supports this process is that of determining the point of origin of a sound in space, or localizing the sound. Localization occurs through computations that incorporate a variety of properties of a sound source, including interaural time differences, interaural level differences, spectral cues, and room acoustics. If any of this information does not come through or is otherwise disrupted or biased, localization accuracy may be inhibited. Localization of sound sources is a prerequisite for SA derived from auditory stimuli (Scharine, and Letowski, 2005), and thus is crucial for cyclists, joggers and pedestrians, whose vision may be otherwise occupied. When examining the challenge faced by cyclists, joggers or pedestrians who endeavor to listen to distracting sounds and

145

maintain auditory SA at the comprehension level, there are several factors to consider. One factor is how cognitively engaged they are with distracting sounds, relative to target sounds-that is, how much are they paying attention? Another factor is how much simultaneous masking can be expected between distractor sounds and environmental target sounds, either rendering target sounds inaudible or causing degraded or biased perception of features of those sounds. The present study focuses on how the two factors of attention allocation and the level of stimulus-driven masking may affect a listener's ability to localize environmental sounds that they can easily detect. 1.2. The effects of masking on localization accuracy By blocking the ear canal, traditional in-ear or over-ear AC headphones introduce a large level difference between distractor sounds and environmental targets, which can lead to environmental sounds being entirely drowned out. Using BC headphones makes this obstruction effect much less severe, but masking effects can still be expected to occur for sounds that have overlapping spectral density distributions. Since BC and AC sounds both ultimately reach the cochlea, masking would be expected to occur in a manner concordant with prior research on AC distractors played through loudspeakers-with allowance for minor differences due to the different frequencies that can be effectively conducted via bone (Stanley, 2006). A number of studies have demonstrated the effect that masking distractors tend to have on localization of targets, when both the target and the distractor are presented through loudspeakers. Grohn (2002) placed participants in a virtual sound field and had them point at moving sound sources while attempting to ignore various auditory distractors, and found that median localization error increased from 13 to 17 when distractors were introduced. Suzuki et al. (1993) used a similar method and found that the presence of loudspeaker-presented pink-noise distractors led to an increase in localization error. Langendijk et al. (2001) had participants listen for noise bursts emanating from one of 85 virtual locations while ignoring distractors presented soon after target onset. They found that localization performance degraded as more of these distractors were added. Butler and Naunton (1962) studied the effect of mask sounds presented through one AC headphone on the ability of participants to localize external sound sources presented through a loudspeaker. While this was a different paradigm from the aforementioned studies, it was still one in which environment sounds were clearly audible in spite of distracting sounds. The authors again found that mask stimuli presented simultaneously with targets led to increased localization error. The magnitude of the detrimental effect of distractors on localization of targets has been found to depend on several factors. Suzuki et al. (1993) found that the increase in localization error caused by distractor pink noise increased as the intensity of the distractor to the target increased. Langendijk et al. (2001) found that the largest errors occurred when distractors and targets emanated from similar locations in space. In Butler and Naunton's 1962 study, localization error tended to be larger when the mask sound was equal in frequency to the target tone. Thus, moderators relating to sound intensity, spectral density distribution, and spatial location all play a role in the extent to which localization abilities may be disrupted. Additionally, this disruption has been found to occur in a systematic way, with judgments tending to become biased in a consistent direction. Suzuki et al. (1993) found that as frequency overlap between distractors and targets increased, localization judgements increasingly became biased away from the direction of

146

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

the distractor. They surmised that when distractor sounds mask target sound input more so in one ear (the ear closer to the distractor) than the other ear (the ear farther from the distractor), components of the target sound entering in the closer ear have a lower perceived intensity than they normally would. This changes the ratio of the two intensities used to compute the interaural intensity difference. Because of this, the target sound's perceived location shifts in the direction of the farther ear. Butler and Naunton (1962) found evidence of this effect as well. Suzuki et al. (1993) noted that distractor-induced localization error was only systematic when the target sound was under 1 kHz. In this range (less than 1.5 kHz), correctly perceiving interaural intensity differences is less critical, due to interaural time/phase difference also being informative (Woodworth, 1938). Conversely, Suzuki et al. observed that for 2 kHz targets, distractors had a larger and less orderly influence on localization judgments. BC headphones present content in a manner akin to having two loudspeakers presenting distractors-one on each side of the listener's head, and targets may incorporate a variety of frequencies. As such, it is not clear in what direction, if any, localization judgments of a headphone-distracted cyclist, jogger, or pedestrian might be systematically biased. For this study it was expected that masking effects would occur in a manner akin to those seen in studies using a distractor presented from a single loudspeaker, with broadband distractors tending to more heavily mask target sounds and lead to increased localization error. To conclude, in the scenario of joggers or cyclists using BC headphones, there are many cases in which distractor sounds should mask target sounds and lead to degradation in sound localization performance. 1.3. The effects of the listener's attention allocation on localization accuracy To be accurately localized, two perceptual computations need to occur: (1) a sound must both be segregated from other sounds-put differently, an auditory object must be formed for that sound; and (2) the sound must also be localized in space. There is evidence that these processes occur in two separate processing streams (Rauschecker, 1998), although there may be significant cross-talk between these streams (Cloutman, 2013). A person's efficacy in carrying out both of these subtasks is moderated not only by stimulus effects such as simultaneous masking, as previously described, but also by how they choose to direct their attention. The first computational challenge is the need to delineate crucial target sounds from background noise, other environmental sounds, and artificial distractors. In naturalistic auditory environments, auditory streams are segregated through a perceptual organization process broadly analogous to the application of Gestalt laws in visual perception. This process takes place in the ‘ventral stream’ of auditory processing, which is dedicated to performing the task of auditory stream segregation (Rauschecker, 1998). Evidence suggests that this process is influenced by the active allocation of attention. Carlyon et al. (2001) found increased stream buildup (the segregation of input into more and more auditory streams) amongst an array of tones when participants were asked to attend to those tones, versus when they attended to a distractor sound. Even when participants were asked to attend to a distractor sound that had different frequencies from the target tone cluster, lessened stream buildup amongst the target tone cluster was observed-that is, a lower number of streams were able to be identified. This indicates that attention allocation has an effect on stream segregation, even when distractors do not mask target sounds. Cusack et al. (2004) further isolated the effect of attention on stream buildup. They found that presenting targets and

distractors in the same or opposite ears had little impact on stream buildup, and that it did not matter whether target and distractor were in close or disparate frequencies. However, a robust effect was found for whether a stream cluster was attended or not. In light of these findings, Cusack et al. (2004) proposed the ‘hierarchical decomposition’ model of stream segregation, which gives both preattentive and attentive processes a role. In this model, basic segregation of an auditory scene (for example, delineating traffic from music) is achieved automatically. Attention can then be directed at one of these high-level groupings, which subjects it to more granular stream segregation. In the traffic example, attending to traffic and not music would allow distinct streams to be formed for each nearby vehicle, at the cost of losing track of the singer and the different instruments. This scenario is common in dynamic environments; thus, allocation of attention may be important for supporting the parsing of complex environments into a number of streams large enough for each stream to be associated with a single meaningful real-world object. This is necessary but not sufficient for the accurate localization of environmental objects (Neuhoff, 2004). Schuett and Walker (2013) surveyed research on the comprehension of multiple auditory data streams, and concluded that in most cases listeners can monitor up to around three streams concurrently. However, while auditory displays can be designed so that streams are distinct, a BC headphone wearer is tasked with sifting through a complex and unpredictable auditory scene, which includes the environment as well as BC-delivered media. As such, ensuring stream segregation may be key to maintaining safe levels of SA in such situations. The second computational challenge involved in maintaining SA through audio is that of locating an object in space. This is carried out by a parallel chain of processes that occur in a dorsal auditory processing stream (Rauschecker, 1998). This steam carries out computations relating to determining the spatial location and direction of movement for objects in the auditory scene. This faculty also appears to be moderated by where a listener chooses to direct their attention (Hang et al., 2009). Thus, sound source localization abilities should be improved when listeners are instructed to allocate more attention to the task of listening for target stimuli. In particular, both (1) the process of auditory stream segregation may be improved, and (2) computations relating to determining the spatial location of sound sources could be improved. However, the presence or absence of meaningful effects of attention allocation on sound source localization depends on the acoustic complexity and spatial busyness of the auditory scene. For example, task component (a) becomes more difficulty as background noise increases, and task component (b) becomes more difficult as the number of moving sound sources increase. Finally, as the number of auditory objects that need be tracked to complete a task increases, maintaining situation awareness using these objects becomes limited by working memory capacity (Fracker, 1987). If the active allocation of allocation is a significant factor in mitigating any localization performance decrement, users could be instructed or trained in how to responsibly divide their attention. However, if it is not important under the pattern of listening conditions typical to BC-distracted tasks, modifications made to the BC distractors themselves could be more effective in improving public safety. 1.4. Current study The current study was aimed at investigating the effects of distractor type as well as attention allocation on sound source localization ability when all distractors were presented through BC headphones. Participants sat inside a circular array of loudspeakers,

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

147

2.2. Test environment The study took place in a hemi-anechoic chamber, in order to minimize reverberation associated with being in an enclosed space. Because only true speaker locations were used as sound sources, participants needed to be unable to see the speakers (Letowski and Letowski, 2012). To enable this, participants were seated inside an acoustically transparent but visually opaque cylindrical curtain made of speaker grill cloth (Fig. 1). The curtain hung from a suspended hula-hoop 8 feet in circumference. Participants were led directly inside the curtain and were not given time to view speaker locations. No participants elected to leave the curtain when taking a break between conditions. Thus, participants were unaware of the exact locations of speakers and unable to visually confirm the possible source of target sounds. 2.3. Sound sources Fig. 1. Testing environment.

inside a hemi-anechoic chamber, while wearing BC headphones. Target sounds were presented through one loudspeaker at a time. Participants were asked to respond to sounds played from these loudspeakers by dragging a circle on a tablet to indicate the direction the sound came from. A within/between-subjects experimental design was used. Each participant experienced three conditions. In the no distractor (baseline) condition, no distractor sounds were played. In the distractor/ignore condition, a spoken story was played, which participants were instructed to ignore as best as they could. In the distractor/attend condition, the same story was played, but participants were instructed to attempt to remember as many details as possible from the story. Participants were not quizzed on their memory of the story, but they were not told whether or not they would be quizzed. The order of these three conditions was counterbalanced. Additionally, there was a between-subjects factor: ‘music presence’. Some participants heard the spoken story alone (similar to an audiobook) whereas others heard the story with musical accompaniment (similar to music with lyrics). Music and speech content are two common types of auditory stimuli that cyclists or joggers might experience that have different component frequencies, and, accordingly, different potential for masking. Specifically, the spoken story with musical accompaniment condition (‘speechþmusic’) included a wider array of frequencies than the speech condition, and thus was expected to exhibit a stronger masking effect, which would in turn disrupt interaural intensity differences in target sounds and lead to increased localization error compared to the spoken word stimulus.

Letowski and Letowski (2012) provided an overview of methods commonly used in auditory localization research. For the present study, a paradigm was used in which the participant is surrounded by stationary speakers and remains motionless during stimulus presentation (Bienvenue and Siegenthaler, 1974). An array of eight speakers was used (Fig. 1), in the manner of Abel et al. (2007). The use of virtual sound fields can itself have a detrimental effect on the accuracy of localization judgments and as such introduces an additional source of performance variation. Thus, each sound was emitted from only one speaker; no between-loudspeaker source angles were used. The speakers were placed on stands two meters away from the participant and adjusted to be close to head height for a person sitting down. Speakers were placed at 0 (straight ahead for the participant), 45, 90, 135, 180 (straight behind the participant), 225, 270, and 315 . All sounds were outputted from a laptop and sent through a 16-channel USB mixer into the speakers. A set of

2. Materials and methods 2.1. Participants Participants were undergraduate students from a technical university in the Southeastern United States, aged 18e28. There were a total of 24 participants, 14 male and 10 female. Only two had prior experience with BC headphones. Participants were compensated with 1 h of course credit.

Fig. 2. iPad with response interface.

148

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Fig. 3. Spectrogram representing the spoken story distractor, which consistently was characterized by moderate intensity (green) in the 0e4 kHz range and low intensity (blue) up to ~14 kHz. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 4. Spectrogram representing the spoken story with musical accompaniment distractor, which consistently was characterized by moderate intensity (green) in the 0e6 kHz range and low intensity (blue) up to ~19 kHz. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

wired Aftershokz Sportz M2 BC headphones was used to present distractor stimuli. These made contact with the temporal bones of each participant, just in front of their ears. The signal sent to the BC headphones was passed through two headphone amplifiers in order to achieve required sound intensity. While state of the art at the time, these BC headphones had approximately 12 dB A (total) of ‘leakage,’ which refers to sound emitted into the air which propagated through the participant's ear via AC.

2.4. Listening task Letowski and Letowski (2012) also included extended discussion of different response methods in auditory research. Possible methods include verbally indicating a speaker number, verbally responding with an angle, or deictic pointing with the hand or head. Methods that are deictic and/or egocentric tend to lead to smaller and more ecologically valid localization error

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

149

Fig. 5. Corrected localization error by distractor task type.

Fig. 6. Corrected localization error by music presence.

measurements, but these can be difficult to instrument. The present study used a method in which the participant indicates a direction on a screen relative to a representation of their head viewed from above. While not as valid as deictic methods, this method is still somewhat egocentric, allows for continuous data (preferable to nominal data derived from naming a speaker or an angle), and has been applied previously with success (Stanley, 2006). While Stanley (2006) utilized a mouse and cursor, in this study an iPad tablet and movable touch object was used. Participants were asked to drag the touch object (a fingertip-sized blue dot) from the center of a topdown drawing of their head, outside the edge of a shaded circle

(Fig. 2), in the direction of the sound source they most recently heard. The next stimulus was not presented until the participant responded. Another issue in localization research is whether to allow head movement during the presentation of sounds (ARL, 2012). While the presence of the occluding curtain prevented head movements from adding any visual information, movement of the head during signal presentation would still have provided the participant with additional data with which localization could be improved. In the real world most people would be able to use visual or movementrelated cues to augment sound-based localization, and as such

150

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Table 1 Corrected localization error by angle and distractor task type (smaller is reflective of better performance), descriptive statistics. Target Angle

No distractor (M,SD)

Speech (M,SD)

Speech þ Music (M,SD)

0

M ¼ 8.57, SD ¼ 9.902 M ¼ 14.52, SD ¼ 10.605 M ¼ 12.43, SD ¼ 7.852 M ¼ 21.19, SD ¼ 10.386 M ¼ 12.86, SD ¼ 11.646 M ¼ 23.62, SD ¼ 11.025 M ¼ 18.86, SD ¼ 8.901 M ¼ 15.52, SD ¼ 12.675

M ¼ 6.29, SD ¼ 3.709 M ¼ 15.07, SD ¼ 5.526 M ¼ 12.07, SD ¼ 8.766 M ¼ 25.36, SD ¼ 7.919 M ¼ 12.36, SD ¼ 10.263 M ¼ 27.36, SD ¼ 7.313 M ¼ 15.64, SD ¼ 8.149 M ¼ 17.00, SD ¼ 7.190

M ¼ 12.86, SD ¼ 9.209 M ¼ 22.29, SD ¼ 11.658 M ¼ 17.71, SD ¼ 7.410 M ¼ 29.71, SD ¼ 5.376 M ¼ 17.14, SD ¼ 11.880 M ¼ 32.43, SD ¼ 12.012 M ¼ 25.14, SD ¼ 6.568 M ¼ 22.71, SD ¼ 15.787

45 90 135 180 225 270 315

spoken story with musical accompaniment condition, a music component was added that consisted of two layers of light acoustic guitar and a drum beat (Fig. 4). This music was recorded by the researchers and overlaid digitally. The spoken story and music did not interact or sync in any way. The music was created to be generally appealing, but highly uniform and lacking in attentiongrabbing features or dynamic shifts, and to be unlikely to provoke listener engagement or emotional response. Distractor stimuli were presented in mirrored stereo at approximately 35 dBA for each ear. This is likely on the low end of settings a headphone user might use. The relatively low intensity used for distractors was chosen in order to minimize sound leakage, which became unacceptably loud if distractor intensity was increased. Thus, due to the complete silence in the room and the low distractor sound intensities used, target sounds were, by design, clearly audible.

2.7. Dependent variables studies that restrict head movements may be losing validity (Neuhoff, 2004). However, in the scenarios of cycling and jogging to which this research is intended to generalize, the listener may be unable to safely turn their head. Thus, participants were instructed to look straight ahead during the testing period. 2.5. Target stimuli Six types of target stimuli were used (see Appendix). Two were nature sounds (a cicada buzz and bird chirp), two were roadway alert sounds (a bicycle bell and car horn) and two were artificial broadband sounds (pink noise and white noise). All stimuli were presented at approximately 50 dBA as measured by a decibel meter application. These particular target sounds were included in order to provide a varied sampling of artificial and realistic sounds with different frequency profiles. Each of the six target stimuli was presented once at each of the eight speaker locations, making for a total of 48 trials per condition. While additional presentations would have been useful, a single presentation was used in order to keep the study time manageable. 2.6. Distractor stimuli The distractor stimulus was an excerpt from the story “RikkiTikki-Tavi” read by Emily Topping (Kipling, 2011, Fig. 3). For the

2.7.1. Localization error Per trial, the participant's response angle was recorded. This was subtracted from the correct angle, after which the absolute value was taken for each deviation. This was averaged in order to generate an error score per participant per condition. Letowski and Letowski (2012) described the phenomenon of “reversal errors.” A reversal error occurs when a participant responds with an angle approximately 180 away from the true sound source location. Due to the relative strength of binaural cues compared to spectral cues, front-back or back-front reversal errors tend to be more common than left-right or right-left errors. It is common for reversals to be removed from data, as they constitute outliers and may not occur often in real-world scenarios where other cues are present with which a sound's general direction can be cross-referenced. Thus, in the present study, results were analyzed with the reversals removed, producing a ‘correction localization error’ measure.

2.7.2. Reversals Accordingly, cases in which the participant's raw response error was greater than 90 were treated as reversal errors (Carlile et al., 1997). These were tallied and divided by the total number of trials to produce a reversal rate. This was further divided into frontback and left-right reversals.

Table 2 Corrected localization error by angle and distractor task type (smaller is reflective of better performance), statistical tests and effect size measures. Target Angle

0 45 90 135 180 225 270 315

No distractor compared to speech

No distractor compared to speechþmusic

Speech compared to speech þmusic

t, p

d

t, p

d

t, p

d

t(13) ¼ -0.760, p ¼ 0.461 t(13) ¼ 2.267, p ¼ .041 t(13) ¼ 0.567, p ¼ 0.580 t(13) ¼ 2.573, p ¼ 0.023 t(13) ¼ 0.109, p ¼ 0.915 t(13) ¼ 2.258, p ¼ 0.042 t(13) ¼ -0.154, p ¼ 0.880 t(13) ¼ 1.077, p ¼ 0.301

d ¼ 0.316

t(13) ¼ 1.46, p ¼ 0.196 t(13) ¼ 0.862, p ¼ 0.422 t(13) ¼ 1.29, p ¼ 0.251 t(13) ¼ 1.02, p ¼ 0.342 t(13) ¼ 1.02, p ¼ 0.338 t(13) ¼ 0.515, p ¼ 0.625 t(13) ¼ 0.082, p ¼ 0.937 t(13) ¼ 1.56, p ¼ 0.178

d ¼ -0.466

t(19) ¼ 2.360, p ¼ 0.112

d ¼ 1.132

d ¼ -0.723

t(19) ¼ 1.951, p ¼ 0.162

d ¼ -0.939

d ¼ -0.718

t(19) ¼ 1.458, p ¼ 0.161

d ¼ -0.713

d ¼ 1.069

t(19) ¼ 1 0.305, p ¼ 0.208

d ¼ -0.639

d ¼ -0.378

t(19) ¼ 0.957, p ¼ 0.350

d ¼ -0.465

d ¼ -0.793

t(19) ¼ 1.209, p ¼ 0.242

d ¼ -0.584

d ¼ -0.833

t(19) ¼ 2.671, p ¼ 0.015

d ¼ 1.308

d ¼ -0.521

t(19) ¼ 1.156, p ¼ 0.262

d ¼ -0.555

d ¼ -0.067 d ¼ 0.045 d ¼ -0.469 d ¼ 0.047 d ¼ -0.041 d ¼ 0.392 d ¼ -0.149

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

151

2.8. Hypotheses The first two hypotheses were aimed at evaluating the fundamental assumption that BC headphone-presented stimuli can cause a detriment to localization without affecting target detection. Participants were expected to have significantly greater mean corrected deviation scores in the distractor/ignore and distractor/ attend conditions compared to the no distractor condition. Similarly, it was hypothesized that there would be a significant increase in reversal errors of all types in the distractor/ignore and distractor/ attend conditions compared with the no distractor condition. The third and fourth hypotheses centered on the role of attention in moderating the effect of distracting sounds delivered over BC. It was expected that, despite the relative simplicity of the soundscape, participants would have impaired localization when tasked with focusing on features of the BC distractors compared with being tasked to simply ignore the distractors. Thus, it was hypothesized that participants would have a significantly larger mean corrected localization error, and reversal count, in the distractor/attend condition compared to the distractor/ignore condition. The fifth and sixth hypotheses was designed to answer the question of whether or not the differing spectral range of realworld distractor sounds could have differing impacts on localization performance. It was expected that simultaneous-masking from continuous BC distractors would affect localization performance in a manner similar to studies in which distractor sounds were played over loudspeakers, in which broadband distractors led to greater decrements. Thus, it was hypothesized that there would be a significant increase in corrected localization error and reversal counts for participants who heard the spoken story with musical accompaniment (which incorporated a wider range of frequencies) compared with those who heard only the story. 3. Results 3.1. Analyses 3.1.1. Corrected localization error A mixed repeated-measures ANOVA showed that the effect of distractor task type on corrected mean localization error (Fig. 5) was significant, F(2,36) ¼ 10.012, p < 0.001. Post hoc paired t-tests using Bonferroni corrections indicated that corrected mean localization error was significantly lower in the no distractor condition (M ¼ 16.887, SD ¼ 1.100) compared to the distractor/ignore condition (M ¼ 19.364, SD ¼ 1.173), t(19) ¼ 3.528, p ¼ 0.006, d ¼ 2.23, as well as compared to the distractor/attend condition (M ¼ 19.949, SD ¼ 1.132), t(19) ¼ 4.065, p ¼ 0.002, d ¼ 2.819. However, the distractor/ignore and distractor/attend conditions were not significantly different from each other (t(19) ¼ 0.372, p ¼ 1.000). The main effect of music presence (Fig. 6) on corrected localization error was significant, F(1,18) ¼ 9.150, p ¼ 0.007, indicating that localization performance was degraded for those who received the spoken story with musical accompaniment condition (M ¼ 22.355, SD ¼ 1.507) compared with those who heard the story alone (M ¼ 16.288, SD ¼ 1.066), d ¼ 0.471. The interaction between distractor task type and music presence was not significant,

Fig. 7. Kernel density contour plot of corrected responses for the no distractor condition (top) the spoken word distractor conditions (center), and the speech þ music

conditions (bottom), for each angle (different colors). Red/shorter lines indicate mean response angles for the closest target angle; nearby labels indicate the average signed deviation of responses for a given angle. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

152

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Table 3 Signed mean deviations by angle and distractor task type (smaller is reflective of better performance), descriptive statistics. Target Angle

No distractor (M,SD)

Speech (M,SD)

Speech þ Music (M,SD)

0

M ¼ 2.61, SD ¼ 13.048 M ¼ 2.30, SD ¼ 15.10, M ¼ 5.78 SD ¼ 9.68 M ¼ -17.91, SD ¼ 11.723 M ¼ 11.65, SD ¼ 12.78 M ¼ 22.08, SD ¼ 14.08 M ¼ 12.22, SD ¼ 11.67 M ¼ 5.65, SD ¼ 15.79

M ¼ 1.31, SD ¼ 3.159 M ¼ 5.00, SD ¼ 9.025 M ¼ 8.00, SD ¼ 9.288 M ¼ 24.08, SD ¼ 8.931 M ¼ 7.31, SD ¼ 9.551 M ¼ 27.38, SD ¼ 8.373 M ¼ 10.15, SD ¼ 9.135 M ¼ 9.38, SD ¼ 9.967

M ¼ 2.29, SD ¼ 13.263 M ¼ 18.86, SD ¼ 14.058 M ¼ 7.57, SD ¼ 10.064 M ¼ 24.29, SD ¼ 13.720 M ¼ 16.71, SD ¼ 12.776 M ¼ 31.43, SD ¼ 13.195 M ¼ 19.71, SD ¼ 9.725 M ¼ 12.00, SD ¼ 19.088

45 90 135 180 225 270 315

significant, F(2,36) ¼ 0.252, p ¼ 0.779. The main effect of music presence was non-significant, F(1,18) ¼ 1.145, p ¼ 0.229. The interaction between distractor task type and music presence was non-significant, F(2,36) ¼ 0.252, p ¼ 0.779. 3.1.4. Front-back reversal rate A mixed repeated-measures ANOVA showed that the main effect of distractor task type on the front-back reversal rate was nonsignificant, F(2,36) ¼ 2.877, p ¼ 0.069. The main effect of music presence was non-significant, F(1,18) ¼ 0.771, p ¼ 0.391. The interaction between distractor task type and music presence was non-significant, F(2,36) ¼ 1.344, p ¼ 0.274. 3.2. Post-hoc analyses by angle and target type

F(2,36) ¼ 2.155, p ¼ . 131. 3.1.2. Overall reversal rate A mixed repeated-measures ANOVA showed that the main effect of distractor task type on the overall reversal rate was significant, F(2,36) ¼ 3.548, p ¼ 0.039. After applying Bonferroni corrections, all pairwise group differences were non-significant. However, an unplanned post-hoc Helmert contrast comparing the no distractor condition (M ¼ 0.054, SD ¼ 0.027) to the two distractor conditions together (M ¼ 0.068, SD ¼ 0.037) indicated significant differences between these groups, F(1,18) ¼ 7.259, p ¼ 0.015, d ¼ 0.444. This indicates that reversals were more common in the two distractor conditions compared with the baseline condition. The main effect of music presence on overall reversal rate was non-significant, F(1,18) ¼ 1.911, p ¼ 0.184. The interaction between music presence and distractor task type was also non-significant, F(2,36) ¼ 2.183, p ¼ 0.127. It is worth noting that in all cases the number of reversals was very small. 3.1.3. Left-right reversal rate A mixed repeated-measures ANOVA showed that the main effect of distractor task type on the left-right reversal rate was non-

Following the finding of significant effects of music presence but not of distractor task type on localization error, a set of post-hoc analyses were performed to better characterize the effects of music presence on corrected localization error for each target angle and as well as for each target type. For these post-hoc analyses, ‘ignore’ and ‘attend’ data were merged. These analyses were exploratory in nature, and sample sizes were relatively small due to repeated subdivision of the date set. Future research could investigate trends indicated herein. 3.2.1. Analysis by target angle 3.2.1.1. Corrected localization error. A set of paired and independent samples t-tests were conducted comparing corrected localization error in the no distractor, speech distractor (with merged ignore/ attend instructions) and speech þ music distractor (with merged ignore/attend instructions) conditions, broken down into each of the 8 target angles (Tables 1 and 2). Results indicated that target sounds coming from some angles were more strongly affected by BC-presented distractors than others (Fig. 7). Targets at 0, 90,180 or 270 were more resilient against distractor effects, while 135 and 225 targets and to a lesser extent 45 and 315 targets were susceptible to larger increases more when speech þ music was used as a distractor. 3.2.1.2. Signed error. In order to further characterize localization error and determine if systematic directional biasing occurred, signed mean localization error was computed and compared for the no distractor, speech distractor (with merged attend/ignore data),

Table 4 Signed mean deviations by angle and distractor task type (smaller is reflective of better performance), statistical tests and effect size measures. Target Angle

0 45 90 135 180 225 270 315

Speech compared to speech þmusic

No distractor compared to speech

No distractor compared to speechþmusic

t, p

d

t, p

d

t, p

d

t(13) ¼ 0.525, p ¼ 0.609 t(13) ¼ -2.467, p ¼ .028 t(13) ¼ -0.221, p ¼ 0.829 t(13) ¼ 2.466, p ¼ 0.028 t(13) ¼ 0.926, p ¼ 0.371 t(13) ¼ -1.901, p ¼ 0.080 t(13) ¼ 0.133, p ¼ 0.896 t(13) ¼ 1.325, p ¼ 0.208

d ¼ 0.136

t(13) ¼ -0.418, p ¼ 0.690 t(13) ¼ -1.701, p ¼ 0.140 t(13) ¼ 1.560, p ¼ 0.170 t(13) ¼ 1.532, p ¼ 0.176 t(13) ¼ -1.373, p ¼ 0.219 t(13) ¼ -0.531, p ¼ 0.614 t(13) ¼ -0.609, p ¼ 0.565 t(13) ¼ 1.386, p ¼ 0.215

d ¼ 0.012

t(19) ¼ -0.313, p ¼ 0.758

d ¼ 0.119

d ¼ 1.136

t(19) ¼ -2.722, p ¼ 0.014

d ¼ 1.179

d ¼ 0.149

t(19) ¼ 0.000, p ¼ 1.000

d¼0

d ¼ 0.519

t(19) ¼ 0.043, p ¼ 0.966

d ¼ 0.019

d ¼ 0.358

t(19) ¼ -1.561, p ¼ 0.135

d ¼ 0.684

d ¼ 0.691

t(19) ¼ -1.034, p ¼ 0.314

d ¼ 0.452

d ¼ 0.656

t(19) ¼ -2.234, p ¼ 0.038

d ¼ 1.022

d ¼ 0.973

t(19) ¼ 0.570, p ¼ 0.575

d ¼ 0.234

d ¼ 0.212 d ¼ 0.155 d ¼ 0.614 d ¼ 0.278 d ¼ 0.382 d ¼ 0.248 d ¼ 0.256

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158 Table 5 Corrected localization error, in degrees, by target type and distractor type (smaller is reflective of better performance), descriptive statistics. Target Type

No distractor (M,SD)

Speech (M,SD)

Speech þ Music (M,SD)

White noise

M ¼ 12.52, SD ¼ 6.07 M ¼ 17.71, SD ¼ 7.45 M ¼ 11.76, SD ¼ 4.77 M ¼ 16.90, SD ¼ 7.60 M ¼ 21.67, SD ¼ 8.200 M ¼ 16.52, SD ¼ 7.833

M ¼ 10.79, SD ¼ 4.775 M ¼ 23.86, SD ¼ 5.908 M ¼ 10.21, SD ¼ 4.117 M ¼ 16.29, SD ¼ 5.136 M ¼ 24.14, SD ¼ 4.655 M ¼ 16.29, SD ¼ 4.858

M ¼ 16.57, SD ¼ 5.855 M ¼ 27.57, SD ¼ 4.962 M ¼ 16.57, SD ¼ 7.743 M ¼ 25.14, SD ¼ 4.776 M ¼ 28.86, SD ¼ 1.244 M ¼ 24.71, SD ¼ 4.536

Bird chirp Pink noise Car horn Bike bell Cicada

and speechþ music distractor conditions (also with merged attend/ ignore data, see Tables 3 and 4). Comparisons were done using a set of paired or independent samples t-tests, depending on the type of comparison being made. This procedure revealed that responses for ‘diagonally’ located targets at 45, 135, 225 and 315 appear to have been ‘pulled’ toward the BC distractor locations, at 90 and 270 (Fig. 7).

3.2.2. Analysis by target type To assess the impact of the different distractors on each target, a similar set of paired and independent samples t-tests were conducted. These compared corrected localization error in the no distractor, merged speech distractor (with merged ignore/attend instructions) and speech þ music distractor (with merged ignore/ attend instructions) conditions, broken down into each of the 6 target sound types (Tables 5 and 6). Even with the small sample size, localization error for all targets aside from the bird chirp and bike bell (and the latter appeared to be trending toward significance) were significantly different between the two distractor types (Fig. 8). This indicates that a variety of target sounds can be effected by different common distractors presented through BC. Further, even for readily detectible and highly salient alert sounds such as car horns, a person's choice of distractor can be expected to influence their localization performance.

153

4. Discussion 4.1. Relation of results to hypotheses As expected, corrected localization error was greater in the distractor/ignore and distractor/attend conditions, compared with the no distractor condition. These distractor-present conditions also had an increased number of reversal errors. This indicates that listening to common BC stimuli at moderate intensity, in general, can be expected to have a negative impact on sound source localization accuracy. Other analyses provided insight into the causes of this effect. There was a significant main effect of music presence on corrected deviation scores. This indicates that listening to music with lyrics over BC headphones is likely to be more detrimental to auditory localization than listening to speech-only distractors such as audiobooks, all things being equal. The presence of these effects was most likely due to the greater range of frequencies present in the speech þ music condition, and is evidence for the importance of bottom-up stimulus interference effects in this sort of dual-task scenario. Further evidence for the importance of masking effects was found through subsequent analyses of the different target sounds. The deleterious effects of masking were found to be stronger for the speech þ music compared with the speech only condition across almost all targets, including the broadband white and pink noise targets. This indicates that the distractor sounds a person is hearing may be highly important, compared to the type of environmental sound they are listening for, and that even highly salient warning sounds such as car horns are susceptible to localization interference by moderate intensity BC distractors. Further, these masking effects appeared to systematically bias localization judgments toward the apparent direction of distractors at 90 or 270 , instead of away from those directions. This is contrary to what one might have predicted based on prior research on single-loudspeaker distractors. Overall, these results point to a robust and systematic deleterious effect of continuous BC distractors on localization judgements, most likely through simultaneous masking. The strength of these effects varied predicated on the type of target, the type of distractors, and the position of the target. However, while masking effects were robust, the impact of the allocation of the listener's attention was negligible. This can to some extent be attributed to the specifics of the present study and its methodology. This research was conducted in a controlled laboratory setting with a sparse environmental soundscape that did

Table 6 Signed mean deviations by angle and distractor type (smaller is reflective of better performance), statistical tests and effect size measures. Target Type

White noise Bird chirp Pink noise Car horn Bike bell Cicada

No distractor compared to speech

Speech compared to speech þmusic

No distractor compared to speechþmusic

t, p

d

t, p

d

t, p

d

t(13) ¼ 2.431, p ¼ 0.025 t(13) ¼ 5.733, p < 0.001 t(13) ¼ -248, p ¼ 0.808 t(13) ¼ 0.214, p ¼ 0.834 t(13) ¼ 1.914, p ¼ 0.078 t(13) ¼ 2.440, p ¼ 0.030

d ¼ 0.329

t(13) ¼ 0.730, p ¼ 0.493 t(13) ¼ 1.334, p ¼ 0.231 t(13) ¼ 0.952, p ¼ 0.378 t(13) ¼ 2.446, p ¼ 0.050 t(13) ¼ 1.351, p ¼ 0.225 t(13) ¼ -0.136, p ¼ 0.897

d ¼ -0.705

t(19) ¼ 2.431, p ¼ 0.025 t(19) ¼ 1.426, p ¼ 0.170 t(19) ¼ 2.485, p ¼ 0.022 t(19) ¼ 3.808, p ¼ 0.002 t(19) ¼ 2.042, p ¼ 0.055 t(19) ¼ 3.826, p ¼ 0.002

d ¼ 1.181

d ¼ -0.949 d ¼ 0.361 d ¼ 0.010 d ¼ -0.384 d ¼ 0.037

d ¼ 1.617 d ¼ -0.776 d ¼ 1.347 d ¼ 1.272 d ¼ 1.328

d ¼ -0.697 d ¼ 1.199 d ¼ 1.859 d ¼ 1.286 d ¼ 1.867

154

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Fig. 8. Corrected localization error by target type and music presence.

not reflect the acoustic complexity of a noisy street or other complex auditory environment. Thus, it is possible that monitoring for target sounds did not place a significant strain on stream segregation, spatial computations, and/or working memory processes such that active attention of allocation was impactful on task performance. Future research could incorporate an ecologically accurate target environment that is more challenging to a listener both in terms of target stream overlap and spatial complexity. It is important to note that, despite the observed performance decline, there were no outright detection failures. The response method did not allow for such errors, since presentation of subsequent stimuli required a response by the participant (i.e., there was no ‘time out’ that would have indicated a missed stimulus presentation). Thus, aside from the occasional trial where the participant may have responded randomly, participants were always able to hear the target sounds clearly enough to respond. 5. Conclusion In a consumer space where marketing materials and journalists are telling potential users that they can cycle or jog safely while listening to music over BC, this study presents a cautionary tale. A listener may be able to perform the low-level perceptual task of detecting sounds with high accuracy, while using BC devices to listen to common entertainment sounds. In low complexity auditory and task environments such the one studied here, where a listener consciously directs their attention appears to be of minimal consequence. However, regardless of task and environment complexity, sound source localization performance can be expected to degrade when moderate intensity distractors are introduced due to masking effects, with the effect growing as more frequencies or components are introduced. This effect may occur in a systematic way, with localization judgments for nearby target sounds being biased

toward the apparent direction of BC distractor sounds. These findings indicate that pedestrians, joggers and cyclists need to be aware that their perceptions and subsequent judgments may be impaired in subtle ways that are not obvious, due to a graceful degradation of perceptual/cognitive faculties supporting SA. A person may even be diligently applying attention to their environment and not their audio entertainment, and such degradation will still be expected to occur. One reviewer of a set of BC headphones wrote “As long as you're not rocking full volume, you can easily have conversations and hear cars approaching from behind, making it a flexible and safer option if music is a must while riding a bike,” (Turi, 2015). While BC headphones may make it somewhat easier to detect the general presence of nearby hazards, listeners can still be expected to have an impaired ability to know precisely and accurately where those hazards actually are. Thus, while BC devices are likely to be preferable to AC devices that block the majority of incoming sound (such as over-ear AC headphones), they should not be used under the pretense that they are as safe as unimpeded listening. At least one faculty that supports SA will be degraded in a subtle way that may not even be apparent to the listener. Thus, claims about the safety of BC headphones ought to be tempered by guidelines that cyclists and joggers keep the volume down and modify their consumption habits toward spoken word or narrow-band music entertainment. Additionally, novel approaches need to be explored in order to ascertain how BC or AC audio signals can be modified to reduce their interference with SA-supporting processes such as sound source localization. Acknowledgements Special thanks to Dr. Frank Clark and Mike Winters for providing equipment and advice as well as Dr. Kenneth Cunefare for providing access to a hemi-anechoic chamber.

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Appendix

Fig. 7. Spectrogram of bike bell target.

Fig. 8. Spectrogram of bird chirp target.

155

156

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Fig. 9. Spectrogram of cicada target.

Fig. 10. Spectrogram of car horn target.

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

Fig. 11. Spectrogram of pink noise target.

Fig. 12. Spectrogram of white noise target.

157

158

K.R. May, B.N. Walker / Applied Ergonomics 61 (2017) 144e158

References Abel, S.M., Tsang, S., Boyne, S., 2007. Sound localization with communication headsets: comparison of passive and active devices. Noise Health 2007 (9), 101e107. Basch, C.H., Ethan, D., Zybert, P., Basch, C.E., 2015. Pedestrian behavior at five dangerous and busy Manhattan intersections. J. Community Health 40 (4), 789e792. Bienvenue, G.R., Siegenthaler, B.M., 1974. A clinical procedure for evaluating auditory localization. J. Speech Hear. Disord. 39, 469e477. Butler, R.A., Naunton, R.F., 1962. Some effects of unilateral auditory masking upon the localization of sound in space. J. Acoust. Soc. Am. 34, 1100e1107. Carlile, S., Leong, P., Hyams, S., 1997. The nature and distribution of errors in sound localization by human listeners. Hear. Res. 1997 (114), 179e196. Carlyon, R.P., Cusack, R., Foxton, J.M., Robertson, Ian H., Feb 2001. Effects of attention and unilateral neglect on auditory stream segregation. J. Exp. Psychol. Hum. Percept. Perform. 27 (1), 115e127. Chang-Geun, O., Lee, K., Spencer, P., September 2011. Effectiveness of advanced BC earphones for people who enjoy outdoor activities. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 55, 1788e1792. Chilli Air II, 2012. Bone Conduction Headphones. Retrieved July 06, 2016, from. http://www.chilli-tech.com/products/land/bone-conduction-headphones. Cloutman, L.L., 2013. Interaction between dorsal and ventral processing streams: where, when and how? Brain Lang. 127 (2), 251e263. Cusack, R., Deeks, J., Aikman, G., Carlyon, R.P., 2004. Effects of location, frequency region, and time course of selective attention on auditory scene analysis. J. Exp. Psychol. Hum. Percept. Perform. 30 (4), 643e656. De Waard, D., Edlinger, K.M., Brookhuis, K.A., 2011. Effects of listening to music, and of using a handheld and handsfree telephone on cycling behavior. Transp. Res. part four- Traffic Psychol. Behav. 14 (6), 626e637. Durso, F.T., Truitt, T.R., Hackworth, C.A., Crutchfield, J.M., Nikolic, D., Moertl, P.M., Manning, C.A., 1995. Expertise and chess: a pilot study comparing situation awareness methodologies. Exp. Anal. Meas. Situat. Aware. 295e303. Endsley, M.R., 1995. Toward a theory of situation awareness in dynamic systems. Hum. Factors J. Hum. Factors Ergon. Soc. 37 (1), 32e64. Fracker, M.L., 1987. Situation Awareness: a Decision Model. Dayton, OH. Grohn, M. (2002). Localization of a Moving Virtual Sound Source in a Virtual Room,

The Effect of a Distracting Auditory Stimulus. Proceedings of the 2002 International Conference on Auditory Display, Kyoto, Japan, July 2-5, 2002. Hang, B., Hu, R., Yang, Y., Ma, Y., Chang, J., 2009, December. Surveillance Audio Attention Model Based on Spatial Audio Cues. In Pacific-Rim Conference on Multimedia. Springer, Berlin Heidelberg, pp. 908e916. Kipling, R., 2011. Rikki-Tikki-Tavi [Recorded by Topping, E.] on Stories from the Jungle Book [CD]. Saland Publishing. Langendijk, E.H., Kistler, D.J., Wightman, F.L., 2001. Sound localization in the presence of one or two distracters. J. Acoust. Soc. Am. 109 (5 Pt 1), 2123e2134. Letowski, T.R., Letowski, S.T., 2012. Auditory spatial perception: Auditory localization (No. ARL-TR-6016). Army Research Lab Aberdeen Proving Ground, MD. Lichenstein, R., Smith, D, Ambrose, J, et al., 2012. Headphone use and pedestrian injury and death in the United States: 2004e2011. Inj. Prev. 18, 287e290. Neuhoff, J.G., 2004. Ecological Psychoacoustics. Elsevier Academic Press. Rauschecker, J.P., 1998. Parallel processing in the auditory cortex of primates. Audiol. Neurotol. 3 (2e3), 86e103. Rhodes, M., 2015. Concept Headphones That Won't Get You Killed while Biking. Wired. Retrieved July 06, 2016, from. http://www.wired.com/2015/07/conceptheadphones-wont-get-killed-biking/. Scharine, A.A., Letowski, T.R., 2005. Factors Affecting Auditory Localization and Situational Awareness in the Urban Battlefield (No. ARL-TR-3474). Army Research Lab- Aberdeen Proving Ground MD Human Research and Engineering Directorate. Schuett, J.H., Walker, B.N., 2013. Measuring comprehension in sonification tasks that have multiple data streams. In: Proceedings of the 8th Audio Mostly Conference. ACM, p. 11. Srinivasan, J., Ashwin Kumar, K.M., Balasubramanian, V., 2009. Cognitive effect of music for joggers using EEG. IFMBE Proc. 23, 1120e1123. Stanley, R.M., 2006. Measurement and Validation of Bone-conduction Adjustment Functions in Virtual 3D Audio Displays. Doctoral dissertation. Georgia Institute of Technology, Atlanta. Suzuki, Y., Yokoyama, T., Sone, T., 1993. Influence of interfering noise on the sound localization of a pure tone. J. Acoust. Soc. Jpn. 14 (5), 327e339. Turi, J., 2015. Bone Conduction Headphones Let Me Ditch the Boombox but Still Cycle Safely. Engadget. Woodworth, R.S., 1938. Experimental Psychology. Holt, Rinehart, Winston, New York.