Sound increases the saliency of visual events

Sound increases the saliency of visual events

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –1 63 a v a i l a b l e a t w w w. s c i e n c e d i r e c t . c o m w w w. e l s e v i e r. c o m /...

669KB Sizes 0 Downloads 188 Views

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –1 63

a v a i l a b l e a t w w w. s c i e n c e d i r e c t . c o m

w w w. e l s e v i e r. c o m / l o c a t e / b r a i n r e s

Research Report

Sound increases the saliency of visual events Toemme Noesselt a,b,⁎, Daniel Bergmanna , Maria Hakea , Hans-Jochen Heinzea , Robert Fendricha a

Department of Neurology II and Center for Advanced Imaging, Otto-von-Guericke-University, Leipziger Street 44, 39120 Magdeburg, Germany Center of Behavioral Brain Sciences, Otto-von-Guericke-University, Magdeburg, Germany

b

A R T I C LE I N FO

AB S T R A C T

Article history:

We show that concurrent auditory stimuli can enhance the visual system's ability to detect

Accepted 17 December 2007

brief visual events. Participants indicated which of two visual stimuli was briefly blinked off.

Available online 3 January 2008

A spatially non-aligned auditory cue – simultaneous with the blink – significantly enhanced subjects' detection ability, while a visual cue decreased detection ability relative to a no-cue

Keywords:

condition. Control experiments indicate that the auditory-driven enhancement was not

Crossmodal

attributable to a warning effect. Also, the enhancement did not depend on an exact temporal

Audiovisual

alignment of cue-target onsets or offsets. In combination, our results provide evidence that

Visual saliency

the sound-induced enhancement is not due to a sharpening of visual temporal responses or

Perceptual sensitivity

apparent prolongation of the visual event. Rather, this enhancement seems to reflect an increase in phenomenal visual saliency. © 2008 Elsevier B.V. All rights reserved.

1.

Introduction

Real world events often stimulate more than one sense, and the integration of information from different sensory modalities can improve behavioural performance. Accordingly, Frassinetti et al. (2002a) reported that concurrent auditory stimuli can enhance the detection of masked visual flashes (see also Bolognini et al., 2005; Frassinetti et al., 2002b, see Lovelace et al., 2003; Odgaard et al., 2003 for visual enhancement of sound). This enhancement was dependent, however, on the exact spatial alignment of visual and auditory signals. In contrast, other studies have shown that auditory stimuli can alter the perception of visual events when the visual and auditory stimuli are not spatially aligned (Noesselt et al., 2005; Recanzone, 2003; Regan and Spekreijse, 1977; Stein et al., 1996; Welch et al., 1986). Stein et al. (1996) have specifically reported an enhancement of the apparent brightness of flashes by co-occurring sounds that does not depend on spatial alignment. However, this study did not demonstrate an

effect on detection thresholds, and more recent studies suggest that the phenomenon may have a decisional rather than sensory basis (Marks et al., 2003; Odgaard et al., 2003). In addition to spatial audiovisual correspondence, temporal relations can play a role in audiovisual integration (Fendrich and Corballis, 2001; Shams et al., 2000; Sheth and Shimojo, 2004). In particular, it has been reported that co-occurring sounds may alter the perceived flash-lag illusion in a manner that suggests a sharpening of the visual events (Vroomen and De Gelder, 2004b, though see Arrighi et al., 2005). However, all previous studies on sound-induced enhancements of simple visual stimuli have been specific to luminance increments (flashes). It remains to be established, whether the observed enhancement would generalize to other types of visual events. Here, we address this issue by testing whether an auditory enhancement of perceptual sensitivity to visual blinks can be observed using a 2-alternative-forced-choice paradigm. Our

⁎ Corresponding author. Center for Advanced Imaging, Haus 1, Leipziger Street 44, 39104 Magdeburg, Germany. Fax: +49 391 6715233. E-mail address: [email protected] (T. Noesselt). Abbreviations: AFC, Alternative forced choice; ANOVA, Analysis of Variance; RT, Reaction time; S.E.M., Standard Error of Mean 0006-8993/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.brainres.2007.12.060

158

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –16 3

results demonstrate that a co-occurring auditory stimulus can significantly enhance the detection of such brief visual blinks.

2.

Results

2.1.

Experiment 1

In the first experiment, subjects (n =16, 8 women, mean age= 23.4) indicated which of two concurrent visual stimuli was very briefly blinked off. A visual, auditory, or combined audiovisual cue could accompany the visual blink event, or the blink could occur

uncued. The cue duration matched the target blink duration (see Fig. 1b). Results are shown in Fig. 2a. Observer's detection accuracy was significantly improved by the auditory cue but reduced by the visual cue, relative to the no-cue condition (Fig. 2a, left column). A repeated-measures ANOVA (Analysis of Variance) with the factor cue-type (visual, auditory, visual-auditory, nocue) showed a significant effect [F(3, 45) = 11.45, p b .001, η2 = .433]. In this and subsequent experiments statistical comparisons of the specific conditions were made using post-hoc two-tailed tests. Accuracy averaged 4.2% higher when the sound was presented with the visual blink than in the no-cue condition [t (15) = 4.14, p b .001, d = .50], with 14 of 16 subjects showing the

Fig. 1 – a) Task design: Schematic illustration of the visual display (not drawn to scale). The event sequence in the no-cue condition with a blink of the upper circle is illustrated. The visual blink duration was adjusted for each subject (see Methods). When a visual cue was presented, the center fixation X changed to a small circle for the duration of the blink (not shown). b) Timing of cue presentation relative to target presentation: Cue-target intervals for Experiment 1 are depicted in the upper row (2 cue types: no cue; auditory/visual/combined audiovisual cue relative to visual blink), for Experiment 2 (3 conditions: no cue; simultaneous auditory, early auditory cue) and for Experiment 3 in the lower row (3 conditions: no cue; simultaneous auditory, prolonged auditory cue).

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –1 63

effect (Fig. 2a, middle column), but 4.8% lower with the central visual cue [t(15) =−2.96, pb .01, d=.66, 12 out of 16 subjects]. Accuracy with the combined audiovisual cue was not significantly different from the no-cue condition [t(15) =−.799, p =.44, d =.20], suggesting the opposing auditory and visual effects were additive. η2-values indicate that the factor cue-type can explain more than 40% of the variance in the data (i.e., a medium-to-largesized effect was obtained). Similarly, the standardised meandifferences (d-values) of the significant post-hoc-tests all exceeded 0.5, indicating medium-sized effects (Cohen, 1988) commensurate with those observed in other studies on audiovisual enhancement (e.g. McDonald et al., 2000). The sound position was just below the display screen, raising the possibility that it might have drawn subjects' attention down towards the lower circle. If such a biasing of attention contributed to our results, one would expect that improved accuracy rates would occur primarily when the lower circle blinked. We therefore tested the hit rates for lower vs. upper targets using a repeated-measures ANOVA with the factors cue-type and target position. No indication of a difference in the upper vs. lower hit rates was found, and no indication of an interaction between target position and cue type (F(3, 45) b1.21; p N .1). Finally, we note that both visual and auditory cues eliminated temporal uncertainty of blink occurrence. However, the observed pattern of accuracy results suggests that visual temporal resolution was improved only by the concurrent sound burst. Reaction time (RT) data (no-cue condition: 722 ms) suggest that this observed enhancement was not due to a speedaccuracy trade-off, since in both the audiovisual and auditory conditions observers were significantly faster than in the visual condition [t(15) = − 4.14, p b .001, d = .66 and t(15) = −4.20, p b .001, d = .73 for auditory and audiovisual conditions relative to the visual condition Fig. 2a, right column]. RTs were slowest in the no-cue condition [t(15) = − 3.48, p b .01, d = .27; t(15) = −6.05, p b .001, d = 1.05 and t(15) = − 5.32, p b .0001, d = .97 for visual, auditory and audiovisual conditions relative to the no-cue condition].

2.2.

Experiment 2

Auditory inputs can reach the cortex more rapidly than visual stimuli (e.g. Schroeder and Foxe, 2004). This raises the possibility that although the sound and blink were presented together, the sound improved performance by providing an advance warning that the blink was imminent. A second experiment (n = 8, 5 women, mean age = 22.7) was run to test this hypothesis. We presented a no-cue condition and a simultaneous auditory-cue condition that were identical to the corresponding conditions in the first experiment, and an early-cue condition in which the auditory cue was presented 100 ms prior to the blink (see Fig. 1b). If the improved performance produced by the sound in the first experiment were due to a warning effect, one would expect that a similar or greater detection advantage would occur with the early cue, since it would provide more opportunity for the warning to be utilized. Eight subjects were tested. Except for the indicated changes, procedures were identical to those employed in Experiment 1 (see Fig. 1b).

159

Results are shown in Fig. 2b. We replicated the major finding from Experiment 1: Accuracy was increased by 4.4% in the simultaneous auditory-cue condition compared to the no-cue condition. A repeated-measures ANOVA with the factor cuetype (no-cue, simultaneous, and early auditory cue) showed a significant main effect [F(2, 14) = 5.16, p b .05, η2 = .424; Fig. 2b left column]. The increased accuracy in the simultaneous auditory cue relative to the no-cue condition was statistically significant [t(7) = 2.69, p b .05, d = .80], with 7 of 8 subjects showing this effect (Fig. 2b middle column). However, performance in the early auditory-cue condition was significantly poorer than in the simultaneous auditory-cue condition [t(7) = −3.314, p b .05, d = 1.10], and was not significantly different from the no-cue condition [t(7) = −.261, p = .80, d = .09]. Thus, despite the observed enhancement in the simultaneous cue condition, we observed no indication of a warning-effect. RTs (no-cue condition: 775 ms) in the two auditory conditions were not significantly different [t(7) = 2.09, p = .08, d = .25], but both were shorter than in the no-cue condition [simultaneous: t(7) = − 3.05, p b .05, d = .88; early: t(7) = − 3.20, p b .05, d = 1.02; Fig. 2b, right column]. Again, more than 40% of the variance could be explained by the independent variable (factor cuetype) and standardized mean-differences exceeding 0.8; both values indicate the occurrence of a large-sized effect (Cohen, 1988).

2.3.

Experiment 3

In the two experiments reported above, we employed an auditory cue with the same duration as the blink. Here, we investigated whether the observed perceptual enhancement depended on the alignment of temporal borders of visual transients and auditory cues. In this experiment (n = 8, 4 women, mean age = 23.9) we presented an “aligned-cue” condition in which the auditory cue was simultaneous with the blink and matched its duration (as in the previous experiments), a no-cue condition, and a “prolonged” cue condition in which an auditory cue was temporally centred on the blink but three times its duration (see Fig. 1b). Results are shown in Fig. 2c. Again, we found a significant effect of cue-type. [F(2, 14) = 4.77, p b .05, η2 = .405]. Accuracy rates were improved in both the simultaneous [mean: 62.4%; t(7) = 3.27, p b .01, d = .70] and prolonged cue conditions [mean: 63.4%, t(7) = 2.43, p b .05, d = .73] relative to the no-cue condition, but these rates did not differ from each other [t(7) = −.58, p = .58, d = .13; Fig. 2c left column; see Fig. 2c middle column for individual results]. RTs (no-cue condition: 717 ms) also did not differ between prolonged and simultaneous cue conditions [t(7)=1.56, p=.16, d=.09; Fig. 2c right column], but were again significantly shorter than the no-cue condition [aligned: t(7)= −2.41, pb .05, d=.45; long: t(7) =−2.54, pb .05, d=.53].

2.4.

Cross-experiment statistical comparisons

A combined analysis comparing the results of the no-cue with temporally aligned auditory-cue condition of all three experiments shows that the observed accuracy advantage was present in 28 of 32 subjects and remained stable across experiments {Repeated-measures ANOVA with within-subject factor cue (no/auditory) , [F(1, 29) = 31.001, p b 0.001, η2 = .52], but the

160

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –16 3

Fig. 2 – Behavioural results for all three experiment. Left column depicts mean accuracy (and standard deviation), middle column individual subject results (with each subjects performance in the different experimental conditions connected with lines), right column mean response times. All results are plotted relative to the no-cue condition. Stars above bar graphs indicate significant effect between conditions. Stars below bar graphs indicate significant effects against the no-cue condition. 2a) (upper row): Results of Experiment 1. 2b) (middle row): Results of Experiment 2 with the temporally aligned or 100 ms early auditory cue. 2c) (lower row): Results of Experiment 3 with the temporally aligned (short) or prolonged auditory cue.

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –1 63

different experiments did not interact with this effect [betweensubject factor Experiment (1–3) x within-subject factor cue], [F(2, 29) = 0.01, p = 0.99, η2 = .001]}.

3.

Discussion

In combination, the three experiments suggest that audition can enhance the ability of the visual system to detect brief temporal events. We observed this effect although the auditory and visual stimuli were not spatially aligned. RT data from our experiments demonstrate an RT-advantage even in the auditory-cue conditions in which we observed no accuracy advantage. This argues that the accuracy enhancement is unlikely to be attributable to a simple speedaccuracy trade-off. However, because reaction times were nonspeeded, it is difficult to draw strong conclusions regarding the relationship of the RT and accuracy data. It is conceivable that independent of any sensory enhancement the auditory cues served to signal to observers that a response was required, reducing average RTs. We also note that in Experiment 2 an advance auditory cue did not significantly improve accuracy relative to the no-cue condition, suggesting it fell outside the window of audio-visual integration (Slutzky and Recanzone, 2001). This result is in accord with previous findings (Bolognini et al., 2005; Frassinetti et al., 2002a; Frassinetti et al., 2002b; Vroomen and de Gelder, 2004a) and argues against the role of temporal warning effects (which could have been produced by the more rapid arrival of auditory than visual signals in cortex) in the production of the observed accuracy enhancement in the simultaneous auditory-cue condition. In Experiment 3 a prolonged auditory cue led to a similar enhancement in performance as a cue matching the blink duration. Thus, the observed performance enhancement did not require an exact temporal edge-alignment of visual and sound stimuli. This argues against the hypothesis that the enhancement was produced by a sharpening of the visual transient response due to the capture of its temporal boundaries by the boundaries of the auditory signal. Nevertheless, because the prolonged auditory cue overlapped the boundaries of the visual cue, enhancement effects specifically related to the onset or offset of the blink cannot be ruled out. Future studies may help elucidate this matter by parametrically varying the temporal relationship of the visual and auditory stimuli. The fact that we observed a similar performance enhancement irrespective of the exact alignment of the audio and visual stimuli also argues against an explanation based on a perceived prolongation of the visual stimulus, since this explanation would lead one to expect improved performance with the prolonged auditory cue. Thus, we think the most likely explanation is that the auditory cue serves as an ‘event marker’ (Johnston and Nishida, 2001) that renders the visual representation more salient (Sheth and Shimojo, 2004), and might be based on neural integration times rather than onset alignments (Arrighi et al., 2005). There have been previous reports that the yoking of visual and auditory stimuli can enhance visual performance (Bolognini et al., 2005; Frassinetti et al., 2002a; Frassinetti et al., 2002b; Vroomen and De Gelder, 2004b). However, in most of the studies that have evaluated visual detection thresholds, the

161

observed enhancement depended on the spatial alignment of the visual and auditory targets (for a review on audiovisual attentional effects see e.g. Spence and McDonald, 2004). For example, Frassinetti et al. (2002a) and Bolognini et al. (2005) both found improved detection of a masked target in an array of possible target location when a concurrent auditory stimulus was produced by a speaker aligned with that target, but not by an auditory stimulus in another position (Bolognini et al., 2005; Frassinetti et al., 2002a; Frassinetti et al., 2002b). In contrast, we observed an enhancement with an auditory stimulus that was not aligned with our visual targets and provided no predictive spatial information. An analysis of upper vs. lower targets provided no indication that a biasing towards the lower circle (due to speaker position) had an effect on the observed results. Our results indicate that audiovisual spatial alignment is not required for sounds to increase visual accuracy. Frassinetti and others (Bolognini et al., 2005; Frassinetti et al., 2002a; Frassinetti et al., 2002b) have argued that the dependence of the threshold improvement on spatial alignment may be explainable by the integration of stimuli in a joint audiovisual spatial map. Based on the work of Stein (Stein and Meredith, 1993) they suggest the superior colliculus as a candidate structure where this integration could occur. Our results indicate that audiovisual spatial alignment is not required for sounds to increase visual accuracy, although we cannot rule out the possibility that the observed effect would have been larger with spatially aligned stimuli. Conceivably, the enhancement effect observed by Frassinetti et al. (2002a) might reflect a special case of multisensory integration that was contingent on their use of a location paradigm in which a target followed by a substitution mask had to be selected from an array of possible target locations. In such a paradigm spatially aligned concurrent auditory stimuli might indeed facilitate the processing of the spatial location of the target, possibly via top–down feedback mechanisms (Enns and Di Lollo, 2000). Stein et al. (1996) have reported that concurrent sounds increase the apparent brightness of weak visual flashes using subjective brightness judgements as a measure. However, they did not investigate whether this effect could alter detection thresholds. In fact, another recent study using a bias-free 2-AFC (alternative-forced-choice) procedure replicated Stein et al's finding but indicated that it could be explained by a criterion shift (Odgaard et al., 2003). In a combined fMRI-ERP study Busse et al. (2005) reported an increased detection rate for visual targets paired with a non-aligned sound. However, they used a simple yes-no paradigm and did not report false alarms, so their results may also have depended upon a criterion shift. Vroomen and de Gelder (2000) have reported that targets in a sequence of complex visual scenes are better discriminated if they are associated with a distinctive sound in a concurrent sound sequence. Based on purely subjective data, they attribute their results to a phenomenal ‘freezing' effect due to a prolongation of the perceived visual event (Vroomen and De Gelder, 2004b). However, it is not evident whether the improved performance reflects an actual sensory enhancement. The reported effect only occurs in visual sequences with repeated masked presentations of a target that would be readily discriminable if not masked. It might therefore result from facilitated transfer of the target into visual memory (i.e.

162

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –16 3

a mask-related rather than a target-related effect), or audiovisual pop-out effects due to stream segregation (O'Leary and Rhodes, 1984). Vroomen and de Gelder (2000) have in fact argued that their effect is mediated by processes operating at the level of segregation. Accordingly, in their experiments auditory sequences were presented several times before the visual sequences started. This approach was used to ‘activate’ such auditory segregation processes, which (as the authors claim) may have influenced the visual stream segregation. In contrast to Vroomen and de Gelder (2000) we employed simple audio-visual pairings; complex sequence dependent mechanisms do not apply. Three possible mechanisms suggest themselves to account for the improved performance we observed. First, the visual temporal response profile could have been sharpened by the boundaries of the co-occurring auditory signal, as Vroomen and De Gelder (2004b) have proposed with the flash-lag effect. Second, the auditory cue could have prolonged the perception of the visual stimulus, commensurate with the mechanism suggested by Vroomen and de Gelder (2004a). Third, the auditory signal could have enhanced the saliency of a sensory representation of the visual blink (Sheth and Shimojo, 2004). Based on our results, the first two interpretations seem unlikely. Our observation of a similar visual enhancement with temporally aligned and prolonged auditory cues argues against the first because the alignment of the auditory and visual temporal boundaries did not increase the enhancement, and against the second because the prolongation did not increase the enhancement. Therefore, the third possibility – an enhancement of the representational visual salience at some level of visual processing – seems to fit best with our data. This interpretation would be in accord with reports (Odgaard et al., 2003; Stein et al., 1996) that flashes paired with sounds appear brighter. However, in our case the effect would be that the blink would appear phenomenally dimmer. Thus, it seems that the direction of the visual change is not important, and that the underlying mechanism rather serves to enhance visual transients per se. Finally, the neural underpinnings of this effect remain to be established. An anatomical pathway directly connecting lowlevel visual and auditory cortical areas has been recently reported and could mediate the enhancement (Falchier et al., 2002; Rockland and Ojima, 2003). Alternatively, the audiovisual correspondence could be determined in polysensory areas, which might back-propagate into visual areas to increase the saliency of visual representations (Falchier et al., 2002; Noesselt et al., 2005). In conclusion, we provide evidence that auditory information can enhance visual detection performance. This soundinduced enhancement is not specific to a luminance increment, since luminance decrements (blinks) were used in all experiments reported here. Rather, it reflects the operation of a general mechanism that acts to increase the saliency of visual sensory representations.

4.

Experimental procedures

The experiments reported in this study were performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki. Statistical analyses were corrected

when necessary (Greenhouse-Geisser for ANOVAs, Bonferroni for post-hoc t-tests). In three experiments, the visual stimuli were presented on a Hewlett-Packard 1311 X-Y display (effectively a large screen oscilloscope) customized with a fast P15 phosphor (decay to 0.1% in 50 μs). The dots used to construct stimulus displays had a luminance of 3.5 cd/m2, and were presented against a dark (b1 cd/m2) background screen in a darkened room (Fig. 1). The central fixation cross subtended 0.3°. Each of the two stimulus circles was constructed from 18 dots and was 2° in diameter. The centres of these circles were positioned 1.5° above and below the fixation cross. A change from the fixation cross to a circle served as visual cue. This circle appeared at the start of the blink and remained for the duration of the blink. The size and luminance of the fixation cross and circle were matched. The auditory cues were 500 Hz 66 dBA sine wave bursts, produced by a function generator that was gated by the computers parallel port, and presented via a 12.7 cm speaker positioned just below the display. The timing of the stimulus presentations was accurate to 2 ms. Preliminary threshold determination runs for each subject consisted of 24 trials each. At the beginning of each trial the central fixation cross and the two circles appeared. After fixating the central cross, subjects initiated the presentation sequence by pressing a ‘start’ button. One half second after this button press, one circle briefly blinked off. No cue was presented. Subjects used two response buttons to indicate which circle had blinked. They were informed that one of the two circles would blink on every trial, and asked to guess if they failed to detect the location of the blink. Subjects were asked to respond as accurately as possible (non-speeded reaction time task). The blink duration was initially set to 20 ms, which produced ceiling performance in every subject. It was reduced by 2 ms in each subsequent run, until the subject's accuracy fell to or below 80%. This was the blink duration used initially in the subsequent experiment. This duration was adjusted up or down by 2 ms between the experimental blocks if on any block the no-cue detection rate exceeded 80% or fell below 55%. For Experiment 1, the mean blink duration across subjects was 14.6 ms [±1 ms S.E.M. (standard error of mean)] and similar in Experiments 2 (13.4±1 ms S.E.M.) and 3 (12.5±1.2 ms S.E.M). Because we used a blink rather than a flash, the duration of the stimulus event was not affected by visual persistence, i.e. after-images could not occur. Sound duration was as long. The general procedure on each trial of all the experiments was the same as in the threshold determination runs, save that cues could be presented. In Experiment 1 there were 7 blocks of 128 trials, 32 with no cue and 32 each with a visual, auditory, or combined audiovisual cue. The cues, if present, served to mark the exact moment when the blink occurred (see Fig. 1b). In all the conditions, the upper circle blinked on half the trials and the lower circle on half. In Experiment 2 (also 7 blocks of 128 trials), the number of cue and no-cue trials was matched, with the cue trials equally divided between a simultaneous and early-cue condition, in which the cue preceded the blink by 100 ms (see Fig. 1b). In Experiment 3 (7 blocks of 128 trials), the number of cue and no-cue trials was also matched, with the cue trials equally divided between a simultaneous and prolonged cue condition in which the cue began one blink duration before the blink and ended one blink duration after the

BR A I N R ES E A RC H 1 2 2 0 ( 2 00 8 ) 1 5 7 –1 63

blink (see Fig. 1b). In all experiments the order of the trials was randomised. The first experimental block was regarded as practice and not included in the data analysis.

Acknowledgments TN, DB, MH and HJH were funded by DFG-SFB/TR31-TPA8 and RF by LSA-Project FKZ 17. We thank M.A. Schoenfeld and J. Rieger for the helpful discussions. REFERENCES

Arrighi, R., Alais, D., Burr, D., 2005. Perceived timing of first- and second-order changes in vision and hearing. Exp. Brain Res. 166 (3-4), 445–454. Bolognini, N., Frassinetti, F., Serino, A., Ladavas, E., 2005. “Acoustical vision” of below threshold stimuli: interaction among spatially converging audiovisual inputs. Exp. Brain Res. 160, 273–282. Busse, L., Roberts, K.C., Crist, R.E., Weissman, D.H., Woldorff, M.G., 2005. The spread of attention across modalities and space in a multisensory object. Proc. Natl. Acad. Sci. U. S. A. 102, 18751–18756. Cohen, J., 1988. Statistical power analysis for the behavioral sciences, 2nd ed. Lawrence Earlbaum Associates, Hillsdale, NJ. Enns, J.T., Di Lollo, V., 2000. What's new in visual masking? Trends Cogn. Sci. 4, 345–352. Falchier, A., Clavagnier, S., Barone, P., Kennedy, H., 2002. Anatomical evidence of multimodal integration in primate striate cortex. J. Neurosci. 22, 5749–5759. Fendrich, R., Corballis, P.M., 2001. The temporal cross-capture of audition and vision. Percept. Psychophys. 63, 719–725. Frassinetti, F., Bolognini, N., Ladavas, E., 2002a. Enhancement of visual perception by crossmodal visuo-auditory interaction. Exp. Brain Res. 147, 332–343. Frassinetti, F., Pavani, F., Ladavas, E., 2002b. Acoustical vision of neglected stimuli: interaction among spatially converging audiovisual inputs in neglect patients. J. Cogn. Neurosci. 14, 62–69. Johnston, A., Nishida, S., 2001. Time perception: brain time or event time? Curr. Biol. 11, R427–R430. Lovelace, C.T., Stein, B.E., Wallace, M.T., 2003. An irrelevant light enhances auditory detection in humans: a psychophysical analysis of multisensory integration in stimulus detection. Cogn. Brain Res. 17, 447–453. Marks, L.E., Ben-Artzi, E., Lakatos, S., 2003. Cross-modal interactions in auditory and visual discrimination. Int. J. Psychophysiol. 50, 125–145.

163

McDonald, J.J., Teder-Salejarvi, W.A., Hillyard, S.A., 2000. Involuntary orienting to sound improves visual perception. Nature 407 (6806), 906–908. Noesselt, T., Fendrich, R., Bonath, B., Tyll, S., Heinze, H.J., 2005. Closer in time when farther in space-spatial factors in audiovisual temporal integration. Brain Res. Cogn. Brain Res. 25, 443–458. O'Leary, A., Rhodes, G., 1984. Cross-modal effects on visual and auditory object perception. Percept. Psychophys. 35, 565–569. Odgaard, E.C., Arieh, Y., Marks, L.E., 2003. Cross-modal enhancement of perceived brightness: sensory interaction versus response bias. Percept. Psychophys. 65, 123–132. Recanzone, G.H., 2003. Auditory influences on visual temporal rate perception. J. Neurophysiol. 89, 1078–1093. Regan, D., Spekreijse, H., 1977. Auditory-visual interactions and the correspondence between perceived auditory space and perceived visual space. Perception 6, 133–138. Rockland, K.S., Ojima, H., 2003. Multisensory convergence in calcarine visual areas in macaque monkey. Int. J. Psychophysiol. 50, 19–26. Schroeder, C.E., Foxe, J.J., 2004. Multisensory Convergence in Early Cortical Processing. In: Handbook of Multisensory Processes. ed. MIT Press, Cambridge, pp. Shams, L., Kamitani, Y., Shimojo, S., 2000. Illusions. What you see is what you hear. Nature 408 (6814), 788. Sheth, B.R., Shimojo, S., 2004. Sound-aided recovery from and persistence against visual filling-in. Vision Res. 44, 1907–1917. Slutzky, D.A., Recanzone, G.R., 2001. Temporal and spatial dependencies of the ventriloquism effect. Neuroreport 12, 7–10. Spence, C., McDonald, J.J., 2004. The cross-modal consequences of exogenous spatial orienting of attention. In: Calvert, G., Spence, C., Stein, B.E. (Eds.), Handbook of Multisensory Processes. MIT Press, Cambridge. Stein, B.E., Meredith, M.A., 1993. The Merging of the Senses. MIT Press, Cambridge. Stein, B.E., London, N., Wilkinson, L.K., Price, D., 1996. Enhancement of perceived visual intensity by auditory stimuli: a psychophysical analysis. J. Cogn. Neurosci. 8, 497–506. Vroomen, J., De Gelder, B., 2004a. Perceptual effects of cross-modal stimulation: ventroloquism and the freezing phenomenen. In: Calvert, G., Spence, C., Stein, B.E. (Eds.), Handbook of Multisensory Processes. MIT Press, Cambridge. Vroomen, J., De Gelder, B., 2004b. Temporal ventriloquism: sound modulates the flash-lag effect. J. Exp. Psychol. Hum. Percept. Perform. 30, 513–518. Vroomen, J., de Gelder, B., 2000. Sound enhances visual perception: cross-modal effects of auditory organization on vision. J. Exp. Psychol. Hum. Percept. Perform. 26, 1583–1590. Welch, R.B., DuttonHurt, L.D., Warren, D.H., 1986. Contributions of audition and vision to temporal rate perception. Percept. Psychophys. 39, 294–300.