The co-occurrence of multisensory competition and facilitation

The co-occurrence of multisensory competition and facilitation

Available online at www.sciencedirect.com Acta Psychologica 128 (2008) 153–161 www.elsevier.com/locate/actpsy The co-occurrence of multisensory comp...

239KB Sizes 0 Downloads 22 Views

Available online at www.sciencedirect.com

Acta Psychologica 128 (2008) 153–161 www.elsevier.com/locate/actpsy

The co-occurrence of multisensory competition and facilitation Scott Sinnett a,*, Salvador Soto-Faraco b, Charles Spence c a

Brain and Attention Research Laboratory, Department of Psychology, University of British Columbia, Canada b ICREA & Parc Cientı´fic de Barcelona – Universitat de Barcelona, Catalonia, Spain c Crossmodal Research Laboratory, Department of Experimental Psychology, University of Oxford, United Kingdom Received 7 September 2007; received in revised form 3 December 2007; accepted 3 December 2007 Available online 22 January 2008

Abstract Previous studies of multisensory integration have often stressed the beneficial effects that may arise when information concerning an event arrives via different sensory modalities at the same time, as, for example, exemplified by research on the redundant target effect (RTE). By contrast, studies of the Colavita visual dominance effect (e.g., [Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16, 409–412]) highlight the inhibitory consequences of the competition between signals presented simultaneously in different sensory modalities instead. Although both the RTE and the Colavita effect are thought to occur at early sensory levels and the stimulus conditions under which they are typically observed are very similar, the interplay between these two opposing behavioural phenomena (facilitation vs. competition) has yet to be addressed empirically. We hypothesized that the dissociation may reflect two of the fundamentally different ways in which humans can perceive concurrent auditory and visual stimuli. In Experiment 1, we demonstrated both multisensory facilitation (RTE) and the Colavita visual dominance effect using exactly the same audiovisual displays, by simply changing the task from a speeded detection task to a speeded modality discrimination task. Meanwhile, in Experiment 2, the participants exhibited multisensory facilitation when responding to visual targets and multisensory inhibition when responding to auditory targets while keeping the task constant. These results therefore indicate that both multisensory facilitation and inhibition can be demonstrated in reaction to the same bimodal event. Ó 2007 Elsevier B.V. All rights reserved. PsycINFO classification: 2320 Keywords: Multisensory interaction; Intersensory competition; Facilitation; Inhibition; Auditory; Visual

1. Introduction It is commonly accepted that the human perceptual system frequently integrates information presented to the various senses (e.g., audition, vision, and touch). However, the literature on multisensory perception also clearly shows that the behavioural consequences of multisensory interaction can range from facilitation on the one hand (e.g., faster response latencies and more accurate responding) to competition on the other (for example, in the form of sensory dominance of one modality at the expense of another *

Corresponding author. Tel.: +1 604 822 0069; fax: +1 604 822 6983. E-mail address: [email protected] (S. Sinnett).

0001-6918/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.actpsy.2007.12.002

such as the apparent extinction of auditory stimuli when simultaneously presented together with visual stimuli in the Colavita effect; see Colavita, 1974). Previous behavioural studies of multisensory facilitation have shown that reaction times (RTs) to a target presented in one modality can often be speeded up when accompanied by a simultaneously presented task-irrelevant event in another sensory modality (see Forster, Cavina-Pratesi, Aglioti, & Berlucchi, 2002; Fort, Delpuech, Pernier, & Giard, 2002; Nickerson, 1973; Zampini, Torresan, Spence, & Murray, 2007). Similarly, if participants are required to detect targets regardless of the modality (e.g., auditory and/or visual), they are typically much faster if these targets are presented in both modalities at the same time

154

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

(and from approximately the same spatial location though see Harrington & Peck, 1998). This phenomenon is typically referred to as intersensory facilitation (see, for example, Forster et al., 2002; Nickerson, 1973). Molhom, Ritter, Javitt, and Foxe (2004) demonstrated RT facilitation in response to the combination of auditory and visual (bimodal) targets, as compared with the RTs to unimodal auditory and visual target stimuli. This bimodal RT advantage, called the redundant target effect (RTE), was greater than expected by the probability summation of the RT distributions of the responses to the unimodal signals (i.e., on the basis of the race model; see Colonius & Diederich, 2006; Miller, 1982). According to the race model, some decrease in response latencies to detect target stimuli presented simultaneously in two different sensory modalities when compared to either target when presented individually is expected statistically (as long as the two targets are redundant and have some degree of overlap in their respective latency distributions; see Miller, 1991). Molholm and her colleagues claimed that multisensory facilitation effects such as these can occur early in perceptual processing, potentially modulating neural processes in what have traditionally been considered to be unisensory cortices (see Ghazanfar & Schroeder, 2006). In contrast with these facilitatory effects of multisensory integration, there are also a number of examples of impaired performance under conditions of multisensory stimulation (e.g., Colavita, 1974; Vatakis & Spence, 2007a; Vatakis & Spence, 2007b), suggesting that the individual sensory signals may not always be integrated after all, or perhaps instead that the integration itself can actually impair performance (Spence, Baddeley, Zampini, James, & Shore, 2003). In a classic example of multisensory stimulation leading to interference, Colavita and his colleagues (Colavita, 1974; Colavita, Tomko, & Weisberg, 1976; Colavita & Weisberg, 1979) presented sequences of intermixed light flashes and tones to observers who were instructed to respond as rapidly as possible with one key to visual stimuli and with another key to auditory stimuli. Bimodal trials, consisting of the simultaneous presentation of a visual flash and an auditory beep, were also presented occasionally (e.g., 5 out of a total of 35 trials in Colavita, 1974). Surprisingly, despite equally fast and accurate responses to isolated flashes and beeps, the participants almost exclusively pressed the visual key on the rare bimodal trials that were interspersed throughout the experiment (e.g., 98% of the time in Colavita (1974); Experiment 1.1

1 In fact, on many occasions, the participants claimed not to have even heard any sound on these bimodal trials (e.g., 33% of the trials in Experiment 1 of Colavita, 1974); although it should be noted that the many subsequent studies of the Colavita visual dominance effect have reported effects that are much smaller in magnitude (though still highly significant), presumably as a result of their having ruled out a number of methodological shortcomings that were present in Colavita’s seminal study, e.g., see Sinnett, Spence, & Soto-Faraco, 2007).

How can these two contrasting manifestations of multisensory interaction be reconciled? One obvious possibility lies in the differing nature of the participants’ task. For example, the response to bimodal stimuli in experiments on the visual dominance effect originally reported by Colavita (1974); see also Colavita et al., 1976; Colavita & Weisberg, 1979) required the participants to discriminate (and to make a decision between two possible response keys), whereas a simple detection response (one key for both modalities) has always been required in studies of the RTE. Yet, intriguingly, the dominant accounts of the RTE and the Colavita effect are based on sensory-perceptual processes, before any effects at the level of response mechanisms. In a recent study of the Colavita visual dominance effect, Sinnett et al. (2007) observed a slowing of participants’ RTs on bimodal trials as compared to their responses to either of the unimodal targets, as well as the typical imbalance in errors when using a discrimination task (i.e., visual dominance consisting of more visual-only responses than auditory-only responses for errors that were made in the bimodal trials; cf. Egeth & Sager, 1977). Interestingly, however, the mean latency of participants’ erroneous visual responses to bimodal stimuli was shorter than their correct visual responses to unimodal visual stimuli, perhaps suggesting that the visual component of the bimodal stimulus is processed earlier when presented in unison with an auditory stimulus. Although it is difficult to speculate on the basis of the RT data from those trials in which participants respond erroneously, this result is nevertheless at least consistent with the facilitation of response latencies for visual events on bimodal trials despite participants exhibiting an inhibition of their responses to auditory events (Sinnett et al., 2007). It is, however, rather difficult to address the comparison between intersensory facilitation (e.g., Molhom et al., 2004) and sensory dominance (e.g., Colavita, 1974) due to the significant methodological differences between the paradigms that have been used to study these two phenomena experimentally. For example, unimodal and bimodal targets are usually presented equiprobably in studies of the RTE (e.g., Molhom et al., 2004), whereas the bimodal targets have typically been presented relatively infrequently (15%; Colavita, 1974) in the majority of previous experiments on the Colavita visual dominance (although see Sinnett et al., 2007). Both the absolute spatial position of the visual and auditory signals and their relative position (i.e., same vs. different) have also varied widely between studies, and this means that it is difficult to compare the results of the various experiments directly (see Colavita et al., 1976; Harrington & Peck, 1998; Johnson & Shapiro, 1989; Koppen & Spence, 2007). A further difference between the various studies is that those experiments designed to look at facilitatory effects usually focus on RTs while the majority of studies of visual dominance concentrate on the accuracy of participants’ responses instead (i.e., on

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

the frequency of the different types of errors to bimodal trials that participants make, although see Egeth & Sager, 1977).2 In the present study, we used exactly the same set of materials, design, and basic procedure to further investigate multisensory facilitation and inhibition effects in the same group of participants. In Experiment 1, we tested whether simply changing the demands of the task (from a simple speeded detection task to a three-button discrimination task) would determine whether audiovisual enhancement (RTE) or interference (e.g., visual dominance) would be observed. Based on previous research (e.g., Molhom et al., 2004), we thought that facilitation should be observed when a simple detection task was used and interference would be seen when the task required some form of discrimination response instead. In Experiment 2, we tested whether the potential dissociation between facilitatory and inhibitory effects could be observed simply by changing the target modality while keeping all other aspects of the task constant (see Mounts & Tomaselli, 2005 for a discussion of facilitatory and inhibitory effects). 2. Experiment 1 2.1. Participants Twenty-two undergraduate students (15 females) from the University of Oxford took part in Experiment 1 in exchange for a £5 gift voucher. All of the participants reported normal hearing and normal or corrected-to-normal vision. 2.2. Stimuli and apparatus Fifty line drawings of common objects chosen from the Snodgrass and Vanderwart (1980) database were used as visual stimuli. The auditory stimuli consisted of 50 familiar sounds selected from an initial pool of 103 sounds (downloaded from http://www.a1freesoundeffects.com on 01/02/ 2003 from source), on the basis of their clarity and familiarity, as rated by three observers. The sounds were edited to a length of 350 ms and their average amplitudes were equated using Cool Edit software (Syntrillium Software Corp., California). Two loudspeakers, one placed directly to either side of the computer monitor, were used to present the sounds at an average level of 60 dB, as measured from the participant’s ear position. These stimuli were presented in two simultaneous streams of stimuli (one visual and the other auditory, pseudo-randomly ordered) with a duration of 350 ms and a 150 ms interstimulus interval (silence and a blank screen; see 2 Note, however, that Egeth and Sager (1977) expanded upon Colavita’s (1974) original findings to show that participants’ responses to unimodal auditory stimuli were faster than their responses to the same tones in the bimodal trials. This result led Egeth and Sager to conclude that the presence of the visual stimulus interfered with the perceptual processing of the auditory stimulus.

155

Fig. 1). The pictures were randomly rotated 30 degrees to the left or the right to increase task difficulty (cf. Rees, Russell, Frith, & Driver, 1999). 2.3. Procedure The participants sat 60 cm from a computer screen in a dimly illuminated testing booth. The participants were instructed to monitor the auditory and visual streams and to respond as rapidly as possible to a predefined unimodal auditory target (the sound of a cat meowing), a unimodal visual target (the picture of a stoplight) or to a bimodal target (consisting of the combination of both stimuli). These particular targets were shown to elicit response latencies that were equivalent to other randomly selected target stimuli in a control study (Sinnett et al., 2007). In the three-key condition, the participants were instructed to make one of three keypress responses (J, K, or L, on the computer keyboard) as rapidly as possible each time a visual target, an auditory target, or both targets were presented. The correspondence between the response keys (J, K, L) and the type of target (i.e., visual, auditory, or bimodal) were counterbalanced across participants. In the onekey condition (note that task order was counterbalanced across participants), the participants were instructed to respond with only one response key (B) to all three types of target (i.e., as in a typical study of the redundant targets effect, e.g., Molhom et al., 2004). In this novel design, the targets were embedded in the visual and acoustic streams and occurred, on average, once every five stimuli. The target frequency was based on equal unimodal presentations (40% visual and 40% auditory targets; 60 of each), and a smaller number of bimodal targets (20%; 30 total), closely replicating the presentation parameters used in previous investigations of visual dominance (see, for example, Colavita, 1974; Colavita & Weisberg, 1979; Colavita et al., 1976). The participants were repeatedly presented with the targets prior to the start of the main experimental session to familiarize them with the stimuli. 2.4. Results and discussion 2.4.1. One-button response condition The participants rarely missed a target (unimodal visual miss rate, 3.0%; unimodal auditory, 3.6%; bimodal, 0.5%). Misses were significantly less frequent on the bimodal trials than for either unimodal visual or auditory trials (p = .024 and p = .023, respectively), but no significant difference was observed in the number of misses made by participants to unimodal visual and auditory targets (p = .448). Response latencies to the unimodal visual targets (501 ms; SE = 11.5) were, on average, faster than responses to unimodal auditory targets (544 ms; SE = 14.9; p < .001; see Fig. 2a). Responses to bimodal targets (451 ms; SE = 11.0) were significantly faster than those made in response to either unimodal visual or unimodal auditory targets (p < .001, for both comparisons).

156

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

Interstimulus interval: 150 ms

Stimulus duration: 350 ms



Sounds Bang

Pictures

Crash

Bimodal target

Unimodal auditory target

… Tick

Ring

Meow

Dong

Meow





Time Unimodal visual target Fig. 1. Illustration of the sequence used in Experiment 1. Pairs of pictures and sounds were presented for 350 ms and separated by a blank screen and silence for 150 ms. The participants were required to monitor both streams and to respond using either one key (B) or three keys (J, K, L; counterbalanced across participants) corresponding to predefined unimodal visual, unimodal auditory, or bimodal targets. The bimodal target consisted of the combination of the two unimodal targets.

The facilitation observed on the bimodal target trials was further analyzed to determine whether it represented a violation of the race model (e.g., Miller, 1982), which would suggest an interaction between the auditory and visual stimuli, thereby revealing a RTE. The race model is violated when the probability of response to a bimodal stimulus at a given latency is higher than the sum of probabilities for each unisensory component at that latency minus a combined probability term (i.e., what would be predicted if the observer chose the fastest of the two unimodal components in an efficient way). To test our empirical RTs to bimodal trials against this model, the RT distributions for auditory, visual, and bimodal trials were divided in 20 quantiles (5% bins) for each participant. Next, the group averaged probability for bimodal trials was compared to the predictions of the model based on the unimodal distributions ([RTaud + RTvis] – [RTaud * RTvis]) for each quantile. The probability of finding fast responses to bimodal targets was higher than what was predicted by the race model (p < .05) in the quantiles ranging from 6 (30%) to 11 (55%; see Fig. 3). 2.4.2. Three-button response condition Response latencies to the unimodal visual (705 ms; SE = 14.0) and to the unimodal auditory (710 ms; SE = 15.5) targets did not differ significantly (p = .707; see Fig. 2b), and responses to the bimodal targets (792 ms; SE = 17.1) were significantly slower than to either unimodal visual or auditory targets (both p < .001). The participants made relatively few errors when responding to either the unimodal visual or unimodal auditory targets (7.3% and 9.7% of trials, respectively). The incidence of errors on the bimodal trials where participants responded

with either the visual or auditory unimodal response key (i.e., instead of with the bimodal response key) was much higher (25.2%). Of these trials, participants erroneously responded by pressing only the visual response key significantly more often than they erred by pressing only the auditory response key (16.2% vs. 8.9% of trials, respectively; p = .009). The data obtained from the three-button response task demonstrate visual dominance, in line with our previous observations (see Sinnett et al., 2007). That is, RTs to bimodal targets were significantly slower than to either type of unimodal target. This result, combined with the higher incidence of visual-only (erroneous) responses to bimodal targets when compared to auditory-only (erroneous) responses, would appear to suggest that the processing of auditory stimuli is somehow inhibited to the benefit of visual processing in the bimodal trials (i.e., visual dominance). Yet, when responding with only a single key (detection task) to all stimulus types, a facilitatory RTE was observed, with faster latencies being reported in response to the bimodal stimuli. The key difference between these conditions was the nature of the task (one-button detection vs. three-button discrimination). Although both of these results have been observed separately before (see Colavita, 1974; Colavita & Weisberg, 1979; Colavita et al., 1976; Molhom et al., 2004), to the best of our knowledge, this is the first time that they have been tested within the same experiment and participants, and with the methodological differences outlined in the Introduction eliminated. Importantly, however, the way in which these opposing tendencies (competition and facilitation) interact during perception has yet to be investigated empirically.

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

157

A 850 Reaction time (ms)

800 750 700 650 600 550 500 450 400 Visual

Auditory

Bimodal

Reaction time (ms)

B 850 800 750 700 650 600 550 500 450 400 Visual

Auditory

Bimodal

% Error rate

C 25 20

Fig. 3. Representation of the test of the race model. The cumulative probability (y-axis) of RTs from the first to the 99th percentile (x-axis) for unimodal visual (dashed line), unimodal auditory (dotted line), and bimodal (black line) responses is shown. Note that the predicted RT distribution based on the race model (grey line; see text for details) is exceeded by the empirical distribution of responses to bimodal targets.

whether competition and facilitation could be observed within the same task (a speeded detection task using only a single response key) simply by changing the modality to which participants had to respond.

15 10 5 Visual

Auditory

Fig. 2. Reaction time (RT) from the one-response (A) and three-response (B) conditions in Experiment 1. Error rates (C) reflect the percentage of responses involving only the unimodal visual or unimodal auditory response keys to bimodal targets. Note the different pattern of RTs to bimodal targets depending on the number of possible responses required in the task.

According to the most commonly accepted accounts of intersensory facilitation and visual dominance, both phenomena arise from early sensory interactions. As such, they should be present whenever a bimodal stimulus is present, and therefore co-occur whenever stimuli such as those used here are presented. The design in Experiment 1 eliminated most of the stimulus-level differences present between previous RTE and Colavita studies in the published literature, thus ruling out several confounding factors as the source of the contrasting effects (i.e., percentage of bimodal trials, spatial mismatch, accuracy vs. latency measures). Yet, the current design still incorporated a fundamental difference in terms of response set, common in all previous implementations of RTE and Colavita experiments. Therefore, it is still unclear whether facilitation and interference did in fact co-occur at the sensory level, or if this dissociation can be accounted for solely by differences in the response set. Our second experiment was therefore designed to investigate

3. Experiment 2 The participants in this experiment had to respond to visual or to auditory stimuli in different blocks of trials where unimodal and bimodal stimuli were intermixed and presented randomly. If intersensory facilitation and visual dominance can, in fact, coexist, we would expect to observe a very specific pattern of results. When participants respond to the auditory targets, we should observe slower responses to bimodal stimuli as compared to unimodal stimuli, because the to be ignored visual component of the bimodal stimulus will interfere (i.e., supporting the visual dominance results of the three-button task in Experiment 1; see also Egeth & Sager, 1977). When responding specifically to the visual component, however, faster RTs should be observed in bimodal trials as compared to unimodal trials (in accordance with the results from the onebutton condition of Experiment 1, and previous intersensory facilitation research (Forster et al., 2002; Fort et al., 2002; Nickerson, 1973; Zampini et al., 2007). 3.1. Participants Twenty undergraduate students (12 females) from the University of British Columbia took part in the experiment in exchange for either course credit or $5 Canadian dollars.

158

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

All of the participants reported normal hearing and normal or corrected-to-normal vision. 3.2. Stimuli and apparatus The materials were the same as those used in Experiment 1 (i.e., the same two targets were used), except that all of the distractors were removed from the stream and target presentations were now separated by a blank screen and silence. Note that as only one modality was relevant in this experiment, the participants would have been unable to distinguish between potential non-targets and targets if the distractor stream used in Experiment 1 had been retained in this experiment. This would have made it impossible to create a purely unimodal response condition (without extensive training prior to the task). It should be noted that analogous visual dominance effects have been observed both with and without distractors (see Sinnett et al., 2007). Furthermore, this methodology (no distractors) closely matches previous experiments on the Colavita effect and on the RTE (see Colavita, 1974; Colavita & Weisberg, 1979; Colavita et al., 1976; Molhom et al., 2004). 3.3. Procedure At the start of each of the four blocks of trials, the participants were instructed to respond to targets in either the auditory or visual modality with only one response key (B). The procedure was identical to that reported in Experiment 1 except for the following changes: each block of trials contained 20 auditory stimuli, 20 visual stimuli, and 10 bimodal stimuli (the same frequency as reported for Experiment 1), and the targets, which occurred on average every 2500 ms (as in Experiment 1), were no longer separated by distractor stimuli. A few additional trials (four per block) were introduced that required the participants to respond to the presentation of a red circle on the screen. These trials were included to ensure that the participants were attentive to the monitor during the auditory blocks (e.g., so they did not close their eyes) and were not included in the analysis. 3.4. Results and discussion Response latencies to the unimodal visual targets (445 ms; SE = 16.4) in the visual blocks and to the unimodal auditory (445 ms; SE = 14.7) targets in the auditory blocks did not differ significantly (p = .995; see Fig. 4). However, the participants responded significantly more slowly to the auditory targets (468 ms; SE = 18.9) than to the visual targets (407 ms; SE = 11.6; p < .001) on the bimodal trials. As hypothesized, when testing for the predicted differences between unimodal and bimodal response latencies (one-tailed tests), RTs to the auditory component of the bimodal targets were slowed relative to the RTs to the unimodal auditory targets (p = .031). Additionally, participants responded significantly more rapidly to the

Fig. 4. Reaction time (RT) in Experiment 2. Black bars denote RTs to unimodal targets. Grey bars denote RTs to bimodal targets. The target modality is noted in the x-axis. Note the significant differential pattern of RTs to bimodal targets depending on the target modality.

visual stimuli on the bimodal trials than on the unimodal visual targets (p < .001). The participants missed very few of the targets overall (1.6% visual and .9% auditory; no differences, p = .285) and made relatively few false alarms (responses to stimuli in the other modality; 6.6% responses to sounds when monitoring for visual targets, and 3.4% responses to visual events when monitoring for auditory targets; significantly different, p = .01). The participants made errors (misses) very infrequently on the bimodal trials when responding to the visual target (.5%) or when detecting the auditory target (1.3%; p = .42). The most important finding to emerge from the analysis of the results of Experiment 2 was the diverging pattern of response latencies to bimodal trials depending on which modality the participants responded to, despite the fact that the actual task and stimuli were otherwise exactly the same. Specifically, a facilitation of response latencies to visual targets and a slowing of response latencies to auditory targets were seen for the same bimodal trials. This finding suggests that given the presence of a bimodal (auditory and visual) stimulus, the processing of the visual event was facilitated while at the same time the processing of the auditory event was inhibited. These results are further supported by the fact that RTs to unimodal visual and unimodal auditory targets were statistically equivalent (and serendipitously, identical, both 445 ms) when presented in isolation, and within the same block as the bimodal trials. Thus, any claim that the modality difference observed is due to differences in the basic saliency of the stimuli would appear unfounded. It should also be noted that more false alarms (responses to the non-target modality) to irrelevant auditory events (i.e., when monitoring for visual targets) were observed when compared to false alarms made by participants in response to the visual events (i.e., when monitoring for

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

auditory targets; although the overall false alarm rate was low <7%). This result is intriguing as one might have expected the participants to exhibit a higher incidence of false alarms to visual stimuli, as the response latencies suggest that the visual stimuli cannot be ignored as easily as the auditory stimuli. Having said that, it should also be noted that the addition of the control trials (i.e., the red circles to ensure task compliance) may have resulted in this slight difference. For example, when attending to the auditory targets, a certain amount of attentional resources must be devoted to the visual stream, to detect visual catch trials. This could have in turn resulted in a reduced false alarm rate when attending to the auditory targets (responses to the visual non-target).

4. General discussion In Experiment 1, when the participants simply had to detect all target stimuli regardless of their sensory modality (single response key), a RTE effect was observed on the bimodal trials. This was manifested in terms of faster responses to bimodal stimuli than would have been predicted by the race model based on the unimodal response latencies (see Molhom et al., 2004).3 However, when the participants had to respond to the modality of the targets (with a different response key for targets in each modality), an interference effect was observed on the bimodal trials instead. In this version of the task, participants’ RTs to the bimodal targets were significantly slower than to either unimodal target. Furthermore, when erroneous responses to bimodal targets were made, they more often consisted of participants making the visual-only response rather than the auditory-only response. Despite both the RTE and the Colavita effect having been attributed to early stages of processing, it could have been argued that this initial finding of audiovisual facilitation and interference depended entirely on the nature of the motor component of the task, and did not necessarily reflect the existence of opposing patterns of multisensory integration at earlier levels of processing. Experiment 2 further explored the nature of these opposing patterns in the data by controlling for taskrelated factors as the source of the difference. The results demonstrated that this dissociation persists beyond any effects that may exist at the output stage. That is, multisensory competition and facilitation can be found in exactly the same displays and using the same task, for auditory and visual targets, respectively. This novel finding helps 3 Other demonstrations of the RTE have used semantically related or unrelated bimodal targets (i.e., the sound of a dog accompanied by a picture of a dog; see for example Laurienti, Kraft, Maldjian, Burdette, & Wallace, 2004; Molhom et al., 2004). We decided not to manipulate the semantic relationship between targets in order to avoid confounding semantic congruency and its effects on multisensory processing. Yet, we still observed an RTE. Future experimentation should focus on any possible increased enhancement for semantically matched bimodal stimuli.

159

to explain the nature of visual dominance and intersensory facilitation in speeded responses, as it suggests that both phenomena can coexist during the processing of a bimodal stimulus. It can be argued that the detection task used in Experiment 2 (detect only one type of target, auditory or visual, using only a single response key) involved an implicit discrimination in the form of a go/no-go task, because people had to refrain from responding to the unimodal events in the irrelevant modality (see also Chmiel, 1989). Note, however, that this hardly undermines the interpretation of the present results, because one would still need to explain why facilitation was found on the bimodal trials in the visual blocks under exactly the same task conditions that produced inhibition in the bimodal trials of the auditory blocks. Thus, regardless of how one describes the task used in Experiment 2, the main finding remains that both the intersensory facilitation of visual events and the inhibition of auditory processing were found (see also Colavita, 1974; Colavita & Weisberg, 1979; Colavita et al., 1976; Egeth & Sager, 1977, for previous examples of the mixed block design). The coexistence of visual dominance and the RTE seen here has important implications for our understanding of multisensory integration. It is commonly thought that multisensory integration aids in the perception of multisensory events by creating a single unitary percept. Indeed, this has often been associated with increased neural activation and more accurate and faster behavioural responses for bimodal stimuli when compared to their unimodal counterparts. For instance, the presence of an irrelevant visual stimulus can sometimes aid in the detection of auditory stimuli (see, for example, Lovelace, Stein, & Wallace, 2003). In recent years, the asymmetrical nature of facilitation and inhibition has been observed in other experimental paradigms (see Odgaard, Arieh, & Marks, 2003; Odgaard, Arieh, & Marks, 2004). These authors suggest that an irrelevant visual stimulus can enhance the perceived loudness of an auditory stimulus. Similarly, in some conditions, the presentation of an irrelevant auditory stimulus can enhance the perceived brightness of a visual stimulus. However, of interest to the present study is Odgaard et al.’s (2003, 2004) suggestion that the light-induced enhancement of auditory perception reflects an early sensory process, whereas the sound-induced enhancement of light perception reflects a response bias instead. The opposing trends in RT seen in the current experiment can be viewed in a similar way. However, the findings reported here suggest that, as both facilitation and interference sometimes coexist, multisensory integration can also operate by promoting the perception of one sensory modality (vision here) and interfering with the perception of stimuli in another modality (audition here). This dissociation is perhaps reminiscent of the dissociation proposed recently by Miller and D’Esposito (2005), who differentiated between a neural network that mediates crossmodal coincidence (which contributes to the detection of corresponding

160

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161

bimodal stimulation) and one that mediates crossmodal binding or fusion (which would underlie multisensory integration itself; see also Soto-Faraco & Alsius, 2007, for a similar argument). It would be interesting in future research to try and pinpoint the neural substrates responsible for the concurrent facilitation and inhibition effects seen here. Beauchamp, Lee, Argall, and Martin (2004) have shown that the posterior superior temporal sulcus (pSTS) and the middle temporal gyrus (mTG) are associated with multisensory integration using stimuli that were similar to those used here. In their experiment, the participants were presented with drawings and sounds (e.g., the picture of a dog accompanied with a barking sound). The task involved responding to two simple true or false questions concerning two semantic categories in the unimodal blocks (e.g., ‘‘Does the animal walk on four legs?” or ‘‘Does the tool require electricity?”). In the bimodal blocks, the participants had to determine whether the picture–sound pairing was semantically related or not. The data from their functional magnetic resonance imaging (fMRI) study showed enhanced activity in the above-mentioned regions for congruent bimodal displays. This multisensory enhancement in neural activity was not reflected in the behavioural data, given that responses to bimodal stimuli were slower than to unimodal visual or auditory stimuli. However, it should be noted that neural enhancement does not necessarily lead to improved or faster behavioural performance. Regardless, their findings seem to be in parallel with the dissociating trend seen in the present results because they suggest an inhibition at the behavioural level when participants have to make a decision between competing response options, while at the same time they show an increase in brain activation for bimodal stimuli as compared to unimodal ones. In a similar vein, the participants in a study by Besle, Fort, Delpuech, and Giard (2004) had to detect target syllables that were presented either unimodally (auditory or visual) or bimodally. Contrary to Beauchamp et al.’s (2004) findings, they reported faster responses to bimodal stimuli. However, this may be attributable to the redundancy present in the bimodal target (e.g., the bimodal target was composed of both unimodal targets). This result is also in line with our own, as their participants were not required to perform any discrimination at the response level, and instead only had to respond with a single button press to both unimodal and bimodal stimuli. One could speculate that the concurrent emergence of opposing patterns in the RT data (see the results of Experiment 2) suggests that the consequences of multisensory integration are expressed in unisensory cortices. This notion is supported by research suggesting decreased neural activity for irrelevant stimuli occurring within a sensory modality (see, for example, Duncan, Humphreys, & Ward, 1997) and across different modalities (Ghatan et al., 1995), while at the same time observing increased activity for brain areas dealing with relevant stimuli (see, for example, Desimone & Duncan, 1995; Duncan et al., 1997). This

position was highlighted by Ghatan, Hsieh, Petersson, Stone-Elander, and Ingvar (1998) who, using positron emission tomography (PET), reported the coexistence of facilitation and inhibition in the human cortex. In their experiment, participants had to perform a continuous arithmetic task in either the presence or absence of irrelevant auditory speech. They found that irrelevant speech did not affect performance, while at the same time observing a decrease in activity in the auditory cortex. This finding was coupled with an increase in activity for relevant stimuli in higher order cortices (e.g., the left posterior parietal cortex which is important for mathematical processing). While, to date, there have been a number of electrophysiological studies of the RTE (e.g., Besle et al., 2004; Giard & Peronnet, 1999; Molhom et al., 2004), no one has, as yet, attempted to investigate the neural underpinnings of the cost-benefit effects involved in the Colavita visual dominance effect and RTE as reported here. It is possible that, as highlighted above, separate neural networks could lead to facilitation and competition effects (see McCarley, Mounts, & Kramer, 2007; Miller & D’Esposito, 2005; Mounts & Tomaselli, 2005). Understanding the neural substrates underlying the Colavita visual dominance effect certainly represents a promising area for future study. Acknowledgment This research was supported by grants from the Spanish Ministerio de Educacio´n y Ciencia TIN2004-04363-C0302, SEJ2007-64103, DURSI SGR2005-1026, and the Agencia de Gestio´ d’Ajuts Universitaris i de Recerca – AGAUR. References Beauchamp, M. S., Lee, K. E., Argall, B. D., & Martin, A. (2004). Integration of auditory and visual information about objects in superior temporal sulcus. Neuron, 41, 809–823. Besle, J., Fort, A., Delpuech, C., & Giard, M. H. (2004). Bimodal speech: Early suppressive visual effects in human auditory cortex. European Journal of Neuroscience, 20, 2225–2234. Chmiel, N. (1989). Response effects in the perception of conjunctions of colour and form. Psychological Research, 51, 117–122. Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16, 409–412. Colavita, F. B., Tomko, R., & Weisberg, D. (1976). Visual prepotency and eye orientation. Bulletin of the Psychonomic Society, 8, 25–26. Colavita, F. B., & Weisberg, D. (1979). A further investigation of visual dominance. Perception & Psychophysics, 25, 345–347. Colonius, H., & Diederich, A. (2006). The race model inequality: Interpreting a geometric measure of the amount of violation. Psychological Review, 113, 148–154. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective attention. Annual Review of Neuroscience, 18, 193–222. Duncan, J., Humphreys, G., & Ward, R. (1997). Competitive brain activity in visual attention. Current Opinion in Neurobiology, 7, 255–261. Egeth, H. E., & Sager, L. C. (1977). On the locus of visual dominance. Perception & Psychophysics, 22, 77–86. Forster, B., Cavina-Pratesi, C., Aglioti, S. M., & Berlucchi, G. (2002). Redundant target effect and intersensory facilitation from visual–

S. Sinnett et al. / Acta Psychologica 128 (2008) 153–161 tactile interactions in simple reaction time. Experimental Brain Research, 143, 480–487. Fort, A., Delpuech, C., Pernier, J., & Giard, M. H. (2002). Dynamics of cortico-subcortical cross-modal operations involved in audio–visual object recognition in humans. Cerebral Cortex, 12, 1031–1039. Ghatan, P. H., Hsieh, J. C., Petersson, K. M., Stone-Elander, S., & Ingvar, M. (1998). Coexistence of attention-based facilitation and inhibition in the human cortex. Neuroimage, 7, 23–29. Ghatan, P. H., Hsieh, J. C., Wirsen-Meurling, A., Wredling, R., Eriksson, L., Stone-Elander, S., et al. (1995). Brain activation induced by the perceptual maze test: A PET study of cognitive performance. Neuroimage, 2, 112–124. Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10, 278–285. Giard, M. H., & Peronnet, F. (1999). Auditory–visual integration during multimodal object recognition in humans: A behavioral and electrophysiological study. Journal of Cognitive Neuroscience, 11, 473–490. Harrington, L. K., & Peck, C. K. (1998). Spatial disparity affects visual– auditory interactions in human sensorimotor processing. Experimental Brain Research, 122, 247–252. Johnson, T. L., & Shapiro, K. L. (1989). Attention to auditory and peripheral visual stimuli: Effects of arousal and predictability. Acta Psychologica, 72, 233–245. Koppen, C., & Spence, C. (2007). Spatial coincidence modulates the Colavita visual dominance effect. Neuroscience Letters, 417, 107–111. Laurienti, P. J., Kraft, R. A., Maldjian, J. A., Burdette, J. H., & Wallace, M. T. (2004). Semantic congruence is a critical factor in multisensory behavioral performance. Experimental Brain Research, 158, 405–414. Lovelace, C. T., Stein, B. E., & Wallace, M. T. (2003). An irrelevant light enhances auditory detection in humans: A psychophysical analysis of multisensory integration in stimulus detection. Cognitive Brain Research, 17, 447–453. McCarley, J. S., Mounts, J. R. W., & Kramer, A. (2007). Spatially mediated capacity limits in attentive visual perception. Acta Psychologica, 126, 98–119. Miller, J. (1982). Divided attention evidence for coactivation with redundant signals. Cognitive Psychology, 14, 247–279. Miller, J. (1991). Channel interaction and the redundant-targets effect in bimodal divided attention. Journal of Experimental Psychology: Human Perception and Performance., 17, 160–169.

161

Miller, L. M., & D’Esposito, M. (2005). Perceptual fusion and stimulus coincidence in the cross-modal integration of speech. Journal of Neuroscience, 25, 5884–5893. Molhom, S., Ritter, W., Javitt, D. C., & Foxe, J. J. (2004). Multisensory visual–auditory object recognition in humans: A high-density electrical mapping study. Cerebral Cortex, 14, 452–465. Mounts, J. R. W., & Tomaselli, R. G. (2005). Competition for representation is mediated by relative attentional salience. Acta Psychologica, 118, 261–275. Nickerson, R. S. (1973). Intersensory facilitation of reaction time: Energy summation or preparation enhancement. Psychological Review, 80, 489–509. Odgaard, E. C., Arieh, Y., & Marks, L. E. (2003). Cross-modal enhancement of perceived interaction versus response bias. Perception & Psychophysics, 65, 123–132. Odgaard, E. C., Arieh, Y., & Marks, L. E. (2004). Brighter noise: Sensory enhancement of perceived loudness by concurrent visual stimulation. Cognitive, Affective, & Behavioral Neuroscience, 4, 127–132. Rees, G., Russell, C., Frith, C. D., & Driver, J. (1999). Inattentional blindness versus inattentional amnesia for fixated but ignored words. Science, 286, 2504–2507. Sinnett, S., Spence, C., & Soto-Faraco, S. (2007). Visual dominance and attention: Revisiting the Colavita effect. Perception & Psychophysics, 69, 673–686. Snodgrass, J. G., & Vanderwart, M. (1980). A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory, 6, 174–215. Soto-Faraco, S., & Alsius, A. (2007). Conscious access to the unisensory components of a cross-modal illusion. Neuroreport, 18, 347–350. Spence, C., Baddeley, R., Zampini, M., James, R., & Shore, D. I. (2003). Crossmodal temporal order judgments: When two locations are better than one. Perception & Psychophysics, 65, 318–328. Vatakis, A., & Spence, C. (in press). Evaluating the influence of the ‘‘unity assumption” on the temporal perception of realistic audiovisual stimuli. Acta Psychologica. doi:10.1016/j.actpsy.2006.12.002. Vatakis, A., & Spence, C. (2007). Crossmodal binding: Evaluating the ‘‘unity assumption” using audiovisual speech stimuli. Perception & Psychophysics, 69, 744–756. Zampini, M., Torresan, D., Spence, C., & Murray, M. M. (2007). Audiotactile multisensory interactions in front and rear space. Neuropsychologia, 45, 1869–1877.