Neuropsychologia 67 (2015) 111–120
Contents lists available at ScienceDirect
Neuropsychologia journal homepage: www.elsevier.com/locate/neuropsychologia
Boosting pitch encoding with audiovisual interactions in congenital amusia Philippe Albouy a,b,c,d,n, Yohana Lévêque a,b, Krista L. Hyde d, Patrick Bouchet a,b, Barbara Tillmann a,b,1, Anne Caclin a,b,1 a Lyon Neuroscience Research Center, Brain Dynamics and Cognition Team & Auditory Cognition and Psychoacoustics Team, CRNL, INSERM U1028, CNRS UMR5292, Lyon, F-69000, France b University Lyon 1, Lyon F-69000, France c Montreal Neurological Institute, McGill University, 3801 University Street Montreal, QC, Canada H3A2B4 d International Laboratory for Brain Music and Sound Research, University of Montreal and McGill University, Canada
art ic l e i nf o
a b s t r a c t
Article history: Received 6 August 2014 Received in revised form 3 December 2014 Accepted 5 December 2014 Available online 11 December 2014
The combination of information across senses can enhance perception, as revealed for example by decreased reaction times or improved stimulus detection. Interestingly, these facilitatory effects have been shown to be maximal when responses to unisensory modalities are weak. The present study investigated whether audiovisual facilitation can be observed in congenital amusia, a music-specific disorder primarily ascribed to impairments of pitch processing. Amusic individuals and their matched controls performed two tasks. In Task 1, they were required to detect auditory, visual, or audiovisual stimuli as rapidly as possible. In Task 2, they were required to detect as accurately and as rapidly as possible a pitch change within an otherwise monotonic 5-tone sequence that was presented either only auditorily (A condition), or simultaneously with a temporally congruent, but otherwise uninformative visual stimulus (AV condition). Results of Task 1 showed that amusics exhibit typical auditory and visual detection, and typical audiovisual integration capacities: both amusics and controls exhibited shorter response times for audiovisual stimuli than for either auditory stimuli or visual stimuli. Results of Task 2 revealed that both groups benefited from simultaneous uninformative visual stimuli to detect pitch changes: accuracy was higher and response times shorter in the AV condition than in the A condition. The audiovisual improvements of response times were observed for different pitch interval sizes depending on the group. These results suggest that both typical listeners and amusic individuals can benefit from multisensory integration to improve their pitch processing abilities and that this benefit varies as a function of task difficulty. These findings constitute the first step towards the perspective to exploit multisensory paradigms to reduce pitch-related deficits in congenital amusia, notably by suggesting that audiovisual paradigms are effective in an appropriate range of unimodal performance. & 2014 Elsevier Ltd. All rights reserved.
Keywords: Multisensory integration Tone deafness Pitch processing Auditory perception Rehabilitation
1. Introduction It is well established that the combination of sensory information across senses can modify both quantitatively and qualitatively the perceptual experience (Bizley and King, 2012; Calvert and Thesen, 2004; Ernst and Bulthoff, 2004; Jousmaki and Forss, 1998; McGurk and MacDonald, 1976; Meredith and Stein, 1986; Shams et al., 2000; Stein and Meredith, 1993; von Kriegstein, 2012), whether by disrupting (Bertelson and Radeau, 1981) or
n Corresponding author at: Montreal Neurological Institute, McGill University, 3801 University Street, Montreal, QC H3A 2B4 Canada. E-mail address:
[email protected] (P. Albouy). 1 Both authors equally contributed to this work.
http://dx.doi.org/10.1016/j.neuropsychologia.2014.12.006 0028-3932/& 2014 Elsevier Ltd. All rights reserved.
enhancing perception (von Kriegstein, 2012). Benefits of multisensory integration upon perception have been observed in various processes, from decreasing reaction times (Gielen et al., 1983; Hershenson, 1962; Posner et al., 1976) to facilitating learning (Seitz et al., 2006). In the specific case of audiovisual integration, the simultaneous presentation of auditory information has been shown to improve visual performance for a variety of tasks, such as discrimination of briefly flashed visual patterns (Vroomen and de Gelder, 2000), detection of a dimly flashed light (Frassinetti et al., 2002a; Frassinetti et al., 2002b; Teder-Salejarvi et al., 2005), and visual detection thresholds (Caclin et al., 2011; Cappe et al., 2010; Stein et al., 1989). Reciprocally, it has been shown that visual stimuli that are congruent with auditory stimuli (i.e. in terms of spatial position
112
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
and temporal occurrence) can improve auditory performance (von Kriegstein, 2012) by facilitating speech intelligibility in the presence of other distracting sounds (Ross et al., 2007; Sumby and Pollack, 1954; von Kriegstein, 2012) or by improving sound localization (Shelton and Searle, 1980). For pitch processing, audiovisual facilitative interactions have also been reported in the literature. For example, auditory pitch and visual vertical positions have been described as synesthetic corresponding dimensions (Melara and O’Brien, 1987); this means that judging pitch height is easier when low and high tones are paired with visual stimuli of corresponding lower or higher positions along a vertical axis (Lidji et al., 2007; Rusconi et al., 2006). Interestingly, recent research has shown that intersensory effects depend on participants’ performance when the modalities are presented in isolation. For example, an improvement of visual localization performance by concurrent auditory stimulation has been observed only under conditions of induced myopia (Hairston et al., 2003). Similarly, it has been proposed that cochlear implant users integrate audiovisual information better than normal-hearing participants (Rouger et al., 2007). It is relevant to note that these audiovisual benefits were observed using congruent audio and visual information. However, it is also possible to observe audiovisual facilitation even with uninformative stimuli (Bolognini et al., 2005; Caclin et al., 2011), particularly for participants with impaired performance in the modality of interest. Caclin et al. (2011) observed that uninformative sounds (i.e., sounds conveying no information about the answer to be given in the visual task) can improve visual detection thresholds in myopic participants with poor visual-only performance, – a facilitation effect that was not observed in normal-seeing participants and myopic participants with good visual-only performance. These effects have been associated with the inverse effectiveness principle of multisensory integration (Meredith and Stein, 1986; Stein and Meredith, 1993; Wallace et al., 1996), which states that cross-modal interactions are maximally effective when the responses to one of the constituents are weak. Further support for this hypothesis can be found in the observation that the behavioral gain in processing bimodal (relative to unimodal) stimuli is associated with increased activity (neural interactions) in the cortex of the weaker unisensory modality (Giard and Peronnet, 1999). Based on this inverse effectiveness principle, the present study aimed at investigating the effectiveness of facilitatory audiovisual interactions in a condition of impaired auditory processing that has been referred to as congenital amusia (Peretz, 2013; Peretz and Hyde, 2003; Stewart, 2011; Tillmann et al., 2014; Williamson and Stewart, 2013). Congenital amusia is a lifelong disorder of music perception and production. Amusic individuals are unable to recognize a familiar tune (without the help of lyrics) or to detect that someone (including themselves) sings out of tune. This disorder cannot be explained by peripheral hearing loss, brain lesions, or general cognitive impairments (Ayotte et al., 2002; Peretz, 2003; Peretz et al., 2002; Peretz and Hyde, 2003). Hyde and Peretz (2004) reported pitch discrimination difficulties in simple tone sequences in congenital amusia for pitch changes smaller than 1/2 tone, whereas controls detected changes of a 1/8 of a tone (Hyde and Peretz, 2004). Converging evidence for a deficit of fine-grained pitch processing was obtained with various psychoacoustic approaches for an independent sample of amusic participants (Foxton et al., 2004). These studies led to the assumption that congenital amusia arises from a failure to extract pitch information with sufficient accuracy. This lack of precision in the first processing steps of auditory perception is bound to affect tonal coding and pitch-related processing in general (Peretz, 2013; Stewart, 2011; Tillmann et al., 2014; Williamson and Stewart, 2013). In the present study, we investigated for the first time the potential effectiveness of multisensory interactions in improving
congenital amusics’ abilities in the auditory encoding of pitch information. To this aim, we first assessed whether amusics exhibit typical audiovisual integration capacities in a simple detection task (Task 1) and tested then whether audiovisual integration can improve processing of more complex pitch material (Task 2). Performance in Task 1 will guide the interpretation of findings of Task 2. It has been previously shown that amusic individuals exhibit deficits in the encoding of tone sequences, but less for the processing of isolated pitch sounds (Tillmann et al., 2014). As in Task 1 participants were required to simply detect an isolated stimulus (without requiring any deeper processing), one can hypothesize typical detection capacities for auditory and visual stimuli in amusics, as well as similar AV facilitation in both groups for Task 1. If confirmed, we hypothesized for Task 2 that facilitatory audiovisual interactions (improving auditory performance with uninformative visual stimulation) would boost performance in a pitch change detection task for both amusics and controls (Task 2). Participants thus performed two tasks. In Task 1, we tested amusics’ and controls’ abilities in detecting auditory (A), visual (V), and audiovisual (AV) stimuli with a speeded detection paradigm. In Task 2, we tested the effect of temporally congruent, but otherwise uninformative visual stimuli upon pitch change detection within five-tone sequences, in amusics and matched controls. The auditory task was adapted from Hyde and Peretz (2004)’s study of pitch change detection in congenital amusia (see above). The focus of interest was the facilitatory influence of uninformative visual stimuli on auditory processing for amusic individuals (with impaired pitch processing, thus a deficit in the auditory modality). This represents the mirror situation of previous work from our group reporting a boost in visual processing with uninformative auditory stimuli in visually-impaired participants (Caclin et al., 2011).
2. Material and methods 2.1. Participants Sixteen amusic adults and sixteen non-musician controls matched for gender, age, educational background, and musical training participated in the study. The amusic group was composed of fourteen right-handed participants and two left-handed participants and the control group was composed of fifteen righthanded participants and one left-handed participant. Standard audiometry ensured that participants did not have peripheral hearing loss of 25 dB or more at any frequency. No participant reported any history of neurological or psychiatric disease. Participants gave their written informed consent, and were paid for their participation. In a previous testing session, all participants were tested with the Montreal Battery of Evaluation of Amusia (Peretz et al., 2003) and with a two-alternative-forced-choice task (using a staircase procedure) to evaluate their pitch discrimination thresholds (Tillmann et al., 2009). Participants’ demographic characteristics and data from the pre-tests are presented in Table 1. 2.2. Montreal battery of Evaluation of amusia (MBEA) All participants were tested with the MBEA (Peretz et al., 2003). In the MBEA, six sub-tests assess various components of music perception and memory along melodic dimensions (i.e., sequential variations in pitch) and temporal dimensions (i.e., sequential variations in duration). To be considered as amusic, they had to obtain an average score on the MBEA below the cut-off score (23.4 on average across the six tasks of the battery, maximum score ¼ 30), the cut-off being two standard deviations below the average of a
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
Table 1 Demographic characteristics of participants and data in behavioral pretests. Educational background is calculated in years of education starting from the first year of primary school in the French system, at about 6 years of age. Results on the Montreal Battery of Evaluation of Amusia (MBEA) are expressed as the number of correct responses (average over the six sub-tests of the battery, maximum score¼ 30; and average over the three melodic subtests, maximum score¼ 30). Pitch Discrimination Threshold (PDT) scores are reported in semitones. Data is reported as a function of group, along with significance levels on corresponding ttests; “NS” refers to a non- significant difference (p 4.05). Standard deviations are in parentheses. Characteristics
Amusics (n¼ 16)
Controls (n¼16)
t-Test
Age in years Gender
34.87 (12.49) 10 female, 6 male 13.87 (2.06) 0.71 (1.48)
34.75 (10.69) 10 female, 6 male 14.65 (2.41) 0.56 (1.36)
NS
21.17 (1.51)
26.75 (1.08)
20.25 (2.53)
27.16 (2.13)
t(30) ¼ 12.32, p o .0001 t(30) ¼ 19.80, p o .0001
Education in years Musical education in years MBEA (Peretz et al., 2003) Mean score (cut off score¼23.4) Melodic sub-tests (cut off score¼21.6)
Pitch Discrimination Threshold (Tillmann et al., 2009) Threshold in semitones 1.12 (1.11) 0.24 (0.22)
NS NS
t(30) ¼ 3.06, p ¼.004
113
audiovisual stimuli. Participants were required to press a button as rapidly as possible when a stimulus appeared (A, V, or AV) irrespectively of the nature of the stimulation. They responded with their right hand by clicking on the left mouse button with their index finger (the three left-handed participants were free to use their left hand, but all were right-handed mouse users). The sound stimulation was a piano tone of 100 ms duration, played at the pitch level of C6 (1047 Hz), and synthesized using a Roland SC 50 sound canvas (Roland Corporation, Los Angeles, CA, USA). The visual stimulation was a single white disk presented at the center of the screen for a duration of 100 ms. There were 90 trials in total with 30 trials per condition (A, V, AV), the trials were presented in a pseudo-randomized order with the constraint that the same trial type (i.e. A, V, or AV), could not be repeated more than three times in a row. Note that for the AV condition, auditory and visual stimuli were presented simultaneously. Participants were given 1000 ms to respond after the onset of any stimulus (disk, sound, disk and sound simultaneously). Inter-trial-intervals varied from 1000 to 2000 ms randomly. Before the main task, participants performed a set of 15 practice trials. For each trial, we measured the RT, calculated from the onset of the stimulus, to investigate for both groups if a potential benefit of AV integration (shorter RT) could be observed in comparison to uni-modal stimulations (A, V). 2.6. Task 2: Pitch change detection task for Audio and audiovisual material
normal population (Peretz et al., 2003). All amusics obtained scores below the cut-off score (ranging from 18 to 23), and all controls obtained scores higher than the cut-off score (ranging from 23.83 to 28.83). 2.3. Pitch discrimination threshold (PDT) PDTs were determined using a two-alternative-forced-choice task with an adaptive tracking, two-down/one-up staircase procedure (see Tillmann et al., 2009, for details). The average PDT of the amusic group (ranging from 0.13 to 4 semitones) was higher (worse) than that of the control group (ranging from 0.05 to 0.95 semitones). In agreement with previous findings, we observed an overlap in pitch discrimination thresholds between amusic and control groups (Albouy et al., 2013a, 2013b; Foxton et al., 2004; Jones et al., 2009; Liu et al., 2010; Tillmann et al., 2014, 2009). 2.4. Experimental set-up for the audiovisual tasks The experiment took place in a sound-attenuated booth. Presentation software (Neurobehavioral systems, Albany, CA, USA) was used to control the presentation of stimuli and record participants’ responses given by button presses on a computer mouse. The mouse was custom-made to directly send signals to the parallel port of the computer (interrupt signals) thus providing precise Response Time (RT) measurements. A 36 27 cm2 CRT computer screen (1024 768 pixels) was located 95cm in front of the participant. The screen background color was grey and had a luminance of 20 cd/m2. The visual stimuli were white disks (150pixel diameter) of 60 cd/m2 luminance for the two tasks. Auditory stimuli were piano tones (adapted from Hyde & Peretz, 2004) presented using two speakers located at each side of the screen at a level of 70 dB SPL A, measured at the participants’ head using a Brüel & Kjær type 2239 sonometer. 2.5. Task 1: Speeded detection of auditory, visual, and audiovisual stimuli Congenital amusics and matched controls performed one block of a simple detection task comprising auditory, visual, and
The Pitch Change Detection task (PCD) was an adaptation of the pitch task used in Hyde and Peretz (2004). Participants performed the PCD in two different conditions (either Audio only (A) or Audiovisual (AV), where the auditory stimuli were presented with uninformative visual stimuli). In both conditions, they had to indicate whether the fourth tone within a sequence of 5 isochronous tones was changed in pitch in comparison to the other tones of the sequence. The stimuli were 100-ms piano tones played at C6 (1047 Hz, the same piano tone as the one used in Task 1) presented with a 250 ms Inter-Tone-Interval (that is, an SOA of 350 ms), leading to a total sequence length of 1500 ms. For different trials, the fourth tone was displaced by either a 1/4 tone, a 1/8 tone or a 1/16 tone (50, 25, and 12.5 cents respectively, with 100 cents corresponding to a 1/2 tone) upward or downward from C6. In the A condition, participants performed the PCD with the auditory stimulation only. In the AV condition, they performed the PCD with simultaneously presented uninformative visual stimuli displayed on the screen. The visual stimulation consisted in 5 disks (with the same characteristics as the disk used in Task 1) appearing consecutively from the left to the right of the screen at the middle of its vertical axis (x1 ¼ 384, x2¼ 171, x3 ¼0, x4 ¼ 171, x5¼ 384, in pixels relative to the screen center). Each visual disk appeared for a duration of 100 ms simultaneously with one of the tones. It is of importance to note that the visual stimulation was uninformative with respect to whether or not an auditory change was present in the sequence. For each condition (A, AV), there were 192 trials: 96 trials without changed tone and 96 trials with changed tone (with equal proportion (16 trials) of each of the 6 pitch changes). Conditions were presented in four alternating blocks (two of each condition), with the starting condition being counterbalanced across participants. Matched participants (a given amusic individual with her/ his matched control participant) performed the tasks with the same block order, and they were informed about the condition order. The blocks were separated by 2–3 min of break. Within a block, trials were presented with several constraints: the same trial type (i.e. trial with changed tone, trial without changed tone), could not be repeated more than three times in a row, and trials with changed tone with the same pitch interval size (either 1/4
114
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
tone, 1/8 tone or 1/16 tone) could not be repeated twice in a row. For each trial, after the auditory stimulation (that lasted 1500 ms) a question mark (“?”) appeared on the screen, and participants had 2000 ms to respond before the start of the next trial. Note that RT were calculated from the onset of the question mark (end of the tone sequence, thus 450 ms after the onset of the fourth tone). Participants were asked to respond as accurately and as rapidly as possible once the question mark appeared. They were asked to press the right mouse button (middle finger of the right hand) when they detected a change in the sequence and the left mouse button (index finger) when they were unable to detect a change. Before the task, participants performed one practice block of each condition composed of 15 trials with feedback. No feedback was given during the task itself. The statistical analyses were performed with repeated measures ANOVA corrected with Greenhouse-Geisser correction. The significant interactions were further analyzed using Fisher LSD post hoc tests.
3. Results 3.1. Speeded detection of auditory, visual, and audiovisual stimuli (task 1) As raw RT data were not normally distributed for each group (Kolmogorov-Smirnov test, all pso.05), these data were transformed using natural logarithm for each trial and averaged over all trials for each condition and participant2. RT were analyzed with a 2 3 ANOVA with Group (amusics, controls) as between-participants factor and Conditions (A, V, AV) as within-participant factor. The ANOVA revealed a significant main effect of Condition (F(1.55, 46.59) ¼102.00; ε ¼ 0.77; p o.0001; MSE ¼0.98; η2p ¼0.77), with shorter RT in the AV condition in comparison to the A condition and the V condition (all ps o.0001). Additionally, the RT for the A condition were marginally significantly shorter than for the V conditions (p¼ .08). No significant group differences (F(1,30)¼ 1.32; p ¼.25; MSE ¼0.09; η2p ¼0.04) nor interaction between Group and Condition (F(1.55, 46.59) ¼ 0.94; ε ¼0.77; p¼ .37; MSE¼ 0.01; η2p ¼0.03) were observed (see Fig. 1.A.). According to the race models (Raab, 1962), a shorter RT in a bimodal condition (known as the redundant-stimulus effect) does not necessarily imply the existence of crossmodal interactions before the response, as the first of the two unimodal processes completed could simply have determined the RT (Besle et al., 2004; Miller, 1982; Ratcliff, 1979). Miller (1982) has shown that under this race model hypothesis, particular assumptions can be made for the distribution of RT:
P (RTAV < t) ≤ [P (RTA < t) + P (RTV < t)] for any response time t, where P (RT ot) represent the cumulative probability density function of the RT (Miller’s inequality, 1982). Violation of Miller’s inequality is taken as evidence for genuine multisensory integration processes. Following the procedure proposed by Ratcliff (1979) and Besle et al. (2004) (see also Miller, 1982), the cumulative probability density function of the RT for each participant in each of the three conditions was divided into 20 fractiles (limits at 0.05, 0.10, …, 0.90, 0.95, see Fig. 1 B.C.D.). To assess whether the benefit of the AV condition was related to multisensory integration processes, we tested Miller’s inequality by comparing the AV cumulative probability density functions 2 Note that after ln-transformation, samples (controls and amusics) were normally distributed (Kolmogorov-Smirnov tests (all ps4.20).
with the sum of the A and V cumulative probability density functions with a 2 2 19 ANOVA with Group (amusics, controls) as between-participants factor and conditions (Audio þVisual: A þV, Audiovisual: AV) and Fractiles (19 values corresponding to RT limits between 20 Fractiles) as within-participant factors. The ANOVA revealed 1) a significant main effect of condition (F(1,30) ¼ 45.31; p o.0001; MSE ¼0.04; η2p ¼0.60), with shorter RT in AV in comparison to A þV; 2) an expected significant effect of Fractiles (F (2.14, 64.30) ¼397.12; ε ¼0.11 po .0001; MSE ¼0.00; η2p ¼ 0.92); and 3) a significant interaction between Condition and Fractiles (F (2.42, 72.82) ¼ 30.09; ε ¼ 0.13; p o.0001; MSE¼ 0.00; η2p ¼0.50). To analyze this interaction, post hoc tests were performed and revealed that RT for AV were faster in comparison to A þV for fractiles 1 to 17 (all ps o.001) and were slower for fractiles 18 and 19 (all ps o.001) (these differences for the longer fractiles are a noninteresting and unavoidable side-effect of the procedure used to create the A þ V distribution, which reaches a maximum of two, instead of one for the AV distribution). No significant effect of Group was observed (p ¼.20), nor any interactions involving the Group factor (all ps4.63) (see Fig. 1.B.). In a second step, we analyzed RT for unimodal data (A and V) to assess whether RT distributions for A and V conditions were different between amusics and controls. Similarly to what has been done above for the AV and A þ V RT distributions, A and V RT distributions were vincentized in 20 fractiles. RT for unimodal stimulations (A or V) were analyzed with a 2 2 19 ANOVA with Group as between-participants factor and Condition (A, V) and Fractile limits as within-participant factors. The interaction between Fractiles and Condition (A, V) was significant (F(3.34, 100.27) ¼16.78; ε ¼0.18; p o.0001; MSE ¼0.01; η2p ¼ 0.35), and was further modulated by Group, as indicated in the three-way interaction between Group, Fractiles and Condition (F(3.34, 100.27) ¼ 4.56; ε ¼0.18; p o.0001; MSE¼ 0.01; η2p ¼0.13). Post-hoc tests revealed that amusics presented shorter RT in the A condition in comparison to the V condition for fractiles 1 to 11 (all ps o.001) and the inverse pattern (V RT o A RT) for fractiles 16 to 19 (all ps o.001). For controls, shorter RT in the A condition (in comparison to the V condition) was observed for fractiles 1 to 5 (all p values o.001)3 and no significant differences were observed between RT of A and V conditions for fractiles 8 to 19 (all ps4.09). None of the direct comparisons between groups reached significance (all ps 4.21). 3.2. Pitch change detection task for audio and audiovisual material 3.2.1. Performance One control participant did not understand the PCD task (she responded every two trials) and was thus excluded from the analyses. Performance in the PCD task was analyzed using signal detection theory by calculating, for each participant, each Condition (A, AV), and each Pitch Interval Size (1/4 tone, 1/8 tone, 1/16 tone), discrimination sensitivity with d’ and the response bias c (when necessary, to correct d’ and c, 0 was replaced by 0.01 for the number of false alarms, and 1 by 0.99 for the maximum number of 3 This pattern of results (with amusics showing decreased RT in the A condition in comparison to the V condition for fractiles 1 to 11, whereas this A advantage for controls only emerged for fractiles 1 to 5) is driven by one control participant exhibiting very long RT for the A condition. This participant was not considered as a group outlier as her RT data for the A condition (ln(RTA)¼ 5.94 ( 379.93 ms)) did not meet our criterion for outlier rejection (2 SD above the mean of participants of a group, corresponding to ln(RTA) ¼ 6.04 ( 419.89 ms)). Note that when we perform these analyses for controls without ctl6 (panel B), shorter RTs in the A condition (in comparison to the V condition) were observed for the fractiles 1 to 8 (all p values o .001 and p values for fractiles 9, 10 and 11 were 4.09 (fractile 9 p¼ .09; fractile 10 p ¼ .21; fractile 11 p ¼ .35)) and no significant differences were observed between RTs of A and V conditions for fractiles 12 to 19 (all ps4 .05).
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
115
Fig. 1. Results of Task 1 (speeded detection). A. Response times of amusic and control groups (black, amusics; white, controls) as a function of the three conditions (Audio, A; Visual, V; Audiovisual; AV). Left y axis corresponds to the ln-transformed. RT on which the statistics were performed; Right y axis corresponds to the RT data in ms. Errors bars represent SEM. B. Density functions of cumulative probabilities for RT in Audio, A; Visual, V; Audio þ Visual, A þ V; and AudioVisual, AV conditions for all participants. C. Cumulative density functions for control participants. D. Cumulative density functions for amusic participants. For shorter response times (fractiles 1 to 17 in both groups), the cumulative density function for AV responses (circles) is above the sum of the A and V cumulative density functions (diamonds). The blue area between these two curves illustrates the fractiles for which the violation of the race model inequality [P(RTAV o t)r P(RTA o t) þP(RTV o t)] is statistically significant (p o .001). The grey areas illustrate the fractiles for which P(RTA o t) and P(RTV o t) are significantly different (p o .001).
hits). For each participant, these analyses were based on z-transforms of the hit rate (i.e., number of correct responses for trials with changed tone / number of trials with changed tone) and the false alarm rate (i.e., number of incorrect responses for trials without changed tone / number of trials without changed tone). Positive values for c arise when the miss rate (incorrect responses for trials with changed tone / number of trials with changed tone) exceeds the false alarm rate. Positive values thus indicate a tendency to answer “no change detected”, negative values indicate a tendency to answer “change detected”, and c-values around 0 suggest the absence of a response bias. d’ and c were analyzed separately with a 2 2 3 ANOVA with Group (amusics, controls) as between-participants factor and Condition (A, AV) and Pitch Interval Size (1/4 tone, 1/8 tone, 1/16
tone) as within-participant factors. Performance was significantly superior to chance level (d’ ¼0) in each condition for each group (assessed with two-sided t-tests, all ps o.001, see Fig. 2A). The ANOVA revealed that the main effect of Group was significant, (F(1,29)¼4.53; p ¼.04; MSE ¼2.78; η2p ¼0.13), and that control participants reached higher performance levels than did amusic participants. The main effect of Condition was significant (F(1,29)¼5.13; p ¼.03; MSE ¼0.10; η2p ¼0.15), revealing that participants reached higher performance levels for the AV condition in comparison to the A condition. The main effect of Pitch Interval Size (F(1.79, 51.92) ¼38.96; ε ¼0.89; po .0001; MSE ¼0.25; η2p ¼0.57) was also significant and further modulated by Group (F(1.79, 51.92)¼ 4.01; ε ¼0.89; p¼ .02; MSE¼ 0.25; η2p ¼ 0.12). Post hoc tests revealed that amusics showed
116
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
Condition (A, AV), and Trial Type (with changed note, without changed note) as within-participant factors. In addition to the main effect of Group (F(1,29)¼4.84; p¼ .03; MSE¼ 0.01; η2p ¼ 0.14), the effect of Trial Type was significant (F (1,29) ¼16.79; po .0001; MSE¼ 0.01; η2p ¼ 0.36) revealing that participants’ performance was decreased for trials with changed note in comparison to trials without changed note. Note that the Group by Trial Type interaction was marginally significant (F (1,29) ¼3.08; p ¼ .06; MSE¼ 0.01; η2p ¼0.11). For completion: This marginally significant interaction was due to a significant difference in terms of performance between amusics and controls only for trials with changed tone (p o .004) but not for trials without changed tone (p ¼.75). Observing that both amusic and control participants showed decreased performance for trials with a changed tone in comparison to trials without changed tone fits with other data in the literature (using different paradigms, such as short-term memory tasks), reporting increased miss-rates rather than FA-rates in congenital amusia (Tillmann et al., 2009; Albouy et al., 2013b).
Fig. 2. A. Participants’ performance in terms of d’ presented as a function of Group (amusics, controls), Pitch Interval Size (1/4 tone, 1/8 tone, 1/16 tone) and Condition (A, AV). B. Bias data for amusic and control groups in terms of c, presented as a function of Pitch Interval Size (1/4 tone, 1/8 tone, 1/16 tone) and Condition (A, AV). All biases where positive thus revealing a general tendency to respond “no change detected” in both groups.
decreased performance with decreasing pitch interval size (all pso.0001), while controls reached similar performance for 1/4 and 1/8 tone Pitch Interval sizes (p ¼.34), but decreased performance for 1/16 tone Pitch Interval Size (all ps o.001). Moreover, amusics showed decreased performance in comparison to controls for 1/8 (p ¼.03), and 1/16 (p ¼.002) (i.e., the most difficult) pitch interval sizes, but not for the 1/4 tone pitch interval size (p ¼.37). No other significant effects or interactions were observed (all ps4.69). For c (Fig. 2B.), the response bias was significantly superior to zero in each condition for both groups, revealing participants’ general tendency to respond “no change detected” for both A and AV conditions (two sided t-tests, all pso .01). However, the ANOVA revealed that the main effect of Condition was significant (F(1, 29) ¼5.68; p¼ .02; MSE ¼0.09; η2p ¼ 0.16), with higher positive c values for the AV condition than for the A condition. The main effect of Pitch Interval size (F(1.79, 51.93) ¼38.80; ε ¼0.89; p o.0001; MSE ¼0.01; η2p ¼0.57) was significant and was further modulated by Group (F(1.79, 51.93)¼ 3.96; ε ¼ 0.89; p ¼ .024; MSE ¼0.01; η2p ¼0.12). Post hoc test revealed that amusics showed increased response bias with decreasing pitch interval size (all pso.001), while controls showed similar bias data for 1/4 and 1/8 tone pitch interval sizes (p ¼.33) and increased bias for 1/16 tone pitch interval size (po .001). Note, however, that amusic participants’ and controls’ bias data did not significantly differ for each of the pitch interval sizes (all ps4.13). No other significant effects or interactions were observed (all ps4.37). To investigate participants’ overall strategy in Task 2 (missing pitch changes or hearing non-existing changes), additional analyses were performed by investigating the effect of Trial Type (with changed tone, without changed tone) in participants’ performance. We analyzed the percentage of correct responses with a 2 2 2 ANOVA, with Group as between-participants factor and
3.2.2. Response times (RT) As in Task 1, because raw RT data in Task 2 were not normally distributed for each group (Kolmogorov-Smirnov tests, all pso.05.) these data were transformed using natural logarithm for each trial and averaged for each condition and participant.4 We analyzed RT for correct responses to assess whether a facilitatory effect of AV stimulation could be observed. In a first step, we analyzed the RT as a function of the Pitch Interval Size in different trials, and then, in a second step, we investigated the effect of the Trial Type. To assess if the observed benefit of the AV condition on RT could be related to task difficulty, we analyzed correct RT data for trials with changed tone as a function of the pitch interval size. The data were analyzed with a 2 2 3 ANOVA with Group as between-participant factor and Condition (A, AV) and Pitch Interval Size (1/4 tone, 1/8 tone, 1/16 tone) as within-participant factors. The ANOVA revealed that additionally to the main effect of Group (F(1,29) ¼15.76; p o.0001; MSE¼ 1.3; η2p ¼0.35), the main effect of Pitch Interval Size (F(1.66, 48.18) ¼39.32; ε ¼0.83; po .0001; MSE¼ 0.03; η2p ¼0.57) was significant, revealing that RT for the larger (i.e., easier) pitch interval size (1/4 tone) were shorter than for the other pitch interval sizes (1/8 tone, 1/16 tone, all pso.0001) and that RT for the 1/8 tone pitch interval size were shorter than for 1/16 tone pitch interval size (p o.001). The main effect of Condition was significant (F(1,29)¼ 11.93; po .001; MSE ¼0.15; η2p ¼ 0.29), revealing shorter RT for the AV condition than for the A condition. Moreover, the three-way interaction between Group, Condition and Pitch Interval Size (F(1.76, 51.22)¼4.51; ε ¼0.88; p ¼.01; MSE ¼ 0.03; η2p ¼ 0.14) was significant. Post-hoc tests revealed that the benefit of the AV condition (shorter RT) compared to the A condition was observed for different pitch interval sizes depending on the group. For amusics, shorter RT in the AV condition (in comparison to the A condition) were observed for 1/4 tone and 1/8 tone pitch interval sizes (all pso.001), but not for the 1/16 tone pitch interval size (p ¼.16) while controls showed this benefit of the AV condition for 1/8 tone (p ¼.005) and 1/16 tone (p o.0001) pitch interval sizes, but not for the 1/4 tone pitch interval size (p ¼.15, see Fig. 3). As for accuracy data, additional analyses were performed by investigating the effect of Trial Type (with changed tone, without changed tone) in participants’ RT for correct responses. Data were analyzed with a 2 2 2 ANOVA with Group as between4 Note that after ln-transformation, data of Task 2 (for controls and amusics) were normally distributed (Kolmogorov-Smirnov tests (all ps 4.20)).
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
Fig. 3. RT for correct responses for trials with changed tone presented as a function of Group (amusics, controls) and Condition (A, AV) for each Pitch Interval size (1/4 tone; 1/8 tone; 1/16 tone). Asterisks indicate significant between-condition differences for each Group and Pitch Interval Size (see Results for details).
participant factor and Condition (A, AV), and Trial Type as withinparticipant factors. Analyses revealed a significant main effect of Group (F(1,29)¼16.48; p o.0001; MSE¼ 0.72; η2p ¼0.36), with shorter RT in controls than in amusics. The main effect of Trial Type was significant (F(1,29) ¼38.58; p o.0001; MSE ¼0.05; η2p ¼ 0.57), showing shorter RT for trials with changed tone in comparison to trials without change. Finally, the main effect of Condition was significant (F(1,29)¼19.09; p o.001; MSE ¼0.07; η2p ¼ 0.39) and revealed that participants showed shorter RT for the AV condition in comparison to the A condition. No other effects or interactions were significant (all ps 4.25) (see Fig. 4).
4. Discussion The present study investigated whether audiovisual interactions could improve pitch processing abilities in congenital amusia, even though the visual stimulation was uninformative for the task (Task 2). Results of a simple speeded detection task (Task 1) revealed that both amusics and controls showed shorter response times for audiovisual stimuli than for either auditory or visual stimuli, and this AV facilitation was observed equally for both participant groups, revealing intact basic AV integration capacities in congenital amusia. Results of Task 2 (Pitch change detection task, PCD) revealed that participants of both groups exhibited higher performance level (d’) and shorter response times in the AV condition than in the A condition. For RT, these audiovisual benefits were observed for different ranges of difficulty for each group (that is, for different pitch interval sizes), thus suggesting that
Fig. 4. RT for correct responses presented as function of Trial Type (with changed tone, without changed tone), Group (amusics, controls), and Condition (A, AV).
117
optimal audiovisual facilitation can be observed within an appropriate range of unimodal performance. Results of Task 1 confirmed that not only simple auditory and simple visual processing, but also AV facilitation for stimulus detection are preserved in congenital amusia. The results of the speeded detection task for audio, visual, and audiovisual stimuli (Task 1) revealed that both amusics and controls exhibited shorter RT for AV stimuli in comparison to the uni-modal stimuli (A, V). More interestingly, by investigating Miller’s inequality (Besle et al., 2004; Miller, 1982; Ratcliff, 1979) for RT distributions, we observed that this benefit was related to audiovisual integration processes (Besle et al., 2004; Miller, 1982) as RT were shorter for the AV distribution in comparison to the A þV distribution, and most importantly, this was observed similarly in both groups. These results are in line with numerous studies showing that the combination of auditory and visual information enhances perceptual abilities as revealed by response times in typical individuals (Cappe et al., 2010; Ernst and Bulthoff, 2004; Gielen et al., 1983; Hershenson, 1962; Posner et al., 1976; Stein and Meredith, 1993; Stein and Stanford, 2008). This effect is wellknown and has been interpreted with the energy-summation hypothesis (Nickerson, 1973): the energy from two stimuli, in the context of a bimodal presentation, is combined and leads to more rapid processing in early stages of the perceptual process, as suggested by shorter detection times. This hypothesis has since received support from neuroimaging studies describing the neural networks where the energy summation could emerge (Bolognini et al., 2013; Stein and Stanford, 2008). Notably, it has been suggested that multisensory interactions occur within multi-modal cortical and sub-cortical brain regions (Calvert et al., 2000) receiving input from multiple sensory modalities and responding to stimulations from more than one modality (Bell et al., 2001; Bolognini et al., 2013; Cappe et al., 2009; Duhamel et al., 1998; Ghazanfar et al., 2008, 2005; Graziano, 2001; Graziano and Gross, 1993; Hackett et al., 2007; Jones and Powell, 1970; Nagy et al., 2006; Seltzer and Pandya, 1978; Shore, 2005; Shore et al., 2008; Stein and Meredith, 1993; Stein and Stanford, 2008). In addition to the implication of the multi-modal regions, it has been suggested that multisensory interactions can be also based on cross-connections between early sensory areas (Falchier et al., 2002; Rockland and Ojima, 2003), and also within primary sensory areas, that are supposed to receive direct input from a single sensory modality (Bolognini et al., 2013; Driver and Noesselt, 2008; Macaluso, 2006; Stein and Stanford, 2008). It has been proposed that multisensory integration occurs both during early and late perceptual processing steps (Giard and Peronnet, 1999, Fort et al., 2002). The differentiation between the role of early and later steps of multisensory integration is still under debate, and one can also hypothesize that multimodal facilitation could rather be considered as a combination of both processes (Bauer, 2008), involving local and long distance synchronization and communication between either unimodal and/or heteromodal brain areas. Regardless of the mechanisms supporting audiovisual facilitation, comparable AV benefits were observed in amusics and controls in Task 1. Observing that amusics process AV information as well as typical listeners in simple detection tasks, allowed for the investigation of bimodal benefits on amusics’ abilities in the processing of more complex pitch material (Task 2). Additionally to these multisensory effects, results of Task 1 have allowed improving our understanding of simple auditory and visual processing in congenital amusia: the direct comparison between A and V conditions for small fractiles showed that, like for control participants, amusics exhibited shorter detection times for the A stimuli in comparison to the V stimuli and that no significant differences were observed between the two groups (Note however the existence of small differences between amusics and
118
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
controls in the shape of the RT distributions of A condition for fractiles 6 to 11 (see footnote 2) and fractiles 16 to 18 (see Results)). Observing comparable detection times between groups for A stimuli is in line with the hypothesis that amusics’ altered encoding of tones is only apparent when task or material requires memory or taking the context into account, with altered memory traces of tones possibly leading to impairments in pitch discrimination and memory (Albouy et al., 2014, 2013a). On the other hand, amusics present preserved sensory extraction of sound information in a simple context (isolated tones) and during tasks requiring stimulus detection only. Taken together, results from Task 1 confirm that, like control participants, amusic individuals are able to encode A, V and most importantly AV information. Based on these conclusions, Task 2 of the present study aimed to investigate the potential effectiveness of bimodal stimulation on amusics’ pitch encoding abilities. In Task 2, we used a PCD task for which amusics were previously shown to be impaired in comparison to controls (Albouy et al., 2014; Hyde and Peretz, 2004; Peretz and Hyde, 2003; Tillmann et al., 2011). As expected, amusics showed decreased performance (as measured with d’) in comparison to controls for the small pitch interval sizes (1/8 tone; 1/16 tone), but not for the large pitch interval size (1/4 tone). This finding is in agreement with previous evidence for amusics’ pitch perception deficit (Albouy et al., 2014, 2013a, 2013b; Ayotte et al., 2002; Foxton et al., 2004; Hyde and Peretz, 2004; Jiang et al., 2013; Jones et al., 2009; Liu et al., 2010; Peretz, 2002, 2013; Peretz and Hyde, 2003; Pfeuty and Peretz, 2010; Tillmann et al., 2014, 2011; Williamson and Stewart, 2013). However, it is relevant to note that while Hyde and Peretz (2004) reported that amusics are impaired in detecting changes smaller than 1/2 tone, amusic participants in the present study were able to perform the task as well as controls for the 1/4 tone pitch interval size. The difference between findings across studies might be due to better pitch thresholds in the amusic group from the present study, in comparison to the amusic group from Hyde and Peretz (2004) (as revealed by the overlap in PDT values between amusics and controls, see Methods). Note additionally that amusics in Hyde and Peretz (2004)’s study (mean age¼ 57, SD¼ 1.6) were older than the amusic participants of the present study (mean age¼ 34.87, SD ¼12.49) and exhibited slightly inferior performance in terms of % of correct responses in the melodic subtests of the MBEA as compared to our participants (amusics in Hyde and Peretz (2004) mean ¼59.8% SD ¼ 2.3; amusics in the present study mean ¼65.32%, SD ¼ 2.53). In Task 2, both participant groups exhibited decreased performance with increasing task difficulty (decreasing pitch interval size). This effect was also reflected in the response bias data (c) showing participants’ increasing tendency to respond’no change detected’ in the sequences with decreased interval size. Most interestingly, a benefit in terms of performance (d’) was observed for the bi-modal (AV) condition in comparison to the uni-modal (A) condition. This result demonstrates that visual synchronous stimulation (even though uninformative for the task) can help amusic and control participants to improve their pitch processing abilities. The present findings confirm that amusic participants process visual stimuli as well as do control participants, and are able to use them via audio-visual integration to enhance their pitch perception performance in the tone sequences. This benefit could be interpreted in the following manner: As perceptual and memory deficits for pitch patterns in congenital amusia have been related to a failure to adequately form pitch traces in memory (altered pitch encoding) (Albouy et al., 2014; Tillmann et al., 2014), we hypothesize that the visual stimuli in Task 2 enhance the construction of memory traces for tones. While visual stimuli in the present experiment were uninformative regarding the pitch
judgment, they could provide relevant temporal information, which benefits perceptual processes (Jones, 1976). This benefit of uninformative visual stimuli in boosting pitch encoding could be explained by the preparation-enhancement hypothesis (Nickerson, 1973): one uni-modal stimulus (referred to as’accessory’, the visual stimulus here) enhances and prepares the processing of the other ‘obligatory’ stimulus (the auditory stimulus here). This multisensory enhancement is supported by neuroimaging findings showing that non-auditory inputs can influence early levels of auditory cortical processing, by priming the cortex to receive acoustic signals (Bizley and King, 2012). It has, for example, been proposed that visual input can modulate the phase of oscillatory activity in the auditory cortex, by potentially amplifying the response to related auditory signals (Schroeder et al., 2008). Observing comparable benefits of the AV condition in terms of accuracy (Task 2) in both amusic and control participants thus suggest that these mechanisms are preserved in congenital amusia. In addition to this benefit of the multisensory integration upon accuracy (d’), RT data confirmed that amusics and controls used the uninformative visual stimuli. For both groups, RT were faster for the AV condition than for the A condition. The analyses of the RT data further revealed that the AV benefit was related to task difficulty as it varied as a function of pitch interval sizes in both participant groups. We observed that amusics and controls showed the RT benefit of the AV condition for different pitch interval sizes (which reflect task difficulty): while control participants showed an audiovisual benefit for 1/8 tone and 1/16 tone changes, amusics showed an audiovisual benefit for the 1/8 tone and 1/4 tone changes, but not for the 1/16 tone change. In contrast, no benefits of the audiovisual condition was observed when the task was too simple for controls (1/4 tone) and too difficult for amusics (1/16) This finding is in line with several pieces of evidence suggesting that multisensory interactions depend on participants’ performance when the modalities are presented in isolation (Caclin et al., 2011; Cappe et al., 2010; Hairston et al., 2003; Rouger et al., 2007; Serino et al., 2007). As proposed by Caclin et al. (2011), this could be understood as a law of inverse effectiveness at the population or participant scale (Meredith and Stein, 1986; Stein and Meredith, 1993). While this principle has been mainly described for uni-sensory impairments as a consequence of a brain lesion (Bolognini et al., 2013, 2005; Frassinetti et al., 2005), our results suggest that this effect could be observed in healthy participants and transposed to the group level (see also Caclin et al., 2011; Cappe et al., 2010). According to this principle, highly salient auditory cues (large pitch interval sizes in the present study) are easily processed and the presentation of simultaneous visual stimuli has a modest effect on behavioral performance. By contrast, weak cues (small pitch interval sizes in the present study) are supposed to evoke comparatively few neural activation, these activations being therefore subject to substantial enhancement when a visual stimulus is combined to the auditory stimulus. In this latter case, the multisensory stimulation can have a positive effect on behavioral performance by increasing the speed and accuracy of detecting an event (Meredith and Stein, 1986; Stein and Meredith, 1993). This principle has received support in the data of controls that showed the AV benefit only for the more difficult conditions (1/8, 1/16), but not for the easiest condition (1/4). However, amusics’ performance suggests that this effect has some limits, notably when participants’ performance is too low, no multisensory facilitation can be observed (as revealed by the absence of AV benefit for the smallest pitch interval size (the most difficult condition) in amusics). Taken together, our data thus suggest that the range of efficiency of AV improvement is related to participants’ performance, and thus to the difficulty of the task. This hypothesis received support in the fact that optimal audio-visual benefit in the present study, as
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
revealed by the RT effects, was observed in both participant groups in the similar range of uni-modal performance (from d’ ¼4.64 (controls’ performance for the 1/4 tone pitch interval size for the A condition) to d’ ¼3.34 (amusics’ performance for the 1/16 tone pitch interval size for the A condition)). In sum, our findings demonstrate that amusic individuals are able to process and benefit from multisensory information to speed up detection as well as to improve pitch processing (i.e., detecting a pitch change). Although the cerebral dynamics of these processes remain to be investigated in this population, previous data suggest that uninformative visual stimuli influence either early levels or late levels of cortical processing (or both), by potentially amplifying and boosting the brain responses to auditory signals. The results presented here are a first promising step toward using multisensory integration for the development of training and rehabilitation programs of impaired pitch processing in amusia, notably by suggesting that AV paradigms are effective in an appropriate range of uni-modal performance.
Acknowledgements This work was supported by a grant from the “Agence Nationale de la Recherche” (ANR) of the French Ministry of Research ANR-11-BSH2-001-01 to BT and AC. PA was funded by a Ph.D. fellowship of the CNRS. This work was conducted in the framework of the LabEx CeLyA (“Centre Lyonnais d’Acoustique”, ANR10-LABX-0060) and of the LabEx Cortex (“Construction, Function and Cognitive Function and Rehabilitation of the Cortex”, ANR-11LABX-0042) of Université de Lyon, within the program “Investissements d’avenir” (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR). We thank Laurie-Anne Sapey-Triomphe, Amandine Picard, and Perrine Teyssier for their contributions at the beginning of this research, and Marie Avillac for her comments on a previous draft.
References Albouy, P., Cousineau, M., Caclin, A., Tillmann, B., Peretz, I., 2014. Building pitch memory traces in congenital amusia. In: Albouy, P. (Ed.), Behavioral and Neurophysiological Correlates of Auditory Perception and Memory: Evidence from Congenital Amusia, Neurosciences, 1. Université Claude Bernard Lyon, Lyon, pp. 177–206. Albouy, P., Mattout, J., Bouet, R., Maby, E., Sanchez, G., Aguera, P.E., Daligault, S., Delpuech, C., Bertrand, O., Caclin, A., et al., 2013a. Impaired pitch perception and memory in congenital amusia: the deficit starts in the auditory cortex. Brain 136, 1639–1661. Albouy, P., Schulze, K., Caclin, A., Tillmann, B., 2013b. Does tonality boost short-term memory in congenital amusia? Brain Res. 1537, 224–232. Ayotte, J., Peretz, I., Hyde, K., 2002. Congenital amusia: a group study of adults afflicted with a music-specific disorder. Brain 125, 238–251. Bauer, M., 2008. Multisensory integration: a functional role for inter-area synchronization? Curr. Biol. 18, R709–R710. Bell, A.H., Corneil, B.D., Meredith, M.A., Munoz, D.P., 2001. The influence of stimulus properties on multisensory processing in the awake primate superior colliculus. Can. J. Exp. Psychol. 55, 123–132. Bertelson, P., Radeau, M., 1981. Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Percept. Psychophys. 29, 578–584. Besle, J., Fort, A., Delpuech, C., Giard, M.H., 2004. Bimodal speech: early suppressive visual effects in human auditory cortex. Eur. J. Neurosci. 20, 2225–2234. Bizley, J.K., King, A.J., 2012. What Can Multisensory Processing Tell Us about the Functional Organization of Auditory Cortex?. In: Murray, M.M., Wallace, M.T. (Eds.), The Neural Bases of Multisensory Processes. CRC Press, Boca Raton (FL). Bolognini, N., Convento, S., Rossetti, A., Merabet, L.B., 2013. Multisensory processing after a brain damage: clues on post-injury crossmodal plasticity from neuropsychology. Neurosci. Biobehav. Rev. 37, 269–278. Bolognini, N., Frassinetti, F., Serino, A., Ladavas, E., 2005. “Acoustical vision” of below threshold stimuli: interaction among spatially converging audiovisual inputs. Exp Brain Res 160, 273–282. Caclin, A., Bouchet, P., Djoulah, F., Pirat, E., Pernier, J., Giard, M.H., 2011. Auditory enhancement of visual perception at threshold depends on visual abilities. Brain Res. 1396, 35–44.
119
Calvert, G.A., Campbell, R., Brammer, M.J., 2000. Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Curr. Biol. 10, 649–657. Calvert, G.A., Thesen, T., 2004. Multisensory integration: methodological approaches and emerging principles in the human brain. J. Physiol. Paris 98, 191–205. Cappe, C., Morel, A., Barone, P., Rouiller, E.M., 2009. The thalamocortical projection systems in primate: an anatomical support for multisensory and sensorimotor interplay. Cereb. Cortex 19, 2025–2037. Cappe, C., Murray, M.M., Barone, P., Rouiller, E.M., 2010. Multisensory facilitation of behavior in monkeys: effects of stimulus intensity. J. Cogn. Neurosci. 22, 2850–2863. Driver, J., Noesselt, T., 2008. Multisensory interplay reveals crossmodal influences on ‘sensory-specific’ brain regions, neural responses, and judgments. Neuron 57, 11–23. Duhamel, J.R., Colby, C.L., Goldberg, M.E., 1998. Ventral intraparietal area of the macaque: congruent visual and somatic response properties. J. Neurophysiol. 79, 126–136. Ernst, M.O., Bulthoff, H.H., 2004. Merging the senses into a robust percept. Trends Cogn. Sci 8, 162–169. Falchier, A., Clavagnier, S., Barone, P., Kennedy, H., 2002. Anatomical evidence of multimodal integration in primate striate cortex. J. Neurosci. 22, 5749–5759. Fort, A., Delpuech, C., Pernier, J., Giard, M.H., 2002. Dynamics of cortico-subcortical cross-modal operations involved in audio-visual object detection in humans. Cereb. Cortex. 12, 1031–1039. Foxton, J.M., Dean, J.L., Gee, R., Peretz, I., Griffiths, T.D., 2004. Characterization of deficits in pitch perception underlying ‘tone deafness’. Brain 127, 801–810. Frassinetti, F., Bolognini, N., Bottari, D., Bonora, A., Ladavas, E., 2005. Audiovisual integration in patients with visual deficit. J. Cogn. Neurosci. 17, 1442–1452. Frassinetti, F., Bolognini, N., Ladavas, E., 2002a. Enhancement of visual perception by crossmodal visuo-auditory interaction. Exp. Brain Res. 147, 332–343. Frassinetti, F., Pavani, F., Ladavas, E., 2002b. Acoustical vision of neglected stimuli: interaction among spatially converging audiovisual inputs in neglect patients. J. Cogn. Neurosci. 14, 62–69. Ghazanfar, A.A., Chandrasekaran, C., Logothetis, N.K., 2008. Interactions between the superior temporal sulcus and auditory cortex mediate dynamic face/voice integration in rhesus monkeys. J. Neurosci. 28, 4457–4469. Ghazanfar, A.A., Maier, J.X., Hoffman, K.L., Logothetis, N.K., 2005. Multisensory integration of dynamic faces and voices in rhesus monkey auditory cortex. J. Neurosci. 25, 5004–5012. Giard, M.H., Peronnet, F., 1999. Auditory-visual integration during multimodal object recognition in humans: a behavioral and electrophysiological study. J. Cogn. Neurosci. 11, 473–490. Gielen, S.C., Schmidt, R.A., Van den Heuvel, P.J., 1983. On the nature of intersensory facilitation of reaction time. Percept. Psychophys. 34, 161–168. Graziano, M.S., 2001. A system of multimodal areas in the primate brain. Neuron 29, 4–6. Graziano, M.S., Gross, C.G., 1993. A bimodal map of space: somatosensory receptive fields in the macaque putamen with corresponding visual receptive fields. Exp. Brain. Res. 97, 96–109. Hackett, T.A., De La Mothe, L.A., Ulbert, I., Karmos, G., Smiley, J., Schroeder, C.E., 2007. Multisensory convergence in auditory cortex, II. Thalamocortical connections of the caudal superior temporal plane. J. Comp. Neurol. 502, 924–952. Hairston, W.D., Laurienti, P.J., Mishra, G., Burdette, J.H., Wallace, M.T., 2003. Multisensory enhancement of localization under conditions of induced myopia. Exp. Brain Res. 152, 404–408. Hershenson, M., 1962. Reaction time as a measure of intersensory facilitation. J. Exp. Psychol. 63, 289–293. Hyde, K.L., Peretz, I., 2004. Brains that are out of tune but in time. Psychol. Sci. 15, 356–360. Jiang, C., Lim, V.K., Wang, H., Hamm, J.P., 2013. Difficulties with pitch discrimination influences pitch memory performance: evidence from congenital amusia. PLoS One 8, e79216. Jones, E.G., Powell, T.P., 1970. An anatomical study of converging sensory pathways within the cerebral cortex of the monkey. Brain 93, 793–820. Jones, J.L., Zalewski, C., Brewer, C., Lucker, J., Drayna, D., 2009. Widespread auditory deficits in tune deafness. Ear Hear 30, 63–72. Jones, M.R., 1976. Time, our lost dimension: toward a new theory of perception, attention, and memory. Psychol. Rev. 83, 323–355. Jousmaki, V., Forss, N., 1998. Effects of stimulus intensity on signals from human somatosensory cortices. Neuroreport 9, 3427–3431. Lidji, P., Kolinsky, R., Lochy, A., Morais, J., 2007. Spatial associations for musical stimuli: a piano in the head? J. Exp. Psychol. Hum. Percept. Perform. 33, 1189–1207. Liu, F., Patel, A.D., Fourcin, A., Stewart, L., 2010. Intonation processing in congenital amusia: discrimination, identification and imitation. Brain 133, 1682–1693. Macaluso, E., 2006. Multisensory processing in sensory-specific cortical areas. Neuroscientist 12, 327–338. McGurk, H., MacDonald, J., 1976. Hearing lips and seeing voices. Nature 264, 746–748. Melara, R.D., O’Brien, T.P., 1987. Interaction between synesthetically corresponding dimensions. J. Exp. Psychol.: Gen. 116, 323–336. Meredith, M.A., Stein, B.E., 1986. Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. J. Neurophysiol. 56, 640–662.
120
P. Albouy et al. / Neuropsychologia 67 (2015) 111–120
Miller, J., 1982. Divided attention: evidence for coactivation with redundant signals. Cogn. Psychol. 14, 247–279. Nagy, A., Eordegh, G., Paroczy, Z., Markus, Z., Benedek, G., 2006. Multisensory integration in the basal ganglia. Eur. J. Neurosci. 24, 917–924. Nickerson, R.S., 1973. Intersensory facilitation of reaction time: energy summation or preparation enhancement? Psychol. Rev. 80, 489–509. Peretz, I., 2002. Brain specialization for music. Neuroscientist 8, 372–380. Peretz, I., 2003. Brain specialization Specialization for Music: New Evidence from Congenital Amusia. In: P.I., Z.R.J. (Eds.), The Cognitive Neuroscience of Music. Oxford University Press, Oxford, pp. 192–203. Peretz, I., 2013. The Biological foundations of music: Insights from congenital amusia. In: Deutsch, D. (Ed.), The Psychology of Music. Elsevier, pp. 551–564. Peretz, I., Ayotte, J., Zatorre, R.J., Mehler, J., Ahad, P., Penhune, V.B., Jutras, B., 2002. Congenital amusia: a disorder of fine-grained pitch discrimination. Neuron 33, 185–191. Peretz, I., Champod, A.S., Hyde, K., 2003. Varieties of musical disorders. The montreal battery of evaluation of amusia. Ann N Y Acad Sci 999, 58–75. Peretz, I., Hyde, K.L., 2003. What is specific to music processing? Insights from congenital amusia. Trends Cogn Sci 7, 362–367. Pfeuty, M., Peretz, I., 2010. Abnormal pitch–time interference in congenital amusia: evidence from an implicit test. Atten. Percept. Psychophys. 72, 763–774. Posner, M.I., Nissen, M.J., Klein, R.M., 1976. Visual dominance: an informationprocessing account of its origins and significance. Psychol. Rev. 83, 157–171. Raab, D.H., 1962. Statistical facilitation of simple reaction times. Trans. N. Y. Acad. Sci. 24, 574–590. Ratcliff, R., 1979. Group reaction time distributions and an analysis of distribution statistics. Psychol. Bull. 86, 446–461. Rockland, K.S., Ojima, H., 2003. Multisensory convergence in calcarine visual areas in macaque monkey. Int. J. Psychophysiol. 50, 19–26. Ross, L.A., Saint-Amour, D., Leavitt, V.M., Javitt, D.C., Foxe, J.J., 2007. Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments. Cereb. Cortex. 17, 1147–1153. Rouger, J., Lagleyre, S., Fraysse, B., Deneve, S., Deguine, O., Barone, P., 2007. Evidence that cochlear-implanted deaf patients are better multisensory integrators. Proc. Natl. Acad. Sci. USA 104, 7295–7300. Rusconi, E., Kwan, B., Giordano, B.L., Umilta, C., Butterworth, B., 2006. Spatial representation of pitch height: the SMARC effect. Cognition 99, 113–129. Schroeder, C.E., Lakatos, P., Kajikawa, Y., Partan, S., Puce, A., 2008. Neuronal oscillations and visual amplification of speech. Trends Cogn. Sci. 12, 106–113. Seitz, A.R., Kim, R., Shams, L., 2006. Sound facilitates visual learning. Curr. Biol. 16, 1422–1427. Seltzer, B., Pandya, D.N., 1978. Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey. Brain Res. 149, 1–24. Serino, A., Farne, A., Rinaldesi, M.L., Haggard, P., Ladavas, E., 2007. Can vision of the body ameliorate impaired somatosensory function? Neuropsychologia 45, 1101–1107.
Shams, L., Kamitani, Y., Shimojo, S., 2000. Illusions. What you see is what you hear. Nature 408, 788. Shelton, B.R., Searle, C.L., 1980. The influence of vision on the absolute identification of sound-source position. Percept. Psychophys. 28, 589–596. Shore, S.E., 2005. Multisensory integration in the dorsal cochlear nucleus: unit responses to acoustic and trigeminal ganglion stimulation. Eur. J. Neurosci. 21, 3334–3348. Shore, S.E., Koehler, S., Oldakowski, M., Hughes, L.F., Syed, S., 2008. Dorsal cochlear nucleus responses to somatosensory stimulation are enhanced after noise-induced hearing loss. Eur. J. Neurosci. 27, 155–168. Stein, B.E., Meredith, M.A., 1993. Merging of the Senses. MIT Press, Cambridge. Stein, B.E., Meredith, M.A., Huneycutt, W.S., McDade, L., 1989. Behavioral indices of multisensory integration: orientation to visual cues is affected by auditory stimuli. J. Cogn. Neurosci. 1, 12–24. Stein, B.E., Stanford, T.R., 2008. Multisensory integration: current issues from the perspective of the single neuron. Nat. Rev. Neurosci. 9, 255–266. Stewart, L., 2011. Characterizing congenital amusia. Q. J. Exp. Psychol. (Hove) 64, 625–638. Sumby, W.H., Pollack, I., 1954. Visual contribution to speech intelligibility in noise. J. Acoust. Soc. Am. 26, 212–215. Teder-Salejarvi, W.A., Di Russo, F., McDonald, J.J., Hillyard, S.A., 2005. Effects of spatial congruity on audio-visual multimodal integration. J. Cogn. Neurosci. 17, 1396–1409. Tillmann, B., Albouy, P., Caclin, A., 2014. Congenital Amusias. Elsevier, Oxford (In Handbook of Clinical Neurology). Tillmann, B., Rusconi, E., Traube, C., Butterworth, B., Umilta, C., Peretz, I., 2011. Finegrained pitch processing of music and speech in congenital amusia. J. Acoust. Soc. Am. 130, 4089–4096. Tillmann, B., Schulze, K., Foxton, J.M., 2009. Congenital amusia: a short-term memory deficit for non-verbal, but not verbal sounds. Brain Cogn. 71, 259–264. von Kriegstein, K., 2012. A Multisensory Perspective on Human Auditory Communication. In: Murray, M.M., Wallace, M.T. (Eds.), The Neural Bases of Multisensory Processes. Vroomen, J., de Gelder, B., 2000. Sound enhances visual perception: cross-modal effects of auditory organization on vision. J. Exp. Psychol. Hum. Percept. Perform. 26, 1583–1590. Wallace, M.T., Wilkinson, L.K., Stein, B.E., 1996. Representation and integration of multiple sensory inputs in primate superior colliculus. J. Neurophysiol. 76, 1246–1266. Williamson, V.J., Stewart, L., 2013. In: Dulac, O., Lassonde, M., Sarnat, H.B. (Eds.), Congenital Amusia. In Pediatric Neurology, Part I. Elsevier, Newnes, pp. 237–239.