Neuropsychologia 131 (2019) 9–24
Contents lists available at ScienceDirect
Neuropsychologia journal homepage: www.elsevier.com/locate/neuropsychologia
Human amygdala response to unisensory and multisensory emotion input: No evidence for superadditivity from intracranial recordings
T
Judith Domínguez-Borràsa,b,c,∗, Raphaël Guexa,b,c, Constantino Méndez-Bértolod, Guillaume Legendrec,e, Laurent Spinellia, Stephan Morattif,g, Sascha Frühholzh, Pierre Mégevanda,e, Luc Arnalc,e, Bryan Strangeg,i, Margitta Seecka, Patrik Vuilleumierb,c,e a
Department of Clinical Neuroscience, University Hospital of Geneva, Switzerland Center for Affective Sciences, University of Geneva, Switzerland c Campus Biotech, Geneva, Switzerland d Facultad de Psicología, Universidad Autónoma de Madrid, Spain e Department of Basic Neuroscience, Faculty of Medicine, University of Geneva, Switzerland f Department of Experimental Psychology, Complutense University of Madrid, Spain g Laboratory for Clinical Neuroscience, Centre for Biomedical Technology, Universidad Politécnica de Madrid, Spain h Department of Psychology, University of Zurich, Switzerland i Department of Neuroimaging, Alzheimer’s Disease Research Centre, Reina Sofia-CIEN Foundation, Madrid, Spain b
ARTICLE INFO
ABSTRACT
Keywords: Auditory emotion Visual emotion Multisensory processing Amygdala Intracranial EEG Local-field potentials
The amygdala is crucially implicated in processing emotional information from various sensory modalities. However, there is dearth of knowledge concerning the integration and relative time-course of its responses across different channels, i.e., for auditory, visual, and audiovisual input. Functional neuroimaging data in humans point to a possible role of this region in the multimodal integration of emotional signals, but direct evidence for anatomical and temporal overlap of unisensory and multisensory-evoked responses in amygdala is still lacking. We recorded event-related potentials (ERPs) and oscillatory activity from 9 amygdalae using intracranial electroencephalography (iEEG) in patients prior to epilepsy surgery, and compared electrophysiological responses to fearful, happy, or neutral stimuli presented either in voices alone, faces alone, or voices and faces simultaneously delivered. Results showed differential amygdala responses to fearful stimuli, in comparison to neutral, reaching significance 100–200 ms post-onset for auditory, visual and audiovisual stimuli. At later latencies, ∼400 ms post-onset, amygdala response to audiovisual information was also amplified in comparison to auditory or visual stimuli alone. Importantly, however, we found no evidence for either super- or subadditivity effects in any of the bimodal responses. These results suggest, first, that emotion processing in amygdala occurs at globally similar early stages of perceptual processing for auditory, visual, and audiovisual inputs; second, that overall larger responses to multisensory information occur at later stages only; and third, that the underlying mechanisms of this multisensory gain may reflect a purely additive response to concomitant visual and auditory inputs. Our findings provide novel insights on emotion processing across the sensory pathways, and their convergence within the limbic system.
1. Introduction The amygdala is a subcortical region with a pivotal role in the encoding of stimulus affective value (Adolphs, 2002; Breiter et al., 1996;
Dolan et al., 2001; LeDoux, 1998; Maren, 2016; Morris et al., 1996; Phelps and LeDoux, 2005; Vuilleumier, 2005). It is known for influencing behavior and cognition through bidirectional projections with the visual and the auditory pathways (Megevand et al., 2017; Freese
∗ Corresponding author. Laboratory for Behavioral Neurology and Imaging of Cognition, Department of Neuroscience, University Medical Center, 1 rue MichelServet, CH-1211 Geneva, Switzerland. E-mail addresses:
[email protected] (J. Domínguez-Borràs),
[email protected] (R. Guex),
[email protected] (C. Méndez-Bértolo),
[email protected] (G. Legendre),
[email protected] (L. Spinelli),
[email protected] (S. Moratti),
[email protected] (S. Frühholz),
[email protected] (P. Mégevand),
[email protected] (L. Arnal),
[email protected] (B. Strange),
[email protected] (M. Seeck),
[email protected] (P. Vuilleumier).
https://doi.org/10.1016/j.neuropsychologia.2019.05.027 Received 1 August 2018; Received in revised form 15 May 2019; Accepted 28 May 2019 Available online 31 May 2019 0028-3932/ © 2019 Published by Elsevier Ltd.
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Fig. 1. a. Sample auditory and visual stimuli. b. Trial structure. Patients judged the emotional expression of faces, voices, or simultaneous faces and voices presented randomly, by pressing a response button. Stimuli could display a neutral or an emotional (fearful, happy) expression. Trial duration was 3900 ± 300 ms.
support social communication across various situations and species (Partan and Marler, 1999). In humans, it has been suggested that the amygdala may possess multisensory integration functions that are sensitive to the combination of emotionally salient information coming from various sensory channels. For instance, a functional neuroimaging study reported greater amygdala response to faces and voices presented together when these were congruent in emotional value, in comparison with emotionally incongruent pairs (Dolan et al., 2001; note that there was no unisensory condition in this study). Similar effects have been observed in other imaging studies (Chen et al., 2010; Klasen et al., 2011; Muller et al., 2011), although some did not report modulation of amygdala (Kreifelts et al., 2007) or described only weak response gain (Robins et al., 2009) in this nucleus. At the single-cell level, there is also evidence that responses of visual-selective neurons in the monkey amygdala are augmented by the simultaneous presentation of auditory stimuli (Kuraoka and Nakamura, 2007), suggesting a response facilitation by multisensory over unisensory inputs. Furthermore, this multisensory capacity has been proposed to rely on direct projections between the amygdala and multimodal associative cortical regions, such as the superior temporal sulcus (Amaral et al., 1992; Muller et al., 2012), which is known to encode emotional and neutral cues at a supramodal level (Peelen et al., 2010). To date, however, the spatiotemporal dynamics of multisensory emotion processing in the human amygdala remain largely undefined, and limited by the relatively poor resolution (spatial and temporal) of non-invasive imaging techniques. Moreover, most iEEG recordings of amygdala in humans have focused on the visual domain, but there is still a lack of iEEG data comparing emotion responses to auditory and visual input, or to the combination of both sensory modalities. Here we recorded intracranial ERPs (iERPs) and oscillatory activity from the amygdala of seven pharmaco-resistant epilepsy patients. First, we tested for emotion- and modality-specific responses to stimuli presented across sensory modalities. To this aim, we specifically compared response patterns to visual, auditory, and audiovisual input, by using faces and voices with either a fearful, happy, or neutral emotion expression (Fig. 1). We focused on amygdala electrode contacts that showed differential responsiveness for either the visual or the auditory
and Amaral, 2005, 2006; Gschwind et al., 2012; McDonald, 1998; Yukie, 2002), as well as through modulation of cortical attention networks (Vuilleumier, 2005), memory systems (Hamann, 2001; McGaugh, 2002), and other subcortical neuromodulatory structures (Fast and McGann, 2017; Retson and Van Bockstaele, 2013). Neuroimaging research in humans has provided important insights into how the amygdala responds to a wide variety of emotionally relevant cues. In particular, it is strongly reactive to visual stimuli with emotional value, such as faces or scenes, relative to comparable but neutral images (Costafreda et al., 2008; Glascher et al., 2004; Hariri et al., 2002; Morris et al., 1996; Sabatinelli et al., 2011; Vuilleumier and Pourtois, 2007). The amygdala is also responsive to emotional signals in the auditory domain, such as voices, screams (Arnal et al., 2015), or environmental sounds (Aube et al., 2015; Fecteau et al., 2007; Fruhholz and Grandjean, 2013; Phillips et al., 1998; Schirmer et al., 2008; Wiethoff et al., 2009), although this has been debated for some stimulus categories (Adolphs and Tranel, 1999; Bruck et al., 2011; Buchanan et al., 2000; Costafreda et al., 2008; Grandjean et al., 2005; Klinge et al., 2010; Pourtois et al., 2005; Scott et al., 1997). Intracranial electroencephalographic (iEEG) recordings in epileptic patients allow direct electrophysiological measurement in the amygdala, with a unique spatial and temporal resolution (Lachaux et al., 2003). Studies using this technique have consistently provided evidence for relatively rapid, bottom-up amygdala response to emotional visual stimuli, such as fearful faces (or eye gaze), starting at latencies spanning from ∼70 ms (Mendez-Bertolo et al., 2016) to 140 ms (Pourtois et al., 2010) or 200 ms (Krolak-Salmon et al., 2004; Meletti et al., 2012) after stimulus-onset (see Guillory and Bujarski, 2014; and Murray et al., 2014; for reviews). Longer latencies have also been reported in some cases, possibly due to differences between studies concerning stimulus categories used, task demands, or drug treatments (Pessoa and Adolphs, 2010; Pourtois et al., 2013). However, the spatiotemporal dynamics of human amygdala responses to acoustic emotional input has rarely been studied and remains largely unresolved (Pannese et al., 2015). In addition to emotional information extracted from purely unimodal visual or auditory input, multisensory signals are not only common but also of great importance in real-life environments, for instance to 10
11
24, 24 92 56 Primary Oxcarbazepine, pregabalin
93 95 Secondary Lamotrigine, valproate, lacosamide
Complex generalized, 4 Partial complex generalized, 1
52, 5 96 31 N/A
3, 3 67, 8 94 92 82 77 Tertiary Primary Valproate Oxcarbazepine Valproate, lamotrigine
Levetiracetam, carbamazepine Partial complex generalized, 2
Partial simple, 4
Idiopathic 21 26 F 09
Left
Idiopathic 19 21 F 08
Right
Idiopathic 23 34 M 07
Right
Idiopathic Idiopathic 16 5 21 18 M F 04 05
Right Right
F 03
Right
53
25
Idiopathic
Right anterior temporal/ temporal lobectomy Right temporal/no resection Left anterior temporal/ temporal lobectomy Right supero-lateral temporal/temporal lobectomy Right orbitofrontal/ventropolar lobectomy Right anterior temporal/ temporal lobectomy
Partial simple, 4 Partial complex, 2
7, 24 99 51 Secondary
28, 8 94 50 Tertiary
Lamotrigine, valproate, levetiracetam, pregabalin, zonisamide, lacosamide. Oxcarbazepine Partial complex generalized, 4 Focus not localized Idiopathic 37 39 M
Right
Age at onset of epilepsy (years) Age
01
Seven patients (4 females) with pharmacologically intractable epilepsy were prospectively recruited during their clinical assessment prior to resection surgery (see Table 1 for demographic and clinical details). From these patients, one amygdala was excluded due to low number of artifact-free trials. Implantation sites were chosen by the neurosurgeon
Handed-ness
2.1. Participants
Sex
2. Methods
Patient
Table 1 Patient demographic and clinical data.
Aetiology
Seizure focus/resection
Seizure type (frequency per month)
Drug treatment
Education completed
STAI-state percentile
STAI-trait percentile
PANAS percentile (pos, neg)
modality, but also on those showing response to audiovisual stimulation. Indeed, research in monkey amygdala suggests that different neuron populations may show preference for visual or auditory information (Kuraoka and Nakamura, 2007; Montes-Lourido et al., 2015), and some neurons may have multimodal profiles, responding indistinctively to various modalities presented alone or in bimodal combination, including audition and vision (Kuraoka and Nakamura, 2007; Nishijo et al., 1988). Moreover, single-cell data in rodent and monkey suggest that some degree of spatial segregation may exist among amygdala neurons depending on their sensory preference (Bergstrom and Johnson, 2014; Nishijo et al., 1988; Zhang et al., 2013). For instance, fear conditioning studies in rats found that discrete neuronal populations distributed within the lateral amygdala encode either visual or auditory fear conditioned stimuli (Bergstrom and Johnson, 2014). Second, we examined whether amygdala activity recorded from these different contacts exhibited a nonlinear enhancement (or reduction) for the bimodal audiovisual stimuli, as compared to the unimodal faces and voices, similarly as in other regions traditionally associated with multisensory integration. In this vein, both single-cell and neuroimaging research on multisensory integration in, for instance, the superior colliculus, have previously reported that the combination of visual and auditory stimulation presented together may result in more neuronal spikes or larger response amplitudes than the sum of responses evoked by the same stimuli presented individually (Calvert et al., 2001; Skaliora et al., 2004; see Holmes and Spence, 2005). This is called ‘superadditivity’, since the multisensory whole is greater than the sum of its unisensory parts (see Holmes and Spence, 2005; Kayser et al., 2012; Sarko et al., 2012). Likewise, subadditivity (i.e., lower neuronal response to combinations than to unimodal sensory inputs presented alone) can also reflect an effect of multisensory integration (Arnal et al., 2009; Arnal et al., 2011; Cappe et al., 2010). Here we specifically probed for multisensory integration in the amygdala through indices of super- and subadditive responses, similarly to previous work with electrophysiological and BOLD measures (see Holmes and Spence, 2005; Kayser et al., 2007; Laurienti et al., 2005), for reviews on the use of these criteria). Direct evidence for super-/subadditivity effects by multisensory stimulation in the amygdala is scarce or indirect (Klasen et al., 2011), given the temporal and spatial limitations of BOLD studies (Laurienti et al., 2005), or that it was not formally tested (Dolan et al., 2001; Kuraoka and Nakamura, 2007; Muller et al., 2011). In sum, the goal of the present study was two-fold. On one hand, we compared, for the first time with iEEG in humans, whether emotion encoding in the amygdala occurs similarly across auditory, visual, and audiovisual inputs. If this was the case, we might conclude that the role of this region in affective processing in humans is of multimodal nature, and, importantly, comparable across modalities. On the other hand, we examined whether this region has a true multimodal integration function. If sensory integration takes place, a non-linear gain in amygdala response for bimodal stimuli, relative to unimodal, should be observed. If no integration is accomplished locally, response gain towards bimodal stimulation should result from simple summation of auditory and visual inputs. Finally, our setup allowed for high anatomical resolution within the amygdala, with a distance between neighboring contacts of 2 mm only, and enabled us to test for any systematic macroscopic segregation of sensory modalities within the basolateral nuclei.
1, 67
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Fig. 2. Summary of selected electrode contacts, and their localizations on amygdala, displayed on coronal sections of the patients' T1 images. Selection was done according to sensory responsiveness (auditory in blue, visual in red, indistinctive response to both unisensory modalities in orange, and yellow circles for audiovisual responsive contacts). On the right, iERPs depicting activity of individual contacts from different amygdalae, according to their stimulus responsiveness. Shaded areas indicate the standard error of the mean (s.e.m.). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
solely on the basis of clinical criteria for identifying the epilepsy focus. The remaining amygdalae (N = 9), corresponding to 5 patients with right-lateralized implants and 2 patients with bilateral implants, were radiologically normal on pre-operative MRI (Fig. 2). All participants gave their informed consent in accord with protocols approved by the ethical committee of the University Hospital of Geneva (Switzerland). Six out of the 7 patients were right-handed and the remaining patient was left-handed. All patients had normal or corrected-to-normal vision. They had no history of head trauma or encephalitis, and had no epileptic seizures during our recordings. Two patients (Patient 03 and 05) had normal performance in memory, attention and executive functions, as assessed by neuropsychological tests in the hospital. Four patients showed mild (Patients 07, 08 and 09) or severe (Patient 04) working memory deficits (verbal or visual). One patient (Patient 04) showed severe difficulties in visuospatial attention tasks, and three showed executive function deficits (Patients 07, 08 and 09). Finally, neuropsychological assessment information was not available for Patient 01 (from whom we could only have access to postictal reports). These cognitive limitations did not prevent our patients from normally performing in our task (see further analyses below).
diameter = 1.1 mm, intercontact spacing = 2 mm; AD-Tech) and were implanted stereotactically. 2.3. Electrode contact localization Electrode contacts within the amygdala were first localized in the native structural brain space using the T1 MRI image obtained prior to surgery, which was then coregistered with a CT scan performed after implantation to visualize the electrodes position. For each electrode and each amygdala, contact localization was verified with the Brainstorm toolbox (Tadel et al., 2011) and custom-written scripts using Matlab (Mathworks, R2013b). The recording site position from selected electrode contacts in native space is shown for each patient in Fig. 2. We also located our electrode contacts on a high-resolution amygdala probabilistic atlas with identified sub-nuclei (Tyszka and Pauli, 2016, Fig. 3; Supp. Fig. 1). Contact localization in the normalized atlas space was performed using iELVis (http://ielvis.pbworks.com; Groppe et al., 2017; Megevand et al., 2017). Briefly, post-implant CT of the brain was first co-registered with a high-resolution pre-implant MRI scan using FSL (https://fsl.fmrib.ox.ac. uk/fsl/fslwiki (Jenkinson et al., 2012); and FreeSurfer (https://surfer.nmr. mgh.harvard.edu/fswiki/FreeSurferWiki; Fischl, 2012). The MRI scan was segmented to identify cortical regions and subcortical structures using FreeSurfer. Electrodes were then identified manually on each high-resolution CT scan using BioImage Suite (http://bioimagesuite.yale.edu/; Papademetris et al., 2006). Electrode coordinates of individual patients were brought to the MNI152 space via an affine transformation using FreeSurfer. Finally, we marked the electrodes on the MNI152-registered
2.2. Stereotactic electrode implantation A contrast enhanced MRI was performed pre-operatively under stereotactic conditions to map vascular structures prior to electrode implantation and to calculate stereotactic coordinates for trajectories. Depth electrodes had eight stainless contacts each (electrode 12
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Fig. 3. Selected electrode contacts (Auditory and/or Visual responsive), displayed on coronal (top), sagittal (bottom left) and axial (bottom right) sections with a high-resolution probabilistic atlas of the amygdala (Tyszka and Pauli, 2016) Contacts overlapped with the probable location of basolateral nuclei. Abbreviations: BL = basolateral nuclei, La = lateral, BM = basomedial, BLV = ventral basolateral, CMN = cortical and medial nuclei, ATA = amygdala transition areas, AAA = anterior amygdala area, AMY = amygdala (other).
version of the CIT168 template brain (Tyszka and Pauli, 2016). As can be seen in Fig. 3 and Supp. Fig. 1, all selected contacts were located within the basolateral (BL) complex of amygdala nuclei with a high probability (Tyszka and Pauli, 2016); see the Statistics and contact selection section). Contacts outside the basolateral nucleus were not considered for analysis.
duration was identical to that of faces (400 ms), with a 15-ms fade-in to avoid startle responses at their onset. There were 8 different voices per emotion, performed by the same actors across conditions. All vocal stimuli were delivered using an Audio File system (version 1.0; Cambridge Research Systems Limited, England) that allowed for compensating for delays in the auditory stimulus onset, and ensured proper synchronization between the sounds and the EEG triggers. Finally, audiovisual stimuli (AudioVis) were generated through the simultaneous presentation of one face and one voice, taken from the above stimuli, with face and voice having always congruent emotion and gender. We did not include incongruent pairs in order to keep the paradigm brief and simple for patients. All stimuli used in our study (auditory, visual, and combinations of both) were selected among a set of 200 faces and 80 voices after pilot ratings performed by a sample of 14 healthy students (5 males; mean 27 ± 3 yrs) recruited at the University of Geneva. On a scale from 1 to 100 (being 100 the strongest judgement), these participants were asked to determine how likely each stimulus conveyed a fearful, a happy, or a neutral expression, and to judge their perceived arousal levels (see Table 2). We then selected those stimuli with the highest valence judgement scores (i.e. the stimuli obtaining the highest ratings reflecting how likely a fearful face was perceived as fearful, and so on). Among them, we selected the stimuli showing the largest difference in arousal ratings between the emotional and the neutral conditions, while matching arousal levels of the two emotional conditions across the auditory, visual, and audiovisual modalities. Final stimuli were consistently judged across subjects in accordance with each emotion condition (i.e. Neu, Fear, or Happy; Table 2), with an overall mean likeliness rating of 80 ( ± 10) on a 0–100 scale (ANOVA for main effect of Emotion type with 3 factor levels, p = 0.26; main effect of Modality,
2.4. Task and procedure 2.4.1. Stimuli Visual stimuli (Vis) consisted of 24 black and white pictures of faces from the NimStim (Tottenham et al., 2009) and the Karolinska Directed Emotional Faces (Lundqvist et al., 1998) databases (Fig. 1), with either neutral (Neu), fearful (Fear), or happy (Happy) expression (i.e. 8 pictures per emotion displayed by the same actors -4 males, 4 femalesacross conditions). Faces were surrounded by a black frame, and the luminance of all images was standardized. Pictures were presented on a laptop screen (see below for stimulus delivery details), with a resolution of 291 × 374 pixels for all pictures, and presentation duration of 400 ms. Auditory stimuli (Aud) consisted of binaural excerpts of vocal utterances of non-verbal emotional expressions, with either neutral, fearful, or happy prosody (again Neu, Fear or Happy, respectively), taken from The Montreal Affective Voices (MAV; Belin et al., 2008) and the Geneva Multimodal Expression Corpus for Experimental Research on Emotion Perception (GEMEP; Banziger et al., 2012) databases (Fig. 1). Recordings corresponded to middle-aged actors (25–40 years; 4 males, 4 females), thus overall matching with the appearance of faces. Stimuli were equated for mean sound-pressure level (70 dB SPL) and average energy in all frequencies within the sound spectrum, thus resulting into equal perceived loudness across all conditions. Voice
Table 2 Mean stimulus ratings by healthy participants (1–100). For valence, scores determined the extent to which each stimulus was judged as being neutral, fearful and happy, respectively. VALENCE
Neutral Fear Happy
AROUSAL
Auditory
Visual
AudioVisual
Auditory
Visual
AudioVisual
88.2 ± 7.9 69.2 ± 9.6 77.8 ± 4
78.8 ± 16.1 81.1 ± 9.9 80.8 ± 4.7
83.2 ± 15.1 83.7 ± 4.1 79.8 ± 3.6
17.3 ± 6.4 70.1 ± 5.4 69.3 ± 2.2
25.2 ± 8.9 72.6 ± 5.3 68.3 ± 5.3
23.6 ± 6.7 73.8 ± 4.4 65.9 ± 5.3
13
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
again with 3 factor levels, p = 0.36), except in the auditory condition where valence judgements were slightly lower for fear compared to other emotions (mean score 69/100; Emotion x Modality p = 0.013; see Table 2), an effect that could not be avoided regardless of the voices selected, and presumably reflected more stereotypical and recognizable emotion expressions from faces than from voices (see, for instance, Banziger et al., 2009). However, importantly, fearful and happy stimuli were rated higher on arousal levels than neutral stimuli across all three sensory modalities (Emotion effect for Aud: p < 0.001; Vis: p < 0.001; AudioVis: p < 0.001; t-tests for Fear vs. Neu and Happy vs. Neu, p < 0.001). In both the Aud and the Vis modalities, there were no differences in arousal ratings between the Fear and the Happy conditions (p > 0.1) but, interestingly, there were some differences for the AudioVis stimuli (p = 0.001; mean arousal ratings for Fear: 73.8 ± 4.4; for Happy: 65.85 ± 5.3). Nevertheless, despite this difference in arousal scores between the Fear and the Happy stimuli for the AudioVis modality, overall the arousal ratings for these two emotion conditions were similar across all three stimulus modalities (Modality effect for Fear: p = 0.17; for Happy: p = 0.19). Again, and supporting the comparability of arousal ratings across the three modalities for both the Fear and the Happy stimuli, a 2 × 3 ANOVA including only the Fear and Happy Emotion levels revealed no Modality effect (p = 0.76) nor Emotion × Modality interaction (p = 0.097).
correction for multiple comparisons (all with the SPSS statistical package, version 22). 2.6.2. Signal pre-processing We used a bipolar montage in which each channel was re-referenced to its adjacent neighbor, so as to cancel signal influences of the external original reference (Cz), while avoiding signal spillover from remote sources on the local field potential (LFP; Lachaux et al., 2003). Physiologically, a bipolar recording corresponds to the fluctuations of the very local electrical activity relative to its immediate surrounding (Mercier et al., 2017; Lachaux et al., 2003), ensuring optimal localization of intra-amygdala activity (note however that subtle confounds derived from volume conduction of nearby sources cannot be fully abolished; Buzsaki et al., 2012). Signal pre-processing was done with custom-written scripts for the toolbox Fieldtrip (Oostenveld et al., 2011) and implemented with Matlab (Mathworks, R2013b). A notch filter (50, 100, 150 Hz) was applied. In order to avoid artefactual effects derived from filters, which may result into significant wave distortion (Acunzo et al., 2012; Maess et al., 2016; Tanner et al., 2015; Tanner et al., 2016), no further filters were implemented. Data were detrended, and epochs from -800 to 1000 ms were then extracted and baseline-corrected relative to the 200 ms prior to stimulus appearance. Data were then visually inspected trial by trial, and epochs with artifacts such as epileptic spikes were removed (23.3% on average). The average number of trials retained for subsequent analysis for each amygdala was 44.2 ( ± 7.6) for VisNeu, 43.1 ( ± 5.3) for VisFear, 43.8 ( ± 3.3) for VisHappy, 42.3( ± 6.7) for AudNeu, 40.8( ± 9.6) for AudFear, 41.9 ( ± 8.6) for AudHappy, 43.7 ( ± 8.4) for AudioVisNeu, 42.4 ( ± 9) for AudioVisFear, and 44.2 ( ± 8.6) for AudioVisHappy. Trials were not selected according to behavioral performance in order to keep as many trials as possible for analysis. In addition, we ensured that all amygdala contacts included in the analysis were stimulus-responsive (see Statistics and contact selection). The remaining trials were then averaged for every condition and for each contact within each amygdala. Finally, the signal was averaged for all selected contacts within each amygdala (see next) to compute the intracranial event-related potentials (iERPs) for each experimental condition. Signal was downsampled to 256 Hz to reduce computation time.
2.4.2. Task and procedure The experimental task was delivered with the E-Prime 2.1 software (Psychology Software Tools, Pittsburgh, PA) on a laptop connected to both the Audiofile system box and the iEEG system. Patients lay comfortably on their bed and wore in-ear headphones. A total of 504 trials were displayed (56 trials per condition, 9 conditions in total, resulting from the combination of 3 emotions x 3 modalities). In each trial, after a fixation cross was displayed for 1200 ± 300 ms, a face alone (Vis condition), a voice alone (Aud condition), or a simultaneous face and voice (AudioVis condition) was presented. Each stimulus had either a neutral (Neu condition), a fearful (Fear condition), or a happy (Happy condition) emotional expression. Faces were always presented at the center of the screen on a light gray background. At the end of the face display, the fixation cross disappeared until the end of the trial (see trial duration below), and then reappeared before the next trial started. During the Aud stimulation, the fixation cross also disappeared after the sound terminated, with only a gray screen being displayed, until the end of the next trial. Subjects were instructed to press a response button with their right hand (index, middle, or ring finger, counterbalanced across subjects), as fast and as accurately as possible, to categorize the emotion depicted by the stimulus. Responses were given on a USB response pad (Manhattan 176354). The trial ended when the response was given or after a maximum response-window of 2700 ms after stimulus-onset. Therefore, total trial duration was of a maximum of 3900 ± 300 ms.
2.6.3. Statistics and contact selection Statistics were performed using Fieldtrip and custom-written scripts implemented with Matlab. Analyses consisted of cluster-based permutation tests (Maris and Oostenveld, 2007), where the amplitude values at each time-point were permuted between conditions, and within each factor and amygdala. After 1000 permutations, ANOVAs (see next sections) were calculated for each of these time-points (window 0–800 ms post-stimulus onset), with a cluster threshold of p < 0.01 for the main effects and interactions. For each cluster, the ANOVA F-values of the tests were summed, and the largest sum among all clusters entered into the permutation distribution. We only retained time-clusters that reached significance over a window of at least 10 ms, after clustercorrection, and report their cumulative (cum) F-values over time. Note that, as the permutation distribution is data-driven and non-parametric, no degrees of freedom are given. Note also that, despite the fact that 2 out of 7 patients had bilateral amygdalae, our analyses included Amygdala as the sole random factor and the random factor Patient was not included (i.e. Amygdala nested within Patient). We chose this approach due to the low number of observations regarding bilateral samples in our dataset. In fact, a linear mixed model, based on single trial data, including Patient and Amygdala as nested random factors, proved to be no better in explaining our data than a model where only Amygdala was included (non-nested model: AIC= 28336, nested model: AIC = 28338; χ2 = 0; p = 1). To evaluate for any super- or subadditive effects of multisensory input on amygdala response, we compared the bimodal [AudioVis]
2.5. Data acquisition Intracranial electroencephalogram (iEEG) was continuously recorded (Micromed Electronics Ltd., UK) with a sampling rate of 2048 Hz for all patients except Patient 01 (256 Hz). No online filters were applied, but hardware bandpass filter was of 0.17-470 Hz. 2.6. Data analysis 2.6.1. Behavior Behavioral responses from one patient were not recorded due to technical reasons. For all other patients, hit rate (HR) and hit mean response time (hit-RT) were computed for each condition. These data were then compared by means of a two-factor repeated-measure ANOVA using the factors Modality (Aud, Vis, AudioVis) and Emotion (Neu, Fear, Happy), as well as pairwise post hoc t-tests with Bonferroni 14
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
versus the sum of both unimodal [Aud + Vis] conditions with a 3 × 2 ANOVA, using the factors Emotion (Neu, Fear, Happy) and Modality (AudioVis, [Aud + Vis]). Then, we performed the same test but for each Emotion condition separately. For these analyses, we used clusterbased permutation one-way ANOVAs under the parameters explained above. This is similar to the approach previously employed for scalpERPs in multisensory studies (Cappe et al., 2010). Then, to further explore for non-linear enhancements or diminishments in emotion responses, we compared amygdala activity to Fear (versus Neu) for the AudioVis modality with the sum of its unisensory counterparts [(AudioVisFear vs. AudioVisNeu) vs. ((AudFear vs. AudNeu) + (VisFear vs. VisNeu))], again by means of a cluster-based permutation oneway ANOVA. Finally, to test how iEEG effects were linked to behavioral effects, we carried out Pearson correlations (1-tailed) between individual behavioral performance measures (accuracy rate or RTs on correct trials) and individual ERP amplitude values recorded from amygdala, extracted from those time-windows where effects of interest were observed. Additional exploratory analyses were also conducted to rule out effects of laterality, cognitive impairment, and proximity of the amygdalae to seizure focus. To do so, we again extracted mean ERP-amplitude values from time-windows of interest and submitted them to ANOVAs with the relevant control factors (see below). For all these analyses, we carefully selected electrode contacts overlapping with the probable anatomical location of the BL nuclei of the amygdala (Tyszka and Pauli, 2016; Bach et al., 2011; Bzdok et al., 2013; Eickhoff et al., 2007; see Table 3), given the key role of this subregion in the encoding of sensory input through its direct connections with both early and high-level associative sensory areas (Amaral, 1992; Bergstrom and Johnson, 2014; Bzdok et al., 2013). Moreover, activity in the BL amygdala is also known to represent stimulus-value (Bzdok et al., 2013; Maren, 2016; Mendez-Bertolo et al., 2016). Overall, these contacts represented ∼70% of all electrode sites located in the amygdala in our patients (31 contacts out of a total of 44). The remaining amygdala contacts were assigned to other subregions, including the centromedial and superficial nuclei (Bzdok et al., 2013), but these were much fewer and insufficient for detailed anatomical comparisons. The labelling of these subregions was carried out, first, by visual inspection, based on previous mapping work with MRI, particularly functional connectivity parcellation maps established from a meta-analysis of ∼6500 neuroimaging studies (Bzdok et al., 2013), and then confirmed with a standard probabilistic atlas (Tyszka and Pauli, 2016); see Electrode contact localization and corresponding figures: Fig. 3 and Supp. Fig. 1). Contact selection was also done according to sensory responsiveness by means of cluster-based permutation t-tests against zero for the neutral conditions (AudNeu, VisNeu, or AudioVisNeu) during the first 0–600 ms after stimulus-onset. We focused on neutral conditions alone to avoid emotion-related biases in contact selection, while ensuring that differential responses to emotion would not result from a lack of
response to the neutral condition. In total, sensory-responsive BL contacts represented ∼60% of all electrode sites located in the amygdala (27 contacts out of a total of 44). Contacts were labelled as “Auditory responsive” when responding to, at least, the Aud condition (18 out of 44), “Visual responsive” when responding to, at least, the Vis condition10 out of 44), and “AudioVisual responsive” when responding to, at least, the AudioVis condition (20 out of 44; Fig. 2, Supp. Fig. 1 and Table 3). In a first analysis, we focused on the Auditory and Visual responsive contacts only (22 contacts, regardless of the audiovisual response) in order to examine how their activity was modulated by emotion and bimodal inputs (see Table 3; e.g., contacts selected from right amygdala in Patient 04 were No. 2 –Auditory and No. 5 –Visual; see also Fig. 2 for illustration showing contact 5 for this patient). Conversely, in a further analysis, we selected the AudioVisual responsive contacts (regardless of unimodal activity) in order to explore any non-linear multisensory effects that could be present in neuronal populations with selective responses to bimodal stimulation (see Table 3; e.g., contacts selected from the same patient and amygdala were Nos. 5 and 6). Please note however that our recording contacts reflect population-based electrophysiological response, not single-cell selectivity, such that those classified as being “Auditory responsive” may in fact pool activity from auditory and visual-responsive neurons, and vice versa. 2.6.4. Time-frequency analysis To complement ERP analyses with a single-trial approach, stimulusrelated oscillatory responses were also analyzed using Fieldtrip. After signal preprocessing (see above), the extracted EEG epochs from -800 to 2000 ms of each single trial were decomposed into their frequency components, and then averaged for each condition and each amygdala. Time-frequency transforms were carried out with complex-valued Morlet wavelets. The wavelet was characterized by four cycles at the lowest frequencies (< 12 Hz) and seven at higher frequencies (12–128 Hz). The analysis window was thus centered on -800 to 2000 ms, sliding in steps of 5 ms. A baseline correction (−600 to −100 ms before stimulus-onset) was implemented by means of decibel (dB) conversion, and grand average of all amygdalae was calculated for each condition. Additionally, to obtain a closer estimation of higher frequency activity, we implemented an additional analysis using multitapers along frequencies from 30 to 128 Hz in steps of 2.5 Hz (timewindow: -1000 to 2000 ms in steps of 10 ms; sliding time-window: 400 ms; frequency smoothing: 10 Hz; 8 tapers; baseline correction: -1000 to -100 ms, in dB’s). Multitapers are particularly advantageous for high-frequency smoothing (Mitra and Pesaran, 1999). Statistical comparisons were all computed by means of cluster-based permutation tests (Maris and Oostenveld, 2007). To assess the main effects of emotion across modalities and, vice versa, the main effects of sensory modality across emotion categories, we computed ANOVAs with the same parameters as those described above. Here, oscillatory power values at each time-point were averaged for each frequency
Table 3 Basolateral contacts selected for our analyses, according to sensory responsiveness (A, auditory; V, visual; AV, audiovisual). Contact 1 was the most medially located and contact 7 the most lateral. Note that no contact 8 is provided as the LFP signal was rereferenced to the next contiguous contact. Patient
Patient Patient Patient Patient Patient Patient Patient Patient Patient
Contact
01 03 04 04 05 07 07 08 09
(R) (R) (L) (R) (R) (L) (R) (R) (R)
1
2
3
4
A+ V
A
A AV AV
A+ V
V
A+V
A A + V + AV
AV
A + AV
A + AV
5
6
AV
V + AV
A + AV A + AV
V + AV A + AV A + AV
AV
A + V + AV
A+ V + AV A + AV
A + AV
V + AV
15
A + AV
7
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
range of interest and permuted between conditions, within each factor and each amygdala. Then, when appropriate, permutation-based paired t-tests (1-tailed) were calculated with Fieldtrip, where permutations were carried out at each time- and frequency point. Clusters of differences were then identified based on temporal and spectral adjacency, and were only retained when reaching significance (with an overall threshold of p < 0.05) after cluster-correction. All statistics were calculated over the time-windows 0–400 ms post-stimulus (encompassing the early period where main Emotion effects were observed in the ERPs) and 400–800 ms post-stimulus (around the period where the ERP Modality effects were significant). Contacts included for these analyses were the same as those used for ERPs. Frequency bands of interest included were theta (4–7 Hz) and alpha (8–12 Hz) rhythms, which are ubiquitous in basolateral amygdala (Pape and Driesang, 1998; Zheng et al., 2017) and have been shown to coordinate intralimbic (Rutishauser et al., 2010; Zheng et al., 2017) and cortico-limbic (Likhtik et al., 2014) communication during emotion processing. We also analyzed gamma (30–80 Hz) oscillations, which have been associated to emotion saliency detection in the amygdala (Oya et al., 2002; Sato et al., 2011) and various cognitive functions including multisensory processing (Mishra et al., 2007; Yuval-Greenberg and Deouell, 2007). Higher gamma (80–128 Hz) oscillations were also examined, as they provide a reliable index of local neuronal spiking (Buzsaki et al., 2012) and may play a role in emotion encoding (Zheng et al., 2017).
and Fear only or Neu and Happy only) and Modality (Aud, Vis, AudioVis), and pairwise comparisons by means of one-way ANOVAs, always using the same cluster-based permutation procedure. The Emotion effect was primarily driven by the difference of Fear vs. Neu, as revealed by a 3 × 2 ANOVA (i.e. excluding the Happy condition) showing a significant modulation around 136–195 ms (cumF = 138.92; pcluster = 0.01), and confirmed by a direct pairwise test comparing all Fear vs. all Neu conditions (141–191 ms; cum-F = 90.75; pcluster = 0.023; Fig. 4). No differential responses were observed for Happy versus Neu in any of these analyses, or for any sensory modality. In addition, one-way ANOVAs for each sensory modality taken separately demonstrated early differential amygdala responses to Fear versus Neu in all three cases, with significant time-clusters spanning from 156 to 176 ms for the Aud condition (cum-F = 42.41; pcluster = 0.027), 172–188 ms for the Vis condition (cum-F= 26.35; pcluster = 0.043), and 105–141 ms for the AudioVis condition (cumF = 52.69; pcluster = 0.03; Fig. 5), all within the early phase of the stimulus-driven response. Finally, given that emotion (Fear vs. Neu) responses became apparent at different latencies depending on the stimulus modality (i.e. shortest for the audiovisual modality, followed by auditory, and longest for the visual modality), we performed cluster-based permutation t-tests against zero (always following the same procedure as described in the Methods; Mendez-Bertolo et al., 2016) on each individual amygdala, and extracted the earliest latency value (in ms) at which the Fear condition would significantly differ from zero for each of the three sensory modalities. Then, we compared these latency values using a one-way ANOVA with the 3 Modality levels (Aud, Vis and AudioVis). This, however, revealed no significant differences between modalities (p > 0.1). Additionally, we performed cluster-based permutation oneway ANOVAs to directly compare, at each time point (successive intervals of 3.9 ms), the main effect of Emotion (i.e. [Fear vs. Neu] difference waves) among all three sensory modalities (Aud, Vis, and AudioVis). This then allowed us to further test for any latency differences in the emotion response across modalities (see parameters of analysis in the Methods). The first time-point with a significant cluster was used as an estimate of differential latency among the effects (Rutishauser et al., 2015; Wang et al., 2018). However, our results again revealed no latency differences for the Fear vs. Neu effect among the three sensory modalities ([Aud vs. AudioVis] [Vis vs. AudioVis], or [Aud vs. Vis], p > 0.09). Together, the results suggest a lack of consistent timing preference for emotion encoding in one sensory modality over the others.
3. Results 3.1. Behavior Overall, patients showed behavioral accuracy above 70% for emotion categorization in all stimulus conditions, with a mean hit rate of 85% (71.43% for Aud; 91.27% for Vis; 92.76% for AudioVis; see Supp. Fig. 2). No differences were seen between emotion categories or modality conditions (Emotion effect: p = 0.31; Modality effect: p = 0.16; Emotion x Modality: p = 0.14). No differences either were seen for RTs (on correct trials) across the different emotion or modality conditions (mean: 1192 ms after stimulus-onset; 1346 ms for Aud; 1135 ms for Vis; 1096 ms for AudioVis; Emotion effect: p = 0.16; Modality effect: p = 0.15; Emotion x Modality: p = 0.22; see Supp. Table 1). This suggests that patients were generally accurate in recognizing emotion expressions in both faces and voices, but also that there was no response facilitation by the bimodal (AudioVisual) stimulation in comparison with the unimodal, unlike results described in previous literature (e.g. Dolan et al., 2001; Ernst and Bulthoff, 2004; Kreifelts et al., 2007).
3.2.3. Modality response The Modality effect was mainly driven by significant differences between the Aud and the AudioVis conditions at 426–457 ms (cumF = 42.44; pcluster = 0.047), and between the Vis and AudioVis conditions at 387–496 ms (cum-F = 180.085; pcluster = 0.012). As shown in Fig. 4 and Supplementary Fig. 3, the bimodal stimuli produced larger responses than both unimodal conditions, predominating in the late phase of the iERPs. However, when considering each emotion condition separately, pairwise tests showed no significant bimodal versus unimodal differences, showing that the effect was subtle. An emotion-specific modality effect was only observed for the Happy condition during an intermediate time-window that did not overlap with the main effect of Modality as found above, and only between the Vis and AudioVis stimuli (176–219 ms; cum-F = 65.7; pcluster = 0.035). In addition, only a very short (less than 1-ms, and therefore not considered) effect was observed at 469 ms for the AudioVisNeu versus VisNeu (pcluster = 0.039). Given that no interaction effects were observed during this time-window, we will not further interpret these single effects. Furthermore, to formally test whether the modality response arose later than the emotion response, we performed cluster-based permutation one-way ANOVAs to directly compare Emotion effects (i.e.
3.2. iERPs All the analyses described here were done by selecting the Visual and the Auditory responsive contacts, unless indicated otherwise. 3.2.1. Main effects in global 3 × 3 permutation-based ANOVA We first tested for the main effect of emotion across modalities and, vice versa, for the main effect of sensory modality across emotion categories. These comparisons revealed an early modulation of amygdala activity as a function of the Emotion conditions at 145–184 ms poststimulus onset (cum-F = 46.71; pcluster = 0.032; Fig. 4), whereas an effect of Modality was observed during a later time-window at 402–473 ms after onset (cum-F = 72.06; pcluster = 0.022; Fig. 4). No significant interaction of Modality x Emotion was observed. These effects were decomposed by further analyses comparing the different conditions. 3.2.2. Emotion response Subsequent analyses comprised 3 × 2 ANOVAs with the factors Emotion (Neu, Fear, Happy) and Modality (Aud and AudioVis only, or Vis and AudioVis only), 2 × 3 ANOVAs with the factors Emotion (Neu 16
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Fig. 4. Main emotion (top left) and modality (bottom left) effects. Intracranial ERPs from nine amygdalae of seven patients, showing that emotion response overall preceded modality effects. Shaded areas indicate the standard error of the mean (s.e.m.). (right) Scatter plots and lines of best fit illustrating voltage values for the corresponding significant time-windows, (top right) after fearful and neutral stimuli across modalities (window 145–184 ms), versus patients' response time in the emotion categorization task, and (bottom right) after audiovisual stimuli across emotion conditions (window 402–473 ms) versus the mean of unimodal (Aud, Vis) stimuli. Amygdala response for fearful stimuli (and not for neutral) increased linearly as individual behavioral response slowed down. In turn, amygdala response gain for bimodal stimulation (relative to unimodal) increased linearly with similar gain in patients' accuracy. *p < 0.05; **p < 0.01.
Fig. 5. Emotion response. (left) Intracranial ERPs from nine amygdalae of seven patients (a total of 22 contacts) to neutral, fearful and happy stimuli for each sensory modality condition (voices, faces, voices and faces together). Shaded areas indicate the standard error of the mean (s.e.m.). Fear versus Neutral effects were observed across sensory modalities in similar early time-windows. No differential responses were shown for happy versus neutral stimuli. (right) Scatterplots depicting average amplitudes over significant time-points for the Fear and Neutral conditions from each individual amygdala. Right amygdalae are represented by circles, and left amygdalae by triangles.
[allFear vs. allNeu] difference wave) with Modality effects (i.e. [allAudioVis vs. allAud] and [allAudioVis vs. allVis] difference waves). All tests were carried out under the same parameters as those described in the Methods. Similarly as for our latency analysis of emotion, the time-point of the first significant cluster was used as an estimate of the differential latency among conditions (Rutishauser et al., 2015; Wang et al., 2018). This comparison yielded a significant difference between Emotion effects ([allFear vs. allNeu]) and Modality effects (i.e. [allAud
vs. allAudioVis], 383–582 ms, cum-F = 587.97; pcluster = 0.004; and [allVis vs. allAudioVis], 355–566 ms, cum-F = 422.8; pcluster = 0.02), confirming that stimulus modality modulated ERP amplitude later (i.e. from ∼360 ms post-stimulus) than emotion content. 3.2.4. ERP-behavior correlations Complementary analyses were performed to determine whether emotion and modality responses in ERPs were related to behavioral 17
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
absence of laterality effects must be considered exploratory only.1
indices of stimulus processing. On one hand, response times (from correct trials) were found to highly correlate with the amplitude of the Emotion effect in EEG (time-window 145–184 ms) on trials with fearful stimuli (allFear condition, r = −0.85; p = 0.004; see Fig. 4, top right). No such correlation with RTs was found for neutral stimuli (allNeu condition, r = 0.23; p = 0.3; Fig. 4, top right). This suggests that the larger the amygdala response was to fear at this latency, the slower participants responded to fearful stimuli. When testing the same correlation for each sensory modality separately, no effect reached significance (p > 0.09), suggesting that the overall effect was not unique to a single modality. On the other hand, we also examined how the observed gain in amygdala response for bimodal relative to unimodal stimuli related to behavioral performance. A similar crossmodal index was computed for individual ERPs and behavioral task accuracy, by subtracting the mean of unimodal trials (allAud and allVis) from the bimodal trials (allAudioVis). ERP amplitudes were extracted from the time-window showing a main effect of Modality (402–473 ms post-stimulus) in the previous analyses above. Our correlation analysis revealed a strong linear association between both measures (r = 0.71; p = 0.024), suggesting that the gain in amygdala response observed for bimodal stimulation (relative to unimodal) at ∼400 ms was paralleled by the same gain in accuracy (Fig. 4, bottom right). The same correlations for each emotion condition separately (Neu, Fear, Happy) showed similar patterns (all r > 0.5), but none were significant (all p > 0.05). Together, these correlations establish a direct link between amygdala response and behavioral performance, strengthening the overall validity of our results.
3.2.7. Cognitive impairment and proximity to seizure focus Because a few patients presented some cognitive impairment (see Methods), we also verified whether our main results might be influenced by deficits observed in working memory, executive functions, or visuospatial attention. We performed similar 3 × 3 ANOVAs as described above, but now including cognitive impairment (in memory, attention and executive functions, respectively) as between-amygdala factors (i.e. No-deficit, Mild deficit, or Marked deficit). Three different ANOVAs (for each of these cognitive domains) were conducted for the two critical ERP time-windows. None of these ANOVAs revealed significant effects of cognitive deficits or interactions with other factors. The same lack of effects was observed when examining cognitive impairment influences on behavioral task performance (HR or hit-RT). Together, this indicates that cognitive impairment did not impact our results. Furthermore, although all amygdalae were radiologically normal on pre-operative MRI and no seizure activity was recorded from our electrode sites, we applied a similar procedure to test whether effects of interest were influenced by proximity to seizure focus. The epileptogenic site was determined by clinical investigations and categorized as Near (ipsilateral temporal lobe) or Distant (elsewhere). Seizure focus was not localized for Patient 01 and therefore not included. None of these control ANOVAs revealed any significant effect of proximity to seizure focus. 3.2.8. Amygdala oscillations Finally, we performed time-frequency analyses to investigate stimulus-related oscillatory responses on each single trial, for the same amygdala contacts as those used for ERP analysis. Main effects of interest were tested as above. First, a 3 × 3 ANOVA revealed an early increase of amygdala activity within the alpha range (8–12 Hz) as a function of the Emotion conditions (266–328 ms post-stimulus; cumF = 56.61; pcluster = 0.045; Fig. 7a). Alpha oscillations are dominant rhythms in the amygdala (Pape and Driesang, 1998) and have been shown to play an important role in intralimbic communication during emotional processing (Zheng et al., 2017). An effect of Modality was observed in the same frequency range during a later time-window at 547–695 ms after stimulus-onset (cum-F = 147.21; pcluster = 0.018; Fig. 7b). No significant interaction of Modality x Emotion was observed. Again, the Emotion effect was primarily driven by a difference of Fear versus Neutral, as revealed by a 3 × 2 ANOVA (i.e. excluding the Happy condition) showing a significant modulation around 246–320 ms (cum-F = 86.92; pcluster = 0.039; paired test for all Fear versus all Neu: 211–359 ms post-stimulus; cum-T = 313.58; pcluster = 0.035; Fig. 7a). Thus, the fear effect appeared in a relatively similar time-window as that observed in the ERPs (∼200 ms). However, please note that latency information in the time-frequency domain depends on the frequency resolution. For effects in the alpha range (8–12 Hz; with a center frequency of 10 Hz), a time uncertainty of ± 200 ms might be expected for each time bin of data with a wavelet length of 4 cycles (see Methods). Increased responses to fear were also apparent in other frequencies, but non-significant. Similarly, alpha responses for Happy versus Neu stimuli showed weak but non-significant modulation (Fig. 7a). A comparison of Fear versus Neu performed for each sensory modality separately revealed significant effects in the alpha frequency range for the Aud condition (47–398 ms; cum-T = 595.77; pcluster = 0.01) but failed to reach significance in other conditions (p > 0.3). However, given that no Emotion × Modality interaction was observed in the main ANOVAs, there was no evidence that the Fear effect on alpha activity was specific for one sensory modality over the others.
3.2.5. Super-/subadditivity We then specifically tested whether the bimodal response was explained by a simple summation of the two simultaneous visual and auditory inputs or by non-linear effects instead. First, a 3 × 2 ANOVA comparing the bimodal versus the summed unimodal responses [AudioVis vs. (Aud + Vis)] revealed no main effect of Emotion or Modality, nor any interaction between both factors. Similarly, a oneway ANOVA for each Emotion condition separately revealed no significant differences between the two time-courses (Fig. 6). These results suggest that the enhanced bisensory response (relative to unimodal) was a purely additive effect (Aud + Vis). Similarly, analyses comparing the differential amygdala response to Fear (versus Neu) for the AudioVis modality versus the sum of its unisensory counterparts [(AudioVisFear vs. AudioVisNeu) vs. ((AudFear vs. AudNeu) + (VisFear vs. VisNeu))] yielded no significant increase or decrease, again indicating no super- or subadditivity effects in the bimodal condition relative to the pure sum of unimodal responses (Aud + Vis). Finally, to further explore non-linear effects, we also performed the same permutation-based analyses, but now considering only the AudioVisual responsive contacts. This analysis replicated results described above, with significant effects of Modality (403–493 ms; cumF = 85.81; pcluster = 0.037), but, critically, no differences between the AudioVis modality versus the sum of its unisensory counterparts. Moreover, no Emotion effects were observed for these contacts. 3.2.6. Laterality Given that we recorded 7 right amygdalae and 2 left amygdalae in our patients, we inspected our data to verify/rule out any major influence of laterality on the results (Dolan et al., 2001; Fruhholz et al., 2015; Klasen et al., 2011; Vuilleumier and Pourtois, 2007; Wang et al., 2017). We ran two 3 × 3 ANOVAs with Laterality as between-amygdala factor, after extracting the mean amplitude values of ERPs for the timewindow exhibiting significant main effects of Emotion and Modality, respectively. None of these ANOVAs revealed significant effects of Laterality, nor interactions of Laterality with any other factor (p ≥ 0.1), indicating no major hemisphere preferences in our observed amygdala response. However, due to the small number of observations, the
1
18
This analysis was added upon request of reviewers.
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Fig. 6. Analysis of super-/subadditivity effects. Intracranial ERPs showing amygdala response to audiovisual stimuli compared to the sum of unimodal responses (Aud + Vis), both for all emotion conditions (top left) and for each emotion condition separately (right). Audiovisual responses were not significantly different from the sum of the unimodal responses (Aud + Vis). Shaded areas indicate the standard error of the mean (s.e.m.). (bottom left) Scatterplots depicting average amplitudes over significant timepoints in previous analyses (i.e. modality effects) for the subtraction [Audiovisual – (Aud + Vis)] from each individual amygdala, showing again no consistent differences between the sum of the unimodal responses and the audiovisual responses for this time-window. Right amygdalae are represented by circles, and left amygdalae by triangles.
In turn, the Modality effect was mainly driven by significant differences between the Aud and AudioVis conditions at 535–664 ms (cum-F = 145.34; pcluster = 0.016), and between the Vis and AudioVis conditions at 535–703 ms (cum-F = 268.22; pcluster = 0.009). As shown in Fig. 7b, the bimodal stimuli produced decreased alpha response relative to both unimodal conditions, predominating in the late phase of the response. Also similarly to the ERPs, when considering each emotion condition separately, pairwise tests showed no significant bimodal versus unimodal differences. Then, to obtain a better estimation of higher frequency activity, we implemented an additional analysis using multitapers (see Methods). A 3 × 2 ANOVA on these data (i.e. excluding the Happy condition, with similar parameters as above) revealed a significant increase in gamma (peaking around 60–70 Hz) for Fear versus Neutral stimuli, 309–340 ms after-onset (Emotion effect: cum-F = 19.54; pcluster = 0.04; Fig. 7a). Again, please note that our multitaper approach resulted into a latency uncertainty of ± 200 ms (for our fixed sliding window of 400 ms) and a frequency uncertainty of ± 10 Hz (for our 10 Hz smoothing, see Methods). No clear Modality effects were observed in this frequency range. Finally, no significant interaction of Modality x Emotion was observed. Again, when considering each sensory modality separately, pairwise tests showed no significant Fear versus Neutral differences (p > 0.05). Most critically, no super-/subadditivity effects (differences between the AudioVis modality versus the sum of its unisensory counterparts) were observed in any frequency range, in any time-point, and/or for any emotional condition (Fig. 7c). This again suggests that the enhanced bisensory response (relative to unsensory) observed in alpha activity was purely additive (Aud + Vis), as already observed in the ERPs. To further explore non-linear effects, we also performed the same tests but now considering only the Audiovisual responsive contacts. This analysis replicated the results described above, with significant effects of Modality in the alpha range (531–629 ms; cum-F = 80.68; pcluster = 0.042) but, importantly, no difference between the AudioVis modality versus the sum of its unisensory counterparts. Again, no Emotion effects were observed for these contacts.
4. Discussion The present study aimed, for the first time, at examining and comparing the time-course of amygdala response to affective stimuli presented in unimodal or bimodal manner. Specifically, we used iEEG to record ERPs and oscillatory activity to auditory, visual, and audiovisual inputs, consisting of either voices, faces, or simultaneous presentations of voices and faces, respectively. This allowed us to determine, with millisecond resolution, whether the electrophysiological response to bisensory emotional stimuli (audiovisual) in the human amygdala reflected a true multimodal integration, characterized by a non-linear modulation (gain or facilitation) relative to their unimodal (auditory and visual) counterparts, as suggested by indirect hemodynamic measures in previous brain imaging studies (Klasen et al., 2011; Robins et al., 2009). Borrowed from standardized databases and selected after careful piloting, our face and voice stimuli had a robust emotional impact as they were the most discriminable within each emotion category, and among those showing the largest difference in arousal ratings compared to neutral categories. In addition, our emotional stimuli (fearful and happy) had similar arousal levels across the auditory, visual, and audiovisual modalities. Therefore, this ensured an optimal comparability of local electrophysiological responses to emotional stimuli across conditions. Another important advantage of our clinical setup was the short intercontact space (2 mm) along depth electrodes, allowing us to record from several contacts within the amygdala region. Overall, our iERP results revealed significant emotion effects in the amygdala that predominated for fearful versus neutral stimuli across all three sensory modalities. Even though these effects were observed in slightly different time-clusters depending on the sensory modality, our latency analyses revealed no significant time differences across the three modality conditions. Therefore, we conclude that these effects appeared in partially overlapping time-windows over early processing stages, spanning from 105 to 188 ms after stimulus-onset. We note that the onset of visual-evoked fear responses in our study appears ∼30 ms later than that recently observed (Mendez-Bertolo et al., 2016), potentially reflecting the fact that 8 face identities were repeatedly presented, as opposed to stimulus-unique presentations used in the prior 19
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Fig. 7. a. (left) Time-frequency plots of amygdala activity after fearful (top left) and neutral (bottom left) stimuli across modalities. Fearful versus neutral stimuli effects were observed in alpha (8–12 Hz, top right, wavelet analysis) and gamma (30–80 Hz, bottom right, multitaper analysis) activity in earlier time windows. b. Modality responses in alpha (wavelet analysis) for Audiovisual versus unimodal (Aud or Vis) stimuli across emotion conditions. Modality effects appeared later than emotion effects. c. Timefrequency plot of amygdala activity to audiovisual stimuli versus the sum of unimodal responses (Aud + Vis), across all emotion conditions. Audiovisual responses were not significantly different from the sum of the unimodal responses (Aud + Vis). Shaded areas indicate the standard error of the mean (s.e.m.).
study. The 105–188 ms time-window does however accord with a later effect of fear reported in Mendez-Bertolo et al. (2016), as well as previous research on iEEG recordings from the amygdala in the visual modality (Pourtois et al., 2010; Krolak-Salmon et al., 2004; Meletti et al., 2012), or single-unit recordings in human amygdala (∼120 ms post-stimulus; Zheng et al., 2017). Altogether, this suggests that similar activity patterns may exist for auditory affective processing in this region, despite the fact that emotion appraisal in vision and audition is subserved by different mechanisms (Pannese et al., 2015). In fact, important structural differences exist between both sensory pathways, and the hierarchical levels of processing along vision and audition might not be directly comparable (Nelken, 2004). Nevertheless, our findings show that, at least within the amygdala, the time-course of auditory emotion processing may be similar to that of visual emotion processing. We also note that the relative ERP latencies of amygdala response to emotion appeared earlier here than in a few other reports using singleunit recordings in humans (i.e. ∼400 ms - Mormann et al., 2008, ∼650 ms post-onset -Wang et al., 2014). However, these results may be difficult to compare directly due to differences in stimulus categories
used (human faces, but also landmarks, animals or objects in Mormann et al., 2008, or randomly selected face parts presented in bubbles in Wang et al., 2014), as well as differences in the (unspecified) subnuclei recorded (Mormann et al., 2008). Note also that, in these studies, emotion was either not studied (Mormann et al., 2008) or tested by comparing fearful and happy faces only (Wang et al., 2014). In the latter study, no neutral faces were used. Conversely, we did not find any differences in amygdala response between the fearful and happy emotion categories. Moreover, whereas firing rate latencies in Mormann et al. (2008) were estimated in ∼400 ms with automated methods, the earliest reliable response at visual inspection occurred earlier (∼220 ms post-onset; Mormann et al., 2008). In addition, while singleunit action potentials reflect the suprathreshold activity of a local neuronal population, the ERP is mainly generated by summed postsynaptic potentials. Thus, the ERP might reflect both the effect of afferent inputs into the amygdala, as well as subthreshold changes in neuronal membrane potential, both of which would be expected to precede neuronal firing in response to a sensory stimulus (Buzsaki et al., 2012). In sum, despite latency dissimilarities that may be due to several 20
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
methodological differences, our iERP results generally dovetail well with previous studies using similar approaches and showing relatively early amygdala response to emotion. By unveiling clear distinctive responses to fearful voices in the amygdala, our study provides novel support to an involvement of this brain structure in the processing of vocal emotions, a question previously debated in the literature (Adolphs and Tranel, 1999; Bruck et al., 2011; Klinge et al., 2010; Pourtois et al., 2005; Scott et al., 1997). Overall, even though emotion effects may seem less prominent in some modalities, scatterplots from Fig. 5 show that a differential effect of fear (versus neutral) was generally consistent across patients and modality conditions (with only few exceptions; e.g., Patients 03 and 08 for auditory and visual/audiovisual modalities, respectively, an effect that could be due to noise or the more lateral location of electrode contacts in these patients –i.e. near white matter). Moreover, ERP responses at the relevant latency (145–184 ms) for fearful stimuli (and not for neutral) were strongly correlated with individual response times. Thus, the larger the amygdala response was to fear, the slower participants responded. This may accord with a slowing of motor response towards fearful stimuli related to freezing-like behavior (e.g. LeDoux, 2000; Sagaspe et al., 2011), or with attentional capture and deeper processing for threatening information (Fox et al., 2002; Vuilleumier and Huang, 2009), both thought to be mediated by the amygdala (Armony et al., 1997). Importantly, this correlation highlights a direct link between emotion processing in the amygdala and behavioral performance, strengthening the overall validity of the results. In contrast, in our study, amygdala responses to happy stimuli were generally not different from those to neutral stimuli across all sensory modalities. Previous single-cell recording and brain imaging studies have observed significant amygdala responses to reward as well as to pleasant emotions or happy faces (Beyeler et al., 2016; Kim et al., 2016; Maren, 2016; Wang et al., 2014; Zhang et al., 2013) or even pleasant music excerpts (Koelsch et al., 2008; Mueller et al., 2011), but there is evidence for greater activations for fearful compared to happy (Dolan et al., 2001; Mendez-Bertolo et al., 2016; Morris et al., 1998; Morris et al., 1996) or mildly happy faces (Phillips et al., 1998) in several prior studies. This suggests that the amygdala responses might be biased in favor of negative valence information, and thus show less reliable or consistent activity for positive stimuli, in line with a central role of this structure for threat detection and learning (Phelps and LeDoux, 2005). Note, however, that this statement should be taken with caution. Given that only three emotion categories were tested (neutral, fearful and happy), it cannot be reliably concluded that the amygdala shows emotion selectivity (or emotion semantic coding) for all negative over positive emotions, or for fear over other negative emotions. Importantly, amygdala responses to audio-visual stimulation differed from those to unimodal auditory or visual input at a later stage than the emotion effect (from 387 to 496 ms post-onset). Our direct comparisons between emotion and modality effects confirmed that the former preceded the latter. However, the modality effect was not specific to any emotion condition. This indicates that there was a general gain in amygdala response to bisensory stimulation, with respect to unimodal, regardless of the stimulus emotional content. Critically, however, we did not find evidence, neither globally nor for any specific emotion category, in favor of a super- or sub- additive modulation to faces and voices presented together, despite their congruent gender and emotion expression. Moreover, when comparing the emotion effects (e.g., fearful vs. neutral) on iERPs to audiovisual stimuli with the sum of unisensory emotion responses (e.g., fear effect for voices plus fear effect for faces), we observed no significant difference across the whole timewindow examined. These data indicate that the gain in ERP amplitude for the bimodal stimuli appears to be purely additive due to the summation of auditory and visual inputs. Finally, when we isolated contacts that were particularly responsive to audiovisual stimulation, we again confirmed an absence of non-linear effects. Altogether, such lack of super- or subadditive responses is inconsistent with multimodal
integration (Meredith et al., 1987; Meredith and Stein, 1983; Wallace et al., 1998), considering that this criterion should be met with techniques that reflect activity from large populations of neurons, as here with iEEG or previous studies with PET or fMRI (Calvert, 2001; Calvert et al., 2001; Holmes and Spence, 2005; Kayser et al., 2012; Sarko et al., 2012; see Laurienti et al., 2005). Amygdala responses for emotion and modality were also reflected by differences in alpha and gamma activity, with similar temporal patterns as those observed in the iERPs. At an early stage, fearful stimuli produced higher alpha power around 211–359 ms post-stimulus, relative to neutral stimuli. A later gamma activity increase for fearful stimuli, relative to neutral, appeared also around 308–339 ms post-stimulus. Again, neither alpha nor gamma changes in response to happy stimuli reached significance (despite a possible apparent trend), suggesting that encoding positive value in the amygdala may be less consistent than negative value. Bimodal effects of alpha oscillations occurred later, around 535–703 ms post-stimulus, with a marked reduction of power after audiovisual stimuli, compared to unimodal visual or auditory inputs alone. Low oscillations, including alpha rhythms, are frequent and characteristic in basolateral amygdala neurons (Pape and Driesang, 1998; Zheng et al., 2017), possibly mediating amygdalo-hippocampal communication during the detection of emotionally salient information (Zheng et al., 2017). Further research should elucidate the precise role of alpha increase/decrease dynamics in the amygdala. In turn, gamma activity in the amygdala has been associated to emotion saliency detection (Oya et al., 2002; Sato et al., 2011). More critically, here again, we found no super- or subadditivity effects modulating oscillatory power for any of the frequency ranges tested. A number of factors may account for the lack of supra/sub-additive bimodal effects in our results. First and foremost, these findings suggest that the amygdala may not have a true multisensory integration function. This appears to conflict with previous studies using non-invasive functional imaging techniques in humans, where it was suggested that this region may integrate bisensory emotional cues (Chen et al., 2010; Dolan et al., 2001; Klasen et al., 2011; Muller et al., 2011; Robins et al., 2009). However, among these studies, only few reported or tested explicitly for non-linear effects in the BOLD signal (Klasen et al., 2011; Robins et al., 2009). In addition, a lack of multisensory properties in this region would actually accord with data from monkeys showing, for instance, that amygdala lesions do not impair crossmodal integration of some sensory modalities (tactual-visual) at the behavioral level (Goulet and Murray, 2001). Hence, the amygdala itself may not be critical in multisensory integration (Goulet and Murray, 2001). Moreover, previous imaging studies of multisensory integration in emotion processing have generally tested for effects of congruency (i.e. faces and voices presented together with either congruent/same or incongruent/different emotional expressions), whereas our paradigm was independent of any potential conflict between modalities (all our audiovisual stimuli were congruent in emotion expression and gender). Therefore, the current results may not necessarily contradict those obtained by manipulating emotion incongruency, but in turn suggest that the previously reported effects might at least partly reflect other aspects, such as the differential saliency or uncertainty levels associated with the congruency or incongruency of inputs, rather than genuine integration of multisensory inputs. Another possibility is that super- or subadditivity was not reliably reflected in our EEG recordings. There is no single perfect neurophysiological index of multisensory integration. In fact, even in single-unit studies, nonlinear effects can be absent despite multisensory convergence, and vice versa (Holmes and Spence, 2005; Skaliora et al., 2004). In addition, given that only a minority of neurons demonstrate superadditivity in single-cell recording studies (see Holmes and Spence, 2005; Laurienti et al., 2005), and that iEEG reflects activity from a larger neuronal ensemble, it remains possible that very restricted or localized bimodal neuron populations might be missed by our electrode 21
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
contacts. Notwithstanding, we underscore that here we used a highresolution electrode distribution with multiple close contacts (< 2 mm) located throughout each of the recorded amygdalae, and performed bipolar re-referencing, providing with higher anatomical precision than numerous other iEEG studies. Moreover, our results were overall consistent across several contacts and patients. Another related possibility for the lack of non-linear multimodal effects is that our recordings were performed outside amygdala subregions that would be selectively involved in multisensory integration. For instance, the integration of visual and auditory information might also take place in the central region, through a convergence of its connections from other amygdala nuclei (Kuraoka and Nakamura, 2007), despite the fact that the lateral region is the main subregion receiving direct inputs from both associative cortical areas (Amaral et al., 1992) and primary sensory areas in visual and auditory cortices (McDonald, 1998; Turner et al., 1980; Yukie, 2002). To directly test for this issue, we performed an additional analysis on selected contacts that showed stimulus-locked responses and overlapped with the centromedial amygdala (details not shown). Among 4 contacts clearly placed in the latter sector, one responded indistinctively to both auditory and visual stimuli, and the other three were mainly auditory. However, permutation tests did not reveal any super-/subadditivity effects for the bimodal stimuli in any of these contacts. Finally, it is also possible that our audiovisual stimuli did not allow for an effective integration of both sensory modalities. Multisensory integration is believed to take place when stimuli in different senses occur at approximately the same location and the same time, and it is stronger when at least one of the two stimuli is weak or produces small neuronal activity by itself (Holmes and Spence, 2005). We may have failed to fulfill the first of these requirements given the artificial experimental procedure, in hospital settings, by pairing static faces displayed on a screen with voices delivered through headphones. In this regard, an optimal procedure to ensure highly synchronized bimodal stimulation may be the use of dynamic faces (Klasen et al., 2011) or naturalistic videos (Senkowski et al., 2007). The second requirement might also have not been met, given that all expressions were highly recognizable and intense (see Methods). In addition, behaviorally, we found no overall difference in emotion categorization accuracy or response time for the audiovisual versus the unimodal stimulation, unlike the performance gain often observed with multisensory stimuli (Dolan et al., 2001; Ernst and Bulthoff, 2004; Kreifelts et al., 2007). Thus, it still remains possible that, in our study, concomitant faces and voices were not perceived as a whole object to be fully integrated. Arguing against this, we underscore that our procedure and stimuli were similar to those used in several prior studies that reported behavioral facilitation and multisensory integration effects in amygdala activity, measured with non-invasive brain imaging. Therefore, even if weak, superor subadditive effects should still be observed with our stimuli if multisensory integration occurred locally within the amygdala. Importantly, however, despite this lack of overall behavioral effects, we found that, in the critical time-window ∼400 ms post-onset, the amygdala response gain for bimodal stimuli (relative to unimodal) was highly correlated with a similar bimodal gain in behavioral accuracy (Fig. 4). This would be consistent with sensory integration, which could also take place in other multimodal associative regions outside the amygdala (e.g., superior temporal sulcus; Muller et al., 2012). In any case, future research should further tackle these questions, and explore multimodal emotion processing with different stimuli in several other brain regions. Here, given the scope of our study and the constraints of implantation determined by independent clinicians, we only focused on amygdala activity and no other region was examined. In conclusion, our novel results reveal, on one hand, early emotional responses to fearful face expressions in the human amygdala that, importantly, partially overlap in time (∼200 ms) with those to fearful expressions from the auditory and the audiovisual modalities. On the other hand, we show an apparent gain of amygdala activation to
audiovisual input (relative to unimodal), arising at a later stage (∼400 ms), that did not exhibit any evidence of non-linear additive effects. These findings highlight a potential segregation of visual and auditory emotional input within the basolateral amygdala, rather than genuine multisensory integration. Taken together, our results provide novel insights on emotion processing in the human amygdala across the sensory pathways, and on how sensory stimulation converges within the limbic system. Acknowledgements This work was supported by the Swiss National Science Foundation (JDB: Ambizione PZ00P3-148112; MS: 163398; SF: PP00P1_157409/1; PM: 167836), the National Center of Competence in Research (NCCR) for the Affective Sciences at the University of Geneva and the Private Foundation of the University Hospital of Geneva (CONFIRM RC1-23). BAS was supported by the Spanish Ministry of Economy and Competition (SAF2015-65982-R). Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.neuropsychologia.2019.05.027. References Acunzo, D.J., Mackenzie, G., van Rossum, M.C., 2012. Systematic biases in early ERP and ERF components as a result of high-pass filtering. J. Neurosci. Methods 209, 212–218. Adolphs, R., 2002. Neural systems for recognizing emotion. Curr. Opin. Neurobiol. 12, 169–177. Adolphs, R., Tranel, D., 1999. Intact recognition of emotional prosody following amygdala damage. Neuropsychologia 37, 1285–1292. Amaral, D.G.P., J.L., Pitkänen, A., Carmichael, S.T., 1992. Anatomical organization of the primate amygdaloid complex. In: Aggleton, J.P. (Ed.), The Amygdala: Neurobiological Aspects of Emotion, Memory, and Mental Dysfunction. Wiley–Liss, New York, pp. 1–66. Armony, J.L., Servan-Schreiber, D., Cohen, J.D., Ledoux, J.E., 1997. Computational modeling of emotion: explorations through the anatomy and physiology of fear conditioning. Trends Cognit. Sci. 1, 28–34. Arnal, L.H., Flinker, A., Kleinschmidt, A., Giraud, A.L., Poeppel, D., 2015. Human screams occupy a privileged niche in the communication soundscape. Curr. Biol. 25, 2051–2056. Arnal, L.H., Morillon, B., Kell, C.A., Giraud, A.L., 2009. Dual neural routing of visual facilitation in speech processing. J. Neurosci. 29, 13445–13453. Arnal, L.H., Wyart, V., Giraud, A.L., 2011. Transitions in neural oscillations reflect prediction errors generated in audiovisual speech. Nat. Neurosci. 14, 797–801. Aube, W., Angulo-Perkins, A., Peretz, I., Concha, L., Armony, J.L., 2015. Fear across the senses: brain responses to music, vocalizations and facial expressions. Soc. Cognit. Affect Neurosci. 10, 399–407. Bach, D.R., Behrens, T.E., Garrido, L., Weiskopf, N., Dolan, R.J., 2011. Deep and superficial amygdala nuclei projections revealed in vivo by probabilistic tractography. J. Neurosci. 31, 618–623. Banziger, T., Grandjean, D., Scherer, K.R., 2009. Emotion recognition from expressions in face, voice, and body: the Multimodal Emotion Recognition Test (MERT). Emotion 9, 691–704. Banziger, T., Mortillaro, M., Scherer, K.R., 2012. Introducing the Geneva Multimodal expression corpus for experimental research on emotion perception. Emotion 12, 1161–1179. Belin, P., Fillion-Bilodeau, S., Gosselin, F., 2008. The Montreal Affective Voices: a validated set of nonverbal affect bursts for research on auditory affective processing. Behav. Res. Methods 40, 531–539. Bergstrom, H.C., Johnson, L.R., 2014. An organization of visual and auditory fear conditioning in the lateral amygdala. Neurobiol. Learn. Mem. 116, 1–13. Beyeler, A., Namburi, P., Glober, G.F., Simonnet, C., Calhoon, G.G., Conyers, G.F., Luck, R., Wildes, C.P., Tye, K.M., 2016. Divergent routing of positive and negative information from the amygdala during memory retrieval. Neuron 90, 348–361. Breiter, H.C., Etcoff, N.L., Whalen, P.J., Kennedy, W.A., Rauch, S.L., Buckner, R.L., Strauss, M.M., Hyman, S.E., Rosen, B.R., 1996. Response and habituation of the human amygdala during visual processing of facial expression. Neuron 17, 875–887. Bruck, C., Kreifelts, B., Wildgruber, D., 2011. Emotional voices in context: a neurobiological model of multimodal affective information processing. Phys. Life Rev. 8, 383–403. Buchanan, T.W., Lutz, K., Mirzazade, S., Specht, K., Shah, N.J., Zilles, K., Jancke, L., 2000. Recognition of emotional prosody and verbal components of spoken language: an fMRI study. Brain Res. Cogn. Brain Res. 9, 227–238. Buzsaki, G., Anastassiou, C.A., Koch, C., 2012. The origin of extracellular fields and currents–EEG, ECoG, LFP and spikes. Nat. Rev. Neurosci. 13, 407–420. Bzdok, D., Laird, A.R., Zilles, K., Fox, P.T., Eickhoff, S.B., 2013. An investigation of the
22
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al. structural, connectional, and functional subspecialization in the human amygdala. Hum. Brain Mapp. 34 (12), 3247–3266. Calvert, G.A., 2001. Crossmodal processing in the human brain: insights from functional neuroimaging studies. Cerebr. Cortex 11, 1110–1123. Calvert, G.A., Hansen, P.C., Iversen, S.D., Brammer, M.J., 2001. Detection of audio-visual integration sites in humans by application of electrophysiological criteria to the BOLD effect. Neuroimage 14, 427–438. Cappe, C., Thut, G., Romei, V., Murray, M.M., 2010. Auditory-visual multisensory interactions in humans: timing, topography, directionality, and sources. J. Neurosci. 30, 12572–12580. Chen, Y.H., Edgar, J.C., Holroyd, T., Dammers, J., Thonnessen, H., Roberts, T.P., Mathiak, K., 2010. Neuromagnetic oscillations to emotional faces and prosody. Eur. J. Neurosci. 31, 1818–1827. Costafreda, S.G., Brammer, M.J., David, A.S., Fu, C.H., 2008. Predictors of amygdala activation during the processing of emotional stimuli: a meta-analysis of 385 PET and fMRI studies. Brain Res. Rev. 58, 57–70. Dolan, R.J., Morris, J.S., de Gelder, B., 2001. Crossmodal binding of fear in voice and face. Proc. Natl. Acad. Sci. U. S. A. 98, 10006–10010. Eickhoff, S.B., Paus, T., Caspers, S., Grosbras, M.H., Evans, A.C., Zilles, K., Amunts, K., 2007. Assignment of functional activations to probabilistic cytoarchitectonic areas revisited. Neuroimage 36, 511–521. Ernst, M.O., Bulthoff, H.H., 2004. Merging the senses into a robust percept. Trends Cognit. Sci. 8, 162–169. Fast, C.D., McGann, J.P., 2017. Amygdalar gating of early sensory processing through interactions with Locus Coeruleus. J. Neurosci. 37, 3085–3101. Fecteau, S., Belin, P., Joanette, Y., Armony, J.L., 2007. Amygdala responses to nonlinguistic emotional vocalizations. Neuroimage 36, 480–487. Fischl, B., 2012. FreeSurfer. Neuroimage 62, 774–781. Fox, E., Russo, R., Dutton, K., 2002. Attentional bias for threat: evidence for delayed disengagement from emotional faces. Cognit. Emot. 16, 355–379. Freese, J.L., Amaral, D.G., 2005. The organization of projections from the amygdala to visual cortical areas TE and V1 in the macaque monkey. J. Comp. Neurol. 486, 295–317. Freese, J.L., Amaral, D.G., 2006. Synaptic organization of projections from the amygdala to visual cortical areas TE and V1 in the macaque monkey. J. Comp. Neurol. 496, 655–667. Fruhholz, S., Grandjean, D., 2013. Amygdala subregions differentially respond and rapidly adapt to threatening voices. Cortex 49 (5), 1394–1403. Fruhholz, S., Hofstetter, C., Cristinzio, C., Saj, A., Seeck, M., Vuilleumier, P., Grandjean, D., 2015. Asymmetrical effects of unilateral right or left amygdala damage on auditory cortical processing of vocal emotions. Proc. Natl. Acad. Sci. U. S. A. 112, 1583–1588. Glascher, J., Tuscher, O., Weiller, C., Buchel, C., 2004. Elevated responses to constant facial emotions in different faces in the human amygdala: an fMRI study of facial identity and expression. BMC Neurosci. 5, 45. Goulet, S., Murray, E.A., 2001. Neural substrates of crossmodal association memory in monkeys: the amygdala versus the anterior rhinal cortex. Behav. Neurosci. 115, 271–284. Grandjean, D., Sander, D., Pourtois, G., Schwartz, S., Seghier, M.L., Scherer, K.R., Vuilleumier, P., 2005. The voices of wrath: brain responses to angry prosody in meaningless speech. Nat. Neurosci. 8, 145–146. Groppe, D.M., Bickel, S., Dykstra, A.R., Wang, X., Megevand, P., Mercier, M.R., Lado, F.A., Mehta, A.D., Honey, C.J., 2017. iELVis: an open source MATLAB toolbox for localizing and visualizing human intracranial electrode data. J. Neurosci. Methods 281, 40–48. Gschwind, M., Pourtois, G., Schwartz, S., Van De Ville, D., Vuilleumier, P., 2012. Whitematter connectivity between face-responsive regions in the human brain. Cerebr. Cortex 22, 1564–1576. Guillory, S.A., Bujarski, K.A., 2014. Exploring emotions using invasive methods: review of 60 years of human intracranial electrophysiology. Soc. Cognit. Affect Neurosci. 9, 1880–1889. Hamann, S., 2001. Cognitive and neural mechanisms of emotional memory. Trends Cognit. Sci. 5, 394–400. Hariri, A.R., Tessitore, A., Mattay, V.S., Fera, F., Weinberger, D.R., 2002. The amygdala response to emotional stimuli: a comparison of faces and scenes. Neuroimage 17, 317–323. Holmes, N.P., Spence, C., 2005. Multisensory integration: space, time and superadditivity. Curr. Biol. 15, R762–R764. Jenkinson, M., Beckmann, C.F., Behrens, T.E., Woolrich, M.W., Smith, S.M., 2012. Fsl. Neuroimage 62, 782–790. Kayser, C., Petkov, C.I., Augath, M., Logothetis, N.K., 2007. Functional imaging reveals visual modulation of specific fields in auditory cortex. J. Neurosci. 27, 1824–1835. Kayser, C., Petkov, C.I., Remedios, R., Logothetis, N.K., 2012. Multisensory influences on auditory processing: Perspectives from fMRI and electrophysiology. In: Murray, M.M., Wallace, M.T. (Eds.), The Neural Bases of Multisensory Processes. Boca Raton (FL). Kim, J., Pignatelli, M., Xu, S., Itohara, S., Tonegawa, S., 2016. Antagonistic negative and positive neurons of the basolateral amygdala. Nat. Neurosci. 19, 1636–1646. Klasen, M., Kenworthy, C.A., Mathiak, K.A., Kircher, T.T., Mathiak, K., 2011. Supramodal representation of emotions. J. Neurosci. 31, 13635–13643. Klinge, C., Roder, B., Buchel, C., 2010. Increased amygdala activation to emotional auditory stimuli in the blind. Brain 133, 1729–1736. Koelsch, S., Fritz, T., Schlaug, G., 2008. Amygdala activity can be modulated by unexpected chord functions during music listening. Neuroreport 19, 1815–1819. Kreifelts, B., Ethofer, T., Grodd, W., Erb, M., Wildgruber, D., 2007. Audiovisual integration of emotional signals in voice and face: an event-related fMRI study. Neuroimage
37, 1445–1456. Krolak-Salmon, P., Henaff, M.A., Vighetto, A., Bertrand, O., Mauguiere, F., 2004. Early amygdala reaction to fear spreading in occipital, temporal, and frontal cortex: a depth electrode ERP study in human. Neuron 42, 665–676. Kuraoka, K., Nakamura, K., 2007. Responses of single neurons in monkey amygdala to facial and vocal emotions. J. Neurophysiol. 97, 1379–1387. Lachaux, J.P., Rudrauf, D., Kahane, P., 2003. Intracranial EEG and human brain mapping. J. Physiol. Paris 97, 613–628. Laurienti, P.J., Perrault, T.J., Stanford, T.R., Wallace, M.T., Stein, B.E., 2005. On the use of superadditivity as a metric for characterizing multisensory integration in functional neuroimaging studies. Exp. Brain Res. 166, 289–297. LeDoux, J., 1998. Fear and the brain: where have we been, and where are we going? Biol. Psychiatry 44, 1229–1238. LeDoux, J.E., 2000. Emotion circuits in the brain. Annu. Rev. Neurosci. 23, 155–184. Likhtik, E., Stujenske, J.M., Topiwala, M.A., Harris, A.Z., Gordon, J.A., 2014. Prefrontal entrainment of amygdala activity signals safety in learned fear and innate anxiety. Nat. Neurosci. 17, 106–113. Lundqvist, D., Flykt, A., Öhman, A., 1998. Karolinska Directed Emotional Faces Set in. Department of Neurosciences, Karolinska Hospital, Stockholm, Sweden. Maess, B., Schroger, E., Widmann, A., 2016. High-pass filters and baseline correction in M/EEG analysis-continued discussion. J. Neurosci. Methods 266, 171–172. Maren, S., 2016. Parsing reward and aversion in the amygdala. Neuron 90, 209–211. Maris, E., Oostenveld, R., 2007. Nonparametric statistical testing of EEG- and MEG-data. J. Neurosci. Methods 164, 177–190. McDonald, A.J., 1998. Cortical pathways to the mammalian amygdala. Prog. Neurobiol. 55, 257–332. McGaugh, J.L., 2002. Memory consolidation and the amygdala: a systems perspective. Trends Neurosci. 25, 456. Megevand, P., Groppe, D.M., Bickel, S., Mercier, M.R., Goldfinger, M.S., Keller, C.J., Entz, L., Mehta, A.D., 2017. The Hippocampus and amygdala are integrators of neocortical influence: a CorticoCortical evoked potential study. Brain Connect. 7, 648–660. Meletti, S., Cantalupo, G., Benuzzi, F., Mai, R., Tassi, L., Gasparini, E., Tassinari, C.A., Nichelli, P., 2012. Fear and happiness in the eyes: an intra-cerebral event-related potential study from the human amygdala. Neuropsychologia 50, 44–54. Mendez-Bertolo, C., Moratti, S., Toledano, R., Lopez-Sosa, F., Martinez-Alvarez, R., Mah, Y.H., Vuilleumier, P., Gil-Nagel, A., Strange, B.A., 2016. A fast pathway for fear in human amygdala. Nat. Neurosci. 19, 1041–1049. Mercier, M.R., Bickel, S., Megevand, P., Groppe, D.M., Schroeder, C.E., Mehta, A.D., Lado, F.A., 2017. Evaluation of cortical local field potential diffusion in stereotactic electroencephalography recordings: a glimpse on white matter signal. Neuroimage 147, 219–232. Meredith, M.A., Nemitz, J.W., Stein, B.E., 1987. Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. J. Neurosci. 7, 3215–3229. Meredith, M.A., Stein, B.E., 1983. Interactions among converging sensory inputs in the superior colliculus. Science 221, 389–391. Mishra, J., Martinez, A., Sejnowski, T.J., Hillyard, S.A., 2007. Early cross-modal interactions in auditory and visual cortex underlie a sound-induced visual illusion. J. Neurosci. 27, 4120–4131. Mitra, P.P., Pesaran, B., 1999. Analysis of dynamic brain imaging data. Biophys. J. 76, 691–708. Montes-Lourido, P., Vicente, A.F., Bermudez, M.A., Gonzalez, F., 2015. Neural activity in monkey amygdala during performance of a multisensory operant task. J. Integr. Neurosci. 14, 309–323. Mormann, F., Kornblith, S., Quiroga, R.Q., Kraskov, A., Cerf, M., Fried, I., Koch, C., 2008. Latency and selectivity of single neurons indicate hierarchical processing in the human medial temporal lobe. J. Neurosci. 28, 8865–8872. Morris, J.S., Friston, K.J., Buchel, C., Frith, C.D., Young, A.W., Calder, A.J., Dolan, R.J., 1998. A neuromodulatory role for the human amygdala in processing emotional facial expressions. Brain 121 (Pt 1), 47–57. Morris, J.S., Frith, C.D., Perrett, D.I., Rowland, D., Young, A.W., Calder, A.J., Dolan, R.J., 1996. A differential neural response in the human amygdala to fearful and happy facial expressions. Nature 383, 812–815. Mueller, K., Mildner, T., Fritz, T., Lepsien, J., Schwarzbauer, C., Schroeter, M.L., Moller, H.E., 2011. Investigating brain response to music: a comparison of different fMRI acquisition schemes. Neuroimage 54, 337–343. Muller, V.I., Cieslik, E.C., Turetsky, B.I., Eickhoff, S.B., 2012. Crossmodal interactions in audiovisual emotion processing. Neuroimage 60, 553–561. Muller, V.I., Habel, U., Derntl, B., Schneider, F., Zilles, K., Turetsky, B.I., Eickhoff, S.B., 2011. Incongruence effects in crossmodal emotional integration. Neuroimage 54, 2257–2266. Murray, R.J., Brosch, T., Sander, D., 2014. The functional profile of the human amygdala in affective processing: insights from intracranial recordings. Cortex 60, 10–33. Nelken, I., 2004. Processing of complex stimuli and natural scenes in the auditory cortex. Curr. Opin. Neurobiol. 14, 474–480. Nishijo, H., Ono, T., Nishino, H., 1988. Topographic distribution of modality-specific amygdalar neurons in alert monkey. J. Neurosci. 8, 3556–3569. Oostenveld, R., Fries, P., Maris, E., Schoffelen, J.M., 2011. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2011, 156869. Oya, H., Kawasaki, H., Howard 3rd, M.A., Adolphs, R., 2002. Electrophysiological responses in the human amygdala discriminate emotion categories of complex visual stimuli. J. Neurosci. 22, 9502–9512. Pannese, A., Grandjean, D., Fruhholz, S., 2015. Subcortical processing in auditory communication. Hear. Res. 328, 67–77. Papademetris, X., Jackowski, M.P., Rajeevan, N., DiStasio, M., Okuda, H., Constable, R.T., Staib, L.H., 2006. BioImage Suite: an integrated medical image analysis suite: an
23
Neuropsychologia 131 (2019) 9–24
J. Domínguez-Borràs, et al.
Skaliora, I., Doubell, T.P., Holmes, N.P., Nodal, F.R., King, A.J., 2004. Functional topography of converging visual and auditory inputs to neurons in the rat superior colliculus. J. Neurophysiol. 92, 2933–2946. Tadel, F., Baillet, S., Mosher, J.C., Pantazis, D., Leahy, R.M., 2011. Brainstorm: a userfriendly application for MEG/EEG analysis. Comput. Intell. Neurosci. 2011, 879716. Tanner, D., Morgan-Short, K., Luck, S.J., 2015. How inappropriate high-pass filters can produce artifactual effects and incorrect conclusions in ERP studies of language and cognition. Psychophysiology 52, 997–1009. Tanner, D., Norton, J.J., Morgan-Short, K., Luck, S.J., 2016. On high-pass filter artifacts (they're real) and baseline correction (it's a good idea) in ERP/ERMF analysis. J. Neurosci. Methods 266, 166–170. Tottenham, N., Tanaka, J.W., Leon, A.C., McCarry, T., Nurse, M., Hare, T.A., Marcus, D.J., Westerlund, A., Casey, B.J., Nelson, C., 2009. The NimStim set of facial expressions: judgments from untrained research participants. Psychiatr. Res. 168, 242–249. Turner, B.H., Mishkin, M., Knapp, M., 1980. Organization of the amygdalopetal projections from modality-specific cortical association areas in the monkey. J. Comp. Neurol. 191, 515–543. Tyszka, J.M., Pauli, W.M., 2016. In vivo delineation of subdivisions of the human amygdaloid complex in a high-resolution group template. Hum. Brain Mapp. 37, 3979–3998. Vuilleumier, P., 2005. How brains beware: neural mechanisms of emotional attention. Trends Cognit. Sci. 9, 585–594. Vuilleumier, P., Pourtois, G., 2007. Distributed and interactive brain mechanisms during emotion face perception: evidence from functional neuroimaging. Neuropsychologia 45, 174–194. Vuilleumier, P., Huang, Y.M., 2009. Emotional attention: uncovering the mechanisms of affective biases in perception. Curr. Dir. Psychol. Sci. 18, 148–152. Wallace, M.T., Meredith, M.A., Stein, B.E., 1998. Multisensory integration in the superior colliculus of the alert cat. J. Neurophysiol. 80, 1006–1010. Wang, S., Tudusciuc, O., Mamelak, A.N., Ross, I.B., Adolphs, R., Rutishauser, U., 2014. Neurons in the human amygdala selective for perceived emotion. Proc. Natl. Acad. Sci. U. S. A. 111, E3110–E3119. Wang, S., Yu, R., Tyszka, J.M., Zhen, S., Kovach, C., Sun, S., Huang, Y., Hurlemann, R., Ross, I.B., Chung, J.M., Mamelak, A.N., Adolphs, R., Rutishauser, U., 2017. The human amygdala parametrically encodes the intensity of specific facial emotions and their categorical ambiguity. Nat. Commun. 8, 14821. Wang, S., Mamelak, A.N., Adolphs, R., Rutishauser, U., 2018. Encoding of target detection during visual search by single neurons in the human brain. Curr. Biol. 28, 2058–2069 e2054. Wiethoff, S., Wildgruber, D., Grodd, W., Ethofer, T., 2009. Response and habituation of the amygdala during processing of emotional prosody. Neuroreport 20, 1356–1360. Yukie, M., 2002. Connections between the amygdala and auditory cortical areas in the macaque monkey. Neurosci. Res. 42, 219–229. Yuval-Greenberg, S., Deouell, L.Y., 2007. What you see is not (always) what you hear: induced gamma band responses reflect cross-modal interactions in familiar object recognition. J. Neurosci. 27, 1090–1096. Zhang, W., Schneider, D.M., Belova, M.A., Morrison, S.E., Paton, J.J., Salzman, C.D., 2013. Functional circuits and anatomical distribution of response properties in the primate amygdala. J. Neurosci. 33, 722–733. Zheng, J., Anderson, K.L., Leal, S.L., Shestyuk, A., Gulsen, G., Mnatsakanyan, L., Vadera, S., Hsu, F.P., Yassa, M.A., Knight, R.T., Lin, J.J., 2017. Amygdala-hippocampal dynamics during salient information processing. Nat. Commun. 8, 14413.
update. Insight J 2006, 209. Pape, H.C., Driesang, R.B., 1998. Ionic mechanisms of intrinsic oscillations in neurons of the basolateral amygdaloid complex. J. Neurophysiol. 79, 217–226. Partan, S., Marler, P., 1999. Communication goes multimodal. Science 283, 1272–1273. Peelen, M.V., Atkinson, A.P., Vuilleumier, P., 2010. Supramodal representations of perceived emotions in the human brain. J. Neurosci. 30, 10127–10134. Pessoa, L., Adolphs, R., 2010. Emotion processing and the amygdala: from a 'low road' to 'many roads' of evaluating biological significance. Nat. Rev. Neurosci. 11, 773–783. Phelps, E.A., LeDoux, J.E., 2005. Contributions of the amygdala to emotion processing: from animal models to human behavior. Neuron 48, 175–187. Phillips, M.L., Young, A.W., Scott, S.K., Calder, A.J., Andrew, C., Giampietro, V., Williams, S.C., Bullmore, E.T., Brammer, M., Gray, J.A., 1998. Neural responses to facial and vocal expressions of fear and disgust. Proc. Biol. Sci. 265, 1809–1817. Pourtois, G., de Gelder, B., Bol, A., Crommelinck, M., 2005. Perception of facial expressions and voices and of their combination in the human brain. Cortex 41, 49–59. Pourtois, G., Spinelli, L., Seeck, M., Vuilleumier, P., 2010. Temporal precedence of emotion over attention modulations in the lateral amygdala: intracranial ERP evidence from a patient with temporal lobe epilepsy. Cognit. Affect Behav. Neurosci. 10, 83–93. Pourtois, G., Schettino, A., Vuilleumier, P., 2013. Brain mechanisms for emotional influences on perception and attention: What is magic and what is not. Biol. Psychol. 92 (3), 492–512. Retson, T.A., Van Bockstaele, E.J., 2013. Coordinate regulation of noradrenergic and serotonergic brain regions by amygdalar neurons. J. Chem. Neuroanat. 52, 9–19. Robins, D.L., Hunyadi, E., Schultz, R.T., 2009. Superior temporal activation in response to dynamic audio-visual emotional cues. Brain Cogn. 69, 269–278. Rutishauser, U., Ross, I.B., Mamelak, A.N., Schuman, E.M., 2010. Human memory strength is predicted by theta-frequency phase-locking of single neurons. Nature 464, 903–907. Rutishauser, U., Ye, S., Koroma, M., Tudusciuc, O., Ross, I.B., Chung, J.M., Mamelak, A.N., 2015. Representation of retrieval confidence by single neurons in the human medial temporal lobe. Nat. Neurosci. 18, 1041–1050. Sabatinelli, D., Fortune, E.E., Li, Q., Siddiqui, A., Krafft, C., Oliver, W.T., Beck, S., Jeffries, J., 2011. Emotional perception: meta-analyses of face and natural scene processing. Neuroimage 54, 2524–2533. Sagaspe, P., Schwartz, S., Vuilleumier, P., 2011. Fear and stop: a role for the amygdala in motor inhibition by emotional signals. Neuroimage 55, 1825–1835. Sarko, D.K., Nidiffer, A.R., Powers, I.A., Ghose, D., Hillock-Dunn, A., Fister, M.C., Krueger, J., Wallace, M.T., 2012. Spatial and temporal features of multisensory processes: bridging animal and human studies. In: Murray, M.M., Wallace, M.T. (Eds.), The Neural Bases of Multisensory Processes. Boca Raton (FL). Sato, W., Kochiyama, T., Uono, S., Matsuda, K., Usui, K., Inoue, Y., Toichi, M., 2011. Rapid amygdala gamma oscillations in response to fearful facial expressions. Neuropsychologia 49, 612–617. Schirmer, A., Escoffier, N., Zysset, S., Koester, D., Striano, T., Friederici, A.D., 2008. When vocal processing gets emotional: on the role of social orientation in relevance detection by the human amygdala. Neuroimage 40, 1402–1410. Scott, S.K., Young, A.W., Calder, A.J., Hellawell, D.J., Aggleton, J.P., Johnson, M., 1997. Impaired auditory recognition of fear and anger following bilateral amygdala lesions. Nature 385, 254–257. Senkowski, D., Saint-Amour, D., Kelly, S.P., Foxe, J.J., 2007. Multisensory processing of naturalistic objects in motion: a high-density electrical mapping and source estimation study. Neuroimage 36, 877–888.
24