Acta Psychologica 178 (2017) 66–72
Contents lists available at ScienceDirect
Acta Psychologica journal homepage: www.elsevier.com/locate/actpsy
Neural mechanisms underlying sound-induced visual motion perception: An fMRI study
MARK
Souta Hidakaa,⁎, Satomi Higuchib, Wataru Teramotoc, Yoichi Sugitad,⁎ a
Department of Psychology, Rikkyo University, 1-2-26, Kitano, Niiza-shi, Saitama 352-8558, Japan Division of Ultrahigh Field MRI, Institute for Biomedical Sciences, Iwate Medical University, 2-1-1 Nishitokuta, Yahaba-cho, Shiwa-gun, Iwate 028-3694, Japan c Department of Psychology, Kumamoto University, 2-40-1 Kurokami, Chuo-ku, Kumamoto 860-8555, Japan d Department of Psychology, Waseda University, 1-24-1 Toyama, Shinjuku-ku, 162-8644, Japan b
A R T I C L E I N F O
A B S T R A C T
Keywords: Audiovisual interaction Motion perception Sound-induced visual motion Visual apparent motion Human motion processing area Sensory association areas
Studies of crossmodal interactions in motion perception have reported activation in several brain areas, including those related to motion processing and/or sensory association, in response to multimodal (e.g., visual and auditory) stimuli that were both in motion. Recent studies have demonstrated that sounds can trigger illusory visual apparent motion to static visual stimuli (sound-induced visual motion: SIVM): A visual stimulus blinking at a fixed location is perceived to be moving laterally when an alternating left-right sound is also present. Here, we investigated brain activity related to the perception of SIVM using a 7 T functional magnetic resonance imaging technique. Specifically, we focused on the patterns of neural activities in SIVM and visually induced visual apparent motion (VIVM). We observed shared activations in the middle occipital area (V5/hMT), which is thought to be involved in visual motion processing, for SIVM and VIVM. Moreover, as compared to VIVM, SIVM resulted in greater activation in the superior temporal area and dominant functional connectivity between the V5/hMT area and the areas related to auditory and crossmodal motion processing. These findings indicate that similar but partially different neural mechanisms could be involved in auditory-induced and visually-induced motion perception, and neural signals in auditory, visual, and, crossmodal motion processing areas closely and directly interact in the perception of SIVM.
1. Introduction Our perceptual systems appropriately and flexibly associate and integrate inputs from different sensory modalities (Ernst & Bülthoff, 2004) in order to create robust perceptual representations of dynamic environmental changes. From this viewpoint, crossmodal studies have investigated the manner of interactions of motion information with different sensory modalities. Specifically, the focus of these studies to date has been the interaction/integration of audiovisual motion information. Kitagawa and Ichihara (2002) reported that an adaptation to visual motion in depth (simulated by visual size changes) induced perceptual changes of auditory loudness (simulating auditory motion in depth) in the opposite direction of the adapted visual stimuli. In addition, motion information in one modality (e.g., vision) has been shown to modulate the perception of motion in another modality (e.g., audition). Soto-Faraco, Lyons, Gazzaniga, Spence, and Kingstone (2002) reported that the direction of auditory apparent motion stimuli tended to be misperceived as the same as the direction of visual stimuli that were moving in opposite directions (crossmodal dynamic capture).
⁎
Corresponding authors. E-mail addresses:
[email protected] (S. Hidaka),
[email protected] (Y. Sugita).
http://dx.doi.org/10.1016/j.actpsy.2017.05.013 Received 16 August 2016; Received in revised form 17 May 2017; Accepted 25 May 2017 0001-6918/ © 2017 Elsevier B.V. All rights reserved.
Investigations involving detailed controls and sensory pairings other than audiovisual stimuli (Sanabria, Soto-Faraco, & Spence, 2005; SotoFaraco, Spence, & Kingstone, 2004, 2005) have suggested that there exist common perceptual mechanisms and shared neural substrates for crossmodal motion perception (see Soto-Faraco et al., 2003; SotoFaraco, Spence, Lloyd, & Kingstone, 2004 for review). Brain imaging studies have confirmed that several brain areas are commonly activated in response to audiovisual motion information (see Hidaka, Teramoto, & Sugita, 2015 for review). Unimodal studies have reported that real (Tootell et al., 1995; Watson et al., 1993; Zeki et al., 1991) and apparent (Goebel, Khorram-Sefat, Muckli, Hacker, & Singer, 1998; Liu, Slotnick, & Yantis, 2004; Muckli, Kohler, Kriegeskorte, & Singer, 2005; Sunaert, Van Hecke, Marchal, & Orban, 1999) visual motion induce activations in the V5/hMT area as well as V3 areas. These areas, which are located in the ventral occipital region (extrastriate cortex) of the brain, are reported to be primarily sensitive to visual motion information. Interestingly, activation in the V5/hMT area (Poirier et al., 2006; Saenz, Lewis, Huth, Fine, & Koch, 2008) has been reported in response to auditory motion as well as the planum
Acta Psychologica 178 (2017) 66–72
S. Hidaka et al.
perception in the contralateral hemisphere (i.e., the left hemisphere), which is distinguishable from the bilateral activation that is observed in response to auditory motion information (e.g., Lewis et al., 2000).
temporale (PT) and parietal areas (Krumbholz, Schönwiesner, Rübsamen, Zilles, Fink, & Von Cramon, 2005; Warren, Zielinski, Green, Rauschecker, & Griffiths, 2002). Alink, Singer, and Muckli (2008) observed V5/hMT activations by presenting auditory motion information alone, although visual motion induced stronger activation than auditory motion. Scheef et al. (2009) reported enhanced activation of the V5/hMT area by sounds that moved in a similar direction as the visual stimuli. These findings suggest that areas that have traditionally been considered sensitive to visual motion are activated by crossmodal motion stimuli. Lewis, Beauchamp, and DeYoe (2000) presented visual and auditory motion information independently and found that visual motion induced the activation of the primary visual areas and the V5/hMT area. Alternatively, auditory motion activated auditory primary areas and the lateral sulcus (auditory belt). Moreover, some overlapped activation was detected in brain areas including the lateral parietal, lateral frontal, anterior midline, and anterior insula areas. Similarly, a number of studies (Baumann & Greenlee, 2007; Bremmer et al., 2001; Grefkes & Fink, 2005) have reported that audiovisual motion information could induce activation of brain areas related to crossmodal integration, such as the superior parietal lobule, superior temporal gyrus (STG), intraparietal sulcus, lateral inferior postcentral cortex, and the premotor cortex (Calvert, 2001; Driver & Noesselt, 2008). These findings suggest that higher sensory association areas are involved in the processing of audiovisual moving stimuli. Studies on the neural mechanisms underlying audiovisual motion perception have mainly examined responses to auditory and visual inputs that both contain motion information, and the activated brain areas and their activation patterns varied depending on the congruency of motion signals and the types of stimuli. Recently, a novel phenomenon regarding audiovisual motion perception was reported between a pair of static visual stimuli and auditory apparent motion. When auditory apparent motion was presented through headphones while a visual target was blinking at a fixed location, auditory motion induced illusory motion perception of the visual target when it is presented at a peripheral visual location (around 10°) (sound-induced visual motion: SIVM) (Fig. 1A). The perception of SIVM was shown to be indiscernible or comparable to that of visual-induced apparent motion (VIVM) (Hidaka et al., 2009; Teramoto, Manaka, et al., 2010b). Previous studies have reported no effects (Alais & Burr, 2004) or biasing effects instead of perceptual effects (Meyer & Wuerger, 2001) of auditory motion on visual motion perception when visual stimuli were presented in the central visual field. In contrast, when visual stimuli were presented in the peripheral visual field, SIVM is confirmed to occur purely based on motion perception and processing rather than attentional or response/ decisional biases (Hidaka, Teramoto, Sugita, Manaka, Sakamoto, & Suzuki, 2011). The neural mechanisms underlying SIVM remain unclear because no direct investigation has been conducted. In the perception of SIVM, auditory motion information triggers motion perception to the static visual stimuli. Brain imaging studies have demonstrated that auditory and visual motion activates motion processing areas, such as V5/hMT area. In addition, sensory association areas have been also reported to be related to audiovisual motion processing. We could therefore predict that both motion processing and higher association areas are involved in the perception of SIVM. Evidence regarding the neuronal bases of SIVM would contribute to increasing the understanding of the mechanisms underlying crossmodal motion processing in the human brain. The aim of the current study was to investigate the brain activity that was related to the perception of SIVM using a 7 T functional magnetic resonance imaging (fMRI) technique. Specifically, we focused on the patterns of neural activity that were related to SIVM and VIVM. As mentioned above, the key factor when demonstrating SIVM is presenting the visual stimuli in the peripheral visual field. Therefore, we presented the visual stimuli in the right hemifield. We expected to observe neural activation in response to SIVM-triggered visual motion
2. Methods 2.1. Participants Twenty-six healthy adults (age range 20–38 years (M = 23.12, SD = 3.84); 13 females) participated in the experiment. All participants had normal hearing and normal or corrected-to-normal vision. This study was conducted in accordance with the Declaration of Helsinki and the protocol was approved by the local ethics committee of Iwate Medical University. Informed consent was obtained from each participant before conducting the experiment. 2.2. Stimuli and experimental procedures Visual stimuli were displayed on a custom-ordered 7 T fMRI compatible LCD display (24 in.; Takashima Seisakusho Ltd., Japan) with a resolution of 1920 × 1200 pixels and a refresh rate of 60 Hz. The 7 T fMRI system has relatively higher sensitivity and spatial resolution for brain responses (Shmuel, Yacoub, Chaimow, Logothetis, & Ugurbil, 2007; van der Zwaag et al., 2009). In addition to the red fixation cross (0.5°; 7.4 cd/m2), a white circle (2.0°; 30.0 cd/m2) was presented at the right side of the fixation with an eccentricity of 7.2°. In the VIVM condition, the visual stimuli were displayed at alternating horizontal locations that were separated by 1.0° apart from each other around the 7.2° of eccentricity. The duration of the visual stimuli was 400 ms and the stimulus onset synchrony (SOA) was 500 ms. White noise bursts (22.05 kHz of sampling frequency) were presented as auditory stimuli through MR-compatible headphones (Kobatel, Japan) at an A-weighted sound pressure level of 90 dB against background noises (76 dB). The duration of the auditory stimuli was 100 ms with 5 ms of rise and fall. The SOA between the stimuli was 500 ms. The onsets of the visual and auditory stimuli were synchronized. The auditory stimuli were presented through headphones to both ears simultaneously or alternating between the participants' ears. These stimulus parameters were based on our previous study (Hidaka et al., 2009) in order to maximize the neural activations induced by SIVM. We briefly confirmed that the participants experienced SIVM with these procedures before performing the experiment. In the SIVM condition, the blinking visual stimuli and auditory stimuli were repeatedly presented in the fixed position and alternating between the ears, respectively, during a 20-second stimulus presentation period (Fig. 1B, top). In contrast, in the VIVM condition, the visual stimuli were presented in alternating horizontal locations, while the auditory stimuli were not alternated (Fig. 1B, middle). In addition, we introduced a baseline condition in which the locations of the visual and auditory stimuli were not changed (Fig. 1B, bottom). Each condition was presented three times. Blank periods (20 s) were interleaved between the stimulus presentation periods of each condition. Additional blank periods were applied during the initiation (80 s) and termination (20 s) of the experimental sequences. The experiment sequence took about 7.5 min in total. The presentation order of the conditions was pseudo-randomized across the participants. The participants were asked to keep fixation during the scans. We adopted a block design rather than an event-related design in order to obtain reliable brain responses for passive observation. Many studies on crossmodal interactions have reported that behavioral tasks tend to induce response/ decisional biases and/or result in inadequate perceptual sets for a target sensory modality. Because the use of behavioral tasks would likely induce patterns of neural activation that were related to responses or decisions, the participants passively observed the stimuli without any tasks in the current study. A questionnaire study (Uwano et al., 2015) reported that any effects of the participants' physical condition (e.g., 67
Acta Psychologica 178 (2017) 66–72
S. Hidaka et al.
Fig. 1. Schematic illustrations of sound-induced visual motion (SIVM) and experimental conditions. (A) SIVM. When the visual blinking stimulus is presented at a fixed position with sounds alternating between each ear of an observer, the visual stimuli are perceived as moving congruently with auditory apparent motion. (B) Experimental conditions. In the SIVM condition, visual flashes at a fixed position and auditory apparent motion were presented. Visual apparent motion with sounds having no alterations was presented in visual-induced visual motion (VIVM) condition. A condition without any shifts both for visual and auditory stimuli was also included as the baseline condition.
dizziness) are adequately alleviated by our experimental setup to a degree that is comparable to that of 1.5 T or 3 T systems. Therefore, the physical conditions and mental statuses of the participants in our 7 T experiment would be similar to those in the 1.5 T and 3 T MRI experiments.
TE = 2.2 ms; flip angle = 12°; spatial resolution = 1 × 1 × 1.2 mm; FOV = 256 mm; slice thickness = 1.2 mm; inter-slice gap = 0.0 mm; number of slices = 170).
2.3. fMRI procedure
The fMRI data were preprocessed with SPM12 (Wellcome Department of Imaging Neuroscience, London, UK) and statistically analyzed with Matlab (MathWorks, Inc.). The realign and unwarp functions implemented in the SPM were used for motion corrections. The deviations of head positions among the participants were within 0.50 mm in the left-right, anterior-posterior, and superior-inferior directions, which were smaller than the acquired voxel size. After motion correction, the functional brain images of each participant were coregistered and normalized into the standard space defined by the Montreal Neurological Institute (MNI) template. The images were then smoothed with a 3D–Gaussian kernel with a full width at half maximum of 8 mm. Functional images of the ventromedial prefrontal and superior temporal middle regions were missing in some participants due to susceptibility artifacts. However, these areas were not part of the region of interest in the current study. The areas of activation were determined in each participant in each condition with general linear model. Each condition (SIVM, VIVM, and baseline) was modeled as a separate regressor that was convolved with
2.4. fMRI data analysis
We collected fMRI data by using a 7 T-MRI system (Discovery MR950; GE Healthcare, Milwaukee, WI) with a 32-channel head coil (Nova Medical, Inc., Wilmington, MA). The parameters of the functional scans were as follows: TR = 4 s, TE = 18.4 ms, flip angle = 80°, spatial resolution = 1.5 × 1.5 × 2.0 mm, FOV = 192 mm, slice thickness = 2 mm, inter-slice gap = 0.0 mm, number of slices = 70. FOV was tilted − 30° relative to the AC-PC line in order to encompass the whole brain and minimize susceptibility artifacts, as previously described by Weiskopf, Hutton, Josephs, and Deichmann (2006). To minimize artifacts in the occipital cortex, a posterior to anterior phase encode direction was used. For each participant, 110 functional brain volumes were obtained. The first 15 volumes were discarded from the analysis to allow for T1-equilibration effects. Our preliminary EPI tests showed that T1-equilibration effects were relatively longer for 7 T imaging compared to 3 T or 1.5 T imaging. We also obtained a T1weighted anatomical scan for each participant (TR = 7.5 ms; 68
Acta Psychologica 178 (2017) 66–72
S. Hidaka et al.
the canonical hemodynamic response function. Next, using the contrast images of each condition from all participants, we performed analyses with a repeated measures analysis of variance (ANOVA) design for taking into consideration the factors of the conditions and participants. Contrast analyses were performed by entering the SIVM and VIVM conditions as factors and compared them in order to identify the specific activation patterns for each condition. A conjunction analysis (testing the conjunction of null hypotheses) (Friston, Holmes, Price, Büchel, & Worsley, 1999; Price & Friston, 1997) was also performed in order to reveal shared activations between the SIVM and VIVM conditions. Conjunction of the contrast of each condition against baseline (i.e., SIVM-baseline ∩ VIVM-baseline) would violate the orthogonality or independence for statistical contrast analyses (e.g., Mumford, Poline, & Poldrack, 2015). Thus, we performed a conjunction analysis of the SIVM and VIVM conditions as factors with masking out the regions activated during the baseline condition. This enabled us to create maps of the areas of activation that were shared by the SIVM and VIVM conditions instead of maps of areas based on the sum of the activations observed in each condition. These analyses were performed with a 5% statistical significance level and were corrected for family-wise errors (FWE) at the voxel level. The activation map of the baseline condition with a 5% significance level (FWE corrected) was used as the mask image. The Anatomy Toolbox 2.2 (Eickhoff et al., 2005) was utilized to determine whether activated brain areas were consistent with cytoarchitectonically-defined areas. Based on the findings of previous fMRI studies on apparent and real visual motion perception (Goebel et al., 1998; Liu et al., 2004; Muckli et al., 2005; Sunaert et al., 1999; Tootell et al., 1995; Watson et al., 1993; Zeki et al., 1991), we focused on the V3 V (Rottschy et al., 2007), V3A (Kujovic et al., 2013), and V5/hMT (Malikovic et al., 2007). A psychophysiological interaction (PPI) analysis was also performed to investigate functional connectivity in the SIVM and VIVM conditions. We set the left V5/hMT area (Poirier et al., 2006) as the seed region and identified the peak coordinates in each participant based on the effects of the interest contrasts of the SIVM and VIVM conditions against the blank period. Then, the time courses of the BOLD signals were extracted at the individually identified peak coordinates within 6 mm of the sphere. The significant statistical level for detecting the peak coordinates was set below uncorrected 0.5% for 22 participants with the exception of 3 participants at 5% and 1 participant at 6% due to a lack of finding a significant coordinates. We performed two PPI analyses with the psychological variables of the SIVM-baseline and VIVM-baseline contrasts as factors. The analysis on the SIVM-baseline contrasts examined the effects of the audiovisual interaction of motion perception, while the analysis on the VIVM-baseline contrast investigated the effects on visual motion processing. Areas of significant interaction were detected with an ANOVA design and at an uncorrected 0.1% statistical level at the voxel level and 5% statistical level with FWE correction at the cluster level.
Table 1 Locations of the activated areas and statistical results of the contrast and conjunction analyses.
3. Results
The analyses were corrected for family-wise errors at the voxel level (p < 0.05). Clusters over 10 voxels are listed. SIVM, sound-induced visual motion; VIVM, visual-induced visual motion.
Region
SIVM > VIVM Superior temporal gyrus Insula cortex Inferior frontal gyrus
Montreal Neurological Institute coordinates
L R
T
443
x
y
z
− 59 − 56 33 38 44 − 48
−35 −26 6 27 17 −9
9 6 −5 9 6 2
298
8.13 5.93 7.51 6.95 5.86 7.04
−6 − 14
36
5.99 5.58
30 12
29 27
5.64 5.66
11
16
5.65
23
13
5.27
12
12
5.58
8 5 −3 23
552
8.38 7.85 7.39 6.40
Superior temporal L gyrus Insula cortex − 45 −6 Superior temporal L − 47 12 gyrus Cingulate gyrus L −2 −17 Superior temporal R 60 −33 gyrus (Frontal lobe) R 24 41 subgyral Anterior cingulate R 3 21 gyrus Extranuclear L − 17 −3 SIVM ∩ VIVM with exclusion of baseline Superior temporal L − 47 −42 gyrus − 41 −38 Extranuclear − 33 −24 (Frontal lobe) L − 36 9 subgyral Inferior frontal gyrus − 42 5 Superior temporal R 68 −45 gyrus 62 −51 51 −53 Middle temporal gyrus Middle occipital L − 39 −65 gyrus Middle temporal − 50 −60 gyrus Precentral gyrus L − 39 −8 − 48 −9 (Temporal lobe) L − 42 −5 subgyral − 44 −11 V5/hMT (middle L − 41 −77 occipital gyrus) (Temporal lobe) − 39 −75 subgyral Precentral gyrus L − 48 −9 (Parietal lobe) L − 39 −29 subgyral Superior temporal L − 53 −5 gyrus Inferior parietal R 44 −32 lobule Lingual gyrus L − 17 −90 Middle frontal gyrus L − 35 −6
We first compared the activation patterns between the SIVM and VIVM conditions. This analysis revealed greater activation in the left STG in the SIVM condition relative to the VIVM condition (Table 1; Fig. 2A, bottom right). Higher activations in the VIVM condition relative to the SIVM condition were not observed. These results simply indicate that the activations were similar in these conditions except for the increased auditory activation that occurred in response to the moving sounds in the SIVM condition. The conjunction analysis was then performed to highlight the common patterns of brain activation in the SIVM and VIVM conditions in areas that did not show activation in the baseline condition. Activations were observed in the bilateral STG as well as the right inferior parietal lobule (IPL) (Table 1; Fig. 2B, bottom left and right). We also observed activation in the left middle occipital gyrus (MOG), which
Cluster size
27 3 9 3 2
435
164
112
104
−5 45 42 − 20 − 14 3
6.00 6.33 6.13 6.11 7.48 5.76
101 58 57
−6
6.08 5.49 6.37 6.02 6.22 5.92
54 23
25 16
6.51 6.81
−5
16
5.93
21
14
5.93
−2 59
12 11
5.85 5.47
was assumed to correspond to the V5/hMT area (12.3% of the activated cluster) (Table 1; Fig. 2B, bottom right). We did not identify the activated areas corresponding to the V3V or V3A in this analysis. These results show that the moving visual stimuli and static visual stimuli with the moving sounds that were presented on the right side of the visual field both induced activation in the left motion processing area and higher association areas. The PPI analysis was also conducted to investigate functional connectivity in the SIVM and VIVM conditions. The left V5/hMT was set as the seed region because the activation of this area was found in the SIVM and VIVM conditions in the conjunction analysis. There were significant interactions of the bilateral STG and left middle temporal 69
Acta Psychologica 178 (2017) 66–72
S. Hidaka et al.
Table 2 Locations of the activated areas and statistical results of the psychophysiological interaction analyses. Region
SIVM > Baseline Superior temporal gyrus
Montreal Neurological Institute coordinates
T
5.87 5.55 5.29 5.33 4.92 4.38 6.47
x
y
z
− 39 − 46 − 57 64 54 51 − 38
−39 −28 −26 −34 −28 −8 −62
6 0 2 6 4 −9 26
1786
L
−9 −3 − 12
34 30 38
28 40 21
700
4.87 4.46 3.96
L
− 38
−63
24
1046
7.61
R
− 30
−48
30
13
5.41
L
− 54
−66
22
12
5.31
L
Superior temporal gyrus
R
Middle temporal gyrus Anterior cingulate
L
VIVM > Baseline Middle temporal gyrus (Parietal lobe) subgyral Middle temporal gyrus
Cluster size
1254
995
The analyses were corrected for family-wise errors at the cluster level (p < 0.05) and not corrected for multiple comparisons at the voxel level (p < 0.001). SIVM, sound-induced visual motion; VIVM, visual-induced visual motion.
correspond with the cytoarchitectonically-defined V5/hMT areas. This area has been reported to activate in response to both real and apparent visual motion (Goebel et al., 1998; Liu et al., 2004; Muckli et al., 2005; Sunaert et al., 1999; Tootell et al., 1995; Watson et al., 1993; Zeki et al., 1991). The conjunction analysis also detected activation in areas that were related to crossmodal interactions, such as the bilateral STG as well as right IPL. These areas have been reported in relation to not only auditory processing, including auditory motion perception (Alink, Euler, Kriegeskorte, Singer, & Kohler, 2012; Baumann & Greenlee, 2007; Bremmer et al., 2001; Griffiths et al., 1998; Krumbholz et al., 2005; Poirier et al., 2006; Saenz et al., 2008; Warren et al., 2002), but also to crossmodal interactions and integrations (Calvert, 2001; Driver & Noesselt, 2008) and crossmodal motion perception (Alink et al., 2008; Baumann & Greenlee, 2007; Bremmer et al., 2001; Grefkes & Fink, 2005; Lewis et al., 2000). These indicate that activation in the areas related to visual and crossmodal motion processing could be similarly involved in the percept of SIVM and VIVM with the presentations of sounds and visual stimuli. We also found dominant activation in the left STG in the SIVM condition compared to the VIVM condition. Moreover, the PPI analyses demonstrated significant interactions between the left V5/hMT and the bilateral STG and left MTG in the SIVM condition, while an interaction with only the left MTG was observed in the SIVM condition. These areas are assumed to correspond to areas related to auditory (PT and IPL) (Poirier et al., 2006) and crossmodal (Alink et al., 2008; Baumann & Greenlee, 2007; Bremmer et al., 2001; Lewis et al., 2000) motion processing. These findings suggest that the modulation of visual motion processing areas (the left V5/hMT) by areas related to auditory and crossmodal motion processing plays a key role in illusory visual motion perception in SIVM. Brain imaging studies of crossmodal interactions in motion perception that involved moving auditory and visual motion stimuli have demonstrated activation in sensory and crossmodal motion processing areas, while the activated brain areas and their activation patterns varied depending on the congruency of motion signals and the types of stimuli (Alink et al., 2008; Baumann & Greenlee, 2007; Bremmer et al., 2001; Grefkes & Fink, 2005; Lewis et al., 2000; Scheef et al., 2009). In the SIVM condition, the visual and auditory stimuli were presented in
Fig. 2. Results of (A) the contrast analysis comparing activation in the SIVM and VIVM conditions and (B) the conjunction analysis of activation in the SIVM and VIVM conditions with exclusive masking of activation in the baseline condition. The areas with significant differences are marked in red (p < 0.05, corrected for family-wise errors at the voxel level, cluster size > 10). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
gyrus (MTG) in the SIVM condition (Table 2; Fig. 3A, bottom left and right). A significant interaction of the left MTG was observed in the VIVM condition (Table 2; Fig. 3B, bottom right). 4. Discussion Studies on the neural mechanisms of crossmodal motion perception have focused on the scenario where both the auditory and visual stimuli possessed motion information. These studies demonstrated that motion processing and sensory association areas were commonly activated by audiovisual motion information. In this study, we investigated the neural mechanisms underlying the phenomenon in which auditory motion information induces illusory visual apparent motion to static visual stimuli (SIVM) with 7 T fMRI. Specifically, we examined the similarities and differences in the neural mechanisms underlying SIVM and VIVM. Based on the conjunction analysis of the SIVM and VIVM conditions with exclusive masking of the activation pattern in the baseline condition, we identified activation in the left MOG, which was assumed to 70
Acta Psychologica 178 (2017) 66–72
S. Hidaka et al.
related or rapid-block design containing perceptual tasks in order to firmly illustrate the commonalities and differences in SIVM and VIVM. The utilization of localization scans that identify each participant's visual motion processing areas would also be essential for detailed investigations. Furthermore, by considering the existence of functional plasticity in the brain regarding crossmodal motion processing (Jiang, Stecker, & Fine, 2014; Poirier et al., 2006; Saenz et al., 2008), it could be beneficial to focus on the establishment processes of common neural representations between SIVM and VIVM by adopting a perceptual learning framework (e.g., Hidaka, Teramoto, Kobayashi, & Sugita, 2011; Teramoto, Hidaka, & Sugita, 2010). 5. Conclusions The present study explored the similarities and differences in the neural mechanisms of brain activities between SIVM and VIVM in order to further clarify the brain mechanisms underlying crossmodal motion perception. The results showed that brain areas corresponding to the visual motion sensitive area (V5/hMT) and those involved in crossmodal motion processing (STG) were similarly activated by the SIVM and VIVM conditions. In addition, greater activations was observed in the left STG and PT for SIVM as compared to VIVM. Moreover, dominant functional connectivity was observed between V5/hMT and areas related to auditory and crossmodal motion processing in the SIVM condition. In concordance with previous observations of activated brain areas and activation patterns in response to audiovisual motion information (Alink et al., 2008; Baumann & Greenlee, 2007; Bremmer et al., 2001; Lewis et al., 2000; Scheef et al., 2009), the findings of the current study suggest that similar but partially different neural mechanisms could be involved in auditory-induced and visually-induced motion perception, and that the neural signals underlying the perception of SIVM closely and directly interact among auditory, visual, and, crossmodal motion-processing areas. Funding This research was supported by Grant-in-Aid for Scientific Research (A) (No. 25245069) and (B) (No. 26285160) from the Japan Society for the Promotion of Science and by Grant-in-Aid for Strategic Medical Science Research (S1491001) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. Fig. 3. Results of the psychophysiological interaction analyses of the (A) SIVM and (B) VIVM conditions against the baseline condition. The areas with significant differences are marked in red (p < 0.001, uncorrected at the voxel level; p < 0.05, corrected for family-wise errors at the cluster level). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Acknowledgements We thank Tsuyoshi Metoki for his support of data collection and Tomoka Omi for her assistances.
an incongruent manner. Nevertheless, the brain integrates these information, and illusory visual motion perception is consequently elicited by auditory motion information (Hidaka et al., 2009, 2011; Teramoto, Manaka, et al., 2010). In accordance with this perceptual evidence, the SIVM condition in the current study resulted in the activation of brain areas related to visual motion processing (V5/hMT) that were also activated in the VIVM condition. Moreover, greater activation and functional connectivity were observed for auditory and crossmodal processing areas in the SIVM condition relative to the VIVM condition in which static sound and moving visual stimuli were presented. Thus, the activation of both visual motion processing areas and auditory and crossmodal processing areas is necessary for the occurrence of SIVM. In line with the suggestions of Ghazanfar and Schroeder (2006), the results of the current study suggest that both higher association and lower primary sensory areas are involved in crossmodal integration of motion information. While we introduced the passive observation situation without any task, future studies should separately analyze trials involving illusory visual motion and those not in the SIVM condition with an event-
References Alais, D., & Burr, D. (2004). No direction-specific bimodal facilitation for audiovisual motion detection. Cognitive Brain Research, 19, 185–194. Alink, A., Euler, F., Kriegeskorte, N., Singer, W., & Kohler, A. (2012). Auditory motion direction encoding in auditory cortex and high-level visual cortex. Human Brain Mapping, 33, 969–978. Alink, A., Singer, W., & Muckli, L. (2008). Capture of auditory motion by vision is represented by an activation shift from auditory to visual motion cortex. Journal of Neuroscience, 28, 2690–2697. Baumann, O., & Greenlee, M. W. (2007). Neural correlates of coherent audiovisual motion perception. Cerebral Cortex, 17, 1433–1443. Bremmer, F., Schlack, A., Shah, N. J., Zafiris, O., Kubischik, M., Hoffmann, K. P., ... Fink, G. R. (2001). Polymodal motion processing in posterior parietal and premotor cortex: A human fMRI study strongly implies equivalencies between humans and monkeys. Neuron, 29, 287–296. Calvert, G. A. (2001). Crossmodal processing in the human brain: Insights from functional neuroimaging studies. Cerebral Cortex, 11, 1110–1123. Driver, J., & Noesselt, T. (2008). Multisensory interplay reveals crossmodal influences on ‘sensory-specific’ brain regions, neural responses, and judgments. Neuron, 57, 11–23. Eickhoff, S. B., Stephan, K. E., Mohlberg, H., Grefkes, C., Fink, G. R., Amunts, K., & Zilles, K. (2005). A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. NeuroImage, 25, 1325–1335. Ernst, M. O., & Bülthoff, H. H. (2004). Merging the senses into a robust percept. Trends in
71
Acta Psychologica 178 (2017) 66–72
S. Hidaka et al.
Amunts, K. (2007). Ventral visual cortex in humans: Cytoarchitectonic mapping of two extrastriate areas. Human Brain Mapping, 28, 1045–1059. Saenz, M., Lewis, L. B., Huth, A. G., Fine, I., & Koch, C. (2008). Visual motion area MT+/ V5 responds to auditory motion in human sight-recovery subjects. The Journal of Neuroscience, 28, 5141–5148. Sanabria, D., Soto-Faraco, S., & Spence, C. (2005). Spatiotemporal interactions between audition and touch depend on hand posture. Experimental Brain Research, 165, 505–514. Scheef, L., Boecker, H., Daamen, M., Fehse, U., Landsberg, M. W., Granath, D. O., ... Effenberg, A. O. (2009). Multimodal motion processing in area V5/MT: Evidence from an artificial class of audio-visual events. Brain Research, 1252, 94–104. Shmuel, A., Yacoub, E., Chaimow, D., Logothetis, N. K., & Ugurbil, K. (2007). Spatiotemporal point-spread function of fMRI signal in human gray matter at 7 Tesla. NeuroImage, 35, 539–552. Soto-Faraco, S., Kingstone, A., & Spence, C. (2003). Multisensory contributions to the perception of motion. Neuropsychologia, 41, 1847–1862. Soto-Faraco, S., Lyons, J., Gazzaniga, M., Spence, C., & Kingstone, A. (2002). The ventriloquist in motion: Illusory capture of dynamic information across sensory modalities. Cognitive Brain Research, 14, 139–146. Soto-Faraco, S., Spence, C., & Kingstone, A. (2004a). Cross-modal dynamic capture: Congruency effects in the perception of motion across sensory modalities. Journal of Experimental Psychology: Human Perception and Performance, 30, 330–345. Soto-Faraco, S., Spence, C., & Kingstone, A. (2005). Assessing automaticity in the audiovisual integration of motion. Acta Psychologica, 118, 71–92. Soto-Faraco, S., Spence, C., Lloyd, D., & Kingstone, A. (2004b). Moving multisensory research along motion perception across sensory modalities. Current Directions in Psychological Science, 13, 29–32. Sunaert, S., Van Hecke, P., Marchal, G., & Orban, G. A. (1999). Motion-responsive regions of the human brain. Experimental Brain Research, 127, 355–370. Teramoto, W., Hidaka, S., & Sugita, Y. (2010a). Sounds move a static visual object. PloS One, 5, e12255. Teramoto, W., Manaka, Y., Hidaka, S., Sugita, Y., Miyauchi, R., Sakamoto, S., ... Suzuki, Y. (2010b). Visual motion perception induced by sounds in vertical plane. Neuroscience Letters, 479, 221–225. Tootell, R. B., Reppas, J. B., Kwong, K. K., Malach, R., Born, R. T., Brady, T. J., ... Belliveau, J. W. (1995). Functional analysis of human MT and related visual cortical areas using magnetic resonance imaging. The Journal of Neuroscience, 15, 3215–3230. Uwano, I., Metoki, T., Sendai, F., Yoshida, R., Kudo, K., Yamashita, F., ... Sasaki, M. (2015). Assessment of sensations experienced by subjects during MR imaging examination at 7 T. Magnetic Resonance in Medical Sciences, 14, 35–41. Warren, J. D., Zielinski, B. A., Green, G. G., Rauschecker, J. P., & Griffiths, T. D. (2002). Perception of sound-source motion by the human brain. Neuron, 34, 139–148. Watson, J. D., Myers, R., Frackowiak, R. S., Hajnal, J. V., Woods, R. P., Mazziotta, J. C., ... Zeki, S. (1993). Area V5 of the human brain: Evidence from a combined study using positron emission tomography and magnetic resonance imaging. Cerebral Cortex, 3, 79–94. Weiskopf, N., Hutton, C., Josephs, O., & Deichmann, R. (2006). Optimal EPI parameters for reduction of susceptibility-induced BOLD sensitivity losses: A whole-brain analysis at 3 T and 1.5 T. NeuroImage, 33, 493–504. Zeki, S., Watson, J. D., Lueck, C. J., Friston, K. J., Kennard, C., & Frackowiak, R. S. (1991). A direct demonstration of functional specialization in human visual cortex. The Journal of Neuroscience, 11, 641–649. van der Zwaag, W., Francis, S., Head, K., Peters, A., Gowland, P., Morris, P., & Bowtell, R. (2009). fMRI at 1.5, 3 and 7 T: Characterising BOLD signal changes. NeuroImage, 47, 1425–1434.
Cognitive Sciences, 8, 162–169. Friston, K. J., Holmes, A. P., Price, C. J., Büchel, C., & Worsley, K. J. (1999). Multisubject fMRI studies and conjunction analyses. NeuroImage, 10, 385–396. Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10, 278–285. Goebel, R., Khorram-Sefat, D., Muckli, L., Hacker, H., & Singer, W. (1998). The constructive nature of vision: Direct evidence from functional magnetic resonance imaging studies of apparent motion and motion imagery. European Journal of Neuroscience, 10, 1563–1573. Grefkes, C., & Fink, G. R. (2005). The functional organization of the intraparietal sulcus in humans and monkeys. Journal of Anatomy, 207, 3–17. Griffiths, T. D., Rees, G., Rees, A., Green, G. G., Witton, C., Rowe, D., ... Frackowiak, R. S. (1998). Right parietal cortex is involved in the perception of sound movement in humans. Nature Neuroscience, 1, 74–79. Hidaka, S., Manaka, Y., Teramoto, W., Sugita, Y., Miyauchi, R., Gyoba, J., ... Iwaya, Y. (2009). Alternation of sound location induces visual motion perception of a static object. PloS One, 4, e8188. Hidaka, S., Teramoto, W., Kobayashi, M., & Sugita, Y. (2011a). Sound-contingent visual motion aftereffect. BMC Neuroscience, 12, 44. Hidaka, S., Teramoto, W., & Sugita, Y. (2015). Spatiotemporal processing in crossmodal interactions for perception of the external world: A review. Frontiers in Integrative Neuroscience, 9, 62. Hidaka, S., Teramoto, W., Sugita, Y., Manaka, Y., Sakamoto, S., & Suzuki, Y. (2011b). Auditory motion information drives visual motion perception. PloS One, 6, e17499. Jiang, F., Stecker, G. C., & Fine, I. (2014). Auditory motion processing after early blindness. Journal of Vision, 14(13), 1–18. Kitagawa, N., & Ichihara, S. (2002). Hearing visual motion in depth. Nature, 416, 172–174. Krumbholz, K., Krumbholz, K., Schönwiesner, M., Rübsamen, R., Zilles, K., Fink, G. R., & von Cramon, D. Y. (2005). Hierarchical processing of sound location and motion in the human brainstem and planum temporale. European Journal of Neuroscience, 21, 230–238. Kujovic, M., Zilles, K., Malikovic, A., Schleicher, A., Mohlberg, H., Rottschy, C., ... Amunts, K. (2013). Cytoarchitectonic mapping of the human dorsal extrastriate cortex. Brain Structure and Function, 218, 157–172. Lewis, J. W., Beauchamp, M. S., & DeYoe, E. A. (2000). A comparison of visual and auditory motion processing in human cerebral cortex. Cerebral Cortex, 10, 873–888. Liu, T., Slotnick, S. D., & Yantis, S. (2004). Human MT+ mediates perceptual filling-in during apparent motion. NeuroImage, 21, 1772–1780. Malikovic, A., Amunts, K., Schleicher, A., Mohlberg, H., Eickhoff, S. B., Wilms, M., ... Zilles, K. (2007). Cytoarchitectonic analysis of the human extrastriate cortex in the region of V5/MT +: A probabilistic, stereotaxic map of area hOc5. Cerebral Cortex, 17, 562–574. Meyer, G. F., & Wuerger, S. M. (2001). Cross-modal integration of auditory and visual motion signals. Neuroreport, 12, 2557–2560. Muckli, L., Kohler, A., Kriegeskorte, N., & Singer, W. (2005). Primary visual cortex activity along the apparent-motion trace reflects illusory perception. PLoS Biology, 3, 1501. Mumford, J. A., Poline, J. B., & Poldrack, R. A. (2015). Orthogonalization of regressors in fMRI models. PloS One, 10, e0126255. Poirier, C., Collignon, O., Scheiber, C., Renier, L., Vanlierde, A., Tranduy, D., ... De Volder, A. G. (2006). Auditory motion perception activates visual motion areas in early blind subjects. NeuroImage, 31, 279–285. Price, C. J., & Friston, K. J. (1997). Cognitive conjunction: A new approach to brain activation experiments. NeuroImage, 5, 261–270. Rottschy, C., Eickhoff, S. B., Schleicher, A., Mohlberg, H., Kujovic, M., Zilles, K., &
72