Vision Research 133 (2017) 121–129
Contents lists available at ScienceDirect
Vision Research journal homepage: www.elsevier.com/locate/visres
Newly acquired audio-visual associations bias perception in binocular rivalry Wolfgang Einhäuser a,⇑, Philipp Methfessel a,b, Alexandra Bendixen b a b
Physics of Cognition Group, Institute of Physics, Chemnitz University of Technology, Chemnitz, Germany Cognitive Systems Lab, Institute of Physics, Chemnitz University of Technology, Chemnitz, Germany
a r t i c l e
i n f o
Article history: Received 15 November 2016 Received in revised form 11 February 2017 Accepted 17 February 2017
Keywords: Rivalry Cross-modal integration Audition Associative learning Optokinetic nystagmus No-report paradigm
a b s t r a c t When distinct stimuli are presented to the two eyes, their mental representations alternate in awareness. Here, such ‘‘binocular rivalry” was used to investigate whether audio-visual associations bias visual perception. To induce two arbitrary associations, each between a tone and a grating of a specific color and motion direction, observers were required to respond whenever this combination was presented, but not for other tone-grating combinations. After about 20 min of this induction phase, each of the gratings was presented to one eye to induce rivalry, while either of the two tones or no tone was played. Observers were asked to watch the rivaling stimuli and listen to the tones. The observer’s dominant percept was assessed throughout by measuring the optokinetic nystagmus (OKN), whose slow phase follows the direction of the currently dominant grating. We found that perception in rivalry was affected by the concurrently played tone. Results suggest a bias towards the grating that had been associated with the concurrently presented tone and prolonged dominance durations for this grating compared to the other. Numerically, conditions without tone fell in-between for measures of bias and dominance duration. Our data show that a rapidly acquired arbitrary audio-visual association biases visual perception. Unlike previously reported cross-modal interactions in rivalry, this effect can neither be explained by a pure attentional (dual-task) effect, nor does it require a fixed physical or semantic relation between the auditory and visual stimulus. This suggests that audio-visual associations that are quickly formed by associative learning may affect visual representations directly. Ó 2017 Elsevier Ltd. All rights reserved.
1. Introduction Perceptual rivalry is characterized by a situation in which a constant sensory stimulus evokes distinct perceptual interpretations that alternate in their access to awareness over time (e.g., Boring, 1930; Breese, 1899; Necker, 1832; Rubin, 1921). Binocular rivalry occurs when two distinct images are presented to the two eyes (Wheatstone, 1838). The dynamics of both forms of rivalry share several statistical properties (Brascamp, Klink, & Levelt, 2015; Klink, van Ee, & van Wezel, 2008; O’Shea, Parker, La Rooy, & Alais, 2009). Besides being a research topic in its own right, rivalry has become a tool to study the perceptual consequences of many perceptual, cognitive and action-related factors. Since the stimulus remains unchanged, such effects can then be attributed to a direct operation of the respective factor on the perceptual representation. ⇑ Corresponding author at: Chemnitz University of Technology, Institute of Physics, 09107 Chemnitz, Germany. E-mail address:
[email protected] (W. Einhäuser). http://dx.doi.org/10.1016/j.visres.2017.02.001 0042-6989/Ó 2017 Elsevier Ltd. All rights reserved.
Examples of this endeavor include the demonstration of direct effects of attention on ambiguous-motion perception (Blaser, Sperling, & Lu, 1999), effects of eye movements on the perception of ambiguous figures (Einhäuser, Martin, & König, 2004; Glen, 1940; Kawabata, Yamagami, & Noaki, 1978; Necker, 1832), effects of manual movements on dynamic rivalry stimuli (Beets et al., 2010; Maruya, Yang, & Blake, 2007; Wohlschläger, 2000), and also effects of higher-level concepts, such as value, on binocular rivalry (Marx & Einhäuser, 2015; Wilbertz, van Slooten, & Sterzer, 2014). Here, we follow this logic and use binocular rivalry to study the effect of a learnt arbitrary audio-visual association on perceptual representations. Visual stimuli can have profound influences on auditory perception and vice versa. This is probably most famously evidenced by the McGurk effect, where conflicting visual and auditory information on a spoken syllable lead to a unique audio-visual perception that is a compromise between both modalities but consistent with neither unimodal stimulus (McGurk & MacDonald, 1976). In the ‘‘ventriloquist effect”, a sound source is spatially linked to visual
122
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
motion even if it does not spatially coincide with the visual stimulus (Pick, Warren, & Hay, 1969). This precedence of vision over audition for a spatial judgement is consistent with an optimal integration of both modalities, when – as it is typically the case – visual localization is more reliable than auditory localization; indeed, spatial localization can be dominated by audition when visual acuity is sufficiently reduced experimentally (Alais & Burr, 2004). Especially in transient presentations, audition frequently dominates vision: when accompanied by multiple tones, a single visual flash is perceived as multiple flashes (Shams, Kamitani, & Shimojo, 2002). Similarly, the visual ‘‘flash-lag” illusion – a brief flash is perceived to lag behind a moving object that is in fact presented at the same location – is reduced by a sound preceding the flash (Vroomen & de Gelder, 2004). Ambiguous visual perception can also be biased by auditory stimulation. The streaming/bouncing stimulus presents a striking example; when two opaque moving discs approach each other and continue their trajectory after the point of contact, two distinct perceptual interpretations are possible (Metzger, 1934): the discs can be perceived to stream by each other or to bounce off each other. While visual factors influence the percept, the co-occurrence of a sound with the time of visual contact shifts the bias profoundly towards the bouncing interpretation (Sekuler, Sekuler, & Lau, 1997). In sum, visual and auditory perception influence each other, and illusions or ambiguous situations in either modality present a good means to reveal such effects. Cross-modal effects on visual rivalry have recently become an area of intense research. Besides using the other modality for the presentation of a distracting task (e.g., Alais, van Boxtel, Parker, & van Ee, 2010), several studies build on intrinsic relations between auditory, tactile or olfactory stimuli on the one hand to a visually ambiguous stimulus on the other hand, to address whether other modalities can bias perception in visual rivalry. In the case of tactile-visual interaction, the rotation of an invisible physical sphere in the participant’s hands biases the concurrent perception of an ambiguous kinetic depth (structure-from-motion) sphere towards the direction of the tactile stimulus (Blake, Sobel, & James, 2004). This effect may exploit the sensitivity of a visual brain area, the medio-temporal visual complex in human (MT+), to such tactile motion (Blake et al., 2004). Remarkably, the effect of touch on rivalry does not require the conscious percept of the visual stimulus: presenting a tactile stimulus that is congruent to the suppressed stimulus fosters its breakthrough to dominance (Lunghi, Binda, & Morrone, 2010) and the presence of a congruent tactile stimulus decreases the detection threshold of a probe on the suppressed stimulus (Lunghi & Alais, 2015). In these cases, the matching across modalities needs to be remarkably precise, for example, down to a few degrees in the case of orientation (Lunghi & Alais, 2013). Together with the observation that passive touch suffices to facilitate the suppressed stimulus (Lunghi & Morrone, 2013), the specificity and the modulation of the suppressed percept indicate that cross-modal effects on rivalry may occur at early processing stages prior to awareness. Nonetheless, these studies not only find effects on the suppressed but also on the dominant stimulus, suggesting that both processes – facilitating the suppressed stimulus and extending the dominance of the currently dominant stimulus – contribute to cross-modal effects on binocular rivalry. Physical similarity to an auditory stimulus can also bias binocular rivalry: an auditory stimulus implying motion biases the perception of conflicting random-dot kinematograms in the direction of the auditory stimulus (Conrad, Bartels, Kleiner, & Noppeney, 2010). A bias towards visual motion congruent with a simultaneously presented sound is also observed for looming/receding stimuli, and stronger when these stimuli are more naturalistic than simple sounds (Conrad et al., 2013). Extending on such physical
relatedness across domains, a semantically related stimulus, such as a bird sound or a car sound when images of cars and birds are competing, biases perception towards the visual stimulus that is congruent with the auditory stimulation (Chen, Yeh, & Spence, 2011). A similar effect is observed for congruent odors: the smell of a rose (induced by phenylethyl alcohol) and the smell of a text marker (induced by butanol) increase dominance of the respective picture (Zhou, Jiang, He, & Chen, 2010). Besides spatial and semantic similarity, a match in temporal structure also facilitates crossmodal effects on visual rivalry: when a visually looming stimulus is paired with an auditory looming stimulus, the former can be held in awareness more easily, and this effect to some extent generalizes to other sounds as long as the rhythmicity helps to keep up attentional control (van Ee, van Boxtel, Parker, & Alais, 2009). van Ee et al. (2009) show similar effects also for tactile stimulation with the appropriate temporal pattern. Tactile and auditory stimulation can bias perception in binocular rivalry towards the visual stimulus that shares the temporal frequency with the other domains; importantly this still holds when perceiving the frequency requires integration across the non-visual domains, which argues that such supramodal integration precedes biasing rivalry (Lunghi, Morrone, & Alais, 2014). Temporal and spatial specificity can also be combined: when an auditory stimulus is presented concurrently with two rivalry gratings, the grating whose spatial frequency is perceptually matched to the amplitude modulation of the auditory stimulus dominates over a grating of a different spatial frequency (Guzman-Martinez, Ortega, Grabowecky, Mossbridge, & Suzuki, 2012). All of the aforementioned examples have in common that the stimulus in the other modality (or modalities) has some intrinsic commonality with the visual stimulus to be biased: the tactile rotation describes the same physical object as the visual stimulus, the temporal and/or spatial frequencies are matched between modalities, the bird and vehicle pictures are semantically matched to their corresponding sounds, motion patterns share the same direction, and the visual spatial frequency is a priori perceived to be more similar to one amplitude modulation pattern than to another. Here we ask, instead, whether an arbitrary audio-visual association that is acquired through a brief period of training (induction) suffices to exert similar influences on visual perception in binocular rivalry. Complementary to previous studies, such effects would provide evidence that cross-modal associations strong enough to interfere with perception do not require a lifelong period of training or relatedness to the same physical object, but can instead be arbitrary and formed quickly. This in turn would indicate that cross-modal representations interfering with rather early stages of vision are highly plastic even in adult observers. Importantly for the interpretation of our data, we circumvent the issue of response bias by using a no-report paradigm (Tsuchiya, Frässle, Wilke, & Lamme, 2015 for a review). Specifically, we use moving gratings as stimuli and exploit the observation that the direction of the slow-phase of the optokinetic nystagmus (OKN) follows the perceived stimulus motion (Enoksson, 1963; Fox, Todd, & Bettinger, 1975; Frässle, Sommer, Jansen, Naber, & Einhäuser, 2014; Marx & Einhäuser, 2015; Naber, Frässle, & Einhäuser, 2011). This allows us to have observers watch a binocular-rivalry stimulus without reporting their percept, while we nonetheless determine their perception at any point in time. In a first step, we induce two audio-visual associations, each between a tone and a grating, by requiring observers to respond to these combinations of tone and grating, but not to other combinations. In a second phase, we present the two gratings, one to each eye, and play either one of the tones or no tone. We hypothesize that playing a tone increases the relative perceptual dominance of the grating associated with the tone at the expense of the other grating’s dominance.
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
123
2. Materials and methods
2.4. Procedure
2.1. Participants
The experiment consisted of two phases, an induction phase and a rivalry phase. The objective of the induction phase was to couple each of the two tones to a specific grating, which was distinguished from the other presented gratings by motion direction and color. Such coupling should result in two specific tonegrating associations to be formed, which could be tested in the subsequent rivalry phase. When referring to the rivalry phase, we will – for simplicity of notation – refer to grating-tone pairs that were coupled during induction, as ‘‘audio-visual association”, ‘‘grating associated with the tone”, etc., although strictly speaking the ‘‘association” is an assumed consequence of the coupling during induction and thus part of the hypothesis to be tested.
Sixteen volunteers (age: 20–26, mean: 22.7, sd: 1.8; 9 female, 7 male) participated. All were naïve to the purpose of the experiment, had normal or corrected-to-normal vision, normal color vision, reported normal hearing and gave written informed consent prior to participation. All procedures conformed to the Declaration of Helsinki; the applicable body (Ethikkommission der Fakultät für Human- und Sozialwissenschaften, TU Chemnitz) determined that the experiments do not require in-depth ethics evaluation.
2.2. Setup A custom-made mirror-stereoscope was used for dichoptic presentation of binocular-rivalry stimuli. Stimuli were displayed on two 210 CRT screens (Samsung, Seoul, South Korea) set to 1024 768 resolution at 85 Hz at a viewing distance of 30 cm. Auditory stimuli were presented diotically through calibrated Sennheiser HD 25-1 (70 X) headphones at 75 dB(A). Throughout the experiment, eye position was recorded binocularly at 500 Hz by an Eyelink-1000 eye-tracking device (SR Research, Ottawa, ON, Canada). Eye tracking was performed through the stereoscope’s mirrors, which were transparent to the infrared light used by the eye tracker. As information from both eyes is highly redundant in binocular rivalry (Naber et al., 2011), only data from one eye was used for analysis. Responses during the induction phase (see Section 2.4 below) were recorded on a USB gamepad, whose state was recorded together with the Eyelink data. Experiments were conducted in a sound-attenuated room with no source of light or sound except for the screens and headphones. Matlab (Mathworks, Natick, MA) with its Psychophysics and Eyelink toolboxes (Brainard, 1997; Cornelissen, Peters, & Palmer, 2002; Pelli, 1997) was used for stimulus presentation.
2.3. Stimuli All visual stimuli were centrally presented isoluminant (10 cd/ m2) square-wave gratings with a spatial frequency of 0.2 cycles/ degree that extended 20 degrees (256 pixels) vertically and 39 degrees (512 pixels) vertically. Two vertical gratings were used, blue (CIE coordinates x = 0.151; y = 0.065) on gray and red (x = 0.623; y = 0.344) on gray, as well as one horizontal grating, which was green (x = 0.287, y = 0.609) on gray. The transitions between the gray and colored phases of each grating were smoothed by reducing the luminance of the chromatic phase gradually by 30% following an inverse square function. The vertical gratings drifted horizontally, the horizontal grating drifted vertically, all at a speed of 18.4 degrees/s (240pix/s). Vertical drift was always upward, horizontal drift was leftward or rightward with the color-direction association being fixed in each observer, but counterbalanced across observers (see Section 2.4). Gratings as well as blank screens were surrounded by a high-contrast pattern, which consisted of squares (width: 1.3 degrees) that were randomly chosen to be either black (<0.1 cd/m2) or white (85 cd/m2) squares. The pattern was identical in both eyes and provided a fixed depth plane to suppress systematic vergence eye movements (Fig. 1A). Two auditory stimuli were used, a high-pitch (1008 Hz) and a low-pitch (400 Hz) sinewave tone of 50 ms duration. Each sine tone was started and ended with 5 ms raised cosine onset and offset amplitude ramps to avoid technical artifacts and to produce perceptually smooth loudness increases and decreases.
2.4.1. Induction phase In each observer, the experiment started with at least 11 ‘‘induction” blocks of 72 trials each. In the induction phase, the same grating was presented to both eyes for 600 ms, followed by a 600 ms blank (10 cd/m2 at the previous location of the gratings). During presentation of the grating, one of the two tones was presented. Sound presentation occurred every 153 ms (13 visual frames), yielding 4 sounds during the grating presentation (at 71 ms, 224 ms, 377 ms and 529 ms). All 6 combinations of sound (2) and visual stimulus (3) occurred equally often (12 times) and in random order. Observers were instructed to press the right back button of the USB gamepad as quickly as possible if one of two prespecified sound-color pairings occurred (‘‘go trials”) and to refrain from pressing a button for any other combination (‘‘no-go” trials, Fig. 1B). If observers responded within 1.2 s after onset in a go trial, this was counted a ‘‘hit”, if they failed to respond a ‘‘miss”. If observers responded within the 1.2 s in a no-go trial, this was determined a ‘‘false alarm”, and if they did not respond a ‘‘correct rejection”. For half of the observers the pairings red/low-pitch and blue/high-pitch were go trials, for the other half the pairings blue/low-pitch and red/high-pitch were go trials. In half of each group the red grating moved rightward and the blue leftward, in the other half the link between color and direction was reversed. Consequently, the coupling in the induction phase for each observer should form two associations between color and motion on the one hand and a specific tone on the other hand. The induction phase was terminated when an observer had performed at least 11 blocks and in the last block no more than one false alarm and one miss had occurred (Fig. 1c); otherwise the induction block was repeated until this criterion was met. 2.4.2. Rivalry phase After successful induction, 6 rivalry blocks were performed. Each rivalry block started with a 72-trial induction block that was identical to the previously described induction blocks. After a five-second blank, one of the drifting vertical gratings (red/blue) was presented to one eye, the other vertical grating to the other eye for 3 min. This period was split into 3 trials of 1 min each, in which either the high-pitch tone, the low-pitch tone or no tone was presented. As for the induction phase, tones were presented for 50 ms in 153 ms intervals, that is, 392 tones for each 1-min rivalry trial (Fig. 1c). The order of the 3 auditory conditions was counter-balanced between the 6 rivalry blocks; the assignment of eye to grating was reversed for half of the rivalry blocks. Observers were instructed to watch the stimuli with keeping their eyes open, that they were free to move their eyes naturally and that no response was to be given during the rivalry trials. Since red and blue gratings drifted in different directions, the currently dominant grating was expected to induce an OKN whose slow-phase direction corresponded to the dominant grating’s direction.
124
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
Fig. 1. (A) Stimulus (rivalry phase). To each eye a drifting grating was presented on identical high contrast patterns. (B) Example induction phase for an observer for whom the red grating drifted rightward and was coupled with the low-pitch tone and the blue grating drifted leftward and was coupled with the high-pitch tone. Plus and minus signs indicate go and no-go trials, respectively; for this example responses were to be given (‘‘go”) whenever either the red grating and the low-pitch tone or the blue grating and the high-pitch tone co-occurred, but not for any other combination (‘‘no-go”). The high-contrast pattern surrounding the stimuli is omitted for illustration purposes, but was also presented during induction. (C) Overview of experimental procedure. (D) Top: excerpt of an example horizontal eye trace, with detected fast phases (gray) and slow phases (red). Bottom: Corresponding OKN slow phase gain, time axis is identical in both plots.
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
2.5. Analysis 2.5.1. Optokinetic nystagmus (OKN) Eye movements were recorded throughout. To extract the currently perceptually dominant grating in the rivalry phase, the direction of the OKN slow phase was determined as described previously (Marx & Einhäuser, 2015; Naber et al., 2011). In brief: OKN fast phases, which have velocity and acceleration profiles similar to saccades, were detected by the saccade detection software implemented in the Eyelink software package with the parameters of 35 degrees/s (velocity threshold) and 9500 degrees/s2 (acceleration threshold). These periods were removed and treated as missing data for analysis, as were periods of eye blinks. For the remaining data, the average horizontal velocity of each individual slow phase was determined by fitting the optimal linear function (in a leastsquares sense) to the data (Fig. 1d). The value of this slope divided by the grating speed (slow-phase gain), and the sign of the slope (slow-phase direction) are used for further analysis. From slow-phase gain and slow-phase direction, we determined three variables that characterize the rivalry phases: (i) the bias towards one motion direction, (ii) the average gain, and (iii) the dominance duration.
2.5.2. Bias towards one motion direction First, we aggregated data over the corresponding trials (highpitch, low-pitch, no tone) of the six rivalry blocks. Since per observer each tone was coupled with one specific motion direction, we will refer to these auditory conditions as ‘‘tone associated with leftward grating”, ‘‘tone associated with rightward grating” and ‘‘no tone”, respectively. To obtain a measure of bias, we subtracted the total time the OKN gain was negative (leftward) from the total time the OKN gain was positive (rightward) and divided this difference by the sum of both values.1 This provides us with a measure of bias towards perceiving rightward motion. This measure would be +1, if only rightward motion were perceived, 1, if only leftward motion were perceived, and 0 if there is no bias to either motion direction. For this measure the hypothesis of an effect of audiovisual associations on rivalry perception predicts a larger value when the tone associated with rightward motion is played than when the tone associated with leftward motion is played, with the no-tone condition falling in-between.
2.5.3. Average gain The bias measure uses the direction of the OKN slow phase, but weighs periods of large gains and small gains equally strongly. Under the assumption that a larger gain in either direction reflects clearer dominance or a more vivid perceptual impression of the corresponding grating, it seems reasonable to take the gain value also into account. Again we aggregated data over all corresponding phases of the six blocks and averaged the gain in this period. Since the gain even under unambiguous presentation is bounded by 1, the average is expected to fall between 1 (leftward grating always dominant) and +1 (rightward grating always dominant), with values around 0 implying equidominance of both motion directions. The hypothesis of the audio-visual association affecting perception predicts that the average gain is larger when the auditory stimulus associated with rightward motion is played than when the auditory stimulus associated with leftward motion is played. 1 Since we defined the gain only for the slow phases, fast phases and blinks are not included in the analysis, and hence the denominator is slightly smaller than the total duration of the condition.
125
2.5.4. Absolute gain As a possible proxy for the vividness of perception, we measured the absolute gain. Unlike for the average gain measure, we here at any point in time took the absolute value of the OKN gain. This measure is bounded by 0 from below. A value close to the upper limit of 1 implies that a single motion direction is exclusively perceived and the eye movement compensates this motion nearly perfectly. While oculomotor factors can contribute to gains clearly below 1 in an individual, within the same observer a lower value of absolute gain in one condition can be interpreted as less vivid or more mixed percepts for this condition than for conditions that show higher values of absolute gain. 2.5.5. Dominance durations If the audio-visual association indeed affects perceptual dominance in the rivalry phases as hypothesized, there are two possible – not mutually exclusive – ways to increase the dominance of one percept relative to the other: the dominance durations of the grating associated with the current tone could increase, or the dominance duration of the other tone could decrease. We estimated the duration of dominance by defining sign changes of the OKN gain (i.e., changes in OKN slow-phase direction) as perceptual transitions and the times between sign changes as the dominance duration of the grating in direction of the OKN slow phase. Again, we aggregated data over all six rivalry blocks. For the ‘‘no tone” condition, all dominance durations (leftward and rightward grating) of the no-tone phases were considered. Dominance durations of phases with tone were split into those corresponding to the grating associated with the tone, and those corresponding to the other percept, which was not associated with the tone. 2.6. Statistical analysis For bias and average gain, there are three auditory conditions: tone associated with leftward drifting grating, no tone, tone associated with rightward drifting grating; for dominance durations there are three conditions: grating associated with concurrent tone (‘‘same”), grating associated with the other tone (‘‘other”) and no tone. In all cases, a one-factor repeated-measures ANOVA with the 3-level factor auditory condition was computed. Reported pvalues are corrected for non-sphericity using the GreenhouseGeisser (GG) adjustment. Uncorrected degrees of freedom and GG-epsilon (e) are reported. Whenever an rmANOVA revealed a main effect at a 5% alpha level, and therefore indicated an effect of the auditory condition on visual perception, post hoc tests were conducted for all pairwise comparisons among the three levels of the auditory-condition factor. To be conservative, we used two-tailed tests and asserted significance to a post hoc test only, if the uncorrected p-value fell below a Bonferroni-adjusted alpha-level of 0.05/3 = 0.0167. Data processing was done in Matlab; statistical analysis used the R software package version 3.1.0 (R Development Core Team, 2014). 3. Results 3.1. Induction performance Observers needed between 11 (the predefined minimum) and 15 induction blocks (mean: 12.4, sd: 1.4) to reach the criterion of maximally one miss and one false alarm (corresponding to 97.2% correct). This corresponds to 18 min (12.4 72 1.2 s) of training on average. In the induction trials preceding each rivalry block, error rates remained low: across observers, correct response rates ranged between 94.2% and 99.8% (mean: 97.8%, sd: 1.4%). Reaction
126
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
times in the induction blocks ranged from 490 ms to 706 ms (mean: 584 ms, sd: 64 ms) and thus were well below the maximum allowed time of 1.2 s (600 ms grating plus 600 ms blank). Hence, possible double accounts of late responses as miss in the current block and as false alarm in the subsequent block are likely to be negligible. Importantly, after training in the initial induction phase, the performance did not noticeably degrade during the rivalry phase. 3.2. Bias towards visual stimulus associated with concurrent tone Over all rivalry blocks and conditions, there was a numerical preference for perceiving the leftward moving grating as indicated by an overall bias of the OKN slow phase towards leftward motion. On top of this preference, the dominant percepts depended on the auditory condition (Fig. 2): when the tone that had been coupled with the leftward motion during induction was played with the rivalry stimulus, there was an average leftward bias (22.1% ± 7.4% to the left, mean ± sem); for the tone coupled with the rightward motion, the average bias was to the right (1.9% ± 7.4% to the right) and the no-tone condition fell numerically in-between (9.1% ± 4.8% to the left). A repeated-measures ANOVA confirmed a main effect of the auditory condition on visual perception [F(2,30) = 5.41, p = 0.03, e = 0.61]. Although post hoc pairwise comparisons between any of the 3 conditions did not reach significance at an adjusted 5% alpha-level of 0.0167 (difference between the two conditions with tones: t(15) = 2.46, p = 0.026; between tone associated with leftward motion and no tone: t(15) = 2.45, p = 0.027; between tone associated with rightward motion and no tone: t(15) = 1.82, p = 0.09; all p-values uncorrected), the significant main effect of the rmANOVA shows that the coupling between tones and gratings during induction had led to specific audio-visual associations that biased visual perception in rivalry. These results are in line with the hypothesis: the coupling of tone and grating in go-trails of the induction phase leads to audio-visual associations being formed. These newly acquired associations persist in the rivalry phase. The numerical order of the results suggests further that visual perception during rivalry is biased towards the visual stimulus associated with the concurrently played auditory stimulus. 3.3. Average gain The average gain showed a similar pattern as the bias (Fig. 3): when the tone associated with rightward motion was played, the average gain was rightward (0.06 ± 0.03) indicating dominance of the rightward-drifting grating, when the other tone was played, the average gain was leftward (0.10 ± 0.04), and the no-tone condition fell in-between (0.03 ± 0.02). These numerical differences were statistically confirmed by a main effect of auditory condition in the repeated-measures ANOVA [F(2,30) = 5.86, p = 0.03, e = 0.55]. None of the pairwise post hoc tests reached significance at the corrected alpha level of 0.0167 (difference between the two conditions with tone: t(15) = 2.48, p = 0.025; difference between tone associated with leftward motion and no tone: t(15) = 2.07, p = 0.056; between tone associated with rightward motion and no tone: t(15) = 2.62, p = 0.019; all p-values uncorrected). Nonetheless, the rmANOVA result again is consistent with the hypothesis that newly formed associations between tone and grating biased observers’ perception in rivalry, and the numerical order of the results suggests that the bias was towards perceiving the grating that had been associated with the concurrent auditory stimulus. The general bias to perceive leftward motion was investigated further for the no-tone condition, where no directional bias should be expected a priori. Indeed, we found only a trend for the bias (t(15) = 1.90, p = 0.08) and no such effect for the average gain
tone associated with leftward grating
no tone tone associated with rightward grating
-30%
left
0 bias
right
30%
Fig. 2. Bias of OKN slow phase direction for the three different auditory conditions; bias was defined as the difference between the aggregated durations of rightward slow phase and leftward slow phase divided by the sum of these durations. Mean and standard error across observers.
tone associated with leftward grating
no tone
tone associated with rightward grating
0.15
left
0 gain
right
0.15
Fig. 3. Average gain for the three different auditory conditions. Mean and standard error across observers.
(t(15) = 1.20, p = 0.25). We also tested whether the bias towards one motion direction depended on which eye which motion direction was presented to (i.e., whether the motion was inward or outward) and observed neither a difference in bias (t(15) = 1.67, p = 0.12) nor in average gain (t(15) = 0.81, p = 0.43). To approximate the vividness or exclusivity of the percept, we measured the absolute gain at each point in time and treated it analogously to the average gain for analysis. We did not find evidence for an effect of the tone on the absolute gain, neither when all conditions were treated separately [F(2,15) = 2.25, p = 0.14, e = 0.80] nor when the conditions with tone were aggregated and compared to the no-tone condition (t(15) = 1.59, p = 0.13).
3.4. Dominance durations Dominance durations, as determined from changes in the sign of the OKN slow phase, depended on the presented tone [F(2,30) = 4.19, p = 0.03, e = 0.81; Fig. 4]. They were longer for the grating that was associated with the concurrently presented tone (2.52 s ± 0.28 s) than for the other grating (1.96 s ± 0.15 s), though this numerical difference did not reach significance at the Bonferroni-adjusted alpha level (t(15) = 2.36, p = 0.032, p-value uncorrected). Although the no-tone condition fell in-between numerically (2.14 s ± 0.20 s), post hoc tests did not show a significant difference to either of the other conditions (to grating associated with tone: t(15) = 1.96, p = 0.07; to other grating: t(15) = 1.15, p = 0.27). Hence, the present data cannot distinguish whether the effect on dominance results from an increase in dominance durations in the grating associated with the current tone, from a reduction of dominance durations in the other grating, or from both mechanisms combined. In any case, the analysis of the dominance durations is consistent with the other two measures: perceptual
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
4. Discussion
grating associated with tone no tone
other grating
0
dominance duration [s]
127
3
Fig. 4. Mean dominance duration for the grating paired with the current tone, the other grating and for the no-tone condition. Mean and standard error across observers.
dominance during binocular rivalry is affected by a newly acquired arbitrary audio-visual association. To avoid any response interference, we deliberately refrained from including any condition that would require observers to report about their perception in rivalry. To nonetheless verify that the measured dominance durations resemble those typically observed in rivalry, we analyzed the distribution of dominance durations as measured by OKN. To capture the ‘‘shape” of this distribution in a single parameter, we determined the kurtosis of the distribution in each individual for each condition. The kurtosis would be 3 for a normal distribution2 and larger than 3 for the leptokurtic (heavy-tailed) distribution typically observed for dominance durations in rivalry. Indeed, we found the kurtosis to be 16.1 ± 3.1 (mean ± sem over observers) for the grating associated with the tone, 11.4 ± 2.2 for the other grating, and 9.8 ± 1.3 for the no-tone condition. All these values were significantly different from 3 (all t(15) > 3.8, all ps < 0.002), showing a leptokurtic distribution of dominance durations for all conditions.
3.5. Relation between the measures The three measures – bias, average gain and dominance durations – test related but distinct aspects of perceptual dominance. Whereas average gain weighs periods with low (absolute) gain less than periods of high absolute gain, the bias measure treats all time points equal, no matter whether there is only a slight bias towards one motion direction. Such small biases can be related to less vivid or mixed percepts. Since an increase in dominance – as measured by gain and bias – can result from increasing one grating’s dominance duration or from decreasing the other grating’s dominance duration, mean dominance duration captures a third aspect. To test whether these measures are consistent within an individual, we quantified the effect strength for each measure in each individual for the rivalry conditions in which a tone was played. The differences between the tone associated with rightward and leftward tone were used for bias and average gain; the difference in mean dominance duration between the grating associated with the concurrent tone and the other grating was used for the dominance duration. We found that the effects of bias and average gain correlated across individuals (r(14) = 0.98, p < 0.001), as did the effects of bias and dominance duration (r(14) = 0.97, p < 0.001) and the effects of average gain and dominance duration (r(14) = 0.98, p < 0.001). Hence, although the measures capture different aspects of dominance, they were highly consistent. 2 We use the definition of kurtosis as standardized forth moment of a distribution (forth moment divided by the squared variance), while some authors define the kurtosis by subtracting 3 from this number, which would yield 0 kurtosis for the Gaussian distribution.
Our findings show that with about 20 min of induction, an arbitrary audio-visual association is formed that subsequently biases perception in binocular rivalry. Importantly, there is no requirement of report or monitoring during the rivalry phase, rendering it unlikely that the observed effect were a mere consequence of instruction or response bias. Given that the visual stimulus remains unchanged during rivalry blocks, our data suggest a direct effect of an acquired arbitrary audio-visual association on visual representations. In the present study, we deliberately refrained from including an ‘‘active” condition, in which observers had to report their percept during rivalry. The rationale was to avoid any possible confounds by the necessity to respond. We demonstrated earlier, however, with the identical setup and similar stimuli, that the OKN is a reliable marker of the dominant perception on a moment-by-moment basis (Marx & Einhäuser, 2015; Naber et al., 2011). Other groups have since obtained a similar degree of correspondence between OKN and perceptual report (Ketkar, Wilbertz, & Sterzer, 2016). Moreover, the dominance durations as measured by OKN follow a leptokurtic (heavy-tailed) rather than a Gaussian distribution, which is a typical result for rivalry (Levelt, 1965). Hence, we consider it as a given that the OKN is a reliable indicator of current perception in our paradigm. When approximating the vividness of the percept by the absolute value of the gain (cf. Einhäuser, Thomassen, & Bendixen, 2017), we did not find any effect of the tone condition on the vividness of the percept as approximated by the absolute gain. This does, however, not imply that the vividness of the percept was independent of condition – it could also mean that the OKN is not sufficiently sensitive to reveal such a difference in the present paradigm. Provided the increased interest in transition phases and mixed percepts (e.g., Knapen, Brascamp, Pearson, van Ee, & Blake, 2011), the effect of audio-visual associations on mixed or piecemeal percepts remains an interesting issue for further research. Perceptual multistability itself is not limited to the visual domain. In fact, it has been described in almost all sensory modalities, including audition (van Noorden, 1975; Warren & Gregory, 1958), touch (Carter, Konkle, Wang, Hayward, & Moore, 2008) and olfaction (Zhou & Chen, 2009). In the case of vision and audition, several attempts have been made to correlate properties of the individual switching dynamics across the two domains. The observed correlations were weak to absent (Kondo et al., 2012; Pressnitzer & Hupé, 2006), which distinguishes these results from similar measures within either modality (Carter & Pettigrew, 2003; Kashino & Kondo, 2012; Sheppard & Pettigrew, 2006). Even though we could recently demonstrate an effect of the dominant percept in auditory multistability on dominance in binocular rivalry (Einhäuser et al., 2017), this required specific tailoring of the visual stimulus to match the auditory perceptual experience. Similarly, studies that investigated auditory effects on visual rivalry have so far used stimuli that were matched across domains (Blake et al., 2004; Chen et al., 2011; van Ee et al., 2009). In the light of these studies, it is remarkable that a newly acquired arbitrary association of a tone with motion direction and color subsequently influences perceptual representation. Several studies that use auditory cues to bias the visual perception have argued that the auditory stimulus modulates attention to the visual stimulus rather than acting on perceptual representations themselves. Indeed, attention can bias perceptual appearance of physical stimulus attributes, such as contrast (Carrasco, Ling, & Read, 2004) or color (Blaser et al., 1999). Attention affects binocular rivalry in two distinct ways: first, the perception of a certain attribute can be boosted by attention to this attribute (Marx &
128
W. Einhäuser et al. / Vision Research 133 (2017) 121–129
Einhäuser, 2015; Ooi & He, 1999; van Ee, van Dam, & Brouwer, 2005). Indeed, this extends to cross-modal effects, as a concurrent auditory or tactile rhythm helps maintaining volitional control over one’s perception (van Ee et al., 2009). Second, distracting visual attention from the stimulus can slow the rivalry process (Paffen, Alais, & Verstraten, 2006) to an extent that withdrawing attention can stop binocular rivalry completely (Brascamp & Blake, 2012; Zhang, Jamison, Engel, He, & He, 2011), although this does not hold for other forms of rivalry (Pastukhov & Braun, 2007). Paying attention to an auditory task slows alternations in visual rivalry, even if the auditory task is unrelated to the visual stimulus (Alais et al., 2010). Given this, two cross-modal effects on rivalry need to be distinguished: dual-task interference and biases to semantically or physically related content. While the former is by definition an attentional effect, the latter does not necessarily require attention, at least not for the semantic relatedness case (Chen et al., 2011). Our current results extend such semantic relatedness to a newly formed, arbitrary audio-visual association. While attentional effects may contribute to the present findings, other explanations are also possible. We note that the presentation is symmetric in the sense that each visual and each auditory stimulus is presented for the same time during all phases – the association is only formed by the required response during induction. Moreover, the effect manifests itself without explicit task, hence there is no incentive towards a volitional preference of one stimulus over the other. Hence, the induction may lead to a form of shared representation that is akin to the semantic relatedness that similarly needs to be acquired though on a vastly different timescale. Such shared representations could recruit responses to non-visual modalities in mid-level visual areas, as has been shown for the influence of touch on bi-stable motion (Blake et al., 2004). It is also conceivable that the association boosts the effective low-level properties (such as effective contrast or effective saturation) of the respective stimulus. Such an increase in effective stimulus strength would solely lead to a reduction in the dominance durations of the non-associated stimulus, but not to an increase in dominance durations of the associated stimulus (Brascamp et al., 2015; Levelt, 1965). This would predict that the dominance durations for the grating associated with the concurrent tone are indistinguishable from the no-tone condition, whereas the dominance duration for the other grating should be smaller than in the notone condition. In the present data, the no-tone conditions falls numerically in-between the other two conditions with respect to dominance durations, with significant differences to neither, such that on the basis of the present data this possibility remains open. It is well possible that both processes, weaker suppression and stronger dominance, contribute. This would be similar to the observations for physically related stimuli between the tactile and visual domains (Lunghi & Alais, 2013; Lunghi & Morrone, 2013), but contrary to the effect of auditory motion on visual motion perception (Conrad et al., 2010). In sum, increases in effective stimulus strength, attentional effects as well as a more central shared audio-visual representation may contribute to the observed cross-modal effects. These explanations are not mutually exclusive. In fact, attentional effects may help shaping the representation as well as mediate a larger perceived stimulus strength. While clearly beyond the present proof-of-principle study, the present no-report paradigm can readily be extended to test other induction protocols as well as interactions of the newly acquired audio-visual associations with attention and stimulus properties. Most contemporary research on audio-visual integration focuses on situations in which auditory and visual information have distinct reliability. A typical result states that when visual information is more reliable, such as in most spatial judgements, visual information dominates (Alais & Burr, 2004), whereas when auditory information is more reliable, such as in most temporal
judgements, auditory information dominates (Burr, Banks, & Morrone, 2009; Shams et al., 2002). In both cases, the integration of cues is often optimal in a statistical (Bayesian) sense: the more reliable cue is weighed more strongly than the less reliable cue (Alais & Burr, 2004; Bresciani, Dammeier, & Ernst, 2008). In the present case, the visual stimulus is ambiguous and thus unreliable, therefore the notion that the unambiguous auditory stimulus determines the dominant overall perceptual experience is at least consistent with the notion of optimal usage of cues. Obviously, there is still some visual evidence for the competing stimulus and therefore rivalry is not completely abolished, but biased towards the percept consistent with the auditory stimulus. To form the audio-visual association, we used a paradigm that couples a specific visual with a specific auditory stimulus through the requirement of a response to exactly these particular pairings. These pairings were as frequent as the reverse pairings, but due to an additional neutral visual stimulus (the green upward grating) the action-relevant events only accounted for a third of the overall induction trials. The successful formation of an audio-visual association under these conditions is consistent with the theory of event coding (Hommel, Müsseler, Aschersleben, & Prinz, 2001). According to this theory, the action, the auditory and its associated visual stimulus form an event file. Upon retrieval of the event, only the audio-visual association matters, as the same action is part of both events and thus uninformative. Irrespective of the underlying mechanism, the current study shows that the chosen induction paradigm is effective in forming an audio-visual association that biases subsequent perception. Nonetheless, it will be an issue for further research whether the effects generalize to associations formed by different induction protocols. For the opposite direction – effects of visual motion on auditory perception as measured by auditory aftereffect sizes – mere exposure to the audio-visual pairing sufficed to induce an effect (Vroomen & de Gelder, 2003). This makes it conceivable that exposure to the audio-visual pairings will have a similar effect on visual rivalry as the present induction protocol, where the exposure to all pairs was equal. Similarly, it is conceivable that associating certain pairings with reward during induction in a classical condition paradigm will also yield audiovisual associations that bias perception. In any case, the present study demonstrates that binocular rivalry, as measured by a noreport paradigm, is sufficiently sensitive to reveal effects of audio-visual associations on visual perception. In conclusion, the present data demonstrates direct effects of newly acquired audio-visual associations on perceptual dominance in binocular rivalry. Attention and effective stimulus strength may contribute to the effect, but induction may also create a shared representation, or a common file, for the auditory and visual stimuli to be associated. When one modality (vision) becomes ambiguous during the rivalry phase, and the other remains reliable (audition), the latter may bias the integrated perception towards the learnt association to maintain a unique and consistent percept across modalities.
Acknowledgments The work was supported by the German Research Foundation (DFG) through SFB/TRR-135 (B4). The authors thank Monique Michl and Julia Trojanek for support in data collection.
References Alais, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal integration. Current Biology, 14(3), 257–262. Alais, D., van Boxtel, J. J., Parker, A., & van Ee, R. (2010). Attending to auditory signals slows visual alternations in binocular rivalry. Vision Research, 50(10), 929–935.
W. Einhäuser et al. / Vision Research 133 (2017) 121–129 Beets, I. A. M., Hart, B. M. T., Rösler, F., Henriques, D. Y. P., Einhäuser, W., & Fiehler, K. (2010). Online action-to-perception transfer: Only percept-dependent action affects perception. Vision Research, 50, 2633–2641. Blake, R., Sobel, K. V., & James, T. W. (2004). Neural synergy between kinetic vision and touch. Psychological Science, 15, 397–402. Blaser, E., Sperling, G., & Lu, Z. L. (1999). Measuring the amplification of attention. Proceedings of the National Academy of Sciences USA, 96, 11681–11686. Boring, E. G. (1930). A new ambiguous figure. American Journal of Psychology, 42, 444–445. Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. Brascamp, J. W., & Blake, R. (2012). Inattention abolishes binocular rivalry: Perceptual evidence. Psychological Science, 23(10), 1159–1167. Brascamp, J. W., Klink, P. C., & Levelt, W. J. (2015). The ‘laws’ of binocular rivalry: 50 years of Levelt’s propositions. Vision Research, 109, 20–37. Breese, B. B. (1899). On inhibition. The Psychological Review: Monograph Supplements, 3(1). i-65. Bresciani, J. P., Dammeier, F., & Ernst, M. O. (2008). Tri-modal integration of visual, tactile and auditory signals for the perception of sequences of events. Brain Research Bulletin, 75(6), 753–760. Burr, D., Banks, M. S., & Morrone, M. C. (2009). Auditory dominance over vision in the perception of interval duration. Experimental Brain Research, 198(1), 49–57. Carrasco, M., Ling, S., & Read, S. (2004). Attention alters appearance. Nature Neuroscience, 7, 308–313. Carter, O., Konkle, T., Wang, Q., Hayward, V., & Moore, C. (2008). Tactile rivalry demonstrated with an ambiguous apparent-motion quartet. Current Biology, 18, 1050–1054. Carter, O. L., & Pettigrew, J. D. (2003). A common oscillator for perceptual rivalries? Perception, 32(3), 295–305. Chen, Y.-C., Yeh, S.-L., & Spence, C. (2011). Crossmodal constraints on human perceptual awareness: Auditory semantic modulation of binocular rivalry. Frontiers in Psychology, 2, 212. Conrad, V., Bartels, A., Kleiner, M., & Noppeney, U. (2010). Audiovisual interactions in binocular rivalry. Journal of Vision, 10, 27. Conrad, V., Kleiner, M., Bartels, A., Hartcher O’Brien, J., Bülthoff, H. H., & Noppeney, U. (2013). Naturalistic stimulus structure determines the integration of audiovisual looming signals in binocular rivalry. PLoS One, 8(8), e70710. Cornelissen, F. W., Peters, E. M., & Palmer, J. (2002). The eyelink toolbox: Eye tracking with MATLAB and the psychophysics toolbox. Behavior Research Methods, Instruments, & Computers, 34, 613–617. Einhäuser, W., Martin, K. A. C., & König, P. (2004). Are switches in perception of the Necker cube related to eye position? European Journal of Neuroscience, 20, 2811–2818. Einhäuser, W., Thomassen, S., & Bendixen, A. (2017). Using Binocular Rivalry to tag foreground sounds: towards an objective visual measure for auditory multistability. Journal of Vision, 17(1) 34. Enoksson, P. (1963). Binocular rivalry and monocular dominance studied with optokinetic nystagmus. Acta Ophthalmologica (Copenhagen), 41, 544–563. Fox, R., Todd, S., & Bettinger, L. A. (1975). Optokinetic nystagmus as an objective indicator of binocular rivalry. Vision Research, 15, 849–853. Frässle, S., Sommer, J., Jansen, A., Naber, M., & Einhäuser, W. (2014). Binocular rivalry – Frontal activity relates to introspection and action, but not to perception. Journal of Neuroscience, 34, 1738–1747. Glen, J. S. (1940). Ocular movements in reversibility of perspective. Journal of General Psychology, 23, 143–281. Guzman-Martinez, E., Ortega, L., Grabowecky, M., Mossbridge, J., & Suzuki, S. (2012). Interactive coding of visual spatial frequency and auditory amplitudemodulation rate. Current Biology, 22, 383–388. Hommel, B., Müsseler, J., Aschersleben, G., & Prinz, W. (2001). The theory of event coding (TEC): A framework for perception and action planning. Behavioral and Brain Sciences, 24(5), 849–878. Kashino, M., & Kondo, H. M. (2012). Functional brain networks underlying perceptual switching: Auditory streaming and verbal transformations. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 367, 977–987. Kawabata, N., Yamagami, K., & Noaki, M. (1978). Visual fixation points and depth perception. Vision Research, 18, 853–854. Ketkar, M. D., Wilbertz, G., & Sterzer, P. (2016). Combined fMRI and eye-tracking based decoding of a bistable plaid motion perception. Perception, 45(Suppl. 2), 277. Klink, P. C., van Ee, R., & van Wezel, R. J. (2008). General validity of Levelt’s propositions reveals common computational mechanisms for visual rivalry. PLoS One, 3, e3473. Knapen, T., Brascamp, J., Pearson, J., van Ee, R., & Blake, R. (2011). The role of frontal and parietal brain areas in bistable perception. Journal of Neuroscience, 31(28), 10293–10301. Kondo, H. M., Kitagawa, N., Kitamura, M. S., Koizumi, A., Nomura, M., & Kashino, M. (2012). Separability and commonality of auditory and visual bistable perception. Cerebral Cortex, 22, 1915–1922. Levelt, W. J. M. (1965). On binocular rivalry. Soesterberg, The Netherlands: Institute for Perception RVO-TNO. Lunghi, C., & Alais, D. (2013). Touch interacts with vision during binocular rivalry with a tight orientation tuning. PLoS One, 8(3), e58754.
129
Lunghi, C., & Alais, D. (2015). Congruent tactile stimulation reduces the strength of visual suppression during binocular rivalry. Scientific Reports, 9, 9413. Lunghi, C., Binda, P., & Morrone, M. C. (2010). Touch disambiguates rivalrous perception at early stages of visual analysis. Current Biology, 4(20), R143–R144. Lunghi, C., Morrone, M. C., & Alais, D. (2014). Auditory and tactile signals combine to influence vision during binocular rivalry. Journal of Neuroscience, 34(3), 784–792. Lunghi, C., & Morrone, M. C. (2013). Early interaction between vision and touch during binocular rivalry. Multisensory Research, 3(26), 291–306. Maruya, K., Yang, E., & Blake, R. (2007). Voluntary action influences visual competition. Psychological Science, 18, 1090–1098. Marx, S., & Einhäuser, W. (2015). Reward modulates perception in binocular rivalry. Journal of Vision, 15(1), 1–13. 11. McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264 (5588), 746–748. Metzger, W. (1934). Beobachtungen über phänomenale Identität. Psychologische Forschung, 19, 1–60. Naber, M., Frässle, S., & Einhäuser, W. (2011). Perceptual rivalry: Reflexes reveal the gradual nature of visual awareness. PLoS One, 6, e20910. Necker, L. A. (1832). Observations on some remarkable optical phaenomena seen in Switzerland; and on an optical phaenomenon which occurs on viewing a figure of a crystal or geometrical solid. London Edinburgh Philosophical Magazine and Journal of Science, 1, 329–337. Ooi, T. L., & He, Z. J. (1999). Binocular rivalry and visual awareness: The role of attention. Perception, 28(5), 551–574. O’Shea, R. P., Parker, A., La Rooy, D., & Alais, D. (2009). Monocular rivalry exhibits three hallmarks of binocular rivalry: Evidence for common processes. Vision Research, 49(7), 671–681. Paffen, C. L. E., Alais, D., & Verstraten, F. (2006). Attention speeds binocular rivalry. Psychological Science, 17(9), 752–756. Pastukhov, A., & Braun, J. (2007). Perceptual reversals need no prompting by attention. Journal of Vision, 7(10), 1–17. 5. Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision, 10, 437–442. Pick, H. L., Jr., Warren, D. H., & Hay, J. C. (1969). Sensory conflict in judgments of spatial direction. Perception & Psychophysics, 6(4), 203–205. Pressnitzer, D., & Hupé, J.-M. (2006). Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization. Current Biology, 16, 1351–1357. R Development Core Team (2014). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Rubin, E. (1921). Visuell wahrgenommene figuren. Copenhagen, Berlin, London: Gyldendalske Boghandel. Sekuler, R., Sekuler, A. B., & Lau, R. (1997). Sound alters visual motion perception. Nature, 385, 308. Shams, L., Kamitani, Y., & Shimojo, S. (2002). Visual illusion induced by sound. Cognitive Brain Research, 14(1), 147–152. Sheppard, B. M., & Pettigrew, J. D. (2006). Correlates with binocular rivalry and positive mood state. Perception, 35, 157–169. Tsuchiya, N., Frässle, S., Wilke, M., & Lamme, V. (2015). No-report paradigms: Extracting the true neural correlates of consciousness. Trends in Cognitive Sciences, 19(12), 757–770. van Ee, R., van Boxtel, J. J., Parker, A. L., & Alais, D. (2009). Multisensory congruency as a mechanism for attentional control over perceptual selection. Journal of Neuroscience, 29, 11641–11649. van Ee, R., van Dam, L. C. J., & Brouwer, G. J. (2005). Voluntary control and the dynamics of perceptual bi-stability. Vision Research, 45(1), 41–55. van Noorden, L. P. A. S. (1975). Temporal coherence in the perception of tone sequences [Doctoral dissertation]. Eindhoven, The Netherlands: Technical University Eindhoven. Vroomen, J., & de Gelder, B. (2003). Visual motion influences the contingent auditory motion aftereffect. Psychological Science, 14(4), 357–361. Vroomen, J., & de Gelder, B. (2004). Temporal vetriloquism: Sound modulates the flash-lag effect. Journal of Experimental Psychology. Human Perception and Performance, 30(3), 513–518. Warren, R. M., & Gregory, R. L. (1958). An auditory analogue of the visual reversible figure. The American Journal of Psychology, 71, 612–613. Wheatstone, C. (1838). Contributions to the physiology of vision. Part the first. On some remarkable, and hitherto unobserved, phenomena of binocular vision. Philosophical Transactions of the Royal Society of London, 128, 371–394. Wilbertz, G., van Slooten, J., & Sterzer, P. (2014). Reinforcement of perceptual inference: Reward and punishment alter conscious visual perception during binocular rivalry. Frontiers in Psychology, 5, 1377. Wohlschläger, A. (2000). Visual motion priming by invisible actions. Vision Research, 40, 925–930. Zhang, P., Jamison, K., Engel, S., He, B., & He, S. (2011). Binocular rivalry requires visual attention. Neuron, 71(2), 362–369. Zhou, W., & Chen, D. (2009). Binaral rivalry between the nostrils and in the cortex. Current Biology, 19, 1561–1565. Zhou, W., Jiang, Y., He, S., & Chen, D. (2010). Olfaction modulates visual perception in binocular rivalry. Current Biology, 20, 1356–1358.