Hearing Research 256 (2009) 39–57
Contents lists available at ScienceDirect
Hearing Research journal homepage: www.elsevier.com/locate/heares
Research paper
Electrophysiological and psychophysical asymmetries in sensitivity to interaural correlation steps Helge Lüddemann *, Helmut Riedel, Birger Kollmeier Medizinische Physik, Carl von Ossietzky Universität Oldenburg, D-26111 Oldenburg, Germany
a r t i c l e
i n f o
Article history: Received 27 November 2008 Received in revised form 12 June 2009 Accepted 15 June 2009 Available online 23 June 2009 Keywords: Binaural psychoacoustics Spatial diffuseness Scale transform Late auditory evoked potentials Sustained potential Auditory scene analysis
a b s t r a c t The binaural auditory system’s sensitivity to changes in the interaural cross correlation (IAC), as an indicator for the perceived spatial diffuseness of a sound, is of major importance for the ability to distinguish concurrent sound sources. In this article, we present electroencephalographical and corresponding psychophysical experiments with stepwise transitions of the IAC in continuously running noise. Both the transient and sustained brain response, display electrophysiological correlates of specific binaural processing in humans. The transient late auditory evoked potentials (LAEP) systematically depend on the size of the IAC transition, the reference correlation preceding the transition, the direction of the transition and on unspecific context information from the stimulus sequence. The psychophysical and electrophysiological data are characterized by two asymmetries. (1) Major asymmetry: for reference correlations of þ1 and 1, psychoacoustical thresholds are comparatively lower, and the peak-to-peak-amplitudes of LAEP are larger than for a reference correlation of zero. (2) Minor asymmetry: for IAC transitions in the positive parameter range, perceptual thresholds are slightly better and peak-to-peak amplitudes are larger than in the negative range. In all experimental conditions, LAEP amplitudes are linearly related to the dB scaled power ratio of correlated (N0 ) versus anticorrelated (Np ) signal components. The voltage gain of LAEP per dB(N0 =N p ) closely corresponds to a constant perceptual distance between two correlations. We therefore suggest that activity in the auditory cortex and perceptual IAC sensitivity are better represented by the dB-scaled N0 =Np power ratio than by the normalized IAC itself. Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction In realistic acoustical environments, several concurrent sound sources are often present at the same time, each having a different spatial extent and being masked by diffuse ambient noise or reverberation. Binaural listening is generally thought to facilitate the ability to distinguish single sound sources of particular interest from others by their spatial position (Colburn, 1995; Bronkhorst, 2000; Faller and Merimaa, 2004; Beutelmann and Brand, 2006; Nix and Hohmann, 2007).
Abbreviations: AFC, alternative forced choice; ASSR, auditory steady state response; BOLD, blood oxygen level dependent; EEG, electroencephalography; ERB, equivalent rectangular bandwidth (of an auditory filter); fMRI, functional magnetic resonance imaging; IAC, interaural cross correlation; ITD, interaural time difference; ILD, interaural level difference; JND, just noticeable difference; JNT, just noticeable transition; LAEP, late auditory evoked potential; MEG, magnetoencephalography; N1, negative deflection in the LAEP at about 100–130 ms after the stimulus; P2, positive deflection in the LAEP at about 200–230 ms after the stimulus; SDT, signal detection theory; SNR, signal-to-noise ratio; SP, sustained potential * Corresponding author. Tel.: +49 441 798 5472; fax: +49 441 798 3902. E-mail address:
[email protected] (H. Lüddemann). 0378-5955/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.heares.2009.06.010
According to the duplex theory of sound, the interaural time difference (ITD) and the interaural level difference (ILD) between both ears are the most valuable physical signal parameters for human listeners to localize a single sound source in the azimuthal plane (Rayleigh, 1907; Yost and Gourevitch, 1987). In acoustically complex situations, however, the interaural timing and level disparities of the incident sound waves no longer provide consistent spatial information across frequency and time due to the superposition of sounds from different locations. In particular, in the presence of diffuse ambient noise ITDs and ILDs might become inconsistent even within small spectrotemporal portions of the signal and appear to be random variables as the noise level increases (Nix and Hohmann, 2006, 2007). As a consequence, the precise localization of distinct sound sources is substantially impaired (Saberi et al., 1998). Perceptually, this results in a broadening of the sound object’s width in auditory space, i.e., the object is perceived as more or less diffuse. Rather than investigating the effect of spatial diffuseness on performance in lateralization tasks (Saberi et al., 1998; Trahiotis et al., 2001), the aim of this contribution is to understand how the perceived diffuseness per se – or, as its related physical quantity, the interaural cross correlation (IAC) – is represented in the
40
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
auditory system. This question is assessed by a combination of electroencephalographical recordings (EEG) of late auditory evoked potentials (LAEP) and by closely corresponding psychophysical experiments, using stimuli with stepwise transitions of the IAC in continuously running broadband noise. 1.1. The normalized cross correlation and related quantities The IAC between the signals lðtÞ at the left and rðtÞ at the right ear is defined as the normalized scalar product of l and r and hereby denoted by q,
R lðtÞ rðtÞ dt : q ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R 2 R l ðtÞ dt r2 ðtÞ dt
ð1Þ
Due to normalization the range of q is restricted to the interval ½1; þ1. When presented via headphones, correlated noise signals (q ¼ þ1, also termed N 0 ) are perceived as a compact sound source at a central position in the head, whereas uncorrelated signals (q ¼ 0, also termed Nu ) are perceived as diffuse, i.e., they are associated with a continuum of simultaneously active sources between both ears. Anticorrelated signals (q ¼ 1, also termed N p ) are often associated with two sound sources, one at the left and the other at the right ear. Hence, by setting q to an intermediate value, the overall spatial diffuseness of dichotic noise stimuli can be adjusted continuously, simulating the amount of either consistent or inconsistent information about the spatial distribution of sound sources in a complex acoustical environment. This allows one to systematically investigate how the binaural system deals with different degrees of diffuseness. Noise stimuli with any desired q can be generated by mixing two orthonormal noise sources aðtÞ and bðtÞ in an appropriate ratio (Culling et al., 2001):
a l pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ q a þ 1 q2 b r
ð2Þ
Based on a descriptive analysis of binaurally masked tone detection thresholds, van der Heijden and Trahiotis (1997) suggested that the amount of masking produced by a noise having an arbitrary interaural correlation is equal to the addition of the masking effects produced by the diotic (N 0 ) and anticorrelated (N p ) constituents that compose the masker. Using this concept, they could successfully explain the dependence of binaural masking level differences (BMLD) on the interaural correlation for various masker bandwidths. Motivated by this additivity of masking, one can also use an alternative mixing formula in which stimuli are obtained as a mixture of a diotic noise (N 0 ) and an antiphasic noise (N p ), which are again built from two orthonormal noise sources aðtÞ and bðtÞ:
l r
¼
rffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffi b 1þq a 1q þ 2 2 a b
ð3Þ
or:
rffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffi 1þq 1q N0 þ Np Nq ¼ 2 2
ð4Þ
The mixing formula (4) illustrates that the IAC – and thus the diffuseness – of any dichotic signal Nq is entirely determined by the ratio of its correlated (N0 ) versus its anticorrelated (N p ) components. This ratio of mixing coefficients provides an alternative representation of IAC,
q~ ¼ 10 log
1þq : 1q
ð5Þ
~ is the Fisher Z-transform Except for a proportionality factor of 10, q ~ is a monotonic nonlinear function of the normalized IAC q. It of q. q
is identical to Durlach’s ‘‘equivalent signal-to-noise ratio” (equivalent SNR) in an N p S0 BMLD paradigm. However, in order to prevent ~ as the ‘‘dB(N 0 =Np ) scaled IAC” because Durconfusion, we denote q lach et al. (1986) introduced several different formulae for the computation of an equivalent SNR any of which corresponds to a particular interaural configuration of masker and signal phase in other BMLD experiments, e.g., Nu S0 or N0 Sm . 1.2. Psychoacoustics on interaural cross correlation The just noticeable difference between two values of normalized IAC (q-JND) critically depends on the correlation of the reference stimulus, qref . For stimulus bandwidths greater than 1 ERB (Moore et al., 1988), q-JNDs at qref ¼ þ1 are between 0.02 and 0.057, while q-JNDs at qref ¼ 0 range from 0.3 up to 0.72 (Gabriel and Colburn, 1981; Koehnke et al., 1986; Akeroyd and Summerfield, 1999; Culling et al., 2001; Boehnke et al., 2002), i.e., psychoacoustical discrimination thresholds are about 10 times lower for qref ¼ þ1 than for qref ¼ 0. For intermediate qref , there is a nonlinear decrease of the q-JND as qref increases from 0 to þ1 (Pollack and Trittipoe, 1959a; Culling et al., 2001, 2003). The differences between the absolute JND values that have been reported in the literature are presumably due to different spectral stimulus properties and experimental techniques (Gabriel and Colburn, 1981; Akeroyd and Summerfield, 1999; Culling et al., 2001). Measuring the q-JND for qref ¼ 1, Boehnke et al. (2002) found that thresholds were markedly lower than for qref ¼ 0 but were twice as large as for qref ¼ þ1. In addition, for qref ¼ 0 they reported lower thresholds for positive than for negative deviant correlations qdev . Also the cumulative d0 -functions by Culling et al. (2003) indicate that the discriminability of two stimuli with different correlations is generally worse in the negative than in the positive range of q. In the above-mentioned IAC discrimination experiments listeners had to rely on auditory memory in order to distinguish stimuli with static but different correlations qref and qdev which were separated in time. In realistic acoustical situations, however, there are more or less rapid IAC transitions within the ongoing signal which can serve as an additional dynamic cue for binaural scene analysis. In analogy to the importance of common onsets in the monaural case, such rapid changes of the interaural signal properties might provide even more salient cues for binaural object separation than the memory-based comparison of signals with silence in between, or the comparison of quasi-static interaural parameters in temporally subsequent segments of an ongoing sound. Nevertheless, the detection of stepwise IAC transitions and the discrimination of stimuli with static IAC seem to be of quite similar character: Dajani and Picton (2006) periodically switched the IAC between zero and a positive deviant correlation Dq at a switch rate of 4 Hz, i.e., with 8 IAC transitions per second. They reported that such stimuli with rectangular modulations of the IAC could be distinguished from an uncorrelated noise if Dq exceeded 0.31. Grantham (1982) investigated the detectability of sinusoidal IAC modulations at various modulation rates. For a modulation rate of 5 Hz, his results indicated that discrimination of the modulated stimuli against uncorrelated reference noise was possible at a modulation amplitude of about 0.45, i.e., with the IAC oscillating between 0:45 and þ0:45. In experiments with binaural gaps, i.e., brief segments of deviant correlation qdev which were temporally flanked by reference segments with an IAC of qref , Boehnke et al. (2002) found qualitatively the same dependence on qref for the gap duration thresholds as for the q-JNDs in corresponding discrimination tasks with static IAC. In summary, psychoacoustical literature data suggest that IAC sensitivity can be characterized by two asymmetries: First, there
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
is a ‘‘major asymmetry” regarding the absolute value of q. This major asymmetry refers to the observation that the sensitivity to IAC deviations is much better for j qref j = 1 than for qref ¼ 0. Second, the experiments by Boehnke et al. (2002) and by Culling et al. (2003) indicate a ‘‘minor asymmetry” regarding the sign of q, since perceptual performance was generally better in the positive than in the negative range of q. This minor asymmetry has a smaller influence on thresholds than the major asymmetry. The close correspondence in static and dynamic tasks suggests that the major and the minor asymmetry might provide a conceptually useful framework which refers to general features of binaural perception rather than being specific for a particular psychoacoustical task. Boehnke et al. (2002) discuss the possibility that the two asymmetries might be helpful to understand also the BMLD in a quantitative way. Similar suggestions were made, e.g., by Durlach et al. (1986), Koehnke et al. (1986), Jain et al. (1991) and Breebaart et al. (2001). 1.3. Neurophysiological measures of IAC related brain activity Ando et al. (1987) measured LAEP to low frequency narrowband noise bursts with different interaural correlations. They reported a decrease of N2 latency with increasing q, but no systematic relation between LAEP amplitudes and IAC. In similar MEG experiments, Soeta et al. (2004) found that the magnitude of late auditory evoked fields (LAEF) in response to the onset of such noise bursts was decreasing for increasing q. Using functional magnetic resonance imaging (fMRI), Budd et al. (2003) observed a monotonic, progressive increase of raw blood oxygen level dependent (BOLD) activity in supra-threshold voxels of the primary auditory cortex as the stimulus IAC was increased from q ¼ 0 to þ1. The increase of raw BOLD activity with q observed by Budd et al. (2003) seems to contradict the decrease of LAEF amplitude reported by Soeta et al. (2004). These opposite trends in MEG and fMRI data are, however, hard to interpret. In both experiments noise bursts with different IAC were separated by silence, so that not only the ongoing binaural stimulus properties but also the monaural signal onset contributed to the brain responses. Due to the coarse temporal resolution of fMRI it is unclear if BOLD activity corresponds to the magnitude of the onset response in the MEG or if it rather reflects a specific binaural sustained cortical activation. In order to overcome this entanglement of monaural and binaural brain responses, binaural interaction is often assessed by methods in which a subtraction of evoked responses to stimuli with different binaural properties is performed. The binaural difference potential is the difference between the binaural and summed monaural responses (e.g., Dobie and Berlin, 1979; Furst et al., 1985; Riedel and Kollmeier, 2002a,b, 2006; Junius et al., 2007). In the mismatch negativity paradigm the response to a rare deviant stimulus is subtracted from the response to a frequent standard (e.g., Schröger, 1996,; Johnson et al., 2003; Damaschke et al., 2005). However, like other indirect methods, the binaural difference potential and the mismatch negativity have the disadvantage of extensive measurement duration and/or a poor signal-to-noise ratio especially near the perceptual threshold. In contrast to any of the methods mentioned so far, specific binaural potentials without monaural contributions could be directly elicited by changes of the ITD or the IAC in an ongoing noise signal with constant monaural stimulus properties (Halliday and Callaway, 1978; Jones, 1991,; McEvoy et al., 1991a,b; Chait et al., 2005, 2007). In comparison to the monaural onset response to noise bursts or clicks, the P1-, N1- and P2-components of LAEP and LAEF in response to IAC transitions were generally weaker in amplitude and delayed in latency by about 20–50 ms (Jones, 1991; McEvoy et al., 1991b; Chait et al., 2005). In a recent study Ross et al. (2007) showed that also a reversal of the interaural
41
phase in amplitude modulated tones elicits a specific binaural brain response. They suggested that their stimuli can be used as a clinical tool to examine the phase locking ability of the periphery at the tone’s carrier frequency. In MEG experiments, Chait et al. (2005) presented their subjects stimuli with a single IAC change after an initial segment of noise with reference correlations qref of either 0 or 1. The deviant correlation after the IAC transition was chosen from a set of equally spaced values, qdev was either 0, 0.2, 0.4, 0.6, 0.8 or 1. For the onset reponse to the reference segment they found the same relation between LAEF amplitude and q as Soeta et al. (2004). In contrast to the monaural onset response, however, the MEG elicited by the IAC transitions in the ongoing noise was not consistently larger for qdev near 0. Instead, it monotonically increased with the size of the IAC switch for both reference correlations. Responses to IAC transitions of equal size were generally larger after initial segments with qref ¼ þ1 than for qref ¼ 0, corresponding to the major asymmetry of psychoacoustical q-JNDs (see also Chait et al., 2007). These findings were based on a differential analysis of RMS values from a spatio-temporal average of field powers in a selection of SQUIDs during 200 ms pre- and post-switch intervals. Therefore their data presentation does not allow one to draw further conclusions about the functional relationship between the stimulus parameters and the magnitude of separate components in the transient brain response. Dajani and Picton (2006) investigated the influence of IAC step size for periodic transitions of q between 0 and Dq at a modulation rate of 4 Hz. The spectral amplitude at the second harmonic of the auditory steady state response (ASSR) increased roughly linear with Dq. Since the data were analyzed in the spectral domain, the influence of IAC transitions on separate LAEP components, e.g., N1 and P2, was not further investigated. For changes between q ¼ 0 and 1, they also varied the modulation rate. At lower modulation frequencies the time series of the EEG showed separate N1and P2 components in response to the single IAC transition. The peak-to-peak amplitudes (P2N1) were roughly equal for transitions from q ¼ 1 to 0 compared to the opposite switch, in contrast to the MEG data by Chait et al. (2005, 2007). Furthermore, different baselines in the EEG during correlated and uncorrelated segments might indicate an IAC specific sustained potential (SP). It is therefore unclear if the linearity of ASSR growth functions with Dq is mainly related to single-switch N1- or P2 components, to a SP or to an interaction of several mechanisms in the case of periodic IAC switches. 1.4. Experimental scope of the present study The main interest of the present study is the examination of IAC specific brain activity in the EEG without contributions from the monaural auditory pathway and the comparison of electrophysiological data to behavioural measurements, using the same kind of stimuli and the same subjects. This was accomplished by using continuous running noise with stepwise transitions of the IAC in an ongoing signal for the acoustical stimulation in both psychophysical and electrophysiological experiments. In contrast to most previous studies, the stimulus correlations covered the entire parameter range of q, including þ1, 0 and 1 as well as intermediate values. The qualitative and quantitative investigation of the relation between the stimulus IAC and the EEG aims to clarify the following issues: (1) Is the psychophysical detectability of stepwise IAC transitions different from the discrimination of temporally separated stimuli with static IAC, i.e., the q-JND from literature? (2) What are the smallest transitions of q which elicit LAEP? How are these electrophysiological thresholds related to the corresponding psychoacoustical thresholds? (3) Is there an IAC specific sustained
42
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
potential (SP), and is it possible to distinguish the contributions of static and dynamic IAC differences to the EEG? (4) Are LAEP amplitudes, latencies and the SP mainly determined by the size of the correlation step Dq or by particular values of qref and qdev ? (5) Do electrophysiological data correlate with the normalized IAC or with other measures of diffuseness such as the ratio of N 0 and N p powers? (6) Can the asymmetries which were proposed to characterize psychoacoustical thresholds also be found in the LAEP or SP?
programmable attenuator and a headphone buffer (PA5 and HB7 by Tucker–Davis Technologies). Stimuli were presented with a matched pair of Etymotic Research ER-2 insert phones (less than 1 dB interaural tolerance in the signal band) at a level of 65 dB SPL. Calibration was performed using a Brüel & Kjr microphone B&K 4157 in a 1:29 cm3 coupler, a preamplifier B&K 2669 and an amplifier B&K 2610. 2.3. Psychoacoustics
2. Methods 2.1. Subjects Eleven subjects (6 female, 5 male), aged between 20 and 35, participated in both the psychoacoustical tasks and the EEG recordings. All subjects had less than 15 dB audiometric loss in their pure-tone audiograms below 4000 Hz and no history of neurological disorder. Experimental procedures were approved by the ethical review board of the University of Oldenburg. Informed consent was obtained from every subject, and they were paid for their participation. 2.2. Stimuli and apparatus All stimuli were digitally generated in MATLAB (The Mathworks) by concatenating segments of dichotic Gaussian bandpass noise (bandwidth 100–2000 Hz) with different interaural correlations. The stimulus correlation in each segment was set by mixing two orthonormal noise sources in the appropriate ratio. Each segment was generated from freshly generated running noise. For reasons of computational efficiency, only the right channel was manipulated, according to the asymmetric mixing formula (Eq. (2)). Three such segments were grouped to make a qref j qdev j qref -sequence, i.e., the IAC during a sequence was qref in the initial and in the last segment (denoted as reference segments) and qdev (qdev ¼ qref þ Dq) during the middle segment (deviant segment). Spectral splatter due to discontinuous waveforms at the segment borders was removed in the spectral domain by setting all components outside the passband to zero. The actual interaural correlations of the segments were measured after filtering the qref j qdev j qref -sequence. The maximal accepted difference between the desired and the actually measured stimulus IAC was checked in the domain of normalized IAC, using the following thresholds: 0.003 for a desired q of þ1, 0.03 for q ¼ 0 and 0.003 for q ¼ 1. The threshold criterion for intermediate correlations was calculated by linear interpolation between 0.03 and 0.003. Only if the deviation between measured and desired segment correlations was below this threshold, the sequence was used for acoustical stimulation. Otherwise the sequence was discarded and replaced by a new one. Hence, the deviation between actual and desired segment correlations was less than 10% of the respective q-JNDs reported by Boehnke et al. (2002). In the psychoacoustical measurements such qref j qdev j qref -sequences were presented with silence in between so that each sequence was an interval in a 3-alternative-forced-choice (3-AFC) paradigm. In contrast, for the EEG experiments a large number of sequences was joined by crossfading in order to present the subjects a continuous signal with no silence in between. Subjects were seated on a comfortable chair in a double-walled, electrically and acoustically shielded sound booth. The same equipment was used for stimulus presentation in the psychoacoustical experiments and the EEG recordings. The digitally generated signal was passed to an external 24-bit D/A-converter (RME TDIF-1) at a sampling rate of 48 kHz using SOUNDMEX (HörTech GmbH). The level of the analog signals was then adjusted using a
The detectability of IAC transitions in a qref j qdev j qref -sequence was investigated under four experimental conditions: qref ¼ þ1 ð1 #Þ, qref ¼ 1 ð1 "Þ, qref ¼ 0 with qdev in the positive range ð0 "Þ and qref ¼ 0 with qdev in the negative range ð0 #Þ. For each of these four conditions psychoacoustical thresholds for the smallest detectable transition of interaural correlation Dq ¼ qdev qref were measured using a 3-interval-3-alternative forced choice procedure (3-AFC) with visual feedback after each trial. The diotic noise was generated afresh (running noise) for each interval. All intervals had a duration of 683 ms, they were gated using 20-ms raised-cosine ramps and separated by 500 ms silence. During the two reference intervals the IAC remained fixed at qref througout the entire interval. During the central 250-ms-segment of the randomly chosen target interval the IAC was qdev ¼ qref þ Dq. The size of the IAC transition Dq was increased after one false response and decreased after two correct responses. This 1-up-2down step rule estimates the 70.7% correct score within the 3AFC paradigm (Levitt, 1971). Starting with a comparatively large step size for fast adaptation of the tracking procedure, it was reduced after each upper reversal until the final step size was reached (0.01 for qref ¼ 1 and 0.03 for qref ¼ 0). Then, for the last eight reversals, Dq was varied with this constant, final step size. The mean Dq in the last eight reversals of a single run was taken as an estimate for the just noticeable IAC transition q-JNT. Since most subjects were naive listeners, they repeated any of the different psychoacoustical tasks until the respective thresholds stabilized. After training, all subjects completed five runs for any of the four conditions. The subjects’ individual q-JNTs were determined as the average over these five runs. 2.4. EEG recordings In the EEG recordings, stimulus sequences of the type
qref j qdev1 j qref j qdev2 j qref j . . . were presented to the subjects. Each deviant segment lasted 800 ms, the duration of the reference segments was 800 ms ± 50 ms jitter. For qref ¼ þ1, the deviant correlations were 1, 0:707, 0, 0.578, 0.707, 0.816, 0.894 and 0.942 (condition þ1 #). The deviant correlations for qref ¼ 1 (1 ") had the same values as for qref ¼ þ1, but the opposite sign. For qref ¼ 0, deviant IACs were qdev = 0.578, 0.707, 0.816, 0.894 and 1 in the positive range (0 ") and qdev = 0:578, 0:707, 0:816, 0:894 and 1 in the negative range (0 #). For all qref , the deviant correlations were presented in random order. The corresponding reverse transitions from qdev back to qref are denoted by the condition labels " þ1, # 1, # 0 and " 0, respectively. Data acquisition was partitioned into blocks of 9 min for qref ¼ 1 and 13 min for qref ¼ 0. For each qref , 20 blocks were presented to accomplish a total number of 1000 transitions from qref towards each of the respective qdev . The total recording time resulted in 11 h per subject and was therefore split into 4–5 sessions. During each session all combinations of qref and qdev were presented with equal frequency. During the EEG recordings, subjects were presented a quiet movie with subtitles on a low-radiation TFT display in order to keep them vigilant. The EEG was recorded with Ag/AgCl-electrodes at the mastoids (A1, A2) with the reference electrode placed at the vertex (Cz) and
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
the ground electrode at the forehead (Fpz), according to the extended 10-20-system (Jasper, 1957; Sharbrough et al., 1991). Electrode impedances were brought below 5 kX at a test signal frequency of 30 Hz and were checked (and maintained) after each block. Voltage drift did not completely vanish but was carefully observed throughout the session, since DC recordings were performed. In addition, the channels Fp1, Fp2, FT9 and FT10 were used to monitor ocular artefacts. Inside the shielded room the EEG was preamplified by a factor 150, and further amplified by the main amplifier (SynAmps 5803) by a factor 33, resulting in a total amplification of 74 dB. The continuous EEG was filtered by an analog antialiasing-lowpass with an edge frequency of 200 Hz, digitized with 1 kHz sampling rate and 16 bit resolution (voltage resolution: 16.8 nV/bit) and stored to hard disk. 2.5. EEG data analysis Late auditory evoked potentials (LAEP) were obtained by processing the raw data offline in the following way: First, the continuous EEG was epoched into single sweeps corresponding to the qref j qdev j qref -sequences in the stimulus. Second, the voltage drift in each sweep was reduced by subtracting a linear fit. For each sweep, the fit was based on two intervals after the decay of the transient responses, namely the last 250 ms of the two reference segments preceding and following the segment with deviant IAC. Because the IAC and the contextual meaning were the same for both detrending intervals, their respective sustained potentials had no effect on the voltage drift estimate. Third, the recordings were filtered in the time domain using an IIR butterworth lowpass with an upper cutoff frequency of 20 Hz. Data were filtered twice, in forward and reverse direction, so that the effective filter had fourth order but zero phase. After filtering, the baseline of each epoch was shifted, yielding a mean of zero during the last 250 ms of the initial reference segment preceding the first correlation switch. Finally an iterated weighted average of the 1000 single sweeps was computed for all subjects and each of the 24 qref -qdev -combinations. The single sweeps were assigned the inverse power of the noise of the epoch as weighting factors (Hoke et al., 1984). Single epoch noise power estimates were optimized in an iterative procedure (Riedel et al., 2001). The residual noise of the average was estimated as the standard error across sweeps (SEM). Besides resulting in an optimum SNR of the averaged data, the iterated weighted averaging technique makes sure that artefacts are appropriately suppressed. Amplitudes and latencies of the transient LAEP components N1 and P2 were obtained in the following way: First, the individual LAEP were automatically scanned for peaks which were considered significant if their amplitude differed by more than 2 SEM from both neighbouring peaks. Second, in order to remove ambiguous alternatives, peak data was revised manually selecting only those peaks which were reasonably consistent with the overall pattern of individual LAEP. At small IAC step sizes, no significant peaks could be found for some subjects. If insignificant peaks were excluded from the average over subjects, mean amplitudes would be estimated too large. On the other hand, if the amplitudes of insignificant peaks were set to zero, mean amplitudes would be estimated too small. Both would result in a systematic bias of mean LAEP amplitudes at small Dq and thereby change the shape of average amplitude growth functions. Thus, missing peak values were assigned the EEG amplitude at the latency of the next significant peak that was found for the respective subject at a larger step size. The amplitudes of LAEP to reverse transitions back to qref were measured with respect to the baseline EEG during the last 250 ms of their preceding deviant segment. In order to estimate IAC specific shifts in baseline activity, i.e., changes of the sustained response to an ongoing IAC, the LAEP
43
were averaged over the last 250 ms of each reference and deviant segment. This interval was chosen a-posteriori after a visual inspection of the data, confirming that the N1-P2-response to the preceding IAC transition had sufficiently decayed, i.e., the SP remained roughly constant during that analysis window. 2.6. Statistical analysis The influence of the reference correlation qref on psychoacoustical thresholds was investigated by analysis of variance (ANOVA) with repeated measures and subjects treated as a random factor. The effect of experimental parameters on N1-, P2- and peak-topeak amplitudes (P2N1) and on N1- and P2 latencies of the LAEP was investigated using N-way ANOVA with repeated measures and subjects treated as a random factor. If a null hypothesis could be rejected, post hoc comparisons were accomplished by post hoc Scheffé tests. If not stated otherwise, the a priori level of significance was set to a ¼ 0:05. Details about the grouping of LAEP data are provided at the end of Section 3.2.1. 2.7. Control measurements If only the manipulated (right) channel of a stimulus with IAC transitions was presented diotically, subjects could not distinguish these control stimuli from unmanipulated diotic noise. Furthermore, no event related potentials could be elicited with such diotic stimulation. Hence, there were no monaural cues present in the stimuli. In the EEG experiments, the switches from and towards the reference situation were desired to elicit independent LAEP for both switches. In order to find the shortest practicable duration of reference and deviant segments, test measurements were performed. With a temporal distance between two subsequent IAC switches exceeding 256 ms, the LAEP could be separated into a first and a second but weaker N1-P2-complex. As the segment duration was reduced to 128 ms, the LAEP could no longer be classified as a sequence of N1-P2-responses and fused into a single transient response. For a segment duration of about 800 ms, the amplitude of transient LAEP began to saturate, and the last 250 ms before the next switch could be used for the analysis of the SP, because 500 ms after a switch the transient response had sufficiently decayed. The different segment durations in AFC- and EEG-experiments might complicate the comparison of psychometric and electrophysiological data. In a psychoacoustical control experiment the q-JNT was measured for segment durations up to 1600 ms. In any of the four conditions (þ1 #, 0 ", 1 " and 0 #) JNTs decreased with segment duration, but for a given segment duration, the relations between the four respective thresholds were consistently characterized by the same major and minor asymmetry as the data by Boehnke et al. (2002). Therefore, a value of 250 ms was sufficient to investigate the perceptual asymmetries. 3. Results 3.1. Psychoacoustical thresholds In Fig. 1 the thresholds for the smallest detectable transition (or: just noticeable transition, q-JNT) between qref and qdev are presented as vertical bars. Mean q-JNTs were smallest for deviations from qref ¼ þ1 (0.034), larger for qref ¼ 1 (0.105), and largest for the uncorrelated reference. For qref ¼ 0, psychoacoustical performance was better for transitions towards positive qdev (0.547) compared to transitions towards negative qdev (0.667). ANOVA revealed a highly significant effect of qref on the q-JNT (p < 1012 ). Post hoc tests indicated that thresholds obey the major
44
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
B
A
C ↓↑ stimulus IAC
↓↑
0
stimulus IAC
1
±ρ−JNT)
1
1
0.034 0.045
single subjects
0.5
Boehnke (2002)
0.667 0.547 0.46 0.32
IAC at threshold (ρ
ref
subjects’ mean 0.5
0
−0.5
0.105 0.086
−0.5
0
−1
−1 stimulus IAC
stimulus IAC
↓↑ −1
↓↑
0
Fig. 1. Psychoacoustical thresholds for the just noticeable transition (q-JNT) in interaural correlation. The bars cover the range between qref and the correlation at the threshold, i.e., their length corresponds to the q-JNT. Errorbars denote 2 SEM. (A) qref ¼ 1, (B) qref ¼ 0, (C) qref ¼ þ1.
asymmetry (a < 109 ) and the minor asymmetry (a < 0:005). Although there were large interindividual differences in the ability to detect small IAC transitions, these asymmetries were found for every single subject. 3.2. Specific binaural late auditory evoked potentials Fig. 2 shows LAEP to all IAC transitions for one exemplary subject. The time axis includes two IAC transitions, the first from qref to qdev at 0 ms and the second, referring to the reverse switch back to
A
qref , at 800 ms. The LAEP components N1 and P2 could be clearly identified in the binaurally evoked brain responses of all but one subject which did not show any N1 at all. The amplitude of both N1 and P2 generally increased with larger step size of the IAC transition. LAEP elicited by the IAC transitions at 0 ms towards any deviant segment (1 ", 0 #, 0 " and þ1 #) showed a generally larger peak-to-peak amplitude (P2N1) than those elicited by the corresponding reverse transition 800 ms later back to the reference IAC (# 1, " 0, # 0 and " þ1). For changes of q in negative direction (þ1 #, 0 #, # 0 and # 1), the transient LAEP were dominated by
B stimulus IAC
C
stimulus IAC
0
stimulus IAC
0
0.707
1.000 0.942 0.894 0.816 0.707
0.000
0.000
−0.707 −0.816 −0.894 −0.942 −1.000
−0.707
deviant IAC ρdev
1.000
−1
−1
+1
+1 −1.000
A1 0
A2
0
2.5μV
200 400 600 800 1000 1200
0
0
200 400 600 800 1000 1200
0
200 400 600 800 1000 1200
time in ms Fig. 2. LAEP for an exemplary subject to stepwise transitions in interaural correlation, recorded at the left mastoid (A1, thick line) and right mastoid (A2, thin line) versus vertex (Cz). Negative voltage is plotted in upward direction (vertex positive). Data were averaged over 1000 sweeps, errorbars denote 2 SEM. (A) qref ¼ 1, (B) qref ¼ 0, (C) qref ¼ þ1. In each panel, LAEP to two IAC transitions are shown, with the first switch from qref towards qdev at 0 ms and the reverse transition from qdev back to qref at 800 ms. For every parameter value qdev , the respective LAEP have been shifted in vertical direction so that the offset qualitatively corresponds to the value of IAC in the deviant segment between 0 and 800 ms.
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
the N1 component, while for changes in positive direction (1 ", 0 ", " 0 and " þ1) the P2 component was more pronounced. At the end of deviant segments the evoked responses also exhibited shifts of the baseline level, i.e., there were changes of the sustained potential (SP) after the decay of the transient response.
3.2.1. Separation of IAC specific effects and context related effects The six IAC transitions with integer Dq (1 ! 0, 1 ! þ1, 0 ! 1, 0 ! þ1, þ1 ! 1 and þ1 ! 0) occur twice within the experimental paradigm, first as a transition at 0 ms with the reference segment preceding a deviant segment and second, in another experimental condition, as a reverse transition at 800 ms with the reference segment following a deviant segment. For example, the transition from þ1 ! 0 can be found at 0 ms in the þ1 #-condition with qref ¼ þ1 and qdev ¼ 0 and at 800 ms in the # 0-condition with qref ¼ 0 and qdev ¼ þ1. The brain responses to these 2 6 IAC transitions constitute six pairs of LAEP. Within such a pair of LAEP the sequence of correlations is the same but the perceptual context is interchanged. These six pairs of LAEP to integer IAC transitions are plotted in Fig. 3 after realignment to common pre-switch baselines and time axes. For these integer transitions, ANOVA revealed that LAEP amplitudes were significantly larger (a < 1012 for N1 and P2N1, a < 5 104 for P2) and latencies were significantly shorter (a < 0:05) for transitions from a reference to a deviant segment than vice versa. Post hoc comparisons revealed that the effect of context on N1- and peak-to-peak amplitude was also significant for each single pair of physically identical IAC transitions with reversed context. The purely context dependent effect on average N1 amplitude was 1–2 lV. Average P2 amplitudes, however, only differed by 0.2–1 lV in four of the six pairs, but were unaffected by the context for transitions from zero to q ¼ 1. For non-integer Dq, data was only available for transitions either towards deviant or towards reference IAC. In order to examine the IAC specific influence on LAEP amplitude and latency with-
−1 → 0
LAEP in μV
−1 → 1 0 → −1 0 →1 1 → −1 1 →0
2.5μV 0
100 200 time in ms
ρref→ρdev ρdev→ρref 300
400
Fig. 3. Dependency of LAEP on the context: Each pair of lines corresponds to the same pair of correlation values before and after the transition, according to the labels at the baselines. The correlation change with the qref segment previous to the transition is displayed as a thick line, whereas the LAEP elicited by the transition with reversed context (qref after the transition) is displayed as a thin line. LAEP were taken from the same exemplary subject as in Fig. 2 and averaged over channels, errorbars denote ±2 SEM.
45
out interaction due to the purely context related effects described above, data was grouped according to the following fixed factors in the further statistical analysis: (1) the recording site, i.e., channel A1 or A2, (2) the context, i.e., the issue if the IAC switched towards a deviant segment or towards a reference segment, (3) the size of the IAC transition, i.e., the absolute value of the step size j Dq j, (4) the sign of Dq, indicating if the switch direction was either positive (conditions 1 ", " þ1, " 0 and 0 ") or negative (conditions þ1 #, # 1, # 0 and 0 #), and (5) the IAC range, i.e., the issue if the IAC transition was done in the positive range of q (conditions þ1 #, " þ1, 0 " and # 0) or in the negative range (conditions 1 ", # 1, 0 # and " 0). Any combination of qref and qdev can be expressed by the latter three factors. They were chosen to allow for an examination of the above-mentioned asymmetries: the major asymmetry can be expressed with the factors ‘‘IAC range” and ‘‘switch direction”, the minor asymmetry corresponds to the factor ‘‘IAC range”. In the following, this grouping pattern was used for the statistical analysis in the full set of data as well as for more detailed investigations of IAC specific effects in the two subsets of data from either context situation. 3.2.2. LAEP latency For LAEP latency, ANOVA revealed significant main effects of the context (p ¼ 0:002 for N1, p ¼ 0:02 for P2), the size of the IAC transition (p < 1012 for N1 and P2), the switch direction (p ¼ 4:5 103 for N1, p ¼ 1:4 1010 for P2) and the IAC range (p < 1012 for N1 and P2). For the P2, also an interaction between IAC range and switch direction was found (p ¼ 0:19 for N1, p ¼ 6:6 109 for P2). There was no significant influence of the recording channel (p ¼ 0:11 for N1, p ¼ 0:71 for P2). Fig. 4 shows N1 and P2 latencies for transitions from qref to qdev , averaged over subjects and channels. The latency of peaks with an amplitude less than 2 SEM of the individual data did not contribute to the mean. Average N1 and P2 latencies ranged from 120 to 160 ms and from 190 to 260 ms, respectively. For deviations from qref ¼ 1, the latency of both N1 and P2 showed a significant monotonic decrease by 25–50 ms with increasing IAC step size. For transitions from qref ¼ 0, in contrast, there was a smaller variation with Dq (10–25 ms), and a significant monotonic relation between latency and IAC step size was only found for the N1. Latency was generally shorter for IAC changes in the positive range of q than for the corresponding changes in the negative range, in accordance with the minor asymmetry. This relation is highly significant for IAC transitions towards deviant segments (þ1 # vs. 1 ", 0 " vs. 0 #, a < 1012 ), while for reverse transitions (" þ1 vs. # 1, # 0 vs. " 0) significance was found only for N1 latency. Considering only transitions with j Dq j¼ 1, post hoc tests revealed that P2 latency could be sorted for both context situations in the following order: t ½þ1!0 < t½0!þ1 t ½1!0 < t ½0!1 . This order is in agreement with the minor but not with the major asymmetry. N1 latency, however, could not be arranged in a consistent way for both context situations. A pairwise comparison of LAEP elicited by transitions from qref to qdev versus those evoked by the corresponding reverse transitions from qdev back to qref 800 ms later revealed that both N1and P2 latencies were significantly shorter for transitions towards deviant segments. 3.2.3. LAEP amplitude For N1, P2 and the peak-to-peak amplitude (P2N1), ANOVA indicated significant main effects of the context (p < 1010 ), the size of the IAC transition (p < 1012 ) and the IAC range (p ¼ 0:037 for N1, p < 1012 for P2 and peak-to-peak amplitude). The switch direction had a highly significant effect on N1 and P2
46
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
A
B
275
−1
→ρ
00 → ρ
dev
250 peak latency in ms
C +1 →
dev
ρ
dev
225 200 175 150 125 100
P2 N1 −1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1
normalized deviant IAC ρdev Fig. 4. Latencies of peak N1 (downward triangles) and P2 (upward triangles), elicited by a correlation change from qref to qdev . Latencies were averaged over 11 subjects and both channels (A1, A2), errorbars denote ±2 SEM. (A) qref ¼ 1, (B) qref ¼ 0, (C) qref ¼ þ1.
amplitude (p < 1012 ), but comparatively little influence on peakto-peak amplitude (p ¼ 0:04, no significance in post hoc comparisons with a ¼ 0:05). ANOVA further indicated an interaction between IAC range and switch direction (p < 1012 ). Amplitude did not depend on the recording channel (p ¼ 0:28 for N1, p ¼ 0:65 for P2, p ¼ 0:79 for peak-to-peak amplitude). Fig. 5 shows the LAEP peak amplitudes for transitions between qref and qdev , averaged over subjects and channels. For any qref and in both context situations, LAEP amplitude monotonically increased with IAC step size, at least for j Dq j< 1. P2- and peakto-peak data showed 4–5 levels of Dq with significantly different amplitude, for the N1 there were 2 such levels. For transitions from reference to deviant segments (panels A–C), absolute N1 amplitude increased over the entire range of j Dq j, while the monotonic increase of P2 amplitude was restricted to j Dq j< 1. For the reverse transitions back to qref (panels D-F), the opposite relation between N1 and P2 was found: while P2 amplitude was monotonically increasing for all j Dq j, N1 amplitude was monotonically related to IAC step size only for transitions towards qref ¼ 1 with j Dq j< 1. In the condition " 0, the N1 could be identified in the LAEP time series as a local minimum with positive peak voltage. Although the N1 is defined as a negative peak in the LAEP, these data were included in the analysis and shown in Fig. 5E because the respective minima occurred at a reasonable latency and their amplitude differed from the P2 amplitude by more than 2 SEM. For IAC changes towards q ¼ 0, absolute N1 amplitude was significantly smaller, and P2 and peak-to-peak amplitude was significantly larger than for transitions towards q ¼ 1 (a < 104 ), i.e., N1 and P2 amplitudes were more positive regardless of the context situation. Transitions of corresponding size from qref to qdev (Fig. 5A–C) elicited significantly larger responses in the positive than in the negative range of q, except for the N1 following qref ¼ 0. Except for the P2 preceding qref ¼ 0, the same trend was found for transitions back to qref (Fig. 5D–F). Post hoc tests for transitions with j Dq j¼ 1 revealed that peak-to-peak amplitudes reflected both asymmetries, since they increased in the following order: A½0!1 < A½0!þ1 < A½1!0 < A½þ1!0 . Although the order of absolute amplitudes was different for the N1 and P2, both could be sorted from more negative to more positive voltages: A½0!1 < A½0!þ1 < A½þ1!0 < A½1!0 . These hierarchies were found in either context situation. Hence, P2 amplitude reflects the major asymmetry, whereas the relations between absolute N1 amplitudes were opposite to the
major asymmetry. A minor asymmetry by means of larger absolute amplitudes for transitions in the positive range of q than in the negative range was not consistently apparent for absolute N1and P2 amplitudes. Fig. 5G–I compares peak-to-peak amplitudes of LAEP elicited by transitions from qref to qdev versus the amplitudes evoked by the corresponding reverse transitions from qdev back to qref 800 ms later. In contrast to Fig. 3, where pairs of identical IAC changes were compared with respect to purely context related effects, each pair of bars in Fig. 5G–I refers to IAC transitions between two values of q in opposite direction and opposite context. Peak-to-peak amplitude was significantly larger for transitions from qref to qdev than for the reverse IAC switch (a < 1012 ). A separate statistical analysis of the experimental conditions further revealed that these amplitude differences were highly significant also for single qref (a < 1012 for qref ¼ 1, a < 106 for qref ¼ 0). 3.2.4. Amplitude growth functions and objective threshold estimates The shape of LAEP amplitude growth functions and objective threshold estimates were assessed for transitions from qref to qdev . Fig. 6A–C demonstrates that the increase of LAEP amplitudes can be described by linear functions of the deviant IAC for all qref and for both peaks, N1 and P2, if the stimulus parameter is expressed in terms of the dB scaled ratio of correlated vs. anticorre~ dev in lated components during the deviant segment, i.e., as q dB(N 0 =N p ) according to Eq. (5). However, for j Dq j> 1 such a linear relationship does not apply anymore. Thus, linear functions of the ~ dev , were fitted to the data for dB(N 0 =N p ) scaled deviant IAC, q j Dq j< 1:
~ dev Þ ¼ a q ~ dev þ b; Aðq i:e:;
Aðqdev Þ ¼ 10 a log
1 þ qdev þ b: 1 qdev
ð6Þ
For the fits of N1, P2 and peak-to-peak amplitude according to Eq. (6), goodness-of-fit values (Press et al., 1992) between 0.976 and 1.0 were achieved for all qref . In contrast, if data were plotted as functions of the normalized deviant IAC as in Fig. 5A–C, LAEP amplitudes could be described as roughly linearly related to qdev only for qref ¼ 0, but not for qref ¼ 1. Accordingly, fitting LAEP amplitudes by linear functions of the normalized IAC resulted in goodness-of-fit values that were acceptable only for qref ¼ 0, whereas the goodness-of-fit for qref ¼ 1 ranged from 0.4 to 0.87.
47
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
A
B
8 7
−1
peak amplitude in μV
6
↑
C
0↓
0↑
P2−N1 P2 N1
+1
↓
8 7 6
5
5
4
4
3
3
2
2
1
1
0
0
−1
−1
−2
−2
−3
−3 −1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1
normalized deviant IAC ρdev
8 7 peak amplitude in μV
6
D
E
F
↓ −1
↑0
↓0
P2−N1 P2 N1
8
↑ +1
7 6
5
5
4
4
3
3
2
2
1
1
0
0
−1
−1
−2
−2
−3
−3 −1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1 −1
−0.5
0
0.5
1
normalized deviant IAC ρ
dev
H
8 7
I
ρref = −1 → ρdev
ρref = 0 → ρdev
ρref = +1 → ρdev
ρdev → ρref = −1
ρdev → ρref = 0
ρdev → ρref = +1
6
***
***
5
**
***
3
***
**
2
**
**
6 4
***
3
**
*** *** **
7 5
**
1
*
*
+0.707 +0.816 +0.894 +0.942
0
−0.707
−1
+1
+0.578 +0.707 +0.816 +0.894
−0.894 −0.816 −0.707 −0.578
−1
+1
+0.707
* 0
0
**
8
***
***
4
−0.942 −0.894 −0.816 −0.707
peak−to−peak amplitude P2−N1 in μV
G
2 1 0
normalized deviant IAC ρdev Fig. 5. Amplitude of LAEP elicited by IAC transitions between qref and qdev , as a function of the normalized deviant IAC qdev . All amplitudes were measured with respect to the baseline at the end of the preceding segment and averaged over subjects and channels. Errorbars denote ±2 SEM. The three panels in each row correspond to the three reference correlations qref ¼ 1, 0 and þ1, respectively. (A–C) Data for transitions from qref to qdev . Grey areas cover the range of qdev which are below the respective mean psychoacoustical thresholds for the just noticeable IAC transition, q-JNT. (D–F) Data for the reverse transitions from qdev to qref . (G–I) Pairwise comparison of mean peak-topeak amplitudes P2N1 for IAC transitions from qref to qdev (dark bars) versus data for the corresponding reverse transitions from qdev back to qref (light bars). The number of stars at the top of each pair refers to the level of significance (*: p 6 0:05; **: p 6 0:01, ***: p 6 0:001).
~ dev for which the extrapolated fitted lines in The values of q Fig. 6A–C intersect the horizontal axis (0 lV) provide an objective estimate of the deviant IAC at the electrophysiological threshold, q~ th ¼ b=a dB(N0 =Np ). Mean peak-to-peak data yielded threshold
~ th ¼ 16:6 dB(N 0 =N p ) for the condition þ1 #, 14:2 estimates of q dB(N0 =N p ) for 1 ", 3:0 dB(N 0 =N p ) for 0 " and 2:3 dB(N 0 =N p ) for 0 #. The respective values of normalized IAC qth can be computed using the inverse dB(N 0 =N p ) transform,
48
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
A 8
peak amplitude in μV
7
B
C
−1 → ρdev
0 → ρdev
6 5
4
4
3
3
2
2
1
1
0
0
−1
−1
P2−N1 P2 N1
−∞ −18 −12
−6
−2 −3 0
6
12 18
−∞ −18 −12
−6
0
deviant IAC ρ
dev
± ρ−JNT)
7
5
−3
ref
8
6
−2
normalized IAC at threshold (ρ
+1 → ρdev
+1
D
6
12 18
+∞
−18 −12 −6
0
6
12 18
+∞
[dB ratio N /N ] 0
π
E
F
+1
0↑
0.5
+1
↓
0.5
0
0
−0.5
−1 −1
AFC
↑ N1
−0.5
0↓ P2 P2−N1
AFC
N1
P2 P2−N1
AFC
N1
P2 P2−N1
−1
basis of threshold estimate ~ dev , i.e., the dB scaled ratio of N 0 - vs. N p -components in deviant segments (Eq. (5)). The solid lines were fitted to Fig. 6. (A–C) Same data as in Fig. 5A–C, but as a function of q ~ dev for all qref . The intersection points of the extrapolated linear fits with the abscissa (zero the data, they show the linear relationship between LAEP amplitudes and q amplitude) provide objective estimates for the deviant IAC at the detection threshold. Grey areas cover the range of qdev which are below the respective mean psychoacoustical thresholds for the just noticeable IAC transition, q-JNT. (D–F): Comparison of behavioural and electrophysiological threshold estimation techniques: The three panels correspond to the reference correlations qref ¼ 1, 0 and þ1, respectively. In each panel, the wide grey bar shows the mean psychophysical q-JNT (same data as in Fig. 1) while the thin bars display objective threshold estimates as obtained by linear extrapolation of mean N1- (white), P2- (black) and peak-to-peak amplitudes (grey). Errorbars denote the standard deviation over single subject thresholds.
10 qth =10 1 ~
qth ¼
q~ th =10
10
þ1
¼
10b=10a 1 10b=10a þ 1
:
ð7Þ
The difference j qth qref j provides an estimate for the electrophysiological q-JNT, i.e., the minimal IAC step size j Dq j necessary to elicit a LAEP (analogue to the psychophysically just noticeable transition). For mean peak-to-peak data, these q-JNTs were 0.043 for condition þ1 #, 0.073 for 1 ", 0.327 for 0 " and 0.254 for 0 #. Electrophysiological threshold were also estimated from N1- and P2 amplitudes, the respective JNTs can be found in Table 1. The relation between these electrophysiological and the psychophysical q-JNTs is illustrated in the barplots in Fig. 6D–F. The errorbars describe the standard deviation of individual JNTs which were derived from single subject data, using the same method of linear extrapolation as described for the mean amplitudes. Electrophysiological q-JNTs were 2-7 times smaller for qref ¼ 1 than for qref ¼ 0, reflecting the perceptual major asymmetry stronger than the absolute LAEP amplitudes. This major asymmetry is more pronounced for threshold estimates based on the P2 than for the N1. q-JNTs derived from LAEP could further be characterized by a minor asymmetry between the conditions þ1 # and 1 ". For qref ¼ 0, however, the minor asymmetry was only apparent in LAEP-thresholds based on P2 amplitudes.
In order to obtain a measure of differential sensitivity to small changes of the IAC in the vicinity of qref , the LAEP amplitude gain per change of Dq was estimated as the derivative of the fit function at the electrophysiological threshold qth ,
dA 20 a ðq Þ ¼ : dqdev th ln 10 1 q2th
ð8Þ
For the four experimental conditions, the fit parameters yield an increase of mean peak-to-peak amplitudes by 37.9 lV per unit of normalized IAC for the condition þ1 #, 22.7 lV for 1 ", 4.2 lV for 0 " and 2.9 lV for 0 #. Additional values for N1- and P2 data are given in Table 1. Hence, both asymmetries are consistently apparent in the derivative of N1-, P2- and peak-to-peak fit functions. These asymmetries regarding differential sensitivity were more pronounced for the slopes of P2 amplitude fits than for the N1. For IAC changes with j Dq j above threshold, the following measure of cumulative sensitivity was considered: the value of a deviant IAC qeq for which LAEP amplitudes were equal in two conditions with different qref was computed from the intersection of the respective fit functions. Peak-to-peak amplitude growth functions for the conditions þ1 # and 0 " intersect at qeq ¼ 0:786. For the conditions 1 " and 0 #, amplitudes matched at
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57 Table 1 IAC sensitivity in the EEG. Condition
1 "
Electrophysiological threshold (q-JNT) N1 0.134 P2 0.045 P2N1 0.073
0#
0"
þ1 #
0.250 0.267 0.254
0.483 0.173 0.327
0.089 0.027 0.043
1.96 2.26 4.20
6.07 40.19 37.88
Derivative of amplitude growth function (dA=dqdev ) N1 2.15 1.91 P2 27.42 1.04 P2N1 22.68 2.94
Deviant IAC with equal amplitude (qeq ) 1 " ‘‘=” 0 # 0:462 0:891 0:761
N1 P2 P2N1
0 " ‘‘=” þ1 # 0.735 0.817 0.786
Electrophysiological q-JNTs, slopes and intersection points of LAEP amplitude growth functions were obtained from fits of mean N1-, P2- and peak-to-peak amplitudes, cf. Section 3.2.4.
49
amounts to 1:2 lV. For qref ¼ 0, the SP is about 0.5 lV more positive during deviant segments with positive non-integer qdev , while the SP in deviant segments with negative qdev tends to be more negative. During diotic deviant segments, however, the baseline activity drops to about 0:5 lV. For qref ¼ 1, the SP after the IAC change was generally more positive than during the reference segments, except for qdev ¼ þ1. The SP tends to monotonically increase with qdev , except for segments with q ¼ þ1. This dependence on q was consistently apparent also in individual data. The observation that the IAC specific SP was more negative during segments with q ¼ 1 than for correlations closer to zero might reflect a major asymmetry, suggesting that more negative sustained potentials were a correlate of higher perceptual sensitivity. Since the SP was usually more positive for positive qdev than for the corresponding negative qdev , the data might also be interpreted by means of a minor asymmetry. However, in contrast to the proposed major asymmetry of the SP, such a minor asymmetry suggests that higher perceptual sensitivity is associated with more positive sustained potentials. Hence, the SP cannot be consistently interpreted as a correlate of both perceptual asymmetries.
qeq ¼ 0:761. Hence, to elicit LAEP of the same peak-to-peak amplitude, the IAC step size had to be about three times larger in the conditions with qref ¼ 0 than for qref ¼ 1, substantiating the major asymmetry. The imbalance between reference correlations of 1 and 0 is even more pronounced for the intersection points of P2-fits but weaker (or absent) for the N1 (cf. Table 1). 3.2.5. Sustained potential Fig. 7 displays the SP shift as observed during the trailing parts of deviant segments. Since the baseline activity corresponding to an ongoing IAC of qref was always set to zero, these shifts were measured with respect to the absolute value of the SP during the reference segments before and after the deviant segment. For any combination of qref and qdev , the respective baseline shifts were estimated separately for every single subject and channel and then averaged. For qref ¼ þ1, the SP during deviant segments with qdev > 0 increased by about 0.3 lV compared to the baseline of the surrounding reference segments. For transitions from qref ¼ þ1 towards negative qdev , a negative shift could be observed. The largest negative voltage shift occurs between qref ¼ þ1 and qdev ¼ 1, it
shift of sustained potential in μV
1 0.75 0.5 0.25 0 −0.25 −0.5 −0.75 −1 → ρdev 0 → ρdev +1 → ρ
−1 −1.25
−∞
dev
−18
−12
−6
0
6
12
18
+∞
deviant IAC ρdev [dB ratio N0/Nπ] Fig. 7. Sustained potential during the deviant segment with respect to the baseline at the end of the surrounding reference segment, averaged over channels and subjects. Errorbars denote 2 SEM. Triangles pointing to the right indicate the sustained potential after transitions from qref ¼ 1, circles denote data for qref ¼ 0, triangles pointing to the left correspond to qref ¼ þ1.
4. Discussion 4.1. Psychoacoustical thresholds Table 2 compares a selection of q-JNDs from the literature and IAC modulation thresholds by Dajani and Picton (2006) with the just noticeable IAC transitions (q-JNTs) which were measured in the present study. The q-JNDs may be interpreted as the ‘static case’ of IAC sensitivity, as measured in a discrimination task with separate stimuli of different static correlations qref and qdev . Such JNDs are entirely based on the internal representation of reference and deviant stimuli in auditory memory. In our q-JNT paradigm, however, the just noticeable change of q within an ongoing noise might depend not only on auditory memory but also on the transition itself as an additional temporal cue. The experimental paradigm by Dajani and Picton (2006) was even more dynamic, since target stimuli included several IAC transitions in a regular sequence; their IAC modulation thresholds represent the ‘dynamic case’ of IAC sensitivity that is most similar to our JNT paradigm. 4.1.1. Comparison of IAC transition thresholds to q-JNDs For qref ¼ þ1, q-JNDs range from 0.019 (Gabriel and Colburn, 1981) to 0.057 (Culling et al., 2001). Since the thresholds by Culling et al. (2001) were more than twice as large as JNDs reported in other studies at any qref , they concluded that their ‘‘experiments appear to underestimate sensitivity when the results are compared to measurements using other methods”. A review of single subject performance in the original articles by Gabriel and Colburn (1981), Koehnke et al. (1986), Akeroyd and Summerfield (1999) and also in this study reveals that mean q-JNDs from the literature often tell more about the ratio between experienced listeners and naive subjects that participated in the study than about the average listener: For qref ¼ þ1, single subject q-JNDs in the low frequency domain range from 0.006 to 0.1. Gabriel and Colburn (1981) obtained their data from only two subjects. Presumably these listeners were very experienced and showed markedly lower thresholds than most other individuals. Another issue that complicates threshold comparison is that the q-JND is often expressed as its equivalent SNR in an N0 Sp experiment according to Eq. (5) (Durlach et al., 1986). If individual equivalent-SNR-thresholds are averaged before the mean equivalent SNR is finally transformed back to the domain of normalized IAC, mean q-JNDs appear to be lower than they are, because averaging does not commutate with the nonlinear transform between q and
50
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
Table 2 Overview: IAC sensititvity in psychophysics. Authors
Year
Bandwidth
Paradigm
Pollack and Trittipoe Gabriel and Colburn Koehnke et al. Akeroyd and Summerfield Culling et al. Boehnke et al. Dajani and Picton Lüddemann et al.
1959 1981 1986 1999 2001 2002 2006 2009
100–6800 0–1000 446–560 100–500 450–550 100–4000 0–8000 100–2000
Static Static Static Static Static Static Modulated Transitions
q-JND/q-JNT in four conditions 1#
1 "
0"
0#
0.04 0.019 0.022 0.023 0.057 0.045 – 0.034
– – – – – 0.086 – 0.105
0.44 0.30 – – 0.72 0.32 0.31 0.55
– – – – – 0.46 – 0.67
q-JNDs from literature display the discriminability of stimuli with static IAC and silence in between. The data for Culling et al. (2001) were derived from the parameters of 0 0 cumulative d -functions in the vicinity of q ¼ þ1 and q ¼ 0, assuming d ¼ 1 at the threshold. Dajani and Picton (2006) rectangularly modulated the IAC in ongoing noise between zero and the q-JNT at a modulation rate of 4 Hz, i.e., with 8 IAC transitions per second. q-JNTs in the present study refer to the detectability of IAC transitions after an initial segment of reference noise. The stimuli in the psychoacoustical experiments of this study also include a transition back to the reference IAC.
Durlach’s equivalent SNR. When averaging is performed on the values of normalized IAC at the individuals’ thresholds, the mean qJND at qref ¼ þ1 by Koehnke et al. (1986) increases from 0.022 to the revised value of 0.035. The same might apply to the data by Akeroyd and Summerfield (1999), who also used the equivalent SNR to specify individual thresholds but unfortunately did not provide further details on how exactly they computed the average over subjects. Obviously, these effects do not only occur during the calculation of the mean over subjects, but also when averaging the reversals in an adaptive staircase procedure in which the equivalent SNR is used as the tracking variable. Thus, the population-JND for qref ¼ þ1 is likely to be between 0.03 and 0.045, in good agreement with the average q-JNT of 0.034 observed in the present study. For qref ¼ 1, the discriminability of signals with static IAC was investigated only by Boehnke et al. (2002). Their q-JND of 0.086 was somewhat lower than the corresponding JNT of 0.105, but differences are still within the range of interindividual variations. For a reference correlation of zero, less data is available than for qref ¼ þ1. The q-JNDs for qref ¼ 0 range from 0.30 (Gabriel and Colburn, 1981) to 0.44 (Pollack and Trittipoe, 1959b) for positive qdev . Culling et al. (2001) report an even larger q-JND of 0.72. However, this value is not included in the following comparison, for the same reasons as given in the case of qref ¼ þ1. Thresholds for negative qdev have only been measured by Boehnke et al. (2002) who report a qJND of 0.46. The corresponding just noticeable transitions of IAC within an uncorrelated ongoing noise are, however, markedly larger: Mean q-JNTs were 0.55 for transitions in positive direction and 0.67 for transitions in negative direction, respectively. Interindividual variability is unlikely to account for this discrepancy between JND and JNT at qref ¼ 0, because the lowest individual JNT measured in the 0 #-condition was 0.49, i.e., even our ‘best’ subject did not reach the threshold reported by Boehnke et al. (2002) for the corresponding JND experiment. 4.1.2. Comparison of dynamic and static binaural cues Since the q-JNTs found in the present study for correlated and anticorrelated reference are within the range of q-JNDs from literature, there seems to be little difference between static and dynamic decision cues for qref ¼ 1. For qref ¼ 0, in contrast, JNTs are about twice as large as the corresponding JNDs. Hence, psychophysical JNTs show a larger major asymmetry than JNDs. A possible explanation is that the optimal cue in a JNT paradigm might depend on the reference correlation. Most subjects reported that they always focused on the first IAC switch regardless of qref . Such a strategy might work well for qref ¼ þ1, while for qref ¼ 0 it could be a more successful strategy to compare the IAC from preswitch and post-switch segments as in a JND task with static corre-
lations. This hypothesis is based on the temporal resolution of the binaural system and the variance of sample correlations: the binaural system is assumed to analyze the incoming signals within a moving temporal window of finite duration T describing binaural sluggishness (Kollmeier and Gilkey, 1990; Akeroyd and Summerfield, 1999). Accordingly, the mean correlation q0 of a signal or segment with a duration longer than T cannot be determined from only a single observation at the current window position, because the correlation at the output of the binaural window is a random variable with a variance proportional to 1 q20 (Gabriel and Colburn, 1981). In order to detect an IAC transition, the binaural system has to draw samples of length T from the input signal and thereby decide if the instantaneous stimulus correlation has just changed. For transitions from qref ¼ þ1, all sample correlations before the IAC switch are þ1, with zero variance along samples, at least if there were no internal noise due to processing errors in the auditory system. Therefore the first sample after the transition can be immediately identified as part of the deviant stimulus segment, indicating an IAC transition. In contrast, sample correlations before an IAC switch from qref ¼ 0 to a deviant correlation are randomly distributed with most samples having an apparent IAC in the vicinity of zero but some rare samples outside the respective confidence region ½r; r. Accordingly, there is always a chance probability that the initial samples of the post-switch segment are either ignored as if they were rare events with sample correlations outside the interval ½r; r, or that they, due to their own random distribution, do not exceed the switch detection threshold at all. Conversely, some samples within a reference stimulus might be misinterpreted as evidence for a non-existent IAC transition. To minimize the number of missed IAC transitions as well as the false-alarm rate in the case qref ¼ 0, it might therefore be a more appropriate strategy to estimate the most likely stimulus correlations from several observations before and after the IAC change separately and compare their mean correlations as in a static IAC discrimination task. Hence, listeners attending to IAC transitions only, while ignoring the static correlations during the pre-switch and post-switch segments, might perform equally well for qref ¼ 1 but are likely to show an increased miss- and false-alarm rate for qref ¼ 0. This argumentation might account for the discrepancy between our q-JNTs for qref ¼ 0 and the corresponding q-JNDs from literature. However, it cannot explain the difference to the values reported by Dajani and Picton (2006), who periodically switched the IAC between qref ¼ 0 and the deviant correlation qdev at a switch rate of 4 Hz, i.e., 8 IAC transitions per second. In their paradigm the duration of the segments with q ¼ 0 and q ¼ qdev had half the duration as in our psychoacoustical experiments. Although the reduced segment duration could be expected to increase thresholds (cf. Section 2.7), they report a JND of 0.31 for the discrimination of their modulated stimuli from uncorrelated noise.
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
The most important difference between both experiments is that the stimuli in Dajani’s paradigm contained several IAC transitions at a regular repetition rate. An increased false-alarm rate, as proposed for our JNT paradigm, might have been prevented because subjects would presumably have rejected an interval if the number or the timing of possible transitions within that interval did not fit the expected pattern of eight IAC switches per second in alternating direction. The miss rate should even have decreased, because the probability that – due to the inherent random fluctuations of the stimulus – at least some IAC switches exceeded the detection threshold increased with the number of switches while the probability that all transitions remained below threshold decreased. 4.2. Electrophysiological data Since in the present study only the IAC of continuously running noise was varied while there were no changes of the monaural signal properties during the recording session, any contribution of auditory processing on the monaural pathway to the LAEP was avoided. Instead, the LAEP elicited by these IAC changes directly display electrophysiological correlates of specific binaural processing only. The morphology of binaurally elicited LAEP that has been reported by Jones (1991), Jones et al. (1991), McEvoy et al. (1991b,a) was also found in the present study. While McEvoy et al. (1991a) systematically varied the ITD of continuously concatenated noise segments, the other studies focused their main interest on a rather general comparison of coherent, incoherent and/or lateralized noise segments which were presented in an interspersed manner. For these different experimental conditions, mean LAEP latencies ranged from 130 to 150 ms for the N1 and from 220 to 240 ms for the P2, while amplitudes between 2 and 5 lV were reported. Although these studies did not investigate the dependence of LAEP on the degree of correlation in detail, the absolute amplitudes and latencies observed in our experiments were similar to those values from literature but cover a wider range: average N1 latencies were between 120 and 160 ms, while latencies from 190 to 260 ms were observed for the P2 (Fig. 4). Mean amplitudes reached up to 6 lV, the smallest values that could be observed in single subjects were limited by the noise floor of about 0.5 lV (Figs. 2, 4 and 5). The amplitudes and latencies reported in the present study might exceed the range of values from the literature mainly because of the comparatively dense sampling of the parameter space, with three different reference correlations and IAC transitions close to the perceptual threshold as well as a set of clearly audible IAC transitions to deviants in the entire range of q. 4.2.1. LAEP amplitude growth functions Even for j Dq j< 1, LAEP amplitude cannot be consistently described using a unique functional relationship based on the normalized IAC, because the amplitude growth functions had very different curvatures and slopes, depending on qref and qdev (Fig. 5A–C). These differences are systematically related to the major and minor asymmetry and are thus discussed in Section 4.4. In contrast, the proposed linear relationship between the dB(N 0 =N p ) scaled deviant IAC and LAEP amplitude can describe data from all experimental conditions equally well (Fig. 6A–C). The voltage gain per dB(N 0 =N p ) was not only constant in each single condi~ dev , it was also quite tion, i.e., independent of the particular value of q similar across conditions. The peak-to-peak growth functions had slopes between 0.32 and 0.43 lV/dB(N 0 =N p ). P2 amplitudes increased by 0.25–0.28 lV/dB(N 0 =N p ), except for the condition 0 #. The derivatives of N1 amplitude growth functions were less similar (0.12–0.21 lV/dB(N 0 =N p ), except for the condition 1 "). Both exceptions refer to those conditions with generally smallest ampli-
51
tudes for the respective peaks and might at least partially be due to the antagonistic effects of SP shifts on absolute N1- and P2 amplitudes (cf. Section 4.2.4). However, in any case the rather few dissimilarities of growth function slopes on the dB(N0 =N p ) scale did not follow a consistent pattern across experimental conditions for N1, P2 and peak-to-peak data, i.e., the voltage gain per dB(N 0 =N p ) ~ ref . can also be considered to be independent of q A linear quantitative description of LAEP amplitudes which is independent of particular values of qref and qdev suggests the use of the dBðN 0 =N p Þ-transformed analog of IAC step size, ~ dev q ~ ref j, instead of q ~ dev . For qref ¼ 0, q ~ ref is also 0. For ~ j ¼ jq jDq qref ¼ 1, however, the corresponding values of q~ ref and accord~ j are infinite. These infinities might be considered as a ingly jDq drawback of the dB(N 0 =N p ) scale, but they can be resolved by taking into account processing errors on the auditory pathway. The non-deterministic character of firing patterns in the monaural periphery suggests to model the effect of such processing errors by adding some uncorrelated noise to the binaural stimulus, resulting in a decorrelation of the left and right inputs prior to binaural feature extraction (van der Heijden and Trahiotis, 1997; Lüddemann et al., 2007). In contrast to our paradigm, Dajani and Picton (2006) investigated the influence of IAC step size on the EEG with an IAC sequence in which q periodically alternated between 0 and Dq at a modulation rate of 4 Hz, i.e., with a segment duration of 125 ms. Dq was chosen from a set of equally spaced values (0.2, 0.4, 0.6, 0.8 and 1) and remained fixed for each run of the experiment. These stimuli elicited binaural auditory steady state responses, and the respective EEG recordings were analyzed in the same way as monaural ASSR. The spectral amplitude at the 2nd harmonic of their grand mean data increased roughly linearly with Dq. This linearity of mean ASSR growth functions is consistent with the linear increase of LAEP amplitude in response to transitions from qref ¼ 0 in the present study. The nonlinear increase of our LAEP amplitude growth functions for qref ¼ 1, however, rather suggests that the magnitude of the response is not generally linearly related to j Dq j throughout the entire parameter range. This outcome of the present study is not in contradiction to the data by Dajani and Picton (2006), for the following reasons: First, in the vicinity of q ¼ 0 within a range of about 0:6 to þ0:6, the dB(N 0 =N p ) transform is roughly line~ 8:7 q dB(N 0 =N p ), arly related to the normalized IAC, with q ~ describe the data equally well in this linear range. i.e., q and q Since Dajani and Picton (2006) used only two values (Dq = 0.8 and 1) in the nonlinear range of the dB(N 0 =N p ) scale, their parameter spacing might have been simply too coarse to reveal possible nonlinearities in the data. Second, the presence of peripheral noise does not only ensure that the maximal values on the dB(N 0 =N p ) scale remain finite, it also reduces the curvature of the dB(N 0 =N p ) transform in the vicinity of q ¼ þ1. Third, a progressive increase of ASSR amplitudes with Dq approaching 1 might have been prevented by saturation. This could also be the case for the P2 in response to transitions between qref ¼ 0 and qdev ¼ 1 in our experiments (cf. Section 4.2.3). Fourth, the single subject data in Fig. 8a by Dajani and Picton (2006) indicate that the increase of ASSR amplitude was largest when Dq was incremented from 0.8 to 1, compared to other increments, e.g., 0.6 to 0.8, 0.4 to 0.6, etc., in agreement with the increased differential sensitivity of our LAEP in the vicinity of qref ¼ þ1. This nonlinear trend might have disappeared in the grand mean data due to their averaging technique.
4.2.2. IAC specific sustained potential As there is no zero potential defined for the EEG, the voltage of a sustained potential is usually specified with respect to some
52
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
arbitrary reference potential during control intervals, e.g., the baseline value during silence. In our stimulus paradigm, however, continuous noise was interspersed by three different types of reference stimuli (qref ¼ þ1, 0 and 1), and the baseline of the LAEP time series during these reference segments was used to define separate reference potentials for the respective experimental conditions. Hence, for each combination of qref and qdev , the data in Fig. 7 indicate a shift of the baseline relative to the SP associated with the particular reference IAC of either qref ¼ þ1, 0 or 1, rather than an absolute SP associated with qdev itself. If the IAC related sustained potentials were determined only by the ongoing correlation during the current segment, the three curves in Fig. 7 should exactly match at any q except for a constant offset in y-direction according to the SP associated with the respective qref . After shifting the data in positive direction for qref ¼ þ1 and in negative direction for qref ¼ 1 the three curves can be reasonably aligned by means of overlapping confidence intervals for most qdev . However, the curves do not match perfectly, in particular for the negative qdev in the condition 1 ". Moreover, if the SP were unaffected by the IAC in the preceding segments, it should shift by the same amount but in opposite direction in two conditions with complementary values for qref and qdev . For changes between þ1 and 0 in the conditions þ1 # and 0 " and for changes between þ1 and 1 in the conditions þ1 # and 1 ", however, the SP always shifted in negative direction. In conclusion, these discrepancies across conditions suggest that the SP is mainly determined by the current value of q, but is possibly also influenced by the IAC sequence in the preceding segments. 4.2.3. LAEP amplitudes in response to large IAC changes For IAC steps with j Dq jP 1, LAEP amplitude growth functions were shallower than for j Dq j< 1, possibly due to saturation. However, for the largest IAC steps to deviant segments, P2 amplitude even decreased whereas N1 amplitude diminished for the corresponding changes back to the reference. Saturation can neither explain the decrease itself nor its dependence on context. For qref ¼ 1, one could argue that the amplitudes of the P2 in response to deviant segments with qdev ¼ 1 and the N1 in response to the respective reverse transitions did not monotonically increase as j Dq j approached the maximum of 2 because the subjective listening impression associated with a correlation of þ1 might be more similar to the percept of a stimulus with q ¼ 1 than to the percept of a stimulus with q ¼ 0 (Culling et al., 2003). However, such an interpretation is not applicable to our data, for the following reasons. First, if correlations of q ¼ þ1 and q ¼ 1 were more similar to each other than either one to q ¼ 0, any transition between q ¼ 1 and q ¼ 0 should have elicited larger N1- and P2 amplitudes than transitions with j Dq j¼ 2, in contradiction to our data. Second, it seems odd that perceptual similarity should cause a decrease of P2 amplitude only in the context situation with qref preceding the IAC switch while the analogous behaviour of the N1 was only observed with opposite context, i.e., with qref after the transition. Finally, perceptual similarity cannot explain why transitions from qref ¼ 0 to qdev ¼ 0:894 elicited a larger P2 than transitions to qdev ¼ 1. 4.2.4. Influence of sustained potential shifts on the N1-P2-complex Since both N1- and P2-amplitudes were always measured with respect to the pre-switch baseline, i.e., relative to the SP in their preceding segment, a change of the SP during the generation of N1 and P2 would result in a shift of N1- and P2 components into the same direction as the SP due to the linear superposition of electric fields from transient and sustained brain activity. Such a superposition could explain why transient LAEP appear to be biased towards the N1 component in conditions with negative baseline shifts (0 # and þ1 # with negative qdev ) while in conditions with po-
sitive changes of the SP (1 " and 0 ") the P2 component was comparatively more pronounced (filled symbols in Fig. 5A–C and Fig. 7). Since the SP is shifting back to its initial value after the reverse transitions back to qref , its effect on N1- and P2 amplitudes in the conditions " 0, " þ1, # 1 and # 0 is just opposite to the effect on the N1-P2-complex after normal transitions (open symbols in Fig. 5D–F). This could explain why non-monotonic relations between LAEP amplitude and j Dq j were only observed either for the P2 or for the N1, according to the reversed direction of the IAC change in opposite contextual situations. In particular, the positive N1 voltages in the condition " 0 and for qdev ¼ 1 in the condition " þ1 can be understood as a consequence of such a reverse baseline shift in positive direction. This suggests that the IAC specific SP is already present after about 100 ms of ongoing correlation, and that it contributes to the electric far field with a larger amplitude than the generator of the N1 in situations with reversed context. Previous binaural studies (Jones, 1991,; McEvoy et al., 1991b,a) did neither report such a systematic bias of the N1-P2-complex nor a specific binaural SP, presumably because electrode drift was removed from their data by high pass filtering and not by linear detrending as in the present study. Since the influence of such baseline shifts on peak-to-peak amplitudes is determined by the change of the SP during the interval between N1 and P2, it depends on the temporal dynamics of the sustained response. If the SP were monotonically changing within the first 150-300 ms, as suggested by monaural experiments by Lammertmann and Lütkenhöner (2001) and Picton et al. (1978), its influence would increase with latency, having greater effects on the P2 than on the N1. Consequently, in comparison to a hypothetic unbiased case in which there were no baseline shifts at all, the superposition of the N1-P2-complex with a monotonically increasing SP would result in an increased peak-to-peak difference P2N1, whereas a baseline shift in negative direction would decrease the peak-to-peak amplitude. Such effects could partially explain why peak-to-peak amplitudes did not monotonically increase with j Dq j for transitions from qref ¼ þ1 to qdev ¼ 0:707 and to 1 (Fig. 5C). However, even if LAEP were fully saturated in these cases, it remains unclear why the P2 in response to an IAC change from qref ¼ þ1 to qdev ¼ 1 was about 2 lV less than for the smaller change to qdev ¼ 0 although the SP shift in Fig. 7 could only account for a maximal amplitude decrease of about 1 lV. In conclusion, the combined effects of saturation and shifts of the SP might qualitatively explain the observed pattern of N1-, P2- and peak-to-peak amplitude growth functions at large j Dq j. However, the assumption of a gradually evolving sustained response which monotonically approaches some limit value according to Fig. 7 cannot quantitatively account for all LAEP amplitudes in the present study. 4.2.5. LAEP amplitude growth: IAC specific effects or mismatch detection? If only data from a single experimental condition were considered, one might interpret the systematic increase of LAEP amplitude and the decrease of latency with IAC step size as a result of rather general mismatch detection within the context of the stimulus sequence, following the arguments by, e.g., Jones et al. (1991) and Sonnadara et al. (2006). One might even argue that the systematic differences of amplitude and latency across conditions were not specifically related to the IAC itself but that instead they only occurred in our data because the auditory objects associated with particular correlations were of different perceptual relevance for a mismatch detection during auditory scene analysis. The overall pattern of LAEP in all four conditions, however, clearly indicates that such an interpretation cannot be entirely true, for the follow-
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
ing reasons. A strict ‘mismatch-only’ hypothesis implies that the amplitude and latency of N1 and P2 were entirely determined by the ‘magnitude of (perceptual) mismatch’ associated with a change of stimulus parameters – independent of the particular values of q before and after that transition. Accordingly, any two stimuli which are equal by means of their ‘magnitude of mismatch’ should elicit responses with same amplitude and latency, and vice versa. In particular, if two stimuli were found to elicit LAEP of the same amplitude, then also their latencies had to be equal. A comparison of such equal-amplitude or equal-latency pairs across the four reference situations in Figs. 4 and 5 demonstrates that the strict ‘mismatch-only’ hypothesis is obviously wrong: neither did amplitudes match for stimuli with equal latencies, nor did latencies match for stimuli with equal amplitudes. Likewise, stimuli which elicited equal N1 amplitudes usually evoked different P2 amplitudes and vice versa. Last but not least, the order of psychoacoustical thresholds in the four experimental conditions was not consistently represented in N1- and P2 latencies (cf. Section 3.2.2). Also the SP has little relation to unspecific mismatch detection: if the SP shifts were related to the perceptual mismatch between deviant and reference IAC, regardless of the particular value of qref , the baseline shifts in the conditions 0 " and 0 # should have the same sign for corresponding positive and negative qdev . Likewise, the curves for qref ¼ þ1 and 1 in Fig. 7 should be roughly symmetric, in contradiction to the data. For the same reasons, the SP cannot be an electrophysiological correlate of expectation. In summary, the amplitudes and latencies of the N1-P2-response and the shifts of the SP exhibit IAC specific features which cannot be explained as a result of unspecific mismatch detection. 4.2.6. Separation of context related and IAC specific effects A pairwise comparison of LAEP to transitions with equal step size but opposite direction within the same recording session indicates that peak-to-peak amplitudes were always larger for transitions from a reference to a deviant segment than for the subsequent transition from a deviant back to a reference segment (Fig. 5G–I). It is remarkable that these relations were also found for qref ¼ 0, because in psychophysics the effects of the major asymmetry are known to be even stronger than the influence of any other signal parameter such as stimulus duration, bandwidth, center frequency or presentation level (Pollack and Trittipoe, 1959a,b; Gabriel and Colburn, 1981; Culling et al., 2001). Accordingly, if there were only a major asymmetry but no influence of the contextual relation between two values of the IAC, the reverse transitions from qdev ¼ Dq back to qref ¼ 0 should have elicited larger peak-topeak amplitudes than IAC changes from the reference qref ¼ 0 to qdev ¼ Dq, in particular for j Dq j¼ 1. Since this is not the case, the unspecific context information in the stimulus sequence had obviously a stronger influence on peak-to-peak amplitudes than the IAC specific major asymmetry. Although the jittered duration of the reference segments might partially account for the larger amplitude of LAEP to transitions towards qdev , this rather small temporal uncertainty is unlikely to explain why the amplitudes of normal and reverse transitions differed by a factor of two (see review by Näätänen and Picton (1987), pp. 401–404). Control experiments with q alternating between 1 and 0 were performed for a constant segment duration (800 ms) and for a jittered segment duration (800 ms ±50 ms jitter in both diotic and uncorrelated segments). The data showed that the LAEP amplitude did not depend on the issue if the stimulus sequence had a jittered segment duration or not. The fact that the amplitude differences in Fig. 5G–I were more pronounced within sessions with qref ¼ 1 than for qref ¼ 0 can be consistently explained by synergistic (qref ¼ 1) and antagonistic (qref ¼ 0) combinations of the effects which are related to the major asymmetry and to the processing of unspecific context infor-
53
mation. Obviously, the context effect must be based on a mechanism with a temporal scope of at least 2400 ms for a qref j qdev j qref -triplet. Furthermore, the absence of significant interactions between the effects of context and the effects of IAC range or switch direction on LAEP amplitude suggests that this mechanism is unaffected by the IAC during the intervening deviant segments, i.e., it does presumably not show any adaptation to the ongoing signal with a delay shorter than 800 ms. In contrast, for the observation of IAC specific asymmetries, a segment duration of 800 ms was sufficient, no matter if the IAC preceding the transition served as a reference or as a deviant within the stimulus sequence. For the six physically identical IAC transitions which occurred with opposite contextual relations in different conditions (Fig. 3), the absolute and relative amplitude differences and the statistical significance were larger for the N1 than for the P2. Hence, purely context related effects were more pronounced for the N1, whereas the IAC specific asymmetries were represented better by the P2. In conclusion, the information about IAC specific and unspecific properties of an ongoing signal are presumably reflected in independent, disjoint features of its neuronal representation on the auditory pathway, contributing with different weights to the generation of N1 and P2.
4.3. Comparison of psychophysical and electrophysiological thresholds The electrophysiological data in Fig. 6D–F and in Table 1 show that the largest electrophysiological q-JNTs were generally obtained from N1 amplitudes, while P2 amplitudes yielded the smallest q-JNTs. This can be understood as a consequence of N1-P2shifts due to changes of the sustained potential: for the conditions þ1 #, 0 " and 1 ", the data in Fig. 7 indicate shifts of the SP in positive direction for those qdev which contributed to the respective amplitude growth function fits in these conditions. Accordingly, the N1- and P2 growth functions in Fig. 6A–C can be considered to have an offset in positive direction which would not be apparent in the data if there were no SP. Because of this offset the positively ~ dev shifted P2 growth functions intersect the x-axis at values of q which are systematically closer to the reference correlation than they would be without the offset. Likewise, the positive offset of the N1 growth functions results in x-axis intercepts with systematically enlarged distance to the reference. In the condition 0 #, in contrast, the SP is weaker than in the other conditions and shifts in opposite direction, so that an opposite but less pronounced effect on N1- and P2 based threshold estimates may be assumed. This interpretation explains the discrepancy between N1- and P2 based threshold estimates for all four experimental conditions. The overall pattern of amplitude growth functions and electrophysiological JNTs suggests that threshold estimates based on the difference P2N1 are comparatively unaffected by shifts of the SP. According to a quantitative comparison of q-JNTs obtained from an extrapolation of peak-to-peak amplitudes versus psychoacoustical JNTs, electrophysiological thresholds tend to be better than their perceptual counterparts, except for qref ¼ þ1 (Fig. 6D–F). This relation between LAEP based JNTs and perceptual performance is reasonable because thresholds were defined by different criteria: while the extrapolation of LAEP growth functions estimates a ‘floor-level’ threshold by means of the minimum IAC step size that is required to elicit a non-zero LAEP, the psychophysical thresholds were measured using an adaptive procedure with an 1-up-2-down step rule, adjusting j Dq j until the correct response rate converges to the 70.7% point on the psychometric function within the 3-AFC paradigm. Since the correct response rate in the psychophysical task is already better than chance for values slightly below the behavioural threshold, finite LAEP amplitudes could be expected in response to perceptually just noticeable IAC transitions.
54
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
The peak-to-peak amplitude fit functions would reach values of 0.63, 1.50, 1.03 and 0:37 lV at the perceptual threshold for the conditions 1 ", 0 #, 0 " and þ1 #, respectively (grey areas in Fig. 6A–C). The negative peak-to-peak amplitude of 0:37 lV for qref ¼ þ1 is obviously meaningless. It might therefore be more appropriate to use another criterion than the electrophysiological JNT to estimate perceptual JNTs from LAEP amplitudes: since the slopes of the peak-to-peak amplitude growth function at the 0 lV-threshold are roughly inversely proportional to the behavioural thresholds, the psychophysical q-JNTs are well approximated by twice the inverse slope of the peak-to-peak amplitude growth function at the 0 lV-threshold. This yields values of 0.088, 0.680, 0.476 and 0.053 for the conditions 1 ", 0 #, 0 " and þ1 #, respectively, in good agreement with the actually observed perceptual JNTs. The above relations between JNTs that were derived from mean data were also found in the majority of data from single subjects. However, a further quantitative analysis of individual JNTs was impeded by comparatively large errorbars and a number of outliers. Amplitude growth functions with shallow slopes caused huge variations of the respective threshold estimates and led to implausible JNTs. For psychophysical and electrophysiological JNTs, the standard deviation across subjects appears to be much larger for qref ¼ 0 than for qref ¼ 1 (Figs. 1 and 6D–F). The large variance of N1 based threshold estimates for qref ¼ 1 is due to the generally shallow slope of amplitude growth functions in this condition. If the values of qdev at the threshold are, however, expressed as their equivalent N 0 =N p ratio, the interindividual standard deviation is similar for all qref . 4.4. Asymmetries The definition of the terms ‘‘major” and ‘‘minor asymmetry” was motivated by psychoacoustical findings. The concept of two asymmetries was chosen to classify if asymmetries in IAC sensitivity are either related to the absolute values of the involved correlations (major asymmetry) or to their signs (minor asymmetry), regardless of the experimental paradigm. It is also applicable to observations in EEG, MEG and fMRI, no matter if the respective data were collected under static or dynamic test conditions. 4.4.1. Effect of stimulus presentation order in EEG and MEG data In addition to the discriminability of the two correlations before and after a stepwise IAC change, the detectability of IAC transitions in a dynamic paradigm also depends on the issue which of the two correlations precedes the other. Chait et al. (2005, 2007) reported that the global field power of the MEG during the N1-P2-complex was larger after transitions from q ¼ þ1 to q ¼ 0 than vice versa, i.e., they found a major asymmetry due to the presentation order. In the present study, this temporal order asymmetry with respect to the absolute value of the preceding correlation was found for all transitions with j Dq j¼ 1: peak-to-peak amplitudes were about 1.4 times larger for transitions from qref ¼ 1 to qdev ¼ 0 than for transitions from qref ¼ 0 to qdev ¼ 1. The same effect of temporal order was found for the reverse transitions from qdev to qref . In one of their binaural ASSR experiments Dajani and Picton (2006) presented their subjects a stimulus in which q alternated between þ1 and 0 at a modulation rate of 1 Hz, i.e., with a segment duration of 500 ms. In this experimental condition they found peak-to-peak amplitudes of about 2.7 lV regardless of the correlation in the segment preceding the transition, i.e., their data were unaffected by the presentation order. To test if the absence of a major asymmetry in their data could be a consequence of the highly predictable stimulus sequence in combination with a com-
paratively short segment duration, we performed a control experiment, using the same stimulus sequence and segment duration as Dajani and Picton (2006). The LAEP (average over 1000 epochs) of the two test subjects showed a clear major asymmetry, with peakto-peak amplitudes being about 1.5 times larger for transitions from q ¼ þ1 to q ¼ 0 than for reversed presentation order. Hence, the effect of stimulus presentation order can be characterized by the same major asymmetry in psychoacoustics, EEG and MEG. The large sample variance of uncorrelated noise might reduce the salience of a subsequent IAC transition (cf. Section 4.1.2). This would not only explain why the psychophysical JNTs show a larger major asymmetry than JNDs, it could also account for the major asymmetry in own LAEP and in the MEG data by Chait et al. (2005, 2007). However, the temporal order asymmetry cannot be consistently applied for negative q since LAEP amplitudes were larger for transitions from 1 to þ1 than vice versa, whereas psychophysical thresholds were better for qref ¼ þ1 than for qref ¼ 1. 4.4.2. Representations of ongoing IAC in EEG, MEG and fMRI data The discriminability of two static correlations in separate signals with silence in between, and hence also in separate parts of a signal, is generally thought to reflect the internal representation of ongoing IAC and its statistics (Gabriel and Colburn, 1981; Breebaart et al., 2001; Shackleton et al., 2005). Using fMRI, Budd et al. (2003) found a significant positive relationship between BOLD activity and IAC in the primary auditory cortex. Their stimuli were separated by silence and had static correlations between 0 and 1, which were chosen ‘by ear’ in order to resemble approximately equal perceptual step sizes (q = 0, 0.33, 0.6, 0.8, 0.93 and 1). They observed a monotonic increase of raw BOLD activity in supra-threshold voxels with the stimulus IAC and found larger differences in activation between levels of IAC near unity than between levels near zero. The comparatively small increment of static correlation from 0.93 to 1 was even accompanied by a larger increase in cortical activity than an increment from 0 to 0.33. Hence, the fMRI data by Budd et al. (2003) suggest a higher sensitivity to IAC differences in the vicinity of 1 and thus a pronounced major asymmetry. The nonlinear relation between the normalized IAC and BOLD activity supports the hypothesis that the internal representation of ongoing IAC can be characterized by a nonlinear scale transformation which, in turn, might also account for the major asymmetry in our LAEP data. Such a nonlinear internal representation of q could also account for the observation by Chait et al. (2005) that the global field power of the MEG during the N1-P2-complex was always larger after an IAC change from 1 to 1 j Dq j than after a transition from 0 to 0þ j Dq j. However, this major asymmetry could also be an effect of temporal presentation order, according to the precedence of either q ¼ 1 or q ¼ 0 for each of the various Dq. The nonlinear increase of our LAEP amplitudes in the conditions with qref ¼ 1, however, cannot be due to the temporal order in which two correlations are presented, because all transitions in the respective conditions were preceded by the same reference correlation. An internal representation of ongoing IAC according to the dB(N 0 =N p ) transform, in contrast, can explain the shape of LAEP amplitude growth functions in all conditions. 4.4.3. Asymmetries in single cell data For neurons in the IC of the guinea pig, Shackleton et al. (2005) demonstrated that rate-IAC-functions (rICF) can look very different for each neuron. Typical peaker neurons increased their firing rate with q while the rICFs of typical trougher neurons showed a monotonic decrease with q. From a receiver operating characteristic analysis of spike count distributions, Shackleton et al. (2005) estimated IAC discrimination thresholds for single neurons. Their
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
single neuron q-JNDs were in obvious contradiction to the perceptual major asymmetry found in human psychophysics. Moreover, for any reference correlation, single neuron thresholds were distributed over nearly the entire range of IAC change from 0.1 to 2. Shackleton et al. (2005) concluded that a population measure is required to account for behavioural interaural correlation discrimination performance. Coffey et al. (2006) found that the curvature of rICFs for cells in the auditory cortex of unanesthesized rabbits was highly nonlinear. A population measure that adds the progressively increasing rICFs of all peaker neurons and the nonlinearly decreasing rICFs of all trougher units does not allow for an identification of the current stimulus correlation without ambiguity because it would provide an internal representation of IAC by means of a nonmonotonic overall rate code. In contrast, the opposite dependence of peaker and trougher rICFs on the IAC and the nonlinearity of rICFs in the rabbit’s cortex support the suggestion that the IAC might be represented in the brain as a differential measure regarding two population outputs: a subpopulation of peaker units with an rICF that progressively increases with the IAC might represent the N 0 component in the signal. The second subpopulation is thought to correspond to the N p component, with a trougher-like rICF that is just opposite to the population of peakers. The comparison between the outputs of two such nonlinear subpopulation rICFs might be used to obtain a decision statistic which is either based on the ratio of N 0 - and N p -powers or their difference on a dB scale. Such a two-population measure might then account for the linear relationship between LAEP amplitude and dBðN 0 =N p Þ-transformed IAC, providing an internal representation of IAC from which the major asymmetry emerges. 4.5. Relation between asymmetries and the dB(N0 =N p ) transform A comparison of the various LAEP features suggests that psychophysical thresholds are estimated best by twice the inverse slope of peak-to-peak growth functions at the electrophysiological threshold. In addition, the deviant correlations qeq for which equal peak-to-peak amplitudes were observed in the two conditions with qref on either side of qeq provide a measure of IAC sensitivity which is in reasonable agreement with the point of equal supra0 threshold discriminability on the cumulative d -functions by Culling et al. (2001). Hence, the voltage gain per dB(N 0 =N p ) might in first approximation correspond to a constant perceptual distance 0 between two correlations by means of d in each single condition, at least for j Dq j< 1. On the dB(N 0 =N p ) scale, peak-to-peak amplitudes for all transitions from qref to qdev are well approximated by linear growth functions with a similar slope of about 0.37 lV/ dB(N 0 =N p ). It might therefore be proposed that the major asymmetry in peak-to-peak data is sufficiently equalized by the dB(N 0 =N p ) transform. In Section 4.2.1 it has been suggested that internal processing errors in the auditory system cause a decorrelation of the input signal so that a perfectly correlated stimulus is represented in the binaural system with an ‘effective internal IAC’ lower than þ1 and thus with a finite value on the dB(N 0 =N p ) scale. The close correspondence between the voltage gain per dB(N 0 =N p ) and IAC sensitivity and the similarity of fit function slopes on the dB(N 0 =N p ) scale allow one to estimate this effective internal IAC for stimuli with q ¼ 1 in the following way: for the conditions þ1 # and 0 ", fit functions have equal peak-to-peak amplitudes for qeq ¼ 0:786, i.e., q~ eq ¼ 9:2 dB(N0 =Np ). If the dB(N0 =Np ) scale com~ eq must have the same dispensates for the major asymmetry, q tance to the two transformed reference correlations on either ~ eq ¼ 18:4 ~ ref ¼ 2 q side, i.e., zero in the condition 0 " and q dB(N 0 =N p ) in the condition 1 #. Likewise, the intersection point of peak-to-peak fit functions for the conditions 1 " and 0 # suggests
55
that a signal with q ¼ 1 is represented within the auditory system as if its effective internal IAC were about 17:4 dB(N 0 =N p ). These estimates for the effective internal IAC of perfectly correlated or antiphasic input signals are reasonable, because they are ~ dev at the electrophysiological more extreme than the values of q thresholds in these conditions, 16.6 and 14:2 dB(N 0 =N p ), respectively. This suggests that stimuli with correlations of þ1 and 1 are represented within the auditory system as if their effective internal IAC had a difference of about 2–3 dB(N 0 =N p ) to the electrophysiological threshold, as in the conditions 0 " and 0 #. ~ correSince constant differences of the transformed measure q spond to equal discriminability on the electrophysiological and perceptual level, the major asymmetry with respect to the normalized q might be mainly due to the fact that the inverse slope ~ =dqÞ1 ¼ ðln 10=20Þ ð1 q2 Þ is proportional to the variance of ðdq sample correlations (Gabriel and Colburn, 1981). In contrast, signal detection theory (SDT) predicts that the discriminability depends on the standard deviation of observations, but not on their variance. Therefore the normalized q is unlikely to reflect the internal representation of diffuseness by means of a decision variable. The ~ , on the other hand, allows one to describe IAC discriminause of q bility by equal variance Gaussian SDT, because as the Fisher Ztransform of q it normalizes the variance of sample correlations (McNemar, 1955; Breebaart et al., 2001). However, the dB(N 0 =N p ) transform cannot provide a reasonable explanation for the minor asymmetry between the positive and ~ ðqÞ ¼ q ~ ðqÞ. The minor negative range of q, simply because q asymmetry could be mediated by different numbers of units in the peaker and trougher subpopulations and the different slopes of typically nonlinear rICFs in the positive and negative range of q (Shackleton et al., 2005; Coffey et al., 2006). It could also be a consequence of the peripheral hair cell transduction, because after half-wave rectification the range of cross correlations at the output of auditory filters is smaller for stimuli with negative q than for signals with positive q. van de Par et al. (2001) criticized ‘‘the normalization that is typically included in correlation-based models of binaural detection”. Because the N 0 =N p -ratio can be directly computed from the stimulus waveforms, as the power ratio between correlated (l þ r) ~ can be extracted and anticorrelated (l r) signal components, q from the signal without the need for any normalization, in contrast to the feature extraction mechanisms proposed by many psychophysical models. As a measure which relies on a comparison of N 0 - and N p -components, it is consistent with the idea of IAC coding by two populations, peakers and troughers. It is therefore unaffected by binaurally correlated changes of the signal level, in contrast to a rate coding strategy relying on the average output of a single population. Although the power ratio between correlated (l þ r) and anticorrelated (l r) signal components is insensitive against binaurally correlated envelope fluctuations, the dynamic range of the N 0 =N p -ratio is reduced in the presence of an interaural level difference. Since this ratio not only depends on q but also on the ILD, it might also account for the ILD dependence of the q-JND reported by Pollack and Trittipoe (1959b).
5. Summary The perceptual thresholds for the detection of IAC transitions in ongoing noise showed the same asymmetries as q-JNDs for the discriminability of signals with static IAC from other studies. IAC transitions elicited specific binaural LAEP in the human EEG, consisting of a transient N1-P2 response accompanied by a change of the sustained potential. The transient brain response was substantially influenced by unspecific context information in the stimulus
56
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57
sequence, having greater impact on the N1 than on the P2. Nevertheless, the LAEP in the present study clearly exhibited IAC specific features which cannot be explained as a result of unspecific mismatch detection. While the sustained potential was dominated by the current value of q itself, the latencies of the transient response showed a systematic dependence on both qref and Dq. However, neither the sustained responses nor the LAEP latencies could be consistently interpreted as an electrophysiological correlate of psychophysical IAC sensitivity by means of IAC specific asymmetries or unspecific perceptual mismatch. In contrast, the dependencies of LAEP amplitudes and derived quantities on qref and Dq were in reasonable agreement with the pattern of psychophysical thresholds and supra-threshold IAC discriminability. The amplitudes of N1 and P2 were mainly determined by the IAC step size j Dq j, and they were linearly related to the dBðN 0 =N p Þ-transformed IAC for j Dq j up to 1. The perceptual IAC sensitivity is best represented by the local derivatives of amplitude growth functions, i.e., the voltage gain per change of the normalized q. In the domain of the dBðN 0 =N p Þ-transformed IAC, the peak-to-peak voltage increased by about 0.37 dB(N 0 =N p ), regardless of qref and qdev . This constant slope closely corresponds to a constant perceptual distance between two correlations. Considering the effect of processing errors in the monaural periphery, electrophysiological thresholds by means of the dBðN 0 =N p Þ-transformed IAC appear to be equal for all conditions, and they amount to about 3 dB(N 0 =N p ). In conclusion, the dBðN 0 =N p Þ transform compensates for the major asymmetry regarding differential and cumulative IAC sensitivity and thus allows for a more consistent quantitative description of data from different experiments than the normalized IAC. Acknowledgements This work was supported by the Deutsche Forschungsgemeinschaft (DFG, Project KO 942/19-1). The authors would like to thank Anita Gorges and Frank Grunau for technical assistance and Steve Colburn and several anonymous reviewers for their comments on previous versions of the manuscript. References Akeroyd, M.A., Summerfield, A.Q., 1999. A binaural analog of gap detection. J. Acoust. Soc. Am. 105 (5), 2807–2820. Ando, Y., Kang, S.H., Nagamatsu, H., 1987. On the auditory-evoked potential in relation to the IACC of sound field. J. Acoust. Soc. Jpn. E 8 (5), 183–190. Beutelmann, R., Brand, T., 2006. Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners. J. Acoust. Soc. Am. 120 (1), 331–342. Boehnke, S.E., Hall, S.E., Marquardt, T., 2002. Detection of static and dynamic changes in interaural correlation. J. Acoust. Soc. Am. 112 (4), 1617–1626. Breebaart, J., van de Par, S., Kohlrausch, A., 2001. Binaural processing model based on contralateral inhibition: I. Model structure. J. Acoust. Soc. Am. 110 (2), 1074– 1088. Bronkhorst, A.W., 2000. The cocktail party phenomenon: a review of research on speech intelligibility in multiple talker conditions. Acta Acust. United Acust. 86, 117–128. Budd, T.W., Hall, D.A., Goncalves, M.S., Akeroyd, M.A., Foster, J.R., Palmer, A.R., Head, K., Summerfield, A.Q., 2003. Binaural specialisation in human auditory cortex: an fMRI investigation of interaural correlation sensitivity. Neuroimage 20 (3), 1783–1794. Chait, M., Poeppel, D., de Cheveigne, A., Simon, J.Z., 2005. Human auditory cortical processing of changes in interaural correlation. J. Neurosci. 25 (37), 8518–8527. Chait, M., Poeppel, D., Simon, J.Z., 2007. Stimulus context affects auditory cortical responses to changes in interaural correlation. J. Neurophysiol. 98, 224–231. Coffey, C.S., Ebert Jr., C.S., Marshall, A.F., Skaggs, J.D., Falk, S.E., Crocker, W.D., Pearson, J.M., Fitzpatrick, D.C., 2006. Detection of interaural correlation by neurons in the superior olivary complex, inferior colliculus and auditory cortex of the unanesthetized rabbit. Hear. Res. 221 (1–2), 1–16. Colburn, H.S., 1995. Computational models of binaural processing. In: Hawkins, H.L., McMullen, T.A., Popper, A.N., Fay, R.R. (Eds.), Springer Handbook of Auditory Research, vol. 6, Auditory Computation. Springer, New York, pp. 332–400. Culling, J.F., Colburn, H.S., Spurchise, M., 2001. Interaural correlation sensitivity. J. Acoust. Soc. Am. 110 (2), 1020–1029.
Culling, J.F., Hodder, K.I., Colburn, H.S., 2003. Interaural correlation discrimination with spectrally-remote flanking noise: constraints for models of binaural unmasking. Acta Acust. United Acust. 89, 1049–1058. Dajani, H.R., Picton, T.W., 2006. Human auditory steady-state responses to changes in interaural correlation. Hear. Res. 219 (1–2), 85–100. Damaschke, J., Riedel, H., Kollmeier, B., 2005. Neural correlates of the precedence effect in auditory evoked potentials. Hear. Res. 205 (1–2), 157–171. Dobie, R.A., Berlin, C.I., 1979. Binaural interaction in brainstem-evoked responses. Arch. Otolaryngol. 105 (7), 391–398. Durlach, N.I., Gabriel, K.J., Colburn, H.S., Trahiotis, C., 1986. Interaural correlation discrimination: II. Relation to binaural unmasking. J. Acoust. Soc. Am. 79 (5), 1548–1557. Faller, C., Merimaa, J., 2004. Source localization in complex listening situations: Selection of binaural cues based on interaural coherence. J. Acoust. Soc. Am. 116 (5), 3075–3089. Furst, M., Levine, R.A., McGaffigan, P.M., 1985. Click lateralization is related to the beta component of the dichotic brainstem auditory evoked potentials of human subjects. J. Acoust. Soc. Am. 78 (5), 1644–1651. Gabriel, K.J., Colburn, H.S., 1981. Interaural correlation discrimination: I. Bandwidth and level dependence. J. Acoust. Soc. Am. 69 (5), 1394–1401. Grantham, D.W., 1982. Detectability of time-varying interaural correlation in narrow-band noise stimuli. J. Acoust. Soc. Am. 72 (4), 1178–1184. Halliday, R., Callaway, E., 1978. Time shift evoked potentials (TSEPs): method and basic results. Electroencephalogr. Clin. Neurophysiol. 45 (1), 118–121. Hoke, M., Ross, B., Wickesberg, R., Lütkenhöner, B., 1984. Weighted averaging – theory and application to electric response audiometry. Electroencephalogr. Clin. Neurophysiol. 57 (5), 484–489. Jain, M., Gallagher, D.T., Koehnke, J., Colburn, H.S., 1991. Fringed correlation discrimination and binaural detection. J. Acoust. Soc. Am. 90 (4 Pt. 1), 1918– 1926. Jasper, H.H., 1957. The ten twenty electrode system of the international federation. Electroencephalalogr. Clin. Neurophysiol. 10 (appendix), 371–375. Johnson, B.W., Hautus, M., Clapp, W.C., 2003. Neural activity associated with binaural processes for the perceptual segregation of pitch. Clin. Neurophysiol. 114 (12), 2245–2250. Jones, S.J., 1991. Memory-dependent auditory evoked potentials to change in the binaural interaction of noise signals. Electroencephalogr. Clin. Neurophysiol. 80 (5), 399–405. Jones, S.J., Pitman, J.R., Halliday, A.M., 1991. Scalp potentials following sudden coherence and discoherence of binaural noise and change in the inter-aural time difference: a specific binaural evoked potential or a ‘‘mismatch” response? Electroencephalogr. Clin. Neurophysiol. 80 (2), 146–154. Junius, D., Riedel, H., Kollmeier, B., 2007. The influence of externalization and spatial cues on the generation of auditory brainstem responses and middle latency responses. Hear. Res. 225 (1–2), 91–104. Koehnke, J., Colburn, H.S., Durlach, N.I., 1986. Performance in several binauralinteraction experiments. J. Acoust. Soc. Am. 79 (5), 1558–1562. Kollmeier, B., Gilkey, R.H., 1990. Binaural forward and backward masking: evidence for sluggishness in binaural detection. J. Acoust. Soc. Am. 87 (4), 1709–1719. Lammertmann, C., Lütkenhöner, B., 2001. Near-DC magnetic fields following a periodic presentation of long-duration tonebursts. Clin. Neurophysiol. 112 (3), 499–513. Levitt, H., 1971. Transformed up–down methods in psychoacoustics. J. Acoust. Soc. Am. 49 (2 Pt. 2), 467–477. Lüddemann, H., Riedel, H., Kollmeier, B., 2007. Logarithmic scaling of interaural cross correlation: a model based on evidence from psychophysics and EEG. In: Kollmeier, B., Klump, G., Hohmann, V., Langemann, U., Mauermann, M., Uppenkamp, S., Verhey, J. (Eds.), Hearing: from Sensory Processing to Perception – 14th International Symposium on Hearing. Springer, Berlin. McEvoy, L.K., Picton, T.W., Champagne, S.C., 1991a. Effects of stimulus parameters on human evoked potentials to shifts in the lateralization of a noise. Audiology 30 (5), 286–302. McEvoy, L.K., Picton, T.W., Champagne, S.C., 1991b. The timing of the processes underlying lateralization: psychophysical and evoked potential measures. Ear. Hear. 12 (6), 389–398. McNemar, Q., 1955. Psychological Statistics. John Wiley & Sons Inc., New York. p. 147. Moore, B.C., Glasberg, B.R., Plack, C.J., Biswas, A.K., 1988. The shape of the ear’s temporal window. J. Acoust. Soc. Am. 83 (3), 1102–1116. Näätänen, R., Picton, T.W., 1987. The N1 wave of the human electric and magnetic response to sound: a review and an analysis of component structure. Psychophysiology 24 (4), 375–425. Nix, J., Hohmann, V., 2006. Sound source localization in real sound fields based on empirical statistics of interaural parameters. J. Acoust. Soc. Am. 119 (1), 463– 479. Nix, J., Hohmann, V., 2007. Combined estimation of spectral envelopes and sound source direction of concurrent voices by multidimensional statistical filtering. IEEE Trans. Signal. Process. 15 (3), 995–1008. Picton, T.W., Woods, D.L., Proulx, G.B., 1978. Human auditory sustained potentials: I. The nature of the response. Electroencephalogr. Clin. Neurophysiol. 45 (2), 186– 197. Pollack, I., Trittipoe, W.J., 1959a. Binaural listening and interaural noise cross correlation. J. Acoust. Soc. Am. 31 (9), 1250–1252. Pollack, I., Trittipoe, W.J., 1959b. Interaural noise correlation: examination of variables. J. Acoust. Soc. Am. 31 (12), 1616–1618.
H. Lüddemann et al. / Hearing Research 256 (2009) 39–57 Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 1992. Numerical Recipes in C, second ed.. The Art of Scientific Computing Cambridge University Press, Cambridge, New York, Port Chester, Melbourne, Sydney. Chapter 15, ‘‘Modeling of Data”. Rayleigh, L., 1907. On our perception of sound direction. Philos. Mag. 13 (1), 214– 232. Riedel, H., Granzow, M., Kollmeier, B., 2001. Single-sweep-based methods to improve the quality of auditory brain stem responses: Part II. Averaging methods. Z. Audiol. 40 (2), 62–85. Riedel, H., Kollmeier, B., 2002a. Auditory brain stem responses evoked by lateralized clicks: is lateralization extracted in the human brain stem? Hear. Res. 163 (1–2), 12–26. Riedel, H., Kollmeier, B., 2002b. Comparison of binaural auditory brainstem responses and the binaural difference potential evoked by chirps and clicks. Hear. Res. 169 (1–2), 85–96. Riedel, H., Kollmeier, B., 2006. Interaural delay-dependent changes in the binaural difference potential of the human auditory brain stem response. Hear. Res. 218 (1–2), 5–19. Ross, B., Tremblay, K.L., Picton, T.W., 2007. Physiological detection of interaural phase differences. J. Acoust. Soc. Am. 121 (2), 1017–1027. Saberi, K., Takahashi, Y., Konishi, M., Albeck, Y., Arthur, B.J., Farahbod, H., 1998. Effects of interaural decorrelation on neural and behavioral detection of spatial cues. Neuron 21 (4), 789–798. Schröger, E., 1996. Interaural time and level differences: integrated or separated processing? Hear. Res. 96 (1–2), 191–198.
57
Schröger, E., Wolff, C., 1996. Mismatch response of the human brain to changes in sound location. Neuroreport 7 (18), 3005–3008. Shackleton, T.M., Arnott, R.H., Palmer, A.R., 2005. Sensitivity to interaural correlation of single neurons in the inferior colliculus of guinea pigs. J. Assoc. Res. Otolaryngol. 6 (3), 244–259. Sharbrough, F., Chatrian, G.E., Lesser, R.P., Lüders, H., Nuwer, M., Picton, T.W., 1991. American electroencephalographic society guidelines for standard electrode position nomenclature. J. Clin. Neurophysiol. 8 (2), 200–202. Soeta, Y., Hotehama, T., Nakagawa, S., Tonoike, M., Ando, Y., 2004. Auditory evoked magnetic fields in relation to interaural cross-correlation of band-pass noise. Hear. Res. 196 (1–2), 109–114. Sonnadara, R.R., Alain, C., Trainor, L.J., 2006. Effects of spatial separation and stimulus probability on the event-related potentials elicited by occasional changes in sound location. Brain Res. 1071 (1), 175–185. Trahiotis, C., Bernstein, L.R., Akeroyd, M.A., 2001. Manipulating the ‘‘straightness” and ‘‘curvature” of patterns of interaural cross correlation affects listeners’ sensitivity to changes in interaural delay. J. Acoust. Soc. Am. 109 (1), 321–330. van de Par, S., Trahiotis, C., Bernstein, L.R., 2001. A consideration of the normalization that is typically included in correlation-based models of binaural detection. J. Acoust. Soc. Am. 109 (2), 830–833. van der Heijden, M., Trahiotis, C., 1997. A new way to account for binaural detection as a function of interaural noise correlation. J. Acoust. Soc. Am. 101 (2), 1019– 1022. Yost, W.A., Gourevitch, G., 1987. Directional Hearing. New York.