brain research 1490 (2013) 153–160
Available online at www.sciencedirect.com
www.elsevier.com/locate/brainres
Research Report
Involuntary attentional capture by speech and non-speech deviations: A combined behavioral–event-related potential study M. Reichea,n, G. Hartwigsenb, A. Widmanna, D. Saurb, E. Schro¨gera, A. Bendixena a
Department for Psychology, University of Leipzig, Seeburgstr. 14-20, D-04103 Leipzig, Germany Language and Aphasia Laboratory, Department of Neurology, University of Leipzig, Liebigstr. 20, D-04103 Leipzig, Germany
b
art i cle i nfo
ab st rac t
Article history:
This study applied an auditory distraction paradigm to investigate involuntary attention
Accepted 26 October 2012
effects of unexpected deviations in speech and non-speech sounds on behavior (increase
Available online 1 November 2012
in response time and error rate) and event-related brain potentials (DN1/MMN and P3a).
Keywords:
Our aim was to systematically compare identical speech sounds with physical vs. linguistic
Distraction paradigm
deviations and identical deviations (pitch) with speech vs. non-speech sounds in the same
Auditory attention
set of healthy volunteers. Sine tones and bi-syllabic pseudo-words were presented in a
Linguistic material
2-alternative forced-choice paradigm with occasional phoneme deviants in pseudo-words,
Sine tones
pitch deviants in pseudo-words, or pitch deviants in tones. Deviance-related ERP compo-
Frequency deviant
nents were elicited in all conditions. Deviance-related negativities (DN1/MMN) differed in
Phoneme deviant
scalp distribution between phoneme and pitch deviants within phonemes, indicating that
Event-related potential (ERP)
auditory deviance-detection partly operates in a deviance-specific manner. P3a as an
MMN
indicator of attentional orienting was similar in all conditions, and was accompanied by
N1
behavioral indicators of distraction. Yet smaller behavioral effects and prolonged relative
P3a
MMN-P3a latency were observed for pitch deviants within phonemes relative to the other
Response time
two conditions. This suggests that the similarity and separability of task-relevant and task-
Error rate
irrelevant information is essential for the extent of attentional capture and distraction. & 2012 Elsevier B.V. All rights reserved.
1.
Introduction
Previous studies have demonstrated that unexpected, taskirrelevant sounds in our environment can involuntarily attract our attention and may thus impair the processing of task-relevant stimulus information (Escera et al., 1998; Grillon et al., 1990; Schro¨ger and Wolff, 1998). This process, often
n
referred to as auditory distraction, can result in specific behavioral and electrophysiological consequences such as performance deterioration and elicitation of certain components of the event-related brain potential (ERP). Involuntary attention has been previously associated with different cognitive subprocesses including change detection and attentional orienting. These processes are reflected in distinct ERP
Correspondence to: Institut fu¨r Psychologie, Universita¨t Leipzig, Seeburgstr. 14-20, 04103 Leipzig, Germany. Fax: þ49 341 9735969. E-mail addresses:
[email protected] (M. Reiche),
[email protected] (G. Hartwigsen), widmann@uni-leip zig.de (A. Widmann),
[email protected] (D. Saur),
[email protected] (E. Schro¨ger), alexandra.bendixen@ uni-leipzig.de (A. Bendixen). 0006-8993/$ - see front matter & 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.brainres.2012.10.055
154
brain research 1490 (2013) 153–160
components. The Mismatch Negativity (MMN) is a negative deflection of the ERP with a peak latency of approximately 150 ms and a frontocentral scalp distribution that can be elicited by an irregular stimulus in a homogeneous series irrespective of whether the stimuli are attended or not (Na¨a¨ta¨nen et al., 1978) (for reviews, see e.g. Kujala et al. (2007), Na¨a¨ta¨nen et al. (2011)). MMN is assumed to reflect a mechanism that automatically detects unexpected stimulus characteristics by comparing a representation of the current stimulus with regular characteristics extracted from preceding stimuli (Winkler, 2007). In repetitive sound sequences, change detection can also be accomplished by more simple mechanisms based on differential adaptation to the frequently and the infrequently presented stimuli, indicated by an increase in the auditory N1 component (hereafter referred to as DN1) (see Horva´th et al. (2008), for a detailed discussion). Another ERP component often observed within the context of auditory distraction is the P3a which peaks at around 300 ms and is elicited by large deviant or novel sounds (Squires et al., 1975). P3a is often interpreted as an indicator of an attention switch towards an unexpected stimulus (Escera et al., 2000, 2001; Snyder and Hillyard, 1976). It has been argued that task-irrelevant information is more likely to interfere with the processing of task-relevant information when the two types of information are difficult to separate (Garner, 1974) than when they are parts of different stimulus streams or occur in different spatial locations. To obtain distractive effects on attention, task-irrelevant
stimulus characteristics need to pass an attentional filter (Schro¨ger, 1997) that depends on the type of task-relevant information. For a given amount of deviation, a distractor that is not part of the task-relevant object may fail to pass the attentional filter, while a distractor that is part of the taskrelevant object would succeed. In other words, deviant stimuli are more likely to cause distraction if the channel separation between task-relevant and task-irrelevant information is minimized (Broadbent, 1971; Schro¨ger and Wolff, 1998). Making use of this relation has led to the development of the auditory distraction paradigm by Schro¨ger and Wolff (1998). In the original paradigm, subjects are asked to distinguish between long and short tones via button presses. Occasionally, these tones slightly change in frequency. Although the frequency change itself is irrelevant to the duration discrimination task, it can lead to reliable behavioral and electrophysiological distraction effects (Schro¨ger et al., 2000), because task-relevant and task-irrelevant information are embedded into a single sound (object) and distracting information can thus not easily be ignored. Previous studies demonstrated that electrophysiological correlates of auditory distraction can also be elicited within speech stimuli (Grimm et al., 2008). However, no systematic comparison between speech and tonal material has been performed within an auditory distraction paradigm. Deviance detection mechanisms in speech processing are known to be different from those in tonal material. MMN for physical deviations in linguistic material has been reported to be diminished or
Fig. 1 – Experimental design for the three conditions.
brain research 1490 (2013) 153–160
even absent, which is often attributed to the categorical nature of speech perception (Maiste et al., 1995; Tampas et al., 2005). This explanation suggests that phonemes are readily categorized as abstract entities rather than being processed according to their exact physical features. Moreover, even if a physical deviation (e.g., in pitch) is detected in linguistic material, it might more easily be filtered out by attentional processes as it is more distant from the relevant (linguistic) information. Consequently, it remains unclear whether similar effects of attentional capture and distraction can be obtained with linguistic relative to tonal material. Our study aimed at investigating distraction in the processing of speech sounds and sinusoidal tones by carefully controlling (a) the distance/domain of distracting information and task information (speech vs. non-speech) and (b) the type of deviation (phoneme vs. pitch). To accomplish this, we used an adapted version of the auditory distraction paradigm by Schro¨ger and Wolff (1998) that included both the original manipulation of tonal material (containing occasional pitch deviations) and an additional manipulation of linguistic material (one condition with occasional pitch deviations within speech material, another condition with occasional phoneme deviations within speech material) (cf. Fig. 1). Importantly, the stimulus material in all three conditions was designed such that the task-irrelevant information commenced at stimulus onset, while the task-relevant information was presented 250 ms after stimulus onset. We expected behavioral distraction effects (increased response times and error rates) caused by task-irrelevant information for all conditions because in all conditions, task-irrelevant information is embedded in the same perceptual object as taskrelevant material. Phoneme deviants within speech material are more similar to the task-relevant information as compared to pitch deviants within speech material. Consequently, phoneme deviants should be more likely to interfere with the processing of task-relevant information and yield larger distraction effects than pitch deviants. Based on previous results (e.g. Grimm et al. (2008)), we expected that both pitch deviation and phoneme deviation within the pseudo-words would elicit electrophysiological correlates of distraction showing up as DN1/MMN and P3a. Due to differences in the processing of the presented material (categorical speech perception vs. feature analysis of sine tones), deviance detection should differ between the conditions. Specifically, detection of pitch deviation should be diminished in pseudo-words as compared to sine tones due to categorical speech perception. Additionally, deviance-related negativities should differ in type and location between phoneme and pitch deviations in speech material because phoneme deviations are more likely to interfere with the task-relevant material.
2.
Results
2.1.
Behavioral data
Behavioral results are displayed in Fig. 2. The expected error rate increase for deviant relative to standard stimuli (Main effect of Stimulus type, F(1,16)¼ 18.77, p ¼0.001) was
155
Fig. 2 – Behavioral results. Mean error rates and response times for the different stimulus types and conditions. Error bars represent standard error of mean. Significant standard-deviant differences (i.e., distraction effects) are marked by asterisks (npo0.05; two-tailed). modulated by condition as reflected in the significant interaction of Stimulus type Condition [F(2,32)¼5.21, p ¼0.011]. Accordingly, post-hoc paired t-tests between deviant and standard error rates revealed significantly higher error rates in response to deviants in the phoneme–phoneme condition [t(16)¼ 3.83, p¼ 0.001] and the tone–pitch condition [t(16) ¼2.83, p¼ 0.012] but not for the phoneme–pitch condition [t(16) ¼1.17, p¼ 0.26]. The expected increase in response times for deviant relative to standard stimuli (Main effect of Stimulus type, F(1,16)¼ 19.00, p¼ 0.0005) was again modulated by condition as reflected in the significant interaction of Stimulus type Condition [F(2,32)¼ 3.95, p ¼0.029]. Response times were significantly longer in response to deviants compared to standards in all three conditions [phoneme–phoneme: t(16)¼ 4.67, p ¼0.0003; phoneme–pitch: t(16) ¼ 3.06, p¼ 0.008; tone–pitch: t(16)¼ 2.51, p ¼0.023]. Post-hoc pair-wise comparisons between conditions were conducted to explore the effect of Condition on the amount of distraction measured by response times (difference between deviant and standard response times). The observed distraction effects were largest in the phoneme–phoneme condition [phoneme–phoneme vs. phoneme–pitch: t(16) ¼2.43,
156
brain research 1490 (2013) 153–160
p¼ 0.0270; phoneme–phoneme vs. tone–pitch: t(16)¼ 2.20, p¼ 0.04294; note that these differences fail to reach significance with Bonferroni correction for multiple comparisons]. In contrast, there were no significant differences in the distraction effect between the phoneme–pitch and the tone–pitch condition [t(16)¼ 0.5681, p ¼0.5779].
2.2.
ERP data
ERPs and scalp potential maps are shown in Fig. 3. Component latencies of the MMN and P3a as well as difference wave mean amplitudes are presented in Table 1. The RMANOVA with the factors Condition, Frontal-mastoidal and Hemisphere for the difference waves in the MMN time window yielded a main effect of Frontal–mastoidal [F(1,16)¼ 24.31, p¼ 0.00015] indicating significant MMN components in all conditions because MMN usually inverts polarity at the mastoids (Schro¨ger, 2005).
An additional Condition Hemisphere interaction [F(2,32)¼ 4.56, p¼ 0.018] indicated that MMN lateralization varied between conditions. Two-tailed post hoc paired t-tests revealed a right-lateralized MMN component for the phoneme–pitch condition [t(16)¼ 3.27, p¼ 0.005]. No Hemisphere effects were obtained in the phoneme–phoneme condition [t(16) ¼0.62, p ¼0.54] and in the tone–pitch condition [t(16)¼ 0.14, p ¼0.89]. Apart from a main effect of Hemisphere [F(1,16)¼ 6.20, p¼ 0.024], no further significant effects were observed in this RMANOVA (all p40.09). Significant P3a components were elicited in all conditions (all p values o0.05). The RMANOVA with the factors Condition and Hemisphere for the P3a revealed a significant Condition Hemisphere interaction [F(2,32)¼ 3.78, p ¼0.034], but no main effects of Condition [F(2,32)¼ 0.80, p¼ 0.459] or Hemisphere [F(1,16)¼ 0.28, p¼ 0.605]. To further investigate the interaction, an additional RMANOVA with the factor Hemisphere (2 levels: left vs. right) obtained from difference waves
Fig. 3 – ERP results. Left: Grand-average ERPs of standard, deviant, and deviant-minus-standard difference waves of all three conditions at electrode positions FC1, FC2 and at the mastoids (M1, M2). Right: Scalp potential maps of the MMN and P3a components in the respective time window for all conditions.
157
brain research 1490 (2013) 153–160
Table 1 – Difference wave mean amplitudes and standard errors of mean for the MMN component at FC1, FC2, left and right mastoid and for the P3a component at FC1 and FC2 (in lV) and component latencies of the difference waves (in ms), determined with a relative criterion of 80% of the peak amplitude on FCz. Phoneme–phoneme Amplitude
Latency
MMN FC1 FC2 M1 M2 P3a FC1 FC2 MMN P3a
Tone–pitch
0.919 0.837 0.191 0.093
(0.31) (0.31) (0.24) (0.26)
0.521 0.995 0.712 0.126
(0.44) (0.47) (0.34) (0.28)
1.619 1.642 0.362 0.078
(0.34) (0.32) (0.26) (0.29)
1.388 1.728 103.18 279.06
(0.40) (0.32) (0.39) (0.86)
2.120 1.838 177.88 405.76
(0.48) (0.44) (0.32) (0.45)
2.202 2.316 125.29 303.53
(0.78) (0.73) (0.73) (0.56)
on electrode positions FC1 and FC2 was calculated for each condition. Effects of Hemisphere were observed in the phoneme–phoneme condition [F(1,16)¼ 5.182, p ¼0.037] indicating a right-lateralized P3a. No Hemisphere effects were observed for the phoneme–pitch [F(1,16)¼3.472, p ¼0.081] and tone–pitch conditions [F(1,16)¼0.307, p¼ 0.587]. For comparing ERP component latencies between conditions, we did not analyze absolute latencies because the comparison would have been biased, given that the taskirrelevant deviant information could not be extracted at stimulus onset in the phoneme–pitch condition. Therefore, we analyzed the relative latency difference between the components. The MMN–P3a delay was significantly modulated by Condition as revealed by the corresponding RMANOVA [F(2,32)¼ 5.61, p¼ 0.008]. Post hoc paired t-tests revealed a significantly delayed P3a relative to MMN in response to deviants in the phoneme–pitch condition compared to the phoneme–phoneme condition [t(16)¼ 3.00, p¼ 0.004; Bonferroni-corrected] and compared to the tone–pitch condition [t(16) ¼2.82, p¼ 0.006; Bonferroni-corrected]. No difference was observed between the phoneme–phoneme and tone–pitch conditions [t(16)¼ 0.13, p¼ 0.45].
3.
Phoneme–pitch
Discussion
The present study systematically compared auditory distraction in tonal and linguistic stimuli as a function of stimulus material (sine tones vs. pseudo-words) and deviating stimulus dimension (pitch vs. phoneme), thus controlling for channel separation of task-relevant and task-irrelevant information. We found signs of behavioral distraction (higher error rates or slower response times in response to deviants compared to standards) in all conditions. However, behavioral distraction effects were weakest in the phoneme–pitch condition. MMN and P3a components of equal amplitude were elicited in all conditions, yet relative MMN–P3a latency was prolonged in the phoneme–pitch condition. On the ERP level, MMN-like components were elicited in response to task-irrelevant stimulus deviations in all conditions. However, morphology of the deviance-related negativities was markedly different between conditions. In the tone–pitch condition, an MMN with typical frontocentral
scalp distribution was observed as reported previously (e.g., Schro¨ger and Wolff, 1998; Schro¨ger et al., 2000). In contrast, MMN showed a strong right-hemispheric dominance in the phoneme–pitch condition (cf. Fig. 3). This is consistent with previous observations of within-category deviations in phonemes (Xi et al., 2010). The most likely explanation for this finding is that the pitch change within the phonemes was processed as a prosodic cue. For instance, previous functional magnetic resonance imaging studies have revealed larger hemodynamic activation in perisylvian areas of the right hemisphere for prosodic information (e.g. Meyer et al., 2002). In our study, MMN peaked clearly after the N1 component in both pitch deviation conditions (phoneme–pitch, tone–pitch), and could thus be separated in time from adaptation-based effects on N1 (DN1; Horva´th et al., 2008). In contrast, in the phoneme–phoneme condition, MMN overlapped the N1 component in time, and MMN topography resembled N1 topography. Therefore, it is plausible to assume that the observed peak in the present phoneme–phoneme condition partly reflects an N1 amplitude difference (DN1) rather than an MMN component. This indicates that information about the deviation was rapidly available based on adaptation differences (Horva´th et al., 2008). In sum, and consistent with previous studies on categorical speech perception (Maiste et al., 1995; Tampas et al., 2005; Winkler et al., 1999; Xi et al., 2010), fundamentally different modes of deviance detection were employed and observed in the three conditions of the present study. This confirms the notion that deviance detection mechanisms depend on stimulus material and dimension of deviance (cf., e.g. Giard et al. (1995)). Rinne et al. (2006) showed that different modes of deviance detection within tonal material (only MMN for intensity decrements, both DN1 and MMN for intensity increments) can lead to similar attentional consequences (measured by P3a elicitation). In line with this observation, morphologically similar P3a components were elicited in all three conditions of the present study. Although some differences in P3 lateralization were observed, these were considerably less pronounced than the morphological differences in the deviance-related negativities. P3a is assumed to reflect an involuntary switch of attention towards the deviant indicating that the deviant might be of potential significance (Escera et al., 2000, 2001; Escera and Corral, 2007). The attention
158
brain research 1490 (2013) 153–160
switch is related to the processing cost for deviants observed at the behavioral level. An alternative but basically compatible interpretation of P3a is that it might reflect an activation of the locus coeruleus–norepinephrine (LC-NE) system by salient, potentially motivational relevant stimuli, i.e., a CNS counterpart of the orienting response binding processing resources (Nieuwenhuis et al., 2005, 2011; Sokolov, 1975). Analyses of the delay between MMN and P3a showed a significantly later P3a relative to MMN for pitch deviants in the phoneme discrimination task, indicating that the attention switch occurred later in the phoneme–pitch condition as compared to the other two conditions. On the behavioral level, we found distraction effects (increased error rates and prolonged response times in deviants compared to standards) comparable with those reported in previous studies (Grimm et al., 2008; Schro¨ger and Wolff, 1998). This suggests that despite fundamentally different modes of deviance detection, similar electrophysiological and behavioral markers of distraction can be obtained for linguistic and non-linguistic material. On the other hand, distraction was modulated by the interplay of deviance type and stimulus material. Both error rate increase and reaction time prolongation were reduced, and relative MMN–P3a latency was enhanced for pitch deviants in linguistic material compared to phoneme deviants in linguistic material and compared to pitch deviants of equivalent size in non-linguistic material. The smaller impact of pitch deviants in linguistic material can be explained in the framework of categorical speech perception (Maiste et al., 1995; Tampas et al., 2005). According to this framework, phonemes are processed in an abstract manner rather than by their exact physical features. A physical deviation (in this case, pitch deviation) is therefore more distant from the relevant (linguistic) information, and hence less likely to capture attention. Our findings thus support the notion that channel separation (i.e., dissimilarity of taskrelevant and task-irrelevant information) is an important factor in determining attentional and behavioral consequences of task-irrelevant deviations. In order to exploit the full potential of the one-stimulus oddball distraction paradigm in future studies, we recommend that interfering task-relevant and task-irrelevant stimulus dimensions are taken also for linguistic material. Aside from these considerations on the mechanisms underlying attentional capture and distraction, the present study provides a methodological advance in the systematic comparison of distraction with different types of auditory stimuli, and may thus prove useful for investigating attention processes in other contexts, such as in the characterization of specific patient groups’ auditory deficits. For instance, it is still a matter of debate whether dyslexia is primarily characterized by dysfunctional processing of speech stimuli or of acoustic stimuli in general. According to Schulte-Ko¨rne and Bruder (2010), the main reason for this discordance in the literature is a lack of methodological approaches to systematically compare speech and non-speech sounds on the level of MMN. A method that allows such a comparison would also be beneficial to further illuminate pre-attentive processes in children and older people (Horva´th et al., 2009; Wetzel and Schro¨ger, 2007) as well as in patients with schizophrenia (Strelnikov, 2010) or unilateral neglect (Deouell et al., 2000). As we obtained significant
distraction effects in all three conditions of the present experiment, we conclude that this provides a useful baseline for the comparison with patient groups with specific impairments in one of the underlying processes. If one aims at obtaining equally strong distraction effects for all conditions with this paradigm, we suggest to omit the physical matching and instead adjust the paradigm according to the effect sizes (e.g., presenting task-relevant information in the phoneme–pitch condition later, such that the task-irrelevant information can reveal its full distractive potential). In conclusion, this is the first study to systematically compare speech and non-speech sounds in the one-stimulus auditory oddball distraction paradigm within the same group of healthy volunteers. Our results show that the adapted distraction paradigm provides a valid means to investigate processes of involuntary attention and distraction in healthy subjects. Despite fundamental differences in deviance detection as reflected by deviance-specific DN1 and MMN, equivalent indicators of distraction were observed for linguistic and nonlinguistic stimulus material. All deviations elicited a P3a component, the suggested cognitive counterpart of the orienting response, and all deviations led to costs for the primary task in terms of speed and accuracy. However, distraction effects were modulated by the interplay of deviance type and stimulus material: pitch deviants in linguistic material caused smaller distraction effects than pitch deviants in tonal material, and also smaller distraction effects than phoneme deviants in linguistic material. This indicates that distractor-task separation is the relevant parameter that must be controlled for when transferring the distraction paradigm to other material types. In future studies, this paradigm may be applied to investigate the electrophysiological correlates of involuntary attention and distraction processes in patient groups with deficiencies related to speech and language.
4.
Experimental procedures
Eighteen paid, healthy subjects participated in the experiment. One subject was rejected from further analyses due to technical problems during the recording, leading to 50% data loss in one condition. The remaining 17 participants (8 female; 20–27 years old, mean age: 24 years) gave written consent according to the Declaration of Helsinki after being informed about the nature of the experiment. All subjects were native German speakers and right-handed (laterality index495%) according to a German version of the Edinburgh Handedness Inventory (Oldfield, 1971). Subjects were comfortably seated in a test room in front of an LCD display. Sine tones or bi-syllabic pseudo-words, which were equally meaningless for all subjects, were presented in a 2-alternative forced-choice paradigm (2AFC) via Neuroscan air tube insert earphones (10 O) in three experimental conditions. In each condition, a set of 4 stimuli was used that differed in two stimulus dimensions. In the taskrelevant dimension, stimulus pairs were equiprobable, while they were distributed in a 4:1 ratio in the task-irrelevant dimension. Subjects had to distinguish the stimuli in the task-relevant dimension (phoneme identity or tone duration) via mouse button presses with the left and right index fingers
brain research 1490 (2013) 153–160
(e.g., left button press for short tones and right button press for long tones). Stimulus onset asynchrony (SOA) varied between 1250 ms and 1550 ms in equiprobable 12.5 ms steps for each condition. The task-relevant information was available after 250 ms in each of the three conditions. In the phoneme–phoneme condition, stimuli consisted of pseudowords composed of concatenated naturally spoken syllables (MI-TA, GI-TA, MI-TO, GI-TO; duration MI/GI: 250 ms, TA/TO: 150 ms, always adding up to 400 ms). Task-relevant information was the phoneme of the second syllable (TA vs. TO; p¼ 0.5), while the task-irrelevant deviation was the phoneme of the first syllable (i.e. ‘‘GI’’ instead of ‘‘MI’’ or vice versa; p¼ 0.8 and p ¼0.2). In the phoneme–pitch condition, two pseudo-words (MI-TA, MI-TO; duration 400 ms) were used. The first syllable (‘‘MI’’) was either presented in original pitch, or pitch-shifted upwards by 11% using the pitch transformation algorithm provided in CoolEdit 2000. Task-relevant information was again the phoneme of the second syllable (TA vs. TO; p¼ 0.5), while the task-irrelevant deviation was the pitch of the first syllable (p ¼0.8 and p¼ 0.2). Finally, in the tone–pitch condition, sine tones of 250 or 500 ms duration (rise-and-fall time 10 ms) and frequencies of 277 Hz or 311 Hz were presented. Task-relevant information was sound duration (p ¼0.5) while the task-irrelevant deviation was the pitch of the sounds (p ¼0.8 and p ¼0.2). Note that the point of discrimination was 250 ms for both the short (250 ms) and the long (500 ms) tone, as the participant could already classify the tone as long at the time when the short tone would have ended. A graphical illustration of the experimental designs for each condition can be seen in Fig. 1. The experiment started with a training block for each condition consisting of 44 trials. Training blocks were repeated per condition until subjects responded correctly in more than 90% of the trials. Each condition was tested in two blocks, once in the first and once in the second half of the experiment. The order of the three conditions was counterbalanced across participants and identical within the first and second halves of the experiment. Each block started with a task instruction on the display, consisted of 244 stimuli of which 48 were deviants, and was separated into two halves by a 15-second break. Deviant positions were pseudorandomized with the restriction of at least two standards preceding each deviant. Stimulus probabilities in the taskirrelevant dimension were exchanged between first and second halves of each block, that is, the standard stimulus of the first half was presented as deviant stimulus in the second half and vice versa (MI vs. GI, original vs. pitch-shifted MI; low vs. high sound). The order of block halves was counterbalanced across the experimental halves and across participants. Stimulus–response mapping was counterbalanced across participants. After each block participants received feedback on their performance in terms of the percentage of correct responses and the mean response time. EEG was measured with Ag/AgCl electrodes from 32 scalp positions: Fz, FCz, Cz, Pz, Fp1, AF3, F3, F7, FC1, FC5, C3, T7, CP1, CP5, P3, P7, PO3, O1, Fp2, AF4, F4, F8, FC2, FC6, C4, T8, CP2, CP6, P4, P8, PO4, O2 and both Mastoids (M1, M2). The vertical EOG was measured with electrodes placed above and below the right eye, horizontal EOG was obtained from electrodes placed at the outer canthi of the left and right
159
eyes. The reference electrode was placed at the tip of the nose. EEG and EOG signals were amplified with a Brainamp DC (Brainvision) and recorded with a sampling rate of 500 Hz. All standard stimuli immediately following a deviant stimulus, as well as the initial two stimuli for each block half were discarded from the behavioral and ERP analysis. For the remaining stimuli, response times were computed from correct button presses in the time range of 250 ms to 1200 ms after sound onset separately for each condition. Error trials or trials with responses outside that range were rejected from the EEG analysis. To statistically analyze the effects of stimulus type and condition on the behavioral data, within-subject analyses of variance for repeated measures (RMANOVA) with the factors Stimulus type (2 levels: standard vs. deviant) and Condition (3 levels: phoneme–phoneme, phoneme–pitch, tone–pitch) were calculated for response times and error rates. EEG was band-pass filtered offline from 0.5–30 Hz (1813 point Kaiser windowed sinc FIR filter, Kaiser beta¼ 5.65) and epoched from 100 to 600 ms relative to stimulus onset. Artifact rejection was accomplished with sorted averaging (Rahne et al., 2008), including only those trials in the analysis that improve the signal-to-noise ratio of the individual average per condition. Grand average ERPs from standards and deviants as well as deviant minus standard difference waves were computed for each condition. The mean amplitudes of the difference waves were measured within a time window of 30 ms around the grand average peak latency of the respective components. Subsequently, deviance-related negativities in all conditions will be referred to as ‘‘MMN’’. Possible DN1 contributions will be addressed in the Discussion. MMN components were statistically tested with an RMANOVA including the factors Condition (3 levels: phoneme–phoneme, phoneme–pitch, tone–pitch), Frontal–mastoidal (2 levels: frontal vs. mastoid) and Hemisphere (2 levels: left vs. right) obtained from difference waves at electrode positions FC1, FC2, M1, M2. P3a difference waves were tested with an RMANOVA including the factors Condition (3 levels: phoneme–phoneme, phoneme–pitch, tone–pitch) and Hemisphere (2 levels: left vs. right) obtained from difference waves at electrode positions FC1 and FC2. Mastoidal effects were not analyzed because unlike for MMN, no polarity inversion for electrodes below the Sylvian fissure is expected for P3a. Latencies of the MMN and P3a components in the difference waves were determined with a relative criterion technique (80% of the peak amplitude) using a jackknifing algorithm (Kiesel et al., 2008). To test whether there are latency differences of the P3a relative to the MMN, an RMANOVA for the latency difference between MMN and P3a with the factor Condition (3 levels: phoneme–phoneme, phoneme–pitch, tone–pitch) was conducted. To illustrate the topographical distribution of ERP components, scalp potential maps were generated using a spherical spline interpolation (Perrin et al., 1989). All reported t-tests are two-tailed. Mauchly’s test indicated that the sphericity assumption was not violated in any of the RMANOVAs.
Acknowledgments This work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG, SCH 375/20-1 to E.S. and HA 6314/1-1 to G.H.). The experiment was realized
160
brain research 1490 (2013) 153–160
using Cogent 2000 developed by the Cogent 2000 team at the FIL and the ICN. The authors wish to thank Joseph Classen, M.D., for valuable comments, and Alessandro Tavano, Ph.D., for providing the pseudo-word stimuli.
references
Broadbent, D.E., 1971. Decision and Stress. Academic Press, London. Deouell, L.Y., Bentin, S., Soroker, N., 2000. Electrophysiological evidence for an early (pre-attentive) information processing deficit in patients with right hemisphere damage and unilateral neglect. Brain 123, 353–365. Escera, C., Alho, K., Schro¨ger, E., Winkler, I., 2000. Involuntary attention and distractibility as evaluated with event-related brain potentials. Audiol. Neurootol. 5, 151–166. Escera, C., Alho, K., Winkler, I., Na¨a¨ta¨nen, R., 1998. Neural mechanisms of involuntary attention to acoustic novelty and change. J. Cognitive Neurosci. 10, 590–604. Escera, C., Corral, M.J., 2007. Role of mismatch negativity and novelty-P3 in involuntary auditory attention. J. Psychophysiol. 21, 251–264. Escera, C., Yago, E., Alho, K., 2001. Electrical responses reveal the temporal dynamics of brain events during involuntary attention switching. Eur. J. Neurosci. 14, 877–883. Garner, W.R., 1974. The Processing of Information and Structure. Wiley, New York. Giard, M.H., Lavikainen, J., Reinikainen, K., Perrin, F., Bertrand, O., Pernier, J., Na¨a¨ta¨nen, R., 1995. Separate representation of stimulus frequency, intensity, and duration in auditory sensory memory: an event-related potential and dipole-model analysis. J. Cogn. Neurosci. 7, 133–143. Grillon, C., Courchesne, E., Ameli, R., Elmasian, R., Braff, D., 1990. Effects of rare nontarget stimuli on brain electrophysiological activity and performance. Int. J. Psychophysiol. 9, 257–267. Grimm, S., Schro¨ger, E., Bendixen, A., Ba¨ß, P., Roye, A., Deouell, L.Y., 2008. Optimizing the auditory distraction paradigm: behavioral and event-related potential effects in a lateralized multi-deviant approach. Clin. Neurophysiol. 119, 934–947. Horva´th, J., Czigler, I., Birka´s, E., Winkler, I., Gervai, J., 2009. Agerelated differences in distraction and reorientation in an auditory task. Neurobiol. Aging 30, 1157–1172. Horva´th, J., Czigler, I., Jacobsen, T., Maess, B., Schro¨ger, E., Winkler, I., 2008. MMN or no MMN: no magnitude of deviance effect on the MMN amplitude. Psychophysiology 45, 60–69. Kiesel, A., Miller, J., Jolicoeur, P., Brisson, B., 2008. Measurement of ERP latency differences: a comparison of single-participant and jackknife-based scoring methods. Psychophysiology 45, 250–274. Kujala, T., Tervaniemi, M., Schro¨ger, E., 2007. The mismatch negativity in cognitive and clinical neuroscience: theoretical and methodological considerations. Biol. Psychol. 74, 1–19. Maiste, A.C., Wiens, A.S., Hunt, M.J., Scherg, M., Picton, T.W., 1995. Event-related potentials and the categorical perception of speech sounds. Ear Hear. 16, 68–90. Meyer, M., Alter, K., Friederici, A.D., Lohmann, G., von Cramon, D.Y., 2002. FMRI reveals brain regions mediating slow prosodic modulations in spoken sentences. Hum. Brain Mapp. 17, 73–88. Na¨a¨ta¨nen, R., Gaillard, A.W.K., Ma¨ntysalo, S., 1978. Early selective-attention effect on evoked-potential reinterpreted. Acta Psychol. 42, 313–329.
Na¨a¨ta¨nen, R., Kujala, T., Winkler, I., 2011. Auditory processing that leads to conscious perception: a unique window to central auditory processing opened by the mismatch negativity and related responses. Psychophysiology 48, 4–22. Nieuwenhuis, S., Aston-Jones, G., Cohen, J.D., 2005. Decision making, the P3, and the locus coeruleus–norepinephrine system. Psychol. Bull. 131, 510–532. Nieuwenhuis, S., De Geus, E.J., Aston-Jones, G., 2011. The anatomical and functional relationship between the P3 and autonomic components of the orienting response. Psychophysiology 48, 162–175. Oldfield, R.C., 1971. Assessment and analysis of handedness— Edinburgh Inventory. Neuropsychologia 9, 97–113. Perrin, F., Pernier, J., Bertrand, O., Echallier, J.F., 1989. Spherical splines for scalp potential and current-density mapping. Electroencephalogr. Clin. Neurophysiol. 72, 184–187. Rahne, T., von Specht, H., Mu¨hler, R., 2008. Sorted averagingapplication to auditory event-related responses. J. Neurosci. Methods 172, 74–78. Rinne, T., Sa¨rkka¨, A., Degerman, A., Schro¨ger, E., Alho, K., 2006. Two separate mechanisms underlie auditory change detection and involuntary control of attention. Brain Res. 1077, 135–143. Schro¨ger, E., 1997. On the detection of auditory deviations: a preattentive activation model. Psychophysiology 34, 245–257. Schro¨ger, E., 2005. The mismatch negativity as a tool to study auditory processing. Acta Acust. United Acust. 91, 490–501. Schro¨ger, E., Giard, M.H., Wolff, C., 2000. Auditory distraction: event-related potential and behavioral indices. Clin. Neurophysiol. 111, 1450–1460. Schro¨ger, E., Wolff, C., 1998. Behavioral and electrophysiological effects of task-irrelevant sound change: A new distraction paradigm. Cogn. Brain Res. 7, 71–87. Schulte-Ko¨rne, G., Bruder, J., 2010. Clinical neurophysiology of visual and auditory processing in dyslexia: a review. Clin. Neurophysiol. 121, 1794–1809. Snyder, E., Hillyard, S.A., 1976. Long-latency evoked-potentials to irrelevant, deviant stimuli. Behav. Biol. 16, 319–331. Sokolov, E.N., 1975. The neuronal mechanisms of the orienting reflex. In: Sokolov, E.N., Vinogradova, O.S. (Eds.), Neuronal Mechanisms of the Orienting Reflex. Erlbaum, Hillsdale, NJ, pp. 217–338. Squires, N.K., Squires, K.C., Hillyard, S.A., 1975. Two varieties of long-latency positive waves evoked by unpredictable auditory stimuli in man. Electroencephalogr. Clin. Neurophysiol. 38, 387–401. Strelnikov, K., 2010. Schizophrenia and language—shall we look for a deficit of deviance detection?. Psychiatry Res. 178, 225–229. Tampas, J.W., Harkrider, A.W., Hedrick, M.S., 2005. Neurophysiological indices of speech and nonspeech stimulus processing. J. Speech Lang. Hear. Res. 48, 1147–1164. Wetzel, N., Schro¨ger, E., 2007. Cognitive control of involuntary attention and distraction in children and adolescents. Brain Res. 1155, 134–146. Winkler, I., 2007. Interpreting the mismatch negativity. J. Psychophysiol. 21, 147–163. Winkler, I., Lehtokoski, A., Alku, P., Vainio, M., Czigler, I., Cse´pe, V., Aaltonen, O., Raimo, I., Alho, K., Lang, H., Iivonen, A., Na¨a¨ta¨nen, R., 1999. Pre-attentive detection of vowel contrasts utilizes both phonetic and auditory memory representations. Cogn. Brain Res. 7, 357–369. Xi, J., Zhang, L., Shu, H., Zhang, Y., Li, P., 2010. Categorical perception of lexical tones in Chinese revealed by mismatch negativity. Neuroscience 170, 223–231.