Clinical Neurophysiology 124 (2013) 2161–2171
Contents lists available at SciVerse ScienceDirect
Clinical Neurophysiology journal homepage: www.elsevier.com/locate/clinph
Neurophysiological evidence of differential mechanisms involved in producing opposing and following responses to altered auditory feedback Weifeng Li, Zhaocong Chen, Peng Liu, Baofeng Zhang, Dongfeng Huang, Hanjun Liu ⇑ Department of Rehabilitation Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510080, China
a r t i c l e
i n f o
Article history: Accepted 6 April 2013 Available online 14 June 2013 Keywords: Auditory feedback Voice control Opposing response Following response Event-related potential
h i g h l i g h t s Greater P2 amplitudes were associated with following vocal responses compared with opposing
responses to pitch perturbations of 200 cents. Upward directions of 200 cents elicited greater P2 amplitudes than downward directions in the pro-
duction of opposing responses only. Differential neural mechanisms may be underlying the processing of opposing and following
responses to pitch perturbations in voice auditory feedback.
a b s t r a c t Objective: When hearing perturbations in voice auditory feedback, people produce responses that mostly oppose the perturbation direction, whereas a few responses follow the direction of feedback perturbation. The causes of opposing and following responses, however, remain poorly understood. The present event-related potential (ERP) study sought to examine the neurophysiological processing of opposing and following responses to pitch feedback perturbations during self-monitoring of vocal production. Method: Twelve Mandarin-native speakers participated in the experiment. Vocal and neurophysiological responses to pitch perturbations (±50 and ±200 cents) in voice auditory feedback were measured. Individual-trial responses were categorized according to the response direction and then separately averaged in groups of opposing and following responses. ERPs indexed by the P1-N1-P2 complex corresponding to two types of vocal responses were also obtained. Results: Opposing and following vocal responses did not differ in the magnitude, but there were greater proportions of opposing to following responses to 50 cents stimuli. The amplitude and latency of the P1 and N1 components showed none of significance across conditions, whereas there was a direction magnitude effect on the P2 response. Following responses elicited greater P2 amplitudes than opposing responses only when pitch feedback was perturbed for downward 200 cents, and upward pitch perturbation elicited greater P2 amplitudes than those with downward direction only in the production of opposing responses. Conclusion: These findings demonstrate that cortical processing of opposing responses is different from that of following responses, which can be modulated by the physical properties of feedback perturbation. Significance: Different neural mechanisms are involved in the production of opposing and following responses to feedback perturbations during self-monitoring of vocal production. Ó 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
1. Introduction In the study of the motor control system, particular attention has been focused on the organization or coordination of the peripheral system. One valuable experimental approach for ⇑ Corresponding author. Tel.: +86 20 87332298; fax: +86 20 87750632. E-mail address:
[email protected] (H. Liu).
examining motor control is to introduce unexpected perturbations to the ongoing motor acts and observe the nature and the time of the responses, which are believed to reflect the organization of the motor system and the reflexive structure of the ongoing movement (Lofqvist and Lindblom, 1994). Most commonly, an unexpected perturbation to the peripheral system such as the hand results in a reflexive response that opposes the perturbation, which is termed a compensatory or opposing response. This approach has
1388-2457/$36.00 Ó 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.clinph.2013.04.340
2162
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
been applied for investigating the control of different movements such as finger movement (Rothwell et al., 1982), respiratory control (Davis and Sears, 1970), and posture control (Nashner and McCollum, 1985). The opposing feature of those reflexes in the ongoing movement is equivalent to a negative feedback loop, which tends to keep the motor control stable in the face of perturbations. Speech production is a highly skilled motor behavior, which is accomplished through the coordination of multiple muscles and accompanying speech articulators. Several lines of research have been conducted to examine how the lips, jaw, tongue and larynx are coordinated during speech in the face of perturbations (Folkins and Zimmermann, 1982; Abbs and Gracco, 1984; Gracco and Abbs, 1985; Folkins and Canty, 1986; Shaiman, 1989; Munhall et al., 1994). Kinematic records of the scale and timing of articulatory movements revealed opposing responses induced by mechanical perturbations to the lips and jaw during the production of vowels and consonants. For example, electrical stimulation to the lowerlip movement during speech elicited compensatory movements of the jaw and/or the upper lip (Folkins and Zimmermann, 1982; Abbs and Gracco, 1984), and compensatory changes in upper and lower lip movements occurred in response to mechanical perturbation to the jaw movement (Folkins and Canty, 1986). Moreover, compensations for perturbations in the articulatory movement can vary according to the task specificity (Shaiman, 1989; Shaiman and Gracco, 2002). For example, Shaiman (1989) reported that jaw perturbations elicited compensatory kinematic adjustments in the lips and/or jaw during the production of [aebae] but no labial compensations during the production of [aedae]. Another central part in speech motor control is the incorporation of sensory feedback arising from auditory and somatosensory pathways in the brain. In recent decades, the perturbation approach has been extended to examine the role of auditory feedback in speech motor control. In this paradigm, acoustic parameters such as intensity, fundamental frequency (F0), or formant frequency (e.g. F1 or F2) of voice auditory feedback are unexpectedly perturbed and fed back to subjects simultaneously during selfmonitoring of speech. The behavioral/neural responses to feedback perturbations are thought to reflect how auditory feedback is involved in the sensorimotor control of voice. It has been demonstrated that, during ongoing vocal production, people decrease their voice intensity in the context of increased noise loudness feedback (Lane and Tranel, 1971; Siegel and Pick, 1974) and compensate for unexpected loudness feedback perturbations (Bauer et al., 2006; Liu et al., 2007). Similar patterns of compensations for feedback perturbation have also been observed in the frequency domain, where people lower or increase their voice F0 or formant frequencies when hearing their voice auditory feedback shifted upwards or downwards (Burnett et al., 1998; Houde and Jordan, 1998; Jones and Munhall, 2002; Purcell and Munhall, 2006; Liu and Larson, 2007). Similar to other motor behaviors, opposing vocal responses observed in these studies are believed to help stabilize voice/speech production at a desired level or reach a specific target. While compensation for feedback perturbations during selfmonitoring of speech has been well documented, it was reported that nearly 10–20 percent of vocal responses did not oppose the direction of the perturbation (Burnett et al., 1998 Larson et al., 2007; Liu et al., 2007). Rather, subjects changed their voice pitch/ loudness in the same direction as the perturbation, which has been termed a following response. To date, much less attention has been paid to the characteristics and mechanisms of following vocal responses to altered auditory feedback (Burnett et al., 1998; Hain et al., 2000; Behroozmand et al., 2012; Korzyukov et al., 2012). It was reported that following responses were frequently elicited when feedback perturbations became larger in size (Burnett
et al., 1998; Liu et al., 2010), or when subjects misperceived the perturbation direction (Larson et al., 2007). In addition, there were greater proportions of following responses when perturbation direction was predictable than when it was unpredictable (Korzyukov et al., 2012). Despite the efforts on characterizing following responses to altered auditory feedback, the underlying mechanism of this response remains poorly understood. It is partly due to the fact that the number of following responses is relatively small as compared to opposing response in most of previous research. For example, there were 121 opposing but 10 following responses in one pitch-shift study (Larson et al., 2008), and 183 opposing but 5 following responses in one loudness-shift study (Bauer et al., 2006). This precluded the statistical analyses of the scale and timing of following response. More importantly, it is attributed to the disadvantage of the current methodology for vocal response measurement. An event-related averaging technique (Bauer and Larson, 2003; Chen et al., 2007) has been widely used to analyze the data of vocal response, in which a number of individual-trial responses are averaged for each condition to generate an overall response. By doing this, random noise embedded in the individualtrial response can be averaged out from the generation of an averaged response. This methodology, however, overlooked the fact that individual-trial responses differed in their direction: some individual-trial responses opposed to the stimulus direction, whereas others followed it (see Fig. 1). In that case, the previously reported opposing or following responses were measured by averaging a mix of both opposing and following individual trials, which may not reflect the true nature of these two types of responses. Therefore, in order to characterize opposing or following responses, a new method must be proposed to determine and categorize the type of individual-trial response prior to averaging. Recently, Larson and his colleagues proposed a pre-sorting averaging method to solve this problem, in which an automatic sorting algorithm was used to categorize individual trials into groups of opposing and following responses according to their response direction prior to averaging (Behroozmand et al., 2012; Korzyukov et al., 2012). Both of these two studies showed that generation of following responses was modulated as a function of the predictability of perturbation direction. In addition, Behroozmand et al. (2012) reported that magnitudes of opposing response became smaller along with the increase of the size of feedback perturbation, whereas no such effect was found for the following response. And they suggested that differential neural mechanisms may be involved in the production of opposing and following vocal responses. As far as we are aware, there has been no published report specifically addressing the neural processing of opposing vs. following responses to altered auditory feedback. Several neurophysiological studies have been conducted to explore the neural mechanisms of sensorimotor control of vocal production using the perturbation paradigm (Behroozmand et al., 2009, 2011a,b; Hawco et al., 2009; Behroozmand and Larson, 2011; Chen et al., 2012; Liu et al., 2011b, 2013). It should be noted, however, that these studies did not categorize individual trials of electroencephalograph (EEG) data based on the type of their corresponding vocal responses, and the procedures of artifact rejection from vocal and EEG data were separately performed. Liu et al. (2011b) inspected vocal and EEG individual-trial responses simultaneously, but they did not categorize them into opposing and following responses. Korzyukov et al. (2012) applied the pre-sorting averaging method to the measurement of vocal responses but not to their EEG data. Therefore, characteristics of the brain activity involved in the production of following response to altered auditory feedback remain unknown. In the present study, we sought to examine the behavioral and neurophysiological characteristics of opposing and following vocal
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
2163
Fig. 1. Waterfall displays of individual trials to pitch perturbations of 50 cents produced by one participant, in which 24 of 200 trials were included. Based on the respond direction relative to the stimulus, all individual trials (top) can be categorized into opposing trials (bottom left) and following trials (bottom right). The dashed lines indicate the onset of pitch perturbation.
responses to pitch perturbations. A new method that is similar to but slightly different from the current pre-sorting averaging method (Behroozmand et al., 2012) was used to determine the type of individual-trial responses. More importantly, this method was applied to the analyses of both vocal and EEG data, which enabled us to separately examine neurophysiological characteristics of opposing vs. following response. A 2 2 experimental design was used in the present study, including perturbation direction (upward and downward) and perturbation magnitude (50 and 200 cents). Inclusion of perturbation direction was because there is evidence that neurophysiological responses to downward pitch perturbations differed from those to upward perturbations (Liu et al., 2011b), and manipulation of perturbation magnitude was due to its modulatory effect on the production of opposing and following (Burnett et al., 1998; Liu et al., 2010; Behroozmand et al., 2012). We expected that differential neural mechanisms would be involved in the production of opposing and following responses, and neurophysiological characteristics associated with them would be modulated according to the physical properties of feedback perturbations (e.g. direction, magnitude). 2. Methods 2.1. Subjects Twelve right-handed, Mandarin-native young adults (6 females and 6 males; 19–25 yr, mean = 22 yr) participated in the experiment. All subjects reported no history of speech, hearing, or neurological disorders. All of them passed a hearing screening at a threshold of 25 dB HL for pure-tone frequencies of 0.5–4 kHz. All subjects gave informed consent in compliance with a protocol ap-
proved by the Institution Review Board of The First Affiliated Hospital at Sun Yat-sen University of China. 2.2. Apparatus All subjects were tested in a sound-treated booth throughout the experiment. The voice signals were recorded by a Genuine Shupu microphone (model SM-306) and amplified by a MOTU Ultralite Mk3 firewire audio interface. The amplified voices were pitchshifted through an Eventide Eclipse Harmonizer. Max/MSP (v.5.0, Cycling 74) software was used to manipulate the acoustic parameters such as duration, direction, and magnitude of pitch perturbation. This program also generated transistor-transistor logical (TTL) control pulses to signal the onset and offset of pitch perturbations. The rise/fall time of pitch perturbations was about 10 ms. The pitch-shifted voices were fed back to subjects through insert earphones (ER1, Etymotic Research Inc.). The original voice, the pitchshifted feedback, and TTL pulses were digitized at 10 kHz by a PowerLab A/D converter (model ML880, AD Instruments), and recorded using LabChart software (v.7.0, AD Instruments). TTL pulses were also sent to the EEG recording system via a FireWire cable. The experimental system was physically calibrated prior to the recording such that the intensity of voice feedback the subject heard was 10 dB SPL (sound pressure level) higher than that of subject’s voice output. This gain was used to partially mask the air-born and bone conducted feedback (Behroozmand et al., 2009). 2.3. Procedure During the experiment, subjects were asked to sustain a vowel sound /u/ for about 5–6 s at their habitual and comfortable voice,
2164
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
during which their voice feedback was randomly pitch-shifted five times. The first stimulus was presented with a delay of 500–1000 ms after vocal onset, and the succeeding stimuli occurred with an inter-stimulus interval (ISI) of 700–900 ms. Subjects were requested to take a 2–3 s break before initiating the next vocalization. Production of 40 consecutive vocalizations constituted a block, resulting in a total of 200 trials per block. Two directions (upward and downward) and two magnitudes (50 and 200 cents) of pitch perturbation were used in the experiment, leading to four experimental blocks: +50 cents, 50 cents, +200 cents, 200 cents. The direction or magnitude of pitch perturbation was kept constant during each block. The cent unit is a music scale, and 100 cents equals one semitone. The duration of each perturbation was fixed at 200 ms throughout the recording. The order of these four experimental conditions was randomized across all subjects.
2.4. Behavioral data analysis Vocal data were offline analyzed in IGOR PRO software (v.6.0, Wavemetrics Inc.). Voice signals were converted to analog voice F0 contours in Hertz using an autocorrelation method in Praat (Boersma, 2001). The voice F0 contour was converted to a cent wave using the formula: cents ¼ 100 ð39:86 log10 ðF0 =referenceÞÞ, in which reference denotes an arbitrary reference note of 195.997 Hz (G4). All individual trials were segmented using a window with a pre-stimulus period of 200 ms (baseline F0) and a post-stimulus period of 700 ms. Then a mathematical algorithm was used to automatically categorize individual trials into groups of opposing and following response. In detail, the mean amplitude of a 200 ms pre-stimulus window (200 to 0 ms) was subtracted from the peak amplitude of a post-stimulus window (0–200 ms), leading to a positive or a negative value that was used to determine the type of each individual-trial response relative to perturbation direction. In the case of downward condition, for example, positive or negative values meant that individual-trial responses opposed or followed the direction of perturbation. Thus, they were defined as opposing or following response. By contrast, individual-trial responses were defined as following or opposing when positive or negative values were obtained in the upward condition. Given the complexity in determining the type of vocal response (e.g. trials with high variability of voice F0), an additional manual inspection was performed for each condition to avoid the errors of automatic response categorization. After that, information regarding the type of individual trials (i.e. opposing or following) was sent to the procedure of EEG analysis. By doing this, the neurophysiological responses associated with opposing and following vocal responses were obtained and examined across conditions. Note that some individual EEG trials were defined as bad trials during the artifact rejection (see below), and this information was sent to IGOR program such that the corresponding vocal trials were also removed prior to averaging. This was done to ensure that averaged vocal and EEG responses were calculated using the same number of trials. Finally, these categorized vocal trials were then separately averaged to generate an overall opposing or following response for each condition. The validity of vocal response was defined using the same criteria as previous research (Liu et al., 2010). An acceptable response was defined as a contour that exceeded a value of two standard deviations (SDs) of the pre-stimulus (200 to 0 ms) mean beginning at least 60 ms and lasting at least 50 ms. The magnitude of an averaged vocal response was measured as the difference in cents between the pre-stimulus (200 to 0 ms) mean and the peak value of voice contour following the response onset. The number of individual trials that opposed or followed the perturbation direc-
tion was also collected to examine the distribution of opposing and following responses across conditions. 2.5. EEG recording and analysis A 64-electrode Geodesic Sensor Net was used to record the EEG signals with a Net Amps 300 amplifier (Electrical Geodesics Inc., Eugene, OR) at a sampling frequency of 1 kHz referenced to the vertex (Cz). Horizontal and vertical electro-oculograms (EOG) were determined from electrodes placed above and below the eyes and at the outer canthus. The impedances of individual sensors were maintained below 50 kX during the recording (Ferree et al., 2001). After data acquisition, the EEG signals were sent to Net Station software (v.4.4, Electrical Geodesics Inc., Eugene, OR) for off-line analyses. All channels were digitally band-passed filtered from 1 to 20 Hz. Individual trials were segmented into epochs ranging from 200 ms before and 500 ms following the pitch perturbation. Segmented trials were then inspected for artifact contamination such as excessive muscular activity, eye blinks, and eye movement using Artifact Detection toolbox in Net Station. Additional visual inspection on all individual trials was conducted to avoid the errors of automatic artifact rejections. Note that those individual trials associated with bad vocal trials defined by IGOR program were also rejected from averaging. Artifact-free segmented trials were then categorized into groups of opposing and following responses according to the results from the measurement of vocal response and then separately averaged to generate overall responses across conditions. All responses were re-referenced to the average of electrodes on each mastoid. The baseline of each averaged response was corrected, and the amplitude and latency of the P1-N1-P2 complex across conditions for each subject were extracted and submitted to statistical analyses. 2.6. Statistical analyses Values of vocal and neurophysiological responses across conditions were analyzed using a repeated-measures analysis of variance (RM-ANOVA) in SPSS (v.16.0). The magnitudes of vocal response were subjected to a three-way RM-ANOVA, including three within-subject factors of direction (upward vs. downward), magnitude (50 vs. 200 cents) and type (opposing vs. following). The percentage of opposing and following responses was subjected to two-way RM-ANOVAs as a function of perturbation direction and magnitude, respectively. Paired t-tests were also run on significant differences between the percentages of the opposing vs. following responses across conditions. Regarding the amplitude and latency of the P1-N1-P2 complex, four-way RM-ANOVAs were used in the statistical analyses, including direction (upward vs. downward), magnitude (50 vs. 200 cents), type (opposing vs. following), and site (FC1, FCz, FC2, C1, Cz, C2, P1, Pz, P2). Appropriate subsidiary RM-ANOVAs were calculated if there were any significant higher-order interactions. Probability values were corrected using Greenhouse–Geisser when the assumption of sphericity was violated. 3. Results 3.1. Behavioral data Grand-averaged voice F0 responses to pitch perturbations across all subjects are illustrated in Fig. 2. The blue and red contours denote the opposing and following vocal responses, respectively. All subjects in the present study produced a mix of opposing and following vocal responses. As can be seen, despite the changes in the direction and magnitude of pitch perturbation,
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
2165
Fig. 2. Grand-averaged vocal responses to pitch perturbations as a function of type, direction and magnitude. The blue and red traces indicate opposing and following responses, respectively. Vertical bars represent the standard errors of averaged traces, and time 0 indicates the onset of perturbation.
subjects produced similar opposing and following responses. One direction magnitude type RM-ANOVA was performed on the magnitude of vocal response, and the results revealed no significant main effects of direction (F(1, 11) = 1.286, p = 0.281), magnitude (F(1, 11) = 0.168, p = 0.690), or type (F(1, 11) = 0.539, p = 0.478). Interactive effects of these factors failed to reach significance either (p > 0.05). Statistical analyses were also performed to examine the percentage of opposing and following responses across stimulus direction or magnitude. Results of the RM-ANOVAs failed to show significant differences in the percentage of opposing or following responses as a function of direction or magnitude (p > 0.05). Paired t-tests showed that, upward direction elicited more opposing responses (55%) than following responses (45%) to 50 cents stimuli (t = 2.795, d.f. = 11, p = 0.017), but they did not differ significantly for 200 cents (52% vs. 48%; t = 0.707, d.f. = 11, p = 0.494). For the downward direction, no significant differences were found between the percentage of opposing vs. following responses for either 50 cents (49% vs. 51%; t = 0.451, d.f. = 11, p = 0.661) or 200 cents stimuli (50% vs. 50%; t = 0.020, d.f. = 11, p = 0.984). 3.2. ERP results Neurophysiological characteristics of opposing and following responses to pitch perturbations were examined by the measurement of P1-N1-P2 complex. Fig. 3 shows the grand-averaged ERP waveforms associated with opposing (blue) and following (red) vocal responses across stimulus direction and magnitude. Fig. 4 shows the topographic distribution of grand-averaged ERPs (P2) as a function of stimulus direction and magnitude. As can be seen, opposing and following responses did not differ much across the conditions except for the downward 200 cents stimuli.
One four-way RM-ANOVA of P1 amplitude revealed no significant main effects of direction (F(1, 11) = 0.405, p = 0.537), type (F(1, 11) = 0.193, p = 0.669), magnitude (F(1, 11) = 0.558, p = 0.471) as well as interactions between any of them (p > 0.05) except for site (F(8, 88) = 51.562, p < 0.001). Similarly, none of main effects of direction (F(1, 11) = 0.004, p = 0.948), type (F(1, 11) = 0.061, p = 0.810), magnitude (F(1, 11) = 0.135, p = 0.721), or site (F(8, 88) = 0.930, p = 0.496) as well as interactions between any of them (p > 0.05) reached significance for the N1 amplitude, although opposing response seemed to elicit relatively larger N1 amplitude (absolute value) than following response as shown in Fig. 3. For the P2 amplitude, the results revealed no significant main effects of direction (F(1, 11) = 1.900, p = 0.195), magnitude (F(1, 11) = 1.309, p = 0.277), or type (F(1, 11) = 0.523, p = 0.485). Given that there was a significant type magnitude direction interaction (F(1, 11) = 5.424, p = 0.040), however, several subsidiary three-way RM-ANOVAs were performed to further examine the difference across conditions in the following analyses. First, type magnitude site RM-ANOVAs were performed on P2 amplitude as a function of direction. For the downward stimuli, a significant type magnitude interaction (F(1, 11) = 4.965, p = 0.048) led to further type site RM-ANOVAs for 50 and 200 cents conditions. It was found that following response elicited significantly larger P2 amplitudes than opposing response to 200 cents stimuli (F(1, 11) = 6.724, p = 0.025), while they did not differ significantly in the case of 50 cents stimuli (F(1, 11) = 0.026, p = 0.876) (see Figs. 3 and 4). For the upward stimuli, none of main effects of type (F(1, 11) = 0.337, p = 0.573) or magnitude (F(1, 11) = 2.957, p = 0.113) as well as the type magnitude interaction (F(1, 11) = 0.600, p = 0.455) reached significance. Second, direction magnitude site RM-ANOVAs were performed as a function of response type. For the opposing
2166
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
Fig. 3. Grand-averaged ERP waveforms associated with opposing (blue) and following (red) responses across perturbation direction and magnitude at electrodes of FCz (left) and Cz (right). Vertical bars represent the standard errors of averaged traces, and time 0 indicates the onset of perturbation.
response, there was a significant direction magnitude interaction (F(1, 11) = 5.767, p = 0.035), subsidiary direction site RM-ANOVAs were thus performed as a function of stimulus magnitude. The results showed that upward direction elicited signifi-
cantly larger P2 amplitudes than downward direction (F(1, 11) = 8.281, p = 0.015) in the case of 200 cents (see Fig. 5), whereas direction effect failed to reach significance in the case of 50 cents (F(1, 11) = 0.313, p = 0.587). For the following response,
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
2167
Fig. 4. Topographic distribution of the grand-averaged ERPs of P2 component to pitch perturbations for opposing (left) and following (right) responses across perturbation direction and magnitude.
nothing but a main effect of site (F(8, 88)824.362, p < 0.001) reached significance. Finally, direction type site RMANOVAs were performed across stimulus magnitude. Both 50 and 200 cents conditions revealed no significant main effects of direction (50 cents: F(1, 11) = 0.356, p = 0.563; 200 cents: F(1, 11) = 3.648, p = 0.083) or type (50 cents: F(1, 11) = 0.063, p = 0.807; 200 cents: F(1, 11) = 2.743, p = 0.126) but a significant main effect of site (50 cents: F(8, 88) = 18.455, p < 0.001; 200 cents: F(8, 88) = 26.550, p < 0.001). Additionally, four-way RM-ANOVAs were performed on the latencies of P1-N1-P2 complex. For these three components, the re-
sults revealed no significant main effects of direction (P1: F(1, 11) = 0.103, p = 0.755; N1: F(1, 11) = 2.733, p = 0.127; P2: F(1, 11) = 3.099, p = 0.106), type (P1: F(1, 11) = 0.025, p = 0.876; N1: F(1, 11) = 4.280, p = 0.063; P2: F(1, 11) = 0.480, p = 0.503), or site (P1: F(8, 88) = 2.930, p = 0.089; N1: F(8, 88) = 2.376, p = 0.117; P2: F(8, 88) = 0.555, p < 0.812). However, the latencies differed significantly between two stimulus magnitudes (P1: F(1, 11) = 30.704, p < 0.001; N1: F(1, 11) = 14.248, p = 0.003; P2: F(1, 11) = 16.110, p = 0.002), where 50 cents elicited significantly longer latency than 200 cents (P1: 76 ± 3 vs. 64 ± 3 ms; N1: 153 ± 7 vs. 129 ± 4 ms; P2: 252 ± 7 vs. 233 ± 7 ms).
2168
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
Fig. 5. Grand-averaged ERP waveforms associated with opposing and following responses to upward (red) and downward (blue) perturbation of 50 and 200 cents at electrodes of FCz (left) and Cz (right). Vertical bars represent the standard errors of averaged traces, and time 0 indicates the onset of perturbation.
4. Discussion The present study was to examine the behavioral and neurophysiological characteristics of opposing and following responses
to pitch perturbation in voice auditory feedback. The results showed that all subjects produced a mix of opposing and following vocal responses when hearing altered auditory feedback. Opposing and following vocal responses did not differ in the response
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
magnitude, but there were more opposing individual-trial responses than following responses to +50 cents. The neurophysiological findings revealed a direction magnitude effect on the P2 amplitudes associated with opposing and following responses. Greater P2 amplitude was associated with following response than opposing response to 200 cents, and upward direction elicited greater P2 amplitude than downward direction in the production of opposing responses to 200 cents. These findings support our hypothesis that neurophysiological characteristics of opposing and following responses can be modulated according to the physical properties of feedback perturbations, suggesting that differential neural mechanisms may be recruited in the production of opposing and following responses to pitch feedback perturbation. 4.1. Behavioral findings Multiple lines of evidence have demonstrated the existence of following vocal response that some people adjust their voice in the same direction as the perturbation (Burnett et al., 1997, 1998; Larson, 1998; Larson et al., 2000, 2001; Bauer and Larson, 2003; Bauer et al., 2006). Note that previously reported following responses were obtained by averaging all individual-trial responses, while they were calculated by averaging all trials that changed in the same direction as the perturbation in the present study. Despite this methodological difference, the present findings are comparable to previous research to some extent. Our results showed greater proportions of opposing responses to following responses in the case of 50 cents while they did not differ in the case of 200 cents, indicating that smaller perturbations elicit greater proportions of opposing to following responses compared with larger perturbations. Similar results were also reported in previous research (Burnett et al., 1998; Liu et al., 2010). A directional effect was also found as reflected by the significant difference in the percentage of opposing vs. following response to +50 cents only. The effect of perturbation direction was also reported in other studies. Larson et al. (2001) reported 16 of 200 voice F0 responses identified as following responses, and 12 of 16 following responses were elicited from downward direction. Korzyukov et al. (2012) found that the difference in the percentage of opposing/following responses between the unpredictable vs. predictable condition (i.e. perturbation direction is randomized or fixed) reached significance only when perturbation direction was downward. Taken together, these findings indicate that the generation of opposing and following responses can be modulated as a function of stimulus specificity. The present study also revealed that magnitudes of opposing and following vocal response did not differ across the experimental conditions. By contrast, several other studies reported that opposing responses had larger magnitudes than following responses (Larson et al., 2000, 2001). Again, individual-trial responses were not categorized into groups of opposing and following responses prior to averaging in these two studies. Moreover, opposing responses significantly outnumbered following responses (184 vs. 16 for Larson et al. (2001); 319 vs. 5 in Larson et al. (2000)), which precluded the statistical comparison between two responses. On the other hand, Korzyukov et al. (2012) sorted the individual trials into groups of opposing and following responses, but they did not report the statistics of vocal response magnitudes. Similarly, Behroozmand et al. (2012) used this pre-sorting technique to examine the difference of magnitude between opposing and following responses. Their results revealed that opposing and following responses did not differ in magnitude when perturbation direction was fixed, which is consistent with the present finding. When perturbation direction was randomized, however, opposing responses were associated with larger magnitude than following response. In addition, magnitude of opposing response was modulated as a function of perturbation magnitude as reflected by being smaller when perturbation
2169
magnitude increased from 100 or 200 cents to 500 cents, whereas such modulatory effect was not found in the following response (Behroozmand et al., 2012). Overall, the behavioral difference between opposing vs. following response appears to be dependent on the experimental task. Further studies should be conducted to address this question using the pre-sorting averaging technique. 4.2. Neurophysiological findings More interestingly, the present study revealed a stimulus-dependent neurophysiological processing of opposing and following responses to altered auditory feedback. Following response was associated with greater P2 amplitude than opposing response to 200 cents, and +200 cents elicited greater P2 amplitude than 200 cents in the production of opposing response. As far as we are aware, this is the first study examining the brain activity involved in the production of opposing and following responses. These findings support our hypothesis that differential neural mechanisms can be recruited when people produce opposing and following responses to pitch feedback changes during self-monitoring of speech. In recent years, several neurophysiological studies have been conducted to address neural mechanisms underlying auditory feedback control of voice, in which neurophysiological responses were obtained by averaging a mix of opposing and following individual trials (Behroozmand et al., 2009; Hawco et al., 2009; Liu et al., 2011b; Korzyukov et al., 2012). The present study built on the categorization of individual-trial responses prior to averaging, and thus revealed some findings in contrast with previous research. For example, there was no systematic change of N1/P2 amplitudes associated with either opposing or following response as a function of perturbation magnitude in the present study, whereas other studies reported greater N1/P2 amplitudes to larger perturbations (Behroozmand et al., 2009; Hawco et al., 2009; Behroozmand and Larson, 2011; Liu et al., 2011b). In addition to the methodological difference, all participants in the present study were Mandarin-native speakers while English-native speakers were recruited in most of previous research. One recent study revealed that there was no modulatory effect of perturbation magnitude on the P2 responses produced by Mandarin-native speakers (Chen et al., 2012). Therefore, this inconsistence appears to be very likely due to the specificity of language experience. The present study also revealed that upward and downward direction elicited significantly different P2 responses to 200 cents but not to 50 cents perturbations, indicating a directional effect elicited by larger perturbations. Similarly, it was reported that directional effect on N1/P2 amplitude to pitch perturbations reached significance in the case of 200 cents and 500 cents but not 100 cents (Liu et al., 2011b). Interestingly, upward direction elicited larger P2 amplitude than downward direction in the present study, while opposite pattern was reported by Liu et al. (2011b) that larger N1/P2 amplitude to downward stimuli was elicited than that to upward stimuli. Note that directional effect observed in the present study was found in the opposing response but absent in the following response. Thus, these contrastive findings are very likely due to the methodological difference. Regardless, these findings are in line with the results reported by Liu et al. (2011b) that neurophysiological processing of auditory feedback during selfmonitoring of speech can be influenced by the direction of feedback perturbation and further support our hypothesis that neural mechanisms underlying opposing and following responses are modulated by the stimulus specificity. 4.3. Neural mechanisms of opposing vs. following response Given that opposing and following responses differ in the direction relative to feedback perturbation, it has been suggested
2170
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171
that they have different functions in the online control of vocal production (Burnett et al., 1998; Hain et al., 2000). Opposing response functions as a negative feedback loop to stabilize the vocal production at a desired level, while following response serves a destabilizing function such that it tends to deviate the vocal production from the target through a positive feedback loop. Although there is some behavioral evidence suggesting that differential neural mechanisms may be underlying the production of opposing and following responses (Behroozmand et al., 2012), none of neurophysiological research has been conducted to support this speculation. The present study revealed significant differences of P2 amplitude between opposing vs. following responses to 200 cents, while none of significance was observed for other stimuli. These findings indicate that there are stimulus-dependent differential cortical mechanisms involved in the production of opposing vs. following responses. Effects of stimulus specificity on vocal responses to altered auditory feedback have been well documented in previous behavioral studies (Burnett et al., 1998; Larson et al., 2000, 2007; Liu and Larson, 2007; Macdonald et al., 2010). In recent years, there is ample evidence that differential neural mechanisms are underlying in the processing of small and large perturbations in voice auditory feedback (Hyde et al., 2008; Behroozmand et al., 2009; Zarate et al., 2010; Behroozmand and Larson, 2011; Liu et al., 2011b). For example, larger pitch perturbations elicited greater neurophysiological responses (Behroozmand et al., 2009; Liu et al., 2011b) or engaged more auditory brain activity (Hyde et al., 2008) than smaller pitch perturbations. Furthermore, according to the two-strategy hypothesis in the online control of voice (Burnett et al., 1998), a compensatory strategy is involved in the correction of small perturbations in voice auditory feedback, while a following strategy is recruited to process larger perturbations. If so, the function of compensating for feedback perturbation in voice auditory feedback may be weakened whereas the tendency of following the feedback perturbation may be more pronounced in the face of larger perturbations compared with small perturbations, leading to an allocation of more neural resources (i.e. larger cortical responses) involved in the following responses than opposing responses. This may account for the present finding that opposing responses differed from following responses in P2 amplitudes only in the case of 200 cents but not 50 cents. In addition to the two-strategy hypothesis (Burnett et al., 1998), several other hypotheses have been also proposed to interpret the generation of opposing and following responses (Hain et al., 2000; Larson et al., 2007). Hain et al. (2000) proposed a model based on the comparison of the planned F0 with the perceived F0. According to this model, opposing response is generated when feedback perturbation is perceived as an internal reference, whereas feedback perturbation that is considered as an external reference leads to following response. This model did not specify what feedback perturbation can be considered as internal or external reference though. As suggested by the efference copy mechanism involved in self-monitoring of speech (Houde et al., 2002; Heinks-Maldonado et al., 2005; Behroozmand and Larson, 2011), however, the perturbation leading to a small mismatch between the intended and actual feedback is considered as being self-produced source or internal reference, while the perturbation leading to a large mismatch are treated as the external reference. So Hain et al.’s hypothesis is partly in line with the two-strategy hypothesis. In addition, Larson et al. (2007) reported that more following responses were elicited when voice pitch and loudness were perturbed in the opposite direction when compared with pitch or loudness perturbation alone, suggesting that misperception of perturbation direction could be partly responsible for the generation of following responses. Overall, these hypotheses regarding the production of
opposing and following response are based on how the feedback perturbation is perceived during the central processing. However, if the production of opposing or following response were purely related to the perception of feedback perturbation, subjects would always produce either opposing or following responses rather than a mix of these two responses. It is thus suggested that the causes of opposing or following responses are not only related to how the feedback perturbation is perceived. Rather, other mechanisms such as muscular control of the voice may be also involved in this process. In a recent laryngeal electromyography (EMG) study (Liu et al., 2011a), it was found that one group of laryngeal muscles (e.g. cricothyroid) showed a decrease while the other showed an increase of EMG response to feedback perturbation in the production of either an opposing or following vocal response, indicating that muscles were controlled by two opposite mechanisms at the same time. That is, the pathways mediating compensatory mechanisms in some muscles may be active while following mechanisms are active in other muscles at the same time (Liu et al., 2011a). A saliency of compensatory mechanisms over following mechanisms may cause the muscles to produce opposing responses and vice versa, which may account for the finding of an alternative set of opposing and following response. This provides evidence that mechanisms involved in the motor control of laryngeal muscles is partly responsible for the production of opposing or following responses. Taken together, complicated neuromuscular mechanisms may be involved in producing opposing and following responses, including mechanisms related to stimulus perception, muscular control, or a translation from stimulus perception to muscular control (Liu et al., 2011a). 5. Conclusion The present study examined the behavioral and neurophysiological characteristics of opposing and following responses to pitch perturbations in voice auditory feedback. The results showed that magnitudes of opposing and following responses did not differ across conditions, but greater proportions of opposing to following responses were found in the case of small perturbations. The neurophysiological findings revealed greater P2 amplitudes associated with following responses compared with opposing responses when pitch feedback was perturbed 200 cents. Moreover, upward direction elicited greater P2 amplitudes than downward direction in the production of opposing response, whereas such directional effect was not observed for following responses. These findings demonstrate that neurophysiological characteristics of opposing and following responses can be modulated by stimulus specificity. It is suggested that differential neural mechanisms may be underlying the production of opposing and following responses. Acknowledgement The present study was supported by National Natural Science Foundation of China (Nos. 30970965 and 31070990). References Abbs JH, Gracco VL. Control of complex motor gestures: orofacial muscle responses to load perturbations of lip during speech. J Neurophysiol 1984;51:705–23. Bauer JJ, Larson CR. Audio-vocal responses to repetitive pitch-shift stimulation during a sustained vocalization: improvements in methodology for the pitchshifting technique. J Acoust Soc Am 2003;114:1048–54. Bauer JJ, Mittal J, Larson CR, Hain TC. Vocal responses to unanticipated perturbations in voice loudness feedback: an automatic mechanism for stabilizing voice amplitude. J Acoust Soc Am 2006;119:2363–71. Behroozmand R, Karvelis L, Liu H, Larson CR. Vocalization-induced enhancement of the auditory cortex responsiveness during voice F0 feedback perturbation. Clin Neurophysiol 2009;120:1303–12.
W. Li et al. / Clinical Neurophysiology 124 (2013) 2161–2171 Behroozmand R, Korzyukov O, Larson CR. Effects of voice harmonic complexity on ERP responses to pitch-shifted auditory feedback. Clin Neurophysiol 2011a;122:2408–17. Behroozmand R, Korzyukov O, Sattler L, Larson CR. Opposing and following vocal responses to pitch-shifted auditory feedback: evidence for different mechanisms of voice pitch control. J Acoust Soc Am 2012;132:2468–77. Behroozmand R, Larson CR. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback. BMC Neurosci 2011;12:54. Behroozmand R, Liu H, Larson CR. Time-dependent neural processing of auditory feedback during voice pitch error detection. J Cogn Neurosci 2011b;23:1205–17. Boersma P. Praat, a system for doing phonetics by computer. Glot Int 2001;5:341–5. Burnett TA, Freedland MB, Larson CR, Hain TC. Voice F0 responses to manipulations in pitch feedback. J Acoust Soc Am 1998;103:3153–61. Burnett TA, Senner JE, Larson CR. Voice F0 responses to pitch-shifted auditory feedback. A preliminary study. J Voice 1997;11:202–11. Chen SH, Liu H, Xu Y, Larson CR. Voice F0 responses to pitch-shifted voice feedback during English speech. J Acoust Soc Am 2007;121:1157–63. Chen Z, Liu P, Wang EQ, Larson CR, Huang D, Liu H. ERP correlates of languagespecific processing of auditory pitch feedback during self-vocalization. Brain Lang 2012;121:25–34. Davis JN, Sears TA. The proprioceptive reflex control of the intercostal muscles during their voluntary activation. J Physiol 1970;209:711–38. Ferree TC, Luu P, Russell GS, Tucker DM. Scalp electrode impedance, infection risk, and EEG data quality. Clin Neurophysiol 2001;112:536–44. Folkins JW, Canty JL. Movements of the upper and lower lips during speech: interactions between lips with the jaw fixed at different positions. J Speech Hear Res 1986;29:348–56. Folkins JW, Zimmermann GN. Lip and jaw interaction during speech: responses to perturbation of lower-lip movement prior to bilabial closure. J Acoust Soc Am 1982;71:1225–33. Gracco VL, Abbs JH. Dynamic control of the perioral system during speech: kinematic analyses of autogenic and nonautogenic sensorimotor processes. J Neurophysiol 1985;54:418–32. Hain TC, Burnett TA, Kiran S, Larson CR, Singh S, Kenney MK. Instructing subjects to make a voluntary response reveals the presence of two components to the audio-vocal reflex. Exp Brain Res 2000;130:133–41. Hawco CS, Jones JA, Ferretti TR, Keough D. ERP correlates of online monitoring of auditory feedback during vocalization. Psychophysiology 2009;46:1216–25. Heinks-Maldonado TH, Mathalon DH, Gray M, Ford JM. Fine-tuning of auditory cortex during speech production. Psychophysiology 2005;42:180–90. Houde JF, Jordan MI. Sensorimotor adaptation in speech production. Science 1998;279:1213–6. Houde JF, Nagarajan SS, Sekihara K, Merzenich MM. Modulation of the auditory cortex during speech: an MEG study. J Cogn Neurosci 2002;14:1125–38. Hyde KL, Peretz I, Zatorre RJ. Evidence for the role of the right auditory cortex in fine pitch resolution. Neuropsychologia 2008;46:632–9. Jones JA, Munhall KG. The role of auditory feedback during phonation: studies of Mandarin tone production. J Phon 2002;30:303–20. Korzyukov O, Sattler L, Behroozmand R, Larson CR. Neuronal mechanisms of voice control are affected by implicit expectancy of externally triggered perturbations in auditory feedback. PLoS One 2012;7:e41216.
2171
Lane H, Tranel B. The Lombard sign and the role of hearing in speech. J Speech Hear Res 1971;14:677–709. Larson CR. Cross-modality influences in speech motor control: the use of pitch shifting for the study of F0 control. J Commun Dirord 1998;31:489–503. Larson CR, Altman KW, Liu H, Hain TC. Interactions between auditory and somatosensory feedback for voice F(0) control. Exp Brain Res 2008;187:613–21. Larson CR, Burnett TA, Bauer JJ, Kiran S, Hain TC. Comparisons of voice F0 responses to pitch-shift onset and offset conditions. J Acoust Soc Am 2001;110:2845–8. Larson CR, Burnett TA, Kiran S, Hain TC. Effects of pitch-shift onset velocity on voice F0 responses. J Acoust Soc Am 2000;107:559–64. Larson CR, Sun J, Hain TC. Effects of simultaneous perturbations of voice pitch and loudness feedback on voice F0 and amplitude control. J Acoust Soc Am 2007;121:2862–72. Liu H, Behroozmand R, Bove M, Larson CR. Laryngeal electromyographic responses to perturbations in voice pitch auditory feedback. J Acoust Soc Am 2011a;129:3946–54. Liu H, Larson CR. Effects of perturbation magnitude and voice F0 level on the pitchshift reflex. J Acoust Soc Am 2007;122:3671–7. Liu H, Meshman M, Behroozmand R, Larson CR. Differential effects of perturbation direction and magnitude on the neural processing of voice pitch feedback. Clin Neurophysiol 2011b;122:951–7. Liu H, Zhang Q, Xu Y, Larson CR. Compensatory responses to loudness-shifted voice feedback during production of Mandarin speech. J Acoust Soc Am 2007;122:2405–12. Liu P, Chen Z, Jones JA, Wang EQ, Chen S, Huang D, et al. Developmental sex-specific change in auditory-vocal intergration: ERP evidence in children. Clin Neurophysiol 2013;124:503–13. Liu P, Chen Z, Larson CR, Huang D, Liu H. Auditory feedback control of voice fundamental frequency in school children. J Acoust Soc Am 2010;128:1306–12. Lofqvist A, Lindblom B. Speech motor control. Curr Opin Neurobiol 1994;4:823–6. Macdonald EN, Goldberg R, Munhall KG. Compensations in response to real-time formant perturbations of different magnitudes. J Acoust Soc Am 2010;127:1059–68. Munhall KG, Löqvist A, Kelso JAS. Lip-larynx coordination in speech: effects of mechanical perturbations to the lower lip. J Acoust Soc Am 1994;95:3605–16. Nashner L, McCollum G. The organization of human postural movements: a formal basis and experimental synthesis. Behav Brain Sci 1985;8:135–72. Purcell DW, Munhall KG. Compensation following real-time manipulation of formants in isolated vowels. J Acoust Soc Am 2006;119:2288–97. Rothwell JC, Traub MM, Marsden CD. Automatic and ‘‘voluntary’’ responses compensating for disturbances of human thumb movements. Brain Res 1982;248:33–41. Shaiman S. Kinematic and electromyographic responses to perturbation of the jaw. J Acoust Soc Am 1989;86:78–88. Shaiman S, Gracco VL. Task-specific sensorimotor interactions in speech production. Exp Brain Res 2002;146:411–8. Siegel GM, Pick Jr HL. Auditory feedback in the regulation of voice. J Acoust Soc Am 1974;56:1618–24. Zarate JM, Wood S, Zatorre RJ. Neural networks involved in voluntary and involuntary vocal pitch regulation in experienced singers. Neuropsychologia 2010;48:607–18.