56, 321–333 (1997) BL971740
BRAIN AND LANGUAGE ARTICLE NO.
Voice Onset Time in Ataxic Dysarthria HERMANN ACKERMANN AND INGO HERTRICH Department of Neurology, University of Tu¨bingen, Germany In eight patients with a purely ataxic syndrome due to cerebellar atrophy the voice onset time (VOT) of word-initial stop consonants was measured at the acoustic signal. The subjects had been asked to produce sentence utterances including either one of the German minimal pair cognates ‘‘Daten’’ (/datən/, ‘‘data’’) and ‘‘Taten’’ (/tatən/, ‘‘deeds’’). In addition, a master tape comprising the target words from patients and controls in randomized order was played to six listeners for perceptual evaluation. Two major findings emerged. First, the cerebellar subjects presented with a reduced categorical distinction of the VOT of voiced and unvoiced stop consonants. Second, the patients’ target words with initial unvoiced plosive gave rise to a significantly increased number of misassignments at perceptual evaluation. To some extent comparable VOT disruptions have been noted in apraxia of speech and basal ganglia disorders. Thus, different pathomechanisms might result in similar VOT abnormalities. 1997 Academic Press
INTRODUCTION
A variety of clinical and experimental data indicate that the cerebellum might represent an internal clock providing temporal computations in the motor, perceptual, and cognitive domains (Braitenberg, 1967; Keele & Ivry, 1990; Ivry & Baldo, 1992). For example, cerebellar patients show increased variability during rhythmic finger tapping in response to external stimuli (Ivry & Keele, 1989). These deficits can be attributed to impaired central pacemaking rather than disordered motor execution. Moreover, subjects with cerebellar dysfunction are less accurate in tasks requiring perceptual discrimination of time intervals indicated by tones or evaluation of the velocity of moving visual stimuli (Ivry & Keele, 1989; Ivry & Diener, 1991). Speech production includes the temporal coordination of respiratory, laryngeal, and orofacial muscle activities. For example, the release of vocal tract occlusion and the initiation of glottal vibrations, respectively, have to be Address correspondence and reprint requests to Hermann Ackermann, M.D., M.A., Department of Neurology, University of Tu¨bingen, Hoppe-Seyler-Str. 3, D-72076 Tu¨bingen, Germany. Fax: (Germany) 7071-296507. 321 0093-934X/97 $25.00 Copyright 1997 by Academic Press All rights of reproduction in any form reserved.
322
ACKERMANN AND HERTRICH
properly adjusted during the production of stop consonant–vowel sequences. Thus, the time lag between stop consonant burst and vowel onset, i.e., voice onset time (VOT), can be considered a measure of the timing of orofacial and laryngeal events (Keller, 1990). Lisker and Abramson (1964, 1967) found relatively discrete temporal domains of this measure for voiced and unvoiced stop consonants in normal speakers: the former have a shorter VOT (shortlag VOT) than their unvoiced cognates (long-lag VOT) or even present with a voicing lead, i.e., glottal vibrations start prior to stop consonant release burst. At the initial position of isolated English words VOT of /d/, e.g., usually amounts to less than 20 msec whereas the respective /t/-productions have values exceeding 30 msec. There is, however, some overlap of the VOT of voiced and unvoiced plosives within sentence contexts. Similar data have been obtained from German speakers with respect to stop consonant production (Stock, 1971; Haag, 1979; Ziegler & von Cramon, 1986). Since VOT conveys phonological information, this durational parameter adequately must be controlled. Thus, the following hypothesis can be inferred: If the cerebellum represents an internal clock required for all temporal computations, impaired categorical segregation of short- and long-lag VOT in cerebellar subjects must be expected. Patients suffering from cerebellar degeneration may have lengthened VOT values (subjects A2–A4 in Kent, Netsell, & Abbs, 1979) as well as increased variation coefficients of word-medial unvoiced stop consonants (Hertrich & Ackermann, 1994b). However, the ability to produce distinct temporal VOT domains of voiced and unvoiced stops has not yet been examined in cerebellar disorders. Subjects with Broca’s aphasia or apraxia of speech, respectively, due to a lesion of the anterior language zones of the dominant hemisphere may show a significant overlap of the VOT of voiced and unvoiced stop consonant targets giving rise to perceived voicing errors (Blumstein, Cooper, Zurif, & Caramazza, 1977; Freeman, Sands, & Harris, 1978; Blumstein, Cooper, Goodglass, Statlender, & Gottlieb, 1980; Itoh, Sasanuma, Tatsumi, Murakami, Fukusako, & Suzuki, 1982; Hoit-Dalgaard, Murry, & Kopp, 1983; Tuller, 1984; Ziegler & von Cramon, 1986; Blumstein, 1988). On these grounds the anterior language zones of the dominant hemisphere have been assumed to be prerequisite for the proper timing of articulatory gestures (Buckingham, 1991). This suggestion is not at variance with the cerebellar timing hypothesis. The neocerebellar cortex sends via dentate nucleus ascending fiber tracts to the frontal lobe which target, among others, the anterior language zones. On the other hand, the frontal cortex including Broca’s area projects to pontine nuclei in the brainstem which, in turn, represent a major source of afferent input to the cerebellum. Leiner, Leiner, and Dow (1993) suggest that these cortico-cerebello-cortical loops contribute to linguistic functions. For example, positron emission tomographic (PET) studies revealed activation of cerebellar structures during lexical processing (Pet-
VOT IN ATAXIC DYSARTHRIA
323
ersen, Fox, Posner, Mintun, & Raichle, 1989). Besides semantic aspects of verbal performance, the reciprocal connections of neocerebellum and frontal cerebral cortex might participate in the computation of linguistically relevant time intervals. Assuming, therefore, that the anterior language zones of the left hemisphere are relevant for the timing of articulatory gestures and that the cerebellum represents an universal internal clock, disorders of the cerebellum should give rise to similar deficits in VOT as those observed with damage to the frontal language area. Recent studies on speech timing in basal ganglia disorders, however, provide some evidence that VOT does not depend exclusively upon the reciprocal connections between frontal cortex and cerebellum. In Parkinsonian patients, Lieberman, Kako, Friedman, Tajchman, Feldman, and Jiminez (1992) noted an overlap of the VOT of voiced and unvoiced stop consonants at syllable-initial position similar to that found in Broca’s aphasia. These authors attribute the observed VOT disruptions to impaired striatal pathways acting on the prefrontal area. Besides Parkinson’s disease, Huntington’s chorea represents another paradigm of basal ganglia dysfunctions. A previous study of our laboratory noted reduced mean VOT of unvoiced stop consonant targets, the voiced cognates were not considered, in the latter disorder (Hertrich & Ackermann, 1994a). Thus, it can be expected that choreic syndromes give rise to an increased VOT overlap as well. There is some further evidence that the basal ganglia contribute to the temporal organization of speech utterances. For example, a subgroup of Parkinsonian patients presents with involuntary acceleration of syllable repetitions during oral diadochokinesis tasks (Logigian, Hefter, Reiners, & Freund, 1991; Ackermann, Gro¨ne, Hoch, & Scho¨nle, 1993). These alterations of syllabic pacing might affect spontaneous speech as well since increased speaking rates have been noted in subjects with Parkinson’s disease (Darley, Aronson, & Brown, 1975). However, ‘‘hastened speech’’ in Parkinson’s disease probably reflects the release of tremor oscillations rather than impaired temporal computations caused by deficient clock mechanisms (Ackermann et al., 1993). The cerebellum and the basal ganglia, therefore, seem to contribute in a differential way to the temporal organization of speech utterances. In consideration of the available data, VOT overlap seems to represent a rather nonspecific sign of central motor articulatory disorders. However, reduced contrasts between voiced and unvoiced stops might reflect different distributional patterns of VOT: both increased variabilities in the presence of unchanged means as well as systematic shifts of the latter parameter without a concomitant increase of variability can give rise to VOT overlap. This suggestion raises the question whether disorders of the cerebellum result into a distinct pattern of VOT distribution in terms of means and variability measuresas compared to apraxia of speech and basalganglia disorders, and whether the observed alterations can be interpreted as disturbed clock mechanisms.
324
ACKERMANN AND HERTRICH
TABLE 1 Clinical Data of the Patients with an Ataxic Syndrome Due to Cerebellar Atrophy Cerebellar signs Subject CA1 CA2 CA3 CA4 CA5 CA6 CA7 CA8
Age
Sex
Dur
Her
Gait
Arms
Oculomotor
Other disorders
30 56 56 57 71 30 68 65
F F M M F F F M
3 3 5 5 5 8 9 23
— — — — — 1 — 1
2 2 1 2 2 2 1 2–3
1 1 1 1–2 1 1 1 2
1 1 1 1 1 1 1 1
— PNP (legs) — — PVWML — syrinx —
Note. Age in years; Dur, disease duration (years); Her, heredity; 1, slight; 2, moderate; 3, severe; PNP, polyneuropathy; PVWML, ischaemic periventricular white matter lesions.
MATERIALS AND METHODS
Subjects The present study considered purely ataxic syndromes due to cerebellar atrophy (CA), a feasible neuroanatomical model of cerebellar dysfunction. Table 1 shows the relevant clinical data of the eight subjects (CA1 to CA8) included. At clinical examination none of them had any signs of extracerebellar motor dysfunctions. The integrity of the corticospinal tracts to distal arm and leg muscles was further confirmed by means of transcranial magnetic stimulation of motor cortex. All patients showed atrophy of the cerebellum at nuclear magnetic resonance imaging (MRI). Extensive investigations of blood serum and cerebrospinal fluid were unremarkable in all subjects. Patients CA6 and CA8 had a definite family history compatible with autosomal-dominant cerebellar ataxia (ADCA III; Harding, 1984). The remaining subjects had a diagnosis of idiopathic cerebellar atrophy. All patients were fully able to cooperate. None had signs of cognitive decline at clinical examination. A former neuropsychological study including tests of planning abilities, verbal and nonverbal memory functions, and acquisition of cognitive skills had provided evidence that cerebellar atrophy does not give rise to significant impairments within these domains (Daum, Ackermann, Schugens, Reimold, Dichgans, & Birbaumer, 1993). Patient CA2 presented with clinical signs of polyneuropathy at the legs. However, nerve conduction measures were still within the normal range. MRI of the spinal cord revealed syringomyelia in patient CA7. This disorder was clinically silent. Most presumably, it represents a distinct disease entity independent from cerebellar atrophy. A slightly reduced volume of the pons was suspected in subject CA2 from the MRI scans. Clinical examination and auditory-evoked potentials, however, did not reveal any deficits of brainstem functions. Subject CA5 presented with a few discrete hypodense regions within the periventricular white matter at MRI. Since this individual was unremarkable at neuropsychological tests and electrophysiological investigations, these lesions were considered silent lacunar infarctions. The remaining patients did not show any evidence of extracerebellar pathology. Using the dysarthric dimensions introduced by Darley and co-workers (1975), a certified speech pathologist, unacquainted with the results of the acoustic analyses, performed a perceptual evaluation of the recorded test materials and samples of spontaneous speech. All cerebellar subjects presented with articulatory impreciseness as well as with harsh and/or breathy voice. With the exception of three individuals (CA2, CA3, CA5) all of them showed slowed speech
VOT IN ATAXIC DYSARTHRIA
325
rate at perceptual evaluation. Further dysarthric features included mild hypernasality (CA4, CA7), excessive loudness variation (CA6, CA8), and scanning speech in terms of ‘‘excess and equal stress’’ (CA6, CA8). None of the cerebellar patients had significantly reduced overall intelligibility during connected speech. Prolongation of syllable and utterance durations seems to represent a salient feature of cerebellar dysarthria (Ackermann & Hertrich, 1994). To obtain a quantitative parameter of the severity of patients’ dysarthria, the length of articulation test sentences was determined at the acoustic speech signal (see below). Normal data were obtained from 10 speakers (three men: age 56, 66, 74 years; seven women: age 26–72 years), who never had suffered from diseases of the central nervous system or the cranial nerves, respectively.
Speech Material The present study considered the German nouns ‘‘Daten’’ (/datən/, ‘‘data’’) and ‘‘Taten’’ (/tatən/, ‘‘deeds’’) for investigation. These two lexical items represent a minimal pair with respect to the initial alveolar stop. In most German dialects the stop consonant /t/, when produced in vowel surrounding, can be characterized as aspirated and voiceless; /d/ represents the voiced cognate. Usually, the latter consonant is not produced with voicing lead, i.e., glottal vibrations preceding burst release, in southern German dialects. The stressed syllable of the target word, i.e., /da/ and /ta/, respectively, bears the main focus of the obtained test sentences (see below). ‘‘Daten’’ and ‘‘Taten’’ were printed in bold large letters on separate cards. In order to obtain data for a different study, three further items were added (‘‘getVte,’’ V 5 /a/, /i/, /u/). These latter stimuli represent a part of the articulation test used in former investigations on speech rate and rhythm in neurological dysarthrias (Ackermann & Hertrich, 1993, 1994; Hertrich & Ackermann, 1994a). Within the context of the present study these items were used for the determination of utterance durations. The latter parameter was considered an objective measure of speech tempo.
Recording Procedure The experimenter presented 10 cards of each of the five stimuli in a quasi-randomized order. Subjects had to produce the visually shown target word embedded into the carrier phrase ‘‘Ich habe . . . gelesen’’ (‘‘I have read . . .’’). Thus, 10 productions of the five test sentences were obtained each. Speech examination was done in a sound-treated room using a DAT recorder (Sony PCM 2000) and a condenser microphone (Sony C-48). The mouth–microphone distance amounted to about 20 cm. Prior to analysis, the recordings were bandpass filtered (50–8000 Hz), digitized at a sampling rate of 20 kHz, and stored on a personal computer (486 IBM-compatible).
Measurement Procedure Acoustic analysis was performed by means of commercially available software (Computerized Speech Lab CSL 4300, Kay Elemetrics Corp., USA). For segmentation the speech signal was displayed on a PC monitor screen using a horizontal resolution of 1 msec and a maximal vertical resolution of 16 bit. The acoustic events considered for analysis were visually identified with the support of auditory playback and marked by a cursor.
Perceptual Evaluation All recorded utterances were checked by the experimenter (HA) with respect to dysfluencies such as iterations or audible inspirations, incomplete stop consonant production, and devoicing
326
ACKERMANN AND HERTRICH
of vocalic intervals. In order to evaluate whether aberrant VOT values give rise to perceived sound errors, a master tape was created comprising the words Daten and Taten both from patients and controls in randomized order. A single target word from patient CA4 included a speech error with consecutive self-correction (‘‘Ta . . . a¨h . . . Daten’’) and, therefore, had to be discarded from analysis. The target words did not comprise iterations, audible inspirations, or devoiced vocalic segments. Thus, the master tape comprised 179 Daten- and 180 Taten-utterances. All word-initial stop consonants were preceded by intervals without acoustic energy both at visual and auditory evaluation. The occurrence of complete occlusions prior to the target words might have been due to the fact that this segment represents an interword pause. In order to avoid a bias of the listeners, the stimuli obtained from the patients and controls were not blocked by group but combined to a single sequence. Three certified speech pathologists, a clinical linguist, and a psychologist, unacquainted with the patients and controls, as well as the second author assessed the utterances and noted the perceived item. The master tape was presented by means of a loudspeaker adjusted to comfortable loudness. As a measure of perceived voicing contrast the percentages of /d/-targets classified as /t/-productions and vice versa were determined. Since the ratings of the second author corresponded to those of the remaining listeners, results were pooled across these six subjects.
Acoustic Parameters In accordance with the definition of Lisker and Abramson (1964) VOT of the initial alveolar stop of the target words Daten and Taten was determined as the time interval between the beginning of the burst and vowel onset. These measurements were performed in line with previous studies (Ackermann & Hertrich, 1993; Hertrich & Ackermann, 1994a). In order to detect eventual systematic shifts of these parameters, the individual means of the VOT of /d/ and /t/, respectively, were computed. Moreover, two measures of the categorical contrast between the VOT of voiced and unvoiced alveolar stops were considered: (a) the individual differences between mean VOT of /d/ and VOT of /t/; (b) the F value from an analysis of variance performed with the VOT data of each subject separately. The F values can be considered a quantitative parameter of categorical distinction since this measure reflects both the distance between the individual means and the respective variabilities. Slowed speech in terms of prolonged acoustic segments has been noted in cerebellar dysarthria (Kent et al., 1979; Ackermann & Hertrich, 1994). Therefore, durational parameters might be considered a quantitative measure of the severity of dysarthria. The length of the articulation test sentences ‘‘Ich habe getVte gelesen’’ (V 5 /a/, /u/, /i/) was determined from the onset of vowel /a/ in ‘‘habe’’ to the beginning of the first vowel /e/ in ‘‘gelesen.’’ The mean of the 30 utterances obtained from each subject provided the parameter ‘‘utterance duration.’’ A former investigation using similar speech material had determined measurement reliability (Hertrich & Ackermann, 1994a). In patients with Huntington’s chorea an error of about the same order of magnitude occurred as in control speakers (measurement error of VOT in normals, 2.3 msec; measurement error in patients with Huntington’s chorea, 2.1 msec).
Statistical Analysis Since, as a rule, the data considered were not normally distributed, Kruskal–Wallis analysis of variance by ranks was performed in order to test for differences between the CA and the control group. Statistical analysis included seven parameters. Under these conditions comparisons may achieve significance by chance. Therefore, the Bonferroni–Holm correction for the adjustment of significance levels was applied [alpha* 5 α/(k 2 i 1 1) (k 5 number of analyses performed, i 5 ith lowest significance value p)]. Considering an alpha-level of .05, the lowest
327
VOT IN ATAXIC DYSARTHRIA
TABLE 2 Perceptual and Acoustic Data Obtained from Patients with Cerebellar Atrophy (CA) and Control Subjects Subject
Err/d/
Err/t/
VOT/d/
SD
VOT/t/
SD
CA5 CA2 CA3 CA1 CA7 CA4 CA6 CA8
0.0 0.0 10.0 a 0.0 16.6 a 0.0 1.6 3.3 a
0.0 1.6 10.0 a 0.0 20.0 a 88.8 a 43.3 a 98.3 a
13.2 20.0 23.4a 22.7a 20.1 15.0 30.0a 31.2a
2.0 7.0 10.1 a 3.3 14.0 a 0.9 4.2 6.5
52.0 80.1 47.4 98.4 57.4 33.5 b 46.3 b 29.4 b
13.3 24.5 a 20.4 a 7.6 29.7 a 30.2 a 15.1 7.0
Min Max Mean SD
0.0 1.6 0.7 0.9
0.0 5.0 0.5 1.6
13.4 21.2 16.7 3.2
Controls (N = 10) 1.6 47.0 7.2 103.1 3.2 76.3 1.9 20.9
3.7 16.4 8.5 4.0
F value 178.45 104.05 14.69 b 777.20 20.70 b 10.02 b 15.90 b 0.39 b 90.13 1,763.23 689.68 551.63
Uttdur 0.904 0.977 1.085 1.197 1.559a 1.737a 1.989a 3.037a 0.695 1.347 0.977 0.184
Note. Err/d/ (Err/t/), /d/-targets (/t/-targets) perceived as /t/-targets (/d/-targets) in percentage of evaluated stimuli; VOT/t/ (VOT/d/), mean VOT value of the /t/-targets (/d/-targets) in msec; SD, standard deviation; Uttdur, utterance duration; Min (Max), lower (upper) limit of the normal range. Note ranking of subjects according to mean utterance duration. a Above normal range. b Below normal range. of the seven p values has to be ,.05/7 (5 .0071) in order to be significant, the second lowest ,.05/6 (5 .0083), the third lowest ,.05/5 (5 .01), and so on.
RESULTS
Perceptual Evaluation of Target Utterances Table 2 (second and third column) provides the results of the perceptual ratings in percentages of misassignments. Only few voicing errors occurred with respect to the /d/- and /t/-productions of the control subjects. Three patients (CA1, CA2, CA5) were also largely unremarkable in these regards. Perceived voicing errors affected the /d/- and /t/-targets to a similar extent in two individuals (CA3, CA7). The remaining cerebellar subjects presented with a different pattern: their /d/-productions were almost correctly identified whereas the utterances with unvoiced alveolar stop showed a considerably increased number of misassignments (43.3–98.3%). Group comparison using Kruskal–Wallis analysis of variance by ranks revealed a significant effect with respect to perceived voicing errors of the /t/-targets (Table 3). Speech Tempo (Utterance Durations) With respect to the parameter ‘‘utterance duration,’’ no significant differences emerged between controls and patients. However, four patients (CA4,
328
ACKERMANN AND HERTRICH
TABLE 3 Kruskal–Wallis Analysis of Variance by Ranks Variable
Mean rank Controls
Cerebellar subjects
ChiQ
p value
VOT/d/ VOT/t/ VOT diff F-value Uttdur Err/d/ Err/t/
7.45 11.60 11.90 12.60 6.90 8.50 6.70
12.06 6.87 6.50 5.62 12.75 10.75 13.00
3.3212 3.4816 4.5474 7.5868 5.3368 0.97577 8.0075
0.0684 0.0621 0.0330 0.0059* 0.0209 0.3232 0.0047*
Note. ChiQ, χ 2 test; *significant at level p , .05; for further abbreviations, see Table 2.
CA6, CA7, CA8) had reduced speech tempo in terms of sentence lengths exceeding the normal range. Acoustic Measures of the Voiced/Unvoiced Contrast The individual mean VOT values of the targets /d/ and /t/ as well as the corresponding standard deviations and F values are shown in Table 2. Kruskal-Wallis analysis yielded a significant group difference (controls versus patients) for the computed F values indicating reduced categorical VOT distinction between voiced and unvoiced stops (Table 3). Acoustic measurements of utterance durations indicated dysarthric deficits of varying severity in the cerebellar group. Therefore, individual VOT distributions were analyzed in more detail. Three subjects (CA1, CA2, CA5) presented with a clear-cut segregation of the VOT of the /d/- and /t/-targets (Fig. 1). Accordingly, these patients had the highest F values. The displays of each of the five remaining patients showed an overlap of VOT (Fig. 1). All the latter subjects had a smaller F value than each of the controls. As concerns the patients with VOT overlap, different distributional patterns in terms of mean values and standard deviations can be noted. For example, two patients (CA6, CA8) presented with an increased mean VOT of /d/ concomitant with a reduced mean of the /t/-productions. The standard deviations were, however, still within the normal range. Subject CA8 even showed a rather complete assimilation of the voiced and unvoiced stop consonants. In contrast, patients CA3 and CA7 had increased standard deviations of the VOT both of voiced and unvoiced stops in the presence of largely unaltered mean VOT values. Comparison of Perceptual and Acoustic Data Three CA subjects (CA1, CA2, CA5) had distinct temporal domains of the VOT of /d/ and /t/. In accordance with these findings, their target produc-
VOT IN ATAXIC DYSARTHRIA
329
FIG. 1. Individual distributions of voice onset time (VOT) from word-initial voiced (5 VOT of /d/; filled columns) and unvoiced alveolar stops (5 VOT of /t/; unfilled columns), respectively, in eight patients suffering from cerebellar atrophy (CA1–CA8). VOT is plotted in milliseconds on the abscissa. The ordinate provides the number of VOT occurrences during intervals of 10 ms (e.g., 20 ms 5 interval 15–24 ms, 40 ms 5 interval 35–44 ms; the columns in between refer to the interval 25–34 ms). Patients’ displays are ordered, from the top of the left panel to the bottom of the right panel, according to mean utterance durations.
tions were correctly identified by the listeners. These three individuals belong to the four least affected ones with respect to severity of dysarthria in terms of utterance duration. Thus, it might be assumed that in these instances the cerebellar disorder did not yet interfere with VOT production. As concerns the patients with VOT overlap, comparison of the perceptual and acoustic data, tentatively, allows the differentiation of two patterns of stop consonant deviations: two subjects (CA3, CA7) produced perceived errors of /d/- and /t/-targets to a similar degree concomitant with increased
330
ACKERMANN AND HERTRICH
variability of short- and long-lag VOT; the remaining three patients (CA4, CA6, CA8) had almost exclusively /t/-target misassignments in the presence of a long-lag VOT below the normal range. With respect to utterance durations, subjects CA3 and CA7 were less impaired than the latter three patients. Moreover, the most dysarthric individual (CA8) showed a complete assimilation of voiced and unvoiced stops. Thus, conceivably, these various patterns of perceptual and acoustic findings in cerebellar disorders might represent subsequent stages of speech dissolution. DISCUSSION
The present study revealed a reduced contrast between voiced and unvoiced word-initial alveolar stops in subjects suffering from cerebellar atrophy. Disruption of long-lag VOT primarily seems to account for these findings. At first glance, a differential influence of cerebellar disorders on voiced and unvoiced stops is at variance with the assumption of cerebellar clock mechanisms. To obtain distinct temporal domains of short- and long-lag VOT values, the former must be adequately controlled as well. However, the production of voiced and unvoiced stops differs with respect to the underlying laryngeal mechanisms. Cooper (1977) assumed that unvoiced stops pose higher demands on temporal control than their voiced cognates. During production of long-lag VOT the glottis must be in a spread position at the time of burst release. Perceived aspiration is due to the resulting glottal airflow turbulences. Prior to burst release, therefore, the vocal folds have to be abducted (laryngeal devoicing gesture). Their subsequent adduction has to be triggered, as suggested by Cooper, within a relatively small time interval in order to produce a fairly constant VOT. In contrast, the time point of vocal fold adduction seems to be less critical with respect to short-lag VOT. Moreover, photoelectric glottographic investigations indicate that voiced unaspirated plosives between two vowels generally are produced with a closed glottis (Dixit, 1989). Presumably, VOT of voiced stops then simply reflects the aerodynamic coupling between burst release and vocal fold vibrations; i.e., the abrupt increase of the difference between sub- and supraglottal pressure following the opening of the vocal tract initiates vibrations of the vocal folds. Thus, the findings of a differential influence of cerebellar dysfunctions on the VOT of voiced and unvoiced stops are compatible with the assumption of disordered clock mechanisms. Admittedly, Weismer (1980) has provided acoustic and conceptual evidence that the production of voiceless stops does not require precise temporal control of the adduction gesture preceding initiation of vocal fold vibrations. He suggests that the overall time course of the abduction–adduction sequence follows, rather, a preprogrammed order. On the grounds of this model, therefore, the time point of adduction does not represent a controlled variable. Given a preprogrammed time course of the laryngeal abduction–adduction sequence, nevertheless, the onset of the devoicing gesture has to be temporally coordinated with supralaryngeal vocal
VOT IN ATAXIC DYSARTHRIA
331
tract opening. Even under these conditions the production of unvoiced stops, thus, requires more complex temporal coordination as compared to the voiced cognates. It cannot be excluded, however, that pathomechanisms other than disordered temporal computations are also relevant for the observed VOT disruptions in cerebellar disorders. Using the technique of cineradiography, Kent and Netsell (1975), e.g., found reduced velocity of most orofacial movements in a patient suffering from cerebellar degeneration. Bradykinesia not necessarily reflects deficits of temporal computations but could be due to impaired force generation. Patients with apraxia of speech may show shifts of mean VOT comparable to those observed in cerebellar disorders (Freeman et al., 1978, Figs. 1–3; Blumstein et al., 1977, Fig. 6; Blumstein et al., 1980, Fig. 1). Moreover, voicing errors in aphasics with articulatory deficits particularly represent substitutions of an unvoiced sound by a voiced one (Shankweiler & Harris, 1966). The findings of similar VOT disruptions following disorders of the anterior language zones and the cerebellum are compatible with the notion that both structures contribute to speech timing via their reciprocal fiber connections. Comparable alterations of VOT have been noted, however, in Parkinson’s disease and can be expected in Huntington’s chorea (see Introduction). Thus, either the basal ganglia cooperate with the anterior language zones and the cerebellum as concerns the temporal organization of speech utterances or different pathomechanisms may result in similar VOT disruptions. Besides a shift of mean VOT, disturbed clock mechanisms can be expected to give rise to increased individual variability of durational parameters. Four of the cerebellar subjects showed enlarged standard deviations of long-lag VOT across 10 sentence productions in comparison with the control group. It is noteworthy that most individuals with increased variability did not present with a shift of mean VOT. Conceivably, cerebellar dysfunctions interfere with speech production by a variety of mechanisms (see above). As an alternative, increased variability of VOT and shifts of the respective means, which ultimately may develop into complete assimilation of voiced and unvoiced stops, represent successive stages of speech dissolution in cerebellar disorders. In summary: Cerebellar disorders, first, give rise to different individual VOT distributions in terms of means and variability measures. Second, at least partially, similar VOT disruptions seem to occur in dysfunctions of the basal ganglia and the anterior language zones. Either all these cerebral structures, i.e., frontal cortex, cerebellum, and basal ganglia, interact during the computation of temporal relationships or a variety of pathomechanisms result in similar deviations of VOT. REFERENCES Ackermann, H., Gro¨ne, B. F., Hoch, G., & Scho¨ nle, P. W. 1993. Speech freezing in Parkinson’s disease: A kinematic analysis of orofacial movements by means of electromagnetic articulography. Folia Phoniatrica, 45, 84–89.
332
ACKERMANN AND HERTRICH
Ackermann, H., & Hertrich, I. 1993. Dysarthria in Friedreich’s ataxia: Timing of speech segments. Clinical Linguistics and Phonetics, 7, 75–91. Ackermann, H., & Hertrich, I. 1994. Speech rate and rhythm in cerebellar dysarthria: An acoustic analysis of syllabic timing. Folia Phoniatrica et Logopaedica, 46, 70–78. Blumstein, S. E. 1988. Approaches to speech production deficits in aphasia. In F. Boller & J. Grafman (Eds.), Handbook of neuropsychology (Vol. 1). Amsterdam: Elsevier. Pp. 349–365. Blumstein, S. E., Cooper, W. E., Goodglass, H., Statlender, S., & Gottlieb, J. 1980. Production deficits in aphasia: A voice-onset time analysis. Brain and Language, 9, 153–170. Blumstein, S. E., Cooper, W. E., Zurif, E. B., & Caramazza, A. 1977. The perception and production of voice-onset time in aphasia. Neuropsychologia, 15, 371–383. Braitenberg, V. 1967. Is the cerebellar cortex a biological clock in the millisecond range? Progress in Brain Research, 25, 334–346. Buckingham, H. W., Jr. 1991. Explanations for the concept of apraxia of speech. In T. Sarno (Ed.), Acquired aphasia (2nd ed.). New York: Academic Press. Pp. 271–312. Cooper, W. E. 1977. The development of speech timing. In S. J. Segalowitz & F. A. Gruber (Eds.), Language development and neurological theory. New York: Academic Press. Pp. 357–373. Darley, F. L., Aronson, A. E., & Brown, J. R. 1975. Motor speech disorders. Philadelphia: Saunders. Daum, I., Ackermann, H., Schugens, M. M., Reimold, C., Dichgans, J., & Birbaumer, N. 1993. The cerebellum and cognitive functions in humans. Behavioral Neuroscience, 107, 411–419. Dixit, R. P. 1989. Glottal gestures in Hindi plosives. Journal of Phonetics, 17, 213–237. Freeman, F. J., Sands, E. S., & Harris, K. S. 1978. Temporal coordination of phonation and articulation in a case of verbal apraxia: A voice onset time study. Brain and Language, 6, 106–111. Haag, W. K. 1979. An articulatory experiment on voice onset time in German stop consonants. Phonetica, 36, 169–181. Harding, A. E. 1984. The hereditary ataxias and related disorders. Edinburgh: Churchill Livingstone. Hertrich, I., & Ackermann, H. 1994a. Acoustic analysis of speech timing in Huntington’s disease. Brain and Language, 47, 182–196. Hertrich, I., & Ackermann, H. 1994b. Voice onset time in neurological dysarthrias. In R. Aulanko & A.-M. Korpijaakko-Huuhka (Eds.), Proceedings of the Third Congress of the International Clinical Phonetics and Linguistics Association. Publications of the Department of Phonetics (Vol. 39). Helsinki: Department of Phonetics, University of Helsinki. Pp. 59–66. Hoit-Dalgaard, J., Murry, T., & Kopp, H. 1983. Voice onset time production and perception in apraxic subjects. Brain and Language, 20, 329–339. Itoh, M., Sasanuma, S., Tatsumi, I. F., Murakami, S., Fukusako, Y., & Suzuki, T. 1982. Voice onset time characteristics in apraxia of speech. Brain and Language, 17, 193–210. Ivry, R. B., & Keele, S. W. 1989. Timing functions of the cerebellum. Journal of Cognitive Neuroscience, 1, 136–152. Ivry, R. B., & Diener, H. C. 1991. Impaired velocity perception in patients with lesions of the cerebellum. Journal of Cognitive Neuroscience, 3, 355–366. Ivry, R. B., & Baldo, J. V. 1992. Is the cerebellum involved in learning and cognition? Current Opinion in Neurobiology, 2, 212–216. Keele, S. W., & Ivry, R. B. 1990. Does the cerebellum provide a common computation for diverse tasks? A timing hypothesis. Annals of the New York Academy of Sciences, 608, 179–211. Keller, E. 1990. Speech motor timing. In W. J. Hardcastle & A. Marchal (Eds.), Speech production and speech modelling. Dordrecht, The Netherlands: Kluwer. Pp. 343–364. Kent, R. D., & Netsell, R. 1975. A case study of an ataxic dysarthric: Cineradiographic and spectrographic observations. Journal of Speech and Hearing Disorders, 40, 115–134.
VOT IN ATAXIC DYSARTHRIA
333
Kent, R. D., Netsell, R., & Abbs, J. H. 1979. Acoustic characteristics of dysarthria associated with cerebellar disease. Journal of Speech and Hearing Research, 22, 627–648. Leiner, H. C., Leiner, A. L., & Dow, R. S. 1993. Cognitive and language functions of the human cerebellum. Trends in Neurosciences, 16, 444–447. Lieberman, P., Kako, E., Friedman, J., Tajchman, G., Feldman, L. S., & Jiminez, E. B. 1992. Speech production, syntax comprehension, and cognitive deficits in Parkinson’s disease. Brain and Language, 43, 169–189. Lisker, L., & Abramson, A. S. 1964. A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20, 384–422. Lisker, L., & Abramson, A. S. 1967. Some effects of context on voice onset time in English stops. Language and Speech, 10, 1–28. Logigian, E., Hefter, H., Reiners, K., & Freund, H.-J. 1991. Does tremor pace repetitive voluntary motor behavior in Parkinson’s disease? Annals of Neurology, 30, 172–179. Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle, M. E. 1989. Positron emission tomographic studies of the processing of single words. Journal of Cognitive Neuroscience, 1, 153–170. Shankweiler, D., & Harris, K. S. 1966. An experimental approach to the problem of articulation in aphasia. Cortex, 2, 277–292. Stock, D. 1971. Untersuchungen zur Stimmhaftigkeit hochdeutscher Phonemrealisationen. Hamburg: Buske. Tuller, B. 1984. On categorizing aphasic speech errors. Neuropsychologia, 22, 547–557. Weismer, G. 1980. Control of the voicing distinction for intervocalic stops and fricatives: Some data and theoretical considerations. Journal of Phonetics, 8, 427–438. Ziegler, W., & von Cramon, D. 1986. Timing deficits in apraxia of speech. European Archives of Psychiatry and Neurological Sciences, 236, 44–49.