Journal of Phonetics (1992) 20, 411-440
Patterns of voicing-conditioned vowel duration in French and English Christiane Laeufer Department of French and Italian, Ohio State University, Columbus, OH 43210-1229, U.S.A. Received 17th July 1989, and in revised form 30th September 1991
Because the variation in vowel duration conditioned by following obstruent voicing has been found to be much greater in English than in other languages, a phonological rule of vowel-lengthening has been proposed for English. This conclusion, however, has been based on studies that do not take into account a number of other factors which influence vowel duration. A comparison of vowel duration in English and French monosyllables shows that, particularly sentence-finally and medially under focus, when inter-language differences in final obstruent realization are taken into account, obstruent voicing has a substantial influence on vowel duration in both languages. In words occurring in unstressed medial position, the variation in vowel duration is much smaller, but equally so in both languages, once inter-language differences in prosodic structure and syllable structure are taken into account. These results suggest that languages universally exhibit fairly similar, physiologically conditioned, voicingdependent variation in vowel durations. Under certain conditions of word length, syllabic structure, stress, position in word and utterance, and speaking tempo, voicing-induced differences are enhanced, while under others they are obscured. A general phonetic account in terms of timing control thus seems more appropriate than a languagespecific phonological rule, even for a language like English.
1. Introduction
Earlier studies have shown that vowels are generally longer before voiced than before voiceless obstruents. They also suggest that the magnitude of this durational difference is much larger in English than in other languages. Thus, in their measurements from word list readings Zimmerman & Sapon (1958) found a ratio of vowel durations in voiceless over voiced context of 86% for Spanish, compared to 63% for English (such ratios are the usual way of calculating the voicing effect on vowel duration). Chen's (1970) results from word list readings revealed ratios of 61% for English, as opposed to 78% for Korean, 82% for Russian, and 87% for French. Based on these and other single-language studies (e.g. House, 1961), it is commonly assumed that obstruent voicing is a universally important factor influencing the duration of preceding vowels and that English, but not the other languages examined, exploits this natural tendency by using it as the basis for a low-level 0095-4470/92/040411 + 30 $08.00/0
© 1992 Academic Press
Limited
412
C. Laeufer
phonological rule of vowel lengthening before voiced obstruents. That is, it is claimed that English, and English alone, has a rule of the form V ~ V: /_voiced obstruent (e .g. Chomsky & Halle, 1968). Most of these earlier comparisons, however, do not take into account a number of confounding factors, such as the length of the test words, the immediate segmental environment, or the presence in some of the languages of an obstruent devoicing rule which tends to neutralize the voicing distinction word-finally (Delattre, 1962; Mack , 1982) . Moreover, comparisons between languages have usually been based on separate studies of each language, ignoring factors such as different positions in the word and utterance, as well as different speaking modes which have large effects on vowel duration and segment duration in general. Mack's (1982) comparison between French and English monosyllabic words read in isolation does take these factors into account. Since she nevertheless found a substantial difference between the English (53%) and French (74%) ratios, Mack suggests there may indeed exist a learned vowel lengthening rule before voiced obstruents in English. This conclusion, however, may still be premature. Studies of English suggest that the effect of obstruent-voicing on preceding vowel duration is highly variable. It is largest in utterance- or phrase-final context and significantly reduced in medial context (Cooper & Danley, 1981; Luce & Charles-Luce, 1985). It is much smaller in polysyllabic words than in monosyllables (Sharf, 1962; Klatt, 1973; Harris & Umeda, 1974; Port, 1981); and in spontaneous connected speech than in read word or phrase lists, to the point of being practically non-existent in non-phrase-final and non-prepausal position (Umeda, 1975 ; Klatt , 1973, 1975, 1976; Crystal & House, 1982, 1988) . There is also a significant interaction between vowel identity and the voicing ratio , the effect of obstruent-voicing being smaller for inherently shorter vowels than for longer ones (Peterson & Lehiste, 1960; Sharf, 1962; Cooper & Danly, 1981; Crystal & House, 1982; Luce & Charles-Luce, 1985). The voicing effect also decreases as speaking rate is increased (Port, 1977, 1981; Harris & Umeda, 1974) . Furthermore , the effect is significantly more pronounced in the context of fricatives than stops (House & Fairbanks, 1953; Peterson & Lehiste, 1960). Phonetic realization of postvocalic consonants also plays a role: in particular, flapping of It/ to [!"] and /d/ to [r] in American English has been found to greatly reduce differential vowel duration (Fox & Terbeek, 1977). Finally, stress and syllable structure play a role: Davis & Summers (1989) report little and, in five cases out of nine, statistically insignificant vowel lengthening as a function of consonant voicing in unstressed syllables before heterosyllabic obstruents. Similar contextual variability in the magnitude of the voicing effect is apparent in French. The effect on vowel duration is larger utterance- , sentence- and phrasefinally than medially; it is also more pronounced in stressed than in unstressed position (Landschultz, 1967; Wajskop, 1978). The voicing effect is also smaller in connected speech than in word or phrase list readings (O'Shaughnessy, 1981, 1984; Bartkova & Sarin, 1987). Furthermore, there is a reduction in differential vowel duration from mono- to polysyllables (O'Shaughnessy, 1981). Intrinsic vowel duration increases with aperture and so do voicing-induced duration differences (Delattre, 1939; Durand , 1946; Di Cristo, 1980; Durand, 1985) . Voiced fricatives are commonly described by French linguists as having the greatest lengthening influence on preceding vowels and the voicing effect is stronger in the context of fricatives than stops (e.g. Delattre, 1939, 1966; Belasco, 1953) . Obstruent devoicing .in assimilation contexts often brings about some shortening of preceding vowels and
Voicing-conditioned vowel duration
413
a reduction in the voicing effect (Thorsen, 1966, Debrock, 1977). Finally, differential vowel duration is also reduced when the following consonants are not tautosyllabic with the vowels (Landschultz, 1967). The consonant voicing effect on vowel duration thus appears to be highly context-sensitive in both languages. The high degree of variability in English casts doubt upon an account in terms of an either/or phonological role of vowel lengthening and suggests an account in terms of a very low-level phonetic gradient similar to the one assumed to be at work in the other languages. The shared context-dependent effects on differential vowel duration are offset by a number of language-dependent effects. Indeed, while Mack's (1982) comparison of French and English vowel duration in the context of following stops remedies some of the shortcomings of earlier studies, it fails to consider certain other languagespecific conditioning factors which may at least partially be responsible for the important cross-linguistic difference found. One relevant factor is language-specific vowel duration. Most English stressed vowels are much longer than their closest French equivalents (Delattre, 1951, 1965). At least in English, greater vowel length has been shown to accentuate any vowel duration differences, and thus lower the ratios of the durations in the voiceless vs. voiced contexts. Since there do not exist enough (near)-homophonous pairs in French and English with short/lax English vowels, testing should be extended to unstressed phrasal contexts. Differences in the phonetic realization of postvocalic consonants, including closure and release intervals constitute another potentially important factor influencing the preceding vowel duration effect. In prepausal position, English final stops are often reported to have short closures and tend to lack audible release bursts (Rositzke, 1943; Umeda & Coker, 1974; Kahn, 1976; Crystal & House, 1982). French stops, on the other hand, have long closures and strong aspirated releases (Delattre, 1951, 1953; Malecot & Lindheimer, 1966; Kohler, 1979, 1984). Furthermore, as Mack (1981) notes for stops and Flege & Hillenbrand (1986) for fricatives, prepausal voiced obstruents are commonly followed by a vocalic release in French, but not in English. It is possible that these embryonic schwas make French final obstruents more like initial ones, which may reduce differences in preceding vowel duration. These cross-linguistic differences could be factored out by extending the study to fricatives and extending the sentential contexts tested to medial, prevocalic context, where stops are apparently released in both languages, and where French obstruents have no separate schwa-like releases. This paper reports an experiment that replicates and extends Mack's study of French and English in order to provide additional insight into the nature of the voicing contrast and the determination of universal and language-particular timing effects. More specifically, the aim is to isolate further those factors which influence voicing-conditioned vowel duration in languages and to assess their relative contributions to any vowel duration effects.
2. Methods 2.1. Subjects Five French and five English-speaking subjects participated. Each speaker group consisted of four females and one male who were paid for their participation.
C. Laeufer
414 TABLE
I. Stimulus words and carrier phrases
French
English (a) Words bock [bak J bog [bag] peck [pek] peg [peg] caught [bt] cawed [bd] moat [mout] mode [moud] bus [bAs] buzz [bAz] face [fers] phase [ferz] safe [serf] save [serv] (b) Contexts Phrase-medial in focus: Yell _x_ at Mary.
bac [bak] bee [bek] cotte [bt] monte [mot] bus [bys] Jesse [fes] sauf [sof]
"ferry" "beak" " tunic" "go up" "bus" " buttock" " unhurt"
bague begue code monde buse fez sauve
Dis _x_
[bag] [beg] [bd] [mod] [byz] [fez] [sov]
"ring" "stammerer" "code" "world" "buzzard" "fez" " unhurt" (f.sg .)
a Marie.
Phrase-final in focus, followed by phrase-medial, unfocused : Dis _x__ Dis _ x _ Yell _x_. Yell _ x _ at MARY.
a MARIE.
According to self-report, their foreign language experience had been at best very limited. The English speakers were all born and lived in Central Ohio (U.S.A.), the French speakers in Paris ; their speech was judged to be free of local or socially non-standard features. The English speakers were all in their early thirties and the French in their early twenties. 2.2. Test materials
Twenty-eight minimal pairs of monosyllabic words which are similar in English and French were used: 14 with final stops and 14 with final fricatives . The seven pairs which were analyzed for the present study are shown in Table I(a). It was ensured auditorily that in the speech of the subjects the vowels in the two members of each pair were identical in quality. (For example, all ten English speakers made no distinction between the vowels of bog and bock. ) The words were inserted into three different frame sentences given in Table I(b), where capital letters signal sentential prominence or focus, that is, the word which represents new information. Five tokens of ·each of the test words in each frame sentence were presented in random order lists along with some foils. The subjects first read the words in phrase-medial position under focus , as in answer to the question WHAT shall I yell at Mary?. Then, after a short break, the words were read in phrase-final position under focus, as in answer to the question WHAT shall I yell?; and medially in non-focused position, as in answer to the question WHO shall I yell x AT?. Written instructions were given in the appropriate language to read the sentences at a comfortable rate of speech and as naturally as possible without pausing within sentences and without varying the tempo or prosodic contour. 2. 3. Recordings
The speech materials were recorded in sound-treated rooms on professional-quality cassette recorders. Prior to the recording , the subjects listened to a cassette
Voicing -conditioned vowel duration
415
contammg altogether 12 model sentences initially prompted by the appropriate question and were given a few minutes to practice reading the sentences. The questions were only used orally prior to the model sentences and did not appear on the sheets from which the speakers read. Nor were the words under investigation capitalized, as is sometimes done to mark focus in writing. The speakers were told to repeat any mispronounced sentence.
2. 4. Data characteristics and duration measurements All sentences were digitized with a sampling frequency of 10 kHz at a 12-bit amplitude resolution. A waveform editor was used to determine the release characteristics of the final obstruent in the target words and to measure the durations of the vowel , the following fricative or stop closure, and (in the case of released stops) of the consonant release. 2.4.1. Voweldurations Vowels following initial voiceless stops and fricatives were measured from the upwards zero crossing of the waveform for the first full pitch period, as shown in Fig. 1(b). Following voiced obstruents, vowel onsets were determined by the usually quite apparent change in waveform shape and amplitude, as shown in Fig. 2(b). Analogous criteria were used for measuring vowel offsets. There was usually a sudden marked decrease of periodic energy before voiceless obstruents . Before voiced obstruents, the end of the vowel was usually manifested by a decrease in amplitude and/or a decrease in the regularity of the pitch periods, which was used as a demarcation point together with a perceptual verification (Fig. 3(b)) . 2.4.2. Final obstruent durations Final stop closures were measured from the vowel offset determined as explained above to the vertical spike marking the instant of the release of supraglottal constriction. Final fricative intervals were measured from the vowel offset to the end of the periodic (voiced) or aperiodic (voiceless) energy. 2.4.3. Final releases Vocalic releases. Each token was analyzed with respect to the type of release of its final consonant. In the final context, most French voiced obstruents, as well as occasional voiceless obstruents, were followed by a "vocalic" release or so-called detente vocalique (Delattre, 1951), heard as a short schwa and visible in the waveform as a low-amplitude vowel of variable duration (see Fig. 3(c), (d)). The presence of a vocalic release was determined auditorily and visually, based on an increase in amplitude and periodicity following the final consonants. The determination of the onset was based on the same criteria as the ones used to locate the onset of the main vowels. The offset was located at the end of the periodic vibrations. To determine the mono- or "quasi-bisyllabic" status of French words containing schwa releases, a group of five non-French-speaking native Germans were asked to listen to the French tokens interspersed with some identical carrier sentences (Dis
416
C. Laeufer
d
(a)
(b)
(c)
Figure 1. Waveforms of (a) a sample token of French Dis code ii Marie illustrating (b) an initial voiceless stop and (c) a ( resyllabified) final voiced stop, with measurement points indicated by the tic marks below each waveform. Panel (a) shows 1700 ms of waveform. The other panels show an expanded view of 300 ms of the portion following the mark at the top of the panel directly above.
__ .) containing two-syllable words with regular French final vowels. These words were chosen so as to be identical to the tokens used in this study, except for the final vowel. They were recorded by the investigator, a native speaker of Eastern French, with stress on the initial syllable to ensure prosodic similarity: becquee "peck", beguin "crush", cater "to rate", coder "to code", manter "to go up", monder "to hull", Jessee "spanking", fais-en "make some", sofa "sofa" and sauver "to save". Initial stress occurs in some geographic and other varieties of French (e.g. Eastern French and in the speech of radio and TV announcers). German speakers were chosen because in German bisyllabic words ending in schwa are very common and German also has an alternative pattern, found in loan words and former compounds, with full vowels in the unstressed last syllable (e.g., Tenor "content", Leichnam "corpse") . The speakers were asked to write down a 1 or 2 (for one or two syllables, forced-choice judgement) on blank sheets next to vertical lists of numbers corresponding to the numbered sentences of the original recording. Voiceless releases. In both contexts under focus (medial and final, when not released in a schwa), French word-final voiceless stops were very strongly released, with both longer release duration and greater release amplitude, compared to English (see
Voicing-conditioned vowel duration
417
Figure 2. Sample token of (a) French Dis bac a Marie illustrating (b) a voiced
initial stop and (c) a non-resyllabified final voiceless stop.
Fig. 2(c), with its suggestion of several bursts of noise). These strong release bursts were measured from the instant of release noise to cessation of the high-amplitude noise. The voiced stops had "softer", less intense releases in both languages. In all three sentential contexts, English stops were followed at times by what could be called a "pseudo"-release, indicative of a more passive separation of the articulators than that of aspirated releases, or possibly in the dialect investigated here, indicative of less oral air pressure buildup during closure, due to glottalization. Pseudo-releases were indicated in the soundwave by a small vertical spike marking the instant of the burst release and lacking the following interval of high-intensity noise typical of fully released stops. Sentence-finally, English stops appeared also at times completely "unreleased", without any indication on the waveform of vocal tract reopening after closure.
2. 4. 4. Syllabic affiliation of final consonants One factor potentially interfering with vowel durations in the medial, prevocalic, contexts tested here is the phonetic syllabification of the final consonants of the target words. In prevocalic connected speech contexts, (underlyingly) word-final consonants may change their original syllabic affiliation. In English, such resyllabification is said to be merely partial. While preserving its original syllabic position, the
C. Laeufer
418
0
0
Figure 3. Sample tokens of French Dis SAUVE from speaker KD (a) without
and (c, d) with vocalic releases; (b) sample token of a final voiced fricative .
consonant becomes also affiliated with the contiguous syllable: e .g. She stood a while. /Ji.stud.;:,.waii/ --7 [Ji.stur;:,.wa!l] (Kahn, 1976; Rudes, 1977; Bailey, 1978; Laeufer, 1989). This partial resyllabification is claimed by many researchers to constitute the prosodic condition for flapping in American English. In French, resyllabification is said to be total in faster speech (unless followed by emphatic or other initial stress). A totally resyllabified consonant is dissociated from its original syllable and associated with a new, following, syllable: thus, mauvaise eau "bad water" /m::J.vr.z.o/--7 [m::J.vr..zo], homophonous with mauvais zoo "bad zoo" [m::J.vr..zo] (e.g. Pernod, 1937; Delattre 1951; Laeufer 1987, 1989). Each token of the present corpus was therefore also analyzed for the syllabic affiliation of its final consonant. A two-way distinction was made between exclusive syllabification with the preceding word or resyllabification with the following word (partial or total), as the criteria used were generally unambiguous in indicating either resyllabification or its absence. The presence of a silent interval between the consonant and the vowel of the following preposition was the clearest evidence for syllable-final position . In the absence of such an interval, the criterion for strictly syllable-final position of voiceless stops was the presence in the magnified soundwave of a long period (>20 ms) of low-amplitude noise following the release (see Fig. 2(c)). For voiceless fricatives, it was the occurrence of an obvious gradual decrease in the amplitude of the frication noise resulting in near zero amplitude. For
Voicing-conditioned vowel duration
419
voiced obstruents, the criteria were a gradual reduction of periodicity ending in an interval of voicelessness, and/or the presence of a period of low-amplitude noise following the release of stops. The criteria for resyllabification were the absence of any intervening silent intervals, and (for voiceless stops) the presence of short releases with sustained amplitude, similar in duration and shape to the initial /k/ release shown in Fig. 1(b). For voiced stops, the criterion was the presence of a more regular overall waveform with sustained periodic amplitude or a smo9th gradual rise in amplitude towards the following vowel, supported by perceptual appraisal (Fig. 1(c)). For fricatives, the criteria for syllable-initial position were the absence of a gradual decrease in the amplitude of the frication noise (voiceless) or of the periodic vibrations (voiced).
3. Results 3.1. Prosodic structure and tempo Prior modelling and practice successively ensured a constant prosodic contour in the speaker's renditions of each sentence type, apart from some small variation in test word prominence in the unfocused context. At the neutral rate of speech used by the subjects, the three prosodic contexts investigated were short enough to consist of single intonational phrases in both languages (see Pierrehumbert, 1980 for the definition of the intonational phrase in English; Vaissiere, 1974, 1975, 1980; Hirst & Di Cristo, 1984 for French). Sentence-finally, the test word occurs under focus, as it is mentioned for the first time. In English, focus is indicated phonologically by the presence on the stressed syllable of the word of a nuclear pitch accent (that is, most prominent phrasal accent). In most varieties of French, there is no comparable word stress and no phrase-internal pitch accent which could function as nuclear accent. Instead, focus is indicated by the presence of a prosodic unit-final boundary tone on the last syllable of the word. Medially under focus, the English sentences contain one intermediate or lower-level phrase (see Beckman & Pierrehumbert, 1986) with a high nuclear accent on the test word, followed by a low phrase accent (Fig. 4(a)). The French sentences, in contrast, contain two intermediate phrases ("rhythm groups" in traditional descriptions of French prosody), dis X and a Marie, with focal prominence on the first noun (Fig. 4(b)). In both languages, the test words reach F 0 values ranging from 50 to 70Hz above the values for the preceding verb. Medially in unfocused position, both the English and French sentences comprise single intermediate phrases. In English the nuclear accent is on Mary and the test word has either no pitch accent at all or a high prenuclear accent (found mostly in the slowest speaker's renditions). The latter is illustrated in Fig. 5(a) where the F0 dip between the two high tones is caused by the segmentals. Even when unaccented, the test words preserved, however, their regular word stress and were always more prominent than the (unstressed) preceding verb. In French sentential prominence falls also on Marie by virtue of its position at the end of a prosodic unit, and the test word is accorded only slightly more prominence than the initial verb dis and at times less, as illustrated in Fig. 5(b). When more prominent than the preceding verb (always the case in the English data), the test words reach in both languages F 0 values ranging from 30 to 50 Hz higher than those for the verb.
C. Laeufer
420
"' 180
H*
170 100 15(1
1" 130 1<0 110 100
(a)
"0
~
~jv BUZZ
Yell
Mary
at 1
H%
1:10 2(1 (1
1:30 1$0
170 160
L
15•)
"' ~\~
"' 130
12•) 110 10 0
(b)
Dis
SAUVE
a'
Marie
".D Figure 4. Sample F0 contours of (a) English Yell BUZZ at Mary and (b) French Dis SAUVE ii Marie. Pitch accents are marked *, boundary tones (% ); H =high tone, L =low tone .
Possible tempo differences between the two speaker groups which could confound the pattern of voicing effects were assessed by calculating the mean duration for each subject and the average mean duration across subjects of one of the sentence types in the medial context under focus (see Table II( a)). In Table II(b) the mean duration of the focused words measured from the onset of their initial consonant to the offset of their final consonant is substracted from the mean sentence duration to obtain the duration of the carrier phrase. There was a 17% (133 ms) difference in the duration of the carrier phrases in the two languages. This is likely to be due entirely to the inherently longer vowel in the stressed syllable of Mary and to the presence of two more short segments in the English frame sentences (the /1/ of yell and the /t/ of at). The French and English sentences were thus read at similar rates of speech.
3. 2. Target-word-final consonants 3. 2.1. Incidence of flapping in English In the medial contexts, the English alveolar stops in caught/cawed and moat/mode were often realized as flaps, that is very short stops of up to about 40 ms resulting
Voicing -conditioned vowel duration
H*
1"
(\
1 $ •)
170
H*
10 0
J\
1SO 1'+0
no 12 0
110
~.
100
Yell
90
0
(a)
421
bog
at
\
MARY
L%
1
1_
H
\
1 70
I"''
[::: ; lt!:: O
I
1110
Dis
buse
a'
MARIE
4
(b) ~ l ~0----~--~----~--~----~--~----~--~----~~ Figure 5. Sample F0 contours of (a) English Yell bog at MARY and (b) French Dis buse ii MARIE.
from a rapid closure followed by an immediate release. Existing studies suggest that these tend to neutralize the vowel duration differences in English. The degree of neutralization seems to differ across dialects (Sheldon , 1973; Donegan, 1974; Fisher & Hirsch, 1976; Port, 1977; Fox & Terbeek, 1977; Zue & Laferriere, 1979; Huff, 1980; Price, 1981). In the present study, some /t/s were realized as voiceless, rather than voiced, flaps (presumably under the influence of spelling). Vowels before these voiceless flaps were on average somewhat shorter than vowels before flapped /d/ : 28.7 ms shorter in focused context, 14.8 ms shorter in unfocused context . The difference in closure duration between [r] and [r] was , on the other hand, minimal: [r] = 34.4 ms, [r] = 32.3 ms in focused context; [r] = 26.0 ms, [r] = 28.2 ms in unfocused context. Before flapped /t/ realized as voiced [r], vowel durations were virtually equal to those before flapped /d/ , differences in either direction not exceeding 11.6 ms in focused and 5.5 ms in unfocused context. Since vowel duration differences in the tokens with flaps were significantly smaller than those in the tokens with non-flapped (alveolar and other) consonants , they
C. Laeufer
422
II . Mean duration (in ms) of the sentence type Yell caught at Mary , Dis cotte a Marie with focus on caught/cotte, as produced by the individual speakers; means averaged over the two speaker groups; mean duration of the English and French test words and of the carrier sentences external to the test words . Initials are those of the individual speaker's names , except XX whose name is not known TABLE
(a) Mean duration of the whole sentence as read by each individual speaker Yell caught at Mary Dis cotte a Marie
JO MV
DT DS
BA
1135.78 1156.75 1262.41 1197.47 1207.23
BF KD KDA NR XX
998.56 1074.70 1018 .35 1008 .56 1112.03
(b)
English
French
Difference
(1) Mean duration of the whole sentence averaged over speakers within a group (2) Mean duration of the test words (3) Mean duration of the carrier sentence
1191.93 412.64 779.29
1042.44 396.13 646.31
149.49 16.51 132.98 (17 % )
were excluded from the results to be compared with French. Flaps are also excluded from the results of consonant durations.
3. 2. 2. Syllabic affiliation In medial context under focus, the number of tokens with resyllabified final consonants was relatively small in both languages: 28% with voiced and 8% with voiceless consonants in French; 16% with voiced and 4% with voiceless consonants in English (excluding flaps). This small incidence of resyllabification is presumably due to the presence of nuclear accent in English and of a following prosodic unit boundary in French. No consistent vowel duration differences emerged between vowels before exclusively final and vowels before resyllabified consonants, possibly due to the prosodic conditions and the small number of resyllabified tokens. Since in this context the great majority of tokens contained non-resyllabified obstruents in both languages, the few tokens with resyllabified consonants are excluded from the vowel duration results presented below. In the medial unfocused context, in which the sentences consisted of single intermediate intonational phrases in both languages, resyllabification was more frequent: 83% of voiced and 75% of voiceless consonants in French; 46% of voiced and 32% of voiceless consonants in English (excluding flaps). Again, the vowel duration results that follow are based on the pattern that is characteristic to each language as manifest in the majority of tokens, in this case resyllabified tokens in French and non-resyllabified ones in English . 3.2.3. Release and constriction durations Table III lists, for each French speaker separately and all five speakers together, the sums and percentages of obstruents in the final context articulated with a vocalic release. A total of 63% of the voiced obstruents, 74% of the voiced stops and 48% of the voiced fricatives, were followed by schwa releases. The male speaker pronounced
Voicing-conditioned vowel duration
423
TABLE III. Tabulation of French final consonants realized with vocalic releases by each individual speaker in sentence-final context; sum and percentages for all speakers together. Speaker NR is male, the other speakers are females. Maximum N = 5 tokens per word per speaker, i.e. a total of 25 tokens per word type and a total of 175 tokens altogether BF
NR
Total N
Total %
2 3 3 2 0 0 __Q 10
18 20 20 16 12 11 _.TI 110
72 80 80
18
4 4 5 4 3 4 4 28
0 0 0 0 0 0 0 0
0 0 1 1 2 3 2 9
0 0 0 0 0 0 Q 0
1 1 4 5 4 7 _§ 28
4 4 16 20 16 28 24 16
Speakers:
KD
XX
bague begue code monde buse fe z sauve
3 4 4 4 4 3 3 25
4 5 4 4 4 4
5 4 4 2 1 0
~
~
29 1 1 3 3 2 4
bac bee cotte monte bus Jesse sauf
0 0 0 1 0 0 Q 1
~
18
KDA
64
48 44
52 63
only words ending in stops with schwa releases, whereas the four female speakers also produced words ending in fricatives with vocalic releases. Schwa releases were also found occasionally (1 and 9% ) after words ending in voiceless consonants for two of the speakers (both female) and more frequently in another female speaker's voiceless tokens (18 % ). An orthographic "e" is not a prerequisite , as vocalic releases occur nearly as often in fez (44%) as in buse (48% ), and in sauf (24% ) as in Jesse (28% ) or monte (20% ). The mean durations of stop releases in each language are shown in Table IV for TABLE IV . Word-final stop release durations (in ms) in final and medial context under focus; mean duration differences between French and English. The durations correspond to the weighted averages of the individual word pairs Voiceless ms
N
(1) Final context French English Fr.- Eng.
89 100
107.44 55.50 51.94
(2) Medial context under focus French 94 31.90 English 71 16.70 Fr.- Eng . 15.20
N
Voiced ms
26 100
45.50 20.10 25.40
69 56
19 .30 10.20 9.10
424
C. Laeufer TABLE v. Stop closure and fricative interval durations (in ms) , mean duration differences, and duration percentages (ratios) in final and focused medial contexts. The durations correspond to the weighted averages of the individual word pairs French
(1) Final context Stops: Voiceless
Voiced Fricatives: Voiceless Voiced
N
ms
89 26
164.60 100.20 64.40
58 39
237.43 134.28 103.15
(2) Medial context under focus Stops 94 101.54 Voiceless Voiced 69 71.36 30.18 Fricatives: Voiceless Voiced
67 57
166.43 101.40 65.03
English %
N
ms
%
100 100
132.67 90.11 42.56
67.92
198.23 126.54 71 .69
63.83
71.19 54.96 16.23
77.20
126.34 92.80 33.54
73.45
60.87 75 75 56.56
71 56 70.27 70 72
60.93
the two contexts under focus, where the majority of tokens in both languages contained non-resyllabified consonants. The French final stop releases are roughly twice as long as the English . The average cross-linguistic difference in release duration for the four words ending in stops combined was found in a two-way analysis of variance to be significant in sentence-final context (main effect for language: F(1, 13) = 21.18, p < 0.01, main effect for voicing: F(1, 13) = 10.07, p < 0.05). French voiceless stops were also found to have significantly longer releases than the English in sentence-medial context (F(1, 7) = 16.61, p < 0.05). Table V reports the durations and ratios for English and French (non-vocalically released) final consonants in the same two contexts. French obstruents, especially voiceless ones, were found in a two-way analysis of variance to have significantly longer stop closures and fricative intervals than English obstruents (main effect for language: F(1, 25) = 42.87, p < 0.01, main effect for voicing: F(1, 25) = 29.68, p < 0.05 in final context; main effect for language: F(1, 25) = 25.63, p < 0.01, main effect for voicing: F(1, 25) = 16.54, p < 0.05 in medial context). Furthermore, the durational range of voiceless-voiced obstruents is about a third to a half larger than the English (F(1, 12) = 20.91, p < 0.01 in final , F(1, 12) = 18.69, p < 0.01 in medial context) . Longer absolute durations combined with a larger range result in a lower ratio for voiced over voiceless consonants in French than in English . Gottfried (1984) also reports longer closures for syllable-final /t/ in French than in
425
Voicing -conditioned vowel duration
English. Flege & Hillenbrand (1986) similarly found longer voiceless than voiced durations, larger ranges , and lower ratios for French fricatives read from a word list (French fricatives: 299 ms (voiceless)- 138 ms (voiced)= 160 ms or 46%; English fricatives: 254 ms- 174 ms = 80 ms or 68% ). The smaller French voiced durations, compared to English, are presumably due to the presence of vocalic releases after all voiced fricatives, which shortens the (in that case, quasi-intervocalic, rather than truly final) consonants. The French results are also comparable to those reported in van Dommelen (1981) and Kohler , van Dommelen & Timmerman (1981) (tabulated from Tables II-V) for utterance-final context: 218.2 (voiceless) -100.4 (voiced)= 117.7 ms or 46% (53% for preceding vowels). For French stops, Kohler (1979) reports a 67.7 ms (55%) duration difference utterance-finally. For English alveolar stops in words read in isolation, Smith (1978) reports a 29 ms (76%) difference. 3.3. Vowel duration results 3. 3.1. Final context Table VI displays the results from final context for each word pair separately. The results suggest a rather large dissimilarity between the two languages, with a mean VI. Vowel durations (in ms), mean duration differences, and duration percentages (ratios) across minimal pairs in final context
TABLE
French
bague bac begue bee
N
ms
25 25
199.96 156. 15 43.81
25 25
183.71 140.41 43.30
code cotte
25 25
164.60 121.60 43.00
monde monte
25 25
239.05 190.53 48.52
buse bus fez Jesse sauve sauf
voiced voiceless
25 25 25 25
199.17 139.80 59.37 203.37 141.00 62.37
25 25
204.95 142.40 62.55
175 175
199.26 147.41 51.85
English % bog bock
N
ms
%
25 25
316.50 201.73 114.77
63.74
230.47 146.93 83.54
63.75
78.09 peg peck
25 25
76.43 cawed caught
25 25
338.29 212.58 125.71
62.84
mode moat
25 25
341.36 219.67 121.69
64.35
348.00 219.33 128.67
63.03
386.18 232.33 153.85
60.16
25 25
378.92 229.74 149.18
60.63
175 175
334.25
73.88
79.70 buzz bus
25 25
70.19 phase face
25 25
69.33 save safe 69.48
voiced voiceless 73.98
~08 . 90
125.35
62.50
C. Laeufer
426
voicing ratio of 74% in French and 62% in English. The 70% ratio before fricatives in French is close to the 72% found by Landschultz (1967) in clause-final position (computed from her table on p. 34). The 77% ratio before stops is close to the 74% found by Mack (1982>"in isolation . The English 64% for lax vowels before stops is consistent with Chen's (1970) 67% and with House & Fairbank's (1953) and Luce & Charles-Luce's (1985) 69%. The average difference in vowel duration for the seven word pairs combined is significant at the 0.01 level in both languages (English: t(6)=14.22, p<0.01; French : t(6)=14.77, p < 0.01). The two-way ANOVA performed for this context revealed a significant main effect for language (F(1 , 24) =52. 72, p < 0.01) and for voicing environment (F(1 , 24) = 36.35, p < 0.01) . The language x voicing interaction also reached a high level of significance (F(1, 24) = 7.23, P < 0.01). In other words, there is a significant difference in vowel durations in the voiced versus voiceless environments in both languages, and there is also a significant difference between the two languages in the magnitude of the effect . In Table VII, mean French vowel durations are divided into two sets of results TABLE VII . Vowel durations (in ms), mean duration differences, and duration percentages (ratios) across minimal pairs in French tokens with vocalic releases and tokens with voiceless releases in final context. The overall durations correspond to the weighted averages of the individual word pairs Vocalic release N
hague bac
18 1
ms 183.38 148.56 34.82
begue bee
20 1
180.73 144. 13 36.60
code cotte
20 4
148.90 116.76 32. 14
monde monte buse bus
16 5 12 4
226.36 182.91 43.45 187.67 135.18 52.49
fez Jesse
11 7
186.78 136.46 50.32
sauve sauf
13 6
194.42 138.68 55.74
voiced voiceless
110 28
186.89 143.24 43.65
Voiceless release %
N
ms
%
7 24
210.03 157.38 52.65
74.93
191.32 139.53 51.79
72.93
5 21
172.30 124.72 47.58
72.39
9 20
255.24 195.31 59.93
76.52
217.02 143.28 73.74
66.02
14 18
221.88 145.80 76.08
65.71
12 19
228.99 148.16 80.83
64.70
65 147
213 .83 150.60 63.23
70.43
81.01 5 24 79.75
78.41
80.80 13 21 72.03
73.06
71.33
76.64
427
Voicing-conditioned vowel duration
depending on the type of following obstruent release . In general, vowel durations were somewhat shorter and the differences between the voiced and the voiceless environment smaller before consonants with vocalic releases than before consonants with unvoiced releases . The overall 74% ratio found in final context thus combines the higher 77% ratio for obstruents with vocalic releases and the lower 70% for consonants with unvoiced releases. A correlation showed a statistically significant, albeit weak, negative relationship between the durations of the preceding vowels and the durations of the release schwas (r = -0.154, t = -2.324, p = 0.021). Figure 3 above illustrates this finding: in the absence of a vocalic element (a), the /o/ of Sauve measures 153 ms. The presence of a very brief schwa shortens it to 134 ms (c); the presence of a relatively long schwa shortens it to 121 ms (d). The 74% ratio reported in Mack (1982) for French words read in isolation (by Southern speakers) is comparable to the 77% found here for stops sentence-finally and it is presumably also due to the presence of final schwa releases, as is expected of subjects from Southern France, where historic word-final schwas (still preserved in the spelling) have not been lost and quasi-bisyllabic renditions are the norm (e.g. Carton, Rossi, Auteserre & Leon, 1983). Similarly , Flege & Hillenbrand's (1986) TABLE VIII. Vowel durations (in ms), mean duration differences, and duration percentages (ratios) across minimal pairs in medial context under focus, excluding flapped and resyllabified tokens . The overall durations correspond to the weighted averages of the individual word pairs English
French N bague bac
20 23
ms 165.90 129.11 36.79
begue bee
19 23
165.67 129.67 36.00
code cotte
14 24
143.25 110.53 32.72
mo.nde monte buse bus fez Jesse sauve sauf
voiced voiceless
16 24
212.56 165.88 46.68
19 22
151.54 109.44 42.10
20 23
169.92 115.79 54.13
18 22 126 161
157.79 111.50 46.29 166.66 124.25 42.41
% bog bock
N
ms
%
23 25
233.09 175.83 57.26
75.43
24 25
183.08 120.92 62.16
66.05
4 9
215.68 152.64 63.04
70.77
223.67 164.70 58.97
73.64
77.82 peg peck
78.27 cawed caught
77.16 mode moat
5 12
78.04 buzz bus
25 23
221.78 151.54 70.24
68.33
phase face
24 23
233.61 156.21 77.40
66.87
save safe
23 24
241.05 165.33 75.72
68.59
221.71 153.31 68.40
69.15
72.22
68.14
70.67
voiced voiceless 74.55
128 141
428
C. Laeufer
production data obtained in Paris and based on word list readings of pairs ending in /s-z/, with a voicing ratio of 70%, identical to the one found here for fricatives, included "a schwalike vowel after /z/ (but not after /s/) in every instance" (p. 513). 3.3.2. Medial context under focus In medial context under focus, the durations are generally somewhat shorter and the effect smaller, as shown in Table VIII. The overall English ratio of 69% is close to the 66% found by Peterson & Lehiste (1960) and the 67% found by Klatt (1973) in a similar context. The 73% found for phonetically long vowels before stops (bog/bock, cawed/caught, mode/moat) compares with Cooper & Danly's (1981) 77%. The overall French ratio is here 75%. The French ratios for individual word pairs are, in this context too, higher than the English ones, in particular those in the context of final stops, and the absolute durations are shorter. In both languages the mean difference between the shorter and the longer vowel variants in the two consonantal environments is still significant (t(6) = 14.87, p < 0.001 for French , TABLE IX. Vowel durations (in ms), mean duration differences, and duration percentages (ratios) across minimal pairs in medial unfocused position, excluding non-resyllabified French tokens, and flapped and resyllabified English tokens. The overall results correspond to the weighted averages of the individual word pairs French N
bague bac begue bee
19 18 21 17
ms 127.40 121.87 5.53 119.90 112.43 7.47
code cotte
20 21
monde monte
24 16
151.71 136.90 14.81
buse bus
20 19
117.00 98.50 18.50
fez Jesse sauve sauf voiced voiceless
19 18 22 22 145 131
114.67 98.50 16.17
115.29 100.83 14.46 104.67 91.86 12.81 118.64 106.76 11.88
English % bog bock
N
ms
%
21 22
183.85 157.16 26.69
85.48
135.20 99.05 36. 15
73.26
162.26 129.84 32.42
80.02
95.66 peg peck
23 21
93.77 cawed caught
3 4
85. 90 mode moat
3 6
154.22 133.23 20.99
86.39
buzz bus
17 24
158.21 122.10 36.11
77.18
172.70 133.75 38.95
77.45
183.04 137.10 45.94
74.90
164.21 130.32 33.89
79.36
90.24
84. 19 phase face
15 23
87.46 save safe
16 19
87.76 voiced voiceless 89 .99
94 119
429
Voicing-conditioned vowel duration
t(6) = 21. 77, p < 0.01 for English). However, the two-way AN OVA revealed that the interaction language X voicing was not significant here at the 0.05 level.
3.3.3. Medial unfocused context In phrase-medial unfocused position, differential vowel durations in both languages are much smaller than in focused position, yet they are very consistent, as Table IX shows. The French results for sauf /sauve are consistent with those of Landschultz (1970) for unstressed /o/ before /f, v/ (computed from the table on p. 34). None of the duration differences are significant at the 0.05% level, nor is the overall difference of 11.9 ms . The individual differences are all below 20 ms. The English mean durations are here also greater than the French, the durational differences larger, and the ratios lower. The difference between vowel durations as a function of consonant voicing, although much smaller than in the other two contexts in English (1/2 of the difference found medially under focus and 1/4 of the difference found in final position), still achieves significance ([t(6) = 10.96, p <0.01]). The language X voicing interaction did not reach significance in the ANOV A performed for this context. TABLE X. Vowel durations (in ms) before voiced (upper numbers) and voiceless {lower numbers) non-vocalically released consonants , mean duration differences , and duration percentages (ratios) before stops and fricatives in the three prosodic contexts. The durations correspond to the weighted averages of the individual word pairs French
{1) Final context Stops
Fricatives
N
ms
26 89
207.22 154.23 52.99
39 58
222.63 145.75 76.88
(2) Medial context under focus 171.84 79 Stops 133.80 95 38.05 Fricatives
72 91
159.75 112.24 47.51
(3) Medial unfocused context 128.42 Stops 86 117.42 71 11.00 Fricatives
93 87
112.32 97 .06 15.26
English %
N
ms
%
100 100
306.65 195.23 111.42
63.66
371.03 227.13 143.90
61.22
213.88 153.52 60.36
71.78
232.15 157.69 74.45
67.93
158.88 129.82 29.06
81.71
171.32 130.98 40.33
76.46
74.43 75 75 65.47 49 69 77.86 57 70 70.26 56 60 91.44 75 75 86.42
430
C. Laeufer
3. 3. 4. Segmental context
In both languages differential vowel durations are greater before fricatives than stops, confirming the findings of previous studies (e.g. for French, Delattre, 1951; Belasco, 1953; for English, House & Fairbanks, 1953; Peterson & Lehiste, 1960; Raphael, 1972; Klatt, 1975; Summers, 1987). The difference as manifested in the ratios is, however, much larger in French than in English in the two focused contexts, as shown in Table X. In final context, the French 70% ratio combines the higher 74% before stops and the much lower 65% before fricatives; and the English 62% ratio combines the 64% ratio before stops and the 61% before fricatives. In medial position the corresponding ratios are 78% before stops, 70% before fricatives in French; 72% and 68% in English. In medial unfocused position French has ratios of 91% before stops and 86% before fricatives and English 82% and 76%. Tests of simple main effects in each language indicate that the effect of manner of articulation is significant in the two focused contexts in French (F(1, 10) = 23.43, p < 0.01 in final and F(1, 10) = 18.87, p < 0.01 in medial context) and in the unfocused context (F(1, 10) = 6.34, p < 0.05); and it is significant in all three contexts in English (F(1, 10) = 13. 70, p < 0.05 in final, F(1, 10) = 10.34, p < 0.05 in medial focused, and F(1, 10) = 5.17, p < 0. 05 in unfocused context). The interaction language (French, English) X manner (stops, fricatives) was significant (F(1, 10) = 13.45, p < 0.01 in final, F(1, 10) = 25. 78, p < 0.01 in medial unfocused context), except in the medial context under focus where the French and English ratios before fricatives are very close. 4. Discussion Based on a small number of studies, it has been generally accepted in the literature that English exhibits a much greater amount of differential vowel duration imputable to obstruent voicing than do other languages as diverse as Spanish, Russian, Korean, French (Chen, 1970; Mack, 1982; but see Kohler, 1979), German (Kohler & Kunze!, 1978; Kohler, 1979), Hindi (Maddieson & Gandour, 1976), Dutch (Slis & Cohen, 1969, 1970), Danish (Fischer-J0rgensen, 1968), Arabic and Japanese (Port, Al-Ani & Maeda, 1980); and Hungarian, Italian, Icelandic, Norwegian, .Swedish, Armenian, Bengali, Assamese and Marathi (see Maddieson, 1977, and Kohler, 1984, and references therein). A low-level allophonic rule of vowel-lengthening before a voiced obstruent has therefore been claimed to be operative in English, but not in French or any of the other languages studied. The results from the present comparison of French and English voicing-induced vowel durations in several prosodic contexts suggest instead that the difference between French and English is not simply the presence vs. absence of a rule of vowel lengthening, but that there is a great deal of variability in the voicing effect in both of these languages and that the variability is determined by the same kinds of factors in both languages. 4.1. Shared effects
Besides the cross-linguistic differences in absolute and relative vowel durations noted above, there are a number of notable similarities between French and English. In both languages, the voicing-dependent vowel duration differences are
Voicing-conditioned vowel duration
431
statistically significant at the 0.01 level in final and in medial focused position. In unfocused medial position, on the other hand, French has very minimal "lengthening" in the voiced context (11.9 ms) which, although not statistically significant, is nevertheless consistently present in all word pairs. English has here a greater amount of lengthening (33.9 ms), statistically significant at the 0.01 level. Although smaller in magnitude than the vowel duration differences found for English, the differences in French are thus statistically significant in two out of the three contexts. Furthermore, the English difference in medial unfocused context (33.9 ms) is smaller than the French in the other two contexts (51.8 and 42 ms). In both languages the lengthening influence of voiced consonants is largest in sentence-final context , where it combines with prepausal lengthening. It is less pronounced medially under focus, where it combines with nuclear accent in English and with a rhythm group boundary tone in French; and it is least pronounced in non-focused position, where practically identical durations before voiced and voiceless consonants are found in French, and the English difference is also much reduced. Thus, in both languages similar prosodic factors determine similar variations in the size of the voicing effect. The English data also suggest that even in monosyllabic words, the putative vowel lengthening rule is not operative to the same extent in all contexts. In particular, in medial unfocused position there is a much smaller variation in vowel duration and therefore ratios which are higher than the French ratios in the other two contexts . Conversely, in French, which is commonly described in the general linguistics literature as having little vowel lengthening before voiced obstruents (although voiced fricatives have been traditionally termed "lengthening consonants" by French linguists) , much more pronounced variations in vowel duration exist under certain conditions, namely in the context of non-vocalically released fricatives under focus and when intensified by the effects of prepausal lengthening. Furthermore , in both languages, variations in voicing-conditioned vowel duration are not restricted to variable amounts of lengthening in the voiced environment under different circumstances , but seem to be part of a complex interaction of timing effects which extend to the environment before voiceless obtruents and involve, compared across contexts, shortening as well as lengthening. The sizable vowel duration difference between the two languages in the voiceless environment, which is also variable across contexts, likewise argues against a language-specific rule of vowel lengthening in English. Moreover, when English and French vowels have similar durations, the voicing effect is also very similar. Compare, for example, the overall mean vowel duration in the environment of voiceless obstruents for English in the focused context (Table VIII) with the overall duration for French in the final context (Table VII, voiceless releases) . Figure 6 illustrates this similarity for vowels in the environment of voiced obstruents (compare, for instance, English vowels in the medial focused context with French vowels in the final context). The figure also shows that vowel durations and the voicing effect covary positively in both languages, that is, the longer the vowels, the larger the voicing effect and vice-versa. 4.2. Language-particular effects The cross-linguistic differences in the magnitude of the voicing effect are also partly ascribable to other, concurrent, language-dependent differences between the two
C. Laeufer
432
50.-----r-----.-----.-----,------.-----,
60
•
i:
"~
8"
70
OT
-
"' .2
e
\}
0
00
·c; "' 80
•
0
>
0
90
100
•
L __ __ _L __ __ _~----~-----L----~----~
50
100
150
200
250
300
350
Mean vowel durations (ms)
Figure 6. French and English mean voicing ratios as a function of mean
durations of the vowels in the voiced context in four relevant prosodic contexts. Filled symbols represent English, empty ones French. Circles stand for final context, triangles for medial focused context, squares for nonresyllabified unfocused context, and diamonds for resyllabified unfocused context.
languages, such as differences in stop release and in the duration and the syllable affiliation of postvocalic consonants .. 4.2.1. Closure duration and fricative interval
The significantly longer closure durations and fricative intervals in French, the larger durational ranges of voiceless-voiced obstruents, and the resultiag lower ratios of (non-vocalically released) final voiced over voiceless consonants, compared to English, could be affecting the pattern of voicing-related vowel duration. The French and English VC dyads differ fundamentally in their internal organization. In English, shorter consonants with a smaller durational range follow longer vowels with a greater durational range. In French, the opposite pertains. The cross-linguistic difference could be related to the prosodic systems of the two languages. In English, a stress-based language (e.g. Dauer, 1983), stressed vowels are very long, especially when also accented, compared to French, a non-stressbased language. Because of the long duration of the English nucleus, the other segments in the stressed syllable rhyme are quite short. The English flapping rule of alveolar stops can be viewed as an extreme, phonologized, form of this timing effect. The cross-linguistic difference in consonant durations could further be linked to the different status assigned to the voicing feature in each language (e.g. Flege & Hillenbrand, 1987). In French, the contrast is carried more by characteristics of the consonant, in particular, the presence or absence of glottal pulsing during the stop closure or fricative interval (e.g. Delattre, 1951). The consonants therefore need to
Voicing-conditioned vowel duration
433
be long enough to perceptually convey their voiced or voiceless status. The relative shortness of the voiced consonant, compared to the voiceless, is presumably linked to the difficulty of maintaining strong glottal pulsing for too long (e.g. Lisker, 1977; Ohala, 1983; Westbury, 1983). In English monosyllables, on the other hand, final obstruents are often (at least partly) devoiced (e.g. Shockey, 1974; Haggard, 1978; Flege & Brown, 1982; Veatch, 1990) and the contrast is thus carried more by characteristics of the vowel, in particular by the relative duration of vowels before phonologically voiced vs. voiceless consonants (Denes, 1955; Walsh & Parker, 1981; Wardrip-Fruin, 1982). French thus seems to grant greater emphasis to the postvocalic final consonants, English to the vowels, which helps explain the comparatively greater magnitude of the voicing effect on vowel durations in English. Importantly, in the unfocused context, where English vowels are prosodically less prominent and phonologically voiced consonants have glottal pulsing throughout or over most of their closure or fricative interval, English vowel duration differences are much less marked and thus more French-like. While the internal organization of French and English VC dyads sheds some light on cross-linguistic disparities in absolute and relative segment durations, it does not by itself help explain the significantly smaller voicing effect in the context of French stops, as opposed to fricatives, and the greater cross-linguistic similarity in the fricative than in the stop context. A specific characteristic of stops with a potentially strong cross-linguistic differentiating effect was therefore investigated, namely differences between French and English stop releases. 4.2.2. Stop releases First the possibility was investigated that French words with vocalic releases are phonetically equivalent to bisyllables, which could explain the shorter absolute vowel durations and smaller duration differences in the voiced vs. the voiceless environment found before French consonants with vocalic, as opposed to unvoiced , releases. The vocalically released tokens are not equivalent to regular French bisyllables with full final vowels and open first syllables, such as be-guin or sau-ver. First, there exists in French a distributional constraint against the occurrence of lower-mid hi in open "stressed" syllables: compare pot [po] "pot" with Paul [p;:,l], or sot [so] with sotte [s;:,t] "stupid", (Masc. and Fern. sg). Since the realization with vocalic releases of the words code and cotte preserves the open-mid quality of the vowel, the first syllable remains presumably closed. Second, the German listeners' ratings of the mono- vs. bisyllabic status of the French tokens suggests that only those in which the release schwa attains a certain duration are heard as bisyllables. Thus, for example, four out of five listeners judged the token corresponding to the waveform in Fig. 3(d) to be bisyllabic, but only two judged the one corresponding to the waveform in Fig. 3( c) to be bisyllabic. The realizations with vocalic releases are thus basically monosyllabic with somewhat reduced stressed vowel durations, in particular in the voiced context, resulting in a smaller voicing effect. The reduction in the voicing effect is important enough to need taking into consideration in comparisons of voicing-related vowel durations in French and a language like English which does not release its stops in a vowel. The disparity between the English and French voicing ratios is reduced when tokens with vocalic releases are taken out of the pooled French results and only
434
C. Laeufer
realizations with non-vocalic final releases are compared with English (Tables VI and VII). The separate voicing ratios in the environment of stops and fricatives are likewise more similar. between the two languages (Table X). Thus, before stops with voiceless releases, the French durational ranges (voicedvoiceless environment) are larger and the ratios lower than before stops with vocalic releases. In other words , when French stops are articulated more like English stops (that is, without vocalic releases) the voicing effect is largest and thus more English-like. The cross-linguistic similarity is, however, limited, possibly due to additional important differences between the voiceless releases of French and English stops. The disparity in the release duration of voiceless stops (French stops having significantly longer releases than English stops) is presumably related to differences in obstruent closure duration. There is a greater supra-glottal pressure buildup during longer (French) stops than shorter (English) ones. In voiceless stops, the transglottal airflow might be further increased in French by a more active opening of the glottis, compared to English stops, which have been found to involve some glottal narrowing or even glottal closure in some American dialects (e.g. Benguerel, Hirose, Sawashima & Ushijima, 1978, for French; Coker, Umeda & Browman, 1973, for English). Together these factors predict a much greater oral air pressure buildup during the (longer) French stop closures than during the (shorter) English ones, resulting in greater and longer final turbulence (aspiration). The crosslinguistic difference in the voiceless releases of stops may be related to vowel duration as a subtle manifestation of the same type of emphasis on (closure and release) characteristics of the consonant that produces vocalic releases in French and makes French final obstruents more like initial ones. The French voiceless stop releases may thus also have a certain reducing effect on voicing-conditioned vowel duration, which fricatives escape due to their different articulatory characteristics.
4. 2. 3. Syllable structure effects A final disparity between French and English which seems to be related to the magnitude of the consonant-voicing effect on preceding vowel duration comes from differences in preferred patterns of syllable structure. The effect of syllable structure is most apparent in medial unfocused context. In French, the final consonants of the test words were, in large part, syllabified with the following preposition. Assuming existing accounts of French syllabification are correct in claiming that in this context consonants tend to undergo total resyllabification (e.g. Grammont, 1930; Durand, 1946; Pernod, 1937; Delattre, 1951; 1965; Laeufer, 1987, 1989), the French data reflect differential vowel durations in truly bisyllabic words with open first syllables. In English, on the other hand, although the nuclear accent or sentential stress was on Mary in this context, the test words had either a perceivable less prominent prenuclear pitch accent in one speaker's reading or (lexical) word stress (without additional pitch accent) in the renditions of the other speakers, and were thus not entirely destressed at the rate of speech used by the speakers. Assuming with existing accounts of English syllabification that under such prosodic conditions resyllabified final consonants still retain their affiliation with their word of origin (Kahn, 1976; Rudes, 1977; Bailey, 1978; Laeufer, 1989), the final consonants are presumably ambisyllabic.
Voicing-conditioned vowel duration
435
There are thus notable differences in preferred syllable structure between French and English in medial unfocused context. In French the test word-final consonants are typically syllabified with the following vowel-initial preposition. In English, on the other hand, following a vowel with some degree of stress, such consonants often do not undergo resyllabification. A second pattern, more frequent with voiced than with voiceless consonants, exists in which final consonants remain affiliated with their word of origin and at the same time serve as onsets to the following vowel-initial syllable. Together with the existence of a majority of French words ending in open syllables (Juilland, 1965), the presence of total resyllabification presumbly contributes to the traditional characterization of French as having a marked preference for open syllables (Delattre, 1951; Leon, 1966; Dauer, 1983); and together with the existence of a majority of English words ending in closed syllables (especially when stressed), the absence of total resyllabification contributes to the characterization of English as having a marked preference for closed syllables (Delattre, 1951; Dauer, 1983). Importantly, in both languages, vowel duration differences emerged depending on whether or not the postvocalic consonant occurred in absolute syllable-final position. As Table XI shows, the 17% tokens of voiced and 25% of voiceless unresyllabified French consonants totalled longer vowel durations before the voiced and slightly shorter durations before the voiceless final consonants, resulting in a larger durational range and a lower ratio than the resyllabified tokens. Although only vowel durations in the voiced tokens were found to be significantly different depending on whether or not the following consonants were strictly syllable-final (p < 0.05), it is still notable that, when French final consonants are syllabified more like English consonants, the voicing effect is larger and thus more English-like. Similarly, the 46% tokens of voiced and 32% of voiceless resyllabified English consonants totalled somewhat shorter vowel durations than the tokens with strictly syllable-final consonants, resulting in a slightly smaller durational range and a higher ratio. The difference in vowel duration between the unresyllabified and the resyllabified tokens was found to be not significant at the 0.05 level. The very existence of such a difference, however, is still notable, and so is the fact that when TABLE XI. Vowel durations (in ms) before voiced and voiceless consonants, mean duration differences, and duration percentages (ratios) before resyllabified and non-resyllabified consonants in medial unfocused context, excluding flapped tokens . The durations correspond to the weighted averages of the individual word pairs French
N
ms
English %
(1) Before non-resyllabified consonants 134.90 Voiced 30 44 102.00 Voiceless 32.90 75.61 (2) Before resyllabified consonants 145 118.64 Voiced Voiceless 131 106.76 11.88
N
ms
%
94 119
164.21 130.32 33.89
79.36
152.90 127.58 25.32
83.44
62 41 89.99
C. Laeufer
Engljsh consonants are syllabified more like French consonants in this context (by not being strictly syllable-final), the voicing effect is smaller and thus more French-like. Also noteworthy is the fact that the non-resyllabified tokens in the two languages average very similar durational ranges and ratios. Moreover, Davis & Summers (1989) found that preceding a heterosyllabic consonant (e.g. anti-grain vs. anti-crane, adopt vs. atop), English unstressed vowel lengthening as a function of consonant voicing was much smaller and statistically less significant than (stressed) lengthening preceding a tautosyllabic consonant (e.g. tabbing vs. tapping). The two studies combined suggest that syllable structure has basically the same effect in English as in French: tautosyllabicity increases the voicing effect and heterosyllabicity decreases it. Hence, the commonly assumed large dissimilarity between French and English might be explained in terms of other, more basic linguistic differences, including the realization of word-final consonants, the relative vowel and following consonant durations, prosodic structure, and patterns of preferred syllable structure. When these are factored out, differential vowel durations appear to be relatively similar in the two languages. When they are factored in, language-internal variations in differential vowel durations come to the fore which are also quite similar in the two languages.
5. Conclusion The results from this comparison between obstruent voicing-conditioned vowel duration in French and English in several prosodic and phrasal contexts replicate those of previous single-language studies of French and English which have found that the effect of obstruent voicing on preceding vowel duration is highly variable. In particular, phrase-finally and medially under focus, vowel duration differences were found to be greatly intensified in both languages by the presence of focus, final lengthening, closed syllable structure, and non-vocalica\ly released obstruents. In medial unfocused context, they were found to be reduced due to the absence of focus and longer word length resulting from ambisyllabification or resyllabification. Similarly, they are enhanced in both languages by following fricatives rather than stops. Thus, both French and English exhibit across contexts the same type of variation in the magnitude of the voicing effect, and this variation is determined by the same factors. Besides being dependent on a number of linguistic factors like the ones isolated here, differential vowel duration has also been found to be highly dependent on various extralinguistic factors, including speaking mode and speaking rate, which enhance or obscure its effects, making an account in terms of a simple rule-governed allophonic alternation between vowel variants lengthened (in the voiced environment) and unlengthened (in the voiceless) problematic in any language. A more accurate description is called for involving a phonetic gradient, that is, a parameter of continuously varying vowel duration which interacts in complicated ways with factors such as the ones isolated here. Differential vowel duration thus belongs, in English as in other languages, along with other, highly context-sensitive phonetic timing phenomena to the level of phonetic realization which derives those aspects of the representation involving continuous time.
Voicing-conditioned vowel duration
437
I would like to thank A. Rialland of the Universite de Ia Sorbonne Nouvelle (Paris III) and the C.N.R.S. for having arranged for the French recordings; andY . Zoubir for having made the recordings at the Institut d'Etudes Linguistiques et Phonetiques. I am also grateful to three anonymous reviewers and the editor for very helpful comments on a previous draft.
References Bailey, C.-J. N. (1978) Gradience in English syllabization and a revised concept of unmarked syl/abization. Bloomington: Indiana University Linguistic Club. Bartkova, K. & Sorin, C. (1987) A model of segmental duration for speech synthesis in French, Speech Communication, 6, 245-260. Beckman, M. E. & Pierrehumbert, J . (1986) Intonational structure in Japanese and English, Phonology Yearbook, 3, 255-309. Belasco, S. (1953) The influence of force of articulation of consonants on vowel duration, Journal of the Acoustical Society of America, 25, 1015-1016. Benguerel, A.-P., Hirose, H., Sawashima, M. & Ushijima, T. (1978) Laryngeal control in French stop production: a fiberscopic, acoustic and electromyographic study, Folia Phoniatrica, 30, 175-198. Carton, F., Rossi, M., Auteserre, D. & Leon, P. (1983) Les accents des Frans;ais. Paris: Hachette. Chen, M. (1970) Vowel length variation as a function of the voicing of the consonant environment, Phonetica, 22, 125-159. Chomsky, N. & Halle, M. (1968) The sound pattern of English, New York: Harper & Row. Coker, C. H., Umeda, N. & Browman, C. P. (1973) Automatic synthesis from text, IEEE Transactions on Audio Electroacoustics, AU-21, 293-297. Cooper, W. & Danly, M. (1981) Segmental and temporal aspects of utterance-final lengthening, Phonetica, 38, 106-115. Crystal, H. & House, A. (1982) Segmental durations in connected speech signals: preliminary results, Journal of the Acoustical Society of America, 72, 705-716. Crystal, H. & House, A. (1988) Segmental durations in connected-speech signals: current results, Journal of the Acoustical Society of America, 83, 1553-1573. Dauer, R. M. (1983) Stress-timing and syllable-timing reanalyzed, Journal of Phonetics, 11, 51-62. Davis, S. & Summers, W. V. (1989) Vowel length and closure duration in word-medial VC sequences , Journal of the Acoustical Society of America, 85, Supplement 1, S28 . Debrock, M. (1977) An acoustic correlate of the force of articulation, Journal of Phonetics, 5, 61-80. Delattre, P. (1939) La duree des voyelles en frans;ais. Paris: d' Artrey. Delattre, P. (1951) Principes de phonetique franfaise a /'usage des etudiants anglo-americains. Middlebury, VT: Middlebury College Store. Delattre, P. (1953) Les modes phonetiques du fran<;ais, French Review, 27, 59-63 . Delattre, P. (1962) Some factors of vowel duration and their cross-linguistic validity, Journal of the Acoustical Society of America, 34, 1141-1143. Delattre, P. (1965) Comparing the phonetic features of English, French, German and Spanish: an interim report. New York : Chilton Books/Heidelberg: Julius Groos Verlag. Delattre, P. (1966) Duree vocalique et consonnes subsequentes. In Studies in French and comparative phonetics (P. Delattre, editor), pp. 130-132. Mouton: The Hague. Delattre, P. (1969) An acoustic and articulatory study of vowel reduction in four languages, International Review of Applied Linguistics, 7, 295-327. Delgutte, B. (1978) Technique for the percepual investigation of F0 contours with application to French, Journal of the Acoustical Society of America, 64, 1319-1332. Denes, P. (1955) Effect of duration on the perception of voicing, Journal of the Acoustical Society of America, 27,761-764. Di Cristo, A. (1980) La duree intrinseque des voyelles du fran<;ais, Trauaux de l'Institut de Phonetique d'Aix, 7, 211-235. Dommelen, W. A. van (1981) Kontextbedingte Vokaldehnung im Franzi:isischen, Arbeitsberichte des /nstituts fiir Phonetik der Universitiit Kie/ (AIPUK), 16, 95-108 . Donegan, P. (1974) On the writer/rider distinction: a brief experimental study, OSU Working Papers in Linguistics, 17, 180-198. Durand, M. (1946) Voyelles longues et l.iOyelles breves. Paris: Klincksieck. Durand, P. (1985) Variabilite acoustique et invariance en fran{:ais: consonnes occ/usives et voyelles. Paris: Editions du Centre National de Ia Recherche Scientifique. Fischer-J0rgensen, E. (1968) Les occlusives francaises et danoises d'un sujet bilingue, Word, 24, 112-153. Fisher, W. M. & Hirsch, I. J. (1976) Intervocalic flapping in English, Chicago Linguistic Society, U, 183-198.
43
C. Laeufer
Flege, J. E. & Brown, W. S. (1982) The voicing contrast between English /p/ and /b/ as a function of stress and position-in-utterance, Journal of Phonetics, 10, 335-345 . Flege, J. E. & Hillenbrand, J. (1986) Differential use of temporal cues to the /s/-/z/ contrast by native and non-native speakers_of English, Journal of the Acoustical Society of America, 79, 508-517. Flege, J. E. & Hillebrand, J. (1987) Differential use of closure voicing and release bursts as cues to stop voicing by native speakers of French and English , Journal of Phonetics, 15, 203-208. Fox, R. A. & Terbeek , D. (1977) Dental flaps, vowel duration and rule ordering in American English, Journal of Phonetics, 5, 27-34. Gottfried, T. L. ( 1984) Effects of consonant context on the perception of French vowels, Journal of Phonetics, 12,91-114. Gram mont, M. (1930) Traite pratique de prononciation franraise. Paris: Delagrave . Haggard, M. (1978) The devoicing of voiced fricatives, Journal of Phonetics, 6, 95-102. Harris, M. & Umeda, N. (1974) Effect of speaking mode on temporal factors in speech: vowel duration, Journal of the Acoustical Society of America, 56, 1016-1018. Hirst, D. & Di Cristo, A . (1984) French intonation a parametric approach, Die Neueren Sprachen, 83, 554-569. Hogan, J. T. & Rozsypal, A. J. (1980) Evaluation of vowel duration as a cue for the voicing distinction in the following word-final consonant, Journal of the Acoustical Society of America, 67, 1764-1771. House, A. S. (1961) On vowel duration in English , Journal of the Acoustical Society of America, 33, 1174-1178. House, A. S. & Fairbanks, G. (1953) The influence of consonant environment upon the secondary acoustical characteristics of vowels, Journal of the Acoustical Society of America, 25, 103-113. Huff, C. (1980) Voicing and flap neutralization in New York City English, Research in Phonetics, 1, 233-256. Juilland, A. (1965) Dictionnaire inverse de Ia Langue Franraise. The Hague: Mouton. Kahn, D. (1976) Syllable-based generalizations in English phonology. Bloomington: Indiana University Linguistics Club. Klatt, D. H. (1973) Interaction between two factors that influence vowel duration, Journal of the Acoustical Society of America, 54, 1102-1104. Klatt, D . H . (1975) Vowel lengthening is syntactically determined in connected discourse, Journal of Phonetics, 3, 129-140. Klatt, D. H. (1976) Linguistic uses of segmental duration in English: acoustic and perceptual evidence, Journal of the Acoustical Society of America, 59, 1208-1221. Kohler, K. J. (1979) Parameters in the production and the perception of plosives in German and French , Arbeitsberichte des Instituts fi.ir Phonetik der Universitiit Kiel (AIPUK) , 12, 261-292. Kohler, K. J. (1984) Phonetic explanation in phonology: the feature fortis/lenis, Phonetica, 41, 150-174. Kohler, K. J. & Kunze! H. J. (1978) The temporal organisation of closing-opening movements for sequences of vowels and plosives in German . Arbeitsberichte des Instituts fi.ir Phonetik der Universitiit Kiel (A/PUK), 10, 117-167. Kohler, K. J., Dommelen, W. A. van & Timmerman, G. (1981) Die Merkmalpaare stimmhaft/stimmlos und Ienis/ fortis in Produktion und Perzeption der franzosischen Obstruenten, Arbeitsberichte des Instituts fi.ir Phonetik der Universitiit Kiel (AIPUK), 14, 1-125. Laeufer, C . (1987) Constraints on the domain of phrasal resyllabification in French , OSU Working Papers in Linguistics, 36, 75-100. Laeufer, C. (1989) French linking, English flapping , and the relationship between syntax and phonology. In Studies in Romance Linguistics (C. Kirschner & J. DeCesaris, editors), pp. 225-247. Amsterdam: Benjamins, Landschultz, K. (1967) Quantile vocalique en fran<;ais: relations quantitatives des voyelles accentuees sui vies d'une consonne fricative. Annual Report of the Institute of Phonetics, University of Copenhagen (ARIPUC), 2, 109-118. Leon, P. (1966) Prononciation du franrais standard. Paris: Didier. Lisker, L. (1977) Factors in the maintenance and cessation of voicing, Phonetica, 34, 304-306. Luce, P. & Charles-Luce, J. (1985) Contextual effects on vowel duration, closure duration, and the consonant/vowel ratio in speech production, Journal of the Acoustical Society of America, 78, 1949-1957. Mack, M. (1981) French and English word-final stop consonants: monolingual and bilingual production, Brown University Working Papers in Linguistics, 66-72. Mack, M. (1982) Voicing-dependent vowel duration in English and French: monolingual and bilingual production, Journal of the Acoustical Society of America, 71, 173-178. Maddieson, I. (1977) Further studies on vowel length before aspirated consonants, UCLA Working Papers in Phonetics, 38, 82-89. Maddieson, I. & Gandour, J. (1976) Vowel length before aspirated consonants, UCLA Working Papers in Phonetics, 31, 47-52.
Voicing -conditioned vowel duration
439
Malecot, A. & Lindheimer, E. (1966) The contribution of releases to the identification of final stops in French, Studia Linguisticae, 20,99-109. Ohala, J. J. (1983) The origin of sound patterns in vocal tract constraints. In The production of speech (P. F. MacNeilage, editor), pp. 189-216. New York: Springer-Verlag. O'Shaughnessy, D. (1981) A study of Frency vowel and consonant durations, Journal of Phonetics, 9, 390-404. O'Shaughnessy, D. (1984) A multispeaker analysis of durations in read French paragraphs, Journal of the Acoustical Society of America, 76, 1664-1672. Pernod, N. ( 1937) La liaison en fran<;ais, liaison et enchalnement, Publication of the Modern Language Association, 22, 333-338. Peterson, G . E. & Lehiste, I. (1960) Duration of syllabic nuclei in English, Journal of the Acoustical Society of America, 32, 693-703. Pierrehumbert, J. (1980) The phonology and phonetics of English intonation. Ph .D. Dissertation, MIT, Cambridge. Port, R. F. (1977) The influence of speaking tempo on the duration of stressed vowel and medial stop in English Trochee words. Bloomington: Indiana University Linguistics Club. Port, R. F. (1981) Linguistic timing factors in combination, Journal of the Acoustical Society of America, 69, 262-274. Port, R. F. , Al-Ani, S. & Maeda, S. (1980) Temporal compensation and universal phonetics, Phonetica, 37, 235-252. Price, P. J. (1981) A cross-linguistic study of flaps in Japanese and American English. Ph.D . Dissertation, University of Pennsylvania, Philadelphia. Raphael, L. J. ( 1972) Preceding vowel duration as a cue to the perception of the voicing characteristic of word-final consonants in American English, Journal of the Acoustical Society of America, 51, 1296-1303. Rositzke, H. (1943) The articulation of final stops in general American speech, American Speech, 18, 39-42 . Rudes, B. (1977) Another look at syllable structure. Bloomington: Indiana University Linguistics Club. Sharf, D. J. (1962) Duration of post-stress intervocalic stops and preceding vowels, Language and Speech, 5, 26-30. Sheldon, D. (1973) A short experimental investigation of the phonological view of the writer-rider contrast in U.S. English, Journal of Phonetics, 1, 339-346. Shockey, L. R. (1974) Phonetic and phonological properties of connected speech. Ohio State University Working Papers in Linguistics, 17, 1-144. Slis, I. H. & Cohen, A. (1969) On the complex regulating the voiced-voiceless distinction (part I), Language and Speech, 12, 80-102. Slis, I. H . & Cohen, A. (1970) On the complex regulating the voiced-voiceless distinction (part II). Language and Speech, 13, 137-155. Smith, B. L. (1978) Temporal aspects of English speech production: a developmental perspective, Journal of Phonetics, 6, 37-67. Summers, W. V. (1987) Effects of stress and final-consonant voicing on vowel production: Articulatory and acoustic analyses, Journal of the Acoustical Society of America, 82, 847-863. Thorsen, 0. M. (1966) Voice assimilation of stop consonants and fricatives in French and its relation to sound duration and intra-oral air pressure. Annual Report of the Institute of Phonetics of the University of Copenhagen, 1, 67-76. Umeda, N. (1975) Vowel duration in American English, Journal of the Acoustical Society of America, 58, 434-445. Umeda, N. & Coker, C. H. (1974) Allophonic variation in American English. Journal of Phonetics , 2, 1-5. Vaissiere, J. (1974) On French prosody, Research Laboratory of Electronics (MIT), Quarterly Progress Report, 114, 212-223. Vaissiere, J. (1975) Further note on French prosody, Research Laboratory of Electronics (MIT), Quarterly Progress Report, 115, 251-262. Vaissiere, J . (1980) La structuration acoustique de Ia phrase fran<;aise, Ann. Scu. Norm. Sup . Pisa Ill, 10, 529-560. Veatch, T. C. (1990) Final devoicing of fricatives in English. Paper given at the !20th Meeting of the Acoustical Society of American, May 1990. Walsh, T. & Parker, F. (1981) Vowel length and 'voicing' in a following consonant, Journal of Phonetics, 9, 305-308. Wajskop, M. (1978) Indices temporels des occlusives intervocaliques en fram;ais, Rapport d'Activitiis de l'Institut de Phonetique, Universitii de Bruxelles, U, 71-98. Wardrip-Fruin, C. (1982) On the status of temporal cues to phonetic categories: preceding vowel duration as a cue to voicing in final stop consonants, Journal of the Acoustical Society of America, 71, 187-195.
C. Laeufer \ estb ury . J. (1983) Enlargement of the supraglottal cavity and its relation to stop consonant voicing, Journal of the A coustical Society of America, 73, 1322-1336. Zimmerman , S. A. & Sapon, S.M. (1958) Note on vowel duration seen cross-linguistically, Journal of the Acousrical Society of America, 30, 152-153. Zue , V. & Laferriere , M . (1979) Acoustic study of medial /t, d/ in American English , Journal of the Acoustical Society of America, 66, 1039-1050.