Journal of Phonetics (1981) 9, 273-281
Durational relationship between Japanese stops and vowels YayoiHomma Department of Foreign Languages, Osaka Gakuin University, Kishibe, Suita-shi, Osaka 564, Japan R eceived 28th April 1980
Abstract:
The durational relationship between Japanese stops and vowels is examined, measuring closure duration, voice onset time, and vowel duration of two and three mora words with single and geminated stops. The results reveal that these three variables are closely related to fixed word duration , although the duration of each syllable in a word is different. Not only universal but also language-specific durational principles are incorporated into Japanese, and acoustic measurements well fit at word level with the linguistic uses of segmental duration in Japanese.
1. Introductio~ Japanese is a mora-counting language. The length of an utterance phonologically depends on the number of moras . For example, kan ("a can") and kana ("syllabary") are two mora words which linguistically have the same duration. The ratio of the duration of gaka ("painter" ) and gakka ("lesson") corresponds to the number of moras namely 2 : 3 , because the first part of the geminated stop /kk/ is counted as one mora. An examination of the durational relationship between Japanese stops and vowels in several contexts, measuring closure duration, voice onset time (VOT) and vowel duration, might reveal acoustic evidence for the linguistic uses of segmental duration in Japanese. This paper will contrast the results of experiments in Japanese , a mora-counting language, with those of English, a stresscounting language . Various people have conducted experiments on closure duration, VOT, and vowel duration in English. They have found that voiceless stops have greater closure duration than voiced stops. According to Lisker (1957), the average closure duration for /p/ was about 120 ms and the average value for /b/ was 75 ms in trochee words . Most investigators agree that when other factors are constant, closure duration for labials is longer than alveolars and velars (Lehiste, 1970). VOT increases as the place of closure moves toward the back (Lisker & Abramson, 1964). VOT is longer in stressed syllables than in unstressed syllables (Lisker & Abramson, 1967). VOT is closely related with the following vowel quality (Klatt, 1975 ; Port & Rotunno , 1979). Vowel duration in English varies greatly according to environments. Vowels are the longest before pauses. Stressed vowels are longer than unstressed ones (Klatt , 1976). Vowels before voiced consonants are longer than vowels before voiceless consonants (House & Fairbanks, 1953; Denes , 1955; House, I 961; Peterson & Lehiste, 1960). The number of 0095-4470/81/030273+09 $02.00/0
©
1981 Academic Press Inc. (London) Ltd.
274
Y. Homma
syllables (Klatt , 1973; Harris & Umeda, 1974) and speaking rate (Lehiste, 1970; Port, 1976) are also important factors which influence vowel duration. In Japanese some experiments on closure duration and vowel duration have been reported. In closure duration, voiceless stops are longer than voiced ones (Han, 1962; Okada, 1969, 1971 ); fpf is longer than ftf and /k/ (Han, 1962). Vowel duration is longer in voiced environments (Okada, 1969 ; Homma, 1973). Pitch accent does not have a significant influence on vowel duration (Homma , 1973). Temporal compensation works within a word (Homma, 1973; Maeda, 1979). VOT in Japanese, however, has been given the least attention (Homma, 1980). The specific objectives of this paper are (1) to confirm the results of the previous experiments in Japanese; (2) to discuss VOT; and (3) to study the durational relationship between stops and vowels.
2. Experiment The purpose of this experiment was to make acoustic measurements of closure duration, VOT, and vowel duration of two and three mora words with single and geminated stops. 2.1 Methods Twenty-four real and nonsense test words were prepared. The words contained vowel /a/ and voiceless stops fp, t , k, b, d , gf at the initial and also at the medial position. Test words
/p/ /b/ /t/ /d/ /k /
Jg/
(A) two moras
(B) three moras
papa paba bapa baba
pappa pabba bappa babba
tata tad a data dada
tatta tadda datta dadda
kaka kaga gaka gaga
kakka kagga gakka gagga
These words were placed in the following sentence frame. Kore wa _ _ _ desu.
This is _ __
The sentences with the test words were randomly arranged on a list. The list was read three times by four Japanese speakers, three males and one female, in natural speed with accent on the first syllable . The total 288 sentences were recorded and wide-band spectrograms were made of each utterance with a Kay-Sonagraph (6061B) in the Phonetic Laboratory of Indiana University.
2.2. Measurement procedures 2.2.1. Closure duration. At the initial position of the word, closure duration was difficult to measure , because there was sometimes a pause before the test word. Therefore, closure
Japanese stops and vowels
275
duration of the medial stop was measured between termination of the first vowel formant transition and the burst of the medial stop.
2.2.2. Voice onset time. The release of plosive consonant, especially voiceless stops, involves three phases - explosion, frication, and aspiration. Frication and aspiration are called voice onset time (VOT) (Lisker & Abramson, 1964). VOT is one of the most important cues to separate voiced and voiceless stops. Klatt measured frication and aspiration separately in his paper (1975), but in this experiment the whole VOT was measured by marking off the interval between the release of the stop and the onset of glottal vibration for voice. The onset of glottal vibration was shown on the spectrogram with the beginning of the regularly spaced vertical striations. 2.2.3. Vowel duration. The duration of the first and second vowels was measured from the onset of glottal vibration to the closure for the following stop shown by the abrupt cessation of energy in all the formants. 2.3. Results and discussion 2.3.1 . Closure duration. Table I gives the results. Table I Average closure duration of the medial stops for each speaker in milliseconds
IPI
lbl
It I
ldl lkl lgl
IPPI
lbbl Itt I
lddl lkkl lggj
S2
S3
90 50 87 37 75 33
64 59 56 41 56 49
56 42 22 51 41
80 55 62 39 62 ?
55 62 35 61 41
184 144 190 130 169 109
180 161 178 148 179 124
199 181 154 156 174 151
170 148 159 141 176 151
183 159 170 144 175 134
72
S4
Mean
S1
77
The following points were observed. (1) Closure duration in Japanese was larger for voiceless stops than voiced stops just as in English (Lisker, 1957). (2) Labials were longer than denials and velars, but the difference was not very large. (3) There was very little difference between dentals and velars. (4) The ratio of single stops and geminated stops was about 1 : 3. Table II shows the average closure duration in milliseconds and also in percentage. In (B), the average closure duration for voiced stops was assumed to be 100%, and in (C), single stop closure duration was assumed to be 100%. According to Table II, geminated stops were less influenced by voicing feature than single stops. This means that the last part of a voiced geminated stop is voiceless before the burst. These findings may agree with the fact that VOT was measured not only for voiceless
Y. Homma
276
Table II (A) Average closure duration (ms)
Voiceless Voiced
(B) Closure duration ratio between voiced and voiceless stops(%)
(C) Closure duration ratio between single and geminated stops(%)
Single
Geminated
Single
Geminated
Single
Geminated
67 44
176 146
152 100
121 100
100 100
263 332
stops but also for voiced ones in geminated stops, but seldom in single stops, a phenomena we will discuss later. Voicing effects on closure duration, and the ratio between single and geminated stop closure by and large agreed with Han's measurements (1962). She reported "/p, t, k, c, s/ show approximately 20 to 40% increase in duration over their corresponding voiced ones" and that "the duration of short and long consonants is, on the average, in the ratio of 1.0 to 2.6 and often 1.0 to 3 .0."
2.3.2. Voice onset time. Table III shows the results. Table III Voice onset time of the initial and medial stops for each speaker in milliseconds (A) VOT of the initial stops Sl S2 S3 (p - pj
/p-b/ /t-t I j t-df
/k-k/ /k- g/ fg-kf fg-gf
18 24 26 16 48 63 14 8
34 43 42 53 51 62 20 19
14 19 26 24 35 57 10 17
S4
Mean
29 31 34 33 46 60 12 13
24 29 32 32 45 61 14 14
(B) VOT of the medial stops S2 S3 Sl
!PI ft I /k/
/PP/ ftt I /kk/ /bb/ /dd/ fggf
3 8 22 0 1
19 0 0 25
7 16 26 16 14 37 4 7 25
3 18 21 7 10 20 1 11 14
S4
Mean
13 23 25 20 28 36 2 12 23
7 16 24 11 13 28 2 8 22
The following points were observed. ( 1) VOT in Japanese clearly increased as the place of closure moved toward the back of the mouth, just as in English (Lisker & Abramson, 1964, 1967). (2) VOT was shorter before voiceless stops, but the difference was negligible except with velars. (3) VOT was longer in accented syllables like English (Lisker & Abramson, 1967), although Japanese has pitch accent, not stress accent. The average VOT of initial /p, t, k/ was 37 ms, and the average of medial /p, t, k/ was 16 ms. (4) Gemination of stops did not affect VOT. The average value of fpp, tt, kk/ was 17 ms. (5) As mentioned above, VOT was observed in voiced geminated stops, but was very rare in single voiced stops, except for /g/ at word initial position. Klatt (1975) reported that the average VOT in English was 61 ms for voiceless stops and and 18 ms for voiced ones in stressed monosyllabic words. That voiced stops have VOT
277
Japanese stops and vowels
indicates that English voiced stops are not truly voiced at the initial position. Although comparing VOT of English and Japanese is not a simple task , it may be safe to say that English has longer VOT than Japanese (Homma, 1980) . Lisker & Abramson (196 7) found that although the effects of stress on VOT were rather limited , in English "stress and VOT are not strictly independent of one another." They reported that the difference between mean values for /p, t , k / in isolated words as against sentences was about 25 ms , and that the difference between stressed and unstressed /p, t, k/ was about 6 ms in sentences. In Japanese , accent and VOT were clearly related. The difference between accented and unaccented /p, t, k / was about 21 ms .
2.3.3. Vowel duration. Tables IV and V show the average duration of the first vowel fa/ and the second vowel /a/ respectively. Average vowel duration of the first /a/ for each speaker in Table IV milliseconds
IP- PI IP- bl l b-· PI lb- b/ It- t I l t - dj ld- tl l d- d j lk- kj lk- gl lg- kj lg- gl
Sl
S2
S3
S4
90 101 103 104 94 113 111 124 74 I 02 107 126
63 59 93 97 49 63 98 105 64 82 111 108
59 86 81 90 71 85 99 97 85 75 113 107
69 78
96 108 62 77
100 108 65 84 108 114
Mean
70 81 93 100 69 85 102 109 72
86 110 114
Table V Average vowel duration of the second fa/ for each speaker in milliseconds
IP- dl lb- dl IPP-dl lbb- dl l t- dl l d- dl ltt- dl ldd- dl lk - dl lg- dl lkk- dl lgg- dl
S1
S2
S3
S4
Mean
106 114 85
76 98
92
104 75 100 73 98 64 96 68 94
107 108 92 103 96 109 84 89 83 92 79 105
92 110 87 103 87 115 81 104 89 95 80 100
95 108 86 101 91 112 82 98 84 102 79 99
106 123 88 102 100 123 89 95
78
The following points we re observed. (1) Voicing of both the preceding and the following stops had a lengthening effect on vowel duration. In light of this we may ask whether the preceding or the following consonant has more influence on vowel duration in Japanese. Okad a (1969) reported that the following
278
Y. Homma
consonant had a slightly stronger influence, while Homma (1973) recognized a stronger effect of the preceding consonant, and Maeda (1979) found the same results as Homma. Table VI shows the average vowel duration in milliseconds and in percentage under the influence of (A) the preceding stop, and (B) the following stop. Table VI was based on Table IV. For all the stops, vowel duration in voiced environments was assumed to be 100%. Table VI
Effects of voicing of the stops on vowel duration
(A) The preceding stop ms % Voiceless Voiced
77
105
73 100
(B) The following stop
ms
%
86 96
100
90
Comparing (A) and (B), we fmd that the preceding stop has more influence than the following stop. In (A) vowel duration is reduced approximately 25% and in (B) 10%. Therefore, we can conclude that in Japanese the preceding consonant has a stronger effect on vowel duration. (2) As seen in Table VII, vowel duration of the first /a/ slightly increased as the place of closure of the adjacent stops moved toward the back. Table VII
Labial Apical Velar Mean
Comparison of the first and second vowels in milliseconds
The first fa/
The second fa/
86 91 96 91
98 96 91 95
These findings agreed with VOT. However, in the second /a/, vowel duration decreased in the same direction. (3) In duration there was no significant difference between accented and unaccented vowels. The second unaccented vowels were somewhat longer as in Homma (1973). This is very different from English. (4) Gemination of stop had a slight influence on the following vowel. Vowels were a little shorter after geminated stops. The average difference was about 8 ms. Contextual effects on vowel duration in Japanese are very different from those in English. First, in English, voicing of the following consonant has much stronger influence on vowel duration (Peterson & Lehiste, 1960; Chen, 1970) than in Japanese. Secondly, in English, stress-accented vowels are much longer than unstressed vowels. In 1976 Klatt reported that the average duration for stressed vowels is approximately 130 ms, and that the average duration for unstressed vowels is approximately 70 ms in a connected discourse. In Japanese the mean of the first /a/ in accented syllables was 91 ms, and that of the second /a/ was 95 ms.
2.3.4. Durational relationship. Table VIII shows the average word duration in milliseconds and ratio. That the word-duration ratio by acoustic measurements fits well into the linguistic mora ratio 2 : 3 is very interesting.
279
Japanese stops and vowels Table VIII
Average word duration in milliseconds
With single medial stop
With geminated medial stop
264 260 279 268
376
Labial Apical Velar Mean Ratio Number of moras
2 2
372
401 383
2.9 3
Although Han wrote that a unit of duration in Japanese is associated with a syllable, it may be more appropriate to say that domain of durational pattern is not a syllable, but a word. The following examples illustrate this. VOT
IVowol dumtion 1Clow
a
18
67\ 77
p
a 98
260ms
105
267 ms
I
'I I
13 g
109 '. 40 a
g
a
Here /papa/ and /gaga/ are two mora words which have almost the same word duration. However, when we compare these two words, we find a big difference in vowel duration and in closure duration. In /papa/ vowels are shorter and closure duration is longer because of the voiceless environment, while in /gaga/ vowels are longer and closure duration is shorter. Therefore, the duration of the first syllable of /papa/ is 85 ms, and that of /gaga/ is 122 ms. In spite of the big difference in the first syllable duration, the two words are almost the same in the word duration. This means that temporal compensation works within a word, not within a syllable or mora. The previous experiments by Homma (1973) and Maeda (1979) support this observation. 3. Conclusions In this durational study of Japanese, the following points were observed. 3.1. Closure duration
(1) Closure duration was longer for voiceless stops than voiced ones. (2) Closure duration for labials was longer than apicals and velars. (3) The ratio of closure duration between single stops and geminated stops was about 1 : 3. This ratio implies that the duration of geminated stops is not only doubling the stop segment but also including the length which corresponds to a larger unit, namely a mora, as Han pointed out.
280
Y. Homma
3.2. Voice onset time (1) VOT increased as the place of closure moved toward the back of the mouth. (2) VOT was shorter before voiceless stops than voiced ones, but the difference was negligible except with velars . (3) VOT was clearly related with accent but not with gemination of stops. (4) Japanese stops have shorter VOT than English.
3.3. Vowel duration (1) As in English, vowel duration was longer before voiced consonants than voiceless con· sonants, but the extent differed drastically. One of the reasons for this may be that vowel duration in Japanese is more influenced by the preceding consonant than the following one. (2) Vowel duration was independent from accent. The mean of the second unaccented vowel duration was longer. (3) The place of articulation of the adjacent stops affected the vowel duration. As the place of closure moved toward the back, both VOT and vowel duration became longer in the first syllable. In the second syllable , on the contrary, vowel duration became shorter in this direction. English is a rhythmic stress-timed language . Rhythm tends to fall with the same amount of time between two primary stresses , regardless of the number of syllables . Although there is a certain limit in compressibility (Klatt, 1973), the more syllables a rhythmic unit has, the shorter becomes the duration of the segments (Lehiste , 1970 ; Homma, 1978). Therefore , stress and the number of syllables have great effects on vowel duration. On the other hand, Japanese is a mora-counting language , and given a certain number of moras, word duration is relatively fixed , although the duration of each syllable in a word is phonetically different. As a result , temporal compensation is observed within a word, not within a syllable. In voiceless environments , we have VOT , shorter vowel duration and longer closure duration, and in voiced environments, we have almost no VOT, but we do have longer vowel duration and shorter closure duration. In other words, closure duration, voice onset time, and vowel duration work together to obtain fixed word duration. Thus the difference in word duration is small. The present experiment clearly revealed that the durational relationship between Japanese stops and vowels shows not only universal but also language-specific characteristics, and that acoustic measurements well fit at word level with the linguistic uses of segmental duration in Japanese. I appreciate the guidan ce and comments of Professor R. F. Port at Indiana University on earlier versions of this paper. References Chen, M. (1970). Vowel length variation as a function of the voicing of the consonant environment. Phonetica , 22, 129- 159. Denes, P. (1955) . Effect of duration on the perception of voicing. Journal of the Acoustical Society of America, 25, 105 - 113. Han, S. M. (1962) . The feature of duration in Japanese. To kyo : Phonetic Society of Japan, Study of Sounds, 10, 65- 75 . Harris, M. S. & N. Umed a (1974) . Effect of speaking mode on temporal factors in speech : vowel duration. Journal of th e Acoustical Society of America, 56, 1016- 101 8. Homma, Y. (1973). An acoustic study of Japanese vowels: their quality, pitch, amplitude, and duration. Study of Sounds , 16, 34 7- 368.
Japanese stops and vowels
281
Homma, Y. (1978). Vowel duration in English. Osaka: Osaka Gakuin University, Gaikokugo Ronshu, 6, 51-67 (in Japanese). Homma, Y. (1980). Voice onset time in Japanese stops. Bulletin of the Phonetic Society of Japan, 163, 7-9. House, A. S. (1961). On vowel duration in English. Journal of the Acoustical Society of America , 33, 1174- 1178. House, A. S. & G. Fairbanks (1953). The influence of consonant environment upon the secondary acoustical characteristics of vowels. Journal of the Acoustical Society of America, 25, 105-113 . Klatt, D. H. (1973). Interaction between two factors that influence vowel duration. Journal of the Acoustical Society of America, 54, 1102-1104. Klatt , D. H. (1975). Voice onset time, frication and aspiration in word-initial consonant clusters. Journal of Speech and Hearing Research, 18,686-705. Klatt, D. H. (1976). Linguistic uses of segmental duration in English: acoustic and perceptional evidence. Journal of the Acoustical Society of America, 59, 1208-1221. Lehiste, I. (1970). Suprasegmentals. Cambridge, Massachusetts; London: M.l.T. Press. Lisker, L. (1957). Closure duration and the intervocalic voiced-voiceless distinction in English. Language, 33, 42--49. Lisker, L. & A. S. Abramson (1964 ). A cross-language study of voicing in initial stops: acoustical measurements. Word, 20, 384--422. Lisker, L. & A. S. Abramson (1967). Some effects of context on voice onset time in English stops. Language and Speech, 10, 1-28. Maeda, S. (1979). Timing control in Japanese speech production. Nara: Tenri University, Tenri Daigaku Gakuho, 121, 1-21 . Okada, T. (1969). The influence of voiced or voiceless consonants on vowel duration. Kyoto: Literary Association of Doshisha University,Jimbungaku, 115,68-84 (in Japanese) . Okada, T. (1971). A spectrographic study of the duration correlation between some Japanese vowels and consonants. Literary Assocation of Doshisha University, Doshisha Studies in English, 1, 32-49 (in Japanese) . Peterson, G. E. & I. Lehiste (1960). Duration of syllable nuclei in English. Journal of the Acoustical Society of America, 32, 693-703. Port, R. F. (1976). Influence of Speaking Tempo on the Duration of Stressed Vowel and Medial Stop in English Trochee Words. Bloomington: Indiana University Linguistics Club. Port, R. F. & R. Rotunno (1979). Relation between voice-onset time and vowel duration. Journal of the Acoustical Society of America, 66, 654-662.