The perception of duration within sequences of four intervals*

The perception of duration within sequences of four intervals*

Journal of Phonetics (1979) 7, 313-316 The perception of duration within sequences of four intervals* lise Lehiste Ohio State University, Department ...

2MB Sizes 0 Downloads 32 Views

Journal of Phonetics (1979) 7, 313-316

The perception of duration within sequences of four intervals* lise Lehiste Ohio State University, Department of Linguistics, Columbus, Ohio, U.S.A . 43210 Received 6th January 1978

This experiment grew out of a study of isochrony-the notion that in stress-timed languages such as English, stresses follow each other at regular intervals. In a previous study (Lehiste 1973) I investigated the production of sentences consisting of four speech measures, and established the duration of the intervals between stresses. While these durations were not always equal, speech measures of the same type showed relatively little variation. Then I reproduced the temporal patterns of the four-measure sentences as non-speech stimuli. The durations of the measures were replicated as noise-filled intervals separated by clicks. Li steners judged both the actual sentences and the sequences of filled intervals, deciding in each case which of the four measures (or four intervals) was longest or shortest. In the case of spoken sentences, listeners had considerable difficulty in identifying the measures which were actually the longest or the shortest. With nonspeech materials, the corresponding intervals were identified with much greater success. I reasoned that if listeners cannot identify the actually longest or shortest measures in spoken English sentences, the measures must seem to them to have equal duration; isochrony would then be a perceptual phenomenon. The observation that listeners do better with non-speech materials supports the hypothesis that isochrony characterizes the perception of spoken English. The measurements made on spoken utterances did reveal some differences in the duration of speech measures. Given the hypothesis that isochrony is a perceptual phenomenon, it is possible that actual differences of the magnitude observed in the first study may not be perceived at all. As listeners do better with non-speech stimuli, just noticeable differences in duration established for non-speech can be considered the base-line against which the perceptibility of durational differences in speech may be measured. The experiment reported in this paper constitutes an attempt to establish such a base-line. I chose three basic reference durations: 300, 400 and 500 ms. These durations corresponded to the range observed in actual productions of sentences consisting of four speech measures. For each reference duration, the length of each of the four intervals, one at a time, was decreased and increased in nine 10-ms steps. Three of the four intervals were always of the same duration; one of the four was either shorter or longer. The intervals were filled with noise and separated from each other by clicks. The sequences were produced on a Glace-Holmes synthesizer at the Ohio State University. An example of a sequence might be 400-480-400-400, i.e. a sequence in which the duration of the second interval has been increased by 80 ms relative to the durations of the other intervals. The set of252 sequences was randomized and presented to thirty listeners, who were asked to identify one of the four intervals as "longest". In a second presentation, the listeners *Paper presented at the 8th International Congress of Phonetic Sciences, Leeds, Aug. 21, 1975. 0095- 4470/79/040313+04 $02.00/0

© 1979 Academic Press Inc. (London) Ltd.

314

lise Lehiste Table I Identification of intervals as "longest" in each of four positions (averaged from four different tokens, 30 judgments per token)

Reference duration (ms)

1. interval

2. interval

3. interval

4. interval

300 400 500

10·5 11 ·5 15·0

7·5 7·5 5·75

7·0 7·75 6·75

5·0 3·25 2·5

Table II Identification of intervals as "shortest" in each of four positions (averaged from four different tokens, 30 judgments per token)

Reference duration (ms)

1. interval

2. interval

3. interval

4. interval

300 400 500

13·75 6·5 7·25

5·0 7·25 7·25

4·0 6·75 3·75

7·25 9·5 11·75

were requested to mark one of the intervals as "shortest" . The order of "longest" and "shortest" judgments was reversed for half of the listeners. For each reference duration, four tokens of sequences of equal intervals were included in the listening test. The judgments of the listeners regarding sequences of equal intervals were used as the basis for evaluating their performance in the perception of actual differences. Tables I and II present the listener judgments concerning sequences of equal intervals. With equal probability, thirty listeners, and four choices, each of the intervals should have been judged "longest" or "shortest" 7· S times. The judgments of intervals in second and third position came close to this value ; the durations of intervals in first and fourth position were evidently subject to perceptual bias. Table I shows the "longest" judgments by listeners in sequences of equal intervals. It is evident that listeners tended to hear the first interval as longer than the others; furthermore, the number of judgments as " longest" increased with the reference duration. The increase of "longest" judgments in the first interval seems to have taken place at the expense of the fourth interval, which the li steners tended to underestimate. Table II shows "shortest" judgments in sequences of equal intervals. For reference durations of 400 and 500 ms , the fourth interval was judged shortest more frequently than other intervals; for the reference duration of 300 ms, it was the first interval that was judged shortest. (The same interval had been most frequently judged "longest" in the first half of the test.) Estimating the duration of the first interval seems to have presented the greatest difficulty to the listeners; it is conceivable that the length of the sequences may have exceeded the short-term memory span of at least some of the subjects. The difficulty in judging the duration of the first interval becomes even more noticeable in the listeners' performance with sequences in which the duration of one of the intervals was actually changed relative to the reference duration. Table III presents the durations of intervals at which listeners agreed (at the 0·01 confidence level) that the interval was "longest"; Fig. I shows the increment needed for listeners to agree on the "longest" judgment. Th:e level of significance was establi shed by the chi-square measure relative to the judgments made by the subjects when they listened

Duration within Sequences of Four Intervals

315

Table III Duration of intervals at which listeners agreed (at the 0.01 level of confidence) that the interval was "longest" (N = 30) Reference duration (ms)

1. interval

2. interval

3. interval

4. interval

300 400 500

390 500+ 600+

340 470 570

340 440 530

370 460 560

(500+ and 600+ indicate that 0.01 level of significance was not reached even at 100 ms increments.) Increment needed for "Longest" judgments Increment in ms

100+ 100+

2

3

~~~ lf------=-~~.: . . _1___

--1 500 '--·_ _ _ _ 7 0_ _ _ __,

~~~I 5oo ----,-:~ 3-=-o-.-1---' f . -

'------'

4

~~~I 500 _ _ _~~ b.:,-,_0'--------i . ~---

L___

Figure I

___;;;c::___ _ __,

Increment needed for "longest" judgments.

sequences of equal intervals. The figure is to be read as follows. For the first interval, with a reference duration of 300 ms, the duration had to be increased by 90 ms before listeners reached agreement (at the 0·01 level) that the first interval was the longest. The sequence 390-300-300-300 received the following number of "longest" judgments: 21, 4, 5, and 0 (21 listeners heard the first interval as longest, four the second, five the third; no listener identified the fourth interval of this particular sequence as longest.) This was compared with the averages for equal intervals, which for the 300 ms reference duration were 10·5-7·5-7·0-5·0 (cf. Table I). The same procedure was followed in each case. It is readily apparent that here, too, the listeners had greatest difficulty in judging the duration of the first interval; for reference durations of 400 and 500 ms, even an increment of 100 msec did not yield significant agreement. On the other hand, listener performance was best in judging the duration of the third interval: here the increase that was needed for agreement among listeners was the smallest. Table IV gives the durations of intervals at which listeners agreed that the interval was "shortest"; Fig. 2 shows the decrement needed for listeners to reach agreement for "shortest" judgments. The figure is to be read in a similar way to the previous one. For example, in the first position, with reference duration of 300 ms, the interval had to be shortened by 70 ms to achieve agreement. Again it was the first interval that seems to have presented the greatest problem to the listeners. The third interval was again the interval in which the listeners were able to perceive the smallest change in duration. to

316

lise Lehiste Table IV Duration of intervals at which listeners agreed (at the 0.01 level of confidence) that the interval was "shortest" (N = 30)

Reference duration (ms)

l. interval

2. interval

3. interval

4. interval

300 400 500

230 340 400

250 320 420

270 370 470

270 340 450

Decrement needed for "Shortest" judgments Decrement in m s

~~~I ~; f-------=.:=------'--500

- ----,

L---~10~0=------------~

2

~~~I 500

~~

~--~8~0~-----~ L---~=-------~

3

4

Figure 2

30080 400 30 500 30

!~~I

~~I

500 :.=====~50~======:,..----'

Decrement needed for "shortest" judgments.

If my earlier assumption is correct and the difficulty in identifying actually longer or shorter intervals may be interpreted to mean that in such cases the intervals must sound alike in duration, then quite large differences in duration are not really perceptible. Most of the differences that J had observed in the production of four-measure sentences were actually smaller than the differences that emerged as the limits of perception in this study. In addition to my own work referred to earlier, it has been shown also by Fujisaki, Nakamura & lmoto (1973) that accuracy in the perception of duration in word context is inferior to accuracy of discrimination when the same signal is presented in isolation as non-speech stimulus. It may thus be assumed that in listening to actual speech, even somewhat larger differences than the just noticeable differences established in the present study might not be perceived. In the ongoing debate concerning isochrony, the validity of the claim has sometimes been dismissed because measurements have shown differences in the duration of intervals between stresses. I believe that such measurements should be re-interpreted in the light of the prese nt results . It may well be that the actually measured differences were not perceptible at all to listeners who were listening in the mode appropriate for spoken English. References Fujisaki, H. , Nakamura , K. & lmoto. T . (1973). Auditory perception of duration of speech and nonspeech stimuli . Research lnsriwre of Logopedics and Phoninlrics, Annual Bulletin No.7, pp. 45-64. University of Tokyo Leh iste, I. (1973). Rhythmic units in production and perception. Journal ofrhe Acoustical Society of America 54, 1128- 1234.