Journal of Phonetics (1982) 10, 105-111
Recovery from categorical boundary shifts for the vowel duration cue to consonant voicing Donald J. Sharf University of Michigan, Ann Arbor, Michigan, U.S.A.
Ralph N. Ohde Bill Wilkerson Hearing and Speech Center and the Vanderbilt University School of Medicine Received 27th May 1981
Abstract:
Subjects identified the consonant in synthesized VC stimuli as /t/ or /d/ in pre-experimental, stimulus repetition, and recovery conditions. The stimuli varied in vowel duration in 25 ms steps between 125 and 375 ms. During the stimulus repetition condition, 22 trials were presented during each of which either the 125 or 375 ms stimulus was heard 85 times and then five of the stimuli were presented for identification. During recovery conditions, subjects again identified the stimuli after intervals of 1, 4, 7, and 28 min. Results were as follows: (1) at least 50% recovery from boundary shift occurred for both repeated stimuli at 1 min but there was significantly greater recovery for the 375 ms stimulus than for the 125 ms stimulus; (2) recovery for both stimuli was no greater than 80% at 28 min; and (3) correlations between boundary shift and recovery were very low and not significant. Results appeared to reflect two stages of recovery, a rapid change in the first minute and a more gradual change during the following half hour.
Introduction The selective adaptation paradigm involves the use of a repeated stimulus to produce a shift in the category boundary of a set of stimuli which vary along an acoustic parameter. The category boundary corresponds to a point on an acoustic continuum which separates stimuli into two phonetic categories. After adaptation, this boundary generally shifts toward the category of the repeated stimulus. Shifts in category boundary have been established for phonetic features, e.g. voicing, place of articulation, and frication (Eimas & Corbit, 1973; Cooper, 1974; Cole & Cooper, 1975) and for acoustic energy variables, e.g. intensity and number of stimulus repetitions (Hillenbrand, 1975; Sawusch, 1977; Ohde & Sharf, 1979). Moreover, there is evidence of peripheral and central involvement in boundary shifts for speech stimuli (Sawusch, 1977; Eimas et at., 1973; Ades , 1974). However, recovery processes for boundary shifts using speech stimuli are poorly understood even though there is evidence of recovery processes in adaptation to basic psychophysical properties (Small, 1963, 1973). Although it seems likely that the recovery process from boundary shift should reflect 0095-44 70/82/010105+07 $02.00/0
© 1982 Academic Press Inc. (London) Ltd.
106
D. J. Sharf and R. N. Ohde
the basic nature of the phenomenon, models of selective adaptation to speech have not accounted for recovery . Explanation., of the phenomenon range between neurophysiological models which assume that specialized neural mechanisms or feature detectors are fatigued (Eimas & Corbit, 1973; Cooper, 1975, 1979) and cognitive models which assume changes in decision criteria (Elman, 1979) or the contrast of anchor and ambiguous stimuli (Diehl et a!. , 1978; Simon & Studdert-Kennedy, 1978; Sawusch & Nusbaum, 1979). Previous studies have indicated that recovery begins soon after the adaptation session (Miller & Morse, 1979) but that it may take a half hour or more to be completed (Eimas & Corbit, 1973). Moreover, the findings of a study of recovery from boundary shift using VOT stimuli by Sharf & Ohde (1981) reveal a complicated process since two stages of recovery were indicated - a rapid recovery period in the first minute and a much slower recovery period over the next half hour. If the recovery process reflects basic mechanisms of boundary shift, then it is important to examine the recovery pattern for different acoustic properties which influence perception of speech. Since it was found that the boundary shifts for the vowel duration cue are similar to those for VOT (Eimas & Corbit, 1973; Ohde & Sharf, 1979; Williams & Sharf, 1979), it is reasonable to predict comparable recovery processes for these acoustic properties. Thus, in order to determine if the pattern of recovery for VOT could be generalized to an acoustic property which differentiates the same phonetic feature, a study was done of the recovery from boundary shifts for the vowel duration cue to consonant voicing.
Method Test stimuli An eleven step VC continuum from ft/ to /d/ was generated using a computer simulation of a terminal analog synthesizer connected in cascade (Klatt, 1972, 1980). The sampling rate for the synthesizer output was 10kHz and the vutput was low-pass filtered with a cutoff frequency of about 4800Hz . The first three formants of the stimuli terminated at 400, 1696, 3114Hz which were appropriate transition values for the alveolar consonant /d/. The steady-state portions of the vowel for the first, second, and third formants were 742, 1618, and 2751Hz, respectively. Formants four and five were set at constant values of 3250 and 3650Hz, respectively. Fundamental frequency rose from 125Hz at voicing onset to 130Hz in 60 ms and then fell linearly to ~00 Hz at the end of the stimulus. The eleven stimuli varied in duration between 125 and 375 ms in 25 ms steps. The duration of the final transition for each stimulus was 35 ms. The duration and formant values were chosen on the basis of research which demonstrated that subjects differentiated /t/ and /d/ in such a continuum (Denes, 1955; Raphael, 1972; Williams & Sharf, 1979). These stimuli were used in generating identification tapes and adaptation tapes which included repetitions of the endpoint stimuli (see Fig. 1).
Subjects and testing Subjects were University of Michigan students who reported normal hearing. They identified the stimuli as "at" or "ad" under the following conditions: (1) prior to the experimental condition (Baseline); (2) following periods of repetition of the 125 and 375 ms stimuli which involved 22 trials per condition (Stimulus Repetition); and (3) following recovery intervals of 1, 4, 7 and 28 min (Recovery). Fifteen subjects participated in the 1, 4, and 7 min recovery conditions and fourteen subjects participated in the 28 min recovery condition. Seven of the subjects participated
Recovery from categorical boundary shifts
Figure 1
107
Spectrograms of the endpoint stimuli in the "at- ad" continuum.
in all four recovery conditions. The experimental testing took place in a sound treated room with a noise level between 30 and 35 dBA based on sound level meter measurements. During the 1, 4, and 7 min intervals, subjects were told to sit quietly and do nothing. During the 28 min interval, subjects were shown slides with no sound presented and no talking permitted. This was done to alleviate the monotony of the subjects. Since the slide presentation was non-verbal, it was not expected to have any effect on recovery. The stimuli were presented binaurally from a tape recorder (Ampex, Model 350) through matched and calibrated earphones (Telephonics, TDH-39) at a sound pressure level of 70 dB. At the beginning of each experimental session, the test stimuli were presented to subjects for identification as /t/ or /d/ in order to determine a baseline function for the perception of these stimuli. This identification tape contained a randomization of 11 replications of each of the 11 stimuli with 2.5 s between them. The first 11 stimuli were used for practice. Following the identification task, the subjects listened to a series of 22 trials in which either the 125 or 375 ms stimulus was repeated 85 times. The interstimulus interval between stimulus presentations was 350 ms. After presentation of the repeated stimulus, there was 1.5 s of silence and then one random block of five test stimuli was presented for identification. Immediately after identification of the last stimulus, a stop watch was activated to time the recovery interval. The test stimuli were again presented to the subjects for identification after the recovery interval. Intervals of 1, 4, 7, and 28 min were chosen since a previous study (Sharf & Ohde, 1981) indicated that these intervals would demonstrate a wide range of recovery. Category boundaries were determined for each subject for the baseline, stimulus repetition, and recovery tasks by means of regression analysis. The difference between baseline and stimulus repetition boundary values was considered the boundary shift and the per cent recovery was defined as follows : 1 - (Recovery - Baseline )/Stimulus Repetition Baseline) x 100. As a control for the effect of the recovery period on the variability of category boundaries, 13 subjects participated in three additional test sessions in which they completed identification conditions separated by 1, 4, and 7 min intervals. These subjects were also
108
D. J. Sharf and R. N. Ohde
University of Michigan students who reported normal hearing; four of them had previously participated in the main experiment. Results
There were small differences among the recovery conditions in baseline boundary values. For the 125 ms stimulus these values were 223 .2, 230.9, 218.7, and 225.1 ms for the 1, 4, 7 and 28 min conditions, respectively, and for the 375 ms stimulus, they were 225.5, 221.8, 219.4, and 223.5ms for the 1, 4, 7 and 28min conditions, respectively. Differences in boundary shifts were also rather small. For the 125 ms stimulus, the shifts were 27.5, 31.5, 28.0, and 36.2 ms for the 1, 4, 7, and 28 min conditions, respectively, and for the 375 ms stimulus, they were 23.7, 22.3, 26.5, and 25.2ms for the 1, 4, 7, and 28rnin conditions, respectively. The boundary shifts obtained are comparable to those found in a study of vowel duration by Williams & Sharf (1979). All of the differences between baseline and repetition boundaries were found to be statistically significant (p < 0.001 using one-tailed correlated t -tests). 100.-----------------------------------------~
r:::-:::3 Standard 1;;:;:;:1 deviation
Stimuli
~ 125
~ms
D
375 ms
75
~
1 >-
50
cr:
7
28
Recovery interval (min)
Figure 2
Per cent recovery for the 125 and 375 ms stimuli at 1, 4, 7, and 28 min recovery intervals.
Mean per cent recovery is shown for the 125 and 375 ms stimuli in Fig. 2~ Recovery is at least 50% after one minute for both stimuli but it is significantly greater for the 375 ms voiced stimulus than for the 125 ms voiceless stimuli (t = 2.82; p < 0.02 using a two-tailed correlated t-test). Recovery for the 125 ms stimulus increases to 72% at 4min and stays at that level through 28 min. Recovery for the 375 ms stimulus decreases to about the same level as that for the 125 ms stimulus at 4 min, remains at about that level after 7 min, and then increases slightly at 28 min. Subject variability is considerable for all recovery conditions as indicated by the rather large standard deviations for the 125 ms stimulus (33.8, 32.5, 31.0, 24.9 for the 1, 4, 7, and 28min recovery conditions, respectively) and the 375ms stimulus (31.0, 20.0, 35.0, and 35.3 for the 1, 4, 7, and 28 min recovery conditions, respectively). There was an additional source of subject variability that was primarily characteristic of the 375 ms stimulus conditions. For each recovery condition using this stimulus, four subjects experienced boundary shifts opposite to the expected direction. An opposite shift with the 125 ms stimulus occurred
Recovery from categorical boundary shifts
109
only for two subjects in the 28 min recovery condition. Only the results for subjects who experienced the appropriate shift were included in further analysis. An attempt was made to determine if the degree of recovery was related to the magnitude of boundary shift. The assumption that per cent recovery would be inversely proportional to boundary shift was tested by calculating correlations between boundary shift and per cent recovery for each subject for each recovery condition. The correlations obtained were low and non-significant for the 125ms stimulus (0.15, -0.37, -0.14, and -0.05 for the 1, 4, 7 , and 28 min recovery conditions, respectively) and for the 375 ms stimulus (0.38, 0.43,- 0.01 and- 0.43 for the 1, 4, 7, and 28 min recovery conditions, respectively). Since the procedure for estimating recovery involved obtaining two identification scores for the test stimuli which were separated by various time intervals, it was deemed important to determine that substantial changes in category boundaries did not occur as a function of the recovery intervals themselves. The findings revealed non-significant (p > 0.05 using two-tailed correlated t-tests) mean differences in category boundaries of 4.0, 0.9 , and 0.4 ms for the 1, 4, and 7 min control conditions, respectively. Discussion
The results of this study may be compared to those of the Sharf & Ohde (1981) study of recovery involving VOT stimuli in which only the 55 ms voiceless stimulus was used. There is close similarity in the recovery patterns for the two voiceless stimuli for the 1, 4, and 7 min recovery conditions (51, 67, and 70%, respectively for VOT and 52, 72, and 72%, respectively, for vowel duration) but 90% recovery occurs at 28 min for VOT and only 72% recovery occurs for vowel duration. Although the standard deviations decreased across the recovery conditions for the VOT stimuli (38, 35 , 29, and 15 for the 1, 4, 7 , and 28 min conditions, respectively) and for the vowel duration stimuli (34, 33, 31 , and 25 for the 1, 4, 7 , and 28 min conditions, respectively), the decrease was not as great for the latter. The major difference between the recovery curves for the two vowel duration stimuli occurs in the early stage of recovery . Although recovery for both stimuli is most rapid during the first minute, it is significantly greater for the 375 ms stimulus than for the 125 ms stimulus. This difference was apparently not related to the magnitude of boundary shift since there was no significant difference in mean boundary shift between the two conditions (t = 0 .54;p > 0.05 using a two-tailed correlated t-test). The finding that recovery from boundary shift is not complete after 7 min does not parallel findings for threshold fatigue or loudness adaptation. Recovery from loudness adaptation is between 70 and 80% complete after the first minute with complete recovery after 3min (Egan, 1955; Elliott & Fraser, 1970; Thwing, 1955). For threshold fatigue, recovery from 1 or 2 min fatiguing periods occurs in about 5 min when the fatiguing intensity is not greater than 80 dB (Hirsh & Bilger, 1955; Jerger , 1956; Small, 1973). Considerable subject variability appears to be characteristic of boundary shift, particularly at recovery intervals of 1, 4, and 7 min . Unlike the recovery pattern for VOT stimuli, subject variability is still rather high for vowel duration stimuli at the 28 min recovery interval. There is surprisingly little information on subject variability for loudness adaptation or threshold fatigue. However , Small (1973) suggests that subject varaibility is great during the initial portion of the threshold fatigue recovery function but that it is no longer a problem after 2 min. No relationship was found between per cent recovery and magnitude of boundary shift for individuals in this study. On the other hand, Small (1973) indicates that the time required for recovery from threshold fatigue and loudness adaptation depends on the amount of fatigue and adaptation from which recovery must proceed .
110
D. J. Sharf and R . N. Ohde
In summary, the findings of this study support the suggestion by Sharf & Ohde (1981) that there are two stages in the process of recovery from boundary shifts. The first stage is rapid, reaching 50% recovery or greater in 1 min. The second stage of recovery is much more gradual. It may reach near maximum for the voiceless VOT stimulus after 28 min but not for vowel duration stimuli. Subject variability is quite high at all recovery intervals but decreases as recovery reaches near maximum. This extended second stage of recovery, high subject variability, and lack of correlation between recovery and boundary shift do not parallel findings for psychoacoustic functions . These results suggest the importance of an understanding of the recovery process to the development of models of categorical boundary shift. This research was partly supported by NIH research grant NS07040 and a University of Michigan Rackham research grant. Portions of this article were presented at the meeting of the American Speech- Language-Hearing Association, November, 1979. References Ades, A. E. (1974). Bilateral component in speech perception? Journal of the Acoustical Society of America, 56 , 610--616. Cote, R. A. & Cooper, W. E. (1975). Perception of voicing in English affricates and fricatives. Journal of th e Acoustical Society of America, 58, 1280- 1287. Cooper, W. E. (1974). Adaptation of phonetic feature analyzers for place of articulation. Journal of the Acoustical Society of America, 56, 617-627. Cooper, W. E. (1975). Selective adaptation to speech. In : Cognitive Theory. Vol. 1, (F. Restle, R. M. Shifrin, N.J . Castellan, H. R. Lindman, and D.P. Pisani eds), Hillsdale, New Jersey: Lawrence Erlbaum Associates. Cooper, W. E. (1979). Speech Perception and Production: Studies in Selective Adaptation . Norwood, New Jersey: Ablex Publishing Corporation. Denes, P. (1955) . Effect of duration on the perception of voicing. Journal of the Acoustical Society of America, 27,761-764. Diehl, R., Elman, J. & McClusker, S. (1978). Contrast effects on stop consonant identification. Journal of Experimental Psychology: Human Perception and Performance, 4 , 599-609. Egan, J.P. (1955) . Perstimulatory fatigue as measured by heterophonic loudness balance. Journal of the Acoustical Society of America, 27, 111-120. Eimas, P. D., Cooper, W. E. & Corbit, J.D. (1973). Some properties of linguistic feature detectors. Perception and Psychophysics, 13, 247-252 . Eimas, P. D. & Corbit, J. D. (1973). Selective adaptation of linguistic feature detectors. Cognitive Psychology, 4, 99- 109. Elliot, D. N. & Fraser, W. (1970). Fatigue and adaptation. In: Foundations of Modern Auditory Theory. Vol. 1, (J. V. Tobias ed.), New York: Academic Press. Elman, J. L. (1979). Perceptual origins of the phoneme boundary effect and selective adaptation to speech: A signal detection theory analysis. Journal of the Acoustical Society of America, 65, 190-207. Hillenbrand, J. M. (1975) . Intensity and repetition effects on selective adaptation to speech. Research on Speech Perception Progress Report, Indiana University, 2, pp. 56-137. Hirsh, I. J . & Bilger, R. C. (1955) . Auditory-threshold recovery after exposure to pure tones. Journal of the Acoustical Society of America, 27, 1186-1 194. Jerger, J. F. (1956) . Recovery pattern from auditory fatigue. Journal of Speech and Hearing Disorders, 21 , 39-46. Klatt, D. H. (1972). Acoustic theory of terminal analog speech synthesis. Proceedings of the 1972 International Conference on Speech Communication and Processing. IEEE Catalog No. 72 Ch0567-7 AE , pp . 131-135 . Klatt, D. H. (1980). Software for a cascade/parallel formant synthesizer. Journal of the Acoustical Society of America, 67,971-995. Miller, C. L. & Morse, P. A. (1979). Selective adaptation effects in infant speech perception paradigms. Journal of the Acoustical Society of America, 65, 789- 798 . Ohde, R.N. & Sharf, D. J. (1979). Relationship between adaptation and the percept and transformations of stop consonant voicing: effects of the number of repetitions and intensity of adaptors. Journal of the Acoustical Society of America, 66 , 30-45 . Raphael, L. J. (1972) . Preceding vowel duration as a cue to the perception of the voicing characteristic of word-final consonants in American English. Journal of the Acoustical Society of America, 51 , 1296- 1303.
Recovery from categorical boundary shifts
111
Sawusch, J. R. (1977). Peripheral and central processes in selective adaptation of place of articulation in stop consonants. Journal of the Acoustical Society of America, 62, 738-750. Sawusch, J. R. & Nusbaum, H. C. (1979). Contextual effects in vowel perception. I. Anchor-induced effects. Perception and Psychophysics, 25, 292-302. Sharf, D. J. & Ohde, R.N. (1981). Recovery from adaptation to stimuli varying in voice onset time. Journal of Phonetics, 9, 79-87. Simon, H. J. & Studdert-Kennedy, M. (1978). Selective anchoring and adaptation of phonetic and nonphonetic continua. Journal of the Acoustical Society of America, 64, 1338-1368. Small, A.M. (1963). Auditory adaptation. In: Modern Developments in Audiology. (J. Jerger ed.), New York: Academic Press. Small, A.M. (1973). Psychoacoustics. In : Normal Aspects of Speech, Hearing, and Language. (F. D. Minifie, T. J. Hixon and F. Williams eds), Englewood Cliffs, New Jersey: Prentice-Hall. Thwing, E. J. (1955). Spread of perstimulatory fatigue of a pure tone to neighboring frequencies. Journal of the Acoustical Society of America, 27,741-748. Williams, P. D. & Sharf, D. J. (1979). Effect of adaptation on the perception and production of vowel duration preceding stop consonants. Journal of Phonetics, 7, 81 - 92.