Journ al of Phonetics (1989) 17, 205- 212
Perception of word-final devoicing in Polish Louisa M. Slowiaczek* and Helena J. Szymanska Loyola University of Chicago, Chicago , IL , U.S.A. Received 21st December 1988, and in revised form 24th February 1989
Researchers examining the phonetic characteristics of a number of neutralization rules have found that underlying contrasts that should be neutralized are phonetically preserved. In particular, earlier results from a production experiment suggested that the rule of word-final devoicing in Polish is not neutralizing. The present investigation extended this work by testing whether the acoustic measures identified in productions from the original study are functional in perception. Native Polish and English listeners identified Polish monosyllabic words using a two-alternative forced-choice procedure. Results for the Polish subjects revealed better than chance performance in identifying words from the minimal pairs examined in the production study, although the magnitude of these results was less than expected. Moreover, the results suggested a bias to choose the voiceless alternative . Data obtained from English speaking subjects revealed similar results, suggesting that poor identification performance and a bias to respond with the voiceless alternative is not a function of Polish listeners' familiarity with the rules of the language. The combined results suggest that differences in measurements obtained in the production study are not the primary cue used to distinguish reliably these items in perception.
1. Introduction
Within phonological theory a fundamental distinction is made between rules that are neutralizing and those that are non-neutralizing. Neutralization rules are those that involve the merging of two distinct, underlying sounds into one at the time of production (for a review of the issues and associated phonetic experiments, see Dinnsen, 1982, 1983). However, recent experimental research has examined a number of putative neutralization rules and found that underlying contrasts are, in fact, phonetically preserved (CharlesLuce, 1985; Dinnsen & Charles-Luce, 1984; Dinnsen & Garcia-Zamor, 1971 ; Port, Mitleb, & O'Dell, 1981; Slowiaczek & Dinnsen, 1985). Word-final devoicing is one such neutralization rule in which an underlying voice contrast at the end of words is realized as voiceless during production. The phonetic characteristics of the word-final devoicing rule have been examined in Catalan (CharlesLuce & Dinnsen, 1987; Dinnsen & Charles-Luce, 1984), German (Charles-Luce, 1985; Dinnsen & Garcia-Zamor, 1971 ; Fourakis & Iverson, 1984; Port et al. , 1981), and Polish *Now at Department of Psychology, University at Albany, State University of New York , Albany, NY 12222, U.S.A. , to where all correspondence should be sent. 0095--4470/89/030205
+
08 $03 .00/0
© 1989 Academic Press Limited
206
L. M . Slowiaczek and Helena J. Szymanska
(Giannini & Cinque, 1978; Slowiaczek & Dinnsen, 1985) and results have indicated that the word-final, underlying voice distinction is maintained during production in terms of one, or some combination of temporal measurements. In fact , evidence from several studies has suggested a distinction between complete and incomplete voicing neutralization (Charles-Luce, 1987; Fourakis & Iverson, 1984; Port et al. 1981). With regard to Polish, in particular, Slowiaczek & Dinnsen (1985) identified at least two phonetic parameters relevant to the final voice contrast for the five Polish speakers tested, i.e. , vowel duration and voicing into closure of labial stops. Vowel duration measurements revealed that vowels are approximately 10% longer before final obstruents that are underlyingly voiced than before final obstruents that are underlyingly voiceless. Moreover, vowel duration differentiated underlying voiced and voiceless obstruents across all places and manners of articulation that were tested. In addition , the parameter of voicing into closure aided in the differentiation of underlying voiced and voiceless labial stops. In particular, the production of underlying voiced labial stops revealed greater voicing into closure than underlying voiceless labial stops. Based on these results, Slowiaczek & Dinnsen (1985) argued that the rule of word-final devoicing in Polish is non-neutralizing. This interpretation is supported by the similar findings obtained from the phonetic studies of word-final devoicing in German and Catalan. In general, while these languages vary in the way the voice contrast is preserved, they do not vary with regard to the neutralizing status of the rule (Charles-Luce, 1985; Charles-Luce & Dinnsen, 1987; Dinnsen & Charles-Luce, 1984; Dinnsen & Garcia-Zamor, 1971; Fourakis & Iverson, 1984; Port et al. , 1981). Examination of the characteristics of word-final devoicing has suggested that the rule is, in fact, non-neutralizing. However, these findings relate only to production. While Polish speakers may produce differences between underlying voiced and voiceless wordfinal obstruents, these differences may not be perceptually salient to Polish listeners when the words are presented in isolation. Several perceptual studies examining this issue have been conducted for German and Catalan. Port and his colleagues (Port & O'Dell, 1984; Port et al. , 1981) obtained evidence that native German speakers could correctly identify naturally spoken words that differed in underlying voicing. In particular, Port et al. (1981) found that listeners could correctly identify 72% of all items presented to the subjects and 63%, 67%, and 86% of items containing p/b, t/d, and k/g, respectively. These values were all significantly better than chance performance. Similarly, in an examination of Catalan, Charles-Luce (unpublished) found that listeners were able to distinguish between underlyingly voiced and underlyingly voiceless items that were examined in the Dinnsen & Charles-Luce (1984) production study. Therefore, previous work suggests that when phonetic differences have been found between underlying voiced and voiceless word-final obstruents, these differences are (sometimes) perceptually audible. The current research was designed to determine whether the acoustic measures identified in the Polish production study conducted by Slowiaczek & Dinnsen (1985) are functional in perception. Native Polish and English listeners identified an auditorily presented Polish word by choosing between two alternatives on an answer sheet. If subjects are able to identify these words correctly, the production and perception results will converge on the non-neutralizing character of the word-final devoicing rule in Polish. However, if subjects are unable to identify these words, it might suggest that the differences obtained between items in the original production study are not used to distinguish these items
Perception of word~final devoicing
207
perceptually. Such a result would suggest that the neutralizing status of the word-final devoicing rule may be operational primarily in perception.
2. Method 2.1. Subje cts Twenty-one native Polish subjects were recruited from a Polish community in Chicago, Illinois. Subjects ranged in age from 21 to 75 years (mean = 35 years) . The number of years in which the subjects had been in the United States varied between one month and 37 years (mean = 6.41 years; median = 3 years). In addition , 18 native English speaking subjects were recruited from the population of students at Loyola University of Chicago . The English subjects ranged in age from 19 to 40 years (mean = 25 years). All subjects were paid $10.00 for their participation in the experiment and had no reported history of a speech disorder or hearing loss. 2.2. Materials Thirteen pairs of words originally recorded and used in the production study conducted by Slowiaczek & Dinnsen (1985) were obtained for use in the current experiment. The items in a pair differed with regard to the underlying voice feature of the final segment. In addition, in the original production study each item in a pair was recorded in two sentence contexts such that the word following the target in one sentence began with a vowel, and the word following the target in the other sentence began with a consonant. The specific pairs used in the current perceptual experiment were selected based on the vowel duration differences between the two words in a minimal pair. The parameter of vowel duration was used because it revealed the most consistent results in the original production study. Thus, the minimal pairs were selected in the following manner. The vowel duration was obtained for each word in the original experiment and those vowel durations associated with the underlyingly voiceless word were subtracted from those vowel durations associated with the underlyingly voiced word in a particular pair. This yielded a difference score for each of the four repetitions of each of the original 15 minimal pairs produced by each of the five speakers in each of the two contexts in the Slowiaczek & Dinnsen (1985) study (i .e., difference scores for 600 total pairs). Thirteen of the minimal pairs associated with the largest difference scores in both a consonant and a vowel context were selected for the current study. Thus, 26 minimal pairs were selected (13 pairs x 2 contexts) or 4% of the original 600 token pairs examined by Slowiaczek & Dinnsen. The specific speaker who produced the pair was not manipulated or balanced in the experimental design of the current study . Of the 26 selected pairs of items II were produced by speaker 1, seven by speaker 2, none by speaker 3, five by speaker 4, and three by speaker 5. While the vowel durations were I 0% longer before underlying voiced obstruents than before underlying voiceless obstruents for the complete set of stimulus items in the Slowiaczek & Dinnsen (1985) study, the vowel durations for the underlying voiced obstruents were 55% longer for the subset of items used in the current study. These pairs are presented in Table I. Thus, 52 items were used in the current study ( 13 pairs x 2 underlying voice contrasts x 2 original contexts). Each of the 52 items were excised from the original
208
L. M . Slowiaczek and Helena J. Szymanska TABLE I. The 13 minimal pairs used in the experiment with the associated gloss Word
Gloss
Word
Gloss
karp grup jot grat kot pot lok luk paf kasz
carp group (gen. pl.) letter j old thing cat perspiration curl bow flop cera ] (gen. pl.) five although to have
karb gr6b jod grad kod pod log lug paw kaz
notch grave iodine hail code under logarithm lye peacock order! span come copper
pi~tc
choc miec
pi~td z
chodZ miedz
production sentences using a digital waveform editor (Luce & Carrell, 1981). These 52 items were repeated twice for a total of I 04 trials. Three tape recordings were made of the I 04 items in three separate random orders. A 3 s pause was inserted between each of the I 04 words presented on the tape. A set of I 0 practice words was spliced onto the beginning of each of the three experimental tapes. An answer sheet was created for each of the three experimental tapes. The answer sheet presented a pair of words next to a trial number. If the word /karp/ was presented on a particular trial, the words "karp" and "karb" appeared on the answer sheet next to that trial number. The order of the words in the pair on the answer sheet was balanced. 2.3. Procedure
The Polish speaking subjects were shown a list of the items to be used in the experiment. Subjects were told that on each trial they would hear an isolated word that was cut from the sentence in which it was originally produced . Subjects were instructed to listen carefully to the word presented on each trial and then to circle on an answer sheet the alternative that best corresponded to the presented word. Ten practice trials were presented before the experimental trials. Subjects were encouraged to ask questions prior to starting the experimental trials. Once the experimental trials began, they continued uninterrupted until each of the 104 trials had been presented. The English speaking subjects were provided with the same instructions and practice trials given to the Polish subjects prior to the experimental session. However, while the Polish subjects were obviously familiar with words in their language, the English subjects were unaware of the nature of the stimuli. Therefore, the English subjects were asked to listen to each of the 13 pairs of items as often as they liked as they reviewed the list of stimuli at the beginning of the session. In addition, these subjects were told to note the difference in sound between the two items in a pair, although specific cues were not identified for the subjects (e.g., "Notice differences in vowel duration") . This procedure was included to familiarize the English subjects with the stimuli.
Perception of word~final devoicing
209
SOr------------------------, (a)
Figure I. Percent correct identification of words with underlyingly voiced and voiceless word-final segments produced in two contexts in which the target word was followed by a consonant (•) and in which the target word was followed by a vowel (D) for Polish listeners (a) and English listeners (b).
3. Results Percent correct identification for the two underlying representations (+voice vs. -voice) and the two contexts (vowel vs. consonant) were obtained for each subject. Mean percent correct identification scores across all Polish and all English subjects are provided in Fig. I. Overall percent correct for Polish listeners was 0.6 1 and for English listeners was 0.59. No significant difference was found between performance of Polish vs. English subjects (t(37) < 1.0, N.S.). 3.1. Polish subjects Results indicated that overall percent correct performance for Polish listeners (0.61) was significantly better than chance (t(20) = 3.21, p < 0.0 I). A two-way analysis of variance (underlying voicing x context) revealed that subjects were significantly better at identifying items with underlying voiceless word-final segments (0.69) than with underlying voiced word-final segments (0 .52) (F(l, 20) = 11.49, p < 0.002). Percent correct performance for both the consonant and vowel contexts was 0.61. Thus, the context in which the original items were recorded did not affect identification of the words (F(l, 20) < 1.0, N.S.). Finally, the interaction between underlying voicing and context was not significant (F(l, 20) = 3.56, N.S.). Because Polish subjects are familiar with the word-final devoicing rule in their language, it is possible that subjects were biased to respond with the voiceless alternative. In order to test for any effects of bias, a signal detection analysis was conducted. The analysis revealed a mean sensitivity (d') of 0.59. This value is significantly different from ad' of
210
L. M. Slowiaczek and Helena J. Szymanska
zero (t(20) 11.41 , p < 0.01) and indicates that subjects were somewhat sensitive to stimulus differences. In addition, the analysis revealed an overall bias ([3) of 1.13. This value suggests that bias was minimal, since a value of 1.00 reflects no bias in perception. Finally, an unbiased estimate of percent correct (P(C)MAx) was determined to be 0.6141. This unbiased estimate is not noticeably different from the overall percent correct obtained by subjects in the experiment. Therefore, although a slight bias was found , the nature of the results cannot be fully explained based on biased perception alone. 3.2. English
su~jects
As with the Polish subjects, overall percent correct performance (0.59) was significantly better than chance (t(l7) = 2.58, p < 0.05). A two-way analysis of variance (underlying voicing x context) revealed a significant effect of underlying voicing, such that underlying voiceless segments were identified better (0.66) than underlying voiced segments (0.52) (F(I , 17) = 6.386, p < 0.02). Percent correct performance in the consonant context was 0.59 and in the vowel context was 0.60. Thus, as with the Polish listeners, a main effect of context was not obtained (F(l , 17) < 1.0, N.S.). Finally, a significant interaction was obtained between underlying voicing and consonant or vowel context (F(I, 17) = 4. 764, p < 0.04). Post hoc tests revealed a significant difference between performance on voiced and unvoiced segments in a consonant context (F(l, 34) = 8.98, p < 0.05), but no significant difference in a vowel context (F(I , 34) = 3.59, N .S.). A signal detection analysis of responses for the English subjects revealed an average sensitivity (d' ) of 0.51 which was significantly different than ad' of zero (t(l7) = 6.8, p < 0.05). The overall bias for English subjects was 1.09 and the unbiased estimate of percent correct for these subjects was 0.59. Again these values suggest a slight bias. However, the bias is too minimal to account for the obtained results. 4. Discussion
In the current experiment, evidence was obtained that suggests that Polish and English listeners are able to identify differences between words on which a putative neutralization rule has operated. However, identification performance was better for items that were underlyingl y voiceless in comparison to those that were underlyingly voiced. Although no main effect of the context in which the words were recorded was found , an interaction between underlying voicing and context was obtained for the English listeners. Several conclusions may be drawn from the present results. First, both Polish and English speaking su bjects were not always guessing when responding to the stimuli in the current experiment. This conclusion is suggested by the fact that overall performance for both groups of subjects was significantly better than chance (50%). Second, the context in which the original items were recorded seemed to have little effect on identification performance. This result is not entirely surprising given the fact that the effect of context on the acoustic measurements obtained in the original study was only minimal. The interaction between context and underlying voicing for English listeners indicates a greater difference in identification of underlying voiced and voiceless items in the consonant context than in the vowel context. The cause of this interaction may be related to a voice assimi lation rule. In the consonant context, a regressive voice assimilation rule may be operating and thus contributing to the perception of the devoiced final segment of the preceding target word as voiceless . Hence, identification performance was greater
Perception of word-final devoicing
211
(0.67) in the condition in which an underlying voiceless segment was followed by a voiceless consonant than in the condition in which an underlying voiced segment was followed by a voiceless consonant (0.51). In the vowel context an assimilation rule would not contribute to the perceived devoicing and as a result the difference in identification performance between the underlyingly voiced (0.54) and voiceless (0.65) segments was not as great. The failure to find an interaction for the Polish listeners is apparently a function of the greater variance found in the Polish data (MSE = 102.38 for Polish subjects, MSE = 35.92 for English subjects). Of greatest interest, however, is the fact that performance for both groups of subjects was better for underlying voiceless segments than for underlying voiced segments. Originally, this was interpreted as a bias on the part of Polish subjects to respond with the voiceless alternative because of their knowledge of the word-final devoicing rule in their language. That is, the results were believed to be a function of the knowledge Polish listeners possess rather than their perception of the stimulus items on the tape. However, this interpretation is ruled out by the fact that English speaking subjects (who were unaware of the devoicing rule in Polish) were also better at identifying words with an underlying voiceless word-final segment. In fact, the major factors examined in the present experiment yielded very similar results for both the Polish and English subjects. Thus, these results seem to indicate that the perception of more voiceless word-final segments on the stimulus tape is a function of the information available in the signal itself rather than information available to the listener beyond the stimulus (e.g., languagespecific rules). These results are consistent with the previously reported perceptual work in German and Catalan (Charles-Luce, unpublished; Port eta!., 1981; Port & O'Dell, 1984) in which listeners were able to identify correctly, better than chance, naturally spoken words differing in underlying voicing. In fact , the magnitude of the findings in the current study are similar to those reported by Port eta!. (1981) (i.e., 63% and 67% for p/b and tjd, respectively). In summary, the fact that subjects in the current experiment were able to differentiate the items in the minimal pairs better than chance suggests that subjects were, to a certain extent, able to use phonetic differences between underlyingly voiced and voiceless items · to differentiate them. However, the better performance for underlyingly voiceless items cannot be ignored. Because the advantage for voiceless items was not a function of the subjects' knowledge of the rule of word-final devoicing, these listeners must simply hear more " voiceless" items on the experimental tape. Thus, despite the better than chance performance observed in the present investigation, subjects do appear to be "perceptually neutralizing" the word-final segments of words used in this study. They are at the very least not attending consistently to the 55% difference in vowel durations for underlyingly voiced and voiceless obstruents used in this study and presumably, would not consistently attend to the 10% difference in vowel durations obtained in the Slowiaczek & Dinnsen (1985) production study. In conclusion, the data suggest that subjects are performing a complex task and while they attend to the actual stimulus being presented, they do not rely primarily on the phonetic differences identified in early studies to differentiate these minimal pairs. The fact that subjects in the current experiment perceive more "voiceless" items than "voiced" items suggests that they are, in fact, perceiving word-final obstruents as neutralized. Thus, while the nature of the word-final devoicing rule has been questiond based on production studies, for the Polish minimal pairs examined in the present
212
L. M. Slowiaczek and Helena J. Szymanska
experiment the integrity of the rule appears to be maintained in perception. Additional investigations into the phenomenon of word-final devoicing and more general neutralization rules will, no doubt, clarify their role in phonological systems. The authors would like to thank the following people for their assistance during various stages of this research project: B. Dugoni, T. Dye, D. B. Pisoni, A. Proske and R. S. Tindale. This research was supported by a small grant from Loyola University of Chicago. An earlier version of this paper was presented at the meeting of the Acoustical Society of America, November 1988 in Honolulu, Hawaii.
References Charles-Luce, J. (1985) Word-final devoicing in German: Effects of phonetic and sentential contexts, Journal of Phonetics, l3, 309- 324. Charles-Luce, J. (1987) The effects of semantic context on voicing neutralization. In Research on speech perception: progress report no . 13, Bloomington, IN: Indiana University. Charles-Luce, J. & Dinnsen, D. A. (1987) A reanalysis of Catalan voicing, Journal of Phonetics, 15, 187- 190. Dinnsen, D . A. (1982) On the phonetics of phonological neutralization. Paper presented at the Working Group on Language/Speech Behavior, XIII International Congress of Linguists, Tokyo. Dinnsen, D. A. (1983) On the characterization of phonological neutralization. Research in Phonetics, Vol. 3.
Dinnsen, D. A. & Charles-Luce, J. (1984) Phonological neutralization , phonetic implementation and individual differences, Journal of Phonetics, 12, 46- 60. Dinnsen, D. A. & Garcia-Zamor, M. (1971) The three degrees of vowel length in German, Papers in Linguistics, 4, 111-126. Fourakis, M. & Iverson, G. (1984) On the " incomplete neutralization" of German finals , Phonetica, 41 , 140- 149. Giannini, A. & Cinque, U . (1978) Phonetic status and phonemic function of the final devoiced stops in Polish. Speech Laboratory Report. Napoli: Instituto Universitario Orientale. Luce, P. A. & Carrell, T. D. ( 1981) Creating and editing waveforms using W A YES. Research on speech perception, progress report no. 7. Bloomington, IN: Speech Research Laboratory, Department of Psychology, Indiana Uni versity. Port, R. & O' Dell, M. (1984) Neutra lization of syllable final voicing in German. In Research in phonetics: report no. 4, pp. 93-134. Bloomington, IN: Indiana University. Port, R. , Mitleb, F. M. & O' Dell, M. (1981) Neutralization of obstruent voicing in German is incomplete. Paper presented at the 102nd Meeting of the Acoustical Society of America, Miami Beach, Florida. Slowiaczek, L. M. & Dinnsen, D. A. (1985) On the neutralizing status of Polish word-final devoicing. Journal of Phonetics, l3, 325-341.