Journal of Phonetics ( 1990) 18 ,
~5 3 -464
Who do phoneticians represent? Francis Nolan Department of Linguistics , University of Cambridge, Sidgwick Avenue, Cambridge CB3 9DA , U.K.
1. Characterizing the positions
The six position papers are striking in their diversity. A skeletal summary brings out the range of ways in which the authors have interpreted the notion of phonetic representation: as a practical descriptive tool for linguistic analysis (Ladefoged , 1990): as a cognitive representation in speech perception (Nearey, 1990) ; as a gestural plan underlying speech production (Browman & Goldstein, 1990); as one or more levels of representation at the level where grammars interface to the physical world (Keating, 1990; Rischel, 1990); and as a statement of purely non-cognitive aspects of the speech event in the physical world (Pierrehumbert, 1990). To clarify the different views which the authors have taken, the rest of Section 1 will pose various questions about the nature of a phonetic representation which the position papers, explicitly or implicitly, seem to me to answer in different ways. 1 Section 2 then discusses Pierrehumbert's view in more detail. Finally, Section 3 investigates the relation between producer, perceiver, and phonetician in relation to phonetic information. 1.1. Whose representation is it?
In much work on grammar, it is possible not to worry about where the phenomenon being represented has its existence. A phrase-structure tree for a sentence is in some sense part of the language; it is a linguist's description of the language; it is a way of representing knowledge shared by language users whether producing or receiving. Because phonetics, however, is centred on the transmission of linguistic structure from a producer to a receiver in the physical world, such representational neutrality is less obviously appropriate. The traditional IPA-type phonetic representation discussed by Ladefoged (1990) is, of all those discussed , the most clearly a linguist's description. A linguistic phonetician making a narrow , " impressionistic", transcription of a language from scratch is monitoring the language-native's2 productions, and evaluating his or her reactions to attempted replications, and so the data are from production and perception; but the transcription itself is a record of neither production nor perception. As Ladefoged notes , only in the earliest stages of the process will the transcription be 1 Since Rischel's paper takes the form of a usefully wide-ranging review of issues to do with phonetic representation, rather than a prese ntation of one position, I shall refer to his paper least in this comparison . 2 I have tried to avoid the biases, towards production and towards the phonic medium, of the term "native speaker".
0095-4470/90/030453 + 12 $03. 00/0
© 1990 Academic Press Limited
454
F. Nolan
totally impressionistic; quickly the phonetician begins to make hypotheses about what is significant and insignificant for the language-native , and to allow the transcription to be influenced accordingly-even before any formal attempt to work out the phonological system. Already , perhaps, the answer as to whose the representation is has begun to shift. It is becoming the language-native's; or, as Ladefoged would prefer, given his view that language does not belong to an individual, the language community's. Keating's categorical phonetic representation is likewise a representation " neutral with respect to articulation and acoustics/perception" . In keeping with earlier work in generative phonology , to which her ideas are heir, it is clearly psychologically real, and common to producer and perceiver. Further , it is not merely a description of knowledge about an utterance, but it is also seen as a mental reality in the processing of it, being "a representation that can be input to articulatory planning by a speaker, or the result of perceptual processing by a listener" (p. 324). Keating's other two proposed phonetic representations are an acoustic parametric representation and an articulatory parametric representation. The latter is presumably a model of something internal to the speaker. Browman & Goldstein's gestural score is a detailed instantiation of such a representation which might serve as input to the production mechanism--or, as in their model , a computer simulation of it. Nearey (1990) reanalyses data from perceptual experiments requiring more than one phonetic judgment (e.g. CV syllables in which the C varied over [s] and [f] and the V over [i] and [u]) . His interpretation using linear logistic analysis supports the reality of phoneme-sized units in perception . The implications of his paper are thus for phonetic representation in the mental processing of utterances by the receiver. Pierrehumbert's (1990) paper is alone in taking a strongly non-cognitive view of phonetic representations. She states that , at the phonetic level, representation " is not cognitive, because it concerns events in the world rather than events in the mind" (p. 377) . Different techniques of experimental phonetics yield a variety of phonetic representations , such as acoustic records or X-ray pictures; and abstraction from these records may produce phonetically relevent representations such as fundamental frequency or (I infer) vocal tract cross-sectional area functions. But these are still in the territory of physics , not psychology. For Pierrehumbert , then, phonetic representations are nobody's. 1.2. In what domain is the representation stated?
This question concerns the "vocabulary" of a representation. Is it acoustic, or articulatory, or auditory? At first sight this might appear to be the same as the previous question; Rischel's Section 2.1, for instance , does not distinguish them . Surely, if we choose to use the term phonetic representation for a cognitive representation in the process of speech production , does it not follow that its vocabulary will be essentially articulatory ; and , correspondingly, that it will be auditory in the case of a perceptual phonetic representation? On the contrary, there is a long pedigree to a hypothesis that the vocabulary of representations in speech perception is in fact articulatory . In its most prominent formulations the view was known as the motor theory of speech perception (see e .g. Liberman, Cooper , Shankweiler & Studdert-Kennedy , 1967; Liberman & Mattingly , 1985). Perceptual decoding is performed not directly on an auditory transformation
Who do phoneticians represent?
455
of the acoustic signal , but on an inferred articulatory representation. In early versions of the theory, concrete articulations were inferred; in the more sophisticated later versions, it is the states of control structures underlying observable articulations which are inferred. Browman & Goldstein's gestures may constitute candidate elements for such a representation. Equally possible, in principle , is a model in which the phonetic plan input to speech production is a set of auditory specifications. These would then be interpreted, with reference to stored knowledge of articulatory-auditory relationships, by the vocal organs. This possibility, however, does not surface in the present papers. There is some difference of emphasis as regards the vocabulary for the linguistically-oriented phonetic representations. The new IPA Principles, quoted by Ladefoged, speak of "phonetic categories which describe how each sound is made" (p. 338). It is not clear how much significance to attach to this apparently exclusively articulatory view. Rischel (1990) is perhaps more realistic in seeing the output of phonetic studies as "sets of descriptive dimensions anchored in either articulatory, acoustic, or auditory phonetics, or in a convergence between the findings from all three phases of human communication by spoken language" (p. 398). The vocabulary of linguistic phonetic representations, though based on apparently articulatory or auditory terms. is in my view really wedded neither to production nor perception. Nearey , in his background assumptions, appears to take a similar view: " phonetic units are not defined primarily either in purely articulatory nor in purely acoustic/auditory term s·· . even though inevitably they are "constrained by the general capabilities of the human articulatory and perceptual systems" (p. 352). The need for a dimension in the phonetic vocabulary derives from knowledge of what languages do-mainl y the phonological contrasts they exhibit-rather than from observing the separate activity of speakers and hearers. It is then a matter of finding a convenient correlate for that dimension, which may be auditory or articulatory. Pierrehumbert is bound by her position to remain fairly close to the physical world, but makes the important point that physical representations may involve a significant degree of abstraction-such as the parameter values used to synthesize speech via an electrical transmission line analogue of the vocal tract. Even the superficially straightforward notion of a fundamental frequency contour in speech is an idealization (her Section 2.2.1). The most innovative proposal, in a sense, is that of Browman & Goldstein. On the face of it, their "gestural score" looks as if it consists of the kind of articulatory parameters long familiar to phoneticians. But the notion of a "gesture" is more abstract than that . It is a "coordinative structure", a functional marshalling of several components of the vocal mechanism (e.g. upper lip, lower lip, jaw) to carry out a task such as the formation and release of labial closure. A gesture is characterized by its dynamic behavior, which is modelled using concepts from the physics of springs, such as mass, stiffness, and damping. From the "score" indicating the phasing of various gestures, the changing vocal tract shape can be computed, and thence the acoustic output. 1. 3. How many phonetic representations are there?
Those who address this question all think more than one, though their respective sets range along different dimensions. Ladefoged notes the International Phonetic
456
F. Nolan
Association's pragmatism in drawing attention to the fact that there are many styles of transcription which fall within the Association's scope, and also presents an Abercrombian categorization of different types of transcription. Keating defines three phonetic representations: categorical , articulatory parametric and acoustic parametric. Pierrehumbert probably ends up with the most; any recorded measurement of speech, or abstraction from it, can be referred to as a phonetic representation. 1.4. Are phonetic representations segmented?
Some support emerges for traditional phoneme-sized segmentation. It is, not surprisingly, still the basis of the revised IPA, and Ladefoged reports no challenge to it at the recent convention. As he points out, the framework is firmly grounded in phonological principles; and through its history these have been dominated by the phoneme. Will innovations from current less linear phonological models, such as autosegmental phonology, influence the next revision? Keating's categorical phonetic representation does modify the segment, in that she describes non-continuant sounds in terms of two successive specifications, one for the closure phase and one for the release. Most interesting, from the point of view of segmentation, is Nearey's paper. Of the units which might be proposed as basic to perception-such as syllables , demisyllables, triphones, diphones, and segments (i .e. phones)-he concentrates on models using diphones and segments. " Pure" and "trans-segmentally biased" versions of the segment models are distinguished, the latter allowing for sensitivity to immediate phonological context and being intermediate between pure segmental and diphone models. Linear logistic analysis is used to reanalyse data from three perceptual experiments exemplifying three types of phonological effect: extrinsic allophonic variation; coarticulation; and phonotactic constraints. In each case Nearey concludes that it is the trans-segmentally biased segment model which best matches perceptual behaviour. Browman & Goldstein (1990) claim that their dynamical description of gestures "simplifies the relation between categorical and continuous characterizations of articulation" (p. 300). I think this claim relates mainly to the degree of contextindependence of a gesture, whose dynamical specification promotes the achievement of its task regardless of the current state of the vocal tract. As far as I can see, traditional segmentation is absent from the gestural score. How the score is derived from a linguistic representation, and how far the latter would consist of segments, is not dealt with. It is suggested that segments-or other linguistic units-might turn out to correspond to "constellations" of gestures, but only one piece of evidence for a segment (English /1/) is cited. Finally , given her "physical" view of phonetic representations, Pierrehumbert would not expect to find clearly demarcated segments in them. The articulation of speech is essentially continuous, and such discontinuities as occur in the acoustic signal do not correspond reliably to boundaries between phonological segments. Segments are mental constructs; although she does admit (Section 3.1.2) to the reinforcement of segmentation by the "weakly segmental character" of the speech signal-the fact that "evidence for any particular distinctive element tends to be localized (rather than occurring at arbitrary distances)" (p. 390).
Who do phoneticians represent?
457
2. A review of Pierrehumbert's position
Pierrehumbert's paper presents, at first sight, a clear-cut and radical position. It is that phonetic representations are confined to the physical world-they are those representations which derive from instrumental analysis of speech, or which directly control the physical parameters of synthesizers-and that they have no cognitive status. I want to deal with this position in some depth because I think it is a helpfully provocative one. I also think it is misguided; and, furthermore, that Pierrehumbert's paper is unable to sustain it consistently. Her position turns out to be an attack on only a narrowly-defined class of possible cognitive phonetic representations. Pierrehumbert contrasts phonological and phonetic representations as follows: " Phonological representations are qualitative, cognitive, and relatively accessible to introspection; phonetic representations are quantitative, non-cognitive, and relatively inaccessible to introspection" (pp . 378-379) "Qualitative" and "quantitative" do not, of course, denote here the phonetic dichotomy between timbre and length, but might, if I understand Pierrehumbert, be glossed as discrete, or categorical, on the one hand , and continuous, or gradient, on the other. Pierrehumbert singles out for approval models wh ich map a phonological representation directly , via realization rules, onto a represe ntation in the physical world. 3 The success she has had in predicting F0 contours from a phonological representation in the form of high and low tonal elements is beyond question ; nor can the utility of another model she cites , the acoustic theory of speech production , be doubted. However, when we look more closely at the latter model, at any rate, it may not give Pierrehumbert's position quite the support she imagines. She regards it as exemplifying "the best understood aspect of the 'semantics' 4 of phonological representations" , namely , "the relation of phonemes, or their distinctive feature representations, to dimensions of articulatory control and acoustic variation '· (p. 382). In fact , the acoustic theory of speech production deals successfully, in my view , with the realization not of phonemes but of phones (or allophones); that is , of elements at a level of linguistic representation which contains appreciable subphonemic detail. If Pierrehumbert were asked to give a specific instance of successful modelling within the scope of the acoustic theory of vowel production , I suspect she might, for instance , be tempted to cite the correct prediction of three or four formant frequencies and amplitudes via a single-tube vocal tract analogue for an English vowel phoneme such as /i:/ . But in fact the prediction would be of values for the phone [i:], such as might occur in see, and would not well express the acoustic realization of the phone [1: 1, which might occur in mean , and for which neither the values of the spectral peaks for [i: 1 nor the assumption of a single acoustic tube would be especially appropriate. Pierrehumbert's rejection of the phone-for this is one facet of her position-is clouded by some passages from which the distinction between phone and phoneme fails to emerge clearly. For instance , she glosses phonemes as "an inventory of elements which could be substituted for each other in the same position" (p. 383). Such elements (if they change lexical identity, as in the case of the consonants at the 3 Including representations which are abstractio ns from , or schematizations of, the speech signal itself-such as the one she desc ribes as consisting of "critical events in a schematic F 0 contour" (p . 385). 4 Pierrehumbert sees the relati o n of phonological entities to the physical world as comparable to that of concepts (see her Sectio n 1.3).
458
F. Nolan
end of lip, lit and lick) are indeed crucial to the notion of the phoneme; but equally crucial is the belief that it is possible to group a member of one such inventory with an (in some sense) equivalent member from various other inventories (e.g. the labial of lip with that of pill). The very concept of the phoneme is based on the notion of phonological units being realized in systematically different ways according to context. This is, of course, well known to Pierrehumbert; the latter half of her Section 3.1 is given over to the problems caused by "syntagmatic" relations between phonological units; that is, the kind of contextual variations some of which is captured in a phonetic representation as traditionally conceived of (cf. [i:] and [I:] above). It turns out, for all that the "best understood aspect of the 'semantics' of phonological representations is the relation of phonemes ... to ... acoustic variation" (p. 382), that "our understanding of what absolute values occur in what context is neither comprehensive nor exact" (p. 384). Ultimately, then, there is no dispute about the wealth of "subphonological" detail which has to be modelled . The dispute is rather about whether a non-physical, probably symbolic, phonetic representation is appropriately part of that mapping, and whether such a representation has cognitive reality. Perhaps there is no principled criterion for deciding absolutely whether there is a need for a non-physical phonetic representation. But one pragmatic criterion is presumably how easy it is to express the mapping from phonological units to their physical realization directly. In turn , it might be expected that the more closely contextual ("syntagmatic") effects correspond to regularities in the physical world , the less need there will be for any further linguistic level of representation. If the frontness of velars correlates with the frontness of an adjacent vowel, it may be possible (ignoring the fact that the details actually differ between languages) to allow this to emerge naturally from the physical properties of the vocal organs-modelled perhaps within the task-dynamic framework described by Browman & Goldstein. And most intonation models use mathematical functions to interpolate between phonologically specified points in intonation contours. But how far does contextual variation really correspond to regularities in the physical world? Take, for instance, the word tooting. There are good phonological reasons for regarding the initial and medial consonants as the same phonological unit. In most accents of English the first occurrence of this phonological unit will be realized as an aspirated alveolar stop . The realization of the second, however, is articulatorily and acoustically diverse across the different accents. In much American English it will be a short voiced stop or flap; in standard southern British it will be a voiceless stop with some aspiration; in various urban British accents it will be a glottal stop; in north-east England a glottalized alveolar; and in some Irish English a voiceless apico-postalveolar flapped fricative. It would be hard to find a general principle in either physical domain (articulatory or acoustic) which would readily model the realization of this phoneme in every accent. Rather, each accent displays its own idiosyncratic behavior. Historically, we might be able to seek explanations for the various patterns in the interplay of articulatory and perceptual factors; but synchronically, the realization in any one accent is as idiosyncratic a fact about that accent as, for instance, the morphophonological relatedness of [k] and [s] of words such as electric and electricity is about English. The various realizations of It/ would seem to be good candidates for representation in terms of Keating's categorial phonetic level.
Who do phoneticians represent?
459
Again, consider assimilation at word boundaries. Kerswill (1987 , pp. 42, 44) notes a dialect-specific difference: where, in the phrase that girl for instance, standard southern British can assimilate the final voiceless alveolar of that to the following place of articulation (rha(k g)irl) , Durham dialect would instead assimilate the voicing feature (tha(d g]irl). Although such "connected speech processes" may be gradual rather than discrete (cf. Nolan, in press) and hence more difficult to handle at any categorial level, the fact that language-specific habits are involved suggests that they should be modelled as linguistic, rather than physical , phenomena. Why does Pierrehumbert deny cognitive reality to the output of whatever learned rules or habits generate patterns like the above? Firstly because "representation at this level ... concerns events in the world rather than events in the mind" (p. 377). Are we to assume that the mind does not make its own representation of events in the world-even events which are so closely tied to clearly cognitive linguistic activity? To deny the existence of a cognitive phonetic representation is virtually to say we have no internalized knowledge about speaking. Secondly, "the phonetic domain is not well accessed by introspection" (p. 377). Neither are the principles of sentence construction in a particular language, nor the putative universal of cyclic rule application in phonology, except via specialized technology (in this case, not phonetic instrumentation, but the technology of formal grammar); but both are squarely cognitive, if a grammar is seen as in some sense describing what a language-native knows . Accessibility to introspection in fact depends on what question is asked. It is doubtful whether the phonemic principle is spontaneously accessible to those not (alphabetically) literate (cf. Rischel , 1990 p. 397); but it can be elicited rather simply by recourse to questions about the lexical identity of phonetic strings. Languagenatives' knowledge, however, is far from being limited to the phonemic partitioning of their lexicon. Presented with some of the alternative pronunciations of tooting discussed above, an English speaker would immediately be aware of different accents, might name some, and even offer fair imitations. Over-emphasis on phonological representations geared to lexical discrimination results in neglect of the other knowledge which language-natives can bring to bear on utterances. In short , I think Pierrehumbert's insistence on the purely non-cognitive status of phonetic representations is misguided. So, apparently, does she, given that her Section 3.2 deals with the mental representation of various sub-phonological, non-categorial information. Her real point seems to be that we should not expect a mental phonetic representation to be a close analogue of IPA narrow transcription, with its close adherence to phoneme-like segmentation. Many would agree with her, and pressure from the continuous nature of speech on one side, and less rigidly segmental phonological representations on the other, may be irresistible. I have even contri?uted a s~~gestion myself, albeit a rather simplistic one, for a cognitive repr~sentatwn contammg the controllable phonetic aspects of speech (Nolan, 1983, SectiOn 2.3. 7). Many fruitful issues will arise from serious attempts to discover the form of such a representation, and the information it should contain. 3. Infonnation in whose mind? If some phonetic representations are indeed cognitive, the question arises as to :-vh?s~ knowle~ge ~bo_ut an utterance they are modelling. In traditional impress-
wmstJc phonettcs, tt ts a working assumption that the same linguistic phonetic
F. Nolan
460
information is available to the producer of an utterance and to its perceiver; and that the linguistic phonetician, once adequately sensitized, can become aware of this information too. A similar assumption pervades a lot of work in phonology-hence , for instance, the classic generative phonological position expressed by Chomsky & Halle (1968, p. 294) that the phonetic representation is "the speaker-hearer's interpretation ... of the signal". This is not the view of all phonologists, however. Jakobson, Fant & Halle (1952, p. 12) saw speech communication as involving a series of stages ("articulatory, acoustical, aural, perceptual"), the transformation between each of which results in some Joss of information: Each of the consecutive stages, from articulation to perception, may be predicted from the preceding stage. Since with each subsequent stage the selectivity increases , this predictability is irreversible.
Only by examining the stage where information is most reduced and which is accessible to the hearer as well as the speaker, runs the argument, can the information structure of speech be captured. This motivation for defining distinctive features in acoustic (or, ideally, auditory terms) is often not sufficiently appreciated . To appreciate the potential complexity of the situation for cognitive phonetic representations, consider the following table , which is adapted from the one in Nolan (1986, p. 9):
(a) (b) (c) (d) (e) (f)
Articulated by language-native
Perceived by language-native
Perceived by phonetician
+ +
+ +
+
+ In each column the + or
+ + +I-
indicates whether or not a distinction is articulated by a native speaker, perceived by a native hearer, and perceived by a trained phonetician . I shall consider each of the listed possibilities in turn. (a) An articulated distinction is perceived by both. This should be overwhelmingly the commonest situation. For instance, the speaker utters a minimal pair, and both language-native and phonetician perceive it. (b) The phonetician alone fails to perceive. The language-natives are coping nicely; the phonetician should consider alternative careers. Realistically, however, training will never equip a phonetician for all possible distinctions, and this failure is inevitable from time to time. Trial and error will usually allow the phonetician to latch on to what the language-natives find salient. (c) No distinction is made, and none perceived. All is well; the speaker said the same thing twice. (d) No distinction is made, but the phonetician hears one. This may not be as paradoxical as it sounds. It may be a case of phonetic free-variation on the part of one speaker. If two speakers are involved, the paradox may result from the process of abstraction by which linguistic properties in the signal are separated from
Who do phoneticians represent?
461
non-linguistic properties. For instance, the phonetician may have noted a difference in vowel nasalization between two otherwise identical utterances from two speakers , but it turns out merely to be a speaker-specific characteristic. (e) The native hearer perceives a distinction which is not there. This again seems paradoxical, but is explicable if phonetic processing proceeds in parallel with other linguistic levels. Knowledge of higher-level linguistic structure imposes different phonetic structure on identical phonetic material. This view is in the spirit of generative phonology , where, according to Chomsky & Halle (1968, p. 294) phonetic representation is understood ... not as a direct record of the speech signal , but rather as a representation of what the speaker of a language takes to be the phonetic properties of an utterance , given his hypothesis as to its surface structure and his knowledge of the rules of the phonological component.
Thus for Chomsky & Halle the fact that many of the potentially unbounded number of decreasing stress levels within an utterance, derived by the cyclic application of their stress assignment rules , might lack distinguishable auditory correlates, does not prevent the perceiver from being aware of all those levels. Under (e) might also be considered the kind of phenomena which Sapir (1933) drew attention to. Sapir reports cases where his informants refused to accept as homophonous certain morphemes which in his own judgment as a phonetician were pronounced identically. Sapir's conclusion is that the informants were being influenced by their knowledge of the distinct phonological behaviour in other morphological environments. (f) The native speaker produces a distinction which is not perceived. It may or may not be perceptible to a phonetician. This case also seems curious- why would a speaker produce a distinction which cannot be heard? Instances are relatively rare, and some are disputed; nevertheless they are crucial to understanding the nature of mental phonetic representations. They may be rare partly because they go against the grain of most linguistic and phonetic research which assumes that description can be neutral between producer and receiver, and so may have been overlooked . At least the following four types of case can be distinguished: (i) Historical phoneme mergers. Here, the distinction between two phonemes is being lost in a historical change. Speakers of the relevant dialect produce small but measurable differences between originally distinct forms, but hearers can't tell which word is which. Such cases have been reported from time to time in sociolinguistic studies. Trudgill (1974, pp. 120-129) , for Norwich English , discusses mergers in progress of the fear-fair word sets and the boot-boat word sets. Costa & Mattingly (1981) show experimentally that there is a residual produced vowel length difference between (arhotic) New England cod and card, but listeners could not exploit it when subsequently tested on their own productions. (ii) Phonological neutralization. In neutralization, the opposition between two phonemes is said to be suspended in a particular phonetic environment. The voicing opposition in obstruents, for instance, is suspended in a number of languages word-or syllable-finally, where characteristically only the voiceless member of a pair has traditionally been said to occur. A number of instrumental studies, however, have claimed that in morphophonologically
462
F. Nolan
distinct forms (such as German Rad and Rat) residual phonetic cues to the underlying voicing opposition persist even in the position of neutralization. These cues may comprise factors such as the duration of the obstruent, or the duration of the preceding vowel. Findings of this general type are reported, for instance, by Charles-Luce (1985) for German, Chen (1970) for Russian, and Slowiaczek & Dinnsen (1985) for Polish. 5 The status of any residual cues for native listeners is not clear-cut. German listeners in Port & Crawford (1989) achieved better than chance (50%) identification--over 70% in the case of tokens dictated in a disambiguation task. But for the production condition "most similar to every day use of language", they suggest (p. 272) that the residual difference is " almost certainly ... not useful" perceptually. For Polish, Slowiaczek & Szymanska (1989) also obtained around 60% identification on tokens chosen to exhibit the clearest acoustic differences , but conclude that while the Polish neutralization rule "has been questioned based on production studies . . . the integrity of the rule appears to be maintained in perception" (pp. 211-212). It seems then that listeners are likely to cope much less well, if at all, with these residual cues than with the cues to straightforward phonological oppositions. (iii) Phonological assimilation. In connected speech, the identity of a segment can be lost when it becomes more like an adjacent one; as for instance when (assuming identical prosodic patterns) the realization of The road collapsed becomes identical to that of The rogue collapsed. Nolan (in press) reports experiments using electro-palatography (EPG) and listening tests. It emerged from controlled experimental data that even for tokens where, on the EPG traces, there was no evidence of any gesture towards the alveolar region, listeners could still tell significantly better (66%) than chance which sentence was which in a sentence minimal pair like the above. That was when the sentences were presented as a pair, with a one second interval. It suggests that even when the alveolar gesture was lost, the speaker was still encoding the underlying phonological distinction, perhaps in the tongue shape for the preceding vowel. However, in an identification task in which each token was presented in isolation and so no short-term comparison was possible-a task with more relevance to everyday speech perception-it appeared that listeners were not able to make use of residual cues in the absence of any alveolar gesture. (iv) Prosodic neutralization. A prosodic opposition may be neutralized when placed at a different point in a larger prosodic unit. Thus compound-noun vs. noun-phrase pairs like 'GREENhouse and 'green 'HOUSE which are clearly cued by pitch and duration when occurring as nuclear and pre-nuclear accents, lose their clear pitch contrast in post-nuclear position in the tone unit (e.g. I 'know very 'WELL it's a-). Faure , Hirst & Chafcouloff (1980) report that listeners were not able to identify members of such stress pairs in post-nuclear position significantly better than chance. Hirst (1983:fn.2) 5 The general applicability of some such findings has, however , been questioned as an artefact caused by orthographic distinctions. Fourakis & Iverson (1984), whose subjects produced German strong verb paradigms from memory, found no such residm.; cues; nor did Jassem & Richter (1989) , whose Polish subjects' productions were elicited in a dialogue :
Who do phoneticians represent?
463
further reports that the speakers in Faure et al. had nevertheless produced significant durational differences, the noun phrases being on average 20% longer. If the phenomena under (f) prove to be reliable they suggest the need for distinct phonetic representations for producer and perceiver. The producer's phonetic plan seems to be influenced by underlying structure in ways which undercut the narrowest of phonetic categories. 6 Some at least of this information is missed by perceivers. Keating's (1990) symmetrical view that "Speakers ... use the categorical representation to arrive at an articulatory one, while listeners ... use an acoustic representation to arrive at the categorical one [emphasis added]" (p. 323) may be too neat.
References Browman , C. & Goldstein, L. (1990) Gestural specification using dynamically-defined articulatory structures, Journal of Phonetics, 18, 299-320. Charles-Luce, J . (1985) Word-final devoicing in German : effects of phonetic and sentential contexts Journal of Phonetics, 13, 309-324. Chen, M. (1970) Vowel length variation as a function of the voicing of the consonantal e nvironment , Phonetica, 22, 129-159. Chomsky, N. & Halle, M. (1968) The sound pattern of English. New York: Harper and Row . Costa, P. J . & Mattingly, I. G. (1981) Production and perception of phonetic contrast during phonetic change, Haskins Laboratories Status Report on Speech Research, SR67/68, 191-196. Dinnsen , D . A. (1985) A re-examination of phonological neutralization. Journal of Linguistics, 21, 265-279. Faure, G., Hirst, D. & Chafcouloff, M. (1980) Rhythm in English: isochronism , pitch and perceived stress. In The melody of language (L. Waugh & C. van Schooneveld editors), Baltimore: University Park Press. Fourakis, M. & Iverson, G . K. (1984) On the 'incomplete neutralisation' of German final obstruents, Phonetica, 41, 140-149. Hirst, D. (1983) Structures and categories in prosodic representations . In Prosody: models and measurements (A . Cutler & D. R . Ladd, editors) , Berlin: Springer. Jakobson , R ., Fant, G . & Halle , M. (1952) Preliminaries to speech analysis. Cambridge, Mass.: MIT Press. Jassem, W. & Richter , L. (1989) Neutralization of voicing in Polish obstruents, Journal of Phonetics, 17, 317-325. Keating , P. A. (1990) Phonetic representations in a generative grammar, Journal of Phonetics, 18, 321-334. Kerswill, P. E. (1987) Levels of linguistic variations in Durham , Journal of Linguistics, 23, 25-49. Ladefoged, P. (1990) Some reflections on the IPA, Journal of Phonetics, 18,335-346. Liberman, A.M., Cooper, F. S., Shankweiler, D. & Studdert-Kennedy, M. (1967) Perception of the speech code, Psychological Review, 74(6), 431-461. Liberman, A . M. & Mattingly , I. G. (1985) The motor theory of speech perception revised , Cognition, 21, 1-36. Nearey, T. (1990) The segment as a unit of perception. Journal of Phonetics, 18, 347-373 . Nolan, F. (1983) The phonetic bases of speaker recognition. Cambridge: Cambridge University Press. Nolan, F. (1986) The implications of partial assimilation and incomplete neutralisation. Cambridge Papers in Phonetics and Experimental Linguistics, 5. Nolan, F. (in press) The descriptive role of segments: evidence from assimilation . To appear in Papers in Laboratory Phonology II (G. Docherty & D. R . Ladd, editors). Cambridge: Cambridge University Press. Pierrehumbert, J. (1990) Phonological and phonetic representation, Journal of Phonetics, 18, 375-394. Port, R. & Crawford, P. (1989) Incomplete neutralization and pragmatics in German, Journal of Phonetics, 17, 257-282. Rischel , J . (1990) What is phonetic representation? Journal of Phonetics, 18, 395-410. A point which suppo~ts Pierrehumbert's scepticism about strictly segmented, categorial phonetic representattons-but not, m my vtew, about the cogntttve status of some sort of phonetic representation . 6
464
F. Nolan
Sapir, E. (1933) La realite psychologique des phonemes. Journal de Psychologie Normale et Pathologique, 30, 247-265. English version, "The psychological reality of phonemes", in Selected Writings of Edward Sapir (D. G. Mandelbaum, editor) , 1949, Berkeley: University of California Press. Slowiaczek, L. M. & Dinnsen, D. A. (1985) On the neutralizing status of Polish word-final devoicing , Journal of Phonetics, 13, 325-341. Slowiaczek , L. M. & Szymanska , H. 1. (1989) Perception of word-final devoicing in Polish , Journal of Phonetics, 17, 205-212. Trudgill, P. (1974) The social differentiation of English in Norwich. Cambridge: Cambridge University Press.