Temporal coordination of consonants in the speech of children: preliminary data

Temporal coordination of consonants in the speech of children: preliminary data

Journal ofPhonetics (1973) I, 181-217 Temporal coordination of consonants in the speech of children: preliminary data Sarah Hawkins Department of Lin...

17MB Sizes 0 Downloads 14 Views

Journal ofPhonetics (1973) I, 181-217

Temporal coordination of consonants in the speech of children: preliminary data Sarah Hawkins Department of Linguistics, University of Cambridge Received May 1973

Abstract:

A comparison was made between the speech of adults·and children with respect to modifications in the durations of consonants, according to whether they were in the context of other consonants of various types or of vowels only. Word-initial and -final clusters in English monosyllabic words were examined. Seven children, aged between 4 and 7 years, were recorded in their homes at weekly intervals until the full sample of words had been collected. Oscillograms of the recordings were measured for consonant and some vowel durations. The results were compared with similar adult data in the study by Haggard (1971). The chief conclusion was that mature forms of temporal reorganization of consonants are not acquired in children of this age where pre-or postvocalic /1/ is involved in a cluster, and probably also where /s/ occurs. Very few convincing age trends were found within the range of age examined. Ways are discussed of delimiting the generality of the observed differences, and if possible of determining the particular aspects of the articulations which cause the difficulty.

Introduction This paper describes an initial investigation into certain aspects of the speech of children aged between 4 and 7 years. The acoustic durations of consonant segments were compared when produced word-initially and -finally, either clustered with one or more other consonants or in the context of vowels only. There has been considerable interest recently in temporal constraints in speech, connected very largely with attempts to identify the units of speech production and to derive or distinguish between appropriate descriptive models (Chistovich & Kozhevnikov, 1965; Allen, 1969, 1972a; Bradshaw, 1970; Lehiste 1970a, b, 1972; Ohala, 1970, 1972). Conclusions generally have been based upon correlations between segment durations, or upon differences in relative variance between the relevant segments' distributions. Negative correlations and sequences whose total variance is less than the sum of the component variances have been interpreted as evidence of integration of the components into a higher order unit. These studies have all identified some interesting aspects of the general problem, but most of the relevant questions remain unanswered, not least because of the difficulty of applying appropriate. techniques of analysis to the data obtained. Methodological considerations include criteria for the segmentation of the acoustic waveform to correspond maximally to the inferred articulatory changes, discussed in detail by Naeser (1969), Peterson & Lehiste (1960) and Lisker (1972). Problems arise in identifying which of the observed effects are due to aerodynamic and 12

182

Sarah Hawkins

physiologically conditioned constraints, and in the choice of meaningful statistics to apply to the measured durations. This latter point has involved some difficulty and controversy which is far from resolved . One of the chief problems is that speaking rate tends to vary quite considerably over repeated utterances of the same word or phrase. This will of course affect the size of segment variances and correlations between segments will be correspondingly reduced. Various methods for removing such effects have been applied, particularly in the work by Allen, Lehiste and Ohala, quoted above. On the whole, these normalization procedures are not satisfactory, so that conclusions based upon data affected by them are of dubious value. Quite detailed criticisms have been made by Ohala (1970, Chapter 3), and by Gregorski & Shockey (1972). A further problem is the interpretation of the results. For instance, it has been assumed that negative correlations between adjacent segments indicate their organization into a superordinate unit of timing, and that no correlation, or a positive one, implies an absence of such structuring. Haggard (1971) points out however that while the first two assumptions may be correct a positive correlation does not necessarily mean the segments are independent in duration. They may both undergo identical changes according to some higher level plan. Moreover the status of the coefficient is thrown into some doubt by the frequent presence of negative correlations of an equal degree of significance but between apparently quite unrelated nonadjacent segments (Gregorski & Shockey, 1972). On a more theoretical level, it is quite reasonable to assume there is more than one "unit of production" in speech, even when confining oneself to phonological rather than to semantic-syntactic considerations. The syllable is commonly taken to be a basic unit (Chistovich & Kozhevnikov, 1965), which has been claimed to be more fundamental to production than smaller elements (Bondarko, 1969). Although the CV syllable seems to be less fundamental in at least English and Dutch, evidence that some form of syllable does form a major unit comes from slips of the tongue (Boomer & Laver, 1968 ; Fromkin, 1971; MacKay, 1969, 1972 ; Nooteboom, 1967, 1969) ; from coarticulation studies of various types (e.g. MacNeilage & deClerk, 1969 ; Lehiste, 1970b and Ohman, 1966-but see Allen, 1969 and 1972a, especially p. 189 and Ohala, 1970) ; and from perception studies (e.g. Huggins, 1968, 1972). However there is also clear evidence from similar data of the importance of units both larger and smaller than the syllable. Fromkin (1971), summarizing the data on slips of the tongue, presents evidence for units of phonemic segments, and (more rarely) distinctive features of phones, as well as for words and syntactic/semantic units of syllable length or longer. [It is interesting to note that, in short term memory studies, distinctive features are more frequently confused than they seem to be from Fromkin's data (Wickelgren, 1965a, b, 1966).] Fromkin & MacKay (1972) present similar evidence for the independence in production of morphemic units, for instance of inflection. This is supported in studies of temporal coarticulation between segments (Lehiste, 1970b, Haggard, 1971) and in perceptual studies (e.g. Gibson & Guinet, 1971). The task of determining the interrelationships and relative importance to production and perception of these different units will be an intriguing one. But while understanding is still rudimentary, the determination of which are the relevant units to be studied, and in which particular way, for evidence of the nature of their cohesion is made more difficult. It is probable that units at different levels will not all be organized in the same way, and there is certainly no a priori reason why instances of apparently the "same" unit-type should not prove to be organized differently. [Cf. Fromkin's (1971) conclusions on the relative independence of distinctive features.] This notion was explicitly incorporated by Haggard (1971) into a tentative model for

Temporal coordination of consonants in the speech of children

183

the control of the relative durations of consonants, in clusters. He distinguished between homorganic and nonhomorganic clusters, subdividing the former into compatible and incompatible sequences depending upon the extent to which the articulator has to change its position for two successive segments. A further distinction was made between sequences in which the articulation of the first consonant (C 1 ), was relatively more open than the second, (C 2 ), and vice versa. Using these dimensions, and assuming that central commands govern articulation onset, the time of offset being determined by local feedback of the time spent in the target area, Haggard was able to derive a model which described his data reasonably satisfactorily. This built-in flexibility of approach appears on a priori grounds to be very promising. But Haggard's formulation itself contains a number of inadequacies and he has not pursued the model. These largely stem from an attempt to keep the parameters involved markedly fewer and less complex than it seems they must be, while attempting to make very specific predictions about durational relationships in various clusters. The assumptions are based only upon considerations of the place of articulation of successive gestures, with no concern for manner differences except insofar as these influence duration through aerodynamic and physiological forces. But in some languages, including English, for example, manner of articulation can influence the duration of at least preceding vowels (Denes, 1955; House & Fairbanks, 1953; Naeser, 1970a; Peterson & Lehiste, 1960). Slis (1971, 1972) has presented evidence for Dutch that the commands for articulations associated with greater articulatory effort (e.g. tense vs. lax consonants, such as /p/ vs.fbf), take place at the same rate but occur earlier in time in nonsense words than those associated with less effort, thus reducing the duration of the preceding vowel. But Fujimura (1961) interprets similar English data as indicating that a faster rate of articulation is involved. Lisker (1972, pp. 168-169) quotes yet other interpretations of this same phenomenon. Whatever the facts prove to be, it would seem to be essential to try to incorporate this type of effect explicitly into a model of temporal coarticulation. On a more general level, Lehiste (1972) has evidence for her hypothesis that timing patterns are connected with major changes in the manner of articulation; resonants, (/m, n, 1J r, 1/), are not thought to constitute such major changes, being fused with the vowel in this sense. The data, from monosyllabic English words, are quite convincing for long vowels when only one pre- or postvocalic resonant is involved, although some inconsistencies for short vowels in other than voiced environments have yet to be explained. Should Lehiste's hypothesis prove to be generally useful, then attempts to treat consonant cluster modifications independently ofthe syllabic nucleus could perhaps be seriously misleading. The two approaches are not necessarily incompatible though, for Lehiste is concerned with somewhat larger units than Haggard. · There is no allowance made for the components of the gesture for a single phone to be controlled by at least partially independent channels. For example, in a nonhomorganic cluster, particularly a closed-open one such as /prj, the major components of the C2 gesture may be formed with or during the C 1 articulation. But, in the case of /prj, the tongue tip will not be raised to execute the complete /r/ until after the /p/ release. What governs this comparatively delayed tongue-tip movement is a moot question. But whether it is some form of context-sensitive inhibition, the relative delay of a separate or semi-independent onset command channel, or some other factor, it seems mistaken to try to treat the fr/ gesture in this context as an entity. Similar arguments can be applied to the treatment of some homorganic clusters. For example, Haggard's model fails to allow for compromise articulations substituting for incompatibly homorganic sequences, as in the affrication of fdrf clusters.

184

Sarah Hawkins

Despite these criticisms, however, Haggard has made some useful distinctions between sequence types which are highly likely to be relevant to the degree oftemporal coordination one could reasonably expect to find between segments. He has also based his work on a relatively large body of data from adult speech. Moreover, the main measurements he made, those of mean duration and the statistics resulting, are largely free from the methodological pitfalls mentioned earlier of such measures as relative variance and the correlation coefficient. This, combined with the use of British English speakers, make his work an appropriate starting point for the study of similar parameters in the speech of English children. In this study, then, we set out to compare principally the modifications in mean duration of consonant segments in clusters and in the contexts of vowels only, along the same lines as Haggard, although by following his methods fairly closely we are not necessarily endorsing his theoretical model. Even though the parameters are not yet fully understood for adults, it is relevant to examine temporal coordination in the speech of children, for it might help to indicate in what ways speech gestures become integrated into their adult form. One could reasonably expect either a greater stereotypy or a greater variability in the synchronization of gestures than one finds with adults. Qualitative as well as quantitative differences may well appear. These may throw light on whether the constraints on production are the same in children as in adults, with a gradual reduction in phone duration as the chief characteristic of increasing maturity, or whether there are also changes in the constraints themselves. In order to make comparison with Haggard's data possible, the choice of clusters examined was made from those he studied, and they were elicited in a list form rather than in a more natural phrase context. The list provides stressed syllables with a marked tendency for phone elongation, particularly of the final one (e.g. Mattingly, 1968), so that a standard situation is set up in which the absolute magnitude of context effects may be exaggerated. Despite the objections discussed above, sequences such as initial/dr/ were included in the material, since we were interested not so much in the predictions of the model as in a direct comparison of the data of Haggard's adults and the children. Moreover, there is good preliminary evidence that this particular type of sequence at least, shows some interesting developmental trends in very young children which it would be interesting to follow through to older children. This is mentioned in more detail in the discussion and is the work of Menyuk & Klatt (1968), Menyuk (1971) and Kornfeld (1971). It should be noted though, that although there are considerable similarities with the methods and materials of Haggard's experiment, the data for the present investigation were not collected in quite the same way. So they may not in fact be completely comparable with Haggard's. Firstly, to facilitate measurements of initial segments, each word was preceded by an unstressed j'dj. Haggard's words were completely isolated. Secondly, when each word was produced, it was repeated three times, since otherwise the time needed to collect an adequate sample of each cluster with the technique described in the method would have been prohibitively long. The children were encouraged to pause between each repetition, but did not always do so. Besides reducing the comparability with Haggard's data, this procedure has the disadvantage of introducing an undesirable rhythm into the speech. Checks were made for this in the measured data (and impressionistically beforehand) and clearly affected items were discarded-usually the last of a group of three words. But in general there was as much variability between items of the same ordinal position of utterance as between those in different positions. In addition, of course, since all measurements are relative and this procedure applied to all words, differences dependent upon such an artifact should be cancelled out.

Temporal coordination of consonants in the speech of children

185

The terminology "abbreviated by" is a convenient way of referring to contrasts between conditions. It is not meant to imply necessarily any causal direction in what follows . The following phonetic symbols are used: /II as in hid fa:/ as in hard jai/ as in hide /if as in heed j ef as in head jei/ as in hate jaj as in had ;i; ~voiceless aspirated /1/. Method

Subjects The children studied were obtained through private contacts. Table I gives the age, sex and the period of first testing of each one. Table!

Name

Sex

sw svw

m f f m m f

SL NB AG CB MW

m

Age at testing

Date of birth

First

Recordings Last

7 6 5 5 41414

November 1964 September 1965 May 1966 July 1966 April1967 May 1967 October 1967

22 November 1971 4 January 1972 17 January 1972 8 November 1971 16 November 1971 23 November 1971 15 November 1971

7 February 1972 20 April 1972 6 March 1972 14 December 1971 2 February 1972 12 February 1972 24 January 1972

The parents of these children all had either academic or professional jobs. SW and MW were brothers. The children lived in three parts of the town, and attended five different schools between them. All had a fair degree of experience outside the house. Materials Five groups of words were drawn up. Three groups involved word-initial two-consonant clusters ; there was one group of initial three-consonant clusters; two groups involved word-final two-consonant clusters. In addition, the phrases which Haggard (1971) had used in his second experiment, elf a flit and else a slit, were included as a separate group. Each cluster was composed of a consonant common to the group-the principal consonant-and one (or two) adjacent consonants. Within a group, the adjacent consonants contrasted in manner or place with the principal consonant, so that one homorganic and one nonhomorganic cluster was formed. In all postvocalic clusters, the first member of the cluster, Cb was relatively more open in articulation than the second member, C2 • The reverse relationship held between the elements of the prevocalic two-consonant clustersi.e. closed-open. The three-consonant clusters were both open-closed-open. Fifteen sublists were composed, of five words each, or six for the three-consonant group. In each sublist, two words contained a consonant cluster from the same one of the five groups of words to be studied. The other words contained either the principal or an adjacent consonant alone, in the same word-initial or final position as it occurred in the cluster. All words in a sublist contained the same vowel. The number of vowels used in a given group varied from 1 to 5, to a large extent depending on the possibility of making complete subgroups of words familiar to children. This arrangement made it possible to compare the duration of each consonant in a cluster with its duration when alone, and also to compare the effect of accompanying consonants on the durations of the principal ones.

186

Sarah Hawkins

In order to speed the collection of data and to make the child's task less monotonous, it was decided to let the consonant at the non-critical end of a word type vary. For instance, the initial jsl/ cluster with vowel /II includes the following words-slit, slip, slim, sling, slink-and examples of any or all of these can contribute to the /sh/ sample of a given child. This is a weak aspect of the design, for it affects vowel length and may affect non-adjacent segments too. As far as possible, these effects have been isolated and dealt with separately in the analysis, but it is probably not a necessary procedure for keeping the child interested, and should not be included in future work. The complete list of words used is given in the Appendix. No attempt was made to control for part of speech, but all words were single morphemes. Lehiste (1970a, b) and Haggard (1971) found morpheme boundaries can override any abbreviation effect which would otherwise be present in a cluster. As far as possible, the words selected were all fairly familiar to young children and occurred frequently in story books. It was occasionally necessary to include unfamiliar words, however. Where this happened, special care was taken in explaining the meaning to the child and he was given practice until he could say the word fluently, before it was recorded. Collecting the data The children were all visited once weekly in their homes, except for NB who was visited twice weekly. They were read stories. At suitable stopping places, usually after every page or so, they were asked to repeat three times any relevant words which had occurred in the text or pictures. (Some children often gave more than three repetitions of a word, and this was not discouraged.) Each word was to be preceded by j'J/, to allow measurement of initial stops. The only exception to this was the word else, which the children tended to volunteer as jg?elsf. They easily learned to precede all but else in this way. This general procedure was changed during the last few sessions for each child, to one where the words requested after each page were not necessarily related to the story, but were a randomized list of all words for which the data were not yet complete. This was because the less common words occurred either rarely or not at all in children's stories. They were still collected in blocks of three to five, as before. A given word was not asked for again once at least 36 acceptable utterances had been obtained. This number was chosen to allow for up to 6 items to be unmeasurable from the oscillograms even though they had sounded "normal" when they were recorded, while still leaving a sample of at least 30 utterances/word type. The order of saying words in all but the last few sessions was thus not fully randomized, and samples of the most common words were usually complete several sessions before the final one. However, in practice the order was fairly random, for even though the text constrained the particular words asked for, the individual child's comments, general interests and degree of tiredness determined the number asked for and their order. When two or more words tended always to be asked for together, e.g. leap and spring, then the order of requesting was always varied within a session. Moreover, since the child's utterances of two different words were always separated at least by the experimenter asking for the next one, the production of a given word is unlikely to have been influenced unduly by the preceding word. Utterances were excluded from the analysis which were clearly affected by position within the group of three repetitions-usually the last one, which would be spoken very slowly. Variations in subjectively perceived rate and loudness were controlled as much as possible within and between sessions. Each child selected his own perferred rate of speaking over all sessions.

Temporal coordination of consonants in the speech of children

187

Each session lasted from about i- to H h , depending upon the child's behaviour. An average session was about an hour long, during which time different children would produce between 300 and 500 utterances on average. Data were discarded from the first session and also from the second unless the child was confident with the routine. Recordings were made on a Uher 4000 Report-L portable tape recorder. The tape speed was 7! in/s. Until 24 January, the standard M514 microphone supplied with the Uher was used, but after this date an AKG lavalier microphone was used (D109). It was hung round the neck and also clipped to the clothing when possible. It was not possible to keep it a constant distance from the child's month. Measurement The recordings obtained were played back at half speed on a Robuk tape recorder into a Siemens Oscillomink, which had a paper speed of 20 cmjs. Tracings were read to the nearest millimeter, with all measures of 0·5 mm being assigned to the next higher value. The measurement precision was of the order of 2·5 ms, but in voiced sounds the effective precision was of one pitch period-about 3·5 to 4·5 ms. The effective band limit was 2·4 kHz. The second channel of the Oscillomink received the recordings through a high-pass filter blocking frequencies below 2500Hz. This trace, rectified and integrated, was written out in parallel with the unfiltered signal, and assisted particularly in measuring fricatives . This is generally known as the duplex oscillogram technique. Figure 1 illustrates the way this equipment was set up. liP -

-

-.L..::>t------.---r----- 0/ P 2·2k 21J.F

Figure 1

Circuit diagram of the high pass filter.

The identification of a segment boundary is usually relatively simple, since it is reflected acoustically by a change of excitation type or an abrupt change of articulatory closure. Particular difficulty is experienced with the division between a vowel and following /1/ , and frequently with the termination of prevocalic /r/ , especially with child speakers. Postvocalic /1/ was therefore measured both by itself and combined with the preceding vowel, partly in an attempt to reduce measurement error, and partly to see if vowel+/1/ tended to undergo the same duration modifications in and out of clustered contexts as /1/ alone. Similar reasoning influenced the decision to measure voiceless stops in two ways-from the transition off the preceding vocalic segment (stop T), and from where closure of the tongue or lips was judged to be complete (no periodic or aperiodic excitation in the oscillogram-stop C). The closure period is a more satisfactory measure than the stop T segment, from the point of view of comparison with measures of stops following unvoiced fricatives. But with children in particular, the point of closure is very ill-defined, and in fact is very frequently not identifiable, in that some noise may be clearly present throughout the acoustic signal. A more reliable segmentation point and therefore one which is also of interest in its own right, is the point of transition off the preceding segment. Alternative measures were also used for the segmentation of prevocalic /1/ from a pre-

188

Sarah Hawkins

ceding fricative. The frequency response of the oscillograph is such that the level of aspiration of a voiceless semi-vowel is very similar to the level of friction of a preceding fricative. However, with some subjects, the division in an fslf sequence is signalled by a sharp burst in the trace where the tongue sides go down to form the /1/. In an /fl/ sequence, there may be a fairly marked change in amplitude of the aperiodic noise associated with the change to the /1/ articulation. The former is more easily identifiable than the latter, and subjects vary in the extent to which either are present in their speech records. Wherever possible, these more refined measures were used with all /fl, sl/ clusters, along with the simpler but less accurate assignment of all aperiodic excitation to the fricative, and all nonaperiodic sections, including any silence, to the /1/. The measurement of fw/ after fs/ was usually simpler, because although there is a period of voiceless fwf, it is generally not accompanied by aspiration of anything like the amplitude of an /s/, so that segmentation is less ambiguous and more similar to the refined measures in the clusters with prevocalic /1/. Results

The results for each subject are kept separate to help in the identification of more than one pattern of behaviour, should these exist. Results are also recorded separately where more than one vowel was used with a given consonant cluster, since in several cases there are quite marked differences in the modifications observed according to the vowel used. It has already been noted that most samples of a particular type of sequence consisted of more than one word. In most cases the duration of the relevant segments was not affected by this. Where durations were affected measurements are shown for the sample as a whole, or where subsamples were widely unequal in size, for the largest subsample. The lowest n for any total sample was intended to be 30. In general, sample sizes varied between about 25 and 45 tokens. There were many cases where a word used in a particular list also contained a segment appropriate to another list with the same vowel. Where the means and ranges of the two distributions were similar, measures for this segment were included in the main sample. This greatly increased some samples, notably of final unclustered ft/, sometimes to over 100 tokens. Differences between means were tested for significance using Student's t test. All significance levels noted for the means are for two-tailed tests. Notes to the Tables (1) Significance levels are indicated by superscripts according to the following key:

t tests 2 tailed p

0·05 0·02 0·01 0·002 ""'0·001

superscript 1 la 2 3 4

(2) A bracketed superscript indicates the value approaches but does not quite attain the particular significance level. (3) Unless indicated, a superscript attached to a mean duration indicates the significance of the difference between that segment and the corresponding unclustered segment. (4) E beside a value indicates that it is longer than the unclustered segment (elongated). (E) indicates an elongation which is statistically nonsignificant but fairly large in absolute terms. (5) ns = not significant. (6) Subjects are listed throughout in order of age.

Temporal coordination of consonants in the speech of children

189

Table II Mean durations in ms of prevocalic /1/ preceded by /f, sf (unrefined measures)

SW** /1/ alone after /f/ after /s/ SVW /1/ alone after /f/ after /s/ SL /1/ alone after /f/ after /s/ NB /1/ alone after /f/ after /s/ AG /1/ alone after /f/ after /s/ CB /1/ alone after /f/ after /s/ MW /1/ alone after /f/ after /s/

/II

/i/

102·67 201 ·43 5 191 ·76 5 122·16 59·53 5 71 ·22 5 140·78 72·895 75-46 5 123·68 { 81·705 4 102·56( 5 )

120·73

4{

4{

4{

138·58 92·6]5 83·93 5 110·49 63 ·99 5 81·77 5 113·05 49 ·145 73·02 5

Vowel /e/

263·85 5 121·81 63·22 5 139·00

1{

72·505 127-46 83·15 5 120·51

z{

78·064 121·00 79·36 5 117-20 83·045

z{ z{

102·92 199·00 5 196·89 5 111 ·07 62·89 5 69·93 5 127·75 70·53 5 74·26 5 134·57 83·00 5 98·51 5 120·92 84·32 5 87·39 5 115·97 65 ·565 50·94 5 140·32 58·47 5 66·97 5

jar/

/a/

110·82 *236·11 5 230·06 5 103-43 63 ·75 5 59·29 5 127·07 56·55 5 (5){ 73·27 5 122·91 78 ·89 5 91·06 5 127·28 78·04 5 87·37 5 118·64 66·79 5 66·44 5 116·79 56·925 62·79 5

z{

4{

(4){

z{

z{

123·09 340·35 5 336·25 5 108·68 61 ·18 5 61-40 5 139·94 67·37 5 82·69 5 131·80 74·68 5 99·55 5 122·18 72·44 5 87·91 5 121-69 58·26 5 56·75 5 144·15 63·83 5 73·29 5

*Mean for total sample (see text, p. 188). Means for subsamples were as follows: /1/ after /f/ in /fag-224·88 (P < 0·001 vs. /1/ in isolation); in flash and/fat-251·83 (P """ 0·001 vs. /1/ in isolation); n = 21 for /fag; n = 15 for flash + flat . Difference between the two subsamples significant (P < 0·05). **These measures for SW are not meaningful as /1/ in isolation was measured without the following vowel, as with all subjects, but in the clusters it was measured together with the vowel.

Differences in mean segment duration List 1: Prevocalic jl, wj preceded by jf, sf. The mean durations of these segments are sum-

marized in Tables II-VIII for unrefined and refined measurements respectively. With the unrefined measures, /1/ was strongly abbreviated after both /f/ and /sf. In general, the degree of abbreviation did not differ significantly between the two clusters. Where there is a significant difference, the /1/ is always longer after the homorganic /s/ than after /f/ except in one case (CB, with vowel/e/), and the general trend is clearly in this direction. Abbreviation was much less evident with the refined measures. The slight tendency for longer durations in the clusters (Table VI) should be treated with caution as it is strongest for AG, whose data were the most difficult to measure in this respect and so have only small ns, with most likelihood of measurement error. There are fewer significant differences between the duration of /1/ after /f/ compared with /sf in the refined measures-six instances as opposed to ten in the unrefined. Four of these six are also significant in the unrefined measures and the greater length in the homorganic cluster is maintained in these cases. But this homorganic-nonhomorganic difference is not consistent in the refined measures -with no statistically significant difference between them. The modifications in /1/ duration are I:robably similar to those Haggard found for adults,

190

Sarah Hawkins Table ill Mean durations in ms of /f/ followed by prevocalic /1/ (unrefined measures)

Vowel

/II

sw

/f/ alone before /1/ SVW /f/ alone before /1/ SL /f/ alone before /1/ NB /f/ alone before /1/ AG /f/ alone before /1/ CB /f/ alone before /1/ MW /f/ alone before /1/

/e/

166·84 191·61 5 E 173·17 199·73 5 E 175 ·37 193·404 E

161·10 169·50

122·35 136·76 2 E 204·72 261 ·92 5 E 140·73 151 ·60 1E 132·50 134·57

116·99 119·94 166·71 240·68 5 E 130· 15 132·15 146-42 136·32

167-84 188·2?4E 173 ·77 201 ·58< 5 >E

/a/ 150·66 148·68 159·21 174-41 2 E 167·30 190·544 E 114·93 115·56 148·81 169·80 2 E 131 ·79 141 ·93(1 1E 150·28 *142·76

/aJ/ 154·27 160·07(E) 153-88 187·15< 5 >E 178·00 177-82 105·07 113 ·39(E) 171·91 191·47(llE 134·07 145·28(E) 126·91 150·394 E

*Mean for total sample. Means for subsamples were as follows: /f/ in flag alone- 138·13 (n = 20); in flash + flat + flap- 147·63 (n = 19). Neither mean is significantly different from /f/ in isolation .

at least for the unrefined measures. He had no unclustered comparison for /fl/ or for his refined measures, which were only carried out on the phrases elf a flit and else a slit, rather than on isolated words. Adult data are being collected on this point now, and a more complete comparison will be made later. Haggard's phrase data, however, do show the same tendency for /1/ to be longer after j sj than /f/, particularly with the unrefined measures. jwj was fairly consistently substantially shorter after jsj than when alone. In general, it was abbreviated slightly less than /1/ following j sj . This direction is consistent with Haggard 's findings with adults, but he found a much smaller degree of jwf abbreviation than with these children . The fricatives before /If in the unrefined measures have a marked tendency to be longer than when alone, the effect being stronger for jsj (with the exceptions of SVW and MW). The refined measures maintain the relatively longer jsj, but show abbreviation rather than elongation in the cluster. Except with the vowel /r/ in MW's speech and /i/ in SVW's, jsj was never significantly abbreviated before fwj, and generally tended to be longer than in isolation, sometimes significantly so. In general, /s/ was rather shorter before jwj than before /1/ with the unrefined measures, CB being the only strong exception. This is strongly reversed when f swj is compared with the refined /sl/ measures. The adult data using unrefined measures show it to be clearly abnormal to lengthen /sf at least before /1/, and probably abnormal to lengthen j fj too. Although no overall significant abbreviation was found for fsj before /1, wj with adults, j sj was significantly abbreviated before jwj when this was considered separately. Considering the clusters as a whole, jswj types show most tendency for overall abbreviation relative to the durations of their unclustered components. The jsl/ measures tend to change either towards overall lengthening, or at least less relative abbreviation than occurs in ftlf or j swj. The exception to this is MW. For all subjects, the relative durations of the segments of j swI clusters are never longer than both f slf and /fl/, and are usuaily the shortest

Temporal coordination of consonants in the speech of children

191

of the three types. While direction and degree of modification of the components of /fi/ clusters are quite frequently not significantly different from those of jsw j , the components of the homorganic / sl/ clusters almost always differ from both the nonhomorganics, /fi/ and

jswj. Table IV Mean durations in ms of /s/ followed by prevocalic /1, w/ (unrefined measures)

/I/

sw

/sf alone

175·79 before /1/ 1 { J 92·97 2E before fwf < > 179·24 176·36 **195·422 E 196·362 E

SVW /s/ alone before /1/ before fw/ SL

/sf alone before /1/ before /w/

NB

/s/ alone before /1/ before fwf

AG

/sf alone before /1/ before fw/

CB

2

3

Vowel /e/

/i/

jar/

182·23 189·87 L 175·00

170·48 207 ·93 5 E

165·96 190·063 E

170·11 200·60 5 E

183·37 { 180·86 1 167·07 2

168·22 200·86 5 E

168·97 *195·50 4 E

167·24 203 ·24 5 E

f 1

4

/a/

199·93 { 219·08 4 E 204·35

192·87 208·97 4 E 202·86 1E

188·73 206·824 E

182·50 *200·45 4 E

181·14 *211·25( 5 )£

113 ·75 136·19 2 E 148·47 5 E

131·88 145·19(E) 136·00

115·14 151·43 2 E

120·36 139·522 E

107·70 146·70 5 E

169·15 { 228·14 5 E 203·604 E

190·00 213-54 1E 203·04(E)

203-42 267·50 5 E

173·31 255·59 5 E

191 ·78 235·95 5 E

163-92 { 157·21 173-44

145·84 168·94(5 )£

149·79 166·94 1"E

160·78 167·25(E)

167·03 { 180·33(1)£ ) 155·58

154·16 152·39

179·17 151·03 4

154·52 162·50(E)

/sf alone before /1/ before fwf

165·90 *154·61 157·70

MW fs/ alone before /1/ before fwf

180·53 *150·23 4 151·55 4

1

(

3

*Mean for the total sample. See below for means of the subsamples. **Mean for the largest subsample (n = 30). See below for means of the total sample and the other subsample. Measures of (sf before /1/ which differ according to the word used with a given vowel (ms) SVW Vowel /If total slit, slim, sling slip, slink SL Vowel /a /total slap slant MW Vowel /If total slit, slip slim, sling, slink, } slid CR Vowel /r/ total slim, slid, slip slink

n 39 30 9 39 21 18 43 22

21

I 88 ·21(E) 195·422 E 164·17 200-45 4 E 208 ·81 4 E 190·69 150·23 4 158·98'} 141 ·07 5

51 36 15

154·61 159·03 144·00 1

Vowel /a/ total slap slant Vowel far/ total slight, sly, slide slice

n 35 28 7 40 29 11

195·504 E 201·07< 51 E 173·21 211·25< 51 E 208·71 <51 E 217·96 3 E

192

Sarah Hawk ins

Table V Mean durations in ms of prevocalic /w/ after /s/

s jwj alone after jsj

sw

jwj alone after /s/

svw

/i/

/II

s

/II

/i/

87·00 68·3 33

92·63 50·89 5

AG

119·74 87·74 5

111·45 92 ·94 2

108-45 75·91 5

99·30 53·86 5

CB

122·25 56·965

109·04 59·83 5

MW

114·19 74·75 5

114·86 62·66 5

(5)

jw j alone after jsj

SL

124·21 77·32 5

121 ·21 71 ·43 5

jwj alone after /s/

NB

115·88 65 ·05 5

123·94 68 ·1?5

Table VI Mean durations in ms of prevocalic /1/ preceded by /f, sf (refined measures)

sw /1/ alone after /f/ after /s/ SVW /1/ alone after /f/ after /s/ SL /1/ alone after /f/ after /s/

NP

/1/ alone

AG

after /f/ /1/ alone after /f/ after jsj

CB

{ 3

MW /1/ alone after /f/ after /s/

/i/

102·67 90·20 2 103·64 122·16 107·404 111 ·25 2

120·73

140·78 123·184 *108·71 5

105·21 1 121 ·81 89·06 5 139·00 94-485

123·68 101·324 138·58 { 140·95 ns 123·21

/1/ alone after /f/ after /s/

/II

5

{

<>

3

110·49 5 79·17 *121·85(E) 113·05 4 { 92·78 106·01

Vowel /e/

/a /

/ai/

102·92 90·29 2 92·50

110·82 90·24 5 105 ·06

111-07 106·90 109·75

103-43 103 ·28 98 ·33 127·07 5 { 84-38 4 109·67 2

123·09 5 { 91 ·90 2 107·85 2 108·68 101 ·25 99·05 1 a

127·75 110·63( 5 ) 111·044

131 ·80 93 ·75 5

unmeasurable 120·51 102·75 1 121 ·00 109·79 117-20 120·81

*Mean of total sample. below : n . SL Vowel /If total 31 slip, slit 13 slim 9 sling, slink, slid 9 CB Vowel /If total 27 slim, slid, slip 19 slink 8

120·92 135·75(E) 128 ·97

139·94 107·30( 5 ) *102·00( 5 )

127·28 114·67 120·00

122·18 142·90 1E 130·71

115 ·97 { 101·25 1 " 78·05 5

118·64 103·08 97·83 4

121·69 95·89 2 90·80( 5 )

140·32 108·98 5 104-41 5

116·79 101-43 4 105·25 1

144·15 111 ·50 4 103·82( 5 )

Means for the individual subsamples are given 108·7JS 105·96( 5 ) 114·44 3 106·94(5 ) 121 ·85(E) 129·85(E) 104·69

n Vowel /al / total 20 102·00(5 ) slight, sly, slide 12 97·08 5 slice 8 109·38 4

Temporal coordination of consonants in the speech of children

193

Table VII Mean durations in ms of /f/ before prevocalic /I/ (refined) Vowel jrj

/e/

/a /

jar/

/f/ alone before /1/ SVW /f/ alone before /1/

166·84 157·57

161 · 10 133·025

154·27 135·502

173-17 156·504

167-84 144·85 3

150·66 108·795 159·21 130·78 4

SL

175 ·37 143 ·03( 5 )

173 ·77 158·042

167·30 159·73

178·00 139·80( 5 )

sw

NB

/f/ alone before /1/ jf j alone before /1/

122·35 107·94 1

AG

/f/ alone before /1/ CB jf j alone before /1/ MW jf j alone before /1/

unmeasurable

153-88 147·75

105·07 96·79

204·72 200·83

166 ·71 21 4· 25"'(E)

148·81 130·50 1

171 ·91 144·21 1

140·73 125·46 1• 132·50 91· l1 5

130·15 94·25 4

131·79 108·08 3

146-42 86-48 5

150·28 95 ·00 5

131 ·07 111·96 1" 126·91 103-423

Table VIII Mean durations in ms of /sf before prevocalic /1/ (refined)

jrj

sw

/i/

Vowel /e/

/a /

jar/

/s/ alone before /1/

175·79 151 ·06( 4 )

182·23 149·17( 5 )

170·48 170·37

165·96 151·99 1

170·11 160·00

SVW /s/ alone before /1/

176·36 *145·90( 5 )

183-37 152·89( 5 )

168·22 159·08

168·97 *156·25 1

SL

199·93 *188·23

192·87 187·41

188·73 170· 15 4

182·50 165 ·87 1•

167·24 166·98 181 ·14 *1 86·25

/s/ alone before /1/

NB

unmeasurable

AG

/s/ alone before /1/

169·15 190·80 1E

190·00 1 68 ·25 1

203-42 226·03 1 £

173·31 199·30 1E

191·78 196·79

CB

/s/ alone before /1/ j sj alone before /1/

165·90 127·13( 5 )

163-92 141·04 1

145-84 136·64

160·78 132·14(5 )

180·53 115·8]5

167·03 141·69(4 )

154·16 114·17 5

149·79 131 ·08 1 " 179·35 ]10·94 5

MW

154·52 132·064

*Mean of the total sample. See below for the means of the subsamples. Measures of /s/ before /1/ which differ according to the word used with a given vowel-refined (ms) SVW Vowel /II total slit, slim, sling slip, slink SL Vowel /II total slip, slit slim sling, slink, slid

36 145·90(5 ) 28 150·544 8 129·69 5 31 188 ·23 13 180·39 2 9 192·78 9 195·00

Vowel /a/ total slap (pron. /slap/) slant (pron. /sla:nt/) Vowel far/ total slight, sly, slide slice

30 156·25 1 25 162-40 5 125·50 5 20 186·25 12 190·21(E) 8 180·31

Sarah Hawkins

194

Table IX Mean durations in ms ofthe components of3-segment clusters plus /st/

/sf

/p/

PT

/t/

PC

/r/

TT

TC 4

sw

svw

SL

175·79

163 ·18

alone in spring in string in step

{ 182·61E 2 197·63 3 E 195-445 E

a lone in spring in string in step

176·36 155·54 2 163·63 181-40 10 E

157·08

alone in spring in string in step

199·93 170-41 (4 ) 186·72( 2 ) 181 ·63

149·33

4

[

97·73 5

134·27

74·3]5

51·505 55·54

5

I

5

74·80

(5)

56·34 109·73 5 E

124·64

44·08 5

5

67·06 98·95 5 E

108·67 5 sf 42 ·93 5 l 69·77

69·60 4 E 67-68

141 ·39

103·75 4{

61-64

s{

5

I __ _ 5 65·11

78·1P 96 ·86 135·18 58·78 5 89·89 5

77-07 (3)

NB

AG

36·67 97·67 5 E

alone in spring in string

113-75 138·58(4 )£ 151-45 4 E

143·79

alone in spring in string in step

169·15 211·65 4 E 219·844 E 199·53

117·75 195·93 5 142·042 E

5

162·17

55·63 5

5

76-45 2 E

193-44

137·74 { 54·25 5 100·894

125·69 5

119·93 { 70-46 4 4 ( ) 101·84 1

80·92 4 70·47 4

CB

MW

alone in spring in string in step

165·90 147·63la 134·61 (S) 140·76

119·69

alone in spring in string in step

180·53 150·15 4 161·72 1 146·39

129-44

CSH alo ne in spring in string

JAR

alone in spring in string

169·65 { 146·67 2 2 167·17

156·10 { 110·50 5 2 128·57(S)

45·80 5

I

86·75

33·75

67·93( 5 )£

r

4

I

46·07 5 5 89·09 E

152·30

85·69( 2 )£ 62 ·15

I

30·15 5 93·00 5 E

132·57 5

{ 58·26 5 100·604

38·17

109·67

4{

33-83

117·50

l

68 ·04 5

Adult data 121 ·67 43·67 5 73·33< 5 lE

5

64·14< 5 >£ 50·90 (S)

116·81

40·56 5

25·36 3

154·17 56·50 5 118.44< 5 >

1{

140·33 46·83 5 75 ·83 5

101 ·41 40 ·50 5 80·18 2

Temporal coordination of consonants in the speech of children

195

Measurements were also made of initial jstj, although with no complete Jist for comparison. These data are shown in Table IX. jsj duration was tested against the value for unclustered initial jsj with vowel jej, as in Tables IV and VIII. The clustered fsf was never significantly shorter than the unclustered, and it was quite significantly lengthened in the speech of SW and SVW. This is in contrast to the significant abbreviation of fsj before ft/ that Haggard found with adults.

List 2: Prevocalic jrj preceded by various stops. The duration modifications of Jrf in the clusters do not correspond with Haggard's finding of abbreviation after both voiced and voiceless stops. But this is not necessarily significant as the segmentation criteria differ in the two studies. Haggard defined the boundary between the stop and following prevocalic /r/ as at the end of the stop burst, which usually coincided with the growth in amplitude of periodic vibration after voiced stops. With voiceless stops, this usually appeared somewhat later. In the children's speech data, it was considered more reliable to take the burst itself as the boundary, so a corresponding reduction in stop duration and increase in jrj duration should be expected. The extent to which this difference will affect the measured durations is not yet known. Adult data using the changed criteria are now being collected, but to date we have only a small amount for comparison (15 to 22 tokens of each word from two adults, one male and one female). The general picture that emerges, for adults and children, is of elongation of the aspiration + jrf segment after voiceless lingual stops, and abbreviation or no change after /p/ and voiced stops. Comparing between place-of-stop contexts with voicing constant, jrj is longest in duration after the alveolar stop and shortest after the bilabial. In most cases, place differences within the voiceless category are all strongly significant for adults and children. Amongst the voiced contexts, bilabial and velar environments are both highly 1 significantly different from the alveolar, but not from each other. (Exceptions are SW, SVW and CB.) Whether significant or not, however, the tendency for longer jrj after velar than after bilabial stops is generally maintained in the voiced contexts. Except for MW and NB, the distribution of jrj durations after voiceless stops overlaps little or not at all with its distribution after voiced ones. This implies that although it is reasonable to include the aspiration period in the following segment, in that the /r/ articulation does occur some time before voicing onset, particularly after voiceless stops, the values so obtained for /r/ duration are probably not comparable across voicing contexts. Haggard did indeed find a difference in the same direction, which he attributed to "spreading tenseness" of the voiceless stop leading to a systematic lengthening of the Jrj gesture, (i.e. to a local slowing by the voiceless member's more stringent requirements upon articulation). But the size of his difference was smaller, overlapping at the alveolar articulation, and with no significant place differences within voicing. The effects found here then appear to depend largely upon differences in the aspiration period of the preceding voiceless stop, particularly as, after a voiceless alveolar stop, the frj segment was shortest in Haggard's data and longest in the children's and the two adults' speech. To get a clearer picture of normal patterns and any adult-child differences, separate measurements should be made of the aspiration period of voiceless stops prevocalically and before jrj, and also of the voiced portion alone of the clustered jrj. These will be done for both adults and children. Interpretation of the effect of prevocalic jrj on a preceding stop is also difficult at tht: 'For CB and JAH these differences are only of moderate significance.

196

Sarah Hawkins

moment due to lack of normative data, for Haggard did not measure these at all. As far as comparison is possible there appear to be no strongly abnormal trends in the children's data. We must await further normative data before detailed comparisons are worthwhile. List 3: Initial three-segment clusters, fspr, strf. The data for the prevocalic three-segment clusters (and also for fstf) are given in Table IX. Once again, the only adult data available as yet are from CSH and JAR only. These are appended to Table IX. The adult modifications are more consistent than in the two-segment clusters with prevocalic frf. Overall, the non-homorganic cluster, /spr/, shows more abbreviation relative to the unclustered durations of the individual segments than the homorganic fstrf. But this is somewhat modified by the elongation of the nonhomorganic stop and the abbreviation or lack of change in the homorganic one. The most outstanding difference in the children's data is the high incidence of fsf and ftf elongation. Comparison with the unrefined measures of fsf before fl/ in List I (Table IV) shows that every child who lengthens /sf in a three-segment cluster also lengthens it before flf-but not vice versa. With the refined /sl/ measures, of course, where only AG shows a small amount of lengthening, this relationship does not hold. Adult data on the degree of abbreviation to be expected with refined measures of fricative + semivowel prevocalically may expose some further relationships. As with the adults, /s/ in the homorganic cluster is longer than in the nonhomorganic cluster in all but CB's case, but the difference only achieves significance for two subjects, SW and SL, whereas it is significant at beyond the 0·01level for both adults. The children, like the adults, all lengthen /p/ in the cluster (compared with the unclustered fpC/ measure), usually to well beyond the O·OOllevel (except for SW, who shortens it -P< 0·001). However, the children differ from the adults in that they also lengthen ft/ in the cluster, to a somewhat less significant degree than /pf-again with the exception of SW and also AG, who shorten it significantly. It is interesting to note that for SVW, SL and AG, and also CSH, the closure measures of fp, tf do not differ significantly from each other in isolation, but do in the cluster, with ftf significantly shorter than /p/. For SW and JAH, even though the unclustered durations are significantly different, the difference when in the cluster is greater. The reverse is the case, however, for CB and MW, with a nonsignificant difference between clustered /p/ and ft/ and one of P< 0·001 between the unclustered segments-with a different direction for each child. NB is alone in having both differences of equal degree (P< 0·02). This stands out as one of the few instances of a reasonably clearcut age trend in any of the clusters. SVW, SL, AG and SW conform to the (apparent) adult pattern, whereas CB and MW clearly do not, with NB representing an intermediate position. fr/ duration is modified similarly by adults and children. It is usually strongly abbreviated compared with its unclustered duration, but always significantly more abbreviated in the nonhomorganic cluster than in the homorganic. But the absolute amount of abbreviation in each case is smaller in the children's speech by about 10 ms (77·25 vs. 67·68 ms for fp/; 42·87 vs. 30·59 ms for ft/). This is not very impressive, but with more adult data may prove to be, for note that JAH abbreviates a surprisingly small amount, and may be unusual in this. The children's tendency to lengthen fs/ before ft/ in step has already been mentioned as contrasting with Haggard's finding of severe abbreviation of fs/ in this context. Comparing clustered /sf durations across all contexts-before prevocalic /1, w, t, pr, and tr/, it

Temporai coordination of consonants in the speech of children

197

will be seen that SW, AG and NB (for whom no step measurements were possible), are on the whole consistent elongators. CB and MW generally abbr~viate, and SVW and SL exhibit some lengthened, some abbreviated and some unchanged values. Interestingly, both these last two children tend to lengthen the jsjs in two-segment clusters and abbreviate in the "more difficult" three-segment clusters. More complete data on these types of clusters are now being collected to allow a more satisfactory analysis to be made.

List 4: Postvocalic jlj followed by jp, t,f, s, m, nj. Some problems were encountered in segmenting /1/ from the preceding vowel, and so measurements were made both of the total vowel+ /1/ durations and of /1/ alone. This decision was not made until after AG's data had been measured, and so only /1/ is measured for vowel jej, and only vowel+/!/ for vowel

/II. Tables X and XI summarize the changes in /1/ and vowel+/!/ (Vl) duration due to clustering with various types of consonant. By definition, an /1/ measurement prior to a stop measured from the point of closure extends to the end of voicing, while before a stop T measurement, /1/ extends only to the point of transition onto the following segment. Comparisons of the clustered /1/ with the unclustered (post-vocalic final) /1/ therefore are only really direct in conjunction with stop C measures, as in Table X, for the stop T measures imply more abbreviation than actually occurred. An attempt was made, with MW's data, to measure unclustered postvocalic /1/ to its transition-off as well as to the end of voicing, but the results did not warrant the increased complexity of the data and time required for the analysis. In general /1/ and /VI/ are abbreviated before all final consonants compared with their isolated congeners, except before /n/, where there is a consistent but generally nonsignificant tendency for at least /VI/ and usually /1/ also to be longer in the cluster.2 The only other exception to overall /1/ abbreviation is NB, whose /1/ is significantly longer before both stops (/pC/ and jtCj), P < 0·001, and nonsignificantly longer before jsj. The degree of abbreviation differs according to the segment following /1/. Within each manner class, /1/ is almost invariably longer before the homorganic segment. The difference is usually significant for fricatives and nasals (but see footnote re jlnj), but only consistently for the three youngest children's stops. Haggard found only a nonsignificant difference in the same direction; this is interesting as the four oldest children have an adult pattern before "easy" stops and conform with the younger children in the "more difficult" fricatives. Comparing across manner and within place, for the jej vowel only, /1/ is usually less abbreviated before a bilabial stop than before a fricative, in most cases significantly so, and for the older children this difference is supported in the /VI/ measures. There are no consistent differences across subjects in the context of final alveolar consonants, for either /1/ or jeij. Most subjects show no differential influence of the stop and fricative on /1/ or jeij, but SVW has both shorter after jsj than after jtj, (P < 0·001), and MW has the reverse difference for both measurements. This, for bilabials at least, contrasts sharply with Haggard, who found /I/ to be ionger in the context of both fricatives than of their corresponding voiceless stops, the difference, if 2 To a large extent this effect is a function of the unfamilarity of the cluster, and hence of the word which had to be used-kiln. All children had difficulty with this, tending to substitute kilm, even after a considerable amount of practice, with the result that they usually said it exaggeratedly slowly. Thus although undoubted difficulty exists with this cluster (as with adults) wider conclusions should not be based on it.

Sarah Hawkins

198

Table X Mean duration in ms of postvocalic /I/ followed by /p, t, f, s/. Stops are measured from the point of closure, and /1/ is measured (i) together with the preceding vowel (ii) by itself

/1/ measure before

sw

jell /1/ SVW jel j jl j

SL

NB AG

CB

measure measure measure measure

fell measure j l/ measure /el / measure /1/ measure jell measure (1, measure

jell /1/ MW jell /1/

measure measure measure measure

f l/ measure alone

fpC/

353·31 168-45 367-42 228 ·13 394·85 175·69 429·96 225 ·96

307·21 4 137·50 4 299·13 5 188·944 285 ·07 5 148·68 2 413 ·17 255 ·174 E

317·37( 4 ) 141 ·15 3 315·87 5 199·78 2 298·33( 5 ) 163·33 429·21 260·50< 5 >E

277·99 5 127·13 5 251·00 5 140·08 5 247·15 5 ))9·65 4 350·52 5 163·75 5 -

222·57 457·17 242·73 448 ·61 236·54

158·21 5 397·904 237·74 334·05 5 181 ·96 5

186·1]2

160·37 4 191 ·91 1 1 4 405 ·00 - - --437·87 193·574 207·60 3 4 385·33 4 371 ·8JS 170-42 5 214·58 1

ftC/

/f/

4

400·00 205·82 2 357·205 189-47( 5 )

/s/ 336·48 147·90 2 4 286·97 5 160·6JS 5 5 - --302·56 4 151 ·60 1• 427·50 5 - -235-46(E) 2

Table XI Mean duration in ms of postvocalic /I/ followed by /m, n, p, t/. Stops are measured from the transition-off the /1/ i.e. /1/ before stops is measured only up to its transition, and (i) together with the preceding vowel and (ii) by itself

/1/ measure before /1/ measure alone*

sw

jrl, elf measure /1/ measure SVW /II, el/ measure jl j measure SL /rl, ell measure /1/ measure NB /rl, el/ measure /1/ measure AG /II, elf measure /1/ measure CB /rl, elf measure /1/ measure MW /II, el/ measure (If measure

315·61 142·50 359·17 228 ·55 369-43 185·29 437 ·55 245·43 393·51 506·06 267·12 389·88 223·84

/m/

/n/

267·344- · -2-296·78 4 157·94(E) 121·142 5 5 374·53(E) 261 ·62 157·16 5 175·78 5 5 5 440·07 E 355·26 190·07(E) 187·50 370·74< 5 >- -5--482·39 3 E 0 194·93< 5 > -1- -225 ·27 387·37 408·71 5

5

526·33(E) 396·40 4 181·06< 5 >---237· 58 337·57 5 - - - " '·-- 2 437·50(E) 2 2 268 ·27(E) 176·62 5

/pT/

/tT/

252·50 5 82·94 5 194·63 5 84·07 5

265 ·145 88·92 5 204·71 5 91·06 5 237·34 5 95 ·16 5 299·19 5 130·59 5

211-46 5 71 ·94 5 286 ·50 5 129·33 5

4 4

92·29 5 120·945 5 334·33(5 ) 301·86 141·37 5 140·00 5 244·46 5 - - -1-266·67 5 91·42 5 98 ·26 5

*This is with vowei /I/, as used before final /m, n/. For measure with vowel /e/, as used before final /p, t/, see Table X above.

Temporal coordination of consonants in the speech of children

199

anything, being more rather than less marked for alveolars. Haggard suggested that this was due to the removal of the periodic source in the anticipation of voicelessness, for a fricative can be made with the glottis only partly open. If this is the case, it would appear that the children are not anticipating the voicelessness of the following segment in the same way as the adults, and are thus not able to produce the stop so rapidly. However, this argument is weakened by the fact that the strongest supporters of the trend are the older children, while the youngest (MW) goes clearly against it. Whilst there is no need to expect a strict parallel between maturity of speech and physical age, a complete reversal is rather surprising. Table XII presents the durations of the voiceless stops-both T and C measurespostvocalically and following /1/, and the corresponding values for fricatives and nasals are given in Table XIII. On the whole there was little evidence of the expected abbreviation from the influence of the preceding /1/ except with /f/ and to some extent with /m/. Haggard found a somewhat more reliable effect for labials than for alveolars, (P < 0·01 vs. P < 0·025) due to /t/ being the only abbreviation "of convincing magnitude" amongst the latter. Part of this lack of abbreviation-and indeed, quite frequent elongation-may be due to the children's tendency to sometimes protract final phones unnaturally in the somewhat formal situation. In order not to reduce the sample artificially and possibly inv~lidly, exclusions of items with unusually long segments were kept to a minimum once the token had been accepted on an impressionistic basis. To some extent the lack of abbreviation may be artifactual, therefore, but it is nevertheless interesting, for the elimination criteria should have affected clustered and unclustered phones equally. It appears, then, that even if the children would normally abbreviate final phones after /1/ in a more adult way, they are more likely to choose to distort these than the same final phones postvocalically. Such an exaggerated effect may in fact stem from a basic articulatory difficulty of the clustered phones encouraging a slower articulatory rate. Considered as a whole, including the vowel, the relative abbreviation (or absence of elongation) of the nonhomorganic clusters was in most cases more significant than of the homorganic clusters, relative to the unclustered segments. This was so in all cases for final fricatives, and with the exception of CB and SVW with final stops, and of AG with final nasals. Comparing clusters differing in manner, five out of the seven children had the cluster with the bilabial stop less abbreviated overall than that with the fricative (i.e. /elp/ > /elf/). Among the alveolar clusters, the reverse relationship held, with /els/ >felt/ for all except SL, for whom they were equal. List 5: Final /t/ preceded by postvocalic //,sf. Haggard found that /t/ was abbreviated by all fricatives he tested, /f, s, J/, but probably most by /f/, compared with the isolated duration. Table XIV shows that some children conform with both aspects of this adult pattern over all three vowels used, notably SW and AG and to some extent MW. SL, NB and CB, however, generally lengthen clustered /t/ significantly. SVW has the adult pattern for vowel fe/, but a slight tendency towards lengthening with the other two vowels. These differences correlate in no consistent way with age, unless a curvilinear function is assumed, for which there is no other evidence. Except for NB, there is no clear tendency for a child who lengthens /t/ after a fricative also to lengthen either segment of the clusters with postvocalic /1/ (List 4, Tables X-XIII), or to lengthen the fricatives themselves before /t/ (Table XV). Table XV gives the durations of postvocalic fricatives word-finally and before /t/.

Sarah Hawkins

200

Table XII Mean durations in ms of final /p, t / after postvocalic /1/

/p'/

sw svw SL NB AG

CB

MW

alone after /1/ alone after /1/ alone after /1/ alone after /1/ alone after /1/ alone after /1/ alone after /1/

/pc/

155·70 156·54 186-45 182-41 127·23 124·51 188·79 208-42(£)

91 ·57 103 ·24(£) 85·78 78·75 45·19 51·69 68·57 85 ·58(£)

263·83 252 ·57 169·85 206-45 4 £ 179·17 152·504

193·23 187-43 92·65 110·00(£) 72-78 62·91

/t'/ 158·24 152·77 185·90 163·46 3 123 ·26 107·66 1 194·76 198 ·75 269 ·54 248 ·83 159·58 186·67 2 £ 208·64 205·76

WI 96·02 102·57 94·70 56·744 49 ·58 57·64(£) 59·67 72-93(£) 193· 18 183·89 96·02 121·002 £ 104·07 115-46(£)

Haggard found that a final /t/ abbreviated jsj but did not affect the duration of /f/. With the children however, /f/ as well as jsj was generally abbreviated in the cluster, although usually by a less significant amount than the homorganic fricative . Possibly this effect is wholly or partially explained by the much longer durations of these segments in the children's speech increasing the ease of abbreviation in any favourable clustered environment and working against the selective abbreviation of the homorganic fricative due to source interference, the explanation Haggard suggested for his effect. (The mean durations of jfj and jsj in Haggard's adults were: /f/-115 ·0 unclustered vs. 1_19·5 before /t/; jsj-157·0 unclustered vs. 115·0 before /t/, all in ms.).

Phrase data: elf a flit vs. else a slit. The phrases elf a flit and else a slit were included primarily as a direct parallel with Haggard's data, with which the list utterances differ in that they were produced with a preceding j>Jj and in groups of three rather than singly. Their TableXill Mean durations in ms of final /f, s, m, n/ after postvocalic /1/

/f/

sw svw SL NB AG

CB

MW

alone after /1/ alone after /1/ alone after /1/ alone after /1/ alone after /1/ alone after /1/ alone after /1/

229·24 182·62 4 199·35 187·67 261·12 160·69 5 190·44 133·13 5 25 1·93 213·66 1 346·80 230·00 5 252·88 195·58 4

/s/ 267 ·39 3ll ·93 4 E 250·76 242·88 236·25 218·53 156·86 193 ·61 2 £ 322·11 348 ·82(£) 300·67 363·24 2 £ 249-42 249 ·58

/m/ 151-45 127· 124 179·75 171·89 181 ·82 155 ·83 2 187·50 168-46 187·88 189·01 222 ·57 167·74 4 241·76 194·93< 5 )

/n/ 151 ·22 154·39 170.69 178·52 199·82 ] 69·86 2 168 ·94 201 ·70 4 £ 250·39 244·93 218 ·05 185·17 2 266·74 236·15

Temporal coordination of consonants in the speech of children

201

chief interest was intended to lie in the correlational data discussed below. They differ from the other utterances in that they were not produced as three single words, and the faster rate at which they were spoken means there is n,o comparable unclustered congener of each clustered segment. A comparison of the /I/ and vowel durations before and after the fricatives can nevertheless be usefully made, especially since, in contrast with List 1, all non-relevant segments are constant within and between cluster types, allowing better comparisons of the vowels in particular. Table XIV Mean duration in ms of final /t/* after postvocalic /f, s/ Vowel

/II

sw (t/ alone

100·37 78·11 4 79-49 3

96·02 74·26( 4 ) 78 ·31 2 n23 94·70 75·93 70·09 3 85 ·52 1 £ 76·57 2 50·16 49·58 66·844 £ 64·142 £ 73 ·69 5 £ 76 ·54 5 E 57·94 59·67 81·04< 5 lE { 83·3PE 3 80·29 3 £ ( ) 110·00 5 E

(t( after /f/

(t/ after (sf SVW /t/ alone (t( after /f/ (t/ after /s/ SL (t/ alone (t/ after /f/ (t/ after (sf NB (t/ alone (t/ after /f/ (t/ after /s/ AG (t( alone (t/ after /f/ (t/ after (sf CB (t/ alone (t/ after /fI (t/ after (sf MW (t/ alone (t( after /fI (t/ after (sf

(e(

2

170·92 { 73·57 5 95 ·07 5 77-28 100·302 £ 90·6JlE 82·53 81·93 89·07

193·18 74·47 5 86·95 5 96 ·02 102·93 96·99 104-07 { 86·92°) 3 114·78(E)

/a/ 92·67 78·50 3 81 ·70 1 69·38 n68 78 ·79 59·30 52·30 82·18 4 £ 70·58 88·44 5 E 95 ·00' "£ 159·14 90·07 5 82·405 69·07 { 96·29 4 £ 1 80·95(£) 104-40 75·31 4 84-42 2

*Only ftC/ measures are given for the segment when alone i.e. closure period - as the period from the transition-off the preceding vowel segment is not comparable with the measurement of /t/ following a fricative .

In the unrefined measures offlit and slit, measures with /lf tend to be rather longer after

/sf than /f/ (Table XVI). Haggard's adult data showed this same difference. No child has /1/ actually longer after /f/ than after j sj in the refined measures, but the tendency for it to be longer after j sj is rather weaker (Table XVII). Vowel lengths are less consistent. The only significant difference is in SW's speech, where nonhomorganic clusters are associated with a longer vowel (P < 0·05). This is in the opposite direction to his /1/ difference with the refined measures, with the result that the total refined /h/ duration in the nonhomorganic sequence approximately equals that in the homorganic sequence-as it does for the other five subjects, though for them this is because neither refined /1/ nor /II alone differ significantly in the two clusters. (From Table

202

Sarah Hawkins

XVII it can be seen that for MW the difference is only as large as 14 ms, and for the others it is Jess than 8 ms. This is rather smaller than in Haggard's refined measures-where /It/ is longer after Js/ than /f/ in five of the six subjects for whom refined measures were madealthough the absolute durations involved are more than half as long again in the children's .speech.) The direction of the unrefined and refined /1/ differences in the two prevocalic clusters tends to be the same in the phrases as in List I (Tables II and III). Vowel differences are less consistent, but this may be because of the fai lure to control for the final consonant of the list words . Table XV Mean duration in ms of postvocalic /f, s/ before final /t/

/sf

/f/ ,.-

/r/

sw alone before j t j SYW alone before /t/ SL alone before j t j NB alone before j t j AG alone before ft j CB alone before j t j MW alone before /t/

247·57 210·43(4 )

jej

229·24 205·27 1

175 ·93 199·35 192·96(E) 189·44 186·09 261 ·12 153 ·163 162·97 5 * 190-44 *190·44 132·71 5 180·97 337·08 251 ·93 325·83 292·96 1 E 322·11 346·80 327·58 238 ·79 5 229·55 252·88 197·98( 5 ) 180·1F

/a/

/r/

197·50 168 ·81 (3 )

294·81 234·94<4 >

190·58 149·02( 5 )

205·23 177·59 2 246 ·74 195·06< 5 >

207·50 171·10 1" *190-44 157·92 3 233 ·48 239·21 284·08 193 ·36( 5 ) 197·81 162·922

*156·86 147·79 356·41 312-4)1 333 ·38 289 ·62 2 235 ·69 166·865

jej

/a/

267 ·39 216·38 4 250·76 180·83 5 236 ·25 199·36( 5 )

238·94 185·46 4 179·74 152·43 4 201 ·52 180·06(5 )

*156·86 249·17 5 E 322·11 274·38 2 300·67 246·54 2 249·42 220·16 2

*156·86 197·24< 5 >E 193-43 204·95 300·07 202·64 5 192·59 166·80 1

*Samples for each vowel were very small, owing to difficulty of measuring NB's exceptionally quiet final fricatives; since the distributions of the measurable tokens for each vowel did not differ significantly, they were pooled. /elf is significantly longer before Js/ than /f/ for SL and MW, and fairly markedly so for the other subjects except SVW and AG. Only NB has a different direction of the difference between homorganic and nonhomorganic clusters dependent on whether /1/ is measured with or without the preceding /e/, and this is not significant statistically. If anything, this is a more consistent result than Haggard's, for, with /1/ measures only, his 12 subjects showed all three possible differences between the two types of cluster, in approximately equal numbers. His subjects also had a slight tendency to display different directions in the difference between /1/ and vowel in homorganic and nonhomorganic clusters according to whether they were prevocalic in the word or postvocalic. The children show the opposite trend, with only SW, AG and CB differing in the direction of their difference.

Summary of the chief points of interest in the mean durational data. From this study, the following points emerge as representing the main differences between adults and children in the durational modifications they make to consonants in different contexts.

Temporal coordination of consonants in the speech of children

203

(I) Segments in clusters with initial fricatives tend to be lengthened, or less abbreviated, in children's speech. This is particularly true when such clusters contain prevocalic /1/. (2) In children's speech, postvocalic /1/ tends to be significantly longer before homorganic than before nonhomorganic consonants, compared with adult speech There is some indication that this is a fairly global tendency in younger children, while for older children it applies in the context of articulatorily "difficult" fricatives, but not in the context of "easier" stops. (3) Initial clusters of stop + prevocalic /r/ are probably also lengthened, especially in homorganic cases, indicating some articulatory difficulty. Confirmation of this point must await further adult data for comparison. Table XVI Mean durations in ms of /1/ with and without the vowel in the phrases elf a flit and else a slit (unrefint:d measures)

sw svw SL NB AG

CB MW

Vowel + /1/ /1/ alone Vowel + /1/ /1/ alone Vowel + /1/ /1/ alone Vowel + /1/ /1/ alone Vowel + /1/ /1/ alone Vowel + /1/ /1/ alone Vowel + /1/ /1/ alone

/elf/

/els/

/flrt/

/slit/

218·31

232·34

188·68

177-50

212·18 125·90 177-1 84·58 267·32 122·89 328·54

210·00 125·19 195·00 97·62 276·25 11 0·83 310·35

156·54 63-46 154·31 61 ·63 226·25 89·29 275 ·14

151 ·67 53-46 182·74 83·17 261 ·88 .95·63 280·28

350·64 160·05 301·86 161·57

184·72

336·16 154·01 277-10 143·71

2

( 5)

Ia

63-01

242·18 49·60

4

165·65 46·85 265·44 70·74

(4) Other differences between children and adults do occur in the data, but are not sufficiently consistent to be regarded as indicative of the existence of qualitatively or quantitatively different organizational principles at this stage. Correlations between segments Pearson product-moment correlation coefficients were calculated between the segments of all consonant clusters, sometimes including the vowel, as in the mean duration analyses. The only direct comparison possible with Haggard's data was the phrase data, elf a flit and else a slit, and also the sequence /dr/ for which he correlated over about 50 tokens for each of 10 subjects in a later paper (1972). These comparisons are notrevealing: the children's data neither correspond closely with the adults, nor do they show a clear trend in a different direction. The values Haggard obtained do not themselves conform to any strong and orderly pattern, but those consistencies which can be extracted are not supported in the children's data. Taken as a whole then, the children's coefficients are not informative and they are not presented here. They are generally around zero for all clusters in each child's speech, and the direction of their signs is fairly inconsistent. The meaning of the occasional high values

Sarah Hawkins

204

TableXVll Mean durations in ms of segments /fii, SII/ (refined measures)

sw svw SL NB AG

/f/

/1/

/II

/sf

/If

/II

123·15 139-48 148·98 99 ·64 75 ·28

79 ·84 89·79 106·80 115 ·71 73 ·06

140·81 92-40 95 ·08 134·29 227·22

129·52 150·83 198· 3 I

93 ·71 87·50 110·57

128·15 99·38 99-44

86·39

91·25

186·94

144·91 120·23 108 ·82

84·73 82·96 98-46

224·64 128·98 193·53

CB MW

which do occur is dubious, in that the same child will often have a very different value for a cluster identical with the highly correlated one except for vowel context. There is no a priori reason, of course, why this apparent inconsistency should not reflect stable tendencies dependent upon vowel context for particular consonant sequences. These could occur only or mainly in the speech of children, or also in adult speech. But in this particular case, and given the problems inherent in interpreting correlations between speech segments, as discussed briefly below, it seems likely that random factors have played a large part in determining the obtained values. Given that the present data do not show any interesting or convincingly consistent patterns, then, it is considered advisable to ignore this section of the analysis, rather than to conclude that children's speech exhibits less organization than adults' in this respect. Discussion

The data presented in this paper indicate that there are some aspects of the timing relationships within clustered consonants that tend to differ fairly consistently between children's and adults' speech, but that these differences are not invariant within or across subjects, nor do they show a convincing age trend. This last point is not unprecedented in studies of other aspects of the speech of normal children of about this age range. Kreshek, Fisher & Rutherford (1972), for example, found little or no developmental trend in the production of /r/ phones between three and nearly five years, and Port & Preston (1972) found no evidence of a refinement in the precision of TableXVIll flit, slit

sw svw SL NB AG

CB MW

Mean durations in ms of /f, sf in elf, else, and unrefined measures of

/elf/

jels j

/flit /

/slit/

112·70 125·51 118·39 74·28 108 ·75 159·36 114·51

' 113-39 130·07 140·68 78·13 145 ·14 171-06 130·14

154·10 169·17 194·69* 123·21 * 106·07 138·01 122·74

175·55 184·23* 226·65* 131·88* 171·60* 152·28 137·72

*All these du rations, somewhat surprisingly, were larger than the isolated values as taken from List 1 (Tables III and IV). Only in SL's speech, however, did the elongations achieve significance (P < 0·001 for /f/, and P ""' 0·001 for /s/).

Temporal coordination of consonants in the speech of children

205

VOT for voiceless apical stops between two and four and a half years. They also note that Sachs has equivalent evidence for voiceless bilabials in children of five years of age (pp. 147-8). It is possible that after acceptable but not necessarily mature forms of particular sounds or sound sequences have been produced for some time, the child's speech does not change markedly in these respects for several years-possibly until nearer puberty. It is also frequently observed, of course, that chronological age correlates fairly poorly with articulatory ability, within certain limits. In the sample used here, MW, the youngest child, was judged as having remarkably mature speech in many respects, whereas others, particularly NB and CB, tended to sound rather younger than they were. A follow-up study on these same children has just been completed-about one year later-and this may help to illuminate any maturational trends which do occur. Before discussing the implications of the main differences found between children and adults, brief mention should be made of some methodological considerations. These relate largely but by no means exclusively to the correlational data. They will therefore be discussed mainly with respect to the correlations, but their wider implications should not be ignored. A major problem in this connection is that of variations in speech rate, which will always reduce the value of a correlation coefficient and increase variances. Attempts have been made to avoid this by normalizing the data-by selecting from the total sample only those tokens with the same or very similar durations (as used by Gregorski & Shockey, and criticised by them in their 1972 paper), or by some other method for eliminating differences in overall duration. (See, for example, Allen, 1969.) This procedure, besides having such undesirable general effects as reducing the number of degrees of freedom by unknown amounts, actually introduces high negative correlations between component segments, and so is particularly unsuitable when used with this statistic. A second difficulty is determining which are the relevant segments to measure. It is by no means clear, for instance, that it is reasonable to correlate just the elements of a consonant cluster without the accompanying vowel of the syllable. Although the data are not fully supportive for English (Allen, 1969, 1972a, Ohala, 1970), Chistovich & Kozhevnikov ( 1965) found evidence based on correlations for a basic CV syllable structure in Russian, and Lehiste (1970a, b) found similar evidence for a VC unit in English. It is by no means impossible that the basic unit upon which measurements-particularly correlations-can appropriately be performed is even larger than the syllable. Perceptual evidence indicates that subjects are more sensitive to modifications in segment duration, and require greater compensatory adjustments in adjacent segments, when the segments in question lie across word and syllable boundaries rather than between them. There is some evidence for an attempt to maintain constant vowel onset times, at least of stressed syllables (Huggins, 1972). Allen (1972b, c) has similarly shown that the point of consonant release/vowel onset in stressed syllables is basic to the identification of the rhythmic 'syllable beat' of English speech. [But the word is certainly not the minimal perceptual unit: Fry's (1970) reaction time experiments indicate that subjects are capable of dealing with units less than the syllable.] There is little difference between correlations on segments including and excluding the vowel in the present data, except for a possible very slight trend towards higher coefficients with measures of jell rather than just /1/ in the word final clusters. (This may only reflect the greater measurement error associated with segmenting postvocalic /1/.) But none of these measures extend over more than one syllable. Connected with this issue is the problem of determining the right number of segments in a given unit. The consequences for conclusions about the control of speech of selecting

206

Sarah Hawkins

either too few or too many elements in a total unit are discussed in some detail by Ohala (1970, p. 145), mainly with respect to variances, but with relevance also to correlational analyses. Even if these points could be satisfactorily dealt with, a further problem is the status of the correlation coefficient itself. As Gregorski & Shockey (1972) point out, it is often found that the correlation between two nonadjacent segments in an utterance is very similar to that between two adjacent ones. Since one cannot logically place importance always on the latter but never on the former, an interpretation of language programming based on correlations between adjacent segments only is hardly adequate. All these problems are encountered in the present data. Rate variations were controlled as much as possible, and rejections of data made on an impressionistic basis, but it is very difficult to keep rate constant with children, especially over a number of recording sessions. The question of appropriate segmentation has already been discussed to some extentparticularly with respect to the status of the aspiration period of a voiceless stop. On the whole, comparisons of the mean durations of particular segments in different environments are less affected by these types of problem than are correlational analyses. In exploratory work with children they can be interpreted much more confidently. The rest of this discussion will be restricted, therefore, to comments on the mean values. The most outstanding difference between the children and adults which has appeared with the present adult data is the lengthening of segments in the clusters with initial fricatives (Lists 1 and 3). In the unrefined measures of List 1, (prevocalic /1, w/ preceded by jf , s/), the fricative measures appear to be lengthened relative to their unclustered durations. But comparison with refined measures indicates that most of the elongation occurs during the voiceless period of the /1/ articulation. Although good adult data are not yet available, the evidence we have does imply that this is not found in adults. What causes this difficulty remains to be seen. It may be that the rate of onset and/or offset of the /1/ articulation are slower than in adults, with or without a prolongation of the steady state portion also. These possibilities may be resolvable by reference to spectrographic measures and possibly speech synthesis, but they are perhaps less plausible than a lack of coordination between voicing onset and the positioning of the upper articulators. Evidence on the control of VOT in apical stops (Port & Preston, 1972), and informal observations of an apparently fairly haphazard control of voicing offset before voiceless stops and fricatives in the present study, suggest that the synchronization of laryngeal and oral positionings may well be a prime factor here. It would perhaps be worthwhile looking at those clusters of 'voiceless' fricative + prevocalic /1/ where the fricative has a large element of voicing in it-of which there are several instances.lt may be that in these, and particularly where the voicing is in the latter part of the fricative, the /i; duration is not abnormally long, or is at least shorter than in clusters with properly voiceless fricatives . Although most of the elongation effect seems to stem from a long period of /i/, then, two other points suggest that this is not the sole cause of the difficulty. Firstly, there is a smaller but not negligible incidence of /s/ elongation before /w/ and /t/, (and these measures were probably less subject to segmentation errors than those involving the division between /s/ and /1/). Secondly, there are some children who lengthen /s/ in the initial three-segment clusters also ; although as we noted earlier, no child with a longer /s/ in such a cluster does not also have a longer (unrefined) /s/ before /1/. Since some children, however, lengthen /s/ only in the two-segment cluster with /1/ , a particular difficulty involving /1/ is still implied. Aspects of the data from final clusters also support the idea that there is some disfluency in the production of /1/, and possibly of fricatives too . There was a significant ten-

Temporal coordination of consonants in the speech of children

207

dency for postvocalic /1/ to be longer before homorganic than before nonhomorganic consonants of the same manner class. This was particularly marked for fricatives, but with stops it was only significant for the younger children. As noted in the Results, this contrasts with adults, who showed only a non-significant difference in the same direction. It implies an exaggeration of the greater difficulty mature speakers have with homorganic clusters (Bradshaw, 1970). There are indications, though, that this extra difficulty is not general to all homorganic clusters, for such strong and consistent lengthening is not found with final j stj vs. jft j clusters, even though they contain fricatives (Tables XIV and XV). There is also, in fact, a less marked tendency in children compared with adults for j sj in the j strj cluster to be significantly longer than in fsprj. With the exceptions of SW and MW (see Table IX), this is plausibly due to the tendency for the children's jsj to be longer in both contexts than in isolation, presumably owing to the greater complexity of the three-segment clusters. This is hypothesized to outweigh and mitigate against any differential effect ofhomorganicity. Nevertheless, it does indicate that homorganicity may not be as important an influence in children's productions as it is in adults'. This encourages the idea that both pre- and postvocalic /1/ articulations are relatively more difficult for the child than for the adult in clustered context. Clusters involving them are more likely to show disfl.uencies than many other types of cluster. The same clusters also suggest that j sj articulations involve greater difficulty than jfj , in that final sequences of jel j + j sj were less abbreviated overall than those with final j tf , whereas the jelp/ clusters were less abbreviated than the /elf/ ones. This again fits in with the data for initial clusters, and is not an unreasonable notion in the light of the order of acquisition of the two phonemes, and the frequency of j sj misarticulations in children. From these data then, we cannot clearly distinguish all the factors which might be contributing to the observed results, but they seem to be connected mainly with /1/ articulations, possibly with the synchronization of voicing in particular and, to a somewhat lesser extent, with j sj articulations. More complete adult data should afford a direct comparison of durational modifications in both unrefined and refined measures of jfl., sl/. In particular, they should provide a reference level for the relative duration of jij in the context of fricatives, and specifically of the ratios of "true" fricative to voiceless /1/ frication, and of the latter to voiced /1/ duration. In the follow-up studied mentioned above, words are used covering all possible initial consonant clusters in English except for /s/+nasals, ftw, kw j and /8r, 8w/. Relationships between segments within these clusters should help to distinguish the following points. Firstly, whether there is a generalized difficulty for all lingual liquids or whether it is greater for /1/ than for /r/ articulations. (It must be remembered, of course, that it is characteristic of children's speech to circumvent a true /r/ articulation with versions of fwf or jvj. Certainly, however, when asked to say words as fast as possible in the follow-up, subjective speed of producing initial clusters with prevocalic /1/ was far slower than with prevocalic /r/, and there were far more unacceptable attempts.) Secondly, whether fricatives also present timing problems, and if so, to what extent and in which contexts. And thirdly, whether there is more difficulty associated with clusters containing any manner of lingual consonant, compared with bilabials. This last point, of course, can be divided into alveolar vs. velar (and palatal) within the lingual class. From data on the order of acquisition and correct usage of phonemes, and on those most commonly misarticulated by children with deviant speech, it seems very likely that greater difficulty would be associated with liquids than with many other phonemes. jsj would also be expected to be difficult (Crocker, 1969; Irwin, 1947 ; Jakobson, 1968 ;

208

Sarah Hawkins

Menyuk, 1968 ; Sander, 1972 ; Snow, 1963, 1964; Templin, 1957, cited in Locke, 1972). Most of these studies have been concerned more with the age at which sounds are correctly used by a particular percentage of children in unclustered contexts, although one or two have attempted to use distinctive feature analysis to describe underlying patterns more parsimoniously [Crocker, Jakobson & Menyuk in the works cited, and also Compton (1970), McReynolds & Huston (1971) and Moskowitz (1970)]. In the present situation, where the chief interest is in the temporal coordination of acceptable tokens of the relevant sounds into relatively complex clusters of consonants, it is probably pertinent to consider other factors also. In particular, the acquisition of mature durational relationships between segments may be to a large extent independent of the ability to produce an acceptable version of the phone, so that there could be little relationship between the traditional age norms for a given phoneme and the adequacy of its temporal integration in various contexts. One point which may turn out to be particularly interesting has been stated by Menyuk & Klatt (1968) and Menyuk (1971) , and is being studied in more detail by Kornfeld (1971). They hypothesize that initial clusters in English and perhaps other languages are coded as single underlying consonants containing the feature specifications of both the phonemes involved. They present preliminary supportive evidence on acoustic distinctions in children's speech which correlate with adult phonemic differences but are not necessarily perceived as distinct by adults. They also have some data on differences in the type of errors made during the learning of nonsense words composed of English and non-English sequences of consonants, according to the increasing age of the children. If durational information is also an aspect of such encoding, then firstly, it may be much more important than has been previously assumed to take the detailed distinctive features of the accompanying context into consideration in comparing durational differences, at least for young children, if not for adults also . Secondly, one might expect to find a more consistent relationship between the durations of the total cluster and the associated vowel than between the elements within a cluster, at a given speaking rate. This could be examined to some extent in the present data, but will be done more effectively on the more recently collected data where vowel durations are more comparable between samples due to the more controlled selection of final consonants. The points remaining to be discussed chiefly concern aspects of the data which seem to be worth investigating further, but which do not in their present form afford any insight into the child's production of speech. A major problem already noted in the Results, is the status of the aspiration period, particularly of voiceless stops, in clusters of stop+ prevocalic /r/ (List 2). In clustered contexts, the aspiration segments of adults' voiceless stops increases as the articulation moves further back in the mouth (Peterson & Lehiste, 1960 ; Sharf, 1971). In both adult and children's clustered data though, we found that the aspiration + fr/ segment was consistently longest with the alveolar stop. The same alveolar > velar > bilabial difference occurred with voiced consonants also, but here, in contrast with the voiceless stops, the velar-bilabial difference was generally not significant. Inspection of the acoustic records suggests that, for both voicing types, most if not all of this difference results from an increased period of aspiration or fricative frf in the homorganic cluster. This is probably due to an effort to reduce the articulatory load and is an aspect of the affrication of such clusters mentioned earlier. It remains to be seen whether children of this age group and adults make similar modifications for both the aspiration period and the voiced portion of the /r/. Separate measures of these, and of the aspiration

Temporal coordination of consonants in the speech of children

209

included in the preceding consonant measure, will be made on the present children's data and on their follow-up data, and on data to be collected with adults. It is not foreseen that the problem will then be solved, but at least we will have a firmer factual foundation on which to base our conclusions. A further aspect of the aspiration period which should perhaps be investigated is whether children also produce the ratios of voiceless (aspirated) and voiced (vocalic) portions of vowels after voiceless stops which Peterson & Lehiste (1960) report for adults. (Assuming that differences in the overall duration of the vocalic segment following voiced and voiceless stops are produced correctly-see below.) This introduces the whole question of vowel durations, which has so far been ignored. eoarticulatory effects extending into vowels are well documented, though somewhat less for durational aspects than for observed or inferred changes in vocal tract configurations. Naeser (1970b) has investigated children's acquisition of vowel duration in American English, and found that it is acquired very early-before 21 months-in her study. It is produced at first independently from voicing differences in the final consonants which condition it in adult speech. [The pattern of results did not distinguish between whether this is physiologically conditioned or learned, but Peterson & Lehiste (1960) quote Zimmerman & Sapon (1958) as suggesting that differential vowel durations of this sort are language-specific.] This would suggest that we should not find differences in the relative durations of vowels between adults and children of three years and older in these contexts, at least in words which the child can produce easily (Naeser used only eve nonsense words). There are some respects, however, in which we might reasonably expect a difference to occur. These are largely connected with Lehiste's (1972) notion of timing units being tied to major changes in manner of articulation. (Her evidence comes from the temporal fusion of resonants with vowels to form single syllabic nuclei.) This implies that there should be less independence between, for example, /r/ and /r/ of the word script than between js, k, r/. This itself has not been shown for adults yet. If children have difficulty in achieving the adults' coordination of at least /1/, as we have hypothesized, then one would expect that for children, there would be a difference between the degree of coordination and temporal fusion found within elements of a syllabic nucleus not including this articulation vs. one in which it did occur. In the latter, the semivowel might not fuse with the vowel to the same extent. Similarly, there might be a difference within children's productions of the more difficult syllabic nuclei according to whether the /1/ was well or poorly articulated. (Final clusters with postvocalic /1/ give particularly frequent examples of this.) The measurements that were made of /1/ with and without the following or preceding vowel (Lists 1 and 4) do not on the whole show much difference between comparisons using the /1/ alone or combined with the vowel (Tables X, XI and XVI). This is consistent with Lehiste's hypothesis and implies that in this respect the children's speech probably did not differ significantly from the adults. These measurements were made on the whole sample though, treating identically both clear and imprecise /1/ articulations that were nevertheless acceptable. If the samples were divided according to the adequacy of /1/ production, one might find that with a poorly articulated /1/, the timing relationships conformed with the adults', but that the vowel and /1/ segments behaved more as independent units when /1/ was clearly and precisely produced. In Ohala's terms, the former situation would imply a timing-dominant system, and the latter an articulation-dominant system (1970, p. 143). If this difference were found with

210

Sarah Hawkins

children, then it would suggest some degree of interplay between the two types of system during development. The subjective impression of an exaggerated rhythmicity and syllable structure in the speech of very young children, when they cannot yet produce many clusters of consonants, or even many singleton consonants, suggests that quite a rigid timing-dominant system might be very basic. It is possible that in acquiring the full articulatory repertoire of the language, the demands and limitations of such a system have to be overcome. Appendix-Segmentation Criteria This section will describe only the criteria used with the oscillographic material. Informal checks on reliability have been made and compare favourably with the literature (Peterson & Lehiste, 1960, p. 694 ; Naeser, 1969, p. 13). As far as possible, the criteria set out below were the only ones used in deciding where to segment the utterances. They can by no means be applied mechanically however, and in practice a great deal of judgement is involved, particularly with some segment types. But with the same person doing all the segmentation, and with frequent checks between different examples of the same word spoken by the same and different subjects, the criteria can generally be consistently applied. This means that conclusions about differences in relative durations can usually be made quite validly. Initial and final voiceless fricatives The onset of all fricatives, clustered and unclustered, word-initially and word-finally, was taken to be where the high-pass filtered trace showed a rise in amplitude, coupled with the cessation in the unfiltered trace of periodic excitation in the preceding vocalic segment. Irregular and/or very low frequency voicing sometimes continued for a short time (approx. 10 ms), but this was included in the fricative measure if aperiodic excitation was clearly present (cf. Peterson & Lehiste, 1960). If voicing occurred again within the defined fricative segment, it too was ignored. The end of all fricatives, except for the 'refined' measures of initial /f, s/ before /1/, was taken to be where the filtered trace fell to the baseline, or at the beginning of voicing in the following segment if the filtered trace did not reach the baseline before voicing began. The refined measure of initial /f/ before /1/ marked the division between the two segments at the sharp rise in amplitude and usually in frequency of the filtered trace. The end of the /s/ segment before prevocalic /1/ with the refined measures was fixed at a sharp and often high intensity spike burst in the filtered trace. This was usually visible in the unfiltered trace also, but less clearly. It was followed by a period of friction with similar intensity and frequency to the true /s/ segment. This was regarded as voiceless /1/. Semivowels /1, r, wj In word-initial, unclustered position, i.e. after j'd j, the onset of these liquids was measured from the point where there was a marked reduction in amplitude of the envelope, couple with a decline in amplitude or complete absense of the higher frequency components of the waveform, and a general change in the waveshape. The segments were considered to end where the amplitude envelope was at or very near its maximum again (or at the point of maximum change in its growth in amplitude) and the high frequency components were mainly present but not necessarily at their maximum values. The sections thus delineated usually coincided with a severe amplitude reduction or complete absence of the signal in the filtered trace. Where the unfiltered signal was

Temporal coordination of consonants in the speech of children

211

ambiguous as to the correct segmentation point, this cue from the filtered one was used where possible. In prevocalic clusters after fricatives, the beginning of the /1/ and jwj segments was taken at the defined offset of the fricative, for refined and unrefined measures (see above). The end of the liquid was signalled, as before, by a rise in amplitude and change in shape of the waveform. For /1/, this was usually abrupt, but for /w/ , the curve was much more gradual, and correspondingly more difficult to segment. Again, the rise in amplitude of the high frequency components, as registered in the filtered trace, was often critical in establishing the segmentation point. In initial clusters after voiced and voiceless stops, the onset of prevocalic /r/ was defined as at the spike burst at the release of the stop (see below.) Criteria for the end of the /r/ were as for prevocalic jwj. Postvocalic /1/ was particularly difficult. Peterson & Lehiste (1960), who also noted the problem, found third formant movements and the attainment of a steady low fundamental frequency were useful criteria in many examples. Relative intensity differences were also used . The first two criteria are only really applicable with spectrograms (as used by Peterson & Lehiste) but intensity differences were utilized wherever possible in the present study, along with the change in waveshape and reduction of high frequency components. As Peterson & Lehiste found, though, the changes were often very gradual and smooth, and establishment of a single boundary was often dubious. The problem was exacerbated with the children's speech, since postvocalic /1/ articulations are frequently very poorly formed. Additional measurements were also made, therefore, of the entire syllabic nucleus: from the beginning of the vowel, defined as the onset of periodic excitation in the case of else and help, or the defined end of the preceding consonant in the other cases, to the end of the /1/ segment. This was taken to be the end of periodic excitation in the case of final /!/, or the defined beginning of the next consonant in the clustered cases. Voiceless plosives The stop T measures were taken from the transition off the preceding voiced segment (/:J/ or the syllabic nucleus of a word) , for both initial and final stops. This was defined as the point of greatest change in the envelope of the preceding segment. Initial and final stop C measures were from the cessation of all detectable excitation in either trace, signalling the attainment of complete closure and minimal airflow. All voiceless stops were measured to the burst at the release of closure. Stops following initial or postvocalic fricatives were measured as for stop C (from the defined end of the fricative , signalled in the high pass trace). The aspiration period of voiceless plosives The status of the aspiration period of voiceless plosives is equivocal, and no clear criteria have been developed for its assignment. The problem is not trivial, however. Lisker, for example, notes that "the decision as to the aspiration phase might well affect the magnitudes of reported consonant durations by as much as 75 % (Lisker & Abramson, 1964)" , (1972, p. 163). Peterson & Lehiste (1960, p. 694) and Lisker (1972, p. 163) suggest that its perceptual relevance in judgements of the durations of plosives and the following vocalic segments might provide the best solution, but this important point has received little investigation. Peterson & Lehiste incline towards the view that it should be included as part of the consonant, but advance arguments both for and against this .

212

Sarah Hawkins

In a later article, Lehiste (1972) reasonably included the aspiration period of initial voiceless plosives in the syllabic nucleus duration, in connection with her hypothesis that the production and perception of timing take place with reference to major changes in the manner of articulation. Certainly, the articulatory gestures involved are quite distinct from those required to produce the stop, so until we have firm perceptual evidence on the question, there is at least as much justification for excluding as for including it with the consonant, despite the traditional habit of including it. At the present moment, then, there is no clearly appropriate way in which to assign the aspiration phase. In this particular study, it was regarded as part of the vocalic segment, but it might have been better to regard it as part of the consonant. Voiced plosives

These were only studied word-initially. Their onset was taken from the abrupt reduction in amplitude of the preceding jgj, plus the absence of all high frequency components. This was very often marked by a sudden dip in fundamental frequency, signalled by a longer cycle in the waveform. The end of the segment, as for voiceless stops, was measured at the burst release. This was accompanied or followed soon after by a growth in amplitude of the periodic component of the following vowel or jrj, in most cases. Occasionally, no clear burst was present, in which case the latter cue was used for segmentation (given an absence of frication on release also). Nasals /m, n/ These were only studied word-finally-after the vowel /II or after postvocalic /1/. The onset was marked by a clear reduction in amplitude and change in waveshape (particularly after the vowel), with an absence or severe reduction of all high frequency excitation. As with voiced plosives, this was usually accompanied by a sudden dip in the frequency of the glottal cycles. This cue was particularly useful in deciding on the division between postvocalic /1/ and jm, nj. The point of offset of nasal articulations was defined as the end of all periodic excitation.

The Test Words Used As mentioned in the Method more than one word was used to elicit the same consonant cluster with a given vowel. Usually, however, the vast majority of each sample consisted of utterances of only one or two different words. Those which were used only rarely (for some children, never) are listed in brackets. List 1: Pre vocalic / l,

flit (flip)

SWill

swing (switch) (swish)

w/ preceded by //, s/

slim slip (slit) (slid) (sling) (slink)

lit lift lick (limp) (lip) (live) wing wish witch (whip) (wind) (wink)

fit fill fish fist (fin)

sit (sip) (sick) (sing) (sink)

Temporal coordination of consonants in the speech of children sweet sweep (swede)

sleep

flag flat (flap) (flash) (flask)

slant (slap)

fled

flight (fly)

sledge

slight slice slide (sly)

seat sea (seed) (seal) (seize) lamp land last laugh (lad)

week weep (weed) (wheel)

less let left (led) (leg) light line (lie) (like) (life)(live)

fell felt (fed) (fetch) (fence) fight fire (fine) (find)

fat fast

leap (lean) (leak) (lead) (leaf) sat sad (sack) (sash) (sang) (sand) set sell said

sight (side) (sign) (sigh)

List 2: Prevocalic fr/ preceded by various stops preach (preen)

breathe breeze

tree (treat)

reach read (real)

creep (cream) (creak) dream

peace peak (peach) (pea)

keep (key) beach (beak) (bead) (beam)

green (greed) train (trail)

brake

crate (crane)

teach tea (teeth)

deep

geese rail rake (rain) (race) (raise) (rage)

drake drain

tail (take) (taste)

cake (cave) (case) (cage)

bake

date (daze) (day)

pin pit pig (pick) (pitch)

tin (tip) (tim) (tit) (tick)

List 3: Initial three-segment clusters spring

13

string

ring

213

214

Sarah Hawkins List 4: Postvocalic (l/ followed by (p, t, /, s, m, n/ fell step help felt (sell) (yelp) (belt) (melt) (yell) (bell) (shell)

self (shelf)

else

film

kiln

List 5: Final (t/ preceded by lift fist (mist) (gift) (twist) (swift) (shift)

left

draught3 (raft) (craft)

deaf

fill (bill) (hill) (kill) (mill)

//, sf lit sit slit flit (hit) (bit) (fit) (pit)

swim slim (dim) (brim)

miss (kiss) (hiss)

rest (best) (nest) (chest)

set wet (as above)

less mess (as above)

fast last . (past) (mast) (cast) (blast)

sat fat flat .(cat) (pat) (mat) (hat)

grass glass (class) (brass)

set wet (pet) (yet) (net) (met) (let) less mess dress (press) (cress) (guess) pin tin (bin) (fin)

cliff (biff) (stiff) (sniff)

deaf

laugh (staff)

A problem with this sublist was that the children all pronounced the words with final fricatives with a long vowel /a: /, but had a short vowel for the group with final unclustered /t/- /sat/ etc. This had not been anticipated since the experimenter used /a/ in all positions. (Since some children seemed uncertain what to say when asked to repeat, for instance /~fast / , the experimenter adopted their pronunciation for these words.) 3

I should like to acknowledge the help of Mark Haggard , my research supervisor, in carrying out this work.

Temporal coordination of consonants in the speech of children

215

References Allen, G. D. (1969). The structure of timing in speech production. Paper presented to the 78th A.S.A. meeting, San Diego, U.S.A. Allen, G. D . (1972a). Timing control in speech production: some theoretical and methodological issues. Paper presented to the Phonetics Symposium, University of Essex Language Centre, 11-13 Jan. Also in Language Centre's Occasional Papers 13, 170- 201. Allen, G. D . (1972b). The location of rhythmic stress beats in'English : an experimental study I. Language and Speech 15(1), 72-100. Allen, G. D. (1972c). The location of rhythmic stress beats in English : an experimental study II. Language and Speech 15(2), 179- 195. Bondarko, L. V. (1969). The syllable structure of speech and distinctive features of phonemes. Phonetica 20, 1-40. Boomer, D . S. & Laver, J.D. M . (1968). Slips of the tongue. British Journal of Disorders of Communication. Bradshaw, J. L. (1970). Phonetic homogeneity and articulatory lengthening. British Journal of Psychology 61, 499-507. Chistovich, L. A. & Kozhevnikov, V. A. (1965). Speech: articulation and perception. Washington D.C.: Joint Publications Research Service, JRPS 30, 543. Compton, A. J. (1970). Generative studies of children's phonol0gical disorders. Journal of Speech and Hearing Disorders 35, 315-339. Crocker, J. R. (1969). A phonological model of children's articulation competence. Journal of Speech and Hearing Disorders 34, 203-213. Denes, P. (1955). The effect of duration on the perception of voicing. Journal of the Acoustical Society of America 27, 761- 764. Fromkin, V. (1971). The nonanomalous nature of anomalous utterances. Language 47, 27-52. Fry, D . B. (1970). Reaction-time Experiments in the Study of Speech Processing. Nouvelles Perspectives en Phonetique, Institut de Phonetique, Universite Libre de Bruxelles : Conferences et Travaux, I, 15- 35. Fujimura, 0. (1961). Bilabial stop and nasal consonants: A motion-picture study and its acoustical implications. Journal of Speech and Hearing Research 4, 233- 247. Gibson, E. J. & Guinet, L. (1971). Perception of inflections in brief visual presentations of words. Journal of Verbal Learning and Verbal Behaviour 10, 182-189. Gregorski, R. & Shockey, L. (1972). A note on temporal compensation. Working Papers in Linguistics, No. 12, Computer and Information Science Research Center, Ohio State University, pp. 87- 88. Haggard, M. P. (1971). Some effects of clusters on segment durations, and a preliminary model. Speech Synthesis and Perception, Progress Report No.5, Psychological Laboratory, Cambridge, pp. 1-50. Haggard, M.P. (1972). Temporal coordination-supplementary data. Speech Synthesis and Perception, Progress Report No.6, Psychological Laboratory, Cambridge, pp. 26- 27. House, A. S. & Fairbanks, G. (1953). The influence of consonant environment on the secondary acoustic characteristics of vowels . Journal of the Acoustical Society of America 25, 105-113. Huggins, A. W. F. (1968). The perception of timing in natural speech I : Compensation within· the syllable. Language and Speech 11, 1-11 . Huggins, A. W. F. (1972) On the perception of temporal phenomena in speech. Journal of the Acoustical Society of America 51, 1279-1290. Irwin, 0. C. (1947). Data cited in G. A. Miller, Language and Communication, p . 144 (chap.7). New York: McGraw-Hill, 1951. Irwin, 0. C. (1948). Infant speech. Journal of Speech and Hearing Disorders 13, 224-225, 320- 326. Cited by E. H. Lenneberg, The Biological Foundations ofLanguage, chap. 4, London: Wiley, 1967. Jakobson, R. (1968). Child Language, Aphasia, and Phonological Universals . Janua Linguarum, Series Minor, 72, The Hague: Mouton. Kornfeld, J. R. (1971). What initial clusters tell us about a child's speech code. Q.P.R., Research Laboratory of Electronics, M.I.T., No. 101, pp. 218-221. Kreshek, J., Fisher, H. & Rutherford, D. (1972). A study of /r/ phones in the speech of 3-year-old children. Folia Phoniatrica 24, 301 - 312. Lehiste, I. (1970a) Suprasegmentals. Cambridge, Mass. & London : MIT Press. Lehiste, I. (1970b). Temporal organisation of spoken language. Working Papers in Linguistics, No. 4, Technical Report No. 70-26, Computer and Information Science Research Center, Ohio State University, pp. 96-114. Lehiste, I. (1972). Manner of articulation, parallel processing, and the perception of duration. Paper given at the Phonetics Symposium, University of Essex Language Centre, 11- 13 January. Reprinted in the Language Centre's Occasional Papers 13, 1-24. Also in Working Papers in Linguistics No. 12, Computer and Information Science Research Center, TR-72-6, Columbus, Ohio, pp. 33- 52.

216

Sarah Hawkins

Lisker, L. (1972). On time and timing in speech. In (T. Sebeok, Ed .) Current Trends in Linguistics, Vol. 12, The Hague: Mouton. Locke, J. L. (1972). Ease of articulation. Journal of Speech and Hearing Research 15, 194- 200. MacKay, D. G . (1969). Forward and backward masking in motor systems. Kybernetik 6, 57-64. MacKay, D. G. (1972). The structure of words and syllables: evidence from errors in speech. Cognitive Psychology 3, 210- 227. MacNeilage, P. F. & deClerk, J. (1969). On the motor control of coarticulation in CVC monosyllables. Journal of the Acoustical Society of America 45, 1217-1233. McReynolds, L. V. & Huston, K. (1971). A distinctive feature analysis of children's misarticulations. Journal of Speech and Hearing Disorders 36, 155-167. Mattingly, I. G. (1968). Synthesis by rule of General American English. Haskins Labs. Status Report · Supplement. Menyuk, P. (1968). The role of distinctive features in children's acquisition of phonology. Journal of Speech and Hearing Research 11, 138-146. Menyuk, P. (1971). Clusters as single underlying consonants: Evidence from children's productions. International Congress of Phonetic Sciences, August 1971, Montreal. Menyuk, P. & Klatt, D. H. (1968). The child's production of initial consonant clusters. Q.P.R ., Research Laboratory of Electronics, M.I.T., No. 91, pp. 205-213. Moskovitz, A. I. (1970). The 2 year old stage in the acquisition of English phonology. Language 46, 426--441. Naeser, M. A. (1969). Criteria/or the segmentation of vowels on duplex oscillograms. Technical Report No. 124, Wisconsin Research and Development Center for Cognitive Learning, University of Wisconsin, Madison . Naeser, M.A. (1970a). Influence of initial and final consonants on vowel duration in CVC syllables. Technical Report No. 130, Wisconsin Research and Development Center for Cognitive Learning, University of Wisconsin, Madison. Naeser, M . A . (1970b) The American child's acquisition of differential vowel duration. Technical Report No. 144, (in 2 parts), Wisconsin Research and Development Center for Cognitive Learning, University of Wisconsin, Madison. Nooteboom, S. G . (1967). Some regularities in phonemic speech errors. !.P.O. Annual Progress Report, No. 2, Eindhoven. Nooteboom, S. G. (1969). The tongue slips into patterns. Nomen: Leyden studies in Linguistics and Phonetics (A. G. Sciarone,..et at. Eds.) The Hague: Mouton, pp. 114-132. (Cited by Fromkin, 1971.) Ohala, J. J. (1970). Aspects of the control and production of speech. Working Papers in Phonetics, No. 15, UCLA. Ohala, J. J. (1972). The regulation of timing in speech. Paper presented to the 1972 Conference on Speech Communication and Processing, Newton, Mass. Ohman, S. E . G. (1966). Coarticulation in VCV utterances : spectrographic measurements. Journal of the Acoustical Society of America 39, 151-168. Ohman, S. E . G. (1967). A numerical model of coarticulation . Journal of the Acoustical Society of America 41, 310-320. . Peterson, G . E . & Lehiste, I. (1960). Duration of syllable nuclei in English. Journal of the Acoustical Society of America 32, 693- 703. Port, D . K. & Preston, M. S. (1972). Early apical stop production: a voice onset time analysis. Haskins Labs. Status Report pp. 125- 149. Sander, E . K. (1972). When are speech sounds learned? Journal of Speech and Hearing Disorders 37, 55-63. Sharf, D. J. (1971). Perceptual parameters of consonant sounds. Language and Speech 14, 169-177. Slis, I. H. (1971). Articulatory effort and its durational and EMG correlates. Phonetica 23, 171 - 188. Slis, I . H. (1972). The influence of articulatory effort on the timing of speech. Paper presented to the Phonetics Symposium of the University of Essex Language Centre, 11-13 January. Reprinted in the Language Centre's Occasional Papers 13, 128- 150. Snow, K. (1963). A detailed analysis of articulation responses of 'normal' first grade children. Journal of Speech and Hearing Research 6, 277-290. Snow, K. (1964). A comparative study of sound substitutions used by 'normal' first grade children. Speech Monographs 31, 135-141. Stevens, K. N. & House, A. S. (1955). Development of a quantitative description of vowel articulation. Journal of the Acoustical Society of America 27,484-494. Templin; M. C. (1957). Certain language skills in children. Institute of Child Welfare Monographs, No. 26, Minneapolis, University of Minnesota. Wickelgren, W. A. (1965a) Acoustic similarity and intrusion errors in short term memory. Journal of Experimental Psychology 70, 102-108.

Temporal coordination of consonants in the speech of children

217

Wickelgren, W. A. (1965b). Distinctive features and errors in short term memory for English vowels. Journal of the Acoustical Society of America 38, 583-588. Wickelgren, W. A. (1966). Distinctive features and errors in short term memory for English consonants. Journal of the Acoustical Society of America 39, 388- 398. Zimmerman, S. A. & Sapon, S.M. (1958). In Journal of the Acoustical Society of America 30, 152-153. Cited by Peterson & Lehiste (1960).