Speech patterns of children and adults elicited via a picture-naming task: An acoustic study

Speech patterns of children and adults elicited via a picture-naming task: An acoustic study

Speech Communication 32 (2000) 267±285 www.elsevier.nl/locate/specom Speech patterns of children and adults elicited via a picture-naming task: An a...

205KB Sizes 3 Downloads 36 Views

Speech Communication 32 (2000) 267±285

www.elsevier.nl/locate/specom

Speech patterns of children and adults elicited via a picture-naming task: An acoustic study S.P. Whiteside a,*, C. Hodgson b a

Department of Human Communication Sciences, University of Sheeld, Sheeld S10 2TA, UK b Stanley Health Centre, Durham DH9 OXE, UK

Received 8 June 1999 ; received in revised form 12 December 1999 ; accepted 12 January 2000

Abstract This brief study presents some acoustic phonetic characteristics that re¯ect both the voice characteristics and motor speech behaviour of 20 pre-adolescent (6-, 8- and 10-year olds) boys and girls, and 9 adults in speech data that were elicited via a picture-naming task. The acoustic phonetic characteristics that were investigated included formant frequency values, coarticulation (or gestural overlap) and temporal patterns. Both voice characteristics and motor speech behaviour presented evidence of age and sex di€erences, and age by sex interactions. In addition there were signi®cant correlations between formant frequencies and their associated formant frequency changes (or excursions). There was also evidence of individual di€erences in the patterns of maturation, which did not conform to chronological age. These data are presented and discussed with reference to the sexual dimorphism of the vocal apparatus, the development of vocal characteristics, and motor speech development and behaviour. Ó 2000 Elsevier Science B.V. All rights reserved. Zusammenfassung Diese kurze Studie stellt einige akustisch phonetische Charakteristiken dar, die sowohl die Stimmcharakteristiken, als auch die Sprachmotorik von 20 Jungen und M adchen vor dem Jugendalter (6-, 8- und 10-j ahrige) und 9 Erwachsenen re¯ektieren. Die zugrundeliegenden Daten wurden mit Hilfe von Bildbeschreibungen erstellt. Die untersuchten akustisch phonetischen Charakteristiken beinhalteten Formantfrequenzwerte, Koartikulation (oder  Uberlagerung mit Gestik) und zeitliche Schemata. Sowohl die Stimmcharakteristiken als auch die Sprachmotorik zeigten Alters- und Geschlechtsunterschiede auf und auch Wechselwirkungen zwischen Alter und Geschlecht. Zus atzlich wurden bedeutende Korrelationen zwischen den Formantfrequenzen und den damit verbundenen Formantfrequenz anderungen festgestellt. Es wurden auûerdem Unterschiede in den individuellen Reifeprozessen aufgezeigt, die bereinstimmten. Diese Daten werden unter Bezug auf den geschlechtlichen nicht mit dem chronologischen Alter u Dimorphismus des Stimmapparates, die Entwicklung der Stimmcharakteristiken und die Entwicklung der Sprachmotorik dargestellt und er ortert. Ó 2000 Elsevier Science B.V. All rights reserved. Resume Cette courte etude presente certaines caracteristiques de la phonetique acoustique qui re¯etent et les caracteristiques de la voix et les mouvements articulatoires de 20 pre-adolescents (^ ages de 6, 8 et 10 ans) et de 9 adults. Les donnees etudiees ont ete obtenues  a lÕaide dÕun exercice o u les sujets designaient le nom dÕobjets presentes sous forme dÕimages. *

Corresponding author. E-mail address: s.whiteside@sheeld.ac.uk (S.P. Whiteside).

0167-6393/00/$ - see front matter Ó 2000 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 6 3 9 3 ( 0 0 ) 0 0 0 1 3 - 3

268

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

Les caracteristiques de la phonetique acoustique examinees comprennent les valeurs de la frequence des formants, la coarticulation, le chevauchement gesturel et les patrons temporels. Les caracteristiques vocales et les mouvements articulatoires nous revelent des di€erences selon lÕ^ age et le sexe, et des interactions entre lÕ^ age et le sexe. En plus, il y avait dÕimportantes correlations entre les frequences des formants et la variation (ou bien les deviations) associee  a ces frequences. On a constate egalement des di€erences individuelles quant aux modes de maturation que ne conformaient pas a lÕ^ age strictement chronologique des sujets. Nous rapportons et discutons de ce donnees par rapport au dimorphisme sexuel de lÕappareil vocal, du developpement de caracteristiques vocales et du mecanisme de la parole et de son comportement. Ó 2000 Elsevier Science B.V. All rights reserved. Keywords: Motor speech development; Formant frequencies; Temporal parameters; Age, sex and individual di€erences

1. Introduction During the development of speech production the child is faced with mastering perceptual and motor skills that eventually become the highly coordinated and automatized input and output schemas in adult perceptual and production systems. The process of developing these ®ne motor speech skills, which is the focus of this discussion, co-occurs with the maturation of the vocal apparatus. The development of ®ne motor skills is underpinned by the skilled control over the timing and co-ordination of the articulators. Evidence for the graduation towards this goal is documented by some researchers to include the babbling behaviour of pre-lexical infants (Davis and MacNeilage, 1994; Locke et al., 1995). In addition, the patterns of development of motor speech behaviour in the speech of some 2±3-year old children (Davis and MacNeilage, 1990; Stoel-Gammon, 1983) has included samples which show the juxtaposition of similar articulatory postures. It is still not clear however, exactly when the child achieves adult or near-adult levels of independent articulatory gestures (Locke 1997), but there is some evidence to suggest that a signi®cant amount of the graduation towards this goal occurs approximately during the third year of life (Goodell and Studdert-Kennedy, 1993). In addition there are other accounts which suggest that children are continually developing and re®ning their speech motor skills between the ages of 3 and 7 (e.g., Sharkey and Folkins, 1985; Sereno et al., 1987; Nittrouer et al., 1989, 1996; Sussman et al., 1992; Nittrouer, 1993). Furthermore, there are some accounts which suggest that the development of motor speech skills continues after the age of 7

years and that ``the maturation of the motor skills in speech is not completed until the child enters puberty'' (Kent, 1976). The development of skilled motor speech schemas involves mastering both the timing and coordination of articulators to produce smooth and overlapping yet distinct gestures that produce acoustic patterns that can be recovered perceptually from the speech signal. The development of these overlapping yet distinct gestures includes the ability to master coarticulatory gestures or gestural overlap (e.g., Goodell and Studdert-Kennedy, 1993) which lend themselves to the economy of gestures in speech production (Lindblom, 1983). Studies that have reported on the development of coarticulation patterns in children have reported on a range of coarticulatory behaviours. For example, anticipatory coarticulation e€ects of vowels on the preceding consonant have been found to exceed those of adult values (e.g., Nittrouer et al., 1989). In addition, although coarticulatory e€ects have been found to be similar for both children aged 3±7 years and adults, these e€ects were found to be less consistent for the children (e.g., Sereno et al., 1987). These inconsistencies in the speech patterns of children may be indicative of the process of re®ning speech motor skills. Further evidence for this process of re®nement speech motor skills comes from Nittrouer et al. (1989) who found that children were able to produce more spectral contrasts between the fricatives =s= and =•= from the ages of 3 to 7 years. Here, they found decreases in the frequency values of the /ò/ centroids, in addition to increases in fricative ratios (si=•i & su=•u), with increasing age. This pattern of development is explained as being the result of the development of the

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

constriction shape of lingual gestures required for the production of =s= and =•=. Alongside this however, the children in their study also exhibited a reduction of vowel contextual e€ects on the preceding fricative (Nittrouer et al., 1989), which was replicated in a later study (Nittrouer et al., 1996). This reduction in phonetic contrast or decreased coarticulation compared to adults is explained as evidence for children organising their articulatory gestures over the course of a syllable. Nittrouer et al. (1989) suggest that it is only gradually that children begin to produce articulatory gestures that are more segmental in nature, such as those exhibited by adults. In another study, Sussman et al. (1992) investigated whether F 2 locus equations were functional in capturing developmental patterns in the CV coarticulation of /bVt/, /dVt/ and /gVt/ syllables of 3±5-year old children. By plotting F 2 mid points against corresponding F 2 onset points, to establish simple regression functions, they were able to investigate the degree of coarticulation where steep and shallow regression slopes were interpreted as being indicative of higher and lower levels of coarticulation, respectively. They found that for all the children's data, the slopes were signi®cantly di€erent between labial (=b=) and alveolar (=d=), in addition to di€erences between alveolar (=d=) and velar (=g=) places of articulation. No signi®cant di€erences were found between the slopes of labial and velar places of articulation. In order to capture any possible age-related chances in F 2 locus patterns, the data of the 4- and 5-year olds in their study were compared. Results showed no signi®cant age-related changes for either slope or y-intercepts. However, when the slope data of the 4- and 5-year olds were analysed separately, they found that the slopes of the 5-year olds exhibited more contrast, with signi®cantly di€erent slopes across all three places of articulation. This was distinct from the data of the 4-year olds, which had trends re¯ecting the overall group results (/b/ versus /d/ and =g= versus /d/ but not /b/ versus =g= contrasts). No sex di€erences were found for the children's slope means or y-intercept means. By comparing results taken from a previous study (Sussman et al., 1991), Sussman et al. (1992) also noted that the adult slope data exhib-

269

ited more of a /b/ versus =g= separation than those of the children, and that the slope data for the labial /b/ came closest to matching adult data. However, for alveolar /d/, the children displayed variable but sometimes much higher slope values, and therefore patterns of greater anticipatory coarticulation than the more consistently lower slope values of the adults. Sussman et al. (1992) explain their results as evidence for the mastering of articulatory skills which require either the reduction or maintenance of coarticulatory kinematics in speci®c phonetic contexts. The range in coarticulation patterns that have been reported for the speech of children (Nittrouer, 1993; Nittrouer et al., 1989, 1996; Sereno et al., 1987; Sussman et al., 1992) suggests that children adopt varying patterns of gestural coordination and overlap, which may be so for a number of reasons. The ®rst of these may be due to di€erent phonetic contexts, and therefore the di€erent gestural and articulatory manoeuvres, that may be required in speech production. For example, children and adults show similarities in coarticulation patterns of consonant±vowel sequences where consonants do not involve lingual gestures such as /b/ (Sussman et al., 1992). This could be explained by the fact that the production of /bV(t)/ syllables involve articulatory gestures that are not mechanically linked (i.e. involve separate articulators), and are therefore more readily mastered independently, and therefore interphased more eciently than those consonantvowel sequences which are mechanically linked and both require lingual gestures for example, such as /dV(t)/ syllables (Sussman et al., 1992) and =•i•i=, =•u•u=, =sisi=, =susu= syllables (Nittrouer et al., 1989, 1996). This explanation has been suggested previously (Nittrouer et al., 1996) and may go some in way in explaining the distinct child±adult di€erences in consonant±vowel coarticulation patterns involving acoustic CV patterns associated with =d= (Sussman et al., 1992) and =s•= (Nittrouer et al., 1989, 1996). It has been suggested that some coarticulation patterns are evidence for ``advanced speech production skills whereas others maybe a sign of articulatory immaturity'' (Repp, 1986), and this statement also goes some way in explaining the wide range of

270

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

reports on coarticulation patterns in childrenÕs speech. It is therefore necessary to stress that for some phonetic contexts (e.g., =b2+=), high levels of coarticulation maybe evidence for increased motor eciency, while for other phonetic contexts (e.g., =di+=), high levels of coarticulation may be evidence for maturing motor speech skills. While some studies have used picture-naming paradigms in eliciting speech data (Nittrouer, 1993), many of the studies reporting on children's coarticulation patterns have largely adopted repetition or imitation paradigms in eliciting their data. These methods of elicitation while attempting to maintain maximum control over the data elicited in their e€orts to maximise the similarities between the speech data of children and adults leave little room for self-paced speech output. The use of a picture-naming task was therefore employed in this study to elicit speech from both the children and adults investigated here. A factor that deserves investigation in the examination of developmental patterns of formant frequencies, their patterns of change (or excursion), and associated temporal patterns, is that of sex-related developmental di€erences. For example, although there are many accounts in the literature that document sex-related di€erences of formant frequencies in adults (Childers and Wu, 1991; Peterson and Barney, 1952), fewer have documented sex-linked developmental patterns in children (Bennett, 1981; Busby and Plant, 1995). Furthermore, although there is evidence in the literature to show that women display longer utterance durations, and slower articulation rates in experimental situations (Byrd, 1992, 1994; Swartz, 1992; Whiteside, 1996), less is known about sexlinked developmental patterns in the temporal structure of children's speech. Further related areas pertain to patterns of coarticulation, assimilation and phonological distinction. There is evidence from the adult literature, which suggests that women show less phonetic reduction, and therefore maintain greater phonological distinction than men (Byrd, 1992, 1994; Whiteside, 1996). Following from this, one might predict that male children may display higher levels of coarticulation compared to their female peers. The question therefore posited, is whether these sex-linked

coarticulation patterns are present in developmental data. There is some evidence to suggest that no sex-linked di€erences occur in the coarticulation patterns represented by F 2 locus equations in speech of 3±5-year old children (Sussman et al., 1992). However, little is known about sexlinked di€erences that may be present in the coarticulation patterns of older, preadolescent children, who were the focus of the present study. The aim of this brief study was therefore to investigate speech patterns elicited via a picturenaming task in three groups of preadolescent male and female children aged 6, 8 and 10 years, and a group of men and women, and thereby add to the developmental data on preadolescent children. Voice characteristics and motor speech behaviour of the four groups were examined through the investigation of formant frequency and temporal patterns, in speech data that comprised a selection of both consonants and vowels. The formant frequency patterns included formant frequencies and their patterns of change between the acoustic events of consonants and vowels. Temporal structures that were investigated included the total duration of utterances, the total duration of pauses, the incidence and patterns of pauses, total speaking time and articulation rate. In addition, the relationship between the temporal patterns and associated formant frequency changes were examined to gauge patterns of acoustic and motor behaviour. The study therefore aims to answer the following speci®c questions: (i) Are there developmental (age) di€erences in the data of a group of preadolescent children? If so, how do these age di€erences manifest themselves?; (ii) Are there sex di€erences in the data? If so, what is the nature of these di€erences?; (iii) Are there any patterns in the speech data that are due to an interaction between factors of age and sex?; (iv) What are the relationships between formant frequencies and their associated formant frequency changes (excursions), and do these relationships inform us about the dynamic patterns of speech production and coarticulation and gestural overlap?; and (v) How do the temporal and formant frequency patterns serve to inform us about motor speech planning, initiation and execution?

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

2. Methodology 2.1. Subjects Three groups of 6-, 8- and 10-year old children and one group of adults participated in the study. The three groups of children were made up as follows: six 6-year olds (mean age 6.0 years), six 8year olds (mean age 8.2 years) and eight 10-year olds (mean age 10.0 years). Equal numbers of male and female children participated in each age group. The 20 children used in the study were matched for height and weight within their age bands, by selecting children that were of average build within their age groups. The group of nine adults consisted of ®ve women (mean age 37.6 years) and four men (mean age 37.1 years). All subjects were from the same geographical in the North East of England, and had a non-rhotic accent. All subjects were monolingual speakers of English, were healthy, and had no speech, language or hearing diculties. 2.2. Data collection The recordings were made in a quiet room using a portable DAT recorder and a stereo microphone. The microphone was laterally o€set at a distance of 30 cm (12 inch) from the subject. Picture cards were used to elicit a possible total of nine target phrases from each of the subjects. The target phrases were The red/green/blue bar/jar/car. The choice of phrases was dictated by the rationale to use objects that were imageable and therefore easily named by the children in the picture-naming task. Also included in the elicitation were four distracters, for example, The red boat, The green balloon. Prior to the actual collection of the data, subjects were asked to name the colours and the pictures separately. This was done to establish picture names and therefore avoid confusion during the recording session. 2.3. Acoustic analyses The elicited data were digitised with a sampling rate of 10 kHz using a Kay Computerised Speech Lab (CSL-Model 4300). All acoustic analyses were

271

performed on the KAY CSL. In order to assess both voice characteristics and motor speech behaviour the following selected acoustic parameters were obtained from the speech data. (i) The second formant (F 2) and third formant (F 3) frequency values for schwa preceding =r=. (ii) F 2 and F 3 values for =r=. (iii) F 2 and F 3 values for =e=. (iv) F 2 onset and midpoint values for =2+= in the syllables =b2+=, =d¥2+=, =k2+= syllables. It is important to highlight at this stage that the phrase®nal vowel =2+= had no r-colouring in the accent of the subjects investigated in this study. (v) F 2 and F 3 changes between consonants and vowels. (vi) A number of temporal parameters. The full details of the parameters and analyses are outlined below. 2.4. Schwa formant frequencies The second formant (F 2) has been found to be a reliable indicator of the anterior versus posterior position of lingual gestures (Bladon and Al-Bamerni, 1976; Nittrouer et al., 1989; Recasens, 1987, 1991; Sussman et al., 1992, 1998). F 2 data have therefore been used by numerous investigators to gauge the extent of gestural overlap between consonants and vowels in speech production. In addition, F 2 data have been found to be reliable in modelling coarticulatory e€ects on schwa (e.g., van Bergem, 1994). Therefore, in order to determine the degree of coarticulation with schwa, F 2 values for schwa were calculated for each phrase using a facility on the CSL, which averages LPC coecients over a speci®ed frame length from a selected point. A frame length of 10 ms was used and the selected point was the temporal midpoint of schwa. The temporal midpoint of schwa was determined by locating the temporal midpoint of the speech pressure waveform. In addition, wideband FFT spectrograms were generated to validate the measures obtained from the LPC analyses. F 2 is less sensitive to detecting changes in the degree and nature of some lingual constrictions, and in some cases, may therefore not be sensitive enough to gauge the gestural activity of some consonants that require speci®c types of constriction in their production. A group of consonants that fall into this category are consonants produced with some degree of retro¯exion. Some

272

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

accounts suggest consonants such as the alveolar approximant =r= may be produced either by using some degree of retro¯exion; or by bunching the tongue (Laver, 1994). The approximant =r= is acoustically characterised by a lowering of the third formant (Dalston, 1975; Pickett, 1980), an acoustic characteristic that has been employed in synthesizing stimuli for experiments investigating its perception of Slawinksi and Fitzgerald (1998). Along these lines, it has been shown that in the case of retro¯ex plosives, the third formant (F 3) is a more reliable indicator of coarticulation (or gestural overlap) than F 2 (e.g., Krull and Lindblom, 1996; Krull et al., 1995). Therefore, in order to examine the degree of gestural overlap of schwa with =r=, F 3 values, in addition to those of F 2, were calculated for schwa in the same way for the phrases beginning, The red__ . 2.5. Formant frequencies of =r= The midpoint for =r= was located in `red'. LPC values were calculated as described for schwa above. F 2 and F 3 values were also obtained for =r= in `red'. 2.6. Formant frequencies of /e/ The midpoints of =e= were located in `red', using wideband spectrograms. LPC values were calculated as described above. Both F 2 and F 3 values were recorded from the calculations. 2.7. F 2 onset and midpoint data for CV syllables Formant frequency values for the onsets and mid points of the vowel =2+= were calculated using the LPC method as follows: For jar (/`d¥2+/), bar (=`b2+=) and car (=`k2+=), F 2 values were calculated at two discrete points, namely at the onset of the vowel and the midpoint of the vowel. 2.8. Examining coarticulation patterns using formant frequency changes Coarticulation patterns (gestural overlap) were investigated by calculating formant frequency changes, where less change was to be interpreted as

evidence for greater levels of coarticulation or gestural overlap. Patterns of formant frequency changes for the following acoustic events associated with the consonants and vowels were investigated. (i) To monitor gestural overlap between schwa and /r/ in phrases beginning ``The red ____'', F 2 and F 3 changes between schwa and the alveolar approximant =r= in `red' were calculated. These were labelled as R1 and R2, respectively. (ii) To monitor gestural overlap between =r= and =e= in `red', F 2 and F 3 changes between =r= and =e= were calculated. These were labelled as R3 and R4, respectively. (iii) To monitor the extent of gestural overlap in the syllables `jar' (=`d¥2+=), `bar' (=`b2+=) and `car' (=`k2+=), the F 2 changes between the onset and midpoint values were calculated. These were labelled as BarF2, JarF2 and CarF2, respectively. 2.9. Temporal parameters The following temporal parameters were measured for each subject: (i) Total utterance durations for each phrase ± measured from the start to the end of the speech pressure waveform. (ii) The duration of pauses after ``The''. (iii) Pauses preceding `bar'/`jar'/`car' ± this included periods of silence and/or total acoustic closures. (iv) The total duration of pauses was then calculated for each phrase. (v) The total duration of pauses was then subtracted from the total utterance duration to give a measure of total speaking time. (vi) The total speaking time was then used to calculate articulation rate as the number of syllables per second. 2.10. Relationships between acoustic parameters Relationships between formant frequency parameters, formant frequency change parameters, and temporal acoustic parameters found in this study, were examined for all groups combined. This was done to move away from group di€erences and draw out more of the individual di€erences for the subjects in this study. The details of how these comparisons were done are de®ned below.

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

2.11. Temporal patterns and associated formant frequency change parameters (FFCPs) Formant frequency change parameters (R1, R2, R3, R4, BarF2, JarF2 and CarF2) and associated temporal patterns, such as the durations of relevant phrases and pauses were examined using Pearson's product moment correlation coecient. 2.12. Relationships between formant frequencies and associated formant frequency change parameters (FFCPs) The relationship between acoustic parameters and their associated frequency change parameters (R1, R2, R3, R4, BarF2, JarF2 and CarF2) were examined using Pearson's product moment correlation. For example, R1 values were correlated with the F 2 values of schwa and the F 2 values of =r=.

273

3. Results 3.1. Acoustic parameters ± formant frequencies Mean and standard deviation values for the formant frequencies of vowels and consonants used in the investigation of coarticulation (or formant frequency changes) are given in Table 1 by age and sex. Table 2 gives the results of a twoway (age  sex) ANOVA, which indicated signi®cant age di€erences for all formant frequency measures (with a signi®cance level of either p < 0:0001, or p < 0:05 ± see Table 2). A series of pairwise comparisons (least signi®cance di€erence (LSD)) indicated the following signi®cant (p < 0:05) di€erences between the data of the 6-, 8and 10-year olds and the adults: 1. F 2 and F 3 of schwa preceding `red', F 2 and F 3 of /e/ in `red', F 2 onset and F 2 mid points of =2+= in `bar', F 2 midpoints of =2+= in `jar' and `car' ± all between group comparisons except those between the 6- and 8-year olds,

Table 1 Mean and standard deviation values of formant frequencies by sex and age group Parameter

Sex

Age 6 years (N ˆ 6)

Age 8 years (N ˆ 6)

Age 10 years (N ˆ 8)

Adults (N ˆ 9)

F 2 of schwa preceding =r= in `red' (Hz)

F M F M F M F M F M F M F M F M F M F M F M F M

2213.6 (173.1) 1977.3 (229.4) 1767.2 (288.4) 1605.9 (232.6) 3303.8 (421.3) 3515.6 (239.7) 2554.4 (76.8) 2613.8 (384.4) 2433.2 (94.5) 2381.6 (158.8) 3586.6 (163.4) 3714.1 (249.6) 1695.4 (204.4) 1467.00 (70.7) 1589.6 (162.7) 1356.1 (94.4) 2306.6 (137.4) 2157.0 (147.8) 1680.0 (105.7) 1399.5 (94.0) 1833.4 (327.7) 1632.8 (190.8) 1628.0 (203.2) 1389.6 (85.7)

2245.1 1851.1 1756.4 1505.6 3427.8 3041.8 2545.8 2286.3 2410.0 2272.6 3318.5 3508.6 1647.0 1359.0 1543.1 1363.1 2333.4 1981.4 1621.6 1502.6 1898.3 1656.1 1604.9 1387.5

2062.9 1684.2 1702.9 1404.1 3081.6 2811.9 2644.9 2323.1 2340.1 2087.3 3432.1 2879.3 1508.4 1287.6 1426.1 1269.3 2227.4 2092.6 1551.4 1365.6 1791.6 1727.9 1518.8 1310.8

1751.5 1406.7 1520.3 1302.7 2727.8 2395.5 2439.3 1984.7 1998.9 1728.3 2742.6 2497.5 1357.0 1122.5 1235.9 1055.6 1889.6 1789.0 1310.8 1088.4 1652.7 1545.8 1269.1 1107.8

F 2 of =r= in `red' (Hz) F 3 of schwa preceding =r= in `red' (Hz) F 3 of =r= in `red' (Hz) F 2 of =e= in `red' (Hz) F 3 of =e= in `red' (Hz) F 2 onset of =2+= (`bar') (Hz) F 2 midpoint of =2+= (`bar') (Hz) F 2 onset of =2+= (`jar') (Hz) F 2 midpoint of =2+= (`jar') (Hz) F 2 at release of =k= burst (`car') (Hz) F 2 midpoint of =2+= (`car') (Hz)

(121.5) (166.4) (228.7) (206.2) (287.2) (487.7) (205.6) (393.0) (165.3) (108.1) (456.9) (262.9) (185.0) (77.4) (97.5) (113.0) (110.0) (150.0) (117.1) (119.5) (243.9) (167.5) (73.7) (120.1)

(139.4) (210.7) (108.7) (191.4) (340.9) (451.5) (436.6) (312.6) (101.5) (206.9) (354.0) (456.2) (69.6) (96.2) (67.9) (64.3) (129.9) (147.4) (123.4) (69.6) (155.3) (210.8) (107.6) (100.1)

(170.0) (77.8) (86.6) (104.6) (166.3) (163.4) (179.5) (287.7) (138.4) (109.9) (164.9) (131.6) (67.9) (104.5) (65.0) (133.9) (93.3) (151.1) (117.3) (124.3) (238.1) (111.8) (105.8) (94.3)

274

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

Table 2 Results of a two-factor ANOVAs (sex ´ age) for formant frequency parameters Parameter F2 F2 F3 F3 F2 F3 F2 F2 F2 F2 F2 F2

of schwa preceding =r= in `red' (Hz) of =r= in `red' (Hz) of schwa preceding =r= in `red' (Hz) of =r= in `red' (Hz) of =e= in `red' (Hz) of =e= in `red' (Hz) onset of =2+= (`bar') (Hz) midpoint of =2+= (`bar') (Hz) onset of =2+= (`jar') (Hz) midpoint of =2+= (`jar') (Hz) at release of =k= burst (`car') (Hz) midpoint of =2+= (`car') (Hz)

Age

Sex a

F …3; 63† ˆ 33:7; p < 0:0001 F …3; 63† ˆ 7:3; p < 0:0001a F …3; 63† ˆ 20:1; p < 0:0001a F …3; 63† ˆ 4:3; p < 0:05b F …3; 63† ˆ 48:6; p < 0:0001a F …3; 63† ˆ 32:7; p < 0:0001a F …3; 63† ˆ 29:6; p < 0:0001a F …3; 63† ˆ 40:5; p < 0:0001a F …3; 63† ˆ 30:8; p < 0:0001a F …3; 63† ˆ 41:8; p < 0:0001a F …3; 63† ˆ 3:0; p < 0:05b F …3; 63† ˆ 33:0; p < 0:0001a

Age  sex a

F …1; 63† ˆ 68:4; p < 0:0001 F …1; 63† ˆ 27:6; p < 0:0001a F …1; 63† ˆ 5:6; p < 0:05b F …1; 63† ˆ 10:0; p < 0:05b F …1; 63† ˆ 25:1; p < 0:0001a F …1; 63† ˆ 2:4, ns F …1; 63† ˆ 80:7; p < 0:0001a F …1; 63† ˆ 60:6; p < 0:0001a F …1; 63† ˆ 31:2; p < 0:0001a F …1; 63† ˆ 56:8; p < 0:0001a F …1; 63† ˆ 9:4; p < 0:05b F …1; 63† ˆ 59:6; p < 0:0001a

F …3; 63† ˆ 0:6, ns F …3; 63† ˆ 0:4, ns F …3; 63† ˆ 2:2, ns F …3; 63† ˆ 1:7, ns F …3; 63† ˆ 2:0, ns F …3; 63† ˆ 5:4; p < 0:05b F …3; 63† ˆ 0:3, ns F …3; 63† ˆ 0:4, ns F …3; 63† ˆ 3:0; p < 0:05b F …3; 63† ˆ 1:4, ns F …3; 63† ˆ 0:7, ns F …3; 63† ˆ 0:4, ns

a

Signi®cant at p < 0:0001 Signi®cant at p < 0:05. ns: not signi®cant at p < 0:05.

b

with falling F 2 and F 3 values with increasing age; 2. F 2 of =r= in `red' ± all inter-group comparisons except those between the 8-year olds and the 10-year olds and between the 8-year olds and the 6-year olds, again the pattern of change with age was falling F 2 values; 3. F 3 of =r= in `red' ± inter-group comparisons of the adults and the 10-year olds and those between the adults and 6-year olds. The adults had lower formant frequencies in both cases; 4. F 2 onset of =2+= in `jar' ± inter-group comparisons between the adults and all three groups of children with the adults having lower F 2 values for all comparisons; and 5. F 2 at release of /k/ in `car' ± inter-group comparisons between the adults and 8- and 10-year olds, with the adults having lower F 2 values in both cases. In addition, all formant frequency parameters showed signi®cant sex e€ects (with a signi®cance level of either p < 0:0001, or p < 0:05) for all except one parameter (F 3 of =e= in `red'). Although this parameter followed the same trends as the other parameters, it did not reach signi®cance (see Table 2). The signi®cant sex di€erences were due to the females having higher formant frequencies than the males. Signi®cant interaction e€ects of age  sex were found for the parameters F 3 of =e= in `red' and the

F 2 onset of =2+= in `jar'. The former of these interaction e€ects was due to di€erent developmental trends in the male and female data. Although the males showed a steady decrease in F 3 of =e= with age, the females showed a decrease in value between age 6 and 8 years, followed by a slight rise between age 8 and 10 years, before decreasing again thereafter. In the case of the F 2 onset of =2+= in `jar', the females showed a slight increase in F 2 values between age 6 and 8 years before falling with increasing age thereafter. This pattern contrasted with that of the males who showed a decrease between age 6 and 8 years, a subsequent increase between age 8 and 10 years and a fall thereafter. 3.2. Formant frequency changes Mean and standard deviation values for the formant frequency change parameters (FFCPs) between vowels and consonants in the investigation of coarticulation (or gestural overlap) are given in Table 3 by age and sex. Table 4 gives the results of a two-way (age  sex) ANOVA, which indicated signi®cant age di€erences for all formant frequency changes (with a signi®cance level of either, p < 0:0001 or p < 0:05 ± see Table 4), with the exception of BarF2 and CarF2 (though the latter showed a signi®cance in a pairwise com-

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

275

Table 3 Mean and standard deviation values for formant frequency change parameters (FFCPs) by age and sexa Formant frequency change parameters (Hz)

Sex

Age 6 years

Age 8 years

Age 10 years

Adults

R1: F2 =r= in `red' minus F2 /c/ in `The'

F M F M F M F M F M F M F M

)446.4 )371.4 )749.4 )901.9 666.0 775.8 1032.2 1100.4 )105.8 )110.9 )626.6 )757.5 )205.4 )243.1

)488.8 )345.5 )882.0 )755.5 653.6 767.0 772.8 1222.4 )103.9 4.1 )711.8 )478.8 )293.4 )268.6

)360.0 )280.1 )436.7 )488.8 637.2 683.2 787.2 556.3 )82.3 )18.3 )676.0 )727.0 )272.8 )417.1

)231.2 )104.0 )288.5 )410.8 478.6 425.6 303.3 512.8 )121.1 )66.9 )578.8 )700.6 )383.6 )438.0

R2: F3 =r= in `red' minus F3 /c/ in `The' R3: F2 =e= in `red' minus F2 =r= in `red' R4: F3 =e= in `red' minus F3 =r= in `red' BarF2: F2 midpoint =2+= minus F2 onset =2+= in `bar' JarF2: F2 midpoint =2+= minus F2 onset =2+= in `jar' CarF2: F2 midpoint =2+= minus F2 onset at release in `car' a

(234.5) (280.8) (431.9) (437.2) (297.9) (271.6) (167.5) (494.5) (170.2) (109.48) (116.8) (81.6) (245.6) (159.0)

(224.4) (149.1) (366.7) (182.8) (347.4) (161.9) (525.9) (505.6) (220.5) (70.0) (142.5) (120.7) (235.7) (131.6)

(112.0) (121.8) (322.3) (249.9) (132.8) (257.1) (469.6) (330.2) (79.4) (77.8) (130.8) (157.6) (158.2) (181.22)

(170.5) (132.8) (192.7) (282.8) (164.2) (184.9) (241.5) (314.0) (65.3) (92.8) (127.4) (222.9) (231.2) (145.5)

All values are given in Hz.

Table 4 Results of a two-factor ANOVAs (sex and age group) for FFCPs given in Table 3 FFCPs (Hz) R1: F2 =r= in `red' minus F2 /c/ in `The' R2: F3 /r/ in `red' minus F3 /c/ in `The' R3: F2 =e= in `red' minus F2 =r= in `red' R4: F3 =e= in `red' minus F3 =r= in `red' BarF2: F2 midpoint =2+= minus F2 onset =2+= in `bar' JarF2: F2 midpoint =2+= minus F2 onset =2+= in `jar' CarF2: F2 midpoint =2+= minus F2 onset at release in `car'

Age

Sex a

Sex ´ age b

F …3; 63† ˆ 7:6; p < 0:0001 F …3; 63† ˆ 10:7; p < 0:0001a F …3; 63† ˆ 5:4; p < 0:05b F …3; 63† ˆ 9:7; p < 0:0001a F …3; 63† ˆ 1:2, ns

F …1; 63† ˆ 6:0; p < 0:05 F …1; 63† ˆ 0:5, ns F …1; 63† ˆ 0:9, ns F …1; 63† ˆ 1:6, ns F …1; 63† ˆ 4:0; p < 0:05b

F …3; 63† ˆ 0:2, F …3; 63† ˆ 0:7, F …3; 63† ˆ 0:5, F …3; 63† ˆ 2:4, F …3; 63† ˆ 0:6,

F …3; 63† ˆ 1:9, ns

F …1; 63† ˆ 0:2, ns

F …3; 63† ˆ 5:4; p < 0:05b

F …3; 63† ˆ 3:0; p < 0:05b

F …1; 63† ˆ 1:4, ns

F …3; 63† ˆ 0:7, ns

ns ns ns ns ns

a

Signi®cant at p < 0:0001. Signi®cant at p < 0:05. ns: not signi®cant at p < 0:05.

b

parison ± see below). A series of pairwise comparisons (LSD) indicated the following signi®cant (p < 0:05) di€erences between the data of the 6-, 8and 10-year olds and the adults: 1. R1 ± inter-group comparisons between the adults and all groups of children, with the adults showing smaller changes in F 2 than the children; 2. R2 ± inter-group comparisons between the adults and the 6- and 8-year olds, with the adults showing smaller changes in F 3. No signi®cant di€erences were found between the adults and the 10-year olds for this parameter;

3. R3 ± inter-group comparisons between the adults and the all three groups of children, with the adults showing smaller frequency changes in all cases; 4. R4 ± all inter-group comparisons except those between the 6- and 8-year olds, with smaller F 3 changes being observed from age 8 years through to the adult group; 5. JarF2 ± inter-group comparison between the 8and 10-year olds, with the 8-year olds showing smaller F 2 changes; 6. CarF2 ± inter-group comparisons between the adults and the 6- and 8-year olds, with the

276

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

adults showing greater F 2 changes in both cases. In addition, signi®cant sex e€ects (with a signi®cance level of p < 0:05) were observed for R1 and BarF2, with the females showing greater formant frequency changes than the males in both cases (see Tables 3 and 4). Signi®cant interaction e€ects of age  sex were only found for JarF2. This interaction e€ect was due to di€erent developmental trends in the male and female data. For the males, there was a decrease in F 2 change between age 6 and 8 years, followed by an increase between age 8 and 10 years, and a subsequent decrease thereafter. The trend for the females was reversed between age 6 and 10 years, followed by a subsequent decrease thereafter. 3.3. Temporal parameters Mean and standard deviation values for the temporal parameters are given in Table 5 by age and sex. Table 6 gives the results of a two-way (age  sex) ANOVA, which indicated signi®cant age di€erences for all formant frequency measures with a signi®cance level of either p < 0:0001, or p < 0:05 (see Table 6). A series of pairwise comparisons (LSD) indicated the following signi®cant (p < 0:05) di€erences between the data of the 6-, 8and 10-year olds and the adults: 1. total utterance duration, duration of pauses after `The', pauses preceding `bar'/`jar'/`car', the total duration of pauses ± inter-group compar-

isons between the 6-year olds and the 8-, 10year olds and adults, with the 6-year olds showing longer temporal measures for these parameters (and therefore, longer inter-word pauses) for all the comparisons; 2. total speaking time and articulation rate- all inter-group comparisons except those between the 10-year olds and the adults, with decreases in speaking time and increases in articulation rate being observed between age 6 and 8 years and between age 8 and 10 years. Signi®cant sex e€ects were found for pauses preceding `bar'/`jar'/`car' (F …1; 234† ˆ 10:8, p < 0:05), total speaking time (F …1; 234† ˆ 5:6, p < 0:05) and articulation rate (F …1; 234† ˆ 9:8, p < 0:05), with females showing longer pauses and total speaking times and lower articulation rates than the males. Signi®cant interaction e€ects of age  sex were found for pauses preceding `bar'/`jar'/`car' (F …3; 234† ˆ p < 0:05) and articulation rate (F …3; 234† ˆ p < 0:05). These interaction e€ects were due to di€erent developmental trends in the male and female data. In the case of the pauses preceding `bar'/`jar'/`car', the females showed a decrease in the duration of pauses between age 6 and 10, with a slight increase thereafter. This pattern contrasted with that of the males who showed a decrease between age 6 and 8 years, a subsequent increase between age 8 and 10, with a decrease thereafter. Age  sex interaction e€ects for articulation rate were due to more marked increases in the articulation rate of the males

Table 5 Temporal parameters by age and sex Parameter

Sex

Age 6 years (N ˆ 6)

Age 8 years (N ˆ 6)

Age 10 years (N ˆ 8)

Adults (N ˆ 9)

Phrase durations (ms)

F M F M F M F M F M F M

1533.2 (721.4) 1516.8 (765.0) 191.3 (199.5) 281.3 (552.1) 269.11 (386.6) 150.4 (162.0) 460.4 (510.6) 431.7 (641.8) 1072.2 (331.6) 1085.1 (165.5) 3.1 (0.9) 2.8 (0.4)

1006.5 839.2 45.2 24.1 64.2 27.7 109.4 51.8 897.1 787.5 3.4 3.9

889.0 853.7 29.8 54.9 40.8 48.6 70.5 103.5 818.4 750.2 3.8 4.1

850.0 761.6 19.5 15.2 45.5 37.2 65.0 52.4 785.0 709.2 3.9 4.3

Pauses after `the' (ms) Pauses before `bar/jar/car' (ms) Total pause time (ms) Total speaking time (ms) Articulation rate (sylls. per s)

(156.0) (77.7) (67.8) (29.8) (46.3) (30.0) (70.5) (33.5) (124.9) (77.5) (0.5) (0.4)

(138.0) (166.5) (36.0) (88.0) (36.1) (35.2) (43.9) (87.3) (141.8) (103.1) (0.7) (0.5)

(79.2) (77.4) (23.0) (28.0) (39.0) (40.6) (37.9) (45.2) (84.4) (105.9) (0.4) (0.6)

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

277

Table 6 Results of a two factor ANOVA (age  sex) of temporal parameters. Parameter Phrase durations (ms) Pause after `the' (ms) Pause before `bar/jar/car' (ms) Total pause time (ms) Total speaking time (ms) Articulation rate (sylls. per s)

Age a

F …3; 234† ˆ 47:1; p < 0:0001 F …3; 234† ˆ 13:1; p < 0:0001a F …3; 234† ˆ 21:7; p < 0:0001a F …3; 234† ˆ 24:4; p < 0:0001a F …3; 234† ˆ 54:2; p < 0:0001a F …3; 234† ˆ 45:3; p < 0:0001a

Sex

Age  sex

F …1; 234† ˆ 2:9, ns F …1; 234† ˆ 0:7, ns F …1; 234† ˆ 5:6; p < 0:05b F …1; 234† ˆ 0:2, ns F …1; 234† ˆ 9:8; p < 0:05b F …1; 234† ˆ 10:8; p < 0:05b

F …3; 234† ˆ 0:5, ns F …3; 234† ˆ 0:7, ns F …3; 234† ˆ 2:7; p < 0:05b F …3; 234† ˆ 0:3, ns F …3; 234† ˆ 1:5, ns F …3; 234† ˆ 4:2; p < 0:05b

a

Signi®cant at p < 0:0001. Signi®cant at p < 0:05. ns: not signi®cant at p < 0:05.

b

between age 6 and 8 years compared to the females. In addition, although the females showed slight increases in articulation rate between age 10 years and the adults, this shows a plateau e€ect compared to the more marked increases for the males (see Fig. 1). 3.4. Correlations between formant frequency change parameters (FFCPs) and associated temporal parameters The results of Pearson's product moment correlation coecients showing the relationship between formant frequency changes and associated temporal parameters are given in Table 7 across all groups combined. The following signi®cant cor-

relations (p < 0:05; p < 0:01) were found for formant frequency change parameters and their associated temporal parameters (see Table 7): (i) R2 with durations of `red' phrase durations. (ii) R2 with durations of pauses preceding `red'. (iii) R4 with durations of `red' phrase durations. (iv) R4 with durations of pauses preceding `red'. (v) BarF2 with durations of `bar' phrases. (vi) BarF2 with durations of pauses preceding `bar'. (vii) JarF2 with durations of `jar' phrases. 3.5. Examining relationships between acoustic parameters and formant frequency change parameters (FFCPs) The results of Pearson's product moment correlation coecients showing the relationship between formant frequency changes and associated formant frequencies are also given in Table 7 across all groups combined. All formant frequency changes and their associated formant frequencies showed signi®cant correlations (p < 0:05 or p < 0:01) with the exception of R1 with F 2 or =r=, R2 with F 3 or =r=, and BarF2 with F 2 mid of =2+= in `bar' (see Table 7). 4. Discussion 4.1. Formant frequency parameters

Fig. 1. Mean articulation rate (syllables per second) by age and sex.

This study aimed to investigate whether there were any age, sex, and age by sex related di€erences in the speech patterns of three groups of children and one group of adults, elicited via a

278

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

Table 7 Results of Pearson's product moment correlations between FFCPs and associated acoustic parameters, and between FFCPs and associated temporal parameters (all age groups combined) FFCPa

Acoustic parameters used to calculate formant frequency change parameters (Hz)

Pearson's r

Temporal parameters used in correlations

Pearson's r

R1 R1 R2 R2 R3 R3 R4 R4 BarF2 BarF2 JarF2 JarF2 CarF2 CarF2

F2 F2 F3 F3 F2 F2 F3 F3 F2 F2 F2 F2 F2 F2

)0.635a 0.080 (ns) )0.617b 0.175 (ns) )0.420b 0.574b )0.286b 0.721b )0.441b 0.134 (ns) )0.475b 0.246a )0.601b 0.351b

Durations Durations Durations Durations Durations Durations Durations Durations Durations Durations Durations Durations Durations Durations

)0.049 (ns) 0.045 (ns) )0.372b )0.366b )0.008 (ns) )0.049 (ns) 0.311b 0.262b )0.253a )0.278a )0.226b )0.105 (ns) 0.138 (ns) 0.066 (ns)

of schwa preceding =r= in `red' of =r= in `red' of schwa preceding =r= in `red' of =r= in `red' of =r= in `red' of =e= in `red' of =r= in `red' of =e= in `red' onset of =2+= (`bar') midpoint of =2+= (`bar') onset of =2+= (`jar') midpoint of =2+= (`jar') at release of /k/ burst (`car') midpoint of =2+= (`car')

of of of of of of of of of of of of of of

`red' phrase durations pauses preceding `red' `red' phrase durations pauses preceding `red' `red' phrase durations pauses preceding `red' `red' phrase durations pauses preceding `red' `bar' phrases pauses preceding `bar' `jar' phrases pauses preceding `jar' `car' phrases pauses preceding `car'

a

Signi®cant at p < 0:05. Signi®cant at p < 0:01. ns: not signi®cant at p < 0:05.

b

picture-naming task. Results showed robust age and sex linked di€erences across the vast majority of formant frequency parameters (see Tables 1 and 2). Age-linked di€erences were generally characterised by a drop in formant frequencies with increasing age. This re¯ects patterns of physical maturation of the vocal tract and supports evidence from previous studies (Bennett, 1981; Busby and Plant, 1995; Eguchi and Hirsh, 1969; Kent, 1976). In addition, sex-linked di€erences were generally characterised by females having higher formant frequencies than the males, a pattern that was generally found across the formant frequency parameters for both the children and the adults (see Tables 1 and 2). Again this replicates previous reports and provides further acoustic evidence for sexual dimorphism in both preadolescent children's (Bennett, 1981; Busby and Plant, 1995; Whiteside and Hodgson, 1999) and adults' (Childers and Wu, 1991; Peterson and Barney, 1952) vocal tracts, with females generally having overall, smaller supralaryngeal vocal tracts than their male peers. What is worthy of note however at this point is that in addition to overall vocal tract di€erences between males and females, one needs to consider the non-uniform male±female di€erences that occur in the vocal tract, with males

having larger pharyngeal cavities compared to females (Fant, 1966) for example. These non-uniform di€erences are illustrated in some of the data reported here. For example, the production of =r= gestures (which may be retro¯exed or bunched) will involve constrictions that create a cavity posterior to the constriction, which will include the pharynx. This may therefore in part, explain the marked male±female di€erences for F 2 and F 3 for =r=, with the exception of the F 3 data for the 6year old boys who had similar F 3 values to the 6year old females (see Table 1). The observation that the males show more marked changes in F 2 and F 3 values across age compared to the females could be interpreted as evidence for development in the back cavity (including the pharynx) being greater for the males. These data in addition to illustrating the non-uniform di€erences in the vocal tract between males and females, could be interpreted as evidence for sex di€erences in the formant frequencies of pre-adolescent children, a ®nding that replicates those of others (Busby and Plant, 1995). Only two parameters (F 3 of =e= in `red' and F 2 onset of `Jar') showed signi®cant age  sex interactions. This suggests that for the subjects in this study, age and sex linked developmental changes

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

279

by and large, followed steady changes with increasing age (see Fig. 2 for an example of this). 4.2. Formant frequency changes (excursions) An investigation into age and sex-linked di€erences for formant frequency change parameters (FFCPs) and, therefore, an index of the degree of gestural overlap, indicated signi®cant age di€erences for 5 out of the 7 FFCPs that were investigated. Of these 5 FFCPs, 4 were generally characterised by decreases in formant frequency changes and therefore increasing patterns of coarticulation, with increasing age. The 4 FFCPs were associated with schwa-=r= acoustic changes, and =r=-=e= acoustic changes for both F 2 and F 3. An example of this is given in Fig. 3, which illustrates the patterns of F 2 change for schwa-=r=, by age and sex. The FFCP parameter that showed an age di€erence with increases in frequency change with age and, therefore, decreases in coarticulation patterns, was CarF2, whereas BarF2 and JarF2 showed no signi®cant age di€erences. These data suggest that patterns of coarticulation in adults and children for di€erent gestural± acoustic patterns should be interpreted in light of the phonological/phonetic contexts that are being investigated. The age-linked di€erences for CarF2 could be interpreted as further evidence for increased segmental and perceptual awareness. As

Fig. 2. Mean F2 of =r= in `red' by age and sex (Hz).

Fig. 3. F2 changes between =r= and schwa preceding =r= in `red' by age and sex (in Hz).

the children gradually develop their gestural organisation to e€ect more marked acoustic distinctions, by producing more perceptible changes between the acoustic events associated with the release of the velar closure and the open back vowel. The gestural requirements for the production of `car' are mechanically linked, and also involve linked gestures that are in close proximity (i.e., velar closure for the consonant and pharyngeal approximation for the back open vowel). Therefore, it is suggested that the re®nement of the gestural organisation, and therefore motor speech skills required for the production of this CV syllable may take longer to master. This mirrors previous evidence for /dV(t)/ (Sussman et al., 1992) and =•i•i=, =•u•u=, =sisi=, =susu= syllables (Nittrouer et al., 1989, 1996), and also supports previous discussions on this issue (Nittrouer et al., 1996). Along similar lines of argument, the proposal that gestures that are not mechanically linked (e.g., in the case of BarF2) may be more readily mastered independently, and therefore inter-phased more eciently is supported by some of the data in this investigation. The BarF2 parameter reported here showed no signi®cant age group di€erences, therefore suggesting similar patterns of gestural organisation for all four groups that were investigated. This provides some corroboration for previous evidence (Sussman et al., 1992;

280

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

Turnbaugh et al., 1985) where children between the ages of 3 and 5 years of age were found to display mature, adult-like coarticulation patterns for some plosive±vowel gestural sequences. Given these previous studies, it is, therefore, perhaps not surprising that no age di€erences were found for the BarF2 data reported here. The lack of agedi€erences for JarF2, however, appeared to be due to a large amount of individual variation, which could therefore explain the inconsistent developmental patterns given in Table 3 for this parameter. The 4 FFCPs that were associated with schwa=r= and r-=e= changes for both F 2 and F 3, all showed patterns of decreases in formant frequency changes, and therefore increasing patterns of coarticulation, with increasing age. These patterns are suggestive of lower levels of inter-gestural overlap between the alveolar approximant and both vowel gestures of schwa and =e= in the younger children. It is suggested that for these consonant±vowel gestures, we are observing an example of low levels of coarticulation, which could be interpreted as evidence for lower levels of maturity in motor speech skills, which are being constrained by a maturing motor speech system. It is posited that the complex perceptuo-motor patterns that are associated with the alveolar approximant =r= may take longer to develop fully, and it is documented as being among the sounds that are mastered later (Locke, 1983). It is the transitions to and from the `dipped' or lowered patterns of F 2 and F 3 which cue the perception of =r=, with faster and slower transitions being noted for F 2 and F 3, respectively (Dalston, 1975). This indicates therefore, the importance of both spectral and temporal cues for both the perception and production of =r=. If we look at the inter-syllabic levels of coarticulation for R1 and R2, one might be tempted to suggest that the lower levels of gestural overlap for the 6-year old children may have been due to the long durations of inter-word pauses. However, lower levels of gestural overlap were also observed for the 8-year olds who displayed much shorter inter-word pauses than the 6year olds (see Table 5). In addition, there were no signi®cant di€erences between the 10-year olds and the adults for R2, suggesting that the gestures

adopted to produce =r= were anticipated in a similar fashion by both groups. It is therefore suggested that the lower levels of anticipatory inter-syllabic coarticulation between schwa and =r= for the 6- and 8-year olds, were by and large due to lower levels of perceptual and motor skill and maturity in the production of =r=. This suggestion is supported further if we examine the extent of intrasyllabic coarticulation indexed by R3 and R4. Here, we see a similar pattern emerging, with the gradual decrease in formant frequency changes with age. However, more saliently, R4 shows similar patterns of reduced coarticulation for the 6- and 8-year olds, with the 10-year olds showing more gestural overlap between =r= and =e=. These data therefore suggest that the 10-year old children in this study showed greater levels of articulatory maturity for =r= than the 6- and 8-year olds. Signi®cant sex di€erences were found for only 2 of the 7 FFCPs (R1 and BarF2 ± see Table 4 and Fig. 3 for R1), therefore suggesting that although the degree of coarticulation generally increased with age, the subjects in this study displayed minimal evidence for sex-linked patterns of coarticulation. The sex di€erences for R1 and BarF2, however, provide some evidence for males showing greater levels of coarticulation and therefore more gestural overlap, than their female peers. This result relates to some evidence in the literature which has reported that women show less phonetic reduction and therefore greater phonological distinction than men in experimental situations (Byrd, 1992, 1994; Whiteside, 1996). A signi®cant age  sex interaction was found only for JarF2. However, this results could be attributed to the high levels of individual variation mentioned in previous discussion, and will therefore not be discussed further. 4.3. Temporal parameters All temporal parameters in this study showed signi®cant age di€erences, with a decrease in the length of inter-word pauses, shorter speaking times and an increase in articulation rate, with increasing age. The 6-year old children showed signi®cantly longer temporal measures for total utterance durations, and the durations and total

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

duration of pauses, compared to the 8-, 10-year olds and the adults. The signi®cantly longer temporal measures for the 6-year olds children was largely due to the long durations of inter-word pausing that was adopted by some of the children. Once the pausing behaviour was subtracted from the total utterance duration to give total speaking time and articulation rate, we still observed a general decrease for total speaking time, and an increase for articulation rate, with increasing age. However, the di€erences between the 10-year olds and the adults were smaller and did not reach signi®cance in the pairwise LSD comparisons. This pattern of decreasing durations of speech patterns with increasing age replicates the ®nding of others who present both cross-sectional (Nittrouer, 1993) and longitudinal (Smith and Kenney, 1998) data. It also re¯ects the patterns of maturation of motor speech skills, which have become more speeded with increasing age. The similarity between the total speaking time and articulation rate parameters, of the 10-year olds and adults, suggests that it is around this age (i.e., 10 or 11) that the temporal structure and organisation of children's speech converges with that of adults. This suggestion is upheld by evidence from previous studies that have examined the developmental patterns of durations of vowel±fricative sequences (Smith and Kenney, 1998) and voice onset time (Eguchi and Hirsh, 1969). Results above indicated signi®cant sex di€erences for pauses preceding `bar'/`jar'/`car', total speaking time, and articulation rate, with females showing longer pauses and total speaking times and lower articulation rates than the males. The sex-linked patterns for the pauses preceding `bar'/ `jar'/`car', showed this pattern for all speaker groups except the 10-year olds, but was more marked for the 6- and 8-year old groups. With the exception of the 6-year olds, total speaking time and articulation rate showed a marked sex di€erence for all groups (see Fig. 1). The data of the adult group replicates previous evidence on sexlinked di€erences in the utterance durations of men and women (Byrd, 1992, 1994; Whiteside, 1996). The evidence from the children in this study therefore indicates that sex-linked di€erences in total speaking time and articulation rate emerge

281

sometime after the age of 6 years, and explains the signi®cant age  sex interaction e€ect found for articulation rate. It is therefore suggested that even in the speech of children elicited in an experimental situation, there is some evidence here to suggest that females have lower articulation rates compared to their male peers. The issue of whether this phenomenon is determined by sociophonetic factors remains to be answered, but there are indications in the adult literature which suggests that these factors should not be overlooked (Byrd, 1992, 1994). 4.4. Temporal patterns and associated frequency change parameters Table 7 shows the results of Pearson's product moment correlation coecients showing the relationship between formant frequency changes and associated temporal parameters across all groups combined. The signi®cant negative and positive correlations for the changes in F 3 between schwa=r= and =r=-=e=, respectively, with the temporal parameters suggest that the third formant frequency changes associated with =r= were related to duration, with longer durations being associated with greater F 3 changes. No positive correlations were found between F 2 changes and associated temporal parameters. It is suggested in the data reported here, that gestures associated speci®cally with =r= are not signi®cantly a€ected by anterior±posterior lingual positions but more speci®cally by the gestures that e€ect a lowering of the third formant (this may include for example the retro¯exion and/or bunching of the tongue). The signi®cant correlations observed for: BarF2 with durations of `bar' phrases; BarF2 with durations of pauses preceding `bar'; and JarF2 with durations of `jar' phrases, suggests that for these cases, there were spectro-temporal relationships, with longer durations being associated with greater formant frequency changes. 4.5. Relationships between formant frequencies and associated formant frequency change parameters The results of Pearson's product moment correlation coecients showing the relationship

282

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

between formant frequency changes and associated formant frequencies across all groups combined (Table 7) showed signi®cant correlations (p < 0:05 or p < 0:01) with the exception of R1 with F 2 of =r=, R2 with F 3 of =r=, and BarF2 with F 2 mid of =2+= in Ôbar'. These results suggest that the extent of formant frequency changes were determined to some extent by the nature of the consonant vowel sequences (Table 7). For example, R1 and R2 showed negative correlations only with the F 2 and F 3 values of schwa, which suggests that the extent of formant frequency changes were negatively correlated with the levels of anticipatory coarticulation that were already present in schwa. In addition, R3 and R4 showed greater positive correlations between the F 2 and F 3 values of the vowel =e=, compared to the F 2 and F 3 values for =r=. This suggests that as a combined group, the extent of formant frequency changes for both F 2 (R3) and F 3 (R4) were related more to the formant frequencies for =e=, than to those for =r=. The extent of changes for R3 and R4 could be attributed to production, perceptual and spectrotemporal characteristics of =r= and how they relate to =e= in this phonetic context. From a production perspective, the formant frequency changes indicate that there are greater changes in articulatory gestures between =r= and =e=, which could be interpreted as evidence for more extreme articulatory con®gurations and, therefore, evidence for more distinct articulatory gestures. From a perceptual and spectro-temporal perspective, however, the results could be explained as follows. Slow rises in F 3 (and to some extent F 2), from the Ôdipped' (lowered) F 2 and F 3 formant frequencies are necessary for the perception or =r=. High formant frequency values that are observed for the midpoint of =r=, and any associated slow articulation patterns, will require larger formant frequency changes over a longer period, to e€ect similar perceptible formant frequency changes compared to those cases where formant frequency changes occur over a shorter period of time. The former and latter cases describe most of the data observed for the 6- and 8-year olds, and 10-year olds and adults, respectively, and illustrate the strong link between perceptual and motor systems in speech production.

In addition, the extent of formant frequency changes for BarF2, JarF2 and CarF2 showed strong negative correlation with the onset values of the vowel =2+=, compared to either some positive correlation (JarF2 and CarF2), or no correlation (BarF2) with the mid-point (minima) values (see Table 7). This suggests that the extent of formant frequency changes for BarF2, JarF2 and CarF2, was inversely related to the levels of anticipatory coarticulation had already been achieved at the onset points of the vowel =2+=, in all three contexts. 4.6. Individual di€erences An issue that has thusfar not been raised is that of individual di€erences in motor speech behaviours, and also the varying rates at which children may develop perceptual and motor speech skills. Rates of development vary across children and may or may not coincide with chronological age. These individual di€erences explain some of the results. For example, the data reported for F 3 of =e= showed that although males showed an overall decrease in frequency with age, the females showed an increase between age 8 and 10 years. Furthermore, the results for F 2 onset of `jarÕ indicated that while females showed an overall slight increase in F 2 values between age 6 and 8 years, they fell thereafter ± a pattern which di€ered to that of the males (see results section and Table 1). The variation across age groups is further illustrated in Fig. 4, which depicts the relationship between R1 and F 2 of schwa preceding =r=. Here we see clear evidence of overlap for these parameters between some adult and 10-year olds data, and between some of the data for 6-, 8- and 10-year olds. This supports previous studies which have shown that there are individual di€erences between the rates at which children mature in both their vocal characteristics (Smith and Kenney, 1998), and articulatory behaviour (Nittrouer, 1993). Individual di€erences in motor speech behaviour and the level of maturation are also observed in the spectrotemporal organisation of the data reported here. An example of this is illustrated in Fig. 5, which shows a signi®cant correlation (r ˆ ÿ0:347; p < 0:01), between R2 and the durations of Ôred'

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

283

5. Summary and conclusions

Fig. 4. F2 change between =r= and schwa (R1) against F2 schwa values by group.

phrases, which have had the pauses preceding `red' subtracted. Although there is evidence for most of the 6- and 8-year olds showing longer durations which are associated with greater frequency changes, there are some 6- and 8-year olds who show overlapping values with both the 10-year olds and the adults. This is a pattern that highlights the extent of individual di€erences in the data reported here.

Fig. 5. F3 change between =r= and schwa (R2) against `red' phrase durations with the pauses preceding 'red' subtracted, by group. Overall result of Pearson's product correlation coecient is ÿ0:347 (signi®cant at p < 0:01).

The data reported here, which were elicited via a picture-naming task, a paradigm which encouraged self-paced speech production, provided evidence for both sex-linked and age-linked di€erences. Sex di€erences were largely characterised by higher formant frequencies and evidence of longer pausing behaviour, longer speaking time and lower articulation rates. The higher formant frequencies are consistent with smaller vocal tracts for both adult (Childers and Wu, 1991) and preadolescent females (Bennett, 1981; Busby and Plant, 1995). The data also provide further evidence for overall male±female di€erences in vocal tract in addition to the non-uniform nature of di€erences between the vocal tracts of male± females, where larger back cavities are typical of the male vocal tract. The larger values for the temporal parameters are resonant of previous ®ndings for adults in experimental situations (Byrd, 1992, 1994; Whiteside, 1996). When age-linked di€erences occurred, these were characterised by higher formant frequencies, larger formant frequency changes and larger values for temporal parameters. These ®ndings are consistent with a maturing and developing vocal tract, as well as a maturing perceptual and motor speech system. In addition, there was evidence for individual di€erences, suggesting that the rate of maturation of speech motor behaviour is not necessarily linked to chronological age. In addition, there was evidence to suggest that some perceptuomotor skills are mastered earlier than others, and that this depends on their phonetic characteristics (both in production and perception terms). This agrees with previous ®ndings (e.g., Nittrouer, 1993; Nittrouer et al., 1989, 1996; Sussman et al. 1992) and suggests that the development of perceptual and motor skills are closely related. Most of the parameters reported here did not display age by sex interactions, with age di€erences following similar trends for both the males and females. There were, however, a few parameters that displayed age by sex interactions. An example of this was articulation rate. This parameter showed divergence between the males' and females' data at some point after age 6 years, with

284

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285

the males exhibiting higher articulation rates, an observation which suggests that articulation rate may have developed at di€erent rates for the male and female children in this study (see Fig. 1). An observation which deserves further investigation. Furthermore, when data from all groups were combined, the formant frequency change parameters showed evidence of signi®cant correlations with associated formant frequencies and temporal parameters. However, there was also evidence to suggest that these patterns of correlation varied among di€erent individuals, therefore demonstrating both the validity and importance of considering individual di€erences in the investigation of motor speech behaviour. In conclusion, it is necessary to reiterate the suggestion that some coarticulation patterns may be evidence for ``advanced speech production skills whereas others may be a sign of articulatory immaturity'' (Repp, 1986). The data reported here could be interpreted as evidence for both of these phenomena. For example, the BarF2 data could be interpreted as evidence for more ``advanced speech production skills'', where we observe no age differences. On the other hand, R1, R2, R3 and R4 data could be interpreted as evidence for ``articulatory immaturity'', where we see clear evidence for age-linked di€erences. However, alongside the notion of ``articulatory immaturity'', we need to consider that the development of motor speech skills is taking place in parallel with the physical maturation of sub-laryngeal, laryngeal and supralaryngeal speech production systems, and continually developing perceptual processes. It may be the case, therefore, that some speech patterns which are interpreted as evidence of articulatory immaturity, may not just be the result of developing motor speech skills, but may also re¯ect the constraints imposed by physical maturational processes. The results of this study have posed some questions that merit further investigation in future studies of motor speech development. Acknowledgements We wish to thank all the participants who made this study possible.

References Bennett, S., 1981. Vowel formant frequency characteristics of preadolescent males and females. J. Acoust. Soc. Amer. 69, 231±238. Bladon, R.A.W., Al-Bamerni, A., 1976. Coarticulation resistance in English. J. Phonetics 4, 137±150. Busby, P.A., Plant, G.L., 1995. Formant frequency values of vowels produced by preadolescent boys and girls. J. Acoust. Soc. Amer. 97, 2603±2606. Byrd, D., 1992. Preliminary results on speaker-dependent variations in the TIMIT database. J. Acoust. Soc. Amer. 92, 593±596. Byrd, D., 1994. Relations of sex and dialect to reduction. Speech Communication 15, 39±54. Childers, D.G., Wu, K., 1991. Gender recognition from speech: Part II. Fine analysis. J. Acoust. Soc. Amer. 90, 1841±1856. Dalston, R.M., 1975. Acoustic characteristics of English /w, r, l/ spoken correctly by young children and adults. J. Acoust. Soc. Amer. 57, 462±469. Davis, B.L., MacNeilage, P.F., 1990. Acquisition of correct vowel production: a quantitative case study. J. Speech Hear. Res. 33, 16±27. Davis, B.L., MacNeilage, P.F., 1994. Organization of babbling: a case study. Language and Speech 37, 341±355. Eguchi, S., Hirsh, I., 1969. Development of speech sounds in children. Acta Otolaryngologica (Suppl.) 257, 5±48. Fant, G., 1966. A note on vocal tract size factors and nonuniform F-pattern scalings. QPSR-STL, KTH-Stockholm 4, 22±30. Goodell, E.W., Studdert-Kennedy, M., 1993. Acoustic evidence for the development of gestural coordination in the speech of 2-year-olds: a longitudinal study. J. Speech Hear. Res. 36, 707±727. Kent, R.D., 1976. Anatomical and neuromuscular maturation of the speech mechanism: Evidence from acoustic studies. J. Speech Hear. Res. 19, 421±446. Krull, D., Lindblom, B., 1996. Coarticulation in apical consonants: acoustic and articulatory analyses of Hindi, Swedish and Tamil. In: Swedish Phonetics Conference, N asslingen, 29±31 May 1996. TMH-QPSR 2, pp. 73±76. Krull, D., Lindblom, B., Shia, B-E., Fruchter, D., 1995. Crosslinguistic aspects of coarticulation: an acoustic and electropalatographic study of dental and retro¯ex consonants. In: Proceedings of the 13th International Congress Phon. Sc. Stockholm. Vol. 3, pp. 436±439. Laver, J., 1994. Principles of Phonetics. Cambridge University Press, Cambridge. Lindblom, B., 1983. Economy of speech gestures. In: MacNeilage, P.F. (Ed.), The Production of Speech. Springer, New York, pp. 217±245. Locke, J.L., 1983. Phonological Acquisition and Change. Academic Press, New York. Locke, J.L., 1997. A theory of neurolinguistic development. Brain & Language 58, 265±326. Locke, J.L., Lambrecht-Smith, S., Roberts, J., Guttentag, C., 1995. Phonetic development of infants at risk for develop-

S.P. Whiteside, C. Hodgson / Speech Communication 32 (2000) 267±285 mental dyslexia. In: Powell, T.W. (Ed.), Pathologies of Speech and Language: Contributions of Clinical Linguistic Phonetics and Linguistics. International Clinical Linguistics and Phonetics Association. Nittrouer, S., 1993. The emergence of mature gestural patterns is not uniform: evidence from an acoustic study. J. Speech Hear. Res. 36, 959±972. Nittrouer, S., Studdert-Kennedy, M., McGowan, R.S., 1989. The emergence of phonetic segments: evidence from the spectral structure of fricative-vowel syllables spoken by children. J. Speech Hear. Res. 32, 120±132. Nittrouer, S., Studdert-Kennedy, M., Neely, S.T., 1996. How children learn to organise their speech gestures: further evidence from fricative-vowel syllables. J. Speech Hear. Res. 39, 379±389. Peterson, G., Barney, H., 1952. Control methods used in the study of vowels. J. Acoust. Soc. Amer. 24, 629±637. Pickett, J.M., 1980. The Sounds of Speech Communication. University Park, Baltimore. Recasens, D., 1987. An acoustic analysis of V-to-C and V-to-V coarticulatory e€ects in Catalan and Spanish VCV sequences. J. Phonetics 15, 299±312. Recasens, D., 1991. An electropalatographic and acoustic study of consonant-to-vowel coarticulation. J. Phonetics 19, 177±196. Repp, B., 1986. Some observations on the development of anticipatory coarticulation. J. Acoust. Soc. Amer. 79, 1616±1619. Sereno, J.A, Baum, S.R., Marean, G.C., Lieberman, P., 1987. Acoustic analyses and perceptual data on anticipatory coarticulation in adults and children. J. Acoust. Soc. Amer. 81, 512±519. Sharkey, S., Folkins, J., 1985. Variability of lip and jaw movements in children and adults: implications for the

285

development of speech motor control. J. Speech Hear. Res. 28, 3±15. Slawinksi, E.B., Fitzgerald, L.K., 1998. Perceptual development of the categorization of the /‹-w/ contrast in normal children. J. Phonetics 26, 27±43. Smith, B.L., Kenney, M.K., 1998. An assessment of several acoustic parameters in children's speech production development: longitudinal data. J. Phonetics 26, 95±108. Stoel-Gammon, C., 1983. Constraints on consonant-vowel sequences in early words. J. Child Language 10, 455±457. Sussman, H.M., McCa€rey, H.A., Matthews, S.A., 1991. An investigation of locus equations as a source of relational invariance. J. Acoust. Soc. Amer. 90, 1309±1325. Sussman, H.M., Hoemeke, K., McCa€rey, H.A., 1992. Locus equations as an index of coarticulation for place of articulation distinctions in children. J. Speech Hear. Res. 35, 769±781. Sussman, H.M., Dalston, E., Gumbert, S., 1998. The e€ect of speaking style on a locus equation characterization of stop place of articulation. Phonetica 55, 204±225. Swartz, B.L., 1992. Gender di€erences in voice onset time. Perceptual Motor Skills 75, 983±992. Turnbaugh, K., Ho€man, P., Danilo€, R.D., Absher, R., 1985. Stop-vowel coarticulation in 3-year olds, 5-year olds and adults. J. Acoust. Soc. Amer. 77, 1256±1258. van Bergem, D.R., 1994. A model of coarticulatory e€ects on the schwa. Speech Communication 14, 143±162. Whiteside, S.P., 1996. Temporal-based acoustic phonetic patterns in read speech: some evidence for speaker sex di€erences. J. Int. Phonetic Assoc. 26, 23±40. Whiteside, S.P., Hodgson, C., 1999. Acoustic characteristics in 6±10-year olds children's voices: some preliminary ®ndings. Logopedics Phoniatrics Vocology 24, 6±13.