Journal of Phonetics (1985) 13, 38! - 406
Pitch and duration in Welsh stress perception: the implications for intonation Briony Williams IBM UK Scientific Centre, Athelstan House , St. Clement St, Win chester, Hants S023 9DR, England Received 12th August !984, and in revised form 24th August !985
Previous work has shown that stress in Welsh is not directly related to the usual acoustic cues of F0 , intensity and duration of the vowel. An experiment was conducted to assess the relative contribution of F0 pattern and segmental duration to the perception of stress in Welsh. A stress-related minimal pair of words was recorded and edited to give stimulus continua. These were resynthesized with three types of pitch pattern superimposed. Ten Welsh listeners made stress judgements on the randomized stimuli. The results point to a complex interaction of F0 , segmental duration and stress perception in Welsh. Pitch-prominence alone was no cue to stress, but functioned as such solely in terms of recognized intonational categories. A certain type of segmental durational effect also influenced stress judgements. The patterns of co-occurrence between pitch patterns and stress location that are found in the intonational units of the language were the means by which stress was indirectly perceived when the stimuli showed a change in F0 . Various theories of intonation in English are discussed with reference to this interaction between stress and pitch.
1. Introduction
1.1. Past work on the acoustic correlates of stress 1 The realization of word stress on the acoustic level has been thoroughly investigated for English. The consensus view is that a stressed vowel is most often characterized by pitch-prominence (markedly higher or lower F0 , or a glide in F0 ). Other important stress cues are the greater amplitude and longer duration of the stressed vowel (see Fry, 1955, 1958; Lieberman, 1960). A similar influence of pitch-prominence has been found for other languages, e.g. Polish, where the stressed penultimate syllable is usually characterized by higher F0 (Jassem, 1959), and French, where the stressed final syllable is associated with higher F0 (Rigault, 1962). For Welsh, however, the situation is different. In polysyllabic words, stressed syllables and unstressed syllables alike may or may not be pitch-prominent, in the sense of having 1 The experimental part of this study was carried out while the author was a research student at the University of Cambridge.
0095-4470/85/04038!
+
26 $03 .00/0
© !985 Academic Press Inc. (London) Ltd.
382
Briony Williams
either markedly higher or lower F0 or a pitch glide. Also, a stressed vowel may or may not have greater amplitude than an unstressed vowel. Perhaps most perversely, the stressed vowel is often shorter in duration, particularly when compared with the supposedly unstressed final vowel of a word. There seems to be some relation between stress and isochrony in Welsh (for details of the lack of intrinsic acoustic cues to stress in Welsh, see Williams, 1982, 1983a).
1.2. The post-stress consonant in Welsh An earlier, purely auditory, study of the phonetics of Welsh (R. 0. Jones, 1967) found that a voiceless intervocalic stop after a stressed syllable had "extra duration"-that is, easily perceptible additional length. This observation was supported by the present author's measurements from recordings of connected speech taken at the Welsh National Eisteddfod at Machynlleth. Each consonant was classified according to the following four categories: after stressed vowel, after unstressed vowel, before stressed vowel, and before unstressed vowel. Thus, for example, a consonant between a stressed and an unstressed vowel would be entered in the categories of "after stressed vowel" as well as "before unstressed vowel". No account was taken of syllable boundaries. The only significant duration difference between these categories was the greater length of consonants after stressed vowels compared with those after unstressed vowels. The mean consonant lengths were 94 ms and 81 ms respectively (p < 0.05). Thus, although the location of syllable boundaries varied, this does not seem to have affected the durational link between a stressed vowel and its following consonant. The stress-related effects of vowel and consonant length combine to form a picture of the stressed vowel in Welsh polysyllables as being shorter and having a longer following consonant than an unstressed vowel, particularly in relation to the supposedly unstressed final vowel. In polysyllables, it is the penultimate vowel which is regularly stressed; of the few exceptions, the majority have stress on the final vowel. In (stressed) monosyllables, the vowel is similar to that in the final syllable of polysyllables, in that it has markedly longer duration and an unreduced quality. However, the stressed monosyllable differs from the final syllable of polysyllables in having increased intensity and, apparently, in playing a part in rhythmic organization. The implication of the above remarks on Welsh stress is that stress and acoustic prominence in Welsh do not always coincide. The only stress cues seem to be of a relational nature, concerning the relative timing of vowel and consonant, or the temporal arrangement of syllables into feet.
1.3. Characteristics of Welsh intonation Very little work has been carried out on the patterns exhibited by pitch movement in Welsh; thus only the grosser features will be mentioned here. A study of conversational Welsh (Thomas, 1967) describes the characteristic patterns found in the "head" of the "tone-unit" in Welsh. The descriptive framework is based on the tradition of O'Connor & Arnold (1961 ). Within the head, unstressed syllables form a rising pitch contour after each stressed syllable. The stressed syllables are usually each lower in pitch than the preceding stressed syllable (interspersed with ascending unstressed syllables), but may sometimes form a continuous ascending pitch contour.
Pitch, duration and intonation in Welsh
383
The higher pitch of a following unstressed ultima has earlier been observed auditorily by D . M. Jones (1949), who notes "the frequently higher pitch of the final syllable", and acoustically by Watkins (1972), who additionally notes the higher pitch of immediately post-stress unstressed syllables in the English spoken by some native Welsh speakers. For words in nuclear position, the pattern is slightly different, since the stressed syllable may begin one of several possible nuclear pitch contours- thus the following unstressed syllable may be lower in pitch. In some cases, however, the great length of the unstressed ultima allows it to carry a pitch glide, usually falling, which begins at a higher pitch than that of the penult. This " rise- fall" pattern is very characteristic ofWelsh and of the English spoken by Welsh speakers. The point to note, however, is that the supposedly unstressed syllable, by carrying a pitch glide, is made pitch-prominent. This could not occur in English, and is a common source of errors made by English speakers in placing lexical stress in Welsh words.2 The implication of the above remarks on Welsh intonation is that stress and pitchprominence in Welsh do not always coincide. This is in contrast to the situation in English, and will be discussed in greater detail in Section 4.2.3. 1.4. Aim of the experiment The results summarized in Section 1.2 had been obtained from the measurement of examples of speech production. It was decided to test the conclusions for speech perception also , in order to ascertain whether the discovered lengthening of the poststress consonant was in fact significant from a functional point of view. As well as investigating this segmental phenomenon, it was proposed to study the role of segmental and suprasegmental cues in the perception of stress. To this end, the interaction between consonant length and F0 pattern was examined.
2. Method 2.1 . Segmental processing A pseudo-minimal pair of Welsh words was chosen, differentiated by stress placement: ymladd j' 'dm!ao/ "to fight ", and ymladd/'dm! 'a:o/ " to tire oneself out". The words, embedded in sentences, were among a list read out by a (female) native Welsh speaker in a soundproofed room, using an AKG 451 microphone and an Ampex AG440 open-reel tape recorder (full details of experimental procedure may be found in Williams, 1983a). A sound editing program on a Computer Automation "Alpha" LSI-2/40 microcomputer was then used. The recorded speech was digitized at a sampling rate of I 0 kHz after analog low-pass filtering at 5kHz. The waveform was displayed on a scope, and two manually controlled cursors were used for the demarcation of points on the waveform; all editing was carried out in the time domain . Each word was excised from its carrier sentence, and the length of the /m/ segment was modified , giving a series of stimuli with /m / lengths ranging from 255 ms to 13 ms in steps of either 15 ms or II ms (representing an integral number of cycles). The original words 2 See Zwicky (1972) for several examples where the author has incorrectly assigned stress to the final syllable.
384
Briony Williams
having been added, the total number of stimuli was 40; i.e. 20 from each of the two source words. The stimuli were then reduced to linear predictive coding coefficients and resynthesized using a CED 301 digital speech synthesizer associated with the minicomputer. The stimuli were arranged in random order.
2.2. F0 patterns Three F0 patterns were superimposed on the stimuli. In the first set (a), all F0 variation was cancelled out by superimposing a monotone at 280Hz on resynthesis. In the second and third sets (b and c), F0 patterns were superimposed using a manually operated "pen" linked to a calibrated digitizer pad to "draw" the required F0 patterns while the speech was being output to tape. Since the utterances used as input to this process already had a degree of F0 variation that differed according to source word, the resultant F0 patterns both differed from the superimposed contour and also retained the distinction by source word, as far as the acoustic shape of the utterance was concerned. The input and output for this process is schematised in Fig. 1 (according to source word). The acoustic and perceptual results of this procedure are discussed in Section 3.3 below.
2.3. Procedure These stimuli were recorded in differing random orders, and the resultant tape, containing lists of 40 words each, was presented to a group of ten Welsh listeners equipped with headphones. The listeners' task was to identify the word in each case (and thereby to make stress judgements), using forced-choice response sheets. A statistical analysis was then carried out on the results, grouped by the three lists of stimuli; list a, monotone; list b, fall /step-down; and list c, step-up and fall. A check on agreement beween subjects revealed no great discrepancies, thus all listeners' judgements were included in the statistical analysis of results .3
3. Results
3.1. Nasal length effect A significant correlation was found between the length of the nasal and listeners' stress judgement, for all F0 conditions. Figure 2 represents the judgements for list a; the other lists gave broadly similar results. The x-axis shows increasing length of jmj , calculated as a percentage of the preceding j::Jj length. The y-axis shows the number of subjects judging the given stimulus to be ymladd (i.e. with stress on the final syllable). The judgements are divided by source word, and a regression line has been drawn. The negative slope of the trend represents the fact that a stimulus was more likely to be identified as ymladd the greater the length of the nasal, i.e. a longer jmj implied stress on the penult. This is what had been predicted from earlier measurements, the findings of which are now seen to hold for perception as well as for production. 3
For full details of the processing of results, see Williams (1983a, b).
385
Pitch, duration and intonation in Welsh List b From original ymtadd
From original ymladd
-----
Input F 0 patterns
Output F0 patterns
List c From original ymtadd
From original ymladd
Input F 0 patterns
----
Output F 0 patterns
Figure I. Input and output F0 patterns by list number.
The (product-moment) correlation coefficient between nasal length and number of ymladd judgements was calculated for each list, as follows: List
Nasal length effect (by source word) Correlation
a (monotone) b (fall/step-down) c (step-up and fall)
-0.84 -0.40 -0.53
-0.43 -0.61 -0.45
Significance
0.0001 0.0914 0.0166
0.0577 0.0042 0.0442
The first figure in each case represents the value from original ymladd (stress on second syllable). The degree of correlation is nowhere very high, but the correlations are statistically significant and consistent across most of the differing F 0 conditions, suggesting that the post-stress consonant cue to stress is little affected by pitch variation. In only two cases was the cue not significant at the p < 0.05 level, and in those cases it was significant at the p < 0.10 level.
Briony Williams
386
3.2. F0 pattern effect
The F0 patterns of the three lists appeared to influence listeners' stress judgements, as shown below: F0 pattern effect, by list number
List
No. of ymladd judgements
a (monotone) b (fall/step-down) c (step-up and fall)
188 229 203
Since the total number of stimuli in a list was 40, and the number of listeners was ten, the median number of ymladd judgements was 200. List c does not differ appreciably from this number; list a is somewhat lower, while list b is much higher. This suggests that, in terms of F0 pattern, listeners tended to hear ymladd in list band ymladd in list a, while the F0 pattern of list c was not associated with any preference. The differing F0 patterns of lists b & care further characterized in Section 3.3.4, while in Section 4.2 they are seen to give an insight into the functioning of intonation.
3.3. Source word and F0 pattern effects
3.3.1. Source word effect: correlation It was found that the origin of the stimulus exerted a significant effect on listeners'
judgements. In Fig. 2 this is seen graphically as a clear separation between two sets of data points. The (product-moment) correlation coefficient between source word number (I for ymladd and 2 for ymladd) and number of ymladd judgements was calculated for each list, as follows: List
Source word effect Correlation
Significance
a (monotone) b (fall/step-down) c (step-up and fall)
0.71 0.65 0.85
0.0001 0.0001 0.0001
The degree of correlation is higher than for the corresonding "nasal length effect" coefficients, and the significance is greater. The source word effect is clearly stronger than the nasal length effect (while not obliterating the latter). 3.3.2. Acoustic factors The origin of the source word effect was investigated. At first, it was thought that this lay in the differing F0 patterns of the source words, which in turn led to differing F0 patterns in the stimuli derived from each source word, even given the superimposition of other F0 patterns. The difference in F0 patterns may be seen in Figs 3-6. These figures show the waveform of each stimulus below the F0 curve (plotted on a linear Hertz scale). Each example is a reasonably typical case of the class of stimuli in question. The F0 patterns seem to group according to source word rather than list number.
101-
c
c
c
9
"'c:
c
0
'"1:1
~-
"'
8
?'"'
"0
7
12....
E
"' "' "
."".,
61-
!::> .....
c
0
.~
"'
0
I
~
"'
.Q
E
z"
4
•
c
0
c
s· ..... ~
;:s !::>
6·;:s
3 2
!::> ;:s
!::>..
c
5
cs· ;:s
•
0
s·
0
~
1:;;:::,-
_j 0~------------------~------------------~------------------~------------------~ 0 100 200 300 400 / m/ length as percentage of
Ia/
length
Figure 2. Scatter plot and regression line for list a (monotone). <), From ymladd; D, from ymliidd.
w
00 -.)
388
Briony Williams
~ ~
.,...~ 0
~
Q)
E I-
"::l~ 0
"'
.D ~
:.:i"' ...;
....
=
O>ll
~
389
Pitch, duration and intonation in Welsh
~
:.s :::
"'
-o .... 0 ~
"'E f--
::"':::>
~
<.)
:.:l"'
...=
..;
Cl)
li:
( ZH) 0.:/
Briony Williams
390
'
'
' ' ' 1 - - -----,------ -~------~-------~------
,
'
'
' '
-------1------ -~------ -r------ ~-.. -- --=:t:-
' ' ' ----···r···----.-------.,----···
1------+-------~-------:--- - ---~ - ------
I
,
'
'
'
I
------r-----r-------r------
'
.
I
I
-------f, -----··c·-----:- ------1------' ' ' ' ' ' '
' '
0
-- - ----f-------~-------:------~-~ -
' ' ' ' '
------- ~·------ -~----- - -:---- -- -'{------- -------t-------:---- . ' ' '
'
-------'· ------ -'----- --
.~-----I
I
I
'
' '
I
' ' '
I
'I '
.' .-----I
I
'
(I)
0
' ------- '. -- ----- .' -------.------.' -------
•••••
0 ~r-· ~
0
I
• • • ..,. • ·- •• • ., •••• · ·-
I
I
I
I
I
I
~---
I
...
..' --- -- -- .. -- --- -....' -'' I
I
~ ··- -•• •1'~ - -
I
0
I
I
''
••• •r•-
I
~
• ---r
I
'
I
I
I
'
'
0
------·r---- ·:-- . ------:-------:------·:------ . -----. I
0
~-
I
I
I
I
~----
'0
'I
0 I
I t
I
-~
I 0 I
I 0
I
I
.
'I
I
I
I
Q)
--··t··-----:--------:··---r··-·r··--r·----~-------:-- - ---- -:- - --- -:------ :·---- ~ - :-------:
-------:-------:-----
''
'' 'I
'
I
I I
I I I I
I 0 I I
0
---- .. _-- ---- .. -- -----
'' ' I I
I I
~-----
I I
"'0
I
__ ,_------ -'------- ~ - ---- I
I 0
I I
I I
I I
I
' ' ' '' ' • - •••• -. - •• --. - ~ •• ---- ·-.---.- ~ - •• ---.---.---.----- --·---- •• -·.- ••• - ~ - ••• -- '"lliE ..• • -. I
I
I
I
I
'
'
------- .. ------ ... -------.------- . .--- -I
I
I
I
I
I
I
I
I I
I I
I
I I
~- -
I I I I
I I I
. t
.
I
o
I I I
I I
~
-----.-- ----- .. ------ ... ------ . . . --I
' ' ' ' ' - • 'r--- - - • ·r--·-- -.---- •- -.,.' -- · •· ·, • • •- ·- •r' I
• • ·- ·
I
I
I
I
I
1
I
I I
I I
I •
I
0
I ~-
• ·- •
'r ·----- -r--' ·-- .,.' --- -··
~
I I 0
I I I
1 I I
I I I
I
I
I
I
I
I
I
I I
I I
I I
I I
I I
...,
- ------~-- ---·-:···· ·· i···-··-1··--- -· j···---- i -------:-------~ - -- -- '
'
'
'
c:i
'
1- - - !~- - :r r:~::::::: ·:::::r I : I
I
I
I
0
I
(\J
0
I
1--·-----~ ·-· -----· -- - -- - --- -- --- •. ------ ·- -- ----·-------· --. ----· --- . . . .•. . .
---- --- ..' ------ ...------- .''
'
''
-----~---------·-------
ci
.' -- -----'... -----
L---~--~----~--~--~----~--~--~--~----~--~--~0
0
..., ~
0
00
N
0
(!)
N
0
0
0
0
"'"
N
0
aJ
N
N
N
0 <.0
0
0
0
o::;t
N
0
E 1-
Pitch , duration and intonation in Welsh
..----- ..,_ ------ ··- --- -- -. ---- --I
:
I
I
I
I
391
.. ---- --. ,.. -I
~ --- ----
I
I
I
I
I
I
I
I
r
I
I
I
I
'
'
I
······r· --- ----. -----·-·•··· ··•••'f••···· ·r··•··· ·r·• ·--- - ··r·· -·
I
' '
I
I
• - - --- ·r------. -,- --- - . - ,. -- - - . • ~ -- - . --- ! -- - --1
r
o
'
'
·r-- -----,-' .-----.., . -----I
I
1
•
'
---- ---·----·- --:------.!- ------1-------r-- -----:. -------·- -- --- --:--, ' '
I
I
I
I
I
l
. -- -'- - ----- -·---- --- J.---- -- J------- ~------.I.--- -I
I
I
'
' .-
.
··'----- -- ..' _- -- --- ~' -- - - - - - '~-- . I
I
' ' '
' ' '
I
I
' '
- ··----- . -~ -
·--- -···------··-- ...... --- ··-- ··-- .. ---...
-~------
~
..
o"' E I-
. ··- - ··-- -- ---
..---- - '
-~----
-....' --- ---. --- ---- .------ I
I
I
' ------ · ··-
I
0
...' -- ---- .,..' ------ .•..' --I
~ ----- -
I{)
'
- ~-
I
I
a
----- .: .. ---- - -~-- ---- -: ---- ---:- ----- - ~- ----- ~- -· -- .. ·- ' ''(' " " " " '
---. -
-~~
· ··· ~ ~~~~~
'
...,.
. 'r-- •• -
' ' ''
'' ' ' ' '
'' ' ' ' '
' ·-----'------ -:. -------:-------1----. '
~-
',. ------ -,-. ' ' '
· ~-- --~
------
0
·· :~ ~ -. ~ ~ ~ ·{·-
N
0
. -"--·== ,______ ,______ T_____ _.-------1·------:-------:-------:-----· '
' ' '
'
' ' '
~- ; - ~ . ~- - ~ ~ ~-- ~-- ~ ~ ~ ~ ~ - - ~ ··~ - ~-
' ' '
---,;___ . -=
' ' '
I· ......... -' --- ... --··---- -- ..,.. ------- --- -- --.------ -. ---------------- -
o~--~o~---o~--~o----~o~--~o----~o~---o~----o~---o~--~~---o~---0~ """
~
N
~
0
~
CD
N
W N
""" N
N
N
(ZH) 0.:/
0
N
CD
W
N
0
0
Briony Williams
392
However, the source word correlation coefficient for list a lies approximately mid-way between those for lists band c. The stimuli in list a were produced on a monotone, giving no F0 cue to source word at all. The implication is that the source word effect in lists b & c was similarly independent of F 0 • 3.3.3. Auditory factors Furthermore, auditory characterization of the F0 in each of lists band c revealed that, while the source word could be distinguished according to F0 in list b, careful listening was required to do the same in list c. This was because the slope of the fall on the second syllable was the main distinguishing factor in list c, and this difference was not as perceptually salient as might be supposed from the F0 plots shown. One reason for this could be that in stimuli from original ymladd, where the F0 varies more widely, the peak F0 falls not on a syllable nucleus but on /1/, the onset consonant of the second syllable; such a location for the F 0 peak might well render it less perceptually salient than if it had fallen on the vowel, as it does in stimuli from original ymliidd. Thus, at the level of perception, F0 differences are not completely reliable as distinguishing factors. Finally, the stimuli from the different source words were consistently distinguished by the duration of the second vowel. The /a/ of original ymliidd had a duration of 201 ms, while that of original ymladd was 145 ms, a difference of 56 ms. This durational difference is well above the perceptual threshold for sounds of this order of duration (Lehiste, 1970, Chapter 2). Also, inspection of the spectral pattern of the fa/ in each source word revealed that these vowels were virtually indistinguishable in formant pattern. Thus the difference between them was purely durational. 3.3.4. Revised interpretation of F0 patterns The conclusion reached was that the origin of the source word effect lay in the consistent difference in duration of the second vowel. While the F0 patterns had seemed the probable origin, they in fact showed certain common auditory features according to list number, as schematized in Fig. 7. In list b, the perceived pitch at the end of the first syllable rose or fell in some cases, and in other cases remained level, as indicated by the alternative lines in Fig. 7, while the pitch of the second syllable was never higher than that of the first. In list c, on the other hand, the pitch at the end of the first syllable showed no variation, while that at the List b
From original ymtodd
From original ymlodd
List c
-\
Figure 7. Perceived pitch by list number and source word.
Pitch , duration and intonation in Welsh
393
beginning of the second syllable was at a higher level than the pitch of the first, falling to a lower level than that of the first syllable. 3.3.5. F0 pattern effect Thus the pitch patterns of lists b and c may be distinguished by means of the second syllable. This was low and/or low falling in list b, and high and falling in list c. The source word effect was attributed to the pronounced difference in fa/ duration between stimuli from different source words, while the "F0 pattern effect" was attributed to the starting height and direction of the pitch of the second syllable. 3.4. F0 pattern effects on measurements
The effects of F0 pattern (and of nasal length and source word) may be seen in Fig. 8. The axes are as in Fig. 2, but the data points are here represented by a single regression line for each subset according to list number and source word. The source word effect is seen in the clear separation of the top three lines (from original ymladd) from the bottom three lines (from original ymladd), while the nasal length effect is seen in the negative slope of each line, to varying degrees. The effects of F 0 pattern are represented by the relationship between the three types of line in each of the two "source word" groups. 3.4.1. Influence of F0 pattern on size of nasa/length effect It is clear from the table in Section 3.1 that the F0 pattern had an effect on the size of the nasal length effect, which is greatest in list a and smallest in list c. In addition, the size of the nasal length effect according to source word seems also to have been affected. In lists a and c, the nasal length effect is stronger where the source word was ymladd, while the opposite applies in list b. 3.4.2. Influence of F0 pattern on the source word effect The F 0 pattern seems also to have a marked influence on the source word effect. This can be seen graphically in Fig. 8, and more particularly in the slopes of the regression lines as given below: Slope of regression lines by list no . and source word List Original ymliidd Original ymladd a (monotone) b (fall/step-down) c (step-up and fall)
-0.0192 -0.0072 -0.0102
-0.0083 -0.0121 -0.0102
3.4.3. Influence of F0 pattern on consistency between subjects The effect of F0 pattern on consistency of judgements between subjects was also studied. For this purpose, two types of"consistency index" were calculated. These were based on the number of stimuli in each list that had been judged to be ymladd by at least five of the ten subjects: such stimuli were referred to as "majority sites" .
(a) The number of majority sites in each list was divided by the number of stimuli in the list that had been judged to be ymladd by at least one subject. A low figure would
'-"'
\0
+=-
II
~
s·
~
31
~
c.,
0
100
200 /m
I
300
400
length as percentage of /a/ length
Figure 8. Regression lines by list number and source word. - - , List a; - --, list b; - ·- ·- , list c.
Pitch, duration and intonation in Welsh
395
indicate that many ymladd judgements were made on non-majority sites: in other words, that there was a high degree of "scatter" of the judgements. (b) The second type of "consistency index" was calculated by dividing the number of majority sites in each list by the number of stimuli in the list. A low figure would indicate that not many stimuli were majority sites: in other words, that there was little agreement among subjects. List
Consistency between subjects, by list number Type a Type b
a (monotone) b (fall/step-down) c (step-up and fall)
0.4359 0.6944 0.7097
0.4250 0.6250 0.5500
Mean 0.4305 0.6597 0.6299
It is clear that the different F0 patterns had an effect on the consistency of stress judgements between speakers. List a showed the least consistency by far, and list b the most consistency. This may be partially related to perceived "naturalness", since a monotone is manifestly quite unnatural in speech. Among the two lists (b & c) that would have been perceived as more "natural", there is a slight difference in consistency of judgements which may reflect the influence of intonation (see Section 4.2). 4. Discussion
4.1. Segmental durational effects 4.1.1. Source word effect The source word effect had been found to be due to the differing length of the fa/ in the second syllable of each source word. The fact that [a:] and [a] can differ in length but not . appreciably in quality in Welsh has been noted by Ball (1983), who made measurements of a-initial diphthongs in Welsh. He concluded that: a perceptible length difference does exist between the first element of the ae diphthong and the first element of the ai and au diphthongs. This would therefore support a transcription ja:l/ vs. jai, a!/ . However, there did appear to be some qualitative difference as well. It appears though that this difference is so small ... that it should not take precedence over the length distinction (op. cit., p. 88).
As is clear from Fig. 8, the source word effect (seen in the separation between the two groups of three lines) is very strong. However, it by no means overrides the effect of jmj length and F0 pattern, which can be studied in the light of their varying behaviour according to source word. 4.1 .2. Nasa/length effect The perceived nasal length effect supports the earlier observations made from measurements of Welsh speech. The effect is interesting in that it forms a new type of cue to stress, which has tended to be seen in terms of acoustic properties of the stressed vowel itself. The discovery of a stress cue based on temporal relations on the segmental level could be seen as offering indirect support to the view of stress as realized by temporal relations on the suprasegmental level, i.e. rhythm (see Ladd, 1980, Chapter 2, for an exposition of this view).
396
Briony Williams
The fact that a nasal is bound in this way to the preceding vowel suggests a different syllabification strategy from that normally used. Previous studies of syllabification have indicated that, given an intervocalic consonant or consonant cluster, a syllable should be deemed to start with as many consonants as is possible, given the phonotactic constraints of the language. This is the "Maximal Syllable Onset Principle" noted by Selkirk for English (Selkirk, 1980b, p. 9) . The same principle is used for Welsh by R. Jones, who states that "statistical evidence favours a C element as the onset marginal of a syllable rather than as the final marginal of a preceding syllable" ( op. cit., p. 250). Jones' syllabification rules for Welsh are based on the statistical method used by O'Connor & Trim (1953) for English, and give similar results. It is not only in Welsh, however, that a consonant seems to syllabify with the preceding vowel (for purposes of stress, at least). In a recent study of the timing of F 0 patterns in Danish. Thorsen has found that: as far as its tonal manifestation goes, stress begins with the vowel, because--contrary to expectations, perhaps- a stressed vowel repels rather than attracts preceding homosyllabic consonants (Thorsen, 1984a, p. 29).
She also cites evidence from Swedish indicating that: the most reasonable account of rhythmic phenomena is achieved if the onset of the rhythmical unit is taken to be the onset of the stressed vowel rather than, say, the onset of the first prevocalic consonant (Joe. cit.).
It is true that Thorsen is here discussing a consonant before, rather than after, a stressed
vowel, but the same principle may well hold. Similarly, in Spanish, the duration of a vowel is dependent on the voicing factor of the following, rather than the preceding, consonant. This is also found in English (see summary of past work in Weismer, Dinnsen & Elbert, 1979). Moreover, in the speech of English-speaking children who regularly omit a word-final stop, the length of the preceding vowel can give an indication of the voicing factor of the omitted stop (Weismer et a!., 1979). From a diachronic point of view, the deletion of a consonant may be followed by a lengthening of the preceding, rather than the following, vowel; this is the phenomenon most often understood by the term "compensatory lengthening" (Clements, 1982). In addition, there are other factors in Welsh suggesting an alternative syllabification to that normally used. For example, phonological vowel length in Welsh is determined by the nature and number of the following consonants (Rhys Jones, 1977, p. 21-22). Also, a slightly greater tendency towards isochrony in Welsh is seen if the rhythmic " foot" is taken to begin at the onset of the stressed vowel, assigning any syllable-initial consonants to the previous foot (Williams, 1983a, p. 43). Thus, as far as duration and rhythm are concerned, syllable boundaries in Welsh seem to occur at the onset of the vowel, ignoring preceding consonants. This conclusion is supported by the nasal length effect investigated here, which operates in relation to the preceding vowel rather than to the tautosyllabic following vowel. 4.1.3. F0 pattern and conflicting/reinforcing segmental cues There were thus two types of influence on stress judgements; the suprasegmental influence of F 0 pattern, and the segmental influence of nasal duration and of /a/ duration.
Pitch, duration and intonation in Welsh
397
Neither type of influence was strong enough to override the other completely, but it seems that each was greater when the other influence was ambiguous or counter to expectations. For example, the source word effect is strongest when the pitch pattern gives no help, as in list c (and to a lesser extent, in list a). Likewise, the nasal length effect is strongest when the pitch pattern is uncommon and shifts judgements, as in lists a and b (see Section 4.2). Similarly, the pitch pattern effect was at its strongest when cues of segmental length were contradictory. This can be seen in Fig 8, where, for words from original ymladd, a longer [a:] length contradicted the expectations set up by the longer nasal, leading to a greater reliance on pitch pattern for longer durations of the nasal, as can be seen in the increasing separation of the regression lines as the cues become more conflicting. Conversely, for words from original ymladd, a shorter [a] contradicted the expectations set up by the shorter nasal, leading to a greater reliance on pitch pattern for shorter durations of the nasal, as can be seen in the decreasing separation between the regression lines for lists a and b as the cues become more mutually reinforcing. List c seems to allow listeners to rely almost exclusively on fa/ length . It was not possible to quantify these conflicting and reinforcing cues according to list number, source word , and duration of nasal above or below a (somewhat arbitrary) threshold value. This was because the small number of observations in each of the 12 categories, and the amount of scatter present, led to a lack of significance in the results obtained. However, the general trend is observable graphically from Fig. 8. 4.2. Suprasegmental effects 4.2.1 . Summary of pitch effects From the findings made so far (see Section 3 and Fig. 8), the main effects of the different pitch patterns can be summarized as follows: List a (monotone) pitch pattern shifts judgements towards ymladd much disagreement between listeners nasal length effect strong source word effect moderate List b (fall/step-down) pitch pattern shifts judgements towards ymladd much consistency between listeners nasal length effect moderate source word effect weak List c (step-up andfall) pitch pattern does not shift judgements moderate consistency between listeners nasal length effect weak source word effect strong
4.2.2. Effect of monotone The monotone pitch pattern seems to have been found the most confusing by listeners, who here showed a higher rate of disagreement. In intonational terms, a constant F 0 gives
398
Briony Williams
no clues at all as to stress pattern, and thus mainly durational cues were available to listeners. Both durational effects were of at least moderate strength, indicating their importance here in making stress judgements. A recent experiment involving English has demonstrated "a significant difference between intelligibility performance for normal versus flat F0 passages" (Wingfield et a!., 1984). This conclusion is certainly borne out by the present experiment. A constant F 0 is very rarely found in Welsh intonation, and thus the pitch pattern of list a may be regarded as "marked" in some way. This may perhaps be seen in the effect it had on stress judgements, shifting them towards the ymladd form. A possible explanation for this shift is the fact that the stress pattern of this member of the word-pair (stress on penult) is the regular pattern, and so this member of the pair could have been favoured by default. 4.2.3. Effect of list b pattern The pitch pattern of list b (level, then a fall or step-down) was associated with the highest rate of agreement between listeners, suggesting that the pitch pattern contained an unmistakeable cue to stress pattern. This cue seems to be strong enough to weaken both the source word effect and the nasal length effect. The nature of this cue to stress pattern may be ascertained with reference to the intonational patterns of Welsh. If the second syllable of the list b pattern were counted as stressed, the whole word would constitute a low falling nucleus preceded by an unstressed syllable. If the first syllable were counted as stressed, the resultant intonational configuration would not be a possible nucleus, nor would it be a pattern occurring in the head (see Section 1.3 for a summary of intonation patterns in Welsh). Thus the intonational odds are weighted in favour of an interpretation which counts the second syllable as stressed; a weighting which does not occur in the monotone pattern of list a. The pitch pattern of list b thus influences stress judgements towards the ymladd form. It could be argued that this is due to the glide or step-down on the second syllable, which gives it " pitch-prominence" . While the second syllable is admittedly pitch-prominent, this does not necessarily entail a one-to-one mapping of pitch-prominence on perceived prominence. That this is so can be seen from the pattern of judgements obtained for list c. 4.2.4. Effect of list c pattern The pitch pattern of list c (level, then a step-up and steep fall) , did not shift stress judgements towards either member of the word-pair. Listeners showed moderate consistency, while the source word effect was at its strongest and the nasal length effect was fairly weak. Considered in intonational terms, the pitch pattern of list c gives no cue as to stress placement. If stress is deemed to fall on the first syllable, the resultant pattern is one found either in the head (a common constituent pattern) or in the nucleus (a "rise-fall" nuclear tone). If stress is deemed to fall on the second syllable, the resultant pattern would qualify as a "high fall " nuclear tone preceded by an unstressed syllable. Thus there is no intonational bias towards either interpretation, unlike the situation in list b. In connection with list b, it will be noted that the second syllable in the list c pattern is unmistakeably pitch-prominent, bearing a steeply falling glide. Yet there is no sign of the expected bias in judgements towards stress on the second syllable. Thus pitchprominence in itself cannot be the whole cue to stress, and the fact that a pitch-prominent
Pitch, duration and intonation in Welsh
399
syllable was the favoured location of stress in list b is merely contingent upon the particular intonational factors applying to the list b pattern. In fact, under the " rise-fall" interpretation of the list c pattern, it would be the unstressed syllable that carried the pitch glide and was thereby made pitch-prominent. Since the " rise-fall " interpretation was approximately as popular as the " high fall " interpretation (where the stressed syllable is also the pitch-prominent one), it must be concluded that it is not pitchprominence that conditions stress judgements, but rather pitch pattern considered in terms of the possible intonational categories of the language. 4.3 . Implications for the theory of intonation 4.3 . 1. The "accent " analyses Such a conclusion is at odds with the " pitch-accent" system developed by Bolinger for English (Bolinger, 1958). This system defines three " accents" , or pitch patterns imparting pitch-prominence. A syllable that is pitch-prominent is labelled "accented" supposedly without reference to any independent notion of stress. Yet, as has been shown (Ladd, 1980, Chapter 2), other factors are made use of in practice. For example, under the strict interpretation of the theory , ambiguities sometimes arise when a pitch configuration can equally well be interpreted as one or the other of two "accent" types . It is in such cases that reference is made to stress differences. As Ladd remarks: " The basis for his inconsistent handling of the pitch movement seems to be the very rhythmic intuitions he is trying to explain" (op. cit. , p. 36). An even more direct counter-example is given by the common Welsh intonation pattern of a falling glide on the unstressed final syllable (the list c pattern). The " pitch accent" theory might thus assign " accent A" to the unstressed syllable. As Bolinger formulates his definition, it is also possible that his "accent A" would be assigned to the penult, more in line with the recognized stress pattern of Welsh. However, there is no indication as to the circumstances under which each of these alternatives would apply. In fact, the circumstances must crucially involve the already known stress pattern of the word. This somewhat defeats the point of attempting to define accents without reference to stress, and suggests that pitch patterns are to be viewed mainly in terms of their interaction with stress. Another analysis of intonation that makes reference to "accents" is that proposed by Vanderslice & Ladefoged (1972). Their "accents" are formulated in terms of the binary features Accent (pitch obtrusion, mainly upwards), Cadence (falling pitch), and Endglide (rising pitch). They also make use of the features Heavy (of a syllable containing an unreduced vowel), and Intonation (marking the location of the nuclear syllable). Only [+heavy] syllables can be candidates for [+accent], only [+accent] syllables can be candidates for [+intonation], and only [+intonation] syllables can be candidates for [+cadence] and/or [ + endglide]. 4 Vanderslice & Ladefoged's analysis would likewise have difficulty with the "list c" pitch pattern, since here pitch obtrusion (denoting [+accent] in their system) occurs not on the stressed syllable but on the unstressed syllable. Moreover, even their feature Heavy (equivalent to Bolinger's "long" syllable) would lead to difficulties, since the 4 This looks very like a hierarchy of prosodic units (as in Selkirk, 1980a, b), but clumsily expressed in terms of binary features. These have no f ormal relationship to one another and so are unable to express the generalisation that these differences are gradations, the values at each level being dependent on the values at the previous level, as indicated.
400
Briony Williams
stressed penult in Welsh often contains the reduced vowel schwa, while the unstressed ultiUia can never contain it. This leads to the absurd situation in which the stressed penult, being [-heavy], cannot qualify even for [+accent], while the unstressed ultima may be honoured with the nuclear attributes of Cadence and/or Endglide. Clearly, in Welsh, it is not enough to equate pitch obtrusion with stress or " accent". 4.3.2. Some previous phonetic studies Early acoustic phonetic experiments into stress in English led to the conclusion that a stressed syllable could be picked out by its higher F0 or F0 glide, with longer duration and greater intensity contributing to perceived stress (Fry, 1955, 1958; Lieberman, 1960). Referring to such studies, Ladd comments that " the apparent primacy of pitch obtrusion is to some extent an experimental artifact" (Ladd, 1980, p . 41), explaining that experimenters bent on discovering fixed properties intrinsic to the syllable could find these only at the highest stress levels, and thus concluded that lower-level stress distinctions were an illusion. However, this " illusion" is the means by which stress is perceived by the hearer, writes Ladd, and it is "a very powerful and consistent one based on rhythmic patterns of the whole utterance" (Joe. cit.). This assertion is supported by observations that can be made from an experiment with synthetic pitch contours superimposed on an English sentence (Garding & Gerstman, 1960). The authors found that the semantically emptier syllables "he" and "-ing" were least likely to be judged prominent even when associated with a pitch peak (thus disproving the "accent" analyses of intonation). More importantly, when the pitch peak was half-way between two rhythmically " strong" syllables, listeners tended to judge the first of these two as having the prominence. The authors speculate that this fact " suggests, perhaps, that there is a closer association between stress and rise of pitch than between stress and fall of pitch" (op. cit., p. 59). Ladd, however, explains it in terms of "scooped contours", i.e. a pitch rise occurring a short time after the syllable with which it is associated. This pitch rise, contra the accent analyses, does not lead to the perception of an "accent" on the unstressed syllable bearing it. Thus listeners must be making use of their prior knowledge of the possible candidates for stressed syllables. This knowledge would be based, on one hand, on their linguistic knowledge of the type of word and syllable more likely to be stressed; and on the other hand , on their perception of rhythmic prominence on the token in question. Thus, native speaker knowledge of both rhythmic and pitch-prominence patterns influences stress perception, as Ladd observes: As native speakers we "know" that one of the possible intonational configurations associated with sentence stress is a scopped fall , and we identify the most prominent syllable not on the basis of a simple equation like "(rising) pitch peak = prominence", but in accordance with our knowledge of rhythmic patterns and intonational configurations, and how they can match up (op. cit., p. 40-41). In a similar fashion, stress perception in Welsh seems to be mediated by knowledge of the abstract system of all possible intonational configurations, together with independent knowledge of the location of stress in the given case. In the experiment referred to by Ladd, the utterance was of sufficient length to set up a rhythmic momentum in terms of which the stressed syllables could be identified. In the experiment using isolated short Welsh words, no rhythm could be set up; thus other temporal cues formed the basis of "rhythmic" stress perception (i.e. cues of segmental
Pitch, duration and intonation in Welsh
401
duration). 5 Two countervailing influences were then at work: the listener could hear a shorter or longer nasal or /a/ and directly infer that stress lay on the appropriate vowel; also, the listener could hear a particular pitch pattern and , using knowledge of the possible intonational configurations of Welsh , could indirectly deduce which was the stressed syllable. 4.3.3. "British" and "American " phonological approaches The " British" and " American" schools of suprasegmental analysis (exemplified in the work of Kingdon (1958) , O'Connor & Arnold (1961), and Crystal (1969) on the one hand , and Trager & Smith (1951), Hockett (1958), and Chomsky & Halle (1968) on the other), both distinguish conceptually between stress and intonation (see Ladd (1980, Chapter I) or Crystal (op. cit., Chapter 2) for a review of the two schools of thought). However, the two approaches disagree on where to draw the line between the two phenomena. The " American" approach, as set out by Trager & Smith, defines " pitch contours" in terms of "pitch phonemes", with no reference to stress. This, like the accent analyses, implies that any pitch pattern must be meaningful and contrastive, and must be thus to the same degree as a ny other sequence-an implication that is not borne out by the facts. Furthermore, because of their view that the main prominence of a sentence is an instance of the highest degree of stress rather than of pitch, they are forced into the implication that it is merely by chance that this stress level coincides nearly always with the greatest pitch prominence. Clearly, a generalization is being missed here. The " British" approach defines unitary pitch contours of various types, making no reference to stress. The main prominence of a sentence is seen as a purely intonational (i.e. pitch-related) phenomenon, which just happens to occur always at a stressed syllable. Again, a generalization is being missed, and the situation can only be described by means of rather ad hoc rules linking the pitch pattern to the stress pattern. This can lead to a confusion in the theoretical position taken. For example, when describing the pitch-range contrasts appropriate to individual nuclear syllables, Crystal begins by stating that these syllables "may only be stressed ... and therefore the unstressed contrasts just noted will not apply" (Crystal, 1969, p. 148). On the origin of this stress on the nuclear syllable, he writes: It is also obligatory for the articulation of a kinetic tone in English
that there is an increase in intensity on the syllable carrying the glide, which perceptually seems equivalent to the loudness of the term stress in the simple syllabic loudness system . . . This loudness increase is within the definition of the nuclear syllable, and is not transcribed separately (op . cit. , p. 143).
In the second quotation Crystal seems to attribute this extra loudness to a linguistically non-significant phonetic correlate of a pitch glide, which (although not strictly necessary in purely articulatory terms) he sees as required by English but not made use of linguistically. In the first quotation, on the other hand, he identifies this loudness with linguistic stress. This simpler interpretation is the more plausible, and is also the one actually utilized by Crystal in his attribution to nuclear syllables of the same "pitchrange" features as occur with other stressed syllables. However, this position is not 5 Intensity was not considered in this experiment, having been found in previous measurements not to be a stress cue in Welsh. Since this parameter was not altered, while stress judgements varied widely, it seems safe to say that it had no overriding influence on stress judgements.
402
Briony Williams
clearly set out, and is certainly not formalized explicitly in the units and notation employed by the theory. The fact that a nuclear tone must coincide with a stressed syllable has indeed been noted , but has not been formally integrated into an otherwise very sophisticated theory. In this implicit appeal to chance to account for a conspicuous regularity, the "British" approach reveals the same inadequacy as the "American" approach, as regards the interaction of stress and pitch phenomena. 4.3.4. Metrical approaches The original formulation of the metrical theory of intonation (Liberman, 1975) makes use of the (surface) syntactic structure as the basis for a metrical structure, which in turn forms the basis for both intonation assignment (according to the metrical structure of the intonational tune) and the metrical grid. The grid in its turn forms the basis for stress assignment. This approach answers the objections raised above to the accent approach in that the tune, having metrical structure in its own right, contains an explicit specification of stress placement in relation to pitch levels (or "tones"). The congruence between this metrical structure and that of the text ensures that the relation between stress and pitch level is constant over all instances of the tune. Also, the mediating role of the metrical structure of the text provides a formalization of the insight that it is only in terms of the interaction between stress and pitch that questions about intonation can have any meaning; in the experiment with Welsh described earlier, the one could be inferred from the other only by referring to their co-functioning in intonation. Since the metrical structure forms the theoretical basis for both stress assignment and pitch assignment, it is by means of this entity that the above insight is formally integrated into the theory. A more recent variation on the metrical theme (Pierrehumbert, 1980, 1981) grafts " pitch accents" onto Liberman's metrical trees, eliminates tunes and any metrical structure for the pitch accents, introduces rules for handling "downstep", and converts an essentially declarative definition of suprasegmentals into a more procedural model. This approach likewise answers the objection raised by Ladd against an earlier "pitch accent" analysis, since the definitions of some possible pitch accents given by Pierrehumbert (unlike earlier such analyses), contain an explicit representation of the co-occurrence restrictions with stress as an integral part of the definition. In this, the theory is equivalent to that of Libermans's. However, it differs in that Pierrehumberts' approach , by eliminating the metrical structure of the tune, has lost the forma/linking of pitch and stress (by means of the metrical structure of the text and related congruence rules) which reflects what is essential in the perception of pitch and stress contrasts. Yet Pierrehumbert agrees that pitch and stress, though conceptually distinct, are critically linked in the functioning of intonation. Referring to Fry's classic experiment with an English noun/verb word-pair, she writes: A given F0 pattern could be compatible with more than one conclusion about the location of stress, if more than one assumption about where the accent is located was consistent with a well-formed intonational analysis for the contour. However, some F 0 contours do not display this kind of ambiguity, but instead permit only one conclusion about the stress pattern. It is only in the second kind of case that F0 can serve as a cue for stress (op. cit., p. 103).
The above quotation encapsulates what has been learned from the experiment with Welsh described earlier; Pierrehumbert's first case corresponds to the list c pattern, her
Pitch , duration and intonation in Welsh
403
second case to the list b pattern, while the list a pattern is not covered by this rubric since it is not particularly well-formed under any interpretation. Thus the results obtained with the different experimental pitch patterns would seem to bear out Pierrehumbert's observation above. So it is all the more suprising that her theory neglects to incorporate this insight formally, having dispensed with the theoretical apparatus (the metrical structure of the tune) that originally allowed stress and pitch assignment to be linked through a common basis in the derivation. 6 4.3.5. A glance at intonation and function The insight described above was gained by considering the function of an intonation pattern rather than its form, in the first instance. Such an approach allows non-contrastive differences to be filtered out, so that a simpler set of contrasts is obtained which can then be more easily systematized. This approach is supported by Ladd who writes, of the British tradition of suprasegmental analysis, that "it identifies the meaningful distinctions first" (Ladd, op. cit., p. 13). Having identified the working units, it is then possible to go on to a phonological analysis of those units. The units are to be identified on the basis of their function in making syntactic or semantic contrasts, as Householder writes: But if we postpone our choice of units until after we have established our grammatical contrasts . . . we will, I think, arrive at a system ... whose units will be much more efficiently used (allowing not 768 or 243 possible contours, but 18, all of which occur, and many of which can be shown to be distinctive) (Householder, 1957, pp. 237- 238).
This emphasis on the priority of semantic and syntactic contrasts over formal pitch contrasts, at least in the determination of units, seems to be shared by Vanderslice & Ladefoged. They state (of their pitch features Dip and Scoo"p) that: . .. we are not sure of their role in the phonological description of English intonation patterns. These additional features clearly convey indexical or paralinguistic information, but we have not been able to find examples of their use in differentiating syntactically distinct sentences (Vanderslice & Ladefoged, op. cit. , p. 822).
This is a case where the phonologically defined intonation features have been found to be motivated by independent considerations of " meaning conveyed"; it is the kind of meaning conveyed that is causing difficulty for the authors, who are unsure of the status of paralinguistic information in the determination of phonological units. Although they seem to take a reverse approach, in that they first define their units on formal grounds and then seek syntactically based motivation, this uncertainty on their part points to an acknowledgement of the essential priority of the latter activity, since without an indisputable syntactic motivation for the units, they feel unable to assign full phonological status to entities otherwise very clearly definable on formal grounds. A more robust version of this approach is espoused by Crystal, who does not hesitate to do away with paralinguistic information altogether, and to use a similar criterion for 6 The fact that Pierrehumbert's pitch accents are linear rather than hierarchical in nature also exposes her theory to the criticisms raised by Thorsen (1984b ), who presents an elegant demonstration of the inadequacy of linear pitch accents in accounting for certain pitch relations holding across sentences in a text of more than one sentence. This inadequacy contrasts with the more complete account given of the facts by a theory allowing hierarchical structure in the intonation pattern.
404
Briony Williams
establishing prosodic features as for establishing segmental phonemes. He writes, of prosodic features: only those features are recognised which are judged to be significant, i.e. contrastive; namely, those whose omission from an utterance would cause a linguistically untrained group of native English speakers to state that the utterance was "different" in meaning from the original ... (Crystal, op. cit. , p. 127).
This is a strong statement of the position taken by the workers cited above: namely, that the syntactic/semantic contrastivity of prosodic features should take precedence over purely formal variations in phonological pattern. The same view is supported by the experiment with Welsh described earlier. It was found that only certain variations in F 0 affected listeners' judgements, and that this effect could be determined by the addition of stress placement to the equation. Taking the degree of agreement between listeners as a measure of the phonological status of the patterns in question, it is possible to solve the equation and establish certain functionally and thus phonologically valid intonation patterns. This means that the effect of the list b pattern, with the highest degree of agreement, is evidence for positing a configuration " fairly high unstressed syllable plus low stressed syllable"-or H + L * in Pierrehumbertstyle notation (though this is not meant to imply that the pattern is to be viewed as a pitch accent). Similarly, the effect of the list c pattern, with a moderate degree of agreement between listeners, allows one to posit two configurations: a stressed syllable falling from high to low pitch, preceded by a fairly low unstressed syllable, or else a low stressed syllable followed by a step up to a falling unstressed syllable. In Pierrehumbertstyle notation these would be, respectively, L- + H*L- L% and L * + H-L-L%. The effect of the list a pattern, with the lowest degree of agreement, allows one to conclude that a sequence of syllables on the same pitch is not systematically integrated into the intonational system of Welsh. It is true that the above intonational categories had previously been deduced from taxonomic studies (rather than inferred from a functional examination of the effect of intonation on meaning); however, the formal categories observed previously are now found to have contrastive status in the sense discussed above. They can accordingly be assigned full phonological status-an assignment that could not have been made with certainty before the functional validation had been performed. 4.4. Concluding remarks
Thus it seems that prosodic units must be defined in terms of their function in distinguishing different meanings (in a stricter or looser sense of "meaning"). One could now ask why it is that meaning plays such a large part in the description of prosodic phenomena. A clue may be found in the results of a recent experiment involving degraded speech under various prosodic conditions (Wingfield et a!., 1984), which indicated that since prosodic cues to the structure of the speech presented had been found to improve intelligibility, .. . it must be because this knowledge [of the speech structure] helps listeners to derive syntactic and semantic contextual information which directly facilitates individual lexical recognition (op. cit., p. 133).
In other words, intonation allows one to infer syntactic structure, which in turn allows one to use stored linguistic knowledge of permitted combinations of words to infer the likely composition of the speech in terms of individual words.
Pitch, duration and intonation in Welsh
405
This indirect process is not necessary when the speech is of a high quality; as the experimenters found, speech with "list" intonation on every word (which was extremely clear but gave no clue as to syntactic structure) was the most intelligible at the slowest rate used (229 words per minute). Thus listeners were able to identify words directly when these were undegraded, and no doubt supplied the syntactic structure from their own knowledge of the language. On the other hand, degraded speech (at 460 words per minute) was most intelligible when normal intonation was supplied and the listeners could therefore identify words indirectly by means of the indicated syntactic structure. As the authors comment: When the perceptual demands of the speech task become most difficult, the facilitation offered by normal prosody ... becomes more important (op. cit., p. 132).
Since most conversation takes place under acoustic conditions that are less than ideal, any cue to sentence structure that can be supplied by the speaker will be at a premium. Intonation is such a cue, indicating syntactic structure and, to some extent, semantic or pragmatic units. As intonation is thus so closely linked to meaning in actual language use, it seems a natural consequence that there should also be a link with meaning in the definition of intonational units. I should like to thank Dr Frank Gooding of the University College of North Wales, Bangor, and Ms Menai Williams, of the Normal College, Bangor, for their help in arranging the experiment. Also I should like to thank Dr Peter Alderson and Brian Pickering, at the IBM UK Scientific Centre, for their help in the production of diagrams, and Dr D. R. Ladd of Edinburgh University for his comments on an earlier version of this paper. Finally, I wish to thank Dr Francis Nolan of Cambridge University, for his continual support and encouragement.
References Ball, M. J. (1983). A spectrographic investigation of three Welsh diphthongs, Journal of the International Phonetic Association, 13, 82-89. Bolinger, D. L. (1958). A theory of pitch accent in English, Word, 14, 109- 149. Chomsky, N. & Halle, M. (1968). The Sound Pal/ern of English. New York: Harper & Row. Clements, G. N. (1982). Compensatory lengthening: an independent mechanism of phonological change, Unpublished MS, distributed by Indiana University Linguistics Club. Crystal, D . (1969). Prosodic Systems and Intonation in English. Cambridge: Cambridge University Press. Fry, D. B. (1955). Duration and intensity as physical correlates of linguistic stress, Journal of the Acoustical Society of America, 27, 765-768. Fry, D. B. (1958). Experiments in the perception of stress, Language and Speech, I, 126-152. Gilrding, E. & Gerstman, L. J. (1960). The effect of changes in the location of an intonation peak on sentence stress, Studia Linguistica, 14, 57- 59. Hockett, C. F. (1958). A Course in Modern Linguistics. New York: Macmillan. Householder, F. (1957). Accent, juncture, intonation, and my grandfather's reader, Word, 13, 234-245. Jassem, W. (1959). The phonology of Polish stress, Word, 15, 252-269. Jones, D. M. (1949). The accent in modern Welsh, Bulletin of the Board of Celtic Studies, XIII, 63- 64. Jones, R. 0. (1967). A structural phonological analysis and comparison of three Welsh dialects, M.A. dissertation. University College of North Wales, Bangor. Kingdon, R. (1958). The Groundwork of English Stress. London: Longman. Ladd, D. R. (1980). The Structure of Intonational Meaning. Bloomington and London: Indiana University Press. Lehiste, I. (1970). Suprasegmentals. Cambridge, MA: M.I.T. Press. Liberman, M. Y. (1975). The intonational system of English, Ph.D. thesis, M.I.T. (distributed by Indiana University Linguistics Club, June 1978). Lieberman, P. (1960). Some acoustic correlates of word stress in American English, Journal of the Acoustical Society of America, 32, 451- 454. O'Connor, J.D. & Arnold, G. F. (1961 , 2nd edn 1973). Intonation of Colloquial English: A Practical Handbook. London: Longman. O'Connor, J.D. & Trim, J. L. M. (1953). Vowel, consonant and syllable-a phonological definition, Word, 9, 103- 122.
406
Briony Williams
Pierrehumbert, J . B. (1980). The phonology and phonetics of English intonation. Ph.D. thesis. M.I.T. Pierrehumbert, J. B. (1981). Synthesizing intonation, Journal of the Acoustical Society of America, 70, 985-995. Rigault, A. (1962). Role de Ia frequence, de l'intensite et de Ia duree vocaliques dans Ia perception de J'accent en Fran9ais, In Proceedings of the Fourth International Congress of Phonetic Sciences, pp. 735- 748. The Hague: Mouton. Rhys Jones, T. J. (1977). Living Welsh. London: Hodder & Stoughton. Selkirk, E. 0. (1980a). The role of prosodic categories in English word stress, Linguistic Inquiry, 11 , 563- 605. Selkirk, E. 0. (1980b). On prosodic structure and its relation to syntactic structure, Unpublished paper presented to the Sloan Workshop on the Mental Representation of Phonology, University of Massachusetts, 18- 19 November 1978; distributed by Indiana University Linguistics Club. Thomas, C. H. (1967). Welsh intonation- a preliminary study, Studia Celtica, 2, 8-28. Thorsen, N. (1984a). F0 timing in Danish word perception, Phonetica, 41, 17-30. Thorsen, N. (1984b). Intonation and text in standard Danish- with special reference to the abstract representation of intonation, Paper presented to the Fifth International Phonology Meeting, 25- 28 June, Eisenstadt, Austria. Trager, G. L. & Smith, H. L. Jr (1951). An Outline of English Structure. Studies in Linguistics, Occasional Papers, 3. Washington: American Council of Learned Societies; Norman, OK: Batten burg Press. Vanderslice, R. & Ladefoged, P. (1972). Binary suprasegmental features and transformational wordaccentuation rules, Language, 48, 819-838. Watkins, T. A. (1972). The accent in Cwm Tawe Welsh. Zeitschriftfor Celtischer Philologie, XXIV, 6--9. Weismer, G., Dinnsen, D. A . & Elbert, M. (1979). A clinical study of the voicing distinction and final stop deletion, Unpublished MS, distributed by Indiana University Linguistics Club. Williams, B. J. (1982). Some problems in the description of stress in modern Welsh, Cambridge Papers in Phonetics and Experimental Linguistics, I. Williams, B. J. (1983a) . Stress in modern Welsh, Ph.D. thesis, University of Cambridge. Williams, B. J. (1983b). The interaction of F0 and duration in the perception of stress in Welsh. Cambridge Papers in Phonetics and Experimental Linguistics, 2. Wingfield, A. , Lombardi, L. , & Sokol, S. (1984). Prosodic features and the intelligibility of accelerated speech: syntactic versus periodic segmentation, Journal of Speech and Hearing Research, 27, 128- 134. Zwicky, A.M. (1972). On casual speech, In Papers from the Eighth Regional Meeting of the Chicago Linguistic Society, 14- 16 April 1972. Chicago: Chicago Linguistic Society.