Journal of Phonetics (1975) 3, 235- 255
Integrating different levels of intonation analysis J. 't Hart and R. Collier* Instituut voor Perceptie Onderzoek, Den Dolech 2, Eindhoven, The Netherlands Received 5th June 1975
Abstract:
This paper deals with a partly experimental approach to the complex relationship that exists between the abstract, global structures of intonation, and the concrete, atomistic features of the course of the fundamental frequency. More specifically, we have introduced three levels of description and have attempted to establish links between these: a concrete and atomistic level of the perceptually relevant pitch movements, a concrete and global level of the audible pitch contours and the measurable fundamental frequency curves, and finally, an abstract and global level of the intonation patterns. In the corresponding three main parts of the paper it will be shown (1) how pitch movements can perceptually segment the fundamental frequency continuum; (2) how a " grammar" can be designed that is capable of generating all and only the acceptable combinations of pitch movements, i.e. pitch contours; (3) how listeners categorize different pitch contours into meaningful classes, i.e. intonation patterns. The investigation was based on Dutch utterances.
Introduction
It has been observed that " most descriptions of intonation have been at one or the other of two extremes: atomistic or global" (Bolinger, 1972, p. 51). Global intonation studies are characterized by the fact that they attempt to describe pitch as an overall structure that spreads over the entire length of an utterance. Examples of this approach are Armstrong & Ward (1926) and Jones (1957), who analyse English intonation in terms of two basic "Tunes", each with a number of variants. The perceptual identity of these Tunes is such that it is not affected by the variable length of utterances: "these Tunes may be spread over a large number of syllables, or they may be compressed into smaller spaces" (Jones, 1957, p. 279). Being an elastic, overall structure, the Tune is a powerful concept that allows of intonational generalizations up to the level of infinitely long sentences. Its main disadvantage, however, is that it does not reveal the detailed internal composition of the pitch contour. It is too global. Opposed to the "Tune" approach is, for instance, Palmer's (1933) proposal to describe pitch in terms of tonal "Nuclei". In this framework nuclear pitch movements become associated with accented syllables and are at the centre of "Tone Units", which further consist of an optional "(pre-)head" and "tail". Thus it becomes possible to decompose the pitch contour into a number of successive tone units. This descriptive alternative has been elaborated further by Kingdon (1958), Halliday (1967), Crystal (1969), and others. It is considered as a net improvement over the Tune approach: "the analysis of the formal structure of intonation patterns made little progress while classification was based (as in the bankrupt Armstrong- Ward approach) on the complete Tune rather than on its constituent elements" (Windsor Lewis, 1971, p. 81). Yet, the Tone Unit has its own *Belgian National Science Foundation (NFWO).
236
J. 't Hart and R. Collier
drawbacks. Indeed, it appears that intonational generalizations on the level of the utterance are now hindered by "the absence of any agreed framework for handling prosodic matters above the tone unit" (Crystal, 1969, p. 255). The sentence, which used to be the regular domain of Jones' tune approach, now turns out to be "too variable a unit for practical intonation analysis and description" (Crystal, 1969, p. 275). Clearly, the successive refinements that have been introduced within the boundaries of the tone unit have made it increasingly difficult to see the wood for the trees. The approach has become too atomistic. The same objection can be raised against the "Level" approach (Pike, 1945; Wells, 1945; Trager & Smith, 1951), which "looks for meaningless sub-units which bear the same relationship to intonation that segmental phonemes bear to words" (Bolinger, 1972, p. 51). Thus it seems as if intonation studies can only be either "atomistic" or "global". Yet, considerable progress in intonation research might be achieved if it should be possible to reconcile the "atomistic" and the "global" points of view, combining the advantages of both approaches into one single framework which could allow of a description "from atomistic to global", and vice versa. In this study we will describe the ways in which we have attempted to solve this first problem. What characterizes intonational research (and speech research in general) is the fact that the phenomena under investigation can be dealt with at various levels, each corresponding to a different degree of abstraction. The most concrete, directly measurable manifestation of intonation is to be found in the acoustic and physiological domains. At this level one can analyse the fundamental frequency of the speech signal, or study the physiological mechanisms that control the rate of vocal cord vibration. A first degree of abstraction is introduced by using the human ear as an analytical tool. The ear can be trained (or otherwise aided) in order to become sensitive to small variations in pitch, and is then capable to perform a detailed perceptual analysis. Yet, even the trained phonetician's ear is inherently limited in its resolving power and this will have a smoothing effect on the outcome of a perceptual analysis as compared to the output of an (ideal) F0 meter. A higher degree of abstraction is obtained when the human ear listens to the intonational aspects of the speech signal in a broad fashion, as it does in everyday speech communication. Then the listener is merely concentrating on the overall melodic shape of the utterance and his selective attention is guided by the need to detect the communicatively relevant pitch events only. The highest level of abstraction is the one where intonation is viewed as a mental category that is integrated in the speaker-listener's linguistic competence. Only at the latter level is intonation completely void of tangible, material aspects. It is pure "form". Even if the perceptual level can be divided into two distinct degrees of abstraction, which in turn are more abstract than the physiological and acoustic levels, the fact remains that at all these levels intonation appears to possess some "substance". For this reason as well as for the sake of simplicity we will combine the acoustic, physiological and perceptual aspects of intonation into one class of "concrete" phenomena, and will restrict the use of the term "abstract" to its mental representation. Although it is perfectly legitimate to limit one's research efforts to just one of the aforementioned levels, it is equally compelling to try and combine the corresponding phenomena across each of these levels into one framework. Therefore, a second problem that can inspire research on intonation is to find a way to relate intonation patterns, which we have previously defined as abstract mental categories, to their concrete realizations ('t Hart &
Integrating different levels of intonation analysis
237
Cohen, 1973). The latter appear, for instance, in the form of F 0 curves (the course of the fundamental frequency as obtained through acoustic analysis), or in the form of pitch contours (which are defined as the stylized perceptual equivalents of the F 0 curves, and are composed of pitch movements). More complete definitions of pattern, contour and pitch movement have been given in 't Hart & Cohen (1973, p. 311). Since the intonation patterns are abstract, global structures and the pitch movements are concrete, atomistic elements, it appears that the second problem (viz. abstract vs. concrete) is interwoven with the first (viz. global vs. atomistic) . These relationships can be schematized as follows:
Global Abstract Concrete
Intonation pattern F 0 curve { Pitch contour
Atomistic
Pitch movement
We take the position that this double problem can be answered if we find a solution to the following three sub-problems: (I) It seems necessary, at the outset of the investigation, to try and find plausible descriptive units. The size of such units should obviously be one in between that of the course of pitch during an entire utterance and one pitch period. The quasicontinuous character of the F0 curve makes it rather unlikely that it will automatically provide cues for the desired segmentation. Moreover, even if the F0 curve were readily segmentable, its capriciousness would still produce segments that are not sufficiently invariant for our purpose. The desired discreteness and invariance of the descriptive units can only be found in the perceptual domain. Therefore it is necessary to introduce an acoustical analogue of the natural F 0 curve that is stylized in such a way that it can be segmented into invariant elements and, at the same time, produce a perceptual impression that is equivalent in all of its relevant melodic detail to that evoked by a natural F 0 curve. The construction of such an acceptable acoustic analogue (i.e. a pitch contour) and the retrieval of suitable descriptive units will be the main subject of section 1. (2) Once suitable descriptive units have been found, each pitch contour that has been analysed up to that point can be described as a sequence of those units. But it remains to be shown that the same units are necessary and sufficient to be applied in a comparable description of pitch contours still to be produced by any speaker of the language. Therefore a generative algorithm should be developed that is capable of arranging the descriptive units into combinations corresponding to all and only the acceptable pitch contours of the language. The design of such an algorithm and its verification will be dealt with in section 2. (3) The problem that will concern us in section 3 deals with the question of how the vast diversity of observable pitch contours is related to their abstract representation in the mind of the speaker- listener. It is not implausible to assume that this relationship is such that the infinite set of different pitch contours ultimately derives from a limited set of abstract categories which we have called intonation patterns.
238
J. 't Hart and R. Collier
The first two questions are directly related to the opposition of global vs. atomistic ; the third question deals with the issue of abstract vs. concrete. The relation between the three questions themselves can be discussed with the aid of Fig. 1. The relationship aimed at between concrete Fa curves and abstract intonation patterns is indicated by a broken line and question marks since this relationship cannot be established directly. Yet it is our conviction that this relationship can be established through a "perceptual detour": the segmentation of the continuously changing global Fa curves by applying the criterion of perceptual relevance, opens the way to the desired atomistic description in terms of pitch movements (question I); if, next, we try to specify all the possible ways in which the atomistic units can combine into pitch contours that are perceptually equivalent to the natural Fa curves, we arrive at a global description (question 2). In doing so, we can hope to close the circle that links the concrete global and atomistic entities; once we know the composition of contours in their atomistic detail we can experimentally manipulate
I{
G>
3
0
cc
:»
w f-
w 0::
u
z
0
u
Figure 1
}I
The relationship between the abstract and concrete, and the global and atomistic aspects of the intonational problem.
their build up and investigate the perceptual and communicative consequences of such manipulations on the abstract level of the intonation pattern (question 3). Moreover, by virtue of the perceptual equivalence between contours and Fa curves we may conclude that the relation uncovered between patterns and contours also holds between patterns and curves. The investigation was based on pitch contours in Dutch utterances. We are convinced, however, that our research methods and many of our results may be relevant to the study of intonation in general. (1) Question 1: from continuous Fa curves to discrete pitch movements From an intuitive point of view it seems plausible to assume that not all measurable Fa changes are equally relevant to the perception of pitch in speech. Among those physical variations that are well above threshold there may be some that strike our ears while others pass unnoticed. In other words, when listening to an utterance we are not following its pitch period by period (like an ideal pitch meter would), but are only sensitive to a certain number of pitch events. It becomes interesting then to explore the relationship between the continuity of the physical variations and the discreteness of the corresponding perceptual events.
Integrating different levels of intonation analysis
239
Method: analytic listening In trying to relate F0 curves to those pitch events the listener is sensitive to, one needs an analysis technique that is perceptual-and hence subjective-but which produces results that can be verified . This preoccupation with objectivity in auditory analysis has led to the elaboration of a technique called "analytic listening" . In analytic listening the listener tries to concentrate on the purely melodic aspect of intonation while making abstraction of any relation to the verbal information. As has been explained on earlier occasions ('t Hart & Cohen, 1964 ; Cohen & 't Hart, 1967) speech pitch can be measured perceptually when successive short portions (30- 50 ms) of an utterance are isolated by means of an electronic gate and matched against an equally long, artificial sound with approximately the same timbre and loudness, whose periodicity can be adjusted and subsequently measured on a frequency counter. A similar pitch matching technique is used in experimenting with the Intonator, an apparatus that replaces the natural course of fundamental frequency by an artificial one, whose variation can be controlled externally. In this case the entire original contour, rather than a gated-out portion, can be matched against a complete, adjustable copy. In the course of the investigation to be reported here the procedure was as follows. F 0 measurements of an utterance, which was played back from a tape loop, were obtained with the aid of two different pitch meters (a TransPitchmeter, designed by FmkjaerJensen, and a Period Sieve, designed by Willems, 1970). At the same time the Intonator was set to provide the visually best possible copy of the F 0 curve. Subsequently, this copy was stylized, i.e. some of the original F0 changes were completely smoothed out (and replaced by a straight line of slowly falling pitch, the so-called declination line), while others were reduced to an elementary shape. Stylization was guided by the requirement that the physically simplified contour should, in all of the detail, sound melodically identical to the original course of the pitch. This correspondence between the melodic impressions evoked by the natural F 0 curve and the stylized pitch contour will be referred to as "perceptual equivalence". It has been systematically investigated with the present authors as sole judges, but on numerous occasions it has been confirmed in confrontations with other listeners. The ultimate setting of the Intonator in this procedure is one which contains the smallest possible number of pitch movements . These are supposed to be the perceptually relevant, atomistic units into which the continuous, global course of F 0 can apparently be segmented. In the stylized contour their slopes and durations were given standard values, whose validity was checked when they were applied as constituent elements for subsequently investigated new utterances. The search for relevant pitch movements Before the present Investigation was started, the perceptual analysis of various speech materials (sentences read aloud , spoken news bulletins, radio commentaries, fragments of spontaneous conversations) had already resulted in a set of relevant pitch movements (Cohen & 't Hart, 1967; 't Hart & Cohen, 1973). The latter work presented an inventory of these pitch movements and showed how they operate as constituents in the various contours that can be derived from one type of basic pattern, the hat pattern. The inventory consisted of the items shown in Table I. The completeness or incompleteness of this inventory was subsequently checked in the following way. From 1·5 h of tape-recorded informal conversation between four native speakers of Dutch about 100 utterances were selected that met the double criterion of
240
J. 't Hart and R. Collier Table I
Preliminary inventory of pitch movements
- - - - - - - - - - - - - - - - - - - - · - -·----· Description Symbol
Prominence-lending rise, early in the syllable Non-prominence-lending rise, as late as possible in the syllable Prominence-lending fall, or "final fall", rather late in the syllable Non-prominence-lending fall, or " non-final fall" , very early in the syllable or in between two syllables Gradual fall , covering several syllables Half-fall Declination line: in all stretches of the contour where none of the above-listed movements takes place, there is a slowly downward drifting pitch, at a low level (e.g. before a rise, or after a fall), transcribed as 0; or at a high level (e.g. after a rise, or before a fall) , symbolized by 0.
2
A B
D E
0,0
being intonationally finished and grammatically well formed, and not lasting longer than 5 s (in view of the capacity of the Intonator). We carried out a perceptual analysis by means of the Intonator, using the technique described above. A fair amount of the pitch contours thus analysed were easily recognizable as being derived from the hat pattern, and allowed of a stylization by means of the already known elementary pitch movements of this pattern. On the other hand there were utterances for whose stylization the hat pattern recipe was not fully adequate: it led to contours that sounded acceptable as such but were not perceptually equivalent to the input sentence. In such cases the required perceptual equivalence could only be obtained by the introduction of new perceptually relevant pitch movements. In this way three new types of rise (symbolized as 3, 4 and 5), and one new type of fall (symbolized as C) were uncovered. At the same time the features of the half-fall, E, became clearer. Table II presents the main characteristics of the inventory thus obtained of 12 perceptually relevant pitch movements. Table II Inventory of 12 perceptually relevant pitch movements in Dutch
Declination Low
High 0
0
Rise
+ Prominence
Fall
+ Prominence
-Prominence
1 3 2
4 5
Early Late Very late Gradual Half
- Prominence B
A
c D E
Integrating different levels of intonation analysis
241
More elaborate definitions of the previously known pitch movements of Table I have been given in 't Hart & Cohen (1973). The newly uncovered pitch movements can be defined as follows. Rise 3 In some contours the prominence-lending rise cannot be stylized as standard rise 1. Instead, a rising pitch movement should be applied which is located typically late in the syllable. 1 In a separate adjustment experiment the peak of rise I was located at 40 ± 30 ms after the vowel onset, while that of rise 3 was situated at 90 ± 45 ms (Collier, I970). Quite often rise 3 is found to have a larger excursion than rise I . As a standard recipe it is feasible to take the same slope as for rise I viz. 4 semi tones per IOO ms, but for 3 the duration of the rise should be ISO rather than 100 ms. Rise 4 In some contours the gradually rising pitches of successive syllables between two pitch accents can be stylized as one smooth, continuous rise during these syllables. The number of syllables is not fixed and, since the pitch at the end of this movement is approximately as high as the peak of the preceding accentuated syllable, the slope of this rise 4 is directly dependent on the time interval between the two pitch accents. We have called this movement " inclination" since it can be considered as the inversion of the " declination" line. Rise 5 In quite a number of contours with " inclination", and in some standard "hat patterns", an extra rise may be observed immediately preceding the final fall, A, and occurring on the same syllable. This rise can be stylized as one of standard slope, but of half the duration of rise 1. If rise 5 is deleted, either the prominence given by the final fall, A, becomes audibly weaker than in the original utterance, or the inclination may be insufficient to make the second pitch peak as high as the first (cf. supra). Fall C In some contours, during the last 20 to 50 ms of phonation in the utterance, F0 goes down rapidly to an immaterial value. This movement, although probably a mere relaxation phenomenon, is perceptually relevant: its omission in the stylized contour is readily noticed. Its position in the final syllable should be very late to avoid undesired prominence on that syllable. FallE This fall has been called "half-fall" since after it has been completed, the pitch gives the impression of being neither high (level of declination line 0), nor low (level ofO). Often the occurrence of this pitch movement suggests a jump of the pitch over a musical interval, viz. a minor third . Apparently, fall E manifests itself in three different conditions. First, it may replace the complex A&2 (final fall and non-prominence-lending rise on one syllable, either in utterance final position or before a major syntactic break (continuation). It might then be regarded as a contraction of the two movements A and 2. A second condition, also in utterance final position, is one in which it is not equivalent to A&2. 1 The position of a pitch movement in the syllable is defined with respect to the vowel onset moment.
242
J. 't Hart and R. Collier
It then causes quite a different attitudinal impression and may be similar to what Abe (1962) has observed in English, viz. a " call contour" in which the " lower pitch is suspended in mid-air". A third type of occurrence is in between rise 1 and fall A . In this position it can repeat itself and hence give a terrace-like shape to the contour, comparable to the "descending stress series" of Pike (1945, p. 70). In such cases fall E normally lends prominence, which is not generally the case in the former two types of occurrence. Discussion and conclusion It appears that our smallest perceptual units are pitch movem ents, i.e. changes of pitch that are a function of time. These pitch movements can be characterized in terms of anum ber of perceptual dimensions : direction (rising or falling) , rate of change (abrupt or gradual), timing (early, late or very late in the syllable), and prominence (yielding a pitch accent or not). This last dimension is not independent of the others, especially not of the timing dimension (van Katwijk, 1974, p . 87). The interval covered by rises and falls seems to be immaterial. This is not merely due to an insufficiency in the refinement of the description , since whenever necessary in the process of stylization we have indeed introduced a distinction between full-size and half-size pitch movements . However, distinctions of the type "low rise" vs. "high rise", for example, appear not to be indispensable elements in the perceptual characterization of isolated pitch movements. On the other hand listeners seem to be very sensitive to difference in the timing of the pitch movements, i.e. their position with respect to the vowel onset in the syll able. Finally, we observed that, whenever a syllable contains more than one pitch movement, the configuration can be stylized as a concatenation of two or more single pitch movements with the same characteristics as when they occur in isolation. In conclusion, by finding 12 types of relevant pitch movements we have apparently managed to reduce the infinite number of possible F 0 variations to a limited set of elementary perceptual units. To a large extent we owe this achievement to the application of our "analytic listening technique" (with the Intonator). The fundamental reason why such a reduction is possible resides in the fact that the physical tolerance region for each perceptual unit (in each of its dimensions) is apparently fairly large and partially overlaps that of others. It follows that the relationship between the " global'' and "continuous" physical F 0 curves on the one hand, and the " atomistic" and "discrete" perceptually relevant pitch movements on the other, can only be established within broad limits. We shall therefore conclude this section with an explicit statement of our present claims. (I) Given an arbitrary F0 curve of a Dutch utterance that is not made audible, there will remain some physical variations of which it cannot be predicted with certainty to which type of relevant pitch movement they can be reduced . The claim is, however, that through the use of the analytic listening technique all possible F0 variations can be reduced to only the pitch movements inventorized above. (2) Given an arbitrary Dutch pitch contour, described in terms of our relevant pitch movements, we cannot with certainty predict the corresponding F0 curve of the natural utterance in all its detai ls. In a first approximation there will be a considerable amount of discrepancy, particularly in perceptually irrelevant portions of the curve (e.g. in stretches of the declination line which does not really manifest itself as a straight, somewhat tilted line) . However, the number of discrepancies can be reduced if we incorporate some rules regarding the intrinsic pitch of the vowels (Lehiste & Peterson, 1961), to the effect that closed vowels have higher pitches than open vowels, and rules about micro-intonation (Cohen & 't Hart, 1967), which assign higher pitch to vowels than to consonants, and make
Integrating different levels of intonation analysis
243
the shape of the pitch elevation being dependent on the nature of the preceding consonant. Nevertheless, there will remain serious discrepancies,~among others those due to the sometimes enormous perceptual tolerances. But the claim is that these deviations, although objectively demonstrable, are not relevant from a perceptual point of view. In short, it must be admitted that the fairly gross tolerance regions of the relevant pitch movements make a unique correspondence between actual F 0 curves and Intonator stylizations not entirely possible. On the other hand it is by virtue of their very existence that a relation between continuous curves and discrete pitch movements can be established at all. (2) Question 2: from simple pitch movements to complex contours
It has been shown in the preceding section how a measured F 0 curve can be decomposed into a number of relevant pitch movements which, together, constitute a stylized pitch contour whose perceptual effect is equivalent to that of the original F0 curve. In other words, global F0 curves and pitch contours can be related through the intermediate atomistic pitch movements in terms of which both can be described. Yet the definition of what constitutes a possible (i.e. acceptable) pitch contour or F 0 curve in a language includes more than an enumeration of its ingredients: the order of the constituents has also to be taken into account. This can be done by writing a grammar that makes clear what combination restrictions govern the concatenation of single pitch movements in complex contours. Method: prediction and verification The method applied consisted of the following steps:
(1) Designing a tentative grammar (2) Checking it against a corpus (3) Analysing the discrepancies between predicted and observed contours, in so far as they are systematic (4) Re-modelling the grammar (5) Checking it against independent material. Let us now elaborate somewhat on each of these steps. At the outset of this investigation a number of frequently occurring pitch movement combinations were already known (e.g. those contours derived from the hat pattern). Other combinations, particularly some containing the new pitch movements described above, were less familiar, although apparently possible. To this set of already observed regularities we added a number of combinatory possibilities that we had not yet encountered, but that we considered to be possible. Thus, we attempted to extrapolate from our limited state of empirically founded knowledge to a tentative statement about all possible pitch movement combinations (contours) of Dutch. To that end, explicit predictions were worked out. In formalizing these predictions we used the pitch movement symbols defined in the preceding section. The ensemble of predictions was called a grammar. Its characteristics will be presented below. In order to check the predictions of the grammar, we chose three samples of contemporary, quasi-spontaneous stage speech. The samples lasted from five to 10 min each. All the utterances (661 in all) were submitted in a perceptual analysis that consisted in assigning one of the 12 elementary pitch movement symbols to every successive syllable in each utterance. All the utterances were analysed one after the other, without any selection.
244
J. 't Hart and R. Collier
The utterances thus transcribed were subsequently matched with the predicted, "possible" contours. In examining the discrepancies special attention was paid to those that occurred in a systematic way, and thus apparently enabled us to introduce suitable and desirable generalizations, to be incorporated in an improved version of the grammar. Finally, this new version was checked both against the first corpus and against an independent one, consisting of three additional samples of quasi-spontaneous stage speech plus a sample of genuinely spontaneous conversation (698 utterances in all). Developing a grammar of pitch contours The grammar (in its successive versions) is intended as a prediction, i.e. a statement concerning future events based on the understanding of previously observed phenomena. This prediction deals with the nature and the number of different pitch contours in Dutch utterances of indefinite length . Some characteristics of the first, tentative grammar In its first version, the grammar explicitly states all the permissible successions of pitch movements, i.e. all and only those that we expected to occur. Thus it describes every possible pitch contour as a typical string of successive, discrete pitch movements in an utterance. In fact, this grammar is merely a condensed list of all expectedly possible pitch contours. For that purpose, and to avoid the need of having as many contours as there are utterances, a few abbreviative conventions are introduced. Thus, optional elements are put between round brackets, e.g.... 0 (5) AO . . . in which the half rise 5 may or may not occur. Optional and recursive elements are put between square brackets. e.g. A[0]2, meaning that the number of syllables on the low declination line between fall A and rise 2 may vary between zero and an indefinite number. Further, the occurrence of "optional alternatives" ('t Hart & Cohen, 1973) is taken into account by grouping them together in
braces, e.g.
{~}, meaning that either
the non-final fall, B, or the gradual fall, D, may
occur. Finally, it is expected that within a string of pitch movements some combinations will repeat themselves. Thus, the expected repetition of prominence-lending rise 1 and non-final fall B (viz. lBI BlB . ..), is formalized as, 1[R], where R = Bl. Verification of the first, tentative grammar In order to make the transcription of the three samples of quasi"spontaneous stage speech as reliable as possible, we adopted the following strategy:
(a) The descriptive unit chosen was the pitch movement within one syllable, deprived of its contextual relationships. (b) The perceptual relevance of the chosen descriptive units (pitch movements) was established prior to and independently of the present analysis. (c) Approximately one-half of the material has been independently analysed by two trained listeners, the present authors. Their transcriptions were subsequently compared and all discrepancies between the two analytical sets of data were subjected to further examination, if necessary with other means [see (d)]. The other half of the material was divided for analysis between the two listeners, who each made a list of doubtful cases, which were also subjected to further inspection. (d) Analytic means other than the trained ear were introduced whenever necessary: either acoustic analysis or perceptual analysis with the help of the Intonator.
Integrating different levels of intonation analysis
245
(e) Since a set of acoustic correlates was known for each pitch movement, it was possible to convert the transcription into commands for the Intonator, thus making the transcription "audible" at once. Comparing the original utterance with the synthetic, transcription-based version of it constitutes the ultimate verification of the validity of the analysis. Since this procedure is laborious it was applied to a limited number of utterances only, with the investigators themselves as sole judges. We estimate that thanks to these safety measures our final transcription will not contain more than 5% uncertainties (as a percentage of the total number of syllables). The confrontation of the tentative grammar with the transcribed corpus showed that about 30% of the contours had not been accounted for in the grammar. On the other hand, many predicted strings did not occur in the material. Re-modelling of the grammar Among the discrepancies between the tentative grammar and the material the following occurred in a systematic way:
(a) In the tentative grammar the non-final fall, B, could be preceded by a rise of type 1 only. The combination of B and a following rise 1 was expected to be recursive. Upon inspection of the discrepancies it appeared that the non-final fall, B, whether in a recursive combination or not, could be accompanied by rises 3 and 4 as well. Moreover, combinations of B with rises 1, 3 and 4 were also found to occur in a mixed way, e.g.. .. 1B4B . ... This suggests that a complex containing B should be considered as "rise+ fall" (rather than as "fall+ rise", as in the previously mentioned recurrent B1 complex), irrespective of the particular type of rise [see Fig. 2 (a)]. (b) A similar observation could be made with respect to continuations accompanying major syntactic breaks. Indeed, contrary to the prediction of the tentative grammar that continuation can only occur in contours derived from the "hat pattern", the material contained instances in which continuation is associated equally well with contours of different composition, e.g. 01&B45&A02030C [See Fig. 2 (b)] .
..... J··. . .____r--.... ~\._____ _____ -
-----· ~
01801
a
2
__ __
_/··-... 01
3
8
4
801
(,i\80
..._~·-----------·· ·· 0 80
801
_/---.._____/...... ~--
____ __
0180180
3D
4
b
5
6
Figure 2
Substitution phenomena observed in the comparison of contours in the corpus.
246
J. 't Hart and R. Collier
Comparisons such as those presented in Fig. 2 suggest that one or more successive movements can be substituted by others in otherwise identical environments. For instance, in contour 2 one of the lB complexes of contour 1 is replaced by a 4B complex; in contour 5 the sequence 10A02, found in contour 4, is replaced by 45A02. The substitution of one or more successive pitch movements by others is, of course, audible, and may have consequences for the interpretation of the utterance. The relevance of this observation to the description of the purely melodical aspects of intonation is, that it shows that certain successions of pitch movements are more coherent than others in that they build recurrent clusters. Therefore it seems possible to introduce a descriptive unit of a size in between that of the pitch movement and that of the pitch contour. We have called this unit a pitch block. (To be clear, a block may contain only one pitch movement, and a contour may consist of only one block.) Unlike the tentative grammar, which considered a pitch contour merely as a string of pitch movements, the new version of the grammar reflects this intermediate level of description and specifies the internal structure of the block, together with the ways in which they can combine. The combinatory restrictions are as follows: it is found that (1) some blocks always occur in isolation or as a contour final element; (2) some other blocks occur before major syntactic breaks ; (3) other blocks never occur in positions (1) and (2) but rather precede blocks in those positions. Therefore, the blocks are grouped into three categories: E-blocks or "end" blocks C-blocks or "continuation" blocks P-blocks or "prefix" blocks The first part of the grammar consists of a statement which formalizes the most general conditions imposed on the blocks, viz. those concerning their position in the contour: contour = [[P] C] [P] E. This formula generates as possible contours E, PE, CE, CPE, PCE, PCPE, PCPCPPE, etc., in which E is the only obligatory element and [P], [C] and [[P] C] are optional recursive elements whose number is free . The second part of the grammar is to be seen in Fig. 3. This scheme generates the internal
E
' <;1{~} 121
4 ~51(~}2 1
c=
1
151 E
3 - - - - 121
3 _______,.. 121
151{~} 2 151 E
151 E
E=
----------...~~)121 151 {~) 121 151 E
Figure 3
The generation of intonational blocks. Arrows indicate the continuation possibilities after a chosen initial pitch movement. The choice among elements within braces is free.
Integrating different levels of intonation analysis
247
structure of the blocks (omitting the optional 0 and 0 elements). Six P-blocks, 8 C-blocks, and 8 £-blocks are di stingui shed : P-blocks
C-blocks
Pl = lB or lD P2 = 3B or 3D P3 =4B or4D P4=B orD P5 =IE P6=E
C1 = 1(5)A2 or 1(5)D2 C2 = 1(5)E C3 = 1(2) C4=2 C5 = 3(2) C6 = 4(5)A2 or 4(5)D2 C7 = (5)A2 or (5)D2 C8 = (5)E
£-blocks El = 1(5)A(2) or 1 (5)D(2) E2 = 1(5)E E3 = 1(2) E4=2 E5 = 3C(2) or 3D(2) E6 = 4(5)A(2) or 4(5)D(2) E7 = (5)A(2) or (5)D(2) E8 = (5)E
(The apparent resemblance between C- and E- blocks is due to the fact that most £-blocks can be transformed into C-blocks by making obligatory in the latter the terminal rise that is optional in the former.) The third part of the gra mmar contains some restrictions on the otherwise unlimited combinatory possibilities given in the first statement. These restrictions are dependent on the particular internal structure of the chosen block or blocks. The first restriction states that P-blocks ending in low pitch cannot be followed by any block beginning with high pitch. Conversely, P-blocks ending in high pitch cannot be followed by any block beginning with low pitch. The second restriction specifies that blocks C4 and E4 cannot be preceded by any P-block. The third restriction prohibits the two movements in P1 and P2 from occurring on different syllables if they precede either C5 and C6, or E5 and E6. In such cases, the variants 1 &Band 3&B (two movements on the same syllable) are the only permissible ones. The number of restrictions is fairly small. This implies that many combinations of blocks are expected to be possi ble, even if they have not been observed in our data. The expectation of their occurrence is mere ly based on the actual observation of analogous combinations. For instance, if P2 + E5 has been observed, then P2 + C5 should be equally possible. Verification of the second grammar As mentioned earlier, the second grammar was checked against the same corpus that was used to verify the first grammar, as well as against a second, independent corpus. The two analysed corpuses together consisted of 1359 utterances. When the seven samples were confronted with the predictions of the second grammar, 80 contours were not in agreement with the expectations . In about half of these cases intonation errors were involved , accompanying disorganizations on the segmental level (e.g. stuttering), or contours unfinished due to the interruption of the speaker. In the other half the discrepancies were mainly of two kinds:
(a) the gradual rise 4 (inclination) sometimes occurred in anticipation of rise I, or it seemed to operate as a base line for the entire contour ; (b) some pitch movements occurred earlier or later in the syllable than is specified in their definition (cf. Table II), to the effect
248
J. 't Hart and R . Collier
that blocks had to be transcribed as e.g. 1Cor 3A (where the grammar invariably predicts IA and 3C). We have not attempted to incorporate such possible systematic discrepancies into a new version of the grammar. In its present outline the grammar accounts for 94 % of the total number of contours. Discussion and conclusion Even if the predictive adequacy of the grammar appears to be demonstrated by the rather high percentage cited above, it remains possible that it is the mere result of the grammar being too powerful in its predictions. Indeed, it predicts many contours that have not yet been observed. If we examine how P-blocks can combine with themselves or with C- and E-blocks, it appears that eight of the 18 permitted PP-combinations have occurred in the analysed sample; of the 24 permitted PC-combinations five were realized; of the 24 permitted PE-combinations 16 occurred . Obviously, if we want to make sure whether the grammar is too powerful or not, we might check it against another, much larger speech sample. Rather than doing this we preferred to examine the probabilistic structure of the contours encountered so far. We did so with the purpose of answering a double question: (a) what are the chances that unpredicted, but really existing pitch phenoma will still occur? (b) what are the chances that predicted pitch phenomena do not really exist? A first observation deals with the frequency of occurrence of the various blocks: for instance, 94·1 % of the P-blocks are of the type PI and 0·2% are of the type P6; of the Cblocks 45 % are of the type C3 and less than 0· 5 % are of the type C6 ; E-blocks are of the type El in 65·1% of the cases and of the type E8 in 0·3 % of the cases. Since it is improbable that a very frequent block will have been overlooked, it follows that the probability of an unpredicted block occurring is less than 0·005. A second observation concerns the combination of blocks. In 't Hart (1971) it has been worked out that
57 % of the contours have no Pin front of E or C 30 1p 7 2 Ps 3 3 1 4 2 5 or more The probability that a P-block will occur is approximately 0·40. The probability for a second P to occur after a first one is approximately 0·30, and this probability remains constant for the occurrence of still more Ps. Similarly, the probability for a C-block occurring is established at 0·15, while the probability that more Cs will follow a first one is 0·20 and also constant. The probabilistic structure of the pitch contours observed in the corpus can be represented as in Fig. 4. From this figure it can be derived that even moderately complicated contours will seldom show up. For instance, a contour of the type nP + C + nP + E, with n ;:::, I, will occur in only 2% of the cases. The study of the probabilistic structure of pitch contours explains why many of the predicted block combinations (i.e. contours) did not occur. Indeed, the chance of an infrequent block being realized in all its permitted combinations is so small that it cannot be expected to be found in a sample of only 1359 utterances. To have a fair chance that all predicted contours will occur at least once, we would need a sample about a hundred
Integrating different levels of intonation analysis
249
~085------------------~ . . 060
015
o~so o~oro u3o
Figure 4
ozo
c=Jo
Structure of a pitch contour with transitional probabilities.
times larger. As long as the grammar has not been confronted with a corpus of that size, it cannot be considered a weakness that it generates numerous not yet observed contours. On the other hand, the probabilistic structure of the contours is such that the possibility of unpredicted contours showing up in large numbers is practically reduced to nil. In other words, the grammar most certainly accounts for all the pitch phenomena that are reasonably frequent, but the grammar cannot feasibly be verified with regard to very infrequent events. Apparently, the description of global pitch contours in terms of atomistic, elementary movements necessitates the intermediate stage of the "blocks". When contours are considered as concatenations of blocks it becomes possible to make interesting generalizations with regard to their internal structure. This structure appears to be of the type [[P] C][P] E. This means that a well-formed contour is of necessity composed of at least an E-block, optionally preceded by one or more P- and/or C-blocks in the prescribed order. The internal composition of such blocks and their combination restrictions are further specified in the grammar. The blocks can rightly be considered as structural elements of a contour and not merely as arbitrary conglomerates of pitch movements. This is evident from the fact that the restrictions are much more severe within the blocks themselves than at their boundaries. Their non-arbitrary character also appears from the fact that, as a result of the first property, it is nearly always possible to replace, within one given utterance, any P-, C- or E-block by any other of the same category, while it is forbidden to replace any block by one of another category (since this would violate the sequential constraints expressed in the formula above). The status of the grammar as a predictive device is not such that it allows of unique specifications with regard to the pitch contour of an arbitrary utterance. Yet its power is such that it can handle the following tasks: (1) Given an arbitrary utterance in Dutch that has been realized with an acceptable pitch contour, the grammar will recognize this contour as a "possible" one. (2) Given an arbitrary sentence whose pitch contour still has to be realized (e.g. in a synthesis by rule process) and whose pitch accents and syntactic boundaries have already been marked the grammar will specify all the contours that lead to an acceptable intonational result, but it cannot indicate among the alternatives the most appropriate candidate (as required by attitude, context). By finding a grammar that rearranges relevant pitch movements into contours that are perceptually equivalent to the course of pitch in natural connected discourse, we have at the same time adduced strong evidence for regarding these elementary movements as the adequate segments for the decomposition of such contours. In conclusion, we have closed the circle that links F 0 curves to pitch contours through
250
J. 't Hart and R. Collier
the mediation of pitch movements. These movements are units of such a nature that the pitch contours they build are perceptually equivalent to the F0 curves. Having related three observable aspects of intonation we have also bridged to some extent the gap between atomistic and global approaches to the phenomenon. Now it remains to be shown how these observable properties of intonation are linked up with the abstract categories of the patterns. (3) Question 3: from concrete pitch contours to abstract intonation patterns In the preceding two sections of this paper we have shown how the structure of F 0 curves can be described in terms of relevant pitch movements. These constitute the elements with which pitch contours can be generated that are perceptually equivalent to the original F0 curves. To this end a grammar has been worked out that fulfils the function of accounting for the composition of a great variety of concrete pitch contours. At the same time, the outline of the grammar reflects a number of assumptions that enable us to introduce some non-trivial generalizations: e.g. by specifying block El as I A (omitting the optional elements 5 and 2) the grammar indicates similarity of contours such as I &A (one syllable), lA (two syllables), 10A, 010A, 010AO, 0100AOO, etc. Likewise, in stating that the number of optional recursive elements (P- and C-blocks) is free, the grammar considers as similar e.g. contours with varying numbers of repetitive P-blocks: 010AO, OI 0B010AO, 01 &B010B010AO etc. This similarity of so many analytically different contours suggests the existence of some kind of resemblance at a level different from the concrete one. Indeed it has been our hypothesis that these various contours are the observable manifestations of more abstract intonational categories, called intonation patterns. In an attempt to test the hypothesis concerning the relationship between contours and patterns we conducted a number of perceptual experiments to be described below. The purpose of these experiments was to find out whether the vast diversity of contours can be reduced to a more restricted set of mutually resembling intonational classes. The discovery of such classes might then be interpreted as an indirect indication of the existence of abstract underlying structures, the intonation patterns.
Method: broad listening We applied a method of subjective comparison through which the degree of similarity between analytically different contours can be judged . To this end, listeners were confronted with small sets of utterances and were in structed to concentrate on their melodic shape as a whole in order to find global similarities between them . They were told that paying attention to the word content and to the number of pitch accents would distract them from their main task. This way of listening is called " broad listening". It is not merely a kind of less accurate or superficial " analytic listening", but differs fundamentally from it in that it introduces a new dimension, viz. that of a classification of intonational wholes. Indeed, we assume that in broad listening one can only hear differences between pitch contours if one can associate them with different "meanings", so that objectively different contours cluster into interpretative categories. However, a subject may find it difficult to formulate hi s interpretation of a contour. Some investigations have circumvented this problem by instructing the subject to select a suitable label of the type "surprised , interested, bored, . .. ," etc. , but it appears that the content and the number of these labels influence the experimental result (Crystal, I969, pp. 307- 8 ; Collier,
Integrating different levels of intonation analysis
251
1973, p. 19). Therefore, in these experiments we preferred not to make use of labels but to elicit similarity judgments. The experimental paradigms were such that the subjects were instructed to either match contours with each other, or to sort them into a number of groups. In the matching experiments the contours were arranged so as to build pairs of stimuli, or were grouped in sets of four. In the sorting experiments the contours were put together in groups of up to 20 items. In both the matching and the sorting experiments the stimuli were either synthesized on the Intonator (and consequently had a typical stylized shape), or were spoken in a carefully controlled way, or borrowed from recordings of spontaneous conversation. In one of the experiments to be described here more extensively the stimuli could be considered as equidistant steps on a quasi-continuum. In search of abstract categories In all, there were five experiments, described in Collier & 't Hart (1972) and Collier (to appear). Two of them will be discussed here. Matching within sets offour contours In this experiment the following three contours were used: (1) 1A, the elementary shape of the "hat" pattern;
(2) 1&B4A, containing the gradual rise, 4; (3) 3C, with the typically late pitch movements 3 and C. Use was made of both utterances in natural speech and of Intonator processed stimuli . The material consisted of three sets of four sentences, each set being provided with one of the contours mentioned. All 12 utterances differed with respect to the word content and the location of the pitch accents. The stimuli were recorded on Language Master cards (Bell and Howell). The cards were grouped into quartets (X, A, B, C), so that each contained two items representing the same type of contour, viz. X and one of the A- B-C-group. Ten subjects were asked to match each X stimulus with A, Band C, and to indicate which item in the latter group resembled X. They were allowed to indicate more than one pair of resembling contours or no pair at all. They were free to play back the cards as many times as necessary. The results are presented in Table III. Table III Results of the matching experiment in percentages of the maximum possible score (M = 40).
Judged as resembling When predicted as resembling As not resembling X=3C = 1&B4A
93 83
=1A
74
5 12 14
No reaction 2 5 12
From these results we may conclude that the various contours seem to derive from three patterns which are so different that their realizations are not confused in the great majority of the cases. The subjects' ability to categorize pitch contours, as revealed in this experiment, was confirmed in a comparable experiment with spontaneous speech stimuli, showing that this ability is independent of the degree of stylization of the stimuli.
J. 't Hart and R. Collier
252
Sorting of stimuli on a quasi-continuum The second experiment aimed at exploring how subjects react to a quasi-continuous change from a small to a large excursion of the falling pitch movement on a prominent syllable. The experimental set-up was as follows : the utterance "Dat heb ik al gedaan" (I've done that already) was given 12 different synthetic contours consisting of a fixed size rise 1 on the syllable " heb" , and a falling pitch movement on " -daan", that was varied stepwise as sketched in Fig. 5. To the 12 stimuli were added another four, which were exact replicas of four of the original12 (viz. 15 = 16, 12 = 13, 10 = 11, and 7 = 8). The 16 utterances were recorded on Language Master cards and the subjects were instructed to find intonational resemblances among them . They were not informed about the possible number of categories involved. Thus, in this experiment they were told to "build less than sixteen piles of cards" . Ten subjects took the test. Their judgments were subjected to an hierarchical clustering analysis (Johnson, 1967). The outcome of the minimum and the maximum method of this analysis is depicted in Fig. 6.
Fo
I
STIMULUS NUMBER
~: 15,15lfl~~-;~~;~-:~;=~;~~~~~
d at
h
£
p
I k a l
X
~6
7, 86-5~~~=====----
8
:j ·..-,- - --
L.· oo.c_---- --
10 e d a 0 20 40 60
2·ac -
.-i'· 100 120 150ms n
80
Tl ME Figure 5
Contours with quasi-continuous variation of the excursion of the falling pitch movement as applied in the sorting experiment.
In Fig. 6, the minimum method suggests the existence of only two groups, whereas the maximum method distinguishes three clusters. This discrepancy does not allow the conclusion that there are certainly three groups, but both methods agree in distinguishing at least two. In this respect the outcome gives support to the previously introduced distinction between contours with a full-size fall (I A) and those with a half-fall (IE), and corroborates the hypothesis that they derive from basically different intonation patterns. Discussion and conclusion We think that the outcome of the experiments can be interpreted in favour of the listeners' hypothesized ability to interpret pitch contours in terms of abstract categories. Some experiments, like the first one discussed above, share the outcome that despite the fact that objective, atomistic differences are present, yet subjective similarities exist thanks to which the listeners are able to group resembling stimuli together. On the other hand, experiments such as our second one give rise to subjective dissimilarities, in spite of the objective continuity in the variation of some physical parameter (e.g. the size of the F 0 fall), thus suggesting the existence of discrete boundaries. This outcome is in agreement with
Integrating different levels of intonation analysis
253
STIMULUS NUMBER LEVEL 10
2
3
4
5
6
7
8
9
10 11
12
13
14
15
16
6
7
8
9
10
12
13
14
15
16
9
8
7 6
5 4
3 2
MINIMUM
METHOD
0 2
3
5
11
10
9 8
7 6 5
4
3
2
MAXIMUM METHOD
0
Figure 6
Outcome of the hierarchical clustering analysis as applied to the results of the sorting experiment.
one of the main results reported by Garding and Abramson (1965), viz. that ~'each contour has a considerable margin within which changes can be made without any effect on perception, as long as these changes do not disturb a certain pattern". Apparently, detailed differences as revealed by the analytic listening technique are responsible for the perceptual distinction between contours derived from different abstract intonational categories. Second, it appears that in order for two contours to be judged as resembling each other, it is not necessary for them to be simil
254
J. 't Hart and R. Collier
enabled us to segment the stylized analogues of the F 0 curve of level (2) in such a way that the recombination of the segments yields a stylized pitch contour which is perceptually equivalent to the original F 0 curve. In fact these segments are more powerful than that: as shown in section 2 a set of rules can be given (called a grammar) according to which the perceptually relevant pitch movements can be combined to generate virtually all pitch contours that were observed to occur in a speech sample which had a duration of more than one hour (the grammar is even able to generate many more contours than we observed in the material). Once we had these tools at our disposal , we were equipped to try t.o bridge the gap towards the abstract and global level of description, viz. by studying the categorizing reactions of listeners to well-defined, analytically insightful variations of the pitch contour. These experiments have indeed suggested the existence of abstract categories, intonation patterns, whose characteristic atomistic features were explicit beforehand. Although it is true that the relation aimed at between intonation patterns and F 0 curves has been unravelled, this is not unconditionally valid. As is demonstrated above, the perceptual detour is essential (see Fig. I : from F 0 curves via pitch movements and pitch contours to intonation patterns). Thus, for example, only in exceptional cases would we be able to decide, on the basis of visual inspection of two F 0 curves alone, whether or not the utterances concerned would sound (intonationally) different. Nor can we, in F 0 curves of two utterances which are said to have different intonations, always and infallibly indicate what atomistic detail is responsible for that difference. But as soon as the utterances are made audible, we can segment them into the perceptually relevant pitch movements, and demonstrate that they do indeed contain different components of which we know that they systematically give rise to different interpretations on the level of the patterns. The total number of discernible intonation patterns must be fairly low, possibly not exceeding 10. At any rate, it must be many times less than the number of different pitch contours that can occur in any given corpus. Two major principles are operative in this reduction. The first one is Jones ' "elasticity" (as expressed in the quotation given in the first paragraph of our Introduction). Indeed , our experiments have confirmed, for instance, that a contour with a "pointed hat" (1 &A) is grouped into the same cluster as one in which there are five or 10 syllables between the initial rise I and the final fall A. The second principle is that of "optionality" (as expressed in the formula [[P] C] [P] E, which describes the general shape of a contour; see section 2). Contours consisting only of the E-block, I &A or I0 ... 0A, are taken together with contours consisting of the same E-block preceded by one or more prefix blocks . Thanks to the fact that the discovered regularities have been formalized (in the so-called "grammar" of section 2) it is now possible to incorporate them into a synthesis-by-rule programme (Siis, 1971). The existing programme can now be expanded in order to include a fairly large range of possible " ways of intonating" for any given utterance. However, as has been mentioned earlier (discussion of section 2), the grammar does not indicate among these alternatives the most appropriate candidate as required by attitude, syntax, or context. This is obviously related to the fact that in establishing the relationship between the abstract and the concrete aspects of intonation and between the global and the atomistic levels of its description we have deliberately restricted ourselves to the melodic aspects only. So far we have not said anything about the functional aspect of intonation, for example we cannot offer any explanation for the motives of the speaker that lead to the choice of a particular intonation pattern. In that sense our investigation has not answered the question about the communicative value of intonation. However, the regularities in
Integrating different levels of intonation analysis
255
intonation which we have been able to express in the grammar constitute the first essential condition for intonation to have any communicative value at all, since they reflect the intonational competence that is shared by both speakers and listeners of Dutch. We would like to thank Messrs B. L. Cardozo, A. Cohen, C. A. A. J. Greebe, A. van Katwijk and S. G . Nooteboom for their many valuable comments on earlier versions of the manuscript. References Abe, I. (1962). Call contours, Proceedings of the I Vth International Congress of Phonetic Sciences, · Helsinki 1961. Pp. 519- 23 . The Hague: Mouton. Armstrong, L. E. & Ward, I. C. (1926). Handbook of English Intonation. Leipzig: Teubner; Cambridge: Helfer & Sons Ltd. Bolinger, D. L. (Ed.) (1972). Intonation. Harmondsworth: Penguin. Cohen, A. & 't Hart, J. (1967). On the anatomy of intonation . Lingua 19, 177-92. Collier, R. (1970). The optimum position of prominence lending pitch rises. IPO Annual Progress Report 5, 82-5. Collier, R. (1974). Intonation from a structural linguistic viewpoint: a criticism. Linguistics 129, 5-28. Collier, R. (to appear). Perceptual and linguistic tolerance in intonation, to be published in International Review of Applied Linguistics. Collier, R . & 't Hart, J. (1972). Perceptual experiments on Dutch intonation. Proceedings of the Vllth International Congress of Phonetic S ciences, Montreal197l. Pp. 880-4. The Hague, Paris: Mouton. Crystal, D . (1969) . Prosodic Systems and Intonation in English. Cambridge : Cambridge University Press. Giirding, E. & Abramson, A. S. (1965). A study of the perception of some American English intonation contours. Studia Linguistica 19, 61-79. Halliday, M.A. K . (1967). Intonation and Grammar in British English. The Hague: Mouton . 't Hart. J. (1971) Concatenation of intonational blocks. /PO Annual Progress Report 6, 21-4. 't Hart, J. & Cohen, A. (1964) . Gating techniques as an aid in speech analysis. Language and Speech 7, 22-39. 't Hart, J. & Cohen, A. (1973). Intonation by rule: a perceptual quest. Journal of Phonetics 1, 309- 27. Johnso)1, S.C. (1967) Hierarchical clustering schemes. Psychometrika 32, 241-54. Jones, D. (1957). An Outline of English Phonetics. 8th edition. Cambridge: Helfer & Sons Ltd. van Katwijk, A. (1974). Accentuation in Dutch: an Experimental Linguistic Study. Assen: van Gorcum. Kingdon, R . (1958). The Groundwork of English Intonation. London: Longmans. Lehiste, I. & Peterson, G . E. (1961) . Some basic considerations in the analysis of intonation. Journal of the Acoustical Society of America 33, 419-25. Palmer, H. E. (1933). A New Classification of English Tones . Tokyo: Institute for Research in English Teaching. Pike K . L. (1945). The Intonation of American English . Ann Arbor: University of Michigan Press. Slis, I. H. (1971). Rules for the synthesis of speech. !PO Annual Progress Report 6, 28-31. Trager, G. L. & Smith, H. L. (1951). An Outline of English Structure. Studies in Linguistics, Occasional Papers, 3. Wells, R. (1945). The pitch phonemes of English. Language 21, 27-29. Willems, L. F. & de Vries, H . (1970). The Phonetograph. !PO Annual Progress Report 5, 181-5. Windsor Lewis, J. (1969). A Guide to Eng/ish Pronunciation. Oslo: Universitetsforlaget.
14