Journal
of Phonetics (1986) 14, 139- 144
Stability and change Anders Lofqvist Department of Logopedics and Phoniatrics , Lund University , Sweden
In their paper, Kelso , Saltzman & Tuller (1986) discuss two complementary aspects of motor control , stability and flexibility . Both of these aspects are related to the probl em of motor equivalence, i.e. how the same, or similar, motor acts are executed under different conditions using different patterns of muscle activity. In the following , I will address some problems associated with these two aspects, in particular how to assess stability in the face of change. Stability has commonly been defined as coherence of patterns of muscle activity and/ or movement. Following Lee (1984), we can describe this coherence at the spatial, temporal , and scaling levels. Spatially, the same set of muscles should always be acti vated. Temporally, synchronicity, fixed temporal order, or fixed-phase relationships should occur among events. Finally, scaling should occur among elements once all elements are activated above threshold. The spatial and , partly the temporal , definitions, in particular the requirements of always activating the same set of muscles, have not been shown for speech motor activities but rather for postural control (cf. Nashner & McCollum, 1985). For speech , the argument about stability has mostly been based on stable relationships among articulatory intervals across variations in stress and speaking rate, e.g. Tuller, Kelso & Harris (1982), Lofqvist & Yoshioka (1984). I will return to this shortly. Flexibility in motor acts has been shown for speech using the experimental paradigm of perturbing one part of the system and examining the rapid and functional compensations that normally follow such perturbations (cf. Abbs, Gracco & Cole, 1984; Kelso , Tuller, Vatikiotis-Bateson & Fowler, 1984). Returning to stability, it seems to me that one central problem has to do with the measurement and definition of stability, i.e. how do you find it and how do you know that it is stable? As an illustration of this problem, f will use some recently obtained experimental material. The experiment I will discuss is basically a replication of the study by Tuller & Kelso (1984) using Swedish speech materials and a Swedish speaker. Movements of the jaw, the lower lip, and the upper lip were recorded using a magnetometer system described by Branderud (1985). For the lower lip, the movements of the jaw were subtracted on-line, and the signal thus represented the net movements of the lower lip. The speech materials consisted of nonsense words of the form jba 1C 1a 2C 2 / with C being one of the labial consonants jp, b, f, v, mj . There is in Swedish a constraint on the durational properties of a VC sequence. Thus, either the vowel is long and the following consonant short, or the vowel is short and the consonant is long. In the present case, the following two possible sequences were used jba : Ca: Cj or jbaC: aC: / . Stress was placed on the first or second syllable, and the material was produced at two self-selected rates. 0095- 4470/86/010139
+
06 $03.00/0
© 1986 Academic Press Inc. (London) Ltd.
140
A. Lofqvist
Measurements were made of two different periods and latencies. First, the period was defined as the interval from onset of jaw lowering for the first vowel to onset of jaw lowering for the second vowel ; this is, then , the vowel-to-vowel cycle as commonly defined in this type of study (e.g. Tuller, Kelso & Harris, 1982; Tuller & Kelso, 1984). With this interval as the period , two different consonant latencies were measured. One was related to lower lip movement and was taken as the interval from onset of jaw lowering for the first vowel to the onset of lower lip raising for the medial consonant. The other was related to upper lip activity and measured from onset of jaw lowering for the first vowel to onset of upper lip lowering for the medial consonant. However, this second latency was only measured for the consonants jp, b, mj since no upper lip movement occurred for the fricatives / f, vj . The second period was taken as the interval from onset of lip movement for the first consonant to the onset of lip movement for the second consonant. This consonant-toconsonant cycle could be defined relative to lower and upper lip activity associated with v;
..s c 0 c
(a}
.·:a
0
~
0
u
200
0
'0 "' E
.....
::::"'
.2 .!: "'
"" 100 f-
"'E
~
g.
80 g<'o 'tP
0;
"
E
0
I
I
~
c 0
( b}
.2
.·. .. .
> .2 200 f-
"'c 0;
" " 0 .2
'"
ooa8oc8
100
~c
•• ••
s'b
sff'D
0
E
_g
0 > 0;
I
100
c
200
300
In terval from onset of jaw low er ing for V 1 fa onset of jaw lower ing for (ms}
v2
Figure I. Plots of vowel-to-vowel period and vowel-to-consonant latency for seq uences with labial stops. (a) /babab/ : r = 0.99; y = 0.84x - 38. (b) /bapapf: r = 0.99; y = 0.7x - 39.
'CVCVC CV 'CVC
Normal
Fast
••
0 D
141
Stability and change
the consonants. For both of these consonant-related cycles, the latency was taken as the interval from onset of lip activity for the first consonant to the onset of jaw lowering for the second vowel. Correlations and regressions were calculated for each combination of cycle, latency, and consonant type. In addition to calculating these measures for all the productions collapsed across stress and rate changes, within category correlations and regressions were also calculated , i.e. separate values for each stress and rate category as well as consonant type . Results for one of the cycle and latency measurements for the /ba: Ca: C/ seq uences are shown in Figures I and 2. Table I summarizes the correlations for the vowel-to-vowel period; this table presents both across an d within category correlations. In Figures I and 2 there is overall a strong positive correlation between the two articulatory intervals across stress and rate variations. The results from the other word type, /baC: aC: /, were comparable. Taking the consonant latency as defined by upper lip movement did not change the overall picture but also resulted in very high correlations. However, the results for the consonant defined cycle showed much lower and much more variable correlations irrespective of whether upper or lower lip movement was used as a basis for the measurements. Taken together, these results for Swedish show very good agreement with similar results for American English (Tu ller & Kelso, 1984) and French (Gentil, Harris, Horiguchi & Honda, 1984). That is, overall high correlations are obtained between the duration of the vowel-to-vowel cycle in VCVC sequences and the duration of the consonant related latency when the data are pooled across variations in stress and speaki ng rate. Lower and more variable correlations are found when the cycle is defined between consonants and the latency measured from consonant to vowel. Coherence and stability have been defined here based on correlations. High correlations indicate stability, whereas low correlations would presumably be taken as evidence for instability, or at least variability. This leads then to the problem of measuring stability, in particular whether high correlations form a sufficient criterion for stability. One particular issue here is whether overall high correlations are due to making correlations between classes. That is, most of the tempora l variation occurs due to stress and rate changes, and the overall high correlations might thus be due to this fact, whereas the within-class correlations could be much smaller. The results given in Table I indicate TABLE I. Inter- and intra-class co rrelations for the interval from onse t of jaw lowering for V1 to onset of jaw lowering for V2 , and the interval from onset of jaw lowering for V1 to onset of lower lip raising for the medi a l consonant. Consonant
Normal
Fast
'CVC
CV'C
Total
bV:b pV:p vV:v fV: f mV:m b : Vb: p:Vp: v : Vv: f:Vf: m: Vm:
0.99 0.99 0.97 0.94 0.99 0.98 0.95 0.96 0.97 0.96
0.95 0.97 0.98 0.93 0.98 0.85* 0.79* 0.93 0.89 0.97
0.99 0.99 0.99 0.98 0.99 0.99 0.98 0.99 0.99 0.99
0.99 0.99 0.97 0.92 0.99 0.98 0.97 0.98 0.98 0.98
0.99 0.99 0.98 0.95 0.99 0.99 0.98 0.99 0.99 0.99
142
A. Lofqvist
..
.•
(a)
"' E ~ 0 c
•• • C@ocf:l
0
"'c0 u
-~
"0
••
-
200 r-
100
§0
"' :::"'
gpB
E
2 0'
c
"'2
(b)
Q_
~
•~··
~
~
200
0 Q)
• ·~·
0
>
•
0
"'c .2
•
o oif!!o o
100
~r:P
.2 0'
'=
I
_l
~
I
~
.2 ~
'?..
"'c
0
_.,•
(c)
0 a;
•
200 r-
E
0
••
2
33'~'
0
>
~ c
100 t-
0
@@l I
100
300
200
In terval from onset of Jaw lower i ng fo r V 1 to on set of jaw lowering fo r (ms)
v2
Figure 2. Plots of vowel-to-vowel period and vowel-to-consona nt latency for sequences with labial fricative and nasal consonants. (a) /bavav/: r = 0.98; y = 0.77x - 31. (b) / bafafj: r = 0.92; y = 0.71x- 38. (c) / bamamj: r 0.99; y = 0.83x - 35.
'CYCVC CV'CVC
No rmal
Fast
••
0 D
that this is not necessarily true. It is thus not the case that the inter-class correlations are higher than those for the intra-class comparisons; the two exceptions marked with asterisks in Table I will be discussed shortly. This is in contrast to the results presented by Munhall ( 1985) who found no reliable intra-class correlations for intra-articulator (tongue body) measurements. There is an additional problem with the type of correlations used in most of these
Stability and change
143
studies. We are correlating the whole with one of its parts (cf. Barry, 1983). This in itself would result in rather high correlations. Following Munhall ( 1985), we can calculate the expected correlations using the formula SD / j"fSiY-. Using Fischer's r to z transformation it is then possible to examine whether the observed correlations are higher than would be expected on the basis of part-whole correlations. In the present case, both the intra- and inter-class correlations are mostly outside the 99% confidence range. Only two intra-class correlations fall within this range; those are marked by asterisks in Table I. The correlations in the present study are thus uniformly high and do not appear to be a statistical artifact due to part- whole correlations. They would thus support the notion of stability between the measured articulatory intervals across changes in stress and speaking rate. However, the picture is quite different if instead we use regression analysis. Here, there are significant differences in both slopes and intercepts, as can be seen in Figures I and 2. The differences between consonants are perhaps less problematic than the varia bility between stress and rate categories. Furthermore, Bartlett's test of homogeneity indicated that the residua l variances were often not the same. In general , the intercepts are non-zero, indicatin g that the ratio between the measured intervals does not remain constant. We should note, however, that in the present material , the words with short vowel and long consonant tended to have intercepts at or close to zero. From the resu lts of the regression analyses it is thus debatable whether stability occurs. There is one final point that needs some clarification. Uniformly high correlations have usually been reported when the period is defined by vowel-related articulatory activities and the latency measured from vowel to a following consonant. This ha s been taken as support for the view that the vowel-to-vowel cycle plays an important organizing role in speech (cf. Fowler, 1983). It seems to me that this claim rests on one assumption and one fact. The ass umption has been that the interval from onset of jaw lowering for the first vowel to onset of jaw lowering for the second vowel is, in fact , a reasonable measure of the vowel -to-vowel cycle. The fact is that correlating this interval with the latency from onset of jaw lowering for the first vowel to onset of lip activity associated with the medial consonant consistency gives higher correlations than a consonant-defined period and a latency measured from consonant to vowel, as illustrated by the present results. However, it seems that we should entertain the possibility of another explanation for these findings. First, while it is possible to argue that the vowel-to-vowel period is what is actually measured , we can say with equal justification that we are measuring the duration of the vowel and the following consonant. Onset of jaw lowering for a vowel will start during a preceding consonant. In a sequence / ba 1C 1a 2C 2 / jaw lowering for the first and second vowels will begin during the initial / b/ and the first consonant, respectively. Variations in stress and rate will mostly affect the duration of the first vowel and also the duration of the first consonant. The interval between the onsets of jaw lowering for the first and second vowels will thus reflect the duration of the first vowel and the first consonant. We are thus correlating the duration of the vowel plus the consonant with the duration of the vowel. Secondly, it is well known that a vowel and a following consonant often serve as a unit for durational constrast (cf. Lehiste, 1970; Bannert, 1979; see also Sock , 1984). That is, some languages have restrictions on the types of syllables that can occur. In Swedish, the combinations allowed are V: C and VC: , and this is also the pattern found in many other Germanic languages. It has also been argued that the ratio V/Y + C is a useful measure in these cases. It is thus entirely plausible that this
144
A . Lofqvist
explains why the high correlations occur for the vowel-defined period and the vowel-toconso nant defined latency while other measures show lower and more va ria ble correlati ons. Note, incidenta lly, that on this analysis the consonant-to-consonant cycle would rather be taken as a measure of the consonant and the following vowel , i.e. the CV syllable. Co rrelating the duration of the consonant plus the vowel with the duration of the consonant usually gives small and variable correlations, as shown in this and other studies. While the CV syllable type may occur universally, it does not appear to play a ny particular role for durational patterns, which a re more commonly defined in terms of the VC unit. From perceptual studies, it is also known that the temporal rel ation between a vowel and a following consonant is important (e.g. Bannert, 1976). Thi s analysis is at least superficially in conflict with the arguments in favor of the CV syllable as a basic unit given by Kelso, Saltzman & Tuller (1986). I should perhaps add that the phase analysis they suggest instead of the durational intervals discussed here is also , in its present form , concerned with the VC transition and not with the vowel-tovowel cycle. Irrespective of these points of disagreement, it seems to me that the general approach outlined by Kelso , Saltzman & Tuller is a good a nd productive one. The search for invariance at the articulatory and acoustic levels has a long history in speech research , mo stly marked by disillusion s. The measurements of relational invariance that they propose provide a fresh line of a ttack on thi s classica l problem. References Abbs, J ., Gracco , V. & Cole, K. ( 1984) Con tro l of multim ovement coo rdinatio n: Sensorimotor mechani sms in speech motor programming, Joumal of Mo10r Behavior, 16, 195- 23 1. Barry, W. ( 1983) Some pro blems of interarticulator phasing as an index of temporal regularity in speech, Journal of' Experimemal Psychology: Human Perception and Perf'orman ce, 9, 826- 828 . Ba nnert , R. ( 1976) Mittelbairische Phonologie allf' akustischer und perzeptorischer Grundlage. Lund: Gleerup. Ba nnert , R. ( 1979) The effect of sentence accent o n quantity. In: E. Fischer-J0rge nsen, J. Rischel & N. Thorse n (eds.), Proceedings of' the Ninth Int ernational Congress of Phonetic Sciences , Vol. 2, pp. 253- 259. Copen hagen: In stitute of Ph onetics. Branderud , P. (1985) Movetrack- a movement tracking system. PERILUS I V, pp. 20-29, Department of Linguistics, Stockholm Uni versit y. Fowler, C. ( 1983) Converging sources of evidence on spoken a nd perceived rythms of speech: cycl ic producti o n of vowels in mo nosyllabic stress feet , Journal of Experimemal Psychology: General, 112, 386412. Gentil, M ., Harri s, K . S. , Hori guchi , S. & Honda, K . ( 1984) Temporal orga niza tion of muscle activity in simpl e di sy llables, Journal of the Acoustical Society of America , 75, S23(A). Kelso, J . A. S., Saltzman. E. & Tull er, B. ( 1986) The dynamical perspective on speech production: data and theory, Journal of Phonetics, 14, 29- 59. Kelso, J. A. S. , Tuller, B. , Vatikiotis-Bateson, E. & Fowler, C. (1984) Fu nction all y specific articu latory cooperation following jaw perturbati o ns during speech: ev idence for coordinative structures, Joumal of Experimemal Psychology: Human Perception and Perf'ormance, 10, 8 12- 832. Lee, W. ( 1984) Neuromotor sy nergies as a basis for coo rdinated intenti onal action , Joumal of' Motor Behavior, 16, 135- 170. Lehiste, I. ( 1970) Suprasegmentals. Cambrid ge, Mass.: M .l.T. Press. Liifqvist, A. & Yoshiok a, H . ( 1984) Intrasegmental timi ng: laryngea l- ora l coordinat ion in voiceless consonant production , Speech Communication, 3, 279- 289. Munhall , K . ( 1985) An examination of intra-articulator relative timing, Joumal of' the Acoustical Society of' America, 78, 1548- 1553. Nashner , L. & McCollum, G. ( 1985) The o rgan iza tion of human postura l movements: a forma l basis a nd ex perimental sy nthesis, The Behavioral and Brain Sciences, 8, 135- 172. Sock , R. (1984) Une compensation temporelle, en fonction de Ia vitesse d'eiocution dans le timin g de !'oppositi o n de quantile voca liq ue du wo lof de Gambie, Bulletin de /'!nstitut de Phonetique de Grenoble, 13, 25- 84. Tuller, B. & Kelso , J. A. S. (1984) The timing of articulatory ges tures: Evid ence for relational inva riants, Journal of the Acoustical Society of America , 76, 1534- 1543. Tull er, B. , Kelso, J. A. S. & Harri s, K . S. ( 1982) lnterarticulator phasing as a n index of temporal regu larit y in speech, Joumal of' Experimemal Psychology : Human Percep tion and Performance, 8, 460-472 .