Journal of Voice
Vol. 3, No. 3, pp. 213-224 © 1989 Raven Press, Ltd., New York
Role of the Thyroarytenoid Muscle in Regulation of Fundamental Frequency Ingo R. Titze, Erich S. Luschei, and *Minoru Hirano Department of Speech Pathology and Audiology, The University of Iowa, Iowa City, Iowa, U.S.A. and *Department of Otolaryngology, Head and Neck Surgery, Kurume University, Kurume, Japan
Summary: Thyroarytenoid muscle activity is shown to combine with cricothyroid muscle activity to regulate fundamental frequency of phonation. The relative amount of activity in these muscles, as measured electromyographically, is illustrated on a muscle activation plot (MAP) for four subjects vocalizing at different pitches and loudnesses. Electrical stimulation of the thyroarytenoid muscle in various regions of the MAP suggests that both positive and negative changes in fundamental frequency (Fo) can occur with increased thyroarytenoid activity. At lower fundamental frequencies and lower vocal intensities, Fo correlates positively with thyroarytenoid activity, but at higher fundamental frequencies and low intensity (especially in falsetto voice) an increase in thyroarytenoid activity tends to lower F o. A biomechanical bodycover model of fundamental frequency control is used to explain this phenomenon. Key Words: Thyroarytenoid--Body-cover--EMG--Fundamental frequency--Electrical stimulation--Pitch control.
proximately as the square root of the effective stiffness (or equivalently, the effective longitudinal tension). The possibility of both positive and negative changes of F o with increasing TA activity was recently predicted quantitatively by Titze, et al. (3) on the basis of rotational equilibrium mechanics around the cricothyroid (CT)joint. The mechanical properties were based on measurements made on three canine larynges in vivo and several samples of muscle tissue in vitro. The approach showed considerable promise for quantifying the body-cover theory of Fo control, but more data are needed to raise the confidence level of the results. More importantly, the canine larynx is not an ideal model of the human larynx when vocal fold tissue morphology is in question. Owing to the virtual absence of the vocal ligament in the canine (1), the cover seems to be more loosely connected to the body in the dog than in the human. This makes the distribution of cover tissue and body tissue in the vibrating portion
The role of the thyroarytenoid (TA) muscle in fundamental frequency regulation is not fully understood. As suggested by Hirano (1,2), the thyroarytenoid muscle should be able to stiffen the body of the vocal fold while slackening the cover. This nonuniform stiffening of adjacent vocal fold tissues, which presumably occurs most markedly when the vocal fold is allowed to shorten, creates an uncertainty about the effective stiffness of the tissue in vibration. If the vibrating cross-sectional area is primarily nonmuscular tissue (cover), then the effective stiffness should be lowered by TA contraction. On the other hand, if the vibrating crosssectional area is primarily muscular tissue, then the effective stiffness could be raised. The fundamental frequency (Fo) would be expected to change ap-
Address correspondence and reprint requests to Dr. Ingo R. Titze, Dept. of Speech Pathology and Audiology, The University of Iowa, Iowa City, IA 52242, U.S.A.
213
214
I . R . TITZE E T A L .
of the vocal fold potentially different in the two species. Given this morphological difference, we felt it appropriate to test the preliminary results obtained with the canine model on some human subjects before a larger study with more animals was initiated. A clear disadvantage in the use of human subjects is that no direct mechanical quantities, such as force, elongation, or vibrating cross section, can be measured easily. The protocol outlined in this study is therefore designed to give only indirect confirmation of the amount of involvement of the body and cover. It provides some qualitative assurance that future refinement with the animal model is justified. Our assumption is that low-intensity and high F o phonation involves less of the body in vibration than high-intensity and low F 0 phonation. This assumption is based simply on differences in vibrational amplitude. Loud and low phonations have greater amplitudes of vibration than soft and high phonations. By examining the effective depth of vibration with the use of videostroboscopy, the amount of involvement of the body in vibration was assessed (qualitatively). The basic hypotheses were: (a) F0 changes with increased TA tension should be more positive when amplitude of vibration is high than when amplitude of vibration is low. (b) F 0 changes with increased TA tension should be more positive when CT activity is low than when CT activity is high. The second hypothesis grew directly out of the consideration that, at low CT activity, the TA muscle is more able to increase the net tension of the vibrating portion of the vocal fold because the tension in the cover is rather low. At high CT activity, on the other hand, when the tension in the cover is quite large, it is difficult for the body to bring about a substantial increase in the net tension of the vibrating system. METHODS AND PROCEDURE Four adult males, ranging in age from 28 to 47 years, served as subjects for the study. None had any known vocal pathology at the time of the experiment. Two subjects had had extensive training and experience as singers, whereas the other two had had no voice training. Bipolar fine wire electrodes were placed in the Journal of Voice, Vol. 3, No. 3, 1989
thyroarytenoid and cricothyroid muscles of each subject. The electrodes were made by passing two 0.003-in diameter, Teflon-insulated, stainless steel wires through a 1.5-in 25-gauge hypodermic needle. The distal 1-2 mm of the end of each wire was bared of insulation by scraping or heating and was then formed into a hook. The electrodes were sterilized by autoclaving. The skin at the penetration site over the lower portion of the laryngeal framework was anesthetized by a small subcutaneous injection of lidocaine. The electrodes were inserted through the skin and directed into the appropriate muscles. All insertions were performed by one of the authors (M.H.), who has extensive experience with this procedure. The location of the electrodes was evaluated by observing the electromyographic activity (EMG) recorded from the electrode pair during various maneuvers. High levels of EMG during closure of the glottis (swallowing, glottal attack, valsalva) were considered to be from the thyroarytenoid. EMG activity that increased when the pitch of vocalization was raised was considered to be from the cricothyroid. If EMG increased during depression of the chin against a manual force, it was assumed that the electrodes were picking up strap muscle activity. Such electrodes were removed and new electrodes inserted. In the first part of the procedure, EMG activity was recorded on an instrumentation tape recorder (DC-2.5 kHz bandpass) while the subject maintained phonation at various low, middle, or high pitches. At each pitch, the subject produced phonation in four ways: soft, medium, and loud levels of intensity, and in some cases a soft breathy voice. Each phonation was approximately 4-6 s long, except for " b r e a t h y , " which was typically much shorter. Three or four tokens were recorded for each condition. Constancy of pitch during the four conditions was established by having the subject listen to a pitchpipe between tokens. The exact low, middle, and high pitch used with each subject differed somewhat; the pitches chosen were the ones that the subject found relatively easy to maintain, but typically covered one octave or more. The EMG was additionally recorded during swallowing, glottal attack, voluntary glottal adduction, singing scales over the entire pitch range, and the production of swell-tones. After EMG activity had been recorded during all of the above conditions, the electrodes to the thyroarytenoid were connected to an isolated, constant-current electrical stimulator. One millisecond
THYROARYTENOID M U S C L E A N D F U N D A M E N T A L F R E Q U E N C Y pulses were delivered to the thyroarytenoid at a rate of 2/s during phonation at each pitch and loudness. The current level was set for each subject by finding a level that subjectively produced a twitch in the throat, but that did not cause pain; currents ranged from 1-7 mA among the subjects. Once the stimulus current was set for any one subject, however, it was held at that level for all phonations. Approximately 100 stimuli were recorded during each pitchloudness combination, except for " b r e a t h y , " in which about 50 stimuli were recorded. An electroglottographic signal (EGG), low-pass filtered slightly above/70, was recorded during the stimulation procedure for later F0 extraction. The relationship between EMG activity and all of the pitch-loudness combinations, as well as other maneuvers, was studied by playing out the rectified (but unfiltered) EMG records on a direct-writing oscillograph, which had a frequency response of DC to 2 kHz. The average amplitude of the EMG was measured during each pitch-loudness combination. In the case of the thyroarytenoid, the measurements were normalized to the EMG amplitude associated with swallowing. Cricothyroid was normalized to the level for the highest tone in the scales. For one subject (IT), however, both CT and TA activities were renormalized to a loud G 4 tone because activities for this experimental tone were greater than for swallow or for the highest tone in the preliminary scales. Changes in the fundamental frequency of phonation, AFo, produced by the electrical stimulation of the thyroarytenoid, were studied by triggering an oscilloscope from each cycle of the EGG and using the horizontal gate pulse from the oscilloscope as an input to an instantaneous frequency meter. The full range of the meter was set to 200 Hz for low pitches, and to 600 Hz for all higher pitches. The output of the frequency meter was averaged by a laboratory computer, synchronized by stimulus time marks that had been recorded on the tape recorder. Fifty stimuli were generally included in each average. There were two independent averages from each condition, except for "breathy." These two averages were almost always very similar to one another. RESULTS EMG The EMG data for the vocal fold tensor muscles CT and TA are shown graphically on a muscle ac-
215
tivation plot (MAP) for each of the four subjects in Figs. 1--4. Figures 1 and 2 are for the two untrained vocalists and Figs. 3 and 4 are for the two trained vocalists. For each MAP, activity in the cricothyroid muscle (act) is plotted against activity in the thyroarytenoid muscle (ata). Both muscle activities are normalized to maximum activity, which is assigned the value 1.0. The subjects' fundamental frequencies (in hertz) are shown for pairs of data points connected by lines. Arrows within each pair represent the direction in which loudness was increased at a given fundamental frequency. For all subjects (except SA in Fig. 3), two pairs of data points were obtained by repeating the experiment with new EMG insertions. The two experimental sessions were conducted 9 months apart. A first observation is that none of the subjects show pronounced activity in the lower right quadrant of the tensor MAP. This is apparently the glottal stop region, where act is small and ata is large, resulting in hyperadducted vocal folds that are unlikely to produce phonation. Subjects JJ and EL (Figs. 1 and 2, respectively) sometimes found it difficult to maintain a given pitch when loudness was increased. Subject JJ had a range of about one and a half octaves (131-392
1.0
C=O
262 Hz
0.8 8Q~O
392
----e
/. 0.6 e"
0.4
o. f y 0.0
.
0.0
.
.
.
.
0.2
'~
.
.
.
0.4
.
.
0.6
.
.
.
.
0.8
1.0
ata FIG. 1. Muscle activation plot (MAP) for subject JJ, an untrained vocalist. Open circles are for first experimental session, and filled circles are for second experimental session. Fundamental frequencies in hertz are indicated for each pair of softloud phonations. Arrows point in the direction from soft to loud. Cricothyroid muscle activity act and thyroarytenoid muscle activity ata are normalized to the maximum value observed. Journal of Voice, Vol. 3, No. 3, 1989
216
I . R . TITZE ET AL.
1.0
U
440 Hz 44O 0.8
~
0.6
20
0.4
0.2
0.0
I
0.0
,
:
',
0.2
:
:
:
I
:
:
0.4
I
0.6
:
:
:
I
0.8
:
:
:
1.0
ata
FIG. 2. Muscle activation plot (MAP) for subject EL, an untrained vocalist. Open circles are for first experimental session, and filled circles are for second experimental session. Fundamental frequencies in hertz are indicated for each pair of softloud phonations. Arrows point in the direction from soft to loud. Cricothyroid muscle activity act and thyroarytenoid muscle activity ata are normalized to the maximum value observed.
Hz), whereas subject EL phonated over more than two octaves (98-440 Hz). Neither of them showed consistent EMG trends with increasing loudness. Both CT and TA activity seemed to increase or decrease in unsystematic ways from soft to loud phonation at given pitch levels. The trained vocalists, on the other hand, showed more consistent increases in TA activity with increasing loudness. More will be said about loudness regulation in later discussions. All subjects were able to cover the male speech range of F0 (from below 100 Hz to about 250 Hz) with act and ata both less than 0.5 (in the lower left quadrant of the MAP). This quadrant of the MAP will be called the speech quadrant. As Fo is raised in chest register, the tendency is to move diagonally from the lower left quadrant to the upper right quadrant (both act and ata greater than 0.5) for all subjects. As the phonation becomes more falsetto-like, however, the tendency is toward the upper left quadrant, where ata < 0.5 a n d act > 0.5. It would appear that the upper two quadrants are where major register transitions would take place in singing, with the falsetto-head transitions occurring in the upper left quadrant and the head-chest transition occurring in the upper right quadrant. Journal of Voice, Vol. 3, No. 3, 1989
Thyroarytenoid stimulation As described in Methods and Procedure, the TA muscle was stimulated in all subjects. Only one subject (EL) showed consistently large changes in F o with moderate to low currents (<5 mA). It is conceivable that the insertions for this subject were near a nerve ending and that, as a result, some effective nerve stimulation was achieved. The F o changes are shown graphically in Fig. 5 for several pitch and loudness conditions. An additional numerical summary of the F 0 changes is given in Table 1. In approximately one-half of the conditions, two independent experiments (involving different electrode insertions approximately 9 months apart) are represented as data pairs, both in Fig. 5 and Table 1. It was felt that this verification was needed to raise the confidence level in the results. TA stimulation produced a consistently positive AF0 in the low to medium low pitch range (G2 -- 98 Hz to E 3 = 164 Hz). This is in the speech quadrant of the MAP, where activity levels of both TA and CT are below 0.5. Negative AF0 was observed at very high pitches ( A n = 440 Hz), where CT contraction was near 1.0. These were falsetto (or near falsetto) phonations, where TA muscle activity was less than CT muscle activity. In the medium pitch range ( A 3 = 220 Hz to E 4 = 330 Hz), the funda-
1.0
0.8
0.6
~d 0.4
0.2
0,0
I
0.0
0.2
0.4
0.8
.
.
,
!
0.8
•.
.
• .
.•
i
1.0
FIG. 3. Muscle activation plot (MAP) for subject SA, a trained vocalist. Fundamental frequencies in hertz are indicated for each pair of soft-loud phonations. Arrows point in the direction from soft to loud. Cricothyroid muscle activity act and thyroarytenoid muscle activity ata are normalized to the maximum value observed.
THYROARYTENOID MUSCLE AND FUNDAMENTAL FREQUENCY 1.o
0.8 e..4~nO 440
830
0.6
-~--~'ll"
-a
0a
9200.,.. o 0.4
0
-0330
862
~l t85
~g165 0.2
t8131 0.0
:
0.0
0.2
:
:
0.4
:
:
:
0.6
:
:
:
:
0.8
:
:
:
1.0
FIG. 4. Muscle activation plot (MAP) for subject IT, a trained vocalist. Empty circles are for first experimental session, and filled circles are for second experimental session. Fundamental frequencies in hertz are indicated for each pair of soft-loud phonations. Arrows point in the direction from soft to loud. Cricothyroid muscle activity act and thyroarytenoid muscle activity ata are normalized to the maximum value observed.
mental frequency change was sometimes positive and sometimes negative. Wherever a discrepancy existed, however, the softer phonations produced a less positive (or negative) AFo. In the A 3 and E 4 tones in Fig. 5, soft phonation produced a AFo of - 1 6 Hz and - 1 7 Hz, respectively. These are apparently cases for which the cover was primarily in vibration. Stimulation of the TA muscle probably shortened the vocal folds, thereby slackening the cover and reducing tension. For loud phonation on the same frequencies, a strong positive AF0 was seen, which suggests that part of the body was in vibration. These results (from this subject) begin to support our two basic hypotheses, that changes in F o with changes in TA activity should be more positive when CT activity is low or when vibrational amplitude is large. It will be shown in the next section that vibrational amplitude did indeed increase with loudness in this subject. Stimulation data from the other three subjects did not show such clear trends. First of all, the F 0 changes were generally smaller. Second, they were more variable from token to token, and even within tokens. Sometimes a negative AF0 obtained at the beginning of a sustained vowel token would disappear (or even change sign) toward the end of the
217
token with repeated stimulation at 2 Hz. Either the placement of the electrodes, the amount of current delivered, or the time-varying properties of the contracting muscle were less than ideal for obtaining large twitches. There were also measurement artifacts. For the two singers, the presence of vibrato compromised our ability to line up and average F o variations that could be attributed solely to the muscle twitch. In addition, for subject IT an initial positive AF0 was sometimes followed by a negative AFo in a period of about 100 ms (Fig. 6). This biphasic frequency change, which occurred mainly at low frequencies, is observable also in the data of Larson and Kempster (4). It may suggest that tension in the TA muscle first increases isometrically, but then decreases when the vocal folds shorten. There is no secondary evidence for this, however, and other explanations could possibly be given. In any case, it was difficult to assign a net positive or negative AFo in these responses. Table 2 shows a summary of results for subject IT, one of the two subjects with voice training. In the middle of the fundamental frequency range, subject IT demonstrated primarily small negative changes in F0. According to our two hypotheses, this would mean that either his TA muscle was not involved much in vibration (because of small amplitudes), or that the vocal fold cover remained quite stiff throughout all the phonations. Given that the productions were perceived to be quite intense (both from a muscular effort and an auditory point of view), the second explanation involving a generally stiff cover seems more plausible. Thus two contrasting tissue morphologies may be manifested in subjects EL and IT, giving rise to an almost inverted AF0 pattern over the intensity and frequency ranges. If we assume that EL has a relatively lax cover, small amounts of thyroarytenoid stimulation would raise the overall stiffness of the vocal folds isometrically and produce primarily positive F 0 changes. In contrast, if we assume that subject IT has a relatively stiff cover, small amounts of thyroarytenoid stimulation will not add much stiffness to the vibrating tissue, but may cause a small length reduction. This in turn may reduce the tension in the cover and lower the fundamental frequency. The explanation would also a c c o u n t f o r the monophasic AFo for subject E L and the biphasic AFo for subject IT. The remaining two subjects had less complete data sets for stimulation. Subject SA, whose thyroarytenoid muscle was not stimulated in the secJournal of Voice, Vol. 3, No. 3, 1989
218
I . R . TITZE E T A L .
Breathy
Soft
Medium
Loud
A4 (440 Hz)
E4 (330 Hz)
Cr.,, <1
A, (sZ0 Hz)
E, (164 Hz)
_/-,.
G, (98 Hz)
Time
t
200 ms '
FIG. 5. Change in fundamental frequency (AFo) with thyroarytenoid muscle stimulation for different pitches and loudnesses. The subject is EL. Pairs of AF0 patterns indicate results for two experimental sessions, session one on top and session two on bottom.
ond experimental session (due to three unsuccessful hooked-wire insertions), exhibited no changes in Fo at high notes (F0 = 349 Hz) in the first experiment. In the middle of the frequency range (220 Hz), a + 6 Hz change was recorded at loud phonations, with no significant changes at soft phonation. At low notes (117 Hz), small negative changes were encountered over various intensities. These did not exceed 3 Hz in magnitude. Subject JJ exhibited no AFo at 262 Hz, 3-4 Hz increases at 196 Hz, and 10--27 Hz increases at 131 Hz, all in the first experimental session. In the second session, TA stimulation produced no AF0 for subject JJ at any intensity or frequency. We assumed that the electrodes were Journal of Voice, Vol. 3, No. 3, 1989
ineffectively placed, or perhaps shorted out after EMG measurements. Nevertheless, the fragmentary stimulation data from subjects JJ and SA seem to follow some of the trends observed in the other subjects. JJ is similar to EL, with generally positive AF0 over the low to middle frequency range. The trend for SA may be similar to that of IT because small negative frequency changes occur in the low to middle Fo range, but the data are too sparse to make a firm comparison. It is interesting to compare our results on TA stimulation with those of Larson and Kempster and their co-workers (4,5). Negative AF0 was observed only once in each of their two studies, presumably
THYROAR YTENOID MUSCLE AND F U N D A M E N T A L F R E Q U E N C Y TABLE
1.
219
S u m m a r y o f A F o (in h e r t z ) f o r subject EL"
Fundamental frequency
Hertz
A4
440
- 15 -20
- 18 - 10
E4
330
--17
-+6
A3
220
+ 17 - 16 b
+21
+ 17 +24
+22.5 --
Breathy
E3
164
+ 16.5 --
G2
98
+7.5 --
Soft
+9 + 14
Medium
-
-
Loud
+ 18 +22 + 17 +24 + 16.5 13.5
+
" Upper and lower numbers refer to first and second experimental sessions, respectively. b V e r y soft.
because fundamental frequencies were kept in the speech range (lower left quadrant of the MAP). If we were to count "frequency of occurrence" of negative AF0 in the lower left quadrant across our four subjects, it would fall below 20%. It is likely, therefore, that our results are not at odds with those of Larson and Kempster, but simply reflect a wider range of TA and CT activity in our study. Nevertheless, the occasional occurrence of negative AF 0 in this speech range still needs clarification on physiologic and biomechanical grounds. Vibrational amplitude and loudness Three of the four subjects repeated some of the phonations at previously described pitches and ioudnesses while a rigid fiberscope (Wolf 4450-47) was inserted into the mouth and positioned near the posterior pharyngeal wall. A stroboscopic light
50
100
Time
1,~0
200
in ms
F I G . 6. B i p h a s i c A F o p a t t e r n o b s e r v e d in s u b j e c t IT.
source (B & K 4914) was used to illuminate the vocal folds, and the image was recorded with a Storz Mini 9000 Solid State CCD video camera and a Panasonic NV-8950 VHS videocassette recorder. Subsequently, the maximum glottal width was measured on a TV monitor screen with calipers. Measurements were repeated and averaged over five cycles. Normalization of the maximum glottal width to the loudest and lowest phonation was possible because this condition consistently produced the greatest values. We assume that the amplitude of vibration is proportional to the maximum glottal width. Figure 7 shows the results for the three subjects at three pitch levels; low, medium, and high. Relative amplitude (normalized to the low-loud condition) is plotted against three loudnesses; soft, medium, and loud. Note that there is a uniform increase in relative amplitude with increased loudness. For all subjects, relative amplitude was greatest for low pitch, less for medium pitch, and least for high pitch. On the average, the relative amplitude doubled (from 0.4 at soft phonation to 0.8 at loud phonation) when data for all pitch levels and all subjects were collapsed. Specifically for subject EL (circles), relative amplitude increased from an average of 0.45 to an average of 0.90 when the data were collapsed over all pitch conditions. Thus we can assume a 2:1 increase in amplitude from soft to loud in subsequent discussions. A MODEL OF FUNDAMENTAL FREQUENCY REGULATION In a previous study (3), we derived an equation relating vocal fold strain e to thyroarytenoid muscle Journal of Voice, Vol. 3, No. 3, 1989
220
I . R . TITZE E T AL. TABLE 2. Summary of M~ofor subject 17~ Fundamental frequency
Hertz
Breathy
440 392
A4 G4
330 262 220 165 147 131 98
E4
Ca A3 E3 D3
C3 G2
Soft
Loud
+ 13 positive phase 1 First - 2 negative phase J~ experimental session
--
+ 10 positive phase ] Second - 8 negative phase f experimental session -4 - 4.5 - 3.5 - 1 --3 + 3.5 positive phase - 4 . 0 negative phase
+8
-- 2 -3 - 2.5 --+ 2 positive phase - 5 negative phase
a Except where indicated, results are for the first experimental session.
activity ata and cricothyroid muscle activity act in the canine larynx. The equation was = G(Ract
-
(1)
ata )
where R is a torque ratio relating the m a x i m u m torque that can be produced by the CT muscle to the m a x i m u m opposite torque that can be produced by the T A muscle. (The torque axis passes through the CT joints.) R can be thought of as a mechanical advantage that CT has o v e r TA. G in Eq. 1 is a gain factor that depends on rotational stiffness, laryngeal
g e o m e t r y , and m a x i m u m T A force p r o d u c e d . It is a m e a s u r e o f how sensitive vocal fold elongation is to muscle contraction. F o r the canine larynx, G and R were found to have dimensionless values of 0.10 and 3.5, respectively. F o r the h u m a n larynx, the torque ratio of 3.5 seems to be appropriate, but a higher gain (G = 0.17) is needed to achieve fund a m e n t a l f r e q u e n c i e s ranging f r o m a b o u t 50 to 500 Hz. The fundamental frequency was derived to be I/2
Fo = Fop m
1.0
.j'j,.
"-"
,,
0.6,
E! /
/
/
1 + A O'p ata
t
.
0.4. Ili
L = L0(1 + ~ - 0.623)
(3)
104 e 9-2~ dyn/cm 2
(4)
O-p
0.2.
=
O'am = 106(1 + 0 . 6 e ) d y n / c m o.o
(2)
where Fop is the passive f u n d a m e n t a l frequency (when there is no thyroarytenoid contraction), Aa/A is the ratio of the cross section of the T A muscle in vibration to the total cross-sectional area in vibration, O-am is the m a x i m u m active stress (force per unit area) in the T A muscle, and O-p is the mean passive stress of all the c o m b i n e d tissues in vibration. Four auxiliary equations
O.S
/
(
2
(5)
I
I
I
Fop = (1/2L)(o-p/p)'/2
Soft
Medium
Loud
were also introduced in the previous paper. In Eq. 3, L is the m e m b r a n o u s vocal fold length at any given strain e, with L0 being the reference length (1.6 c m for m a l e s on a v e r a g e and 1.0 c m for
FIG. 7. Relative amplitude of vibration versus loudness condition for three subjects at low pitch (short dashed lines), medium pitch (long dashed lines), and high pitch (solid lines). Circles are for subject EL, triangles for SA, and squares for JJ. Journal of Voice, Vol. 3, No. 3, 1989
(6)
THYROARYTENOID MUSCLE AND FUNDAMENTAL FREQUENCY females). The factor -0.623 is necessary to transform typical strains used in in vivo dog experiments to typical strains found in human vocal fold length adjustments (6). In Eq. 4, the passive tissue stress is modeled with an exponential stress-strain curve (6,7), while in Eq. 5, the maximum active stress follows the data on canines reported by AlipourHaghighi et al. (8). Finally, Eq. 6 is the familiar formula for the fundamental frequency of a vibrating string, p being the tissue density (1.03 g/cm3). Equations 1 through 6 can be combined to yield one equation in four unknowns, act, ata, Fo, and Aa/A. In the process, the variables L, o-0, O'am, 6, and Fop are eliminated by substitution. In order to relate the theoretical results to experimental results discussed earlier in Figs. 1-4, it is useful to obtain a theoretical MAP by solving for act as a function of ata, with Fo and AJA being parameters. This requires a numerical solution of Eqs. 1 through 6 because some of the equations are transcendental in 6. Such a numerical solution is shown graphically in Fig. 8. Normalized cricothyroid activity act is plotted against normalized thyroarytenoid activity ata, with families of curves representing constant F0 contours. Fo varies in 50 Hz increments from a low of 100 Hz (lower left corner) to a high of 450 Hz (upper left corner). Solid lines are for AJA -- 0.3 and dashed lines are for AJA = 0.6. These ratios are estimates to represent the soft and loud conditions, respectively, in the experimental data. In other words, less than one-third of the vibrating cross section is assumed to be TA muscle in soft phonation, whereas nearly two-thirds of the vibrating cross section is assumed to be muscle in loud phonation. This doubling of Aa/A agrees with the doubling of vibrational amplitude noted earlier. Several observations are in order with regard to Fig. 8. First, the lowest Fos are attainable with low act and low ata (lower left quadrant of the MAP, where speech is produced), highest F0s are attainable with high act and low ata (upper left quadrant, where falsetto is produced), and intermediate Fos are found in all quadrants. Since the constant Fo lines are continuous, an infinite number of combinations of CT and TA activity are possible for every F0. For example, 300 Hz can be obtained in the Upper left quadrant with ata 0 and act = 0.76. With only a gradual increase in act (solid line for Aa/A = 0.3), ata can vary all the way into the upper right quadrant. With a combined reduction in both act and ata , the 300 Hz curve extends all the way into the lower left quadrant. Thus we see that the =
221
motor system is dealing with a peripheral mechanism that offers multiple solutions to accomplish the same task. These redundancies may be removed, however, when more stringent requirements are put on the vocalization, such as specific loudnesses, qualities, or dynamic Fo changes. For example, a 300 Hz tone may have distinctly different qualities in the three quadrants. It is likely to be a falsetto sound in the upper left quadrant, a chest sound in the upper right quadrant, and a pressed sound in the lower left quadrant. The second point to make is that changes in Fo with increased TA activity are all positive in the lower left quadrant. A positive increment in ata (toward the right) will approach a higher F 0 curve, both for the solid lines (soft phonation) and the dashed lines (loud phonation). This is a direct result of the downward bending of the curves in the lower left quadrant. Similar statements can be made about the lower right quadrant, but phonation generally does not occur in this " s t o p " region. Near the top of the upper quadrants, AFo can be negative with increased ata. Displacement to the right from a given Fo curve will approach a lower Fo if the curves have a positive slope. Thus, F 0 and ata are inversely related at the higher frequencies, especially for A j A = 0.3 (solid lines corresponding to soft phonation). This theoretical result agrees with experimental findings on subject EL. Note that the two 440 Hz phonations in Fig. 2 occurred near the top of the MAP. Recall also that AFo was always negative for this frequency with TA stimulation (Fig. 5 and Table 1). Although in Fig. 8 the 440 Hz curve is slightly off the graph above the upper right quadrant, it is evident that negative AFo would be predicted for both sets of data, especially if Aa/A is closer to the 0.3 value. The 350 Hz curve is even more interesting. Here both positive and negative changes in F o are predicted with increasing ate. Upward sloping lines in the upper left quadrant give rise to negative AF0 with increasing ata for soft phonation, whereas the downward sloping lines for loud phonation give rise to positive AF0. Experimental results from subject EL confirm this sign reversal. Note that AFo in,Fig. 5 (330 Hz) is negative for soft phonation and positive for loud phonation. Muscle activity level was in the upper right quadrant (Fig. 2) for this frequency. In this same quadrant, Fig. 8 predicts opposite slopes for the soft and loud conditions at 350 Hz. Below 250 Hz, AF0 is predicted to be positive for Journal of Voice, Vol. 3, No. 3, 1989
222
I . R . TITZE E T AL. 1.0
,%
t
0.8
I I I
%
• I~0
I
r,.
,v ~ ...
~ . _
/ /
I
%
#
i 0.6
200
;
,
~i II
I II
\
,, %%1 t60 ras i
0.4.
--
I
\
,
I\
!
~n
i l I
,:. l t,11 /! 0
/
/
/
, Ik
,
I I ~ .1 I
l,-
/ /
/ f/
,7
//
,'/
,,/
// /i 11./
,,,"
.,
/
/
,so," ~ ' . . " ,,
,,
/..
/,,,,
,,/ , , , ' / , , " . . . / ,,'/,,"/,,," ,v"
I/
,
#
.' I/
/
I~, S-S*- -
t~
,
)
I,
/
t' r
i I l !
,
/
|
i
,l\~
0.2
X
/**
i/0
/
o:I
I I
/
,"/,'/,"
~ . soo /
/
,,r/
/
111
/ ;/ ,1 ,,' /,/,, , , // /. , / / ,, '/,"I . . /"/," .//,', 11,' / I" t l,'/,"/, / [ #" • /• ,
/l,'i
" .'/,',~" 0.2
0,4
0.8
0.8
1.0
ata FIG. 8. Theoretical muscle activation plot (MAP). Solid lines of constant F o are for Aa/A = 0.3 and dashed lines for A./A = 0.6.
all loudness conditions. This is again in agreement with Fig. 5, with the exception of one very soft phonation in experimental session two that showed a negative AF0 at 220 Hz. However, a slightly louder phonation in experimental session one showed the predicted result. This suggests that the model is quite accurate in explaining the general direction of F o change in various quadrants. A final point of interest is the mechanism by which loudness is increased at constant fundamental frequency. Recall that the trained subjects SA and IT systematically increased TA activity with Journal o f Voice, Vol. 3, No. 3, 1989
increased loudness (horizontal or slightly rising or falling lines in Figs. 3 and 4), excepting a few data pairs in the lower left quadrant for subject IT. It appears that increased TA activity is used by singers in conjunction with the major mechanism for increasing loudness, increased subgtottal pressure. Much less systematic approaches were observed for the two nonsingers (Figs. 1 and 2). Furthermore, the singers kept the CT activity more constant, even though some changes in both directions were observed. In previous EMG studies of singers performing a
THYROARYTENOID MUSCLE AND FUNDAMENTAL FREQUENCY
crescendo at constant pitch (9,10), the CT activity usually decreased while the TA activity increased. This indicated that pulmonary pressure was one of the major mechanisms for increasing loudness. Given that F0 increases at a rate of 2-6 Hz/cm H20 with increased subglottal pressure (11), some F0 compensation can be achieved by reduced CT activity. Figure 8 predicts that crescendos at constant F0 should be trajectories between the solid lines and the dashed lines if pulmonary pressure is increased and the amplitude of vibration grows. Should TA activity stay constant in the process, the trajectories would be vertical (downward). This implies an automatic reduction in CT. If TA activity is increased together with pulmonary pressure, the trajectories are sloping or straight lines from left to right, as in the data of subjects SA and IT. The slopes of the trajectories should be positive at high F0 and zero or slightly negative at intermediate F0s. At low fundamental frequencies, the situation is more complicated because of the highly curved nature of the constant F o lines in Fig. 8. In the lower left quadrant, it may be necessary for TA to decrease in order to follow a trajectory from the solid to the dashed line. Given these many possibilities, it is not surprising that the subjects showed great variabilities in their approaches to intensity control, especially the untrained vocalists. A next step in the experimental protocol would be to include subglottai pressure as a controlled variable. This would increase the complexity of the experiment, but would add considerable insight into both frequency and intensity control. DISCUSSION AND CONCLUSIONS The results of this study are both satisfying and slightly disappointing. It is unfortunate that only one out of four subjects showed large and consistent changes in fundamental frequency with stimulation of the thyroarytenoid muscle. This leaves the general applicability of our biomechanical model of fundamental frequency control in some doubt at this time. On the other hand, the model has not been invalidated by those data sets that showed the strongest and most consistent trends. Both of the specific hypotheses stated in the introduction were supported by theory and measurement. We might safely say that we have modeled the Fo control mechanism for a single subject. Perhaps this subject's larynx is closest to the canine larynx from
223
which the basic model was derived. It remains to be seen if adjustments in biomechanical and geometric parameters of the model can be made to match the response of any human larynx, and if better responses to stimulation can be obtained. Notwithstanding some of these limitations, the present study has shed considerable light on the mechanism of F o regulation. An explanation has been offered for why increased thyroarytenoid muscle activity can both raise and lower Fo. When the cover is lax and the amplitude of vibration is sufficiently large to include a portion of the muscle in vibration, increased thyroarytenoid activity will raise F 0. This is because the increase in tension in the muscle outweighs the decrease in the tension in the cover that may result from a small decrease in vocal fold length. This effect increases with increasing vibrational amplitude, i.e., with increased loudness, because more of the muscle is involved in vibration. Presumably, this situation is most often encountered in speech. It accounts for the generally positive correlations observed between F0 and TA in a speech environment (12,13). If the amplitude of vibration is very small, such that none of the muscle is in motion, F o can only drop with increased TA activity. This is a plausible condition for the canine larynx, in which the cover is 2-3 mm thick, but does not appear to be a plausible condition in the human larynx. The presence of the vocal ligament and the thinner cover (about 1 mm) would seem to prevent this complete independence of the body and cover in humans. When the cover is very tense (large cricothyroid activity with elongated vocal folds), the active tension in the TA muscle cannot match the tension in the cover and the vocal ligament. Greater contraction of the muscle will lower F o because the small gain in muscle tension is outweighed by reduced tension in the cover that results from a small decrease in length. The length-tension curve for the cover is very steep at large elongations, making the loss of tension very dramatic with only minor reductions in length. At intermediate levels of CT and TA contraction, our results show that an increase in TA can both raise and lower F 0. This depends critically on the amount of muscle in vibration~ For louder phonations, where the vibrational amplitude is larger, there is more likely to be an F 0 rise. A next step in the development of this model is to include the dependency of Fo on pulmonary pressure. The important link is the amplitude of vibration, which needs to be quantified in detail. Recent Journal of Voice, Vol. 3, No. 3, 1989
224
I . R . TITZE ET AL.
experimentation on excised larynges (7,11) has shown that changes in Fo with subglottal pressure can be accounted for by dynamic (or amplitudedependent) stiffness in the vocal fold tissue. Acknowledgment: This research was supported by the National Institutes of Health, Grant No. NS 16320-08. The authors appreciate the assistance of David Druker, Steve Austin, and Linnie Southard in preparation of the manuscript and data reduction. We are also grateful for the many helpful suggestions and stimulating thoughts that were provided by Ronald Scherer, Ph.D., and David Garrett, Ph.D. Finally, the medical assistance of Steven Gray, M.D. and his assistants from the Department of Otolaryngology-Head and Neck Surgery at the University of Iowa are greatly appreciated.
REFERENCES 1. Hirano M. Phonosurgery. basic and clinical investigations. Official Report of the 78th Annual Convention of the OtoRhino-Laryngeal Society of Japan, •975. 2. Hirano M. Morphological structure of the vocal cord as a vibrator and its variations. Folia Phoniatr (Basel) 1976;26: 89-94. 3. Titze IR, Jiang J, Druker DG. Preliminaries to the bodycover theory of pitch control. J Voice 1988;1:314-9.
Journal of Voice, Vol. 3, No. 3, 1989
4. Larson CR, Kempster GB. Voice fundamental frequency changes following discharge of laryngeal motor units. In: Titze IR, Scherer RC, eds. Vocal fold physiology: biome. chanics, acoustics, and phonatory control. Denver: The Denver Center for the Performing Arts 1983:91-104. 5. Kempster GB, Larson CR, Kistler MK. Effects of electrical stimulation of cricothyroid and thyroarytenoid muscles on voice fundamental frequency. J Voice 1988;2:221-9. 6. Titze IR. Physiologic and acoustic differences between male and female voices. J Acoust Soc A m 1989;85 (in press). 7. Titze IR, Durham P. Passive mechanisms influencingfundamental frequency control. In: Baer T, Sasaki C, Harris KS, eds. Laryngeal function in phonation and respiration. Boston: Little, Brown, 1987:291-303. 8. Alipour-Haghighi F, Titze IR, Perlman AL. Tetanic contraction in vocal fold muscle. J Speech Hear Res (in press). 9. Hirano M, Vennard W, Ohala J. Regulation of register, pitch, and intensity of voice. Folia Phoniatr (Basel) 1970; 27:1-20. 10. Hirano M. Vocal mechanisms in singing: laryngological and phoniatric aspects. J Voice 1988;2:51-69. 11. Titze IR. On the relation between subglottal pressure and fundamental frequency in phonation. J Acoust Soc Am 1989;85:901-6. 12. Atkinson JE. Correlation analysis of the physiological factors controlling fundamental voice frequency. J Acoust Soc A m 1978;63:211-22. 13. Shipp T, Doherty ET, Morrisey P. Predicting vocal frequency from selected physiologic measures. J Acoust Soc A m 1979;66:678-84.