Vocal fold vibrations in dysphonia: model vs. measurement

Vocal fold vibrations in dysphonia: model vs. measurement

Journal of Phonetics (1986) 14, 429- 434 Vocal fold vibrations in dysphonia: model vs. measurement D. G. Childers Department of Electrical Engineerin...

2MB Sizes 0 Downloads 45 Views

Journal of Phonetics (1986) 14, 429- 434

Vocal fold vibrations in dysphonia: model vs. measurement D. G. Childers Department of Electrical Engineering, University of Florida, Gainesville , FL 32611, U.S .A .

Y. A. Alsaka Department of Electrical Engineering and Communication Sciences, University of Central Florida, Orlando, FL 32816, U.S .A .

D. M. Hicks and G. P. Moore Department of Speech, University of Florida, Gainesville, FL 32611, U.S .A .

Several investigators have conjectured that the electroglottographic (EGG) waveform is inversely proportional to the lateral area of contact of the vocal folds. But this conjecture is essentially impossible to substantiate by observations. We have developed a simple model of the vocal folds that estimates the vocal fold lateral contact area as a function of time. The EGG waveform is directly calculated as the reciprocal of this lateral contact area function. We show how the model may be used to simulate EGG waveforms for normal male voices in model register and vocal fry and for predicting EGG waveforms for models of vibrating vocal folds that have a nodule or polyp on one fold.

1. Introduction

For some years we have been assessing electoglottography as a method for investigating the functioning of the vocal folds. For electroglottography to be useful to the clinician or speech researcher the electroglottographic (EGG) waveform or electroglottogram must be related to a model of vocal fold vibratory behavior. The elementary descriptive EGG model suggested by Rothenberg (1981) has been updated by our group until we now believe that the segments of the EGG waveform are related to vocal fold vibratory events as depicted in Fig. 1. These conclusions are drawn from our observations of ultra-high speed laryngeal films synchronized with EGG and speech waveforms (Childers, Moore, Naik, Larar & Krishnamurthy, 1982; Childers, Naik, Larar, Krishnamurthy & Moore, 1983; Childers, Smith & Moore, 1984; Childers & Krishnamurthy, 1985). The segments of the EGG waveform are most easily described starting with point 5 in Fig. 1. At this point the glottis has just closed, i.e. the glottal area has just become zero. For a normal male voice in modal register this typically occurs as the lower margins of the 0095-4470/86/030429

+

06 $03.00/0

© 1986 Academic Press Inc. (London) Ltd.

D. G. Childers et a!.

430

2

34

5

6 I

Time

1-2

2-3

3

3-4

4-5

5-6

Figure 1. Elementary descriptive EGG model with artistic rendition of a flexible, elastic one mass vocal fold model. 1- 2 Vocal folds maximally closed; complete closure may not be obtained; flat portion idealized. 2- 3 Folds parting, usually from lower margins toward upper margins. 3 When this break point is present, this usually corresponds to folds opening along upper margin. 3-4 Upper fold margins continue to open. 4-5 Folds apart, no lateral contact; idealized. 3-5 Open phase. 5 Glottal area zero; folds in contact along lower margin; idealized. 5- 6 Folds closing from lower to upper margin. 6-1 Rapid increase in vocal fold contact.

vocal folds initially come into contact. Contact initiates at the anterior of the folds and progresses to the posterior in a zipper-like manner, as shown in Fig. 1. We conjecture that (1) the lateral contact area of the vocal folds continues to increase from point 5 to 6 to I as the folds roll into contact inferiorly to superiorly; (2) the maximum vocal fold lateral contact area occurs at point I and is maintained until point 2; and (3) following point 2, the lower vocal fold margins begin to separate posteriorly to anteriorly in a zipper-like manner as seen in Fig. 1. The anterior-posterior angle for the separation of the vocal folds differs from that for closure. At point 3, the lower margins of the folds are separated and the upper margins are moving apart but lag the lower margins. The vocal folds are completely separated at point 4, i.e. the glottis has just opened. The folds continue to move laterally in the interval from point 4 to 5, achieving a maximum glottal opening in this interval. The folds then begin to move medially (close) again with glottal closure occurring at point 5 and the cycle repeats. The plausibility of the conjectures mentioned above is substantiated in this paper using our EGG model. The argument that the EGG measures vocal fold contact area has been taken up by several researchers (Fourcin, 1981; Titze & Talkin, 1982; Baer, Lofqvist & McGarr, 1983; Baer, Titze & Yoshioka, 1983; Childers et a!., 1983; Gilbert, Potter & Hoodin, 1984; Childers & Krishnamurthy, 1985). The argument is based on impedance concepts discussed more fully in Titze & Talkin (1981), and Childers, Hicks, Alsaka & Moore (1986).

Vocal fold vibrations

431

2. Simple triangular unitary mass model of the vocal folds Our simple model is an extension of the two mass vocal fold articulatory speech synthesis model of Ishizaka & Flanagan (1972) and uses ideas similar to Titze, Baer, Cooper & Scherer (1983); Titze & Talkin (1981); and Titze (1984), as well as information from our own observations. High speed films of the vocal folds of males in modal register show that there exists a phase difference along the length of the vocal folds during their vibration and therefore during the closing (opening) phase. During closure, contact between the folds first occurs over a small portion of their length. Closure continues, zipper-like, along the length of the folds until the glottis is closed. Similar behavior occurs during the opening phase. The angles of vocal fold closure and opening differ from one another. We can model this behavior as in Fig. 2 by creating an angle, e, between the left and right vocal folds. Figure 2 is a top view of the vocal folds and a lateral section. The contact area is now proportional to L11. This is similar to the approach taken by Titze (1984). In addition to the longitudinal vocal fold phase differences described above, a vertical (superiorinferior) phase difference between the upper and lower margins of the vocal folds has also been observed. An artistic rendition of the elastic, flexible unitary mass vocal fold model and its relation to EGG waveform events is depicted in Fig. 1. Note the vertical phase difference between the edges of each mass as well as the longitudinal phase difference. The latter phenomenon resembles the action of a zipper being closed and opened. The manner in which our model functions is similar to that shown in Fig. 2 for the flexible one mass model. This model works as follows. The upper and lower glottal areas (AG2 and AG 1, respectively) are calculated using the Flanagan-Ishizaka two mass vocal fold articulatory synthesis model. The displacements of these parallel masses of the vocal folds (both upper and lower) from the mid-sagittal line is calculated for a particular time instant, n, as AGl(n) 2L AG2(n) 2L

--jrf-Figure 2. Triangular unitary mass vocal fold model, top and lateral views. This figure can be used for vocal fold opening or closing and for upper and lower margin displacements.

D. G. Childers et a!.

432

.

.

'\

l/V

(!) (!)

w

f-~~

(!) (!)

w

\r"\\

~

0

/

0

/ \\

0 7 E 0

0

j

(\

I

/

.

r-, ·..

/ I

I

w

0 I

:/

/~

0

\

\ '- __../'

5

/

/\

\

___/

/

.

I

I \\ :

L

,. ,..---~t'

\

/

20

15

10

5

"'E

g

J \

/

~


~
/\.

/

l\___/

/

/ 15

10

'

"

_/

· ' ·, ~"

20

25

Time (ms)

Figure 3. Simulated EGG, DEGG, and glottal area waveforms with e0 = 1.0°, 0.2°, and a 0. 7 ms phase lag between upper and lower vocal fold margins (upper graphs). Measured glottal area and EGG waveforms (lower graphs).

ec =

where n is a time index and L is the length of the vocal folds. These displacement values are used to position the upper and lower margins at the posterior ends of the vocal folds in the model in Fig. 2. This modified model has a triangular glottal area with an angle, 8, as shown. The phase difference between d1(n) and d2 (n) is maintained along the complete length of the unitary model of the vocal folds. The vocal fold contact area is calculated using this triangular, phase shifted configuration. Several conditions may be specified in the computer program implementation of this model: (1) the folds are not in contact (no contact area), (2) the lower margins of the folds are in contact and the upper margins of the folds are not in contact (lateral area of contact is triangular), (3) lower margins are not in contact and the upper margins are in contact (lateral area of contact is again triangular), and (4) both upper and lower vocal fold margins are in contact and possibly out of phase (lateral area of contact is trapezoidal, the conditions shown in Fig. 2). With these conditions, the EGG waveform is specified as

EGG(n)

=

k A(n)

+ C'

where n is the time index, A(n) is the lateral contact area, Cis a constant proportional to the shunt impedance specified for the case when A(n) = 0, and k is a scaling constant.

'ul/. /. /_/- ' I \..].··,..L__.../ '\ ..\·

Vocal fold vibrations

§

\

......

;

-

433

--'::..._THE TO= 0 · . - ..:... - _ THETO = I · 0 L _ _ _ THE TO = 2 ·7 ~

/'

/

1===::::::::;::--~-.,-----.,.----:::==~~-..

.

.\ ·..·. .

'\ \L__,//

<

Log=O·O ~



Log: 0 ·5 ms .t.!..:. ___ ·Log =0·8 ms

~

LJ.J

1---~--

THETC=0 ·5 ·' \ ' 'L··..\_ THETC: O·OOOJ . - 1 THETC : 2. 0- \

rl;,

I

\1

I

' ·.\ ~

~

;·?

v

' . /J.

w:· \,_\.\

_,:-

~

1

-

§V1- AW~v-~·~v -: . /

.

-- .

!-· .

-

.

•._ .

....



20

10

5

0

-. .

Time ( ms)

Figure 4. Simulated EGG for various opening and closing angles and various lag differences between upper and lower vocal fold margins. L? L?

LJ.J '0

Q)

0 :::>

E (f)

8LJ.J

5

10

15

20

Time (ms)

Figure 5. Simulated EGG (upper graph) for a simulated nodule present on one vocal fold , 00 = 2.0, Oc = 0.5. Measured EGG (lower graph) for a nodule present on one vocal fold (time scale adjusted for illustrative purposes).

The glottal area is calculated in the model using the projected triangular glottal area configuration, not the projected area given by AG I and AG2 of the two mass vocal fold articulatory speech synthesis model. 3. Simulation results To help orient the reader the first example is illustrated in Fig. 3 for the following conditions: opening angle, () 0 = 1.0°, closing angle, Oc = 0.2°, and a lag of 0.7 ms

434

D. G. Childers et al.

between the upper and lower vocal fold margins. These values have been found to simulate features of an actual EGG quite well as can be seen by comparing the model with the measured waveform. Figure 4 illustrates the effects of varying the opening and closing angles and changing the lag between the upper and lower margins of the vocal folds. Large lag values simulate vocal fry (pulse register) quite well. Figure 5 compares an EGG waveform generated by simulating a nodule on one vocal fold with an EGG waveform measured from a patient with an actual nodule on one vocal fold. The flat hump in the rising EGG segment is where the folds are opening but the nodule remains in contact with the opposite fold. Our model can shift the location of this flat segment by moving the nodule either anterior or posterior. Further, the duration of the flat segment is controlled by the size of the nodule, i.e. its contact area. This research was supported in part by NIH grant NS17078, NSF grant ECS-8116341 and the University of Florida, College of Engineering Center of Excellence Program for our Mind-Machine Interaction Research Center.

References Baer, T., Liifqvist, A. & McGarr, N . S. (1983) Laryngeal vibrations: a comparison between high-speed filming and glottographic techniques, Journal of the Acoustical Society of America, 73(4), 1304-1308. Baer, T., Titze, I. & Yoshioka, H . (1983) Multiple simultaneous measures of vocal fold activity. In Vocal fold physiology: contemporary research and clinical issues (D. M. Bless & J. H. Abbs, editors), pp. 229- 237. San Diego, California: College Hill Press. Childers, D . G. , Hicks, D. M ., Alsaka, Y. A. & Moore, G. P. (1986) Modeling vocal fold vibrations and the electroglottographic waveform. Journal of the Acoustical Society of America, 80 (5), in press. Childers, D. G. & Krishnamurthy, A. K. (1985) A critical review of electroglottography, CRC Critical Reviews in Biomedical Engineering, 12, 131 - 161. Childers, D. G. , Smith, A.M. & Moore, G. P. (1984) Relationships between electroglottograph, speech, and vocal cord contact. Folia Phoniatrica, 36, 105- 118. Childers, D . G., Naik, J. M ., Larar, J. N., Krishnamurthy, A . K. & Moore, G . P. (1983) Electroglottography, speech, and ultra-high speed cinematography. In Vocal fold physiology and biophysics of voice (I. R. Titze & R. Scherer, editors), pp. 202-220. Denver: The Denver Center for the Performing Arts. Childers, D. G ., Moore, G. P., Naik, J. M., Larar, J . N . & Krishnamurthy, A. K. (1982) Assessment of laryngeal function by simultaneous, synchronized measurement of speech, electroglottography and ultrahigh speed film. In Transcripts of the eleventh symposium care of the professional voice, (L. Van Lawrence, editor), pp. 234-44. The Julliard School, New York; II, Medical/Surgical Sessions: Papers, The Voice Foundation. Fourcin, A. J. (1981) Laryngographic assessment of phonatory function, Proceedings Cont. Assessment of Vocal Pathology (C. L. Ludlow & M. 0. Hart, editors), pp. 88-96. ASHA Rep. No. II. Rockville, MD: American Speech-language and Hearing Association. Gilbert, H. R ., Potter, C. R. & Hoodin, R. (1984) Laryngograph as a measure of vocal fold contact area, Journal of Speech and Hearing Research , 27, 173- 178. Ishizaka, K . L. & Flanagan, J. L. (1972) Synthesis of voiced speech from a two mass model of the vocal cords, Bell System Technical Journal, 51, 1233- 1268. Rothenberg, M. (1981) Some relations between glottal air flow and vocal fold contact area, Proceedings Conf Assessment of Vocal Pathology (C. L. Ludlow and M . 0 . Hart, editors), pp. 88- 96. ASHA Rep. No. II. Rockville, MD: American Speech-language and Hearing Association. Titze, I. R. (1984) Parameterization of the glottal area, glottal flow, and vocal fold contact area, Journal of the Acoustic Society of America, 75, 570-580. Titze, I. R., Baer, T ., Cooper, D. & Scherer, R. (1983) Automated extraction of glottographic waveform parameters and regression to acoustic and physiologic variables, Vocal Fold Physiology: Contemporary Research and Clinical Issues (D. M. Bless & J. H . Abbs, editors), pp. 146--154. San Diego, California: College-Hill Press. Titze, I. R. & Talkin, D . (1981) Simulation and interpretation of glottographic waveforms," Proceedings Conf Assessment of Vocal Pathology (C. L. Ludlow & M. 0. Hart, editors), pp. 48- 55. ASHA Rep . No. II . Rockville, MD: American Speech-language and Hearing Association.