Airflow measurements: Theory and utility of findings

Airflow measurements: Theory and utility of findings

Journal of Voice Vol. 7, No. 1, pp. 38-46 © 1993 Raven Press, Ltd., New York Airflow Measurements: Theory and Utility of Findings Creighton J. Mille...

852KB Sizes 3 Downloads 63 Views

Journal of Voice

Vol. 7, No. 1, pp. 38-46 © 1993 Raven Press, Ltd., New York

Airflow Measurements: Theory and Utility of Findings Creighton J. Miller and Raymond Daniloff Southeastern Louisiana University, Hammond, and Louisiana State University, Baton Rouge, Louisiana, U.S.A.

Summary: The biomechanical movements of vocal structures are shown to control flow of expiratory air for the production of speech sounds. Laminar and turbulent air flow are discussed as they relate to sound generation. Theoretic and therapeutic utility of air flow profiles in differentiating linguistic contrasts and vocal characteristics are discussed along with considerations of relative utility and accuracy for various flow-measuring devices and strategies. Key Words: Air flow--Aerodynamic impedance--Air pressure--Respiratory air flow--Pneumotachography.

GENERAL CONSIDERATIONS

The ultimate source of energy for speech production comes from compressed air. A combination of elastic recoil forces and respiratory muscle forces (both inspiratory and expiratory muscles may contribute opposing forces) compress the thorax, driving a copious outward flow of compressed air. The potential energy of this pressurized expiratory stream of air is partially converted into the kinetic energy of an acoustic vibration through one or more of the following processes. 1. Transience--the abrupt onset of substantial air flow through the vocal tract after the rapid opening of an articulatory closure blocking the expiratory air stream. This is the acoustic "burst" or "explosion" characteristic of stop-plosive sounds. 2. Frication--aerodynamic turbulence caused by relatively high rates of air flow through an articulatory constriction. Although unimpeded air flow through the vocal tract is essentially uniform, or laminar, when the jet of compressed air encounters a constriction, turbulent eddies are generated at the downstream orifice of the constriction. The "Reynolds number," a relationship among (a) volume velocity of flow, (b) viscosity for the medium (air), and (c) the dimensions of the constriction, determines the critical point at which significant aerodynamic turbulence (and thus acoustic noise) will occur (2). 3. Phonation--quasiperiodic sound waves that arise when the expiratory air stream forces the vo-

Speech and singing are powered by an aerodynamic process in which articulatory and laryngeal modulations of air flow create the raw acoustic waves that are subsequently modified by vocaltract resonance. Measurements of these air flow patterns offer a relatively noninvasive method for exploring the dynamics of speech production and also provide general pedagogical and medical information (1). For speech production, the vocal tract functions much like a brass instrument; the dynamic valving of air flow by lips against the mouthpiece (larynx) generates acoustic waves that resonate within the piping of the instrument (the speaker's vocal tract). Regardless of their location (e.g., lips, tongue, or vocal folds), constrictions within the vocal tract smaller in cross section than - 0 . 3 0.5 c m 2 introduce aerodynamic impedances that significantly restrict the expiratory flow of air during speech and may also act as acoustic sources, giving rise to significant levels of sound (1,2). Constrictions >0.5 c m 2 have only minimal effects on air flow through the vocal tract but by their location and size determine the acoustic impedances, and thus the resonance patterns, of the vocal tract. Address correspondence and reprint requests to Dr. C. J. Miller at Department of Special Education, Box 879, SLU Station, Southeastern Louisiana University, Hammond, LA 704020879, U.S.A.

38

AIR F L O W M E A S U R E M E N T S

cal folds through cycles of opening and closing; the resulting train of air bursts causes the train of acoustic transients characteristic of "voicing." This phenomenon arises in a manner similar to the buzz of the lips during production of a "raspberry." AIR FLOW PARAMETERS The long-term air flow pattern for continuous speech "ripples" in a series of alternating peaks and troughs. Maximum air flows (peaks) occur when the vocal tract is most open (e.g., for vowels and approximant sounds) and minimum air flows (troughs) occur as the tract is fully closed for stop consonants. These correspondences are the direct result of well-known processes that govern "fluid dynamics" in any closed system: Generally stated, for laminar flow, the driving pressure, P, is related to the volume velocity of fluid flow, U, and total system impedance, Z, by the following relation: Z (dyne • s/cm 5) -

P (dyne/cm2) U(cm3/s)

Expressed for vocal-tract aerodynamics, this equation relates the driving pressure of expiratory force (as measured subglottally, in the trachea), P~g, and air flow through the vocal tract (the total of that seen at lips and nares), Uvt, to overall vocal-tract impedance, Zvt (due to all constrictions).

Zvt =

esg/Uvt

DRIVING PRESSURE As is obvious from the above equations, when Psg is constant, Uvt must be the inverse of Zvt. Therefore, for cases in which Psg does not change significantly, the reciprocal of monitored expiratory air flow pattern yields Zvt. Air pressures and flows observed during quiet expiration are the result of elastic recoil forces in the pulmonary system acting through the relatively unconstricted airways. Because these recoil forces vary directly with lung volume, Psg varies similarly. During speech and singing, combinations of inspiratory and/or expiratory muscle activity can be used to provide a relatively stable driving pressure "platform" for sound generation despite the decline of recoil pressure as lung volume decreases. This relative stabilization of Psg allows inference o f / v t

39

as the reciprocal of Uvt. A complicating factor is introduced by the suprasegmental properties of stress and emphasis (often referred to as prosody). These are largely implemented by increased "respiratory effort," creating transient increases in respiratory driving pressure. The increased Psg causes (a) higher intensity for sound generated at any vocal-tract constriction, and/or (b) higher fundamental frequency for voiced sounds. Because the expiratory driving pressure (Psg) is only approximately constant across relatively short intervals during normal speech, precise evaluations of Zvt from speech air flow patterns must incorporate simultaneous measures of esg. Netsell and Hixon (3) have described a simple yet effective technique for measuring Psg by monitoring intraoral pressure. Because the unconstricted vocal tract offers virtually no impedance to laminar direct current (DC) air flow, Zvt is the algebraic sum of impedances arising from laryngeal constrictions (Zlar) and supralaryngeal, articulatory, constrictions (Zart), i.e.,

Zvt ~-Zla r -~- Zar t For voiceless consonants Zart is large, whereas Zl~r is effectively 0, because the vocal folds are motionless and held in an open position. In the extreme case of voiceless bilabial stops, Zvt is functionally infinite, i.e., air flow past the constriction is reduced to 0 during articulatory closure. During a sufficiently long (>15-20 ms) labial closure, air pressure measured behind the bilabial constriction approximates Psg. Netsell and Hixon (3) therefore use oral air pressures during the production of voiceless consonants within the stream of continuous speech to provide a running estimate of Psg in adjacent sounds/syllables. For vowels and most voiced approximant consonants, the vocal tract is sufficiently open that Zart 0, while laryngeal vibration for phonation ensures a significant Zlar. Variations in Uvt during the production of these sounds are directly attributable to laryngeal control and/or to changes in Psg (the latter partially subject to observation as described above). During the production of voiced obstruent consonants, both laryngeal and articulatory constrictions contribute to the combined vocal impedance, Zvt. In all of these cases the relative contributions of Zart, Zl~r, and Psg tO the observable Uvt can only be differentiated by considering phonetic information, simultaneously recorded acoustic data, supportive articulatory information, etc. One Journal of Voice, Vol. 7, No. 1, 1993

40

C. J. MI L LE R A ND R. D A N I L O F F

exception occurs if air flow from the nares is measured independently of air flow from the oral cavity (i.e., Uvt = Ooral if" Unasal): In this case, nasal versus nonnasal distinctions and functional evaluations of velopharyngeal competence are possible solely on the basis of the air flow records. MEASUREMENT PRINCIPLES The manner in which air flow measures should be derived depends heavily on the purposes to which the data are to be applied. For example, it should be obvious that the range of air flows and the time frame over which oscillations occur from minimum to maximum values are measures best applicable to the long-term flow characteristics of continuous speech. When studying air flow during more "static" activities (e.g., sustained vocalization or production of a particular fricative sound), a simple average flow rate is appropriate. The examples listed below describe the more common methods for relating air flow measures to speech acts of general interest. During silent breathing, the twin halves of the respiratory cycle, inhalation and exhalation, have comparable durations. During speech production, the impediment to expiratory air flow caused by vocal-tract constrictions greatly prolongs the exhalation phase relative to that for inhalation. Trained speakers and singers learn to optimize combined vocal-tract impedances and thus minimize air flows, greatly prolonging vocalization on a single expiratory breath. Across the time course of sustained productions for continuant, or "static," speech sounds, a smooth, steady air flow pattern implies a constancy of the valving mechanism(s) involved, and thus, appropriately coordinated control of the mechanisms for speech production. In contrast, major fluctuations in the air flow pattern, untoward or unusual ripples, etc., imply an unsteadiness of control. Thus, for example, a speaker's ability to appropriately monitor and maintain glottal impedance during voicing may be evaluated by measuring air flow magnitude and variance during a sustained vowel at any preselected vocal register, vocal frequency, and vocal effort level (4). If the vowel vocalization is prolonged to the limits of the subject's abilities, the task is then a maximum performance test (5), and reflects standard engineering practice of testing systems to the limits of rated ~performance. Similar evaluations of articulatory control and/or of laryngeal-articulatory coordination may Journal of Voice, Vol. 7, No. 1, 1993

be derived from air flow measurements taken during the production of appropriate continuant consonants. Characteristic air flows (and thereby, vocal-tract impedances) can also be derived for each phoneme of a language. As noted above, the articulation of continuant sounds such as vowels, fricatives, nasals, etc., can be steadily sustained with a single unchanging set of vocal-tract constrictions, yielding a single, steady, and easily measured air flow pattern that may then be represented as a specific volume velocity value. Stops, affricates, and other "transient" consonants are produced with dynamic articulation and/or voicing, and have correspondingly dynamic patterns for both air flow and vocaltract impedances. Accordingly, for most concerns the latter sounds are often better represented by characteristic air flow patterns than by specific values of volume velocity. The aforementioned dynamic air flow patterns with transients highlight a major consideration in air flow measurements for speech that is related to the broader question of the "normalcy" of speech and speech measures. Although recognizable and acceptable forms of most speech sounds can be produced either in isolation, within syllables or words, or within phrases, virtually any measure taken from the " s a m e " sound in these differing contexts may itself also be different in each case. Speech sounds produced in isolation exhibit neither normal durations nor the coarticulatory-allophonic variations common to continuous speaking. Indeed, many sounds such as stops or glides are characterized by transitions into a neighboring sound, so that their production in isolation is distinctly unnatural. The minimum "natural" environment for measuring air flow for speech {s the CV syllable (consonantvowel syllable), because within the syllable, coarticulation and normal sound durations will tend to emerge (6). If the influences of rhythm, rate, and intonation on phoneme articulation are to be realistically sampled, air flow measures for speech sounds ought best be made within utterances that are structured as short "carrier" phrases. Conversational speech and singing generally consist of streams of syllables produced at the rate of 3-4/s. The aforementioned "peaks and troughs" of air flow for continuous speech occur at this 3-4/s rate (250 ms) in correspondence with the rate of syllable production. In the CVCV-type stream of sounds, the articulators move rhythmically from the "target" position for

AIR FLOW MEASUREMENTS one sound toward the target position for the next, never actually achieving or maintaining the idealized postures seen for either phoneme in isolation. The stream of articulations in flowing speech terms consists of a succession of articulatory targets Ta,b,c with transition movements (ab,bc,cd) sandwiched between them TAabTBbcTC. The transitions reflect the combined movement of the articulators and voicing from the preceding target sound toward that of the oncoming target sound; each transition reflects the merged influence of the adjacent target sounds. Each target is merely approximated, very briefly, and then followed by a rapid transitional movement of the articulators toward the next target. The distance that each articulator must move between adjacent targets interacts with speaking rate to determine the extent to which targets are achieved (hit) or unachieved (missed). Rapid shifts in air flow often occur at a transition because of the abrupt shift from one manner of articulation to another or because of a change in voicing during the transition. Peak and trough air flows occur when oropharyngeal pressure, size of articulatory constriction, and glottal opening jointly maximize (peak flow) or minimize (minimum flow) the P/Z ratio. The fine structure 1 of the air flow profile depends not only on driving pressure, constriction size, and timing of articulator movement, but also on the mechanical properties of oropharyngeal tissue, the size and shape of oropharyngeal cavities, and biomechanics of release or formation of articulatory constrictions (7). Additional research in speech physiology is needed to generate predictive models of vocal-tract aerodynamics, including prediction of air flow and air pressure profiles. A satisfactory aerobiomechanical model of laryngeal and supralaryngeal function will enable the researcher and clinician to use aerodynamic parameters such as flow rate to infer biomechanical abnormalities of function. For example, dynamic flow profiles offer the opportunity to measure how briskly articulatory constrictions are achieved and released, how long they are held, how early or late voicing or devoicing occurs, etc.

CHARACTERISTIC AIR FLOW VALUES Table 1 [from Baken (4)] presents characteristic peak (maximum) air flow rates achieved during the production of selected English consonants, as measured in prevocalic (CV) and postvocalic (VC) syllables. Very often, the peak flow is not maintained throughout the sound, but shows a modest declination as either constriction and/or voicing achieve constancy toward the middle or end of the sound. Note that relatively low air flow magnitudes occur for the approximant consonants/m,n,l,r/, their principal vocal-tract impedance being that of the vibrating larynx. Voiced obstruents /z,g,v/ present air flow rates of similar magnitude to those for the approximants, but their voiceless cognates/s,O,f/have air flow rates of double or even triple magnitude, suggesting that the fricative slit alone generally offers much less impedance than does the vibrating larynx. Despite the fact that they exhibit 0 translabial flow during closure, stop consonants give rise to the highest peak air flows, due to the explosive air burst that escapes the oral cavity at articulatory release. Furthermore, maximum achieved air flows during voiceless stop sounds are almost double TABLE 1. Peak airflow during the production of selected English consonants (four male and four female normal adults; "normal vocal effort"; pitch and loudness uncontrolled; single production of each syllable; pneumotachograph system) Flow (L/s) CV syllables Phoneme p b t d k g f v 0 s z

tS l By fine structure we mean the temporal patterning of minima and maxima in the air flow profile and first/second derivatives thereof, as they relate to successive vocal-tract gestures. These data represent the dynamic parameters of the aeromechanical system.

41

d~. m n 1 r

VC syllables

Mean

SD

Mean

SD

0.933 0.472 0.968 0.410 0.940 0.372 0.352 0.095 0.652 0.126 0.466 0.583 0.583 0.249 0.881 0.525 0.168 0.155 0.133 0.143

0.065 0.159 0.136 0.147 0.159 0.103 0.073 0.059 0.194 0.077 0.053 0.128 0.128 0.039 0.106 0.119 0.080 0.064 0.072 0.047

1.019 0.675 0.821 0.481 0.882 0.455 0.525 0.338 0.869 0.365 0.455 0.479 0.479 0.303 0.583 0.424 0.287 0.244 0.213 0.132

0.057 0.229 0.157 0.259 0.226 0.263 0.113 0.108 0.230 0.124 0.254 0.108 0.108 O.119 0.183 0.101 0.145 0.113 0.108 0.084

Source: ref. 16.

Journal of Voice, Vol. 7, No. 1, 1993

42

C. J. M I L L E R A N D R. D A N I L O F F

those for voiced stops; the delay in voicing onset for the former means that at the moment of release, air flow is completely unimpeded by the open larynx. CLINICAL UTILITY OF FLOW MEASUREMENTS Sustained phonation Mean air flow rates for sustained phonation of vowels provide estimates of glottal impedance, and hence, glottal integrity: Considerable normative data exist for air flow rates under these conditions (4,5). In particular, variations in air flow as a function of vocal register, vocal effort, vocal frequency, age, sex, etc., have been investigated in detail. Table 2 [from Baken (4)] exemplifies the complex interrelations among these variables for sustained vowel productions by normal female speakers. Table 3 [from Baken (4)] presents similar data for preand postremediation air flow norms for speakers with various vocal pathologies. It should be noted that when assessing air flow as a function of vocal effort, fundamental frequency, voice register, etc., care must be taken to provide the singer/speaker with sufficient practice and acoustic models to be i m i t a t e d (5). In g e n e r a l , l a r y n g e a l p a t h o l o g y / dysfunction is indicated by excessive air flow rates (incomplete closure of the vocal folds) or by highly variable air flow rates during the sustained phonation (imprecise laryngeal control). The implication of such findings is an abnormally restricted range of achievable vocal frequencies and intensities, leading to limited speech production abilities for the affected individual. Similar evaluations of laryngeal and/or articulatory control can also be achieved by monitoring

speakers' abilities to sustain greatly prolonged continuant consonants. Much less normative data are generally available for air flow during sustained consonantal forms; however, a c o m m o n clinical estimate of laryngeal impedance control compares the speaker's ability to sustain prolonged /s/ and /z/ sounds. In general, the maximum sustainable duration o f / z / s h o u l d be nearly double that of/s/. Failure to achieve this ratio implies reduced laryngeal control and/or vocal pathology (4). At this point, it should be reemphasized that air flow measures should routinely be allied with simultaneous acoustic, glottographic, and other articulatory recordings. Netsell and colleagues (8) offer a model protocol for diagnosis of air flow-based disorders that uses a wide variety of additional measures of articulatory function to supplement the interpretation of air flow data.

Continuous speech Articulatory integrity as reflected by the steadiness and tightness of oral constriction for fricative consonants can be estimated from mean sustained flow values, but it is clear that defective articulatory control is most obvious in the flow profiles for continuous s p e e c h (4). With its s t e a d y s t r e a m o f sounds, natural speech requires exquisitely timed articulatory gestures that are mirrored by a highly complex air flow pattern. Air flow trajectories for continuous speech are chunked into breath groups (9), so named because their organization depends primarily on respiratory dynamics. E a c h breath group consists of a single phonation-filled expiratory outflow, both preceded and followed by inhalations that replenish the air supply. Although the breath group may be p u n c t u a t e d by intentional

TABLE 2. Mean airflow in the loft register: females Intensity level (% age intervals) 10 20 30 40 50 60 70 80 90 Mean for pitch level

10

20

30

40

50

60

70

80

90

Mean for intensity level

92.4 121.1 162.0 185.7 208.9 207.5 242.4 23918 266.4

96.9 120.6 158.6 187.8 200.5 217.4 232.0 240.4 269.7

91.7 114.4 138.0 177.3 174.2 183.9 194.3 211.5 223.9

87.8 123.4 154.2 164.6 203.1 216.4 224.5 246.4 249.0

100.8 133.9 144.3 148.7 179.4 188.8 205.5 202.6 198.4

132.8 149.2 165.6 183.3 184.1 200.5 220.6 237.8 232.9

130.7 154.4 167.2 169.5 198.4 227.8 229.7 242.2 268.7

121.9 151.8 159.1 173.9 182.0 190.1 215.2 223.1 213.3

131.8 158.6 180.2 186.1 188.5 188.8 206.2 216.4 224.0

109.6 136.4 158.8 175.2 191.0 202.4 218.9 228.9 236.3

191.8

189.8

167.7

185.5

166.9

189.6

198.7

181.2

186.7

Mean air flow (ml/s) pitch level (% age intervals)

Source: ref. 17. Journal of Voice, Vol. 7, No. 1, 1993

AIR FLOW MEASUREMENTS

43

TABLE 3. Mean airflow during sustained phonation: laryngeal pathology Untreated Condition Uncompensated Unilateral Paralysis Polyps Carcinoma Chronic laryngitis Vocal nodule Minor inflammation Spasmodic dysphonia

Airflow (ml/s)~

N

Mean

SD

10 11 4 t0 4 5

442.2 312.8 478.0 218.5 236.5 224.0

204.3 154.9 182.3 91.9 26.9 171.5

5 8 8 11 8

212.4 153.9 160.9 133.8 110.4

79.9 35.5 66.4 29.2 49.7

Posttreatment

Treatment

N

Mean

SD

Teflon inj. Teflon inj.

4 3

200.5 147.7

34.7 46.4

Radiation or surgery

3

109.3

6,5 b

Source

Means and SDs compiled from individual case data presented in two sources. a Ref. 18. b Ref. 19.

pauses or prolongations for expressive purposes, the general pattern is very characteristic: Expiratory flow rises rapidly to a plateau (or very gradual decline), which is maintained for a substantial portion of the entire breath group. Progressively declining lung volume and utterance-final relaxation of effort eventually lead to a rapid drop in the air flow rate just before termination of the epoch at a duration of 3 or 4 s (marked by inhalation). The breath group has long been known to underlie the sentential/phrasal and intonational organization of speech as well as the "vocal effort" underlying loudness control. Internal to the breath group structure of continuous speech, the rates of change and overall durations for peak and trough events in the air flow pattern, their relative timing, and trends in the overall structure of sentential/phrasal air flow profiles all depend directly on dynamically changing vocaltract impedance patterns. These become particularly complex for intervocalic consonants and consonant clusters, which present the greatest challenges to speakers' powers of articulatory control. Continuous speech flow profiles can be averaged over various time windows to yield estimates of the average combined glottal and supraglottal impedance. Such estimates (mean, SD) could be used to reflect on the general glottal/pulmonic integrity of the speaking voice provided that vocal effort was controlled (4). Continuous speech air-flow patterns display a frequency spectrum with peaks in the 0-2-Hz range, reflecting the sequencing of breath groups. Spectral peaks in the 4-6-Hz range reflect the syllable rate

for conversational speech. At the upper frequency limits(60--400 Hz) there is a peak representing the fundamental of the volume velocity waveform, Ug. Spectral analysis of air flow data has not been explored in much detail, despite the fact that applying such methods with carefully designed speech samples could yield very useful diagnostic and biomechanical implications. AIR FLOW TRANSDUCERS

Pneumotachograph Transducers capable of faithfully measuring air flow fluctuations for speech must have a frequency response of at least several hundred hertz and > 1,000 Hz if the glottal flow wave is to be faithfully reproduced (10). The ideal device would have (a) a wide frequency response, (b) input:output linearity over a wide range of air flow rates, (c) high-output signal levels, and (d) would offer minimal aerodynamic, acoustic, and mechanical interference (e.g., encumbrance to the face of the speaker). The last requirement is very nearly intractable for many transducer systems: To effectively monitor air flow, transducers must sample the entire oronasal output. This is normally accomplished by directing all speech air flow into a mask system to which the transducers are attached. Unless the mask is circumferential to the head, it will press on the face, mechanically loading the lips and jaw. If the mask is circumferential (e.g., a helmet-type mask), facial movement is minimally impeded; however, the relatively large volume of air within the helmet can cause resonances and compliant loading, which may alter the acoustic characteristics of the subJournal of Voice, *Col. 7, No. 1, 1993

44

C. J. M I L L E R A N D R. D A N I L O F F

ject's voice. Even more importantly, relatively large mask volumes can trap " d e a d " air spaces that are not fully circulated--and thus oxygenated-with outside air during the speaking tasks. This oxygen-deficient air can load the speaker's respiratory system, forcing progressively faster and deeper breathing unless the helmet is steadily flushed with fresh air by mechanical means. The most commonly employed transducer system for monitoring speech air flow consists of a small aerodynamic impedance (either a fine-mesh screen or a bundle of very thin tubes) through which the entire breath stream is directed. As air flows through the device, a pressure differential, or "pressure drop," occurs across the resistive element. The magnitude of the pressure drop is directly and linearly related to volume flow, i.e., P U. A differential pressure transducer to sense the pressure drop, plus a heating element to prevent condensation (which would otherwise alter the resistance of the device), completes the "pneumotachographic" transducer ensemble. This system has all the potential hazards described above for mask-based devices; furthermore, if the screen mesh is large enough to minimize impedance loading of the airway, only very small pressure changes accompany air flow rates like those seen in normal speech. The Rothenberg (10) circumferentially vented screen-mask pneumotachographic system is an excellent compromise both in terms of backloading, encumbrance, and aerodynamic fidelity. It can be partitioned at the upper lip to separate oral and nasal flow streams. Further advantages of pneumotachographic systems include faithful recording of direction of air flow, and relative insensitivity to turbulence in the breath stream, which might otherwise cause gross inaccuracy. Once turbulence occurs, the simple linear relationship of Psg, Uvt, and Zvt becomes complicated, depending on second- and third-order flow derivatives (11). Hot-wire anemometer

The hot-wire anemometer flow transducer is predicated on the flow of the breath cooling a wire heated by an electrical current; the cooling lowers the wire's resistance and engenders a reciprocal rise in current flow. Thus, as liner flow rate increases, resistance falls, and current flow increases in the anemometer circuit. Unfortunately, the relationship between flow and cooling is highly nonlinear, and must be compensated for electronically. A single wire cannot sense direction of flow and turbulence Journal of Voice, Vol. 7, No. 1, 1993

in the breath stream and causes further nonlinearities in system response. Also, unless ultrathin, the diameter of the wire limits frequency response. Finally, the anemometer senses only linear flow velocity, not volume velocity. Use of several ultrathin wires and electronic compensation circuitry has recently improved frequency response and linearity to the point of competition with pneumotachography (4,12). Turbulent air flow, common for consonants and also present with some approximants, still plagues this system. Furthermore, any mouth fluid touching the hot wire can "caramelize," coating the wire and distorting readings (13), but because only a few fine wires need to protrude into the air stream (interfering only minimally with free breathing), further development is warranted. Body plethysmograph The body plethysmograph (airtight chamber) is a method for measuring speech air flows that avoids significantly encumbering the speaker while providing a relatively wide-frequency response. In the standard device, the speaker is positioned within a rigid, airtight box with his/her head protruding through a neck seal. A screen-and-pressure-transducer pneumotachograph is mounted on the wall of the box as the only port through which air may pass into and out of the plethysmograph. The speaker's thorax expands during inhalation and vice versa, progressively deflates as breath is expelled, thus causing pressure changes within the box that are equalized by air flow through the port formed by the pneumotachograph system. At moderate speaking levels, the directionality, magnitudes, and temporal patterning for air flows at the pneumotachograph port will be directly proportional to those that occur in the speaker's vocal tract. When the speaker uses significant vocal effort (high lung pressures), the system may lose some accuracy because at these levels, reductions in the speaker's thoracic/abdominal dimensions may result either from expiratory air flow, or because of the muscular compression of lung tissue and air itself. If moderate lung pressures and volumes typical of conversational speech are used, the plethysmograph is a reasonably accurate, minimally encumbering, acoustically ideal flowmeasuring system. Disadvantages arise from the facts that plethysmographs are both cumbersome (thus lacking portability) and expensive to build and operate. If the body plethysmograph is used to evaluate relatively high-frequency glottal flow waveforms

AIR FLOW MEASUREMENTS

(e.g., 100--400 Hz), the long time constant (substantial capacitance) of the large air volume in the box precludes accurate measurement of the rapidly changing air flow patterns. Much more accurate air flow measurements can be achieved with a variation on the plethysmograph that enccmpasses the subject's head rather than the subject's body. With this approach the dome must be flushed with a steady (DC) stream of fresh, oxygenated air: The time-variant (AC) air flow pattern of the subject's speech is superimposed on the DC air flow through the helmet and, due to the small volume and much shorter time constant, may be accurately measured at frequencies like those of the glottal waveform. As is the case with pneumotachograph masks, the helmet plethysmograph distorts speech at frequencies above the fundamental, thus impeding detailed acoustic analyses. Electroaerometer

Another measurement system of venerable lineage is the electroaerometer. This device consists of a mask that gathers the oral/nasal flow; the air stream forces open a low-mass valve, whose degree of opening is proportional to the volume velocity. The degree of valve opening is measured via a photocell system. The system demands ultra-low-mass valves on low-resistance bearings. It still has frequency response limitations because inertia radically alters the speech signal acoustically during flow measurement.

45

fact that pressure gradients and particle velocities are linearly related during plane wave propagation, these authors monitored intrapharyngeal air pressures with minimicrophones encased in a length of small-diameter tubing. Cranen and Boves (13) determined that the glottal-wave particle velocity profile was measured with excellent fidelity, even when vocal frequencies exceeded 1,500 Hz. This approach does not encumber the face as maskbased systems do, and is to be preferred if a clear acoustic signal is to be simultaneously recorded along with the glottal air flow measure. Table 4 (4) summarizes the essential properties of the flow transducers discussed. GLOTTAL VOLUME VELOCITY The glottal volume velocity (air flow), Ug, directly reflects the dynamics of vocal-fold vibration and is therefore of interest both to clinicians and theoreticians. Ug may be indirectly evaluated with the Rothenberg mask system through inverse flltering--a technique that mathematically cancels out vocal-tract resonance effects on speech air flows, thus leaving only the lower-frequency laryngeal air flow pattern. As mentioned previously, Cranen and Boves (13) employed direct intrapharyngeal measures of air pressure to calculate air particle velocity and thereby to estimate air flow volume velocity. Their results suggest excellent accuracy of Ug estimates up to at least 1,500 Hz. CONCLUSIONS

Distributed vocal-tract pressure transducer

Another technique (10) was designed specifically to capture the glottal flow wave (necessitating a relatively high-frequency response). Relying on the

Air flow measurements are being used to explore developmental, aging, and gender differences in normal and pathological speech. Increasingly, diag-

TABLE 4. S u m m a r y o f air f l o w transducers Transducer

Additional equipment required

Frequency response

Linearity

Notes

Differential pressure transducer system Heater current supply Warm-wire anemometer None-complete system

Good-excellent

Excellent

Preferred method Stable, reliable, rugged

Poor-good

Very poor but can be compensated

Plethysmograph

Pressure transducer system

Poor-fair

Good-excellent

Electroaerometer

None-complete system

Moderate

Fair

Insensitive to flow direction Very delicate Large, expensive, but useful for other measures Simple to u s e

Differential oropharyngeal pressure catheter

Preamplifiers

Excellent

Excellent

Pneumotachograph

Easy insertion, small encumbrance

Journal of Voice, Vol. 7, No. 1, 1993

46

C. J. MILLER AND R. DANILOFF

nosis of vocal disorders (14)and neuromuscular disorders of the larynx and velum (8) have come to rely on air flow measurements. With the Rothenberg mask have come numerous studies of the glottal volume velocity wave as related to glottal mechanics for normal speaking and singing voices (15). Recent advances have reduced the back loading of the air stream and excessive distortion of the acoustic signal by conventional face-mask-pneumotachographic transducer arrays (10), but further development of nonencumbering systems with a widefrequency bandwidth are needed. Air flow/pressure data will, however, come into their own as powerful diagnostic measures only after new models of biomechanical/aerodynamic vocal-tract performance are created. These models will relate articulatory/ glottal dynamic parameters and temporal coordination directly to key air flow/pressure wave parameters in a cause-effect fashion. This will allow investigators to infer biomechanics of phonationarticulation from air flow events, that is, to infer physiology from noninvasive air flow/pressure measurements. REFERENCES 1. Warren DW. Aerodynamics of speech. In: Lass NJ, McReynolds LV, Northern JL, Yoder DE, eds. Speech, language and hearing, vol I: normal processes. Philadelphia: WB Saunders, 1982:219-45. 2. Shadle CH. The effect of geometry on source mechanisms of fricative consonants. J Phonetics 1991;19:409-24. 3. Netsell R, Hixon T. A noninvasive method for clinically estimating subglottal air pressure. J Speech Hear Disord 1978;43:326-30.

Journal of Voice, Vol. 7, No. 1, 1993

4. Baken R. Clinical measurements o f speech and voice. San Diego: College Hill Press, 1987. 5. Kent R, Kent J, Rosenbeck J. Maximum performance tests of speech production. J Speech Hear Disord 1987;52: 367-87. 6. Kozhevnikov V, Chistovich L. Speech; articulation andperception [Translation]. Washington, DC: Joint Publications Research Service, 30543; U.S. Department of Commerce, 1965. 7. Mueller E, Brown WS. Variation in the supraglottal air pressure waveform and their articulatory interpretation. In: Lass NJ, ed. Speech and language: advances in basic research and practice, vol 4. New York: Academic Press, 1980. 8. Netsell R, Lotz W, Barlow C. A speech physiology examination for individuals with dysarthria. In: Yorkston K, Beukelman D, eds. Recent advances in clinical dysarthria. Boston: College Hill, 1989:3-35. 9. Lieberman P. Intonation, perception and language. Boston: MIT Press, 1968. 10. Rothenberg MA. New inverse-filtering technique for deriving the glottal airflow waveform during voicing. J Acoust Soc A m 1973;53:1632-45. 11. Van den Berg J, Zantema J, Doornenbal P. On the air resistance and the Bernoulli effect of the human larynx. J Acoust Soc A m 1957;29:626-31. 12. Teager H, Teager S. The effects of separated airflow on vocalizations. In: Bless DM, Abbs JH, eds. Vocal fold physiology. San Diego: College Hill Press, 1983:124--43. 13. Cranen B, Boves L. On the measurement of glottal flow. J Acoust Soc A m 1988;84:888-900. 14. Hirano M. Phonosurgery: basic and clinical investigations. Otologia Fukuoka 1975 ;21:239-440. 15. Sundberg J. The science o f the singing voice. Chicago: Northern Illinois University Press, 1987. 16. Isshiki N, Ringel R. Airflow during the production of selected consonants. J Speech Hear Res 1964;7:20%32. 17. McGlone R. Airflow in the upper register. Folia Phoniatr 1970;22:231-8. 18. Yanagihara N, von Leden H. The cricothyroid muscle during phonation. Ann Otol Rhinol Laryngol 1967;75:987-1007. 19. Hirano M, Koike Y, von Leden H. Maximum phonation time and air usage during phonation. Folia Phoniatr 1968; 20:185-201.