Laboratory advances for voice measurements

Laboratory advances for voice measurements

Journal of Voice Vol. 8, No. 1, pp. 8-17 © 1994 Raven Press, Ltd., New York The G. Paul Moore Lecture Laboratory Advances for Voice Measurements Wi...

1MB Sizes 0 Downloads 69 Views

Journal of Voice

Vol. 8, No. 1, pp. 8-17 © 1994 Raven Press, Ltd., New York

The G. Paul Moore Lecture

Laboratory Advances for Voice Measurements Wilbur James Gould and Gwen Susan Korovin Voice Laboratory, Lenox Hill Hospital, New York, New York, U.S.A.

Dramatic advances have been made in the area of voice research over the past two decades. Going back even further, G. Paul Moore, whom we honor with this lecture, was one of the pioneers in this field. To illustrate the advances and greater interest in, and awareness of, voice research, we need only look at the increasing number of meetings and conferences dealing with this topic. As early as 1959, Moore was a major contributor to a National Institutes of Health (NIH)-sponsored group that was chosen to look at "Research Potentials in Voice Physiology" (1). Over the years, the NIH has continued to support research in voice. A full research laboratory with several full time scientists has been established. The NIH has also set up a comprehensive institute, The National Institute of Deafness and Other Communication Disorders. This new organization provides funds and support for the entire field of communication disorders, of which voice is so important. The organization held its first conference on advances in laboratory research in Bethesda, MD, this fall (2). As the meeting was a great success, a second annual meeting is already planned. G. Paul Moore, along with Hans von Leden organized the first International Voice Conference (3). This provided a forum for increased exchange of information among the many represented countries. The first symposium, held in 1971, was entitled a "Short Course on Basic Concepts of Voice for Teachers of Singing and Speech." This, as you are aware, is our 20th symposium, and it has extended well beyond just " b a s i c " concepts. Until three Accepted August 30, 1993. ' Address correspondence and reprint requests to Dr. G. S. Korovin at 47 East 77th Street, New York, NY 10021, U.S.A.

years ago, the presentations were chronicled in the transactions of the symposium. Three years ago, we saw the establishment of the Journal o f Voice. The publication reports on the advancements presented at the symposium and other important developments throughout the year. The establishment of the Journal of Voice shows increased recognition of voice as a distinct specialty. The Journal has continued to attain more widespread recognition and credibility. At the 10th Symposium, Dr. Robert Coleman felt that it was risky to predict the direction of research (4). He felt this was because different investigators may not agree and because national funding patterns for research are so variable. However, I am pleased to say that voice research has gone in a positive direction. Many of our investigators are in agreement and funding in this field has become increasingly available. A further example of the positive direction is the status of voice laboratories in the western hemisphere. Until 1970, there were three such laboratories in the United States and Canada. By 1984, 35 laboratories had been established (5). We now have 93 voice laboratories registered with The Voice Foundation. The majority of these are related to a residency training program. Although some voice laboratories have but rudimentary resources, others have extensive staff including voice scientists and laryngologists with extraordinary training. The surge of laboratories indicates the increasing level of interest, and promises even greater advancement as these laboratories are more extensively utilized. Through cooperation and collaboration, the voice laboratories can establish bases of knowledge, and ultimately contribute to improved patient care. Research into voice function provides a valuable objective means to measure speech production in

A D V A N C E S I N VOICE M E A S U R E M E N T S

the clinical setting. As stated by von Leden, "current adaptation of electronic facilities complement the eyes and ears of the examiner and furnish precise measurement for prognostic and therapeutic purposes" (6). Before considering laboratory analytic studies, the laryngologist must make full use of his or her eyes and ears. A complete history, including careful listening to the patient's voice, is necessary. A good physical examination is also needed. The mirror is still one of the best ways to visualize the larynx. Mirrors which magnify up to three times can be extremely useful. Other ways to examine the larynx are discussed later. Phonation requires proper functioning of three interrelated systems: respiratory, laryngeal, and articulatory. Subglottal aerodynamic power is converted into laryngeal and acoustic dynamics with fine tuning by the kinesthetic and auditory senses. Recent research has been addressing all these aspects of phonation. Let us now discuss more specifically the advances in laboratory research and take care to point out how these may be directly applied to care of the patient; indeed, the ultimate goal of all voice research is to help in the diagnosis and care of patients. RESPIRATORY ANALYSIS Analysis of respiration used in the vocal laboratory ranges from simple to highly complex. On the simple end of the analytic spectrum is the spirometer. It consists of a cylinder, resting in water, in and out of which a patient breathes while movements are recorded. This can record vital capacity. A ventilograph can provide information on basal minute ventilations and maximum breathing capacity. If a pneumotachometer is added to this equipment, moment to moment airflow speed can be determine. A body plethysmograph, first described by Mead in 1960, provides the most accurate and complete pulmonary function studies. By placing the patient in a sealed box, the face is entirely freed for full phonation. The changes in volume within the box, which occur as the patient changes thoracic air volume, are recorded by connecting the box to a spirometer face. The procedure has been modified by Hixon and Warren to measure airflow. The plethysmograph and pneumotachograph represent two types of airflow transducers that convert airflow into an appropriate electrical signal. A third

9

type of transducer sometinSes used in measuring airflow is a warm wire anemometer. Changes in resistance are used as an index of airflow. Steady airflow (DC) and intermittent airflow (AC) may be measured. A fourth type of transducer which is rarely used in the research laboratory is an electroaerometer, first described by Smith in 1960 and Van den Berg in 1962. By obtaining measures of vital capacity, tidal volume, expiratory flow volume, and inspiratory capacity can be calculated. These can then be compared with predicted values (6-8). The primary goal with all these studies is to determine the usable air volumes. The body plethysmograph, mentioned previously, or helium pulmonary studies are useful to measure residual volume. Hixon (9,10) extended simple respiratory evaluation to study the respiratory element of vocal production. Hixon's kinematic method evaluates the respiratory efforts by considering the chest wall to be a two-part system, consisting of the rib cage and abdomen. Together, the rib cage and abdomen may displace a volume equal to that displaced by the lungs. To estimate these volumes, one looks at the changes in diameter (i.e., the anterior-posterior (AP) diameter) using magnetometers. A relative diameter chart with the abdominal diameter on the x axis and rib cage diameter on the y axis can be drawn. Each point on the chart thus represents a unique combination of rib cage and abdominal diameters. Through research, a well-defined set of kinematic patterns associated with speech in normal subjects has been documented. Various kinematic patterns associated with different types of dysfunction have been determined. Other investigators have extended Hixon's work by converting AP diameter measurements to displacement measures. It has been found that most normal individuals favor the rib cage over the abdomen in volume displacement. By using and expanding upon the kinematic method as a basis for respiratory evaluation, comparisons of various "normal" and "abnormal" speech and singing patterns can be studied (11). LARYNGEAL FUNCTION ANALYSIS The larynx is the sound generator. The laryngeal sound source is comprised of quasiperiodic pulses of air which cause vibration of the vocal folds. Basic studies of dynamic air pressure and airflow are Journal of Voice, Vol. 8, No. 1, 1994

10

W. J. GOULD A N D G. S. K O R O V I N

needed to expand theories of phonation. The capabilities of the larynx can be assessed by visual analysis, acoustic analysis, and aerodynamic studies. Results can be used to define normal laryngeal function and pathologic conditions, which in turn may facilitate better patient care. VISUAL ANALYSIS Stroboscopy is currently our most valuable clinical tool in visual analysis. The stroboscope can be used with either a flexible fiberoptic laryngoscope or rigid laryngoscope. The flexible fiberoptic laryngoscope was first introduced by Sawashima and Hirose in 1968 (12). Next to a simple mirror examination, it is probably the most widely available method of evaluating the larynx. It has the advantages of ease of examination and preventing unnecessary direct laryngoscopies under anesthesia, as had had to be done more frequently prior to its use. Aspects of respiration, phonation, glottal effort closure, and swallowing may all be visualized and qualitatively evaluated. Although the fiberoptic laryngoscope is a great clinical aid, some disadvantages exist (13). It requires a powerful light source if videotaping is being done simultaneously. Distortion is a significant problem. A study done by Hibi et al. (14) further defined the distortion as related to either the fiberscope lens system or the angle and axis view along with the distance of the tip of the scope from the object. They were able to correct some of the distortion by using computer processing and simultaneous measurements. Because of some of the problems associated with flexible laryngoscopy, rigid laryngoscopy was introduced by Gould and Andrews in 1971 (15,16). Yanagisawa (17) first described a simple method of videolaryngoscopy using a right-angled telescope and home video camera. Rigid laryngoscopy offers a clearer, sharper, larger, and brighter picture. The higher resolution videoimages can be combined easily with s t r o b o s c o p y . Anatomic or structural changes in the larynx can be visualized more easily and documented. Minor mucosal changes and early lesions of the vocal cords can thus be detected. However, the rigid laryngoscope does have disadvantages. A patient with a hyperactive gag reflex may be unable to tolerate the examination. A 75% success rate using the rigid scope is reported in the literature. Normal speech and singing voice may be distorted during the examination. Another problem Journal of Voice, Vol. 8, No. 1, 1994

with the stroboscope is in evaluating the vocal folds in the absence of voice or very low intensity voice. The stroboscope may not be "generated" (14). Rarely, the microscope and mirror used in conjunction with the stroboscope have been used for visual analysis. Because it is much more cumbersome than using the rigid or fiberoptic scope, it is used in a smaller percentage of people. However, this method is more widely used in Europe. By increased utilization of the stroboscope in routing clinical examination, much more can be learned about both normal and abnormal vocal production. As presented by Dr. Wilbur Gould at the 19th Annual Voice Symposium (18), the rigid stroboscope allows for better analysis of the undersurface of the vocal cord and immediate subglottic area. From simple mirror or fiberoptic examination from above, all of the laryngeal structures may appear normal. By passing the rigid scope down close to the level of the vocal cords and changing the angle of visualization, subtle variations can be noted. There may be swelling or subglottic polyps. There may be a change in the angular configuration of the subglottic tissue. All of these may have an effect on vocal production. Further studies of these variables are needed. Stroboscopy will continue to be a most important part of our vocal evaluation. Perhaps in the future, the stroboscope could be used intraoperatively. Such use could evaluate whether a lesion has been completely removed or whether further work is necessary. Because evaluation of the stroboscopic examination is subjective, many authors have attempted to establish a rating system. Hirano, Bless, and Feder (19) and Gelfer and Bu!temeyer (20) are proponents of such a system. It allows us to establish standards and have a source of comparison for results. By continuing to come up with simpler rating systems, standardization may some day become a reality. Historically, high-speed photography with vocal fold vibration analysis has proven to be the most accurate means of evaluation. The efforts of Drs. Moore, von Leden, Gould, Timcke, Farnsworth, and more recently Dr. Hirano deserve special mention. Motion picture analysis is of the greatest value when frame-by-frame analysis is done. The first high-speed motion pictures were taken at Bell Laboratories in 1937. In the late 1950s, high speed film research was advanced to a quantitative level (21-25). Hirose (26) and Booth and Childers (27) have made further advances by using digital image processing to collect and analyze high-speed

A D V A N C E S I N VOICE M E A S U R E M E N T S

fdms. Colton et al. (28) have taken a further step by applying digital processing to the analysis of laryngeal images using a small, low-cost microcomputer. The storage of video frames using digital image processing deserves special mention here. Digital storage eliminates extensive equipment and excessive wear and tear on tape heads, allows for ease in selection of precise frames for study, and can provide magnification of image details. As digital processing and storage become more widely used, further applications in the clinical area are expected. Perhaps laryngeal images could be analyzed digitally from either still pictures, slides, motion pictures, or videotapes. It may eventually be possible to analyze sequential changes within the larynx from various videofiberoptic recordings. Regardless of type of equipment used for visual evaluation, perhaps the most important aspect is the setting of standards that will allow finer points of evaluation. As the standards evolve, they should be acceptable and objective in nature. As we improve our techniques, we can further assess adynamic areas, closure, extent of wave motion, and asymmetry of mucosal wave motion. Thus, scarring or early malignancy may be demonstrated. The mucosal wave can be measured and should allow an objective analysis that can be an even more valuable tool in evaluation. ACOUSTIC ANALYSIS The primary tool for voice quality evaluation by the clinician is listening to the patient talk. Although quite useful, the information obtained cannot be quantified. The simplest form of acoustic analysis involves a tape recording. Low cost AC recording can now be converted to DC recording using a convertor. Multichannel DC records can then be used for data storage (W. S. Winholtz, unpublished data). Great advances have been made in data collection and storage in recent years. Use of digital audio tape (DAT) has been gradually replacing the standard analog system. The digital type of information imprint gives the data to be stored an extremely long-lasting quality. The precision allows better reproduction of sound comparison in future recordings of the same individual and comparison between individuals is more exact. The fidelity is greater, especially at higher pitches. Data are immediately computer capable.

11

Data manipulation and direct analysis can thus be performed at any time. As we are aware, the previously used analog system is sometimes too slow and is often inadequate. We have experienced loss of quality of recordings when stored over time. With DAT, these problems should now be eliminated (Sony Research Division, unpublished data). A headmount type of microphone allows for greater standardization of mouth to microphone distance and aids the subject who cannot adequately maintain a constant distance. Acoustic analysis have shown no significant differences in results of acoustic measures using a standard mount microphone versus a miniature headmount microphone (W. S. Winholtz, unpublished

data). Sound spectrographic analysis is the primary tool used in the research and clinical laboratory. Basic spectrography is based on the Fourier theorem which states that any periodic waveform can be analyzed into a series of sine waves with different frequencies, amplitudes, and phase relations. The fundamental (repetition) frequency and harmonics (integral multiples of the repetition) can be determined. The fundamental frequency value gives a clue to abnormalities, but does not establish a cause for the problem. One may also want to determine mean speaking fundamental frequencies. This is measured during spontaneous speech such as the reading of the Rainbow passage. One can measure frequency range but this is difficult to assess, and varies widely. Newer computer programs allow for automatic tracking of fundamental frequencies and intensity. The sound spectrograph allows for comparison of the degree of hoarseness, although somewhat subjectively. Clinical terms such as rough, breathy, asthenic, and strained may be used. Many indices have been developed and reported in the literature to further standardize the results. In 1967, Yanagihara (29) first did spectral analysis of speakers with varying degrees of hoarseness. The spectrograms obtained could be placed into four categories which correlated with the severity of the hoarseness. In these categories, the first and second formants were replaced by varying degrees of noise components. Thus Yanagihara (29) developed a classification of hoarseness based on noise and loss of harmonic components. In hoarseness, harmonics are replaced by noise energy. Kojima et al. (30) expanded this Journal of Voice, Vol. 8, No. 1, 1994

12

W. J. GOULD A N D G. S. K O R O V I N

theory to develop an actual harmonics to noise ratio. At first, the computations were complex. Soon afterwards, a simpler harmonics to noise ratio was developed by Yumoto et al. (31). Indices of breathiness and strain have since been developed. Such indices allow quantification of the spectrographic features that appear in a hoarse voice. Definitions such as jitter (frequency perturbation) and shimmer (amplitude perturbation) have been clarified. These reflect the variations in vocal fold vibratory patterns and vocal fold instability, respectively. Von Leden reported a strong tendency for frequent and rapid changes in the regularity of the vibrating pattern in pathologic conditions (re jitter) (6). Studies by Wendahl and Lieberman have concurred (32,33). Measuring sound spectrography involves filtering of the speech signal with bandpass filters. The earlier spectrographs used analog techniques. Digital spectrographic analysis has provided a further advancement in processing of the acoustic signal. Recently, special circuits have been designed to perform "on-line" digital spectrum analysis. A new spectrograph which is a hybrid of analog and digital methods is now available. Although the input is digitalized, the analysis of the signal is accomplished with analog circuitry. The advantages of this new instrument are that digital storage results in less noise than magnetic recording, input signals are given sequential numerical values allowing for controlled access, and the input can be obtained at one rate and read out of memory at a different rate (28). The increased use of computer techniques has achieved even more accurate analysis. Application of complex computational algorithms can now be achieved. Baken (34), at the 19th Annual Symposium, presented his model using fractal analysis of voice. His model evaluates the irregularity of the physiologic and acoustic aspects of speech. Titze and Scherer (35) at the University of Iowa and the Recording and Research Center at The Denver Center of The Performing Arts have demonstrated how acoustic measures have collaborative potential among different laboratories. Titze has developed a program called GLIMPES--glottal imaging by processing external signals---in which acoustic measures and estimates of vocal fold motion are obtained from simultaneous recording of a number of signals. These signals include output from the microphone, the electroglottograph (EGG)., a neck accelerometer, a photoglottograph, and an oral flow mask (35). Journal of Voice, Vol. 8, No. 1, 1994

Four fundamental frequency extraction methods have been incorporated into the GLIMPES software package. These include peak-picking with waveform matching, zero-crossing with waveform matching, peak-picking with local smoothing, and zero-crossing with local smoothing. Further information regarding fundamental frequency, sampling frequency, signal-to-noise ratio, and amplitude and frequency modulation can be obtained using these extraction methods, allowing for greater accuracy (34). Acoustic analysis has shown itself helpful in the clinical setting. Berke et al. (36) analyzed five acoustic measures to compare voice before and after surgery. Lehman et al. (37) used acoustic analysis along with videostroboscopy to study objectively patients after radiation therapy for stage I squamous cell carcinoma of the glottis. Results showed abnormality in the post-irradiation voice. Rontal et al. (38) studied post-Teflon injection patients' voices using sound spectrograms. Further studies are needed to compare radiation therapy and surgery patients. Studies are also needed to compare laser and more traditional surgical procedures. The spectral analysis may be useful for evaluation and extension of surgical techniques. Another use of sound spectrography is that of studying the effects of endotracheal intubation on acoustic characteristics of voices. Horii and Fuller (39) showed that even short-term intubation had its affect on voice. This can be seen in selective measures of waveform characteristics. Lin and Gould (40) have used acoustic analysis to evaluate the voice after tonsillectomy. Results showed a change in the fourth formant. Spectrum analysis has been very useful in studying singing. Miller and Schutte (41) have used it to track formant frequencies in singing. Horii has used acoustic analysis to modify the definition of vibrato. In 1960 the United States Standards Institute defined vibrato as " a family of tonal effects in music that depend on periodic variations of one or more characteristics of the sound wave." Horii (42) feels clearer definitions of vibrato, frequency vibrato, and amplitude vibrato will be needed once more particular characteristics are known. Studies of vibrato extend beyond the realm of singing voice only. They can be applied to studies of pathologic vocal tremor, such as in patients with spastic dysphonia, Parkinson's disease, amyotrophic lateral sclerosis, and head injuries. A vocal demodulator that produces low-frequency amplitude and fre-

A D V A N C E S I N VOICE M E A S U R E M E N T S

quency variations and measures the level and frequency of tremor components in sustained phonation has been developed. This has been used to study patients with vocal tremor and individuals using vibrato (43,44). Acoustic analysis can also be useful in the objective analysis of register. Many different terms have been used to describe subjective qualities of voice. The attempted implementation of stricter definitions has been the source of some controversy. Hollien (45) has suggested that, to avoid confusion, we adopt completely new designations for the narrowly defined registers. These registers include modal, pulse, and loft. Modal may include such terms as chest, head, low, mid, or high. Pulse may mean vocal fry, glottal fry, or strohbass. Loft means upper reaches or falsetto. Loft and pulse register differ from modal register by the shape and tension of the vocal folds. Titze (46) has further studied transitions between registers using acoustic analysis. He classified transitions between registers as either periodicity (relating to fundamental frequency) or timbre (loss or gain of high frequency sound noise). He feels a third transition in which the vocal fold vibration shifts to whistling may also occur. Although subglottal resonances induce involuntary register transitions, regulation of vocal fold adduction causes voluntary timbre transitions. He questions the terms modal register and head register, but agrees with the terms chest, pulse, and falsetto registers. Both teachers and scientists need to clarify their labels if we are to speak a common language. By using acoustic and physiologic information these goals may eventually be achieved. In the works is a voice analysis network in which laboratories all around the country could tie into a central computer for analysis of signals (47). This could lead to further standardization of voice analysis. The formation of such a large database could provide much greater information about vocal disease and an accurate directory of participating clinicians and researchers and their ongoing projects. Pertinent resources and funding could then be more carefully and appropriately matched. AERODYNAMIC ANALYSIS Valuable information can be obtained from studies of airflow efficiency during phonation. The central feature is a pneumotachograph, as previously mentioned. This allows for measurement of airflow

13

rate and air volume. Sound pressure level is measured by a recorder and an audiofrequency spectrometer. Additional components can be added for special studies such as measurement of subglottal pressure. Rothenberg's (48) mask allowed for more accurate analysis of airflow. In addition, he used various filtering techniques to extend the utility of the pneumotachograph, and further defined the glottal airflow waveform. Studies of airflow in the average population have defined accepted limits. Air loss measurement gives clues to a deficiency of glottal closure. After maximal inhalation, a relatively short period of maximally sustained phonation implies air usage greater than expected, and a lesion or gap created during surgery or vocal cord paresis is suspected. The amount of air loss not only gives information about possible pathology, but may also give a clue to the amount of closure to be obtained if surgical correction is advisable (49). A noninvasive technique relating pulmonary air source capabilities to utilization of airflow at the larynx was discussed by Smitheran and Hixon (50) and used by Holmberg and Leanderson (51). Subglottal pressure and glottal airflow volume are used to calculate laryngeal airflow resistance for normal and disordered voices. They obtained flow resistance calculations for chronic laryngitis, polyps in the postoperative stage, nodules, and various paralyses and dysphonia. Traditionally, sound and airflow had to be measured by separate instrumentation. A new hot wire flow meter which permits automatic calculation of the ratio of sound to airflow (AC and DC) was introduced in 1980 (52). This allows quicker determination of the vocal efficiency index. GLOTTOGRAPHIC ANALYSIS The electroglottograph (electrolaryngograph) is a noninvasive method of recording cycle-to-cycle vocal fold function. It is typically assumed to be proportional to the dynamic contact area of the medial surface of the vocal folds. The EGG has been extensively reviewed in articles by Fourcin, Rothenberg, Childers, and Abberton (53-55). Rothenberg described a model waveform to emphasize the various phases of vocal fold contact. Although the EGG provides easily obtainable data, there are many limitations of the method. Placement of the electrodes is critical. Electrical resistance at the skin electrode must be lowered by Journal of Voice, Vol. 8, No. 1, 1994

14

W. J. GOULD A N D G. S. K O R O V I N

keeping the electrodes clean, well lubricated, and firmly attaching them to the skin. Excessive subcutaneous soft tissue could alter the result. As stated by Karnell, no efficient cost-effective and/or valid means of interpreting the morphology of EGG waveforms in live subjects is thus far available (56). Numerous studies are being done to compare EGG measurements with other glottal function measures. The EGG has been combined with transillumination, acoustic studies, high speed laryngeal photography, and stroboscopic photography. These studies attempt to define when closure occurs relative to the EGG signal, e.g., the study by Scherer et al. (57). The use of digital video methods and processing capabilities in conjunction with EGG may provide further information. Many investigators have attempted to use the EGG to document voice quality. Various objective measurements of hoarseness, breathiness, and strain have been suggested using analysis of the EGG waveform. By computing quotients between segments of the waveform and the entire vibratory period, different quotients have been defined. These include the closed, open, surface, and adduction quotients. Earlier research promised the possibility of using a differentiated EGG (DEGG) to give more definite identification of the opening and closing moments of the curve. Closer analysis showed that this method was not significantly more dependable than the simpler EGG (58). Newer research has come up with photoglottography (PGG). This method assesses glottal opening by measuring transglottal illumination. A photosensor measures transillumination from the skin surface below the vocal folds and acts as a shutter during phonation. Theoretically, light is transmitted in proportion to the size of the glottal opening. Initially PGG required the insertion of a light source or photosensor to the level of the nasopharynx or oropharynx at minimum (59). Gerratt et al. (60) report on a minimally invasive transoral illumination technique and an automated system to help identify glottal events. The validity of this method, as with EGG, remains questionable. Further clinical application of the EGG may be extended to the area of teaching voice and singing. The EGG may be useful as a form of immediate biofeedback, although its limitations must be kept in mind here as well. The EGG may be useful in illustrating register. Kitzing (61) has written that the waveform varies in a characteristic manner that deJournal of Voice, Vol. 8, No. 1, 1994

pends on the register. Although pulse register may be indicated by dicrotic excitations and long closed phrases, modal register may be seen as rounded closed phases, and falsetto may show a more pointed closed phase. Professional singers who can equalize register transitions without voice breaks or perturbations may show such transitions nicely on the EGG. This may be applicable to teaching register transitions to beginning students in the future. Once further research is carried out and there is true identification of laryngeal activity being measured, EGG may offer a means of assessing vocal fold activity in a noninvasive manner which is comfortable for the patient and provides little interference with vocal function. As Childers (62) warns, EGG "remains a poorly understood tracking device, in its present f o r m . . , not capable of contributing much to clinical diagnosis and treatment of voice disorders." Only through future research will we further define its clinical usefulness. ULTRASONOGRAPHY In 1970, Hertz (63) first used ultrasound to visualize vocal cord movement. Kelsey et al. (64) have studied pharyngeal wall motion using this method. Ultrasonography involves the use of a transducer which produces a frequency in the ultrasonic range. Sound waves are reflected off target structures and picked up by a sensor. Ultrasonoglottography involves the reflection of high-frequency sound from the air/tissue interface of the vocal fold motion during phonation. At this time, it is still in the laboratory research phase. It may someday become suitable for use in the clinical laboratory for viewing and recording dynamic behavior of laryngeal structures with minimal interference to vocal production. It is both noninvasive and radiation free. Now widely available for other areas of the body, it offers a promise for future clinical use (some speech trainers have been using it as a form of biofeedback) (65,66). ELECTROMYOGRAPHY Electromyography (EMG) is an objective method available to study laryngeal muscle activity. It provides information about the electrical activity resuiting from contraction of muscles or motor units. These studies help determine which laryngeal muscles are being used during different respiratory and phonation conditions. Hollien (67) describes four

A D V A N C E S I N VOICE MEA~SUREMENTS

15

things which EMG can tell the investigator: (a) whether a muscle is operating; (b) when a muscle starts and stops contracting; (c) whether paired muscles fire in synchrony; and (d) to what extent a muscle is contracting. EMG can differentiate peripheral laryngeal nerve paralysis from central neurologic disorders and arytenoid fixation. It has prognostic value in cases of paralysis (67). Conventional techniques involve needle electrodes placed into the cricothyroid and thyroarytenoid muscles. The lateral and posterior cricoarytenoid muscles are reached via a transoral approach. Fine detail about a particular muscle or about synchronicity of muscle use can be obtained via these electrodes. Surface electrodes have also been used. These give information regarding the whole muscle. These are easy to place and are comfortable but lack precision and limit high-frequency response. Ludlow (68) has been working extensively on EMG studies in her laboratory and has presented this work at the Bethesda conference. Sanders' group has done a great deal of research on recurrent laryngeal nerve activation using a transcutaneous approach. Sanders et al. (69) found frequency-dependent vocal fold movement due to different contraction times of abduction and adduction muscles in dogs. Extensions of such use of the EMG could eventually improve prognosis in recurrent laryngeal nerve trauma and help in evaluating functional deficits caused by tumor. Crumley, in this triological thesis, used EMG to study 28 laryngeal reinnervation procedures done in dogs (70). EMG has been used in spastic dysphonia to determine the specific muscle to be injected with botulinum toxin. Neuromyography, the direct stimulation of a laryngeal nerve, and reflex-myography, which uses the afferent system of the superior laryngeal nerve, are both being tried in the research laboratory (71).

In studying vocal production in the laboratory, a variety of types of analyses must be combined. Hill et al. (73) compared videolaryngoscopy, videolaryngostroboscopy, EGG, and acoustic perturbation analyses in two normal and 26 voice patients. All four techniques were successful in normals and 42% of patients. Difficulty with stroboscopy was found in 23%, with EGG in 46%, and with acoustic analysis in 35%. The results indicate that complementary and overlapping interpretation of these measures is useful for clinical vocal analysis.

SUPRAGLOTTAL STUDIES

CONCLUSIONS

Studies of the spatial, aerodynamic and acoustic characteristics of the area superior to the larynx are being extensively carried out. A project originally begun by Fujimara and more currently undertaken by Abbs involves using radar X-ray devices to track minute lead pellets through the oral structures during speech. Areas of the tongue, palate, and lips can be visualized (5). Nasal airflow devices, including the rhinomanometer, have provided more useful in-

Utilizing a combination of vocal evaluation techniques offers a quantitative approach to evaluate the newer surgical procedures such as cordal injections and laryngeal framework surgery. The latter involves changing the position or tension of the vocal cord. Outcomes of various techniques for spastic dysphonia including sectioning of the recurrent laryngeal nerve, xylocaine injections, and botulinum injections can be more objectively evaluated.

formation. Palatal sensor devices may also provide some valuable information. Some of the instrumentation used in craniofacial research, including measures of intraoral airflow and pressure, lingual and palatal pressure, oral sensory perception, and visualization techniques may eventually see wider usefulness in the study of vocal production. Facial muscle activity studies are also being done (63). Studies of speech intelligibility involve looking at both the adequacy of the phonatory source and supraglottal/articulatory levels. This has clinical application in designed treatment programs for patients with neurologic disorders (72). By combining acoustic, aerodynamic, and physiologic measures, we can evaluate the interaction of all components and render a treatment plan. AUDITORY FUNCTION Auditory perception is important to the patient, as he or she needs to monitor his or her own voice. Hearing tests and further auditory evaluations as necessary may become part of the full vocal evaluation. The recent advances in hearing aids and surgical procedures available to improve hearing are of great help to the hearing-impaired voice patient. COMBINATION OF TECHNIQUES

Journal of Voice, Vol. 8, No. 1, 1994

16

W. J. GOULD A N D G. S. K O R O V I N

Nerve-muscle pedicles for attempted reinnervation in vocal paralysis cases should be studied. Future extensions of these techniques to artificial laryngeal muscle or an artificial larynx all have their basis in the research laboratory of today. In addition, advances which are being made in the fields of molecular biology and genetics will see further application to the study of voice research. Voice measurement has come a long way since Dr. Moore began his specialized studies involving high-speed photography of vocal fold movement. Various methods of visual analysis have provided valuable information about the intricacies of vocal fold vibration. Simpler instrumentation has made this type of analysis more widely available to a larger number of clinicians and researchers. Aerodynamic and acoustic analysis performed in the voice laboratory have also seen great advances. By combining data obtained from all the pertinent studies, diagnostic and therapeutic programs can be customized to the needs of the patient. In addition, researchers can determine the appropriate direction for future investigations. Cooperation between laryngologists, voice scientists, and speaking and singing voice therapists is necessary for further advancement. We must continue to work as a team. If we use the example set during the last 20 years, the next 20 hold great promise for the specialty of voice care.

11.

12. 13. 14. 15. 16. 17.

18. 19. 20. 21. 22. 23. 24.

REFERENCES 25. I. Brewer DW. Concluding session. In: Brewer DW, ed. Research potentials in voice physiology. New York: University Publishers, 1964:365-77. 2. Assessment of speech and voice production research and clinical applications. Proceedings of an N1DCD Conference. Bethesda, Maryland, 1990. 3. Brewer DW. Voice research: the next ten years. J Voice 1989;3:7-17. 4. Gould WJ. The inception and development of the symposium. In: Lawrence VL, ed. Transcripts of the Tenth Symposium on care of the professional voice. New York: The Voice Foundation, 1981 :t-42. 5. Gould WJ. The Clinical Voice Laboratory clinical application of voice research. J Voice 1988;1:305-9. 6. Von Leden H. Larynx-measurement of function. In: Scientific foundations of otolaryngology. 7. Baken RJ. Clinical measurement of speech and voice. Boston: College Hill Press, 1987. 8. Gould WJ. The clinical voice laboratory: clinical application of voice research. Ann Otol Rhinol Laryngol 1984;93:34650. 9. Hixon TJ, Putnam AHE. Voice disorders in relation to respiratory kinematics. Semin Speech Lang 1983;4:217-31. 10. Hixon TJ. Speech breathing kinematics and mechanism inferences therefrom. In: Sullner S, Lindblom B, Lubker J, Journal of Voice, Vol. 8, No. 1, 1994

26. 27. 28. 29. 30. 31. 32. 33. 34.

Pearson A, eds. Speech motor control. Oxford: Pergamon Press, 1982:75-93. Subtely JD, McCormack RM, Worth JH, et al. Synchronous recording of speech with associated physiological and pressure-flow dynamics: instrumentation and procedures. Cleft Palate J 1968;5:93-116. Sawashima M, Hirose H. New laryngoscopic technique by use of fiber optics. J Acoust Soc Am 1968;43:168-70. Yanagisawa E, Owens TW, Strothers G, Honda K. Videolaryngoscopy: a comparison of fiberoptic and telescopic documents. Ann Otol Rhinol Laryngol 1983;92:430-6. Hibi S, Bless D, Hirano M, Yoshida T. Distortions of videofiberoscopy imaging: reconsideration and correction. J Voice 1988;2:168-75. Gould WJ. The Gould Laryngoscope. Transactions of American Academy of Ophthalmology and Otolaryngology, May, 1973, 139-41. Andrews AN Jr., Gould WJ. Laryngeal and naso-pharyngeal indirect telescope. Ann Otol Rhinol Laryngol 1977;88:627. Yanagisawa E, Casuccio JR, Suzui N. Videolaryngoscopy using a rigid telescope and video home system color camera. A useful office procedure. Ann Otol Rhinol Laryngol 1981; 90:346-50. Gould WJ. Treatment of vocal fold lesions: an overview from a clinical perspective. Care of the Professional Voice Symposium, June 7, 1990, Philadelphia. Bless DM, Hirano M, Feder ILl. Videostroboscopic evaluation of the larynx. Ear Nose Throat J 1987;66:48--58. Gelfer MP, Bultemeyer DK. Evaluation of vocal fold vibratory patterns in normal voices. J Voice 1990;4:335-45. Timcke R, Von Leden H, Moore GP. Laryngeal vibration: Measurement of the glottic wave. Part I. Normal vibratory cycle. Arch Otolaryngol 1958;68:1-19. Timcke R, Von Leden H, Moore GP. Laryngeal vibration: Measurement of the glottic wave. Part II. Pathologic larynx. Arch Otolaryngol 1959;69:438-44. Moore I, Von Leden H, Timcke R. Laryngeal vibration: measurement of the glottic wave. Part III. Pathologic larynx. Arch Otolaryngol 1960;71:16-35. Gould WJ, Jako GJ, Tanabe M. Advances in high speed motion picture photography of the larynx. Trans Am Acad Opthalmol Otolaryngol 1974;78:276--8. Farnsworth DW. High speed motion pictures of the human vocal cords. Bell Lab Rec 1940;18:203-8. Hirose H, Hiritani S, Imagama H. High speed digital image analysis of laryngeal behavior in running speech. Ann Bull P1LP 1987;21:25-40. Booth J, Childers D. Automated analysis of ultra high speed laryngeal films. IEEE Trans Biomed Eng 1979;EME-25:18592. Colton RH, Casper JK, Brewer DW, Conture EG. Digital processing of laryngeal images: a preliminary report. J Voice 1989;3:132-42. Yanagihara N. Significance of harmonic changes and noise component in hoarseness. J Speech Hear Res 1967;10:531-41. Kojima M, Gould WJ, Lambaise A, Isshiki N. Computer analysis of hoarseness. Acta Oto-Laryngol 1980;89:547-54. Yumoto E, Gould WJ, Baer T. Harmonics to noise ratio as an index of the degree of hoarseness. J Acoust Soc Am 1982;71:1544-50. Wendahl RW. Laryngeal analog synthesis of jitter and shimmer. Auditory parameter of hoarseness. Folia Phoniatr 1966;18:98-108. Lieberman P. Some acoustic measures of the fundamental periodicity of normal and pathologic larynges. J Acoust Soc Am 1963;35:344-53. Baken RJ. Irregularity of vocal period and amplitude: a first approach to the fractal analysis of voice. J Voice 1990;4:18597.

ADVANCES

IN VOICE MEASUREMENTS

35. Scherer RC, Gould WJ, Titze IR, Meyers AP, Sataloff RT. Preliminary evaluation of selected acoustic and giottographic measures for clinical phonatory function analysis. J Voice 1988;2:230-44. 36. Berke GS, Gerratt BR, Hanson DG. An acoustical analysis of the effects of surgical therapy on voice quality. Otolaryngol Head Neck Surg 1983;91:502-8. 37. Lehman JJ, Bless DM, Brandenberg JH. An objective assessment of voice production after radiation therapy for stage 1 squamous cell carcinoma of the glottis. Otolaryngol Head Neck Surg 1988;98:121-9. 38. Rontal E, Rontal M, Rolnick M. The use of spectrograms in the evaluation of vocal cord injection. Laryngoscope 1975; 85:47-56. 39. Horii Y, Fuller B. Selected acoustic characteristics of voices before intubation and after intubation. J Speech Hear Res 1990;33:505-10. 40. Lin PT, Gould WJ, Fukazawa T. Acoustic analysis of voice in tonsillectomy. J Voice 1989;3:81-6. 41. Miller DS, Schutte HK. Feedback from spectrum analysis applied to the singing voice. J Voice 1990;4:329-34. 42. Horii Y. Acoustic analysis of vocal vibrato: a theoretical interpretation of data. J Voice 1989;3:36-43. 43. Aronson AE, Ramig LO, Winholtz WS, Silber SR. Rapid voice tremor, or 'flutter', in amyotrophic lateral sclerosis. Ann Otol Rhinol Laryngol 1992;101:511-8. 44. Ramig LO, Scherer RC, Ktasner ER, Titze IR, Horii Y. Acoustic analysis of voice in amyotrophic lateral sclerosis: a longitudinal case study. J Speech Hear Disord 1990;55:2-14. 45. Hollien H. A review of vocal registers. In: Lawrence V, ed. Transcripts of the twelfth symposium on care of the professional voice. New York: Voice Foundation, 1983:1-5. 46. Titze IR. A framework for the study of vocal registers. J Voice 1988;2:183-94. 47. Gates GA. First National Conference on research goals and methods in otolaryngology: recommendations of the writing groups. Ann Otol Rhinol Laryngol 1982;91(suppl 100). 48. Rothenberg M. Some relations between glottal air flow and vocal contact area. A S H A Rep 1981;11:88-96. 49. Gould WJ. Effect of respiratory and postural mechanisms upon action of the vocal cords. Folia Phoniatr 1971 ;23:21124. 50. Smitheran JR, Hixon TJ. A clinical method for estimating laryngeal airway resistance during vowel production. J Speech Hear Disord 1981;46:138-46. 51. Holmberg E, Leanderson R. Laryngeal aerodynamics and voice quality. In: Lawrence V, ed. Transcripts of the eleventh symposium on the care of the professional voice: medical~surgical sessions. New York: The Voice Foundation, 1983:124-9. 52. Kitajima K, Isshiki N, Tanabe M. Use of a hot-wire flow meter in the study of laryngeal function. Studia Phono11978; 112:25-30. 53. Abberton E, Fourcin AJ. Laryngographic assessment of normal voice. A tutorial. Clinical Linguistics and Phonetics. 1989;3:281-96. 54. Childers DG, Krishnamurthy AK. A critical review of electroglottography. CRC Crit Rev in Bioeng (CRC Crit Rev Biomed Eng) 1985;12:131-61.

17

55. Rothenberg M. Some relations between glottal air flow and vocal contact area. A SHA R ep 1981;11:88-96. 56. Karnell MP. Synchronized videostroboscopy and electroglottography. J Voice 1989;3:68-75. 57. Scherer RC, Gould WJ, Titze IR, Meyers AD, Sataloff RT. Preliminary evaluation of selected acoustic and giottographic measures for clinical phonatory function analysis. J Voice 1988;2:230--44. 58. Childers DG, Hicks DM, Moore GP, Alsakay A. A model for vocal fold vibratory motion, contact area and the electroglottogram. J Acoust Soc A m 1986;80:1309-20. 59. Harden RJ. Comparison of glottal area changes as measured from ultra-high speed photographs and photoelectric glottographs. J Speech Hear Res 1975,18:728-38. 60. Gerratt BR, Hanson DG, Berke GS, Precoda K. Photoglottography: a clinical synopsis. J Voice 1991;5:98-105. 61. Kitzing F. Clinical applications of electroglottography. J Voice 1990;4:238-49. 62. Childers DG, Hicks DM, Moore GP, Eskenazi L, Latwani AL. Electroglottography and vocal fold physiology. J Speech Hear Res 1990;33:245-54. 63. Hertz CM, Lindstrom M, Sonesson B. Ultrasonic recording of the vibrating vocal folds: a preliminary report. Acta Otolaryngol 1970;69:223-30. 64. Kelsey C, Woodhouse RJ, Minifie F. Ultrasonic observations of coarticulation in the pharynx. J Acoust Soc Am 1969;46:1016--8. 65. Zagzebski JA, Bless DM, Ewanowski SJ. Pulse echo imaging of the larynx using rapid ultrasonic scanners. In: Bless DM, Abbs JH, eds. Vocal fold physiology: contemporary research and clinical issues. San Diego: College Hill Press, 1983:210-23. 66. Hamlet SL. Ultrasound assessment of phonatory function. A S H A Rep 1981;11:128-40. 67. Hollien H. Status report on instrumentation useful for craniofacial research. Cleft Palate J 1976; 13:138-55. 68. Ludlow CL. Neurophysiological assessment ofpatients with vocal motor control disorders and assessment of speech and voice production: research and clinical applications. Proceedings of an N1DCD Conference, Bethesda, Maryland, 1990. 69. Sanders I, Aviv J, Biller HF. Transcutaneous electrical stimulation of the recurrent laryngeal nerve: a method of controlling vocal cord position. Otolaryngol Head Neck Surg 1985 ;95:152-7. 70. Crumley RL. Experiments in laryngeal reinnervation. Laryngoscope 1982;92(Suppl 30): 1-27. 71. Thumfart WF. From larynx to vocal ability. New electrophysiological data. Acta Otolaryngol 1988;105:425-31. 72. Ramig LO. The role of phonation in speech intelligibility: a review and preliminary data from patients with Parkinson's disease. Intelligibility in speech disorders: theory, measurements and management. In: Kent RD, ed. Reference manual for communicative sciences and disorders: speech and language. Amsterdam: John Benjamin, 1992. 73. Hill FP, Meyers AD, Scherer RC. A comparison of four clinical techniques in the analysis of phonation. J Voice 1990;4:198-204.

Journal of Voice, Vol. 8, No. i, 1994