Journal of Voice
Vol. 10, No. 3, pp. 26%283 © 1996 Lippincon-Raven Publishers, Philadelphia
Meaningful Features of Voice Range Profiles from Patients with Organic Vocal Fold Pathology: A Preliminary Study *tAlison Behrman, tCarolyn J. Agresti, tEsther Blumstein, and ~tGeeta Sharma *Communication Sciences Program, Hunter College/CUNY, New York; tDepartment of Otorhinolaryngology, The New York HospitaI-Cornell Medical Center, New York; and ~cCornell University Medical Center, New York, New York, U.S.A.
Summary: This preliminary study identifies features that have the potential to be meaningful descriptors of voice range profiles (VRPs) for 15 patients with organic vocal fold pathologies before and after laryngeal surgery. This study also explores the utility of the VRP as an outcome measure of change in vocal function after surgery. Potentially meaningful features for these patients are the semitone range, intensity level of the lower contour, frequency locus of the lower frequency values, smoothness of the contours, and the presence of intermittencies in the VRP contours. These features are not suggested for differential diagnosis, but for aiding the understanding of each individual patient's phonatory status. Initial use of these features suggests that the VRP may be a useful outcome measure for these patients. Key Words: Voice range profile-Phonetogram--Laryngeal surgery--Voice outcome measure.
The voice range profile (VRP) is a graphic display of the covariance of a patient's dynamic and frequency ranges, generally without specification of vocal quality. It is referred to by other labels in the literature, including phonogram (1), phonetogram (2,3), and F0-SPL profile (4). From the earliest investigations of Wolfe (5) and Stout (6), through the resurgence of interest in the VRP that began with Damste (2) in the 1970s, the VRP has been shown to reflect information about voice source and vocal tract characteristics. Komiyama (1), Coleman et al. (4), and Klingholz and Martin (7) describe a characteristic VRP shape of the maximum intensity at each frequency (the upper contour) and the minimum intensity at each frequency (the lower contour) over the maximal frequency range. Titze (8)
and Gramming (9) examine the influence of specific harmonic and spectral characteristics on those contours. The profile contours have been examined for differences in vocal capabilities across groups of trained and untrained vocal users, male and female, and healthy and dysphonic individuals (1,4,10-12). Differences have been noted between the contours of the VRP when individuals are pushed to the physiologic limits of their intensity and frequency ranges, and when individuals maintain satisfactory, or musical, vocal quality (10,13). The International Association of Logopedics and Phoniatrics (IALP) established an International Voice Committee in 1986 to examine issues and make recommendations concerning the standardization of voice evaluation procedures (14). A conclusion of the Committee was that the VRP is more useful as a within-subject measure than as a between-group measure. Indeed, the literature appears to support that conclusion. It does not appear that there are distinctive, between-group profile characteristics that differentially diagnose vocal pathologies, nor is it even clear whether group data consistently reflect distinct profile shapes that dif-
Address c o r r e s p o n d e n c e and reprint requests to Alison Behrman, Department of Communication Sciences, Hunter College/CUNY, 425 East 25th Street, Room N-1306, New York, NY 10010-2590, U.S.A. Portions of this paper were presented at the Voice Foundation's 25th Annual Symposium, Care of the Professional Voice, Philadelphia, June 1995.
269
270
A. B E H R M A N E T AL.
ferentiate normal and dysphonic groups, or pre- and posttherapy groups. Gramming (12), in a study of males and females with nonorganic dysphonia, found that as a group, the VRPs of females showed a significant increase in the upper contour as a result of therapy, but no change for the lower contour. There was no significant change in the male group's upper and lower profile contours as a result of therapy. There was, however, a significant increase in fundamental frequency (Fo) range across therapy for both males and females. For similar groups of patients, Gramming and Akerlund (15) found that the lower contour of the female group data was not significantly different from that of normal females, while the upper contour was significantly lower than normals. For males with nonorganic dysphonia, however, the lower contour was significantly increased compared to the normal male groups, and the upper contour, while not significantly different, was lower than that of the normals. All VRPs were normalized and no information about differences in F o range was provided. Akerlund (16), in contrast, found that as a group, the lower profile contour of the male patients with functional dysphonia was significantly lowered and the upper contour was unchanged. The group data for women demonstrated no change in the lower contour across therapy but a significant increase in the upper contour. Ohlsson and Lofqvist (17), comparing a group of nondysphonic subjects with a group of patients with nonorganic dysphonia, found no significant differences. They did find, however, that a group of subjects with laryngeal pathology had VRPs with reduced areas. The IALP Committee (14) recommended that the voice range profile be studied further to achieve a classificatory scheme for the different kinds of upper and lower contour patterns observed. Exploration of numerical algorithms characterizing relevant profile characteristics has been done. Airainer and Klingholz (18) and Klingholz and Martin (7) derive statistics based on elliptical dimensions of dysphonic profiles to characterize the profile contour shape. Sulter et al. (19) quantity profile shape using Fourier descriptors and derive area-related parameters quantifying maximal frequency and dynamic range, as well as area indices of habitual speaking register. To develop a classifactory scheme as called for by the IALP, however, more withinsubject, pathologic profile data need to be examined to determine precisely which contour characteristics are meaningful to analysis of phonatory funcJournal of Voice, Vol. 10, No. 3, 1996
tion. There is insufficient literature documenting the prominent characteristics of VRPs of dysphonic patients and no documentation of the use of the VRP as a measure of the within-subject change in phonatory function subsequent to surgical intervention. The majority of the studies of the VRPs of dysphonic subjects are restricted to functional dysphonia (2,9,15,16). Sulter et al. (19) analyze one dysphonic male but do not indicate the nature of the pathology. They do, however, express the same need as the IALP Committee: A large number of VRPs from different pathologies need to be analyzed to build a knowledge base of what constitutes normal and abnormal VRP features. This exploratory study was undertaken to provide preliminary data toward answering two questions. First, what are the important features of the upper and lower contours of the VRPs of patients with organic pathology? Important features are those that may be helpful to understand the phonatory function for an individual patient and to distinguish between normal-appearing and abnormalappearing profiles. Second, is the VRP a clinically useful, within-subject measure of change in vocal function as a result of surgical intervention? Results and subsequent interpretations need to be assessed as preliminary and exploratory in nature because the number of patients studied is small. Nevertheless, the data could well provide direction for more robust exploration. METHODS Subjects VRPs were elicited from patients undergoing laryngeal surgery by the second author (C.A.) at the Department of Otorhinolaryngology at The New York Hospital-Cornell Medical Center. The data are part of a larger, ongoing study to assess the utility of a number of acoustic and aerodynamic tests as effective outcome measures of laryngeal surgery and therapeutic intervention. Results from 15 patients are reported here. They represent all patients within a 9-month period who underwent vocal fold surgery and who had pre- and postoperative VRP testing, but no significant additional medical, motoric, or cognitive deficits that might plausibly affect the patient's ability to perform the test tasks. Table 1 lists the pertinent data for each dysphonic patient. Eight healthy women and two healthy men, ranging in age from 19 to 70, were also tested to make certain that distinctive features in the dysphonic patients' VRPs were due to laryngeal
VRP A N D ORGANIC VOCAL FOLD PATHOLOGY
271
T A B L E 1. Pertinent data for all dysphonic patients included in this study; the left and right vocal folds are indicated by the letters " L " and " R , " respectively Patient
Sex
Age
I 2 3 4 5 6 7 8
M M F F F F F F
30 24 27 69 49 77 II 56
9 10 II 12 13 14 15
M F M F F F M
85 75 62 78 24 47 74
Vocal fold organic pathology L intracordal cyst R polyp R intracordal cyst L papilloma L paralysis R paralysis Bilateral intubation granulomas Bilat. polyps + ant. polypoid degeneration L paralysis L paralysis L paralysis R paralysis L polyp R polyp L paralysis
pathology and not to procedural or instrumental differences. Each healthy subject was tested twice to simulate "preoperative" and "postoperative" testing. The average amount of time between the two testing trials for the healthy subjects was approximately 2 weeks. Instrumentation All VRPs were elicited using the Kay Elemetrics Voice Range Profile Model 4326. The instrumentation consisted of a microphone mounted on a headset for transduction of the acoustic signal. The microphone was positioned just below the center of the lower lip of the patient and was connected to an IBM-compatible PC. The acoustic signal was digitized at a sample rate of 51.2 k samples/s with 16-bit resolution. The software displayed a "piano keyboard," as in Fig. 1, which could be "played" by clicking on a piano key with the mouse button. The stimulus tone was sounded through a speaker connected to the computer. The patient was required to match the pitch of the stimulus tone while phonating/a/. The output gain of the stimulus tone was adjustable as the needs of the patient required. Stimulus frequencies were limited to semitone pitches from D -2 to G # . The patient's mean Fo had to be within a range equal to halfway between the frequency just below and just above the stimulus tone to be considered acceptable, and had to have a minimum duration of 160 milliseconds, the longest sampling duration option allowed by the software. If the phonation was considered acceptable, a green mark, corresponding to the intensity (y-axis) and Fo
Surgical procedure Blunt dissection + laser Sharp excision Blunt dissection + laser Sharp excision Thyroplasty type I Thyroplasty type I Laser excision Sharp excision + Thyroplasty type Thyroplasty type Thyroplasty type Thyroplasty type Laser excision Sharp excision Thyroplasty type
laser I I I I I
(x-axis) at which the patient phonated, was plotted on the screen. The intensity was measured using the dB(C) scale with fast-meter damping. Procedures All VRPs were elicited by one speech-language pathologist. Assessment of the vocal quality of each patient was made separately by the examining clinician and by a second speech-language pathologist (the third and first authors, respectively), both experienced in vocal rehabilitation, as well as by an otolaryngologist (the second author) who specializes in laryngeal surgery. Otolaryngologic examination was conducted by the second author. Videostroboscopy, using the Kay Elemetrics RLS 9100 videostroboscope system, was performed by the second author, with reviews of the videotapes by all of the authors as a group. Criteria used to rate the videos were generally consistent with those described by Hirano and Bless (20). Because the VRP is a behavioral test requiring complex cooperative behavior on the part of the patient, understanding the elicitation methodology is necessary for correctly interpreting the VRP (21). Three points are noteworthy. First, phonation was elicited at each Fo of the musical (semitone) scale across the entire extent of the patient's frequency range, and the dynamic range was tested at each frequency. This is in contrast to the more common method of eliciting the patient's maximum phonational frequency range and testing the dynamic range at decile increments of the frequency range. Journal of Voice, Vol. 10, No. 3, 1996
272
A. B E H R M A N E T A L .
FIG. 1. The display screen of the Kay Elemetrics Voice Range Profile Model 4326.
Second, the lower contour, representing the softest phonation, was tested first at each frequency, then the upper contour, representing loud phonation, was tested at each frequency. The patient was not required to phonate throughout his or her dynamic range at each frequency (a difficult task for many patients and requiring a different biomechanical and aerodynamic coordination than that used in ordinary speaking). Third, the upper contour was not elicited as maximal physiologic intensity, but rather as comfortably loud phonation, as explained herein. The clinician was seated in front of the computer, and the patient was seated next to the clinician facing the computer screen. The purpose and method of the VRP were explained to the patient. Water was provided for the patient to sip at will. A stimulus tone was presented to the patient, with instructions to phonate /a/ at the pitch of the stimulus tone. Specific instructions to the patient are described herein for each of the contours.
L o w e r contour
Instructions to the patient were to vocalize/a/as quietly as possible, "just louder than a whisper." Intermittent aphonic breaks were encouraged as a sign that the boundary of vocal fold oscillation was being reached. The patient was cued with hand signals to phonate more softly. Visually, the patient Journal of Voice, Vol. I0, No. 3, 1996
was encouraged to make the green marks go as far down the y-axis as possible for a given tone. The patient was encouraged to take as many breaths as required during sampling at each frequency, rather than trying to phonate only after a single inspiration. As reviewed by Coleman (20), the percent of vital capacity at which a patient is phonating can significantly affect the minimum intensity of the phonation. Inspiratory checking (maintenance of contraction of the muscles of inspiration, the diaphragm, and external intercostal muscles), may be necessary to limit the subglottal pressure and airflow during quiet phonation. Especially for untrained speakers, some degree of compensatory intrinsic laryngeal muscle contraction is also likely that would probably alter resistance of the vocal folds to the airflow. The lower contour of the VRP is the physiologic minimum intensity achievable for each frequency. Because subglottal pressure is the major regulator of intensity, especially at the lower frequencies (22,23), the lower contour of the VRP is believed to represent the intensity achieved by using phonation threshold pressure (PTP) for each frequency, the minimum subgiottal pressure required to initiate phonation (24). Because the PTP required to initiate vocal fold vibration is greater than the PTP required to sustain vibration (24), this methodology was believed to more accurately reflect the lowest possible intensity for each frequency. The
VRP AND ORGANIC VOCAL FOLD PATHOLOGY patient was free to use whatever biomechanic and aerodynamic settings necessary to produce the softest phonation at a given frequency. Sampling began at the Fo that was comfortable for the patient, near the patient's habitual speaking Fo, and then proceeded up the musical scale through falsetto until the patient was unable to sustain phonation for t>160 ms. The clinician then returned to the comfortable starting frequency and proceeded down the musical scale until the patient could not sustain modal phonation for 160 ms. All phonations judged perceptually as predominantly glottal fry (pulse register) and that were also approximately <78 Hz were rejected. The clinician then resampled portions of the lower contour until a stable contour was achieved (until the patient did not decrease intensity at a given frequency below that which had already been produced at that session). The lower contour was always elicited first, because it was believed to be less fatiguing and effortful for the patient to produce than the upper, louder contour.
Upper contour Constraints on the upper physiologic limit of intensity are difficult to quantify. Even in a private and enclosed area, patients are often inhibited from vocalizing to their maximum loudness in a nonemergency situation. Indeed, such behaviors are contraindicated in patients with vocal fold pathology. Additionally, both pre- and postoperative patients frequently expressed concern about causing further damage to their voices with loud vocalizing. Therefore, instructions to the patient were to vocali z e / a / j u s t below the intensity level at which the patient felt concern about pain or vocal damage. This level is referred to as maximal comfortable loudness. At each frequency sampled, the patient was encouraged with hand signals to phonate louder, until the clinician was confident that the sample was representative of maximal comfortable loudness. As with the lower contour, the patient was encouraged to take as many breaths as required during sampling at each frequency. Visually, the patient was encouraged to make the green marks go as far up the y-axis as was comfortable for a given frequency. The order of frequency sampling for the upper contour was similar to that for the lower contour, starting at comfortable F o and proceeding upward, and then returning to comfortable F o and proceeding downward. The clinician then resampled por-
273
tions of the upper contour until a stable contour was achieved--until the patient did not increase intensity at a given frequency above that which had already been produced at that session.
Data Analysis For each individual VRP, the minimum and maximum intensity values for each elicited frequency were determined by placing, the mouse cursor on the graphic display. The software displayed the frequency and intensity values for each location, and the values were noted. Values were judged to be artifact and not included if the frequency or intensity value was an isolated point that the individual could not replicate. In general, this occurred most frequently during elicitation of the minimum intensity from patients whose vocal quality was very breathy. The data were input into separate plotting software (SigmaPlot) for graphing. The VRPs of the healthy individuals were normalized to compare minimum contour intensity levels of specific percentiles of the total range with values from the literature. That is, for each VRP, the frequencies in hertz were converted to semitones relative to the lowest frequency and expressed as percentiles of the total range. The total range was then divided into I0 equal increments, and all intensity measures falling within the same increment were averaged (9). All three authors visually examined each VRP of the dysphonic patients to assess profile normality. It is recognized that examiner bias most likely influences perceptual judgments about the normality of the data. Vocal quality and patient history probably result in expectations of how a patient will perform, and therefore those expectations may influence data interpretation (25). In an attempt to remove that bias, a second method was used to assess each VRP. Each VRP from the healthy and dysphonic individuals, marked only with the sex of the speaker, was shown to three experienced voice scientists, with instructions to categorize the profiles as "normal" or "abnormal," using whatever criteria the rater desired. Each of the two VRPs per individual (preoperative and postoperative of the patients; trials one and two of the healthy subjects) was shown on a separate page, and the order of all the VRPs was randomized. Five of the profiles were duplicated to check intrarater reliability. The elicitation methodology for the lower and upper contours was explained to each rater before the task. Journal of Voice, Vol. 10, No. 3, 1996
274
A. B E H R M A N E T A L .
After the task was completed, each rater explained what criteria he or she had used.
RESULTS
Independent Rater Judgments
Table 2 shows the summary statistics for the healthy and dysphonic subjects. F o u r o f the 15 dysphonic patients could not sustain sufficiently periodic phonation for 160 ms at more than one Fo to create a VRP preoperatively. These four had unilateral vocal fold paralysis. (The other three patients in this study with vocal fold paralysis were able to phonate sufficiently preoperatively to create a VRP.) All patients, however, were able to create a VRP postoperatively. Because four patients were too dysphonic to create a preoperative VRP, considerable caution is urged in interpretation of average data for the dysphonic group. Additionally, because of some unique, abnormal characteristics in some of the patients' VRPs, such as gaps in the profiles (discussed TABLE 2. Summary statistics for healthy subjects and dysphonic patients; caution should be used in interpreting group data for the dysphonic patients because 4 of the 15 patients could not produce a preoperative VRP at all, and therefore are not included in the "preop" statistics
Healthy subjects "Preop. . . . Semitone range Mean range Mean SD Mean maximum Mean minimum Intensity (dB) Mean maximum Mean SD Mean minimum Mean SD Mean range
Semitone range Mean range Mean SD Mean maximum Mean minimum Intensity (dB) Mean maximum Mean SD Mean minimum Mean SD Mean range
30.3 6.9 40.0 19.0
Postop"
Combined
30.4 6.2 39.0 21.0
30.3 5.6 40.0 19.0
82.7 5.6 64.2 4.3 28.5
92. I 91.9 4.6 5.0 64.2 63.9 3.7 3.6 28. l 28.0 Dysphonic patients "Preop" "Postop" Combined 13.5 5.0 22.0 8.0
19.5 6.3 32.0 10.0
18.0 7.2 34.0 8.0
79.5 9,4 71.2 7.0 8.3
85.5 5.8 73.8 4.2 11.7
83.6 7.7 72.5 5.3 I 1.1
VRP, voice range profile. Journal of Voice. Vol. 10. No. 3, 1996
herein), averaged values may appear more " n o r mal" than they actually are.
All of the independent raters judged all o f the VRPs from the healthy individuals as being normal profiles. Of the postoperative VRPs from the dysphonic patients, rater l judged three as normal, rater 2 judged five as normal, and rater 3 judged eight as normal. All of the p o s t o p e r a t i v e VRPs judged as normal by rater 1 was similarly judged by the other two raters, and those V R P s j u d g e d as normal by rater 2 were similarly judged by rater 3. Rater l, unlike the other two, is also a singer and specified stricter dynamic and frequency range criteria than the other two raters. T w o preoperative VRPs were judged as normal, both by two o f the raters. The postoperative VRPs for those dysphonic patients were also judged as normal by both o f those raters. For rater 3, the entire task was repeated, but the age of each individual was added to the VRP. The only difference in the results of this task was that three additional postoperative VRPs were judged to be normal. The ages of those individuals were 47, 50, and 78 years, respectively. Because of the relative youth o f two of those individuals, it is not clear whether the change in judgment was due to the additional information about the age or to intrarater variability. The rater did c o m m e n t during the task that some of the VRPs appeared " b o r d e r l i n e . " All three raters were 100% consistent in their ratings of the duplicate profiles. The criteria used by all o f the raters to include a VRP in the normal category were frequency range, dynamic range, minimum intensity level, and the absence of gaps in the upper and lower contours. F r o m numerical values specified by some o f the raters and by assessing those VRPs judged to be normal by them, the minimum frequency range appeared to be - 2 octaves. The minimum dynamic range required by the raters to judge a profile as normal appeared to vary considerably, but in general a 30-dB range at the widest profile area was required. Minimum intensity level varied considerably as well, but in general at least the e x p e c t e d habitual speaking frequencies for the individual needed to be close to 60 dB. When there were isolated frequencies within the maximal f r e q u e n c y range at which the patient could not phonate, resulting in a gap in the contour, the VRP was considered abnormal.
VRP A N D ORGANIC VOCAL FOLD PATHOLOGY
All of the raters expressed some difficulty with the task, largely due to the artificial nature of assessing isolated patient data without other additional information about each individual. The most frequent issue raised was not having the clinical judgment available regarding whether the patient had indeed used maximal effort (i.e., had been "pushed" by the clinician) to attain the minimum and maximum frequencies and the minimum intensity level.
Important Features of the VRPs Five features of the VRP are suggested as being potentially meaningful descriptors of the profiles of the dysphonic patients studied: the semitone range, the intensity level of the lower contour, the smoothness of the upper and lower contours, the frequency locus of the lower frequency values, and the presence of gaps in the contours. That is, these features show the potential for distinguishing between normal-appearing and abnormal-appearing profiles. These features also show the potential for assisting in the description of the phonatory function of the individual patients. The selection of these five features was made from a combination of input from the three independent raters, as well as examination of the VRPs by the authors. One feature that appears to be a potentially meaningful descriptor of VRPs for the patients studied is the semitone range. Only 3 of the 15 postoperative VRPs had a semitone range of - 2 . 5 octaves (Nos. 1, 10, 14). Two preoperative profiles had a semitone range of - 2 . 0 octaves (Nos. 5, I0). The other postoperative profiles, as well as the profiles of the 11 patients who could phonate preoperatively, had semitone ranges below 2 octaves. Excluding Patients 7 and 8, whose entire semitone range shifted across surgery (discussed herein), two patients (Nos. 2, 13) demonstrated a consistent semitone range across surgery, and none demonstrated a decrease, leaving 11 patients who demonstrated an increased range. This suggests that the semitone range may be an important descriptor of change in phonatory status as a result of surgery. A second feature of the VRPs of the dysphonic patients studied that may reveal important information about phonatory function is the intensity level of the lower contour. The preoperative VRP from one patient (No. 1) had a lower contour that was <60 dB over the lowest octave of the profile. The pre- and postoperative VRPs from the other 14 patients all had lower contours >60 dB over the lower
275
octave of the profile. Using the intensity level of the lower contour to assess change in phonatory function across surgery did not appear helpful for many of the patients studied, either because the patient was too dysphonic to create a preoperative VRP or because the presence of gaps in the contours prevented clear assessment of overall minimum intensity level. The minimum intensity level, however, may provide information about phonatory function. Closer examination of the VRP of Patient 9 (Fig. 3) (an 85-year-old man with left vocal fold paralysis) 3 weeks after thyroplasty Type I reveals that the overall shape of the VRP is essentially normal, with increased intensity as F o increases, as well as a narrower dB range at the extremes of the F o range. The intensity level of the lower contour, which is elevated overall, suggests a possibly elevated PTP, or at least an alteration in the relationship between the PTP and the biomechanic characteristics of the vocal folds. This altered relationship is consistent with the videostroboscopic findings of a nonmobile left vocal fold and mild compression of the ventricular folds. It is conceivable that these result in increased damping. The nonmobile vocal fold could result in increased energy loss from the airflow, where less of the transglottal pressure converts to oscillatory driving pressure. It is also possible that the compression of the ventricular folds alters the acoustic back pressure from the vocal tract, increasing the inefficiency in the transfer of energy from the airflow to the tissue (24). These are only hypotheses, which require additional testing and experimentation to substantiate. Although the independent raters clearly identified overall dynamic range as an important judgment criteria, it was not included as a potentially meaningful descriptor of the profiles of these patients. The unique elicitation methods for the upper contour that resulted in maximal comfortable phonation rather than maximal physiologic phonation need to be assessed more broadly in normal, healthy individuals before a comparison could be made with dysphonic patients. However, the intensity level of the lower contour, together with the smoothness of the upper and lower contours (to be discussed) indirectly addresses the intensity range of the profiles, For example, the postoperative profile of Patient 3 (Fig. 2) clearly shows an increased intensity range in the middle frequencies. However, by discussing the change in intensity level of the lower contour and the change in the smoothness of the Journal of Voice, Vol. 10, No. 3, 1996
276
A. B E H R M A N E T AL.
I
I
I
I
100
100
- /v4 ~-/~,L~<
._1
(3_ v
80
m
r
"O
/
/ \I
l/ 60
60 I
I
[
I
I
200
400
600
200
400
Pt #1 I
Fo (Hz) I
Pt-# 2 I
100
100
._1
n q) v
Fo (Hz)
._I
o0 v
80
m
~\/l /
.,
"10
l",.T'"'.x/ J r~
80
nn "10
60
60 I
I
I
200
400
600
Pt# 3
200
Fo (Hz)
400
Pt# 4
Fo (Hz)
I
I\.~\
100
.
100
_
_1
._1
n v
13_
80
v
rn
h..
80
m
"O
60
60 I
I
I
I
I
200
400
100
200
300
Pt#5
Fo(Hz)
Pt#6
Fo(Hz)
FIG. 2. Voice range profiles (VRPs) for Patients I through 6. The solid lines with the unfilled circles represent the preoperative profiles. The dashed lines with the filled circles represent the postoperative profiles. For those plots in which no preoperative profile is presented, the patient was too dysphonic to create a VRP. Caution should be used in comparing profiles of different patients, because the x-axis scale (frequency) is not consistent for each patient. This inconsistency was necessary to provide clear detail for some of the smaller profiles.
Journal of Voice, Vol. 10, No. 3, 1996
VRP A N D ORGANIC VOCAL FOLD PATHOLOGY
I
I
I
I
I
I
v
Q.
80
l
V
I
v
"0
60
60 I
I
I
I
I
I
I
200
400
600
800
1000
200
400
Pt #7 I
Pt#8
Fo (Hz) I
l
I
100
I
I
,==.,,
..J
(3. O0
80
rn
m
"0
"0
80
\
r.j\
--'J ",,_.-.-.~'--"
60
60 I
I
I
I
I
I
200
400
600
200
400
600
Pt#9 I
Pt#10
Fo (Hz)
Fo(Hz)
I
I
l
100
100 ._I
._i
13_ CO v
Fo (Hz)
100
F.,./'/ i\. /l~._r...,.. ,~ / i
..J v
t~
80
rn
nn "0
EL O9
I
100
100
._J EL GO
277
D.
80
80
nn
I
/
m
"10
60
60
i
I
I
I
l
100
200
200
400
Pt#11
Pt# 12
Fo(Hz) FIG.
3. V R P s f o r P a t i e n t s 7 t h r o u g h
Fo (Hz)
12.
Journal o f Voice, Vol, 10, N o . 3, 1996
278
A. B E H R M A N E T A L .
lower contour across surgery, the intensity range is addressed. A third feature of the VRPs of the dysphonic patients that appears to reveal important information about phonatory function is the frequency locus of the lower frequency values. For 2 of the 15 dysphonic patients studied, one adult and one child (Patients 7 and 8 in Fig. 3), the frequencies at which the lower end of the VRP area are located are not within the expected habitual speech frequency range for the patient's sex and age. The postsurgicai profiles for each of these patients are increased in range and relocated closer to expected habitual speech frequencies. Both presurgical VRPs were rated as abnormal in the independent rating, and only one rater judged one of the postsurgical profiles in this group to be normal (Patient 7). An example of a dramatic change in frequency locus is found in the profile of Patient 7 (Fig. 3), an ll-year-old girl with bilateral intubation granulomas. The presurgical profile frequency range extends from 440 to 987 Hz, while 3 weeks postoperatively the frequency range is in a lower register, extending from 196 to 494 Hz. Vocal quality was high pitched and severely strained, with frequent aphonic breaks preoperatively and within normal limits postoperatively. Preoperative videostroboscopic assessment revealed an exophytic, pedunculated mass attached posteriorly at the junction of the vocal folds and the arytenoids. The posterior edge of the vocal folds appeared irregular; the middle third appeared edematous. An hourglass type glottic closure resulted from the large lesion obstructing closure posteriorly, while the middle third of the vocal folds approximated due to excessive edema. The open phase predominated during phonation. Of particular note is that the mass flopped into the subglottic region on inhalation, where it remained during phonation. (Frequently, a lesion of this nature is expected to project into the supraglottic area during phonation.) Interestingly, the patient frequently used inspiratory phonation. Vibration was confined to a small portion of the vocal folds, which overall demonstrated increased stiffness with decreased mucosal wave. Typically, children would be expected to have a more marked mucosal wave than adults, due to the thick, pliable mucosa. Videostroboscopy 3 weeks after surgery revealed all parameters to be within normal limits, with smooth vocal fold edges and complete glottic closure. The increased stiffness and decreased vibratory region of the vocal folds likely contributed to the unusually Journal of Voice, Vol. 10, No. 3, 1996
high frequency range of the VRP preoperatively. For both Patients 7 and 8, comparison of the preand postoperative VRPs suggests significant change in the phonation status across surgery. A fourth feature that may be an important descriptor of the profiles is the smoothness of the upper and lower contours. Most of the patients' VRPs are remarkable not only for decreased frequency and/or dynamic range, but also for unique shapes, especially in the areas where register transitions might be expected to occur. (See especially the profiles of Patients 1, 3, and 6 in Fig. 2; 7, 8, and 9 in Fig. 3; and 13, 14, and 15 in Fig. 4.) Indeed, although none of the independent raters specified contour smoothness as a criteria, all three commented on the "unusual" shapes of many of the profiles during the rating task. An example of a change in contour smoothness across surgery is demonstrated in the VRPs from Patient 14 (Fig. 4), a 47-year-old woman with a right vocal fold polyp. The frequency range of the VRP increased from 21 to 30 semitones 3 weeks after sharp excision of the polyp. The postoperative VRP was judged normal in the independent rating. The postoperative frequency range is equivalent to the mean range for the healthy subjects. The overall mean intensity of the lower contour dropped from 76 to 73 dB. There is, however, a noticeably smoother transition postoperatively at 349 Hz, the region generally associated with a register transition from modal to falsetto, with a greatly increased dynamic range in the higher frequencies. Videostroboscopy revealed a moderately irregular right vocal fold, smooth left vocal fold, incomplete glottic closure, and predominant open phase during phonation preoperatively. Three weeks after surgery, the left vocal fold was slightly irregular, the right was smooth, glottic closure was complete, and all other parameters were within normal limits. According to traditional biomechanical theory, the transition between registers is due to a change in the balance of the contractile forces between cricothyroid and thyroarytenoid muscles as the F o is increased (25,26). However, it is possible that register transition may also be dependent on changes in the subglottal resonances (27,28) and sufficient collisional force between the vocal folds (29). It is conceivable that the decreased smoothness in register transition may be due in part to a preoperative inability to alter glottal geometry to effectively compensate for changes in subglottal resonance, as well as a decreased collisional force.
VRP A N D O R G A N IC VOCAL FOLD P A T H O L O G Y
I
I
I
I
I
100
100 _!
ft. co v
279
i1~
.-i
fl_ o~ v m
80
p \ "--.~-~
8O
l'-/
"0
60
60 200 Pt#13 I
I
I
I
I
I
400
200
400
600
800
Pt# 14
Fo(Hz)
1
Fo (Hz)
I
100 ._1
D. 09 v rn
r.,.,- f',,[,.
80
/
"10
FIG. 4. VRPs for Patients 13 through 15.
60 1.
I
1O0
200 Pt# 15
300
Fo (Hz)
Abrupt changes in the slope of the upper and lower contours have been discussed frequently in the literature as a function of formant frequency and register transition (1,8,9). For many of the abrupt slope changes in this study, especially in the lower frequencies, formant frequency was probably not a dominant factor, because of the relatively high frequency of the first formant for the vowel /a/. Change in the mode of vibration, due to register transition from modal to falsetto, could well have resulted in slope discontinuities, that is, either an abrupt narrowing of the dynamic range (equivalent to an abrupt change in slope of both contours) or a gap in contours. A fifth feature of the VRPs that may be an important descriptor of the patient profiles is the presence of intermittent gaps in the upper and lower contours. These gaps result from a patient's inability to sustain sufficiently periodic phonation for 160
ms at intermittent musical fundamentals within his or her tested range. The VRPs from 5 of the 15 dysphonic patients demonstrated this feature (Patients 1 and 4 in Fig. 2; Patients I1 and 12 in Fig. 3; Patient 13 in Fig. 4). Three of these patients demonstrated postsurgical contours that contained no gaps within the elicited frequency range (Nos. 1, 4, 13). Two VRPs demonstrated gaps postsurgically (Nos. 11, 12), but there were considerably fewer gaps, and each of the postsurgical VRPs showed an overall increased frequency and dynamic range. Vocal fold pathologies were varied and included intracordal cyst, polyps, and paralysis. Vocal quality preoperatively also varied among the patients, from mild to severe dysphonia. None of the presurgical VRPs were judged to be within normal limits by the independent raters. One of the postoperative VRPs was judged as being normal by all three raters (No. 1). Journal of Voice, Vol. 10, No. 3, 1996
280
A. B E H R M A N ET AL.
An example of gaps in both pre- and postoperative profiles is found in the VRPs of Patient I 1, (Fig. 3), a 62-year-old man with a history of non-small cell lung cancer and metastasis to the right cervical lymph nodes, who presented with left vocal fold paralysis. Vocal quality was severely breathy and wet, with a pulsed quality at all frequencies tested. Three months after Type I thyroplasty was performed, videostroboscopy revealed normal glottic closure, with absence of vibratory movement of the left vocal fold. Three months postoperatively, the patient's vocal quality was markedly less breathy, but a pulsed quality was maintained at all frequencies. The preoperative VRP of this patient is highly fragmented. While the postoperative VRP demonstrates a marked increase in frequency and dynamic range, the area in which the patient's mean speaking fundamental frequency is located, approximately 100 Hz, shows a more restricted dynamic range. Although phonation in the region
DISCUSSION The first purpose of this preliminary study was to explore those features of the pre- and postoperative VRPs of 15 patients with organic vocal fold pathology that may be helpful to understand the phonatory function for those individual patients. The second purpose was to provide initial data for determining whether the VRP is a clinically useful within-subject measure of change in vocal function subsequent to surgery. The preliminary data presented in this study suggest there are five features of the upper and lower contours of the VRPs that may be meaningful descriptors of the types of profile patterns observed and were used to judge change in phonatory function: semitone range, intensity level of the lower contour, frequency locus at which the lower frequency values of the profile are located, smoothness of the upper and lower contour, and continuity of the two contours. There was no attempt to use these profile features to differentially diagnose organic pathology, nor was there an attempt to use these profile features as patient subgroup descriptors. Rather, they were explored as features that may assist both characterizing the nature of the individual's dysphonia and assessing the change in vocal function after surgery. All results and interpretations need to be assessed as preliminary and exploratory in nature, because the number of patients studied was small. The mean frequency range for the healthy subjects of 30 semitones (2.5 octaves) is slightly less than that reported by Coleman et al. (4), but is consistent with octave range data reported by Gramming (9) for healthy individuals. The test-retest mean difference of <1 semitone is consistent with Awan (10) and Colton and Hollien (30). The mean frequency range for the dysphonic patients of 13.5 and 19.5 semitones pre- and postoperatively, respectively, are both well below the data for normal subjects, while the increase in semitone range across surgery is greater than would be expected from normal test-retest differences (10,30,31). Therefore, the postoperative increase of 6 semitones can be attributed to the surgical procedures. The mean minimum intensity of 64 dB for the healthy subjects is greater than the mean of 58 dB for males and 55 dB for females found by Coleman et al. (4), as well as the mean of - 6 0 dB reported by Gramming (9). The overall mean of the lower contour of the normalized VRPs of the healthy individ-
VRP A N D O R G A N IC VOCAL FOLD P A T H O L O G Y
uals from the lowest fundamental frequency up to 60% of the fundamental frequency range was 61.5 dB, which compares to the mean of 60 dB reported by Gramming (9). The test-retest mean difference in the lower and upper contours of < I dB for the healthy subjects and almost 2 dB in the lower contour for the dysphonic patients are within the average 3 to 4 dB differences reported by Gramming (9) and Awan (10). The 6-dB increase in the upper contour of the dysphonic patients after surgery, however, is greater than expected, and therefore can be attributed to surgery. The mean maximum intensity for all individuals was considerably less than the data reported by Coleman et al. (4) (113 dB for females and 117 dB for males), as expected, given that the upper contour was maximal comfortable intensity, not maximal physiologic intensity. It should be noted that there may have been a bias toward higher intensity levels for the lower contours of the dysphonic patients. The microphone, although generally protected with a windscreen, was positioned just below the center of the lower lip of each patient. For those patients with significantly breathy phonation, the resulting high frequency turbulent noise may have resulted in an increase in intensity independent of increased harmonic energy. Coleman (20) states that possibly as much as a 15-dB gain in amplitude of the acoustic waveform may be achieved by the addition of significant noise. The raters either expressed overtly or demonstrated by the profiles they classified as normal that the three criteria of frequency range, dynamic range, and intensity level of the lower contour were not assessed independently. Minimum intensity levels below - 6 0 dB, especially in the lower frequencies, and a larger dynamic range in the middle frequencies were commonly cited by the raters as criteria for normal VRPs. It appeared that a large increase in overall dynamic range compensated for restricted frequency range and/or higher than expected minimum intensity level and vice versa. In other words, although all of the raters attempted to impose objective and stringent criteria on their task, clinical judgment and interpretation of the overall VRP shape remained an important factor. This is an important point to bear in mind when, as diagnosticians, we try to reduce graphically presented patient data to numerical indices alone. Only approximately one third of the postoperative VRPs were judged normal by the independent raters. Indeed, the postoperative mean semitone
281
range and the mean upper and lower contour values were not within normal limits. There are four plausible reasons why more of the postoperative VRPs did not appear normal. First, the number of weeks elapsed between pre- and postoperative testing for the patients in this study was - 4 weeks, with a minimum of 1 week and a maximum of 12. It is likely that resolution of surgical changes and complete return of mucosal wave function take >4 weeks. Second, the recent surgery may have made the patients hesitate to "perform" vocally. Third, for certain of the organic pathologies, especially unilateral paralysis, establishment of completely normal phonatory function is not expected. The goal of the surgery in those cases is to provide optimal vocal function given structural and functional limitations. Fourth, fully one third of the dysphonic patients were aged >70 years, and therefore it is reasonable to expect some diminution of vocal function related to presbylarynx. However, Ramig and Ringel (32) demonstrate that physiologic age may be a more robust predictor of change in phonatory function than chronologic age. Patients who were clearly too ill to participate in vocal testing were not included in this study. Nevertheless, the dysphonic subjects in this study were not selected for health, and a number of patients had medical histories remarkable for substantial disease. Therefore, the number of patients of the 15 in this study who would be classified as physiologically aged may indeed be greater than the number who are simply those aged >70 years. Fourth, due to the organic pathology, some of the patients may have established negative compensatory behaviors that persisted after surgery. In fact, videostroboscopy did reveal some degree of persistent hyperfunction after surgery in many of the patients, in addition to a hyperfunctional vocal quality. The fact that more postoperative profiles did not appear to be normal, however, is not a negative reflection on the VRP's utility as a within-subject measure of change in vocal function after surgery. For these patients, the VRP provided information about the status of phonatory function relative to the presurgical status and relative to the potential optimal vocal performance for that individual patient. Somewhat surprisingly, there was almost no difference in the judgments of normal versus abnormal VRPs between the independent raters and the authors. The only real difference was that none of the preoperative VRPs were judged normal by the auJournal of Voice, Vol. 10, No. 3, 1996
282
A. B E H R M A N E T A L .
thors during the weekly voice laboratory patient review meetings. Quite likely, this was due to a bias effect based on the prior knowledge that the patient was a candidate for vocal fold surgery. It should be noted, however, that as a group, the independent raters demonstrated greater confidence in their assignment of VRPs to the abnormal group than to the normal group. (This is likely due to the artificial nature of the task, as discussed earlier; within routine clinical practice, the VRP is considered along with other patient data.) Interestingly, the fact that the upper contour did not represent physiological maximal intensity did not appear to confound the raters, as demonstrated by the 100% consistency in identifying the healthy individuals' VRPs as normal. It appears that the task modification of using maximal comfortable intensity, while providing no information about physiologic maximal intensity, is a workable compromise for clinical settings with patients who present with laryngeal pathology. In spite of the variable judgment criteria, instrumentation, and elicitation methodology, the results indicate that there appears to be a conceptualization of what constitutes a normal-appearing VRP that is common to voice specialists. This is important, because it is a prerequisite to the use of the VRP as a standardized assessment and outcome measure. It suggests that the VRP as an assessment measure may be more robust with respect to inevitable variations in methodology than previously believed. Some caution should be used in interpretation of these VRPs. First, as Airainer and Klingholz (18) state, these data are but snapshots of phonatory behavior at one moment in time. They are, however, larger snapshots than would be achieved by simply having the patient phonate sustained vowels at self-selected comfortable pitch and loudness. Second, although all efforts were made to maximize vocal performance, the cooperation and motivation of the patient are a significant influencing variable on the data. Third, the lack of vocal warm-up before elicitation of the VRPs may have differentially affected the healthy and dysphonic groups. Elliot et al. (33) found that the effect of vocal warm-up on PTP varied considerably among trained singers. Dysphonic patients often report that their voice is considerably improved after it is warmed up. Because the lower contour was elicited first in this study, it is possible that a prior period of vocal warm-up exercises would have resulted in greater frequency and dynamic ranges for the dysphonic patients than a similar warm-up period for the Journal of Voice, Vol. 10, No. 3, 1996
healthy subjects. A fourth caution in interpretation of the VRPs in this study is the risk that they overestimate the phonatory abilities of the dysphonic patients more than the healthy subjects due to the short sampling duration of 160 ms. The healthy subjects are more likely than the dysphonic patients to be able to sustain speech at each of the points elicited in their respective VRPs. However, the profiles of each of the dysphonic patients, both pre- and postoperatively, in general were consistent with the other data, both perceptual and instrumental, for each patient. From a purely clinical interpretation, the profiles of the dysphonic patients did not appear to overestimate phonatory function. Further study of the profile features identified in this preliminary study from larger groups of dysphonic patients could be usefully applied to increasing our understanding of phonatory behavior after surgery. Further study of the VRP over time for individual patients may provide information eventually leading to the use of the VRP as a prognostic test of the optimal vocal performance a patient could reasonably expect to achieve after a specific surgical procedure. Continued efforts to standardize the ways to quantify the relevant features of the upper and lower contour of the VRP could well improve the clinical utility of the VRP. Acknowledgment: We thank the anonymous reviewers for their useful comments and the three speech scientists who served as independent raters. A.B. is ever grateful for the interest and insights offered by R. J. Baken and R. F. Orlikoff.
REFERENCES I. Komiyama S. Phonogram--a new method evaluating voice characteristics. Otologica (Fukuoka) 1972;18:428--40. 2. Damste, PH. The phonetogram. Practica Oto-RhinoLaryngol 1970;32:185-7. 3. Schutte H, Seidner W. Recommendations by the Union of European Phoniatricians (UEP): standardizing voice area measurement/phonetography. Folia Phoniatr (Basel) 1983; 35:286-8. 4. Coleman R, Mabis J, Hinson J. Fundamental frequencysound pressure level profiles of adult male and female voices. J Speech Hear Res 1977;20:197-204. 5. Wolfe SK, Stanley D, Sette WJ. Quantitative studies on the singing voice. J Acoust Soc A m 1935;6:255--66. 6. Stout B. The harmonic structure of vowels in singing in relation to pitch and intensity. J Acoust Soc A m 1938;10:13746. 7. Klingholz F, Martin F. Die quantitative Auswertung der Stimmfeldmessung. Sprache-Stimme-Gehor 1983;7:106-10. 8. Titze 1R. Acoustic interpretation of the voice range profile (phonetogram). J Speech Hear Res 1992;35:21-34. 9. Gramming P. The phonetogram; an experimental and clinical
VRP A ND O R G A N I C VOCAL FOLD PA THOLOG Y
10. 11.
12. 13. 14. 15. 16.
17. 18. 19.
study [Dissertation]. Malmo, Sweden: Lund University, 1988. Awan SN. Phonetographic profiles and Fo-SPL characteristics of untrained versus trained vocal groups. J Voice 1991; 5:41-50. Akerland L, Gramming P, Sundberg J. Phonetogram and averages of sound pressure levels and fundamental frequencies of speech: comparison between female singers and nonsingers. J Voice 1992;6:55-63. Gramming P. Non-organic dysphonia: phonetograms for pathological voices before and after therapy. Scand J Logopedics Phoniatrics 1988;1:3-16. Coleman R. Performance demands and the performer's vocal capabilities. J Voice 1987;1:209-16. International Association of Logopedics and Phoniatrics (IALP) voice committee discussion of assessment topics. J Voice 1992;6:196--8. Gramming P, Akerlund L. Non-organic dysphonia. Phonetograms for normal and pathological voices. ActaOtolaryngol (Stockh) 1988;106:468-76. Akerlund L. Averages of sound pressure levels and mean fundamental frequencies of speech in relation to phonetograms: comparison of nonorganic dysphonia patients before and after therapy. Acta Otolaryngol (Stockh) 1993;113: 102-8. Ohlsson AC, Lofqvist A. Phonetograms of normal and pathological voices. Working Papers Logopedics Phoniatrics (Lund) 1986;3:94-106. Airainer R, Klingholz F. Quantitative evaluation of phonetograms in the case of functional dysphonia. J Voice 1993; 7:136-41. Sulter AN, Wit HP, Schutte HK, Miller DG. A structured approach to voice range profile (phonetogram) analysis. J Speech Hear Res 1994;37:1076--85.
283
20. Hirano M, Bless DM. Videostroboscopic examination o f the larynx. San Diego: Singular, 1993. 21. Coleman RF. Sources of variation in phonetograms. J Voice 1993;7: i-14. 22. lsshiki N. Regulatory mechanism of voice intensity variation. J Speech Hear Res 1964;7:17-29. 23. Isshiki N. Vocal intensity and air flow rate. Folia Phoniatr 1965;17:92-104. 24. Titze IR. Phonation threshold pressure: a missing link in glottal aerodynamics. J Acoust Soc A m 1992;91:2926-35. 25. Sonninen A. Is the vocal cords the same at all different levels of singing. Acta Otolaryngol Suppl (Stockh) 1954; 11:219-31. 26. van den Berg J. Vocal ligaments versus registers. In: Trojan F, ed. Current problems in phoniatrics and Iogopedics; vol I. Basel: Karger, 1960:19-34. 27. van den Berg J. Register problems. In: Bouhuys A, ed. Sound production in man. Ann N Y Acad Sci 1968;155:12934. 28. Titze IR. A framework for the study of vocal registers. J Voice 1988;2:183-94. 29. Vilkman E, Alku P, Laukkanen A-M. Vocal-fold collision mass as a differentiator between registers in the low-pitch range. J Voice 1995;9:66-73. 30. Colton R, Hollien H. Phonational range in the modal and falsetto registers. J Speech Hear Res 1972;15:708-13. 31. Teitler N. Examiner bias: influence of patient history on perceptual ratings of videostroboscopy. J Voice 1995;9:95105. 32. Ramig LA, Ringel RL. Effects of physiological aging on selected acoustic characteristics of voice. J Speech Hear Res ! 983 ;26: 22-30. 33. Elliot N, Sundberg J, Gramming P. What happens during vocal warm-up? J Voice 1995;9:37-44.
Journal of Voice, Vol. 10, No. 3, 1996