J. COMMUN. DISORD. 17 (19841, 319-324
F’UNDAMENTAL FREQUENCY AND INTENSITY MEASUREMENTS IN LARYNGEAL AND ALARYNGEAL SPEAKERS GORDON W. BLOOD Communication Research Laboratory Communication and Theatre Miami University, Oxford, Ohio
Fundamental frequency measurements and voice sound pressure levels were examined in 10 tracheoesophageal, 10 esophageal, and 10 laryngeal sneakers during three phonation tasks. The results indicated that tracheoesophageal speakers had a fundamental frequency approximately 25 Hz higher than esophageal speakers. Intensity levels for laryngeal and tracheoesophageal speakers were similar. Discussion includes support for this new voice restoration technique.
INTRODUCTION Investigators have examined the acoustic characteristics of alaryngeal speech. Researchers have studied fundamental frequency (Damste, 1958; Curry and Snidecor, 1961; Shipp, 1967; Hoops and Nell, 1969; Weinberg and Bennett, 1972; Angermeier and Weinberg, 1981) and intensity characteristics (Hoops and Nell, 1969; Filter and Hyman, 1975; Blood,,1981) of esophageal speakers. Recently, a surgical-prosthetic method for voice restoration was developed by Blom and Singer (1979). The Blom-Singer prosthesis is inserted after an endoscopic procedure “punctures” the pdsterior tracheal wall and anterior esophagus. The prosthesis opens on one end like a “duck’s bill,” allowing pulmonary esophageal voice production. Voice is produced by occluding the tracheostoma. The tracheoesophageal puncture (TEP) has been reported to produce fluent speech (Blom and Singer, 1979; Singer and Blom, 1979, 1980). Although few articles have been published on the acoustic features of TEPspeakers, a number of papers at national conventions attest to the interest in this new alternative to voice restoration (Weinberg, Horii, and Smith,, 1980; Robbins, Fisher, and Logemann, 1979; Robbins et al., 1981; Singer and Address correspondence to: Dr. Gordon W. Blood, Communication Research,Laboratory, Communication and Theatre, Miami University, Oxford, OH 45056. 0 1984 by Elsevier Science Publishing Co., Inc. 52 Vanderbilt Ave.. New York, NY 10017
319 0021:9924/84/$03.00
320
G. W. BLOOD
Blom, 1979). The purpose of the present study was to determine if there are differences in the fundamental frequency and intensity characteristics among tracheoesophageal, esophageal, and laryngeal speakers. This information will contribute to our understanding of this new voice restoration technique and its overall intelligibility and acceptability.
METHODS Subjects The 30 subjects included 10 esophageal, 10 tracheoesophageal, and 10 laryngeal speakers. The esophageal speakers ranged in age from 61 to 76 yr, with a mean age of 67 yr. Each had used esophageal speech for at least 3 yr as a primary mode of communication. The TEP speakers ranged in age from 52 to 68 yr, with a mean of 58 yr, and had an average of 7 hr of training ranging from 2 hr to 20 hr. Three speech-language pathologists familiar with esophageal speech rated each speaker on vocal effectiveness (Curry and Snidecor, 1961) and speech acceptability (Shipp, 1967). All subjects were rated as above average speakers. The laryngeal speakers ranged in age from 52 to 64 yr, with a mean of 59 yr, and presented no voice or articulation disorders.
Procedure Recordings were made in a sound treated room (IAC, 402A) using an Ampex model AG 500-2 tape recorder with a microphone 30 cm from the speaker’s lips. Each speaker sustained the vowel “ah” at a comfortable level, counted to four and held the word “four,” and then read the Rainbow Passage into the microphone. The second sentence of the RainbowPassage was extracted for the analysis. Four of the TEP speakers and five of the esophageal speakers were unable to read. Therefore, the passage was read to the subject by the examiner and the subject repeated it into the microphone.
Data Analysis All data were analyzed using a Kay Textronix 6087 Visipitch for continuous fundamental frequency analysis. The Visipitch is equipped with a light beam cursor that allows a digital display of the fundamental frequency at any specitied point on the frequency wave. A 4-set time frame was employed. Sustained “ah,” the word “four,” and segments of the second sentence of the Rainbow Passage were placed on the screen and the average fundamental frequency was calculated. The Visipitch counter will only display the average number of pitch periods for 1 set of voiced
FUNDAMENTAL
FREQUENCY
AND INTENSITY
321
continuous speech. The circuitry extracted all the voiced sounds of the second sentence and displayed a modal pitch for each 1 sec. Intensity measures were obtained from graphic level recordings (GLR; General Radio Company, type 1521-B) prepared from the tapes. The GLR had a O-50 dB potentiometer, and was set for fast stylus speed and 10 mm/set paper speed. Prior to recording the speech sample, a lOOO-Hz calibration tone, with a level equivalent to a speech level of 80 dB SPL re: 20 PPa was recorded on the tape and served as a reference for analysis. The intensity peaks were measured in relation to the 80-dB reference line. Research in the investigation of fundamental frequency routinely uses a linear unit. Although a logarithmic unit is used to report intensity, and the human response is logarithmic in both instances, the investigator deferred to the conventional use of a linear unit for frequency. RESULTS Mean, SD, and range values for the fundamental frequency of all subjects are presented in Table 1. A 3 x 3 analysis of variance (speakers x phonation task) was performed. Results revealed a significant difference among speakers (p s O.Ol), no significant difference among the phonation tasks, and no significant interaction. The speakers’ data were submitted to a posthoc Newman-Keuls analysis to determine significant differences among means. All groups were significantly different from each other (p s 0.01). The mean for the TEP speakers was approximately 25 Hz higher than the esophageal speakers, and approximately 30 Hz lower than the laryngeal speakers.
Table 1. Means, SD, and Ranges of Fundamental Frequency for All Subjects During Three Conditions Subjects (n = 30) Tracheoesophageal Mean SD Range Esophageal Mean SD Range Laryngeal Mean SD Range
Sustaining “ah”
Sustaining “four”
2nd Sentence of Rainbow Passage
89.3” 18.6 60-120
89.7“ 15.2 63-113
88.3” 20.1 71-142
63.6” 10.8 57-96
63.8” 13.5 57-102
64.6” 14.5 56-104
119.5” 4.4 112-131
122.6” 5.8 118-132
120.8” 5.6 115-130
u All means are significantly different
at the p G 0.01 level.
322
G. W. BLOOD Table 2. Means, Intensity Levels Conditions Subjects (n = 30) Tracheoesophageal Mean SD Range Esophageal Mean SD Range Laryngeal Mean SD Range
Standard Deviations (S.D.), and Ranges of (in dB SPL) in All Subjects During Three Sustaining “ah”
Sustaining “four”
2nd Sentence of Rainbow Passage
80 8.6 65-91
81 8.1 69-94
82 8.2 66-90
73” 6.4 65-18
72” 5.2 69-79
70” 7.4 68-8 1
84 5.1 80-93
85 6.2 79-96
84 7.9 81-96
a Esophageal speakers group means are significantly different @ s 0.01) from laryngeal and tracheoesophageal speakers.
Mean, SD, and range values for intensity levels of all subjects are presented in Table 2. The raw data were subjected to a two-way analysis of variance with two factors: speakers (laryngeal, tracheoesophageal, and esophageal), and phonation tasks (“ah,” “four,” and Rainbow Passage). Results revealed a significant difference among groups (p s 0.01). To determine significance among means, a Newman-Keuls procedure was employed. A significant difference was found between the means of the esophageal and both the TEP and laryngeal speakers. The TEP speakers’ mean intensity level was approximately 80 dB SPL, while the laryngeal speakers’ intensity was 84 dB SPL. The esophageal speakers’ intensity level was approximately 12 dB less intense than the laryngeal speakers. Examination of Table 2 also reveals that the TEP speakers were approximately 9 dB more intense than the esophageal speakers. DISCUSSION The mean fundamental frequency for the laryngeal speakers agrees with the data reported by Hollien and Shipp (1972) for this age group. The mean fundamental frequency for the esophageal speakers is also in agreement with the studies of Snidecor and Curry (1959) and Filter and Hyman (1975). For this study the tracheoesophageal speakers yielded a mean fundamental frequency of approximately 89 Hz. Intensity measures suggest that TEP speakers’ intensity levels more closely approximate laryngeal speakers’ intensity levels. The data for the second sentence of the
FUNDAMENTAL
FREQUENCY
AND INTENSITY
Rainbow Passage should be interpreted with caution. The equipment used only allows a I-set modal pitch of voiced continuous speech. This study may have been influenced by the fact that nine of the subjects were nonreaders. It is possible that the level of the voice of the investigator reading the sentences affected the intensity levels. Further research with such a stimulus control should be conducted. This study suggests that use of the TEP procedure may provide a better voice for some laryngectomized patients. The TEP procedure reports a quicker and more successful voice restoration, and these data indicate that frequency and intensity features are superior to esophageal speakers. It appears that other and better treatment options are now available to the laryngeal cancer patient. This research was supported by a Faculty Research Grant from Miami University. Special thanks to St. Elizabeth’s Hospital, Dayton, Ohio, and Diana Emery, consultant for Kay Electremics.
REFERENCES Angermeier, C. B. and Weinberg, B. (1981). Some aspects of fundamental quency control by esophageal speakers. .I. Speech Hear. Res., 24:85-91.
fre-
Blom, E. and Singer, M. (1979). Surgical-prosthetic approaches for postlaryngectomy voice restoration. In Keith and Darley (eds.), Latyngectomee Rehabilitation. Houston, Texas: College Hill Press. Blood, Cl. W. (1981). The interactions of amplitude and phonetic quality in esophageal speech. J. Speech Hear. Res. 24:308-312. Curry, E. and Snidecor, J. (1961). Physical measurement esophageal voice. Latyngoscope 71:415-424.
Damste, P. H. (1958). Oesophageal
speech. Hoitsema,
and pitch perception Netherlands:
Groningen.
Filter, M. D. and Hyman, M. (1975). Relationship of acoustic parameters perceptual ratings of esophageal speech. Percept. Motor Skills 40:63-68. Hollien, H. and Shipp, T. (1972). Speaking fundamental frequency logical age in males. J. Speech Hear. Res. 15:155-159.
in
and
and chrono-
Hoops, H. R. and Nell, J. D. (1969). Relationship of selected acoustic variables to judgments of esophageal speech. J. Commun. Dis. 2:1-13. Robbins, J. A., Fisher, H. B., and Logemann, J. A. (1979). Frequency, temporal and intensity characteristics of neoglottis voice production. (Paper presented at Annual Convention of American Speech and Hearing Association, Atlanta.) Robbins, J., Fisher, H., Logemann, J., Hillenbrand, J., and Blom, E. (1981). A comparative acoustic analysis of laryngeal speech, esophageal speech and speech production after tracheoesophageal puncture. (Paper presented at the Annual Convention of the American Speech-Language-Hearing Association, Los Angeles.) Shipp, T. (1967). Frequency,
duration,
and perceptual
measures
in relation
to
G. W. BLOOD judgments 427.
of alaryngeal
speech acceptability.
J. Speech
Hear. Res. 10:417-
Singer, M. I. and Blom, E. D. (1979). Tracheoesophageal puncture: A surgicalprosthetic method for post laryngectomy speech restoration. (Third International Symposium on plastic Reconstructive Surgery of the Head and Neck, New Orleans.) Singer, M. and Blom, E. (1980). An endoscopic technique for restoration after laryngectomy. Ann. Otol. Rhinol. Laryngol. 89:.529-533. Snidecor, J. C. and Curry, E. T. (1959). Temporal esophageal speech. Ann. Otol. 68:623-636.
of voice
and pitch aspects of superior
Weinberg, B. and Bennett, S. (1972). Selected acoustic characteristics of esophageal speech produced by female laryngectomes. J. Speech Hear. Res. 15:21 l266. Weinberg, B., Horii, Y., and Smith, B. E. (1980). Long-time spectral and intensity characteristics of esophageal speech. J. Acoust. Sot. Am. 67:1781-1784.