ARTICLE IN PRESS Whisper and Phonation: Aerodynamic Comparisons Across Adduction and Loudness *Ramya Konnai, †Ronald C. Scherer, ‡Amy Peplinski, and §Kenneth Ryan, *West Bloomfield, Michigan, †Bowling Green, Ohio, ‡Grovetown, Georgia, and §Morgantown, Virginia
Summary: Introduction. Whisper is known to be produced by different speakers differently, especially with respect to glottal configuration that influences glottal aerodynamics. Differences in whisper production and phonation types imply important linguistic information in many languages, are identified in vocal pathologies, are used to communicate mood and emotion, and are used in vocal performance. Objective. The present study focused on investigating the aerodynamic differences between whisper and phonation at different loudness and adduction levels. Methods. Three men and five women between 20 and 40 years of age participated in the study. Smooth syllable strings of the syllable /baep:/ were whispered and phonated at three different loudness levels (soft, medium, and loud) and three voice qualities (breathy, normal, and pressed). The voice qualities are associated with different adduction levels. This resulted in 18 treatment combinations (three adduction levels × three loudness levels × two sexes). Results. A regression analysis was performed using a PROC MIXED procedure with SAS statistical software. Under similar production conditions, subglottal pressure was significantly lower in whisper than in phonation in 10 of 18 combinations, mean glottal airflow was significantly higher in whisper than in phonation in 13 of 18 combinations, and flow resistance was significantly lower in whisper than in phonation in 14 of 18 combinations, with the female subjects demonstrating these trends more frequently than the male subjects do. Of importance, in general, compared with phonation under similar production conditions, whisper is not always accompanied by lower subglottal pressure and higher airflows. Conclusion. Results from this study suggest that the typical finding of lower subglottal pressure, higher glottal airflow, and decreased flow resistance in whisper compared with phonation cannot be generalized to all individuals and depends on the “whisper type.” The nine basic production conditions (three loudness levels and three adduction levels) resulted in data that may help explain the wide range of variation of whisper production reported in earlier studies. Key Words: Whisper–Phonation–Aerodynamics–Loudness–Adduction. INTRODUCTION Whispering is a socially significant form of communication. Cirillo1 surveyed 350 people to find out when and why people whispered and found that 38% of the subjects indicated that they whisper in private, often quite frequently. People whisper to (1) avoid disturbing someone (eg, in “silence zones” of libraries and hospitals), (2) communicate a secret message to a specific person and confirm affiliation with the person, (3) initiate a playful encounter or for fun, and (4) attract the attention or induce curiosity in members of an audience.1 Actors and singers use “stage whisper” for special effects,2 and children whisper during play. Patients with aphonia communicate by whispering.3 Furthermore, “soft whisper” is therapeutically prescribed for some patients with vocal pathologies.4 The study of this unique physiological action of the larynx (whispering) is important to the understanding of certain pathologic vocal phenomena, such as aphonia and vocal fold paralysis.
Accepted for publication February 24, 2017. From the *Department of Neurology, Henry Ford West Bloomfield Hospital, West Bloomfield, Michigan; †Department of Communication Sciences and Disorders, Bowling Green State University, Bowling Green, Ohio; ‡Truven Health Analytics, Atlanta, Georgia; and the §Department of Statistics, West Virginia University, Morgantown, Virginia. Address correspondence and reprint requests to Ramya Konnai, Department of Neurology, Henry Ford West Bloomfield Hospital, 6777 West Maple Road, West Bloomfield, MI 48322. E-mail:
[email protected] Journal of Voice, Vol. ■■, No. ■■, pp. ■■-■■ 0892-1997 © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jvoice.2017.02.016
Whisper especially is known to be produced by different speakers differently. Some researchers suggest that there is high interand intra-subject variation during whisper production, especially with respect to glottal configuration that influences glottal aerodynamics.5–8 At the level of the glottis, individuals demonstrate various configurations such as no vocal fold contact, various degrees of closeness of the two vocal folds, and compression of the anterior and middle thirds or the entire length of the membranous vocal folds. At the level of the supraglottis, there may be various degrees of false vocal fold gap or anterior-posterior displacement of the epiglottis and arytenoid cartilages. The high variability in whisper production data may be related to how the individual whispers, that is, with what loudness and adduction levels. In general, whisper may be thought of as “soft,” but “loud whisper” or “stage whisper” may be used during performance and by individuals with voice disorders. When one whispers loudly along with vocal strain as in severe muscle tension dysphonia, glottal adduction may be increased as well. The loudness and adduction variations in whisper production should affect the acoustic and aerodynamic characteristics of whisper. Thus, understanding the effects of intentionally varying loudness and adduction on acoustic and aerodynamic measures may help to explain the variability of these measures in the clinic and in earlier whisper studies. To our knowledge, only two studies have attempted to report the aerodynamic variability in whisper production. Monoson and Zemlin compared two types of whisper (quiet and loud or forced
ARTICLE IN PRESS 2 whisper) with breathy and normal phonation in five female young adults.9 In their study, mean flow was the greatest for forced whisper (0.328 L/s) followed by breathy phonation (0.258 L/s), quiet whisper (0.203 L/s), and normal phonation (0.120 L/s). These mean airflow values for whisper are lower than the mean airflow values (0.90–1.71 L/s) for whisper found by Sundberg et al.10 This difference could be due to the nature of the subjects used in the studies. Monoson and Zemlin investigated young adult female subjects, whereas Sundberg et al investigated one subject, a 69-year-old man who was 6 foot 7 inches tall, thus suggesting that the size of the larynx may make a large difference in subsequent airflow values. In addition, gender effects have been found for mean glottal airflow, with male subjects demonstrating significantly greater airflow rates than female subjects do.11–16 In their study, Sundberg et al10 examined aerodynamic and glottal measures for different levels of loudness and adduction in whisper. Their subject produced four types of whisper: hyperfunctional (more compressed), neutral, hypofunctional (more abducted), and post-phonatory, at three loudness levels (soft, medium, and loud). Measurements were made of the glottal area, glottal flow, and subglottal pressure (Ps) (via tracheal puncture). For this subject, whisper was produced with a wide range of numerous measures, namely, Ps, 1.3–17 cm H2O; glottal flow, 0.9–1.71 L/s; glottal area, 0.065–1.76 cm2; and glottal perimeter, 1.09–1.76 cm. Relatively highly adducted glottal configurations resulted in whisper that tended to have higher Ps and lower glottal areas and flows than for relatively highly abducted glottal configurations during whisper, with neutral and post-phonation whisper values in between. In more adducted and abducted whisper, the glottis assumed a rectangular or elliptical shape for this subject. Prior investigations of glottal configuration during whisper revealed vocal folds with straight medial edges or a glottis with a toed-in configuration.6 Sundberg et al10 found that glottal flow changed more for small changes of area when the area was already small than when it was already large (see also Scherer et al17). The authors derived an equation for whisper aerodynamics (relating glottal flow, Ps, and glottal area), as well as an equation involving nondimensional terms (pressure coefficient and Reynolds number). Although the study by Sundberg et al10 is the first of its kind to offer generalized expressions for whisper aeromechanics because of the wide range of whisper conditions, the subject sampling (one subject) limits broader issues of individual differences. According to Luchsinger and Arnold,3 whisper differs from phonation in a number of ways: (1) The glottis shows the shape of an inverted Y and the vocal folds are incompletely touching over their anterior-posterior length. (2) The vocal fold tension is much lower than in phonation, and the folds do not vibrate; as a result, the escaping air is set into non-periodic frictional turbulence so that a noise is produced instead of a tone with periodic vibrations. (3) The expiratory air volume is greatly increased; whispering is therefore “much more strenuous” than speaking in a normal voice. (4) The subglottal air pressure is much lower than it is during phonation.3 The current study challenges the latter two assertions. Netsell et al15 recorded estimates of subglottal air pressure and mean volume velocity of airflow during phonation from 30 normal
Journal of Voice, Vol. ■■, No. ■■, 2017
adults during repetition of consonant-vowel syllables. Results suggested no gender differences in subglottal air pressure, but men used significantly higher flows than women did in all speaking tasks. Women had greater laryngeal airway resistances than men did for the [i] vowel. Women also had greater resistances during the [i] vowel than during the [a] vowel, and the men did not. Thus, it is important to include both male and female subjects in the attempt to explain variability of aerodynamic measures across gender. The present study is an attempt to obtain the variability of whisper aerodynamics when individuals whisper over a wide range of loudness and adduction levels, and to compare the aerodynamics of whisper productions with phonation also produced over a wide range of loudness and adduction. METHODS Subjects Eight subjects (three men and five women) with an age range of 20–27 years (mean age of 23 years) participated in the study. All subjects were nonsmokers and native speakers of English with no history of voice or speech problems, hearing loss, or professional voice training. Equipment A Glottal Enterprises aerodynamic flow mask system (MSIF-2 S/N 2049S; Syracuse, NY) was used to obtain oral air pressure and airflow. Calibrations for pressure and flow were completed for a wide range of flows and pressures, using a calibrated pneumotachograph for calibrating flow and a U-tube manometer for calibrating the oral pressure transducer. A headband microphone system (AKG C-420, AKG Acoustics, Vienna, Austria) with preamplifier (APHEX 107, Aphex Systems, Sun Valley, CA) was used to record the audio signal simultaneously with the aerodynamic recordings. The mouth-to-microphone distance was held constant at 6 cm for all subjects. All signals were simultaneously recorded into a 16-bit DATAQ A/D system (model DI720 Series, DATAQ Instruments, Akron, OH) and analyzed using custom-written “Sigplot” software written in MATLAB code. Syllables for analysis Subjects were instructed to produce at least five trials of the fivesyllable series of /bæp:/, that is, /bæp:bæp:bæp:bæp:bæp:/, smoothly on one breath for each condition of whisper and phonation. Typically, Ps is estimated from intraoral pressure measured during the production of syllables with voiceless plosives such as /pæp/. Although this is a widely used method for obtaining estimates of Ps, a common source of error in this measurement can occur if the intrasyllable Ps is not constant.18 Rothenberg19 described ways to reduce this error, including the use of repeated “/baep/” syllables instead of the more classically used “/pa/.” Frazer20 studied the differences in intraoral pressure for smoothly produced strings of /bi:p:/ and /pi:p:/ and found similar estimates of Ps between them, suggesting that smoothly produced “beep” sequences may be just as useful, or more so, than “peep” sequences for studies estimating Ps, especially in cases where
ARTICLE IN PRESS Ramya Konnai, et al
3
Aerodynamics of Whisper and Phonation
using less air would be a benefit. Hence, the use of /bæp:/ was thought to be efficient for measuring Ps during whisper. Breathy productions did not permit a full string of five /bæp:/ syllables. However, shortening the vowel portion of the syllable allowed more syllables within the breath group. The middle three syllables within each trial were analyzed. There were a total of 721 valid syllables for analysis. Each syllable yielded a single mean Ps estimate, mean flow, and derived flow resistance. Procedure The subjects passed an initial screening when they were able to perceptually discriminate and produce the three voice qualities and loudness levels after the experimenter had demonstrated them. For the purposes of this study, the various voice qualities are described as different “adduction levels” because breathy voice is associated with less adduction compared with normal, and normal compared with pressed voice, but it is noted that the actual levels of adduction and glottal configurations are not known because the larynx was not visualized in this study. Subjects passed a hearing screening at 500 Hz, 1 kHz, and 2 kHz at 20 dB. The subjects were trained to refine their skills to differentiate and produce (1) phonation and whisper, (2) three adduction levels for both phonation and whisper (perceptually identified as breathy, normal, and pressed), (3) each of the voice qualities at each of the three loudness levels of soft, medium, and loud; in addition, they were trained to (4) produce the /bæp:/ sequence very smoothly so that pressure measurements (and thereby flow measurements also) would be valid, because lung and tracheal pressure would remain relatively constant throughout a given utterance. Subjects received visual feedback of the oral pressure signal by viewing it on an oscilloscopic display during training. The visual feedback was to help subjects produce flat intraoral pressure signals during lip occlusion. Training concluded when subjects were able to correctly produce at least three valid tokens of the voice and whisper types at the different loudness and adduction conditions. The time taken for training subjects varied from 45 to 75 minutes. Although the experimenter demonstrated the various voice and whisper types, subjects were instructed to produce their own version of the different voice and whisper types and were coached in their validity or appropriateness relative to the description of the nine sound source types (three adductions, three loudness levels) for both whisper and phonation. Pitch was not controlled for the phonation tasks, considering that it was likely to increase with loudness in vocally untrained subjects. Normal adduction and medium loudness were specified as the subject’s preferred “comfortable” production experienced during a short dialogue with the experimenter sitting approximately 3 feet away for both phonation and whisper. Loud voice was above the subject’s comfortable loudness level by at least 5 dB, and soft voice was below the subject’s comfortable loudness level by at least 5 dB. A sound level meter placed at a distance of 15 cm from the lips was used to verify that there was at least a 5 dB difference between the soft and medium, and the medium and loud productions during sustained vowels. Subjects were instructed to take a deep enough breath before beginning the /bæp:/ syllable train to produce the entire syllable train on a continuous
expiration. Subjects were also instructed to use constant effort throughout all syllable strings and to produce them very smoothly. During the recordings, the subjects replicated the same procedures that they practiced during training. Each subject was comfortably seated inside an IAC sound-treated booth (4 feet × 6.3 feet × 6.5 feet) (Model 402A, Industrial Acoustics Company, Bronx, NY). One experimenter was present with the subject inside the sound booth. A second experimenter outside the sound booth monitored and recorded the signals using WinDaq Pro software (DATAQ Instruments). The subject placed the mask over his or her face with the thin oral pressure tube positioned between the lips in the corner of the mouth. The end of the tube was not obstructed by the tongue. Each subject also wore a headband microphone, with the microphone located off to the side of the corner of the mouth at a distance of 6 cm. The experimenter who was alongside the subject monitored the mask fit and seal, mouthto-mic distance, and the correct productions of the tokens during the recording. The subject first produced the different phonation conditions followed by the whisper conditions. Within each condition of phonation and whisper, the “normal” or “comfortable” adduction condition was produced first at normal loudness followed by soft and then loud. This was followed by the production of the “breathy” condition at normal, soft, and loud levels. Finally, the “pressed” condition was produced at normal, soft, and loud levels. Five or more successful trials were recorded for each of the nine conditions (three levels of adduction and three levels of loudness) for phonation and nine similar conditions for whisper. A 10-minute rest interval was given after recordings of phonation were completed before recording the whisper tokens. Subjects were asked to repeat any token that they or the experimenters considered to be a poor example of the intended condition. Any token considered unacceptable by either the subject or the experimenters was excluded from analysis. Data analysis The independent variables in the study were source (whisper and phonation), adduction (three levels), and loudness (three levels). The dependent variables were measures of aerodynamics (estimated mean Ps, mean airflow, and derived laryngeal flow resistance). The microphone, airflow, oral air pressure, and sound intensity signals were recorded into separate channels of the DATAQ A/D converter using a sampling rate of 20,000 Hz per channel. The digitized signals were analyzed using custom software (Sigplot) and the calibration equations for the flow and oral air pressure signals. Averages of oral air pressures of the flat portions of adjacent lip occlusions were calculated, as well as the average flow midway in the vowel. The experimenter chose the points on the pressure and flow signals for analysis (these were not automatically determined by the software). The software then calculated the average values for pressure, flow, and flow resistance. RESULTS Statistical approach The main and interaction effects along with subject variability were analyzed through regression analyses by using the PROC
ARTICLE IN PRESS 4
Journal of Voice, Vol. ■■, No. ■■, 2017
TABLE 1. A List of PROC MIXED Fixed Effect Model Terms for Responses: Log Sub Glottal Pressure, Log Airflow, and Log Flow Resistance. Fixed main effects
source adduction loudness
Fixed two-factor interactions for Ps
adduction*source adduction*loudness gender*source source*loudness gender*loudness
Fixed two-factor interactions for flow
source*adduction source*loudness adduction*loudness source*gender
Fixed two-factor interactions for flow resistance
source*adduction source*loudness source*gender
Each model treated the subject as a random effect.
MIXED procedure in the SAS (SAS Institute Inc, Cary, NC) statistical program. There were 18 treatment combinations (three adduction levels × three loudness levels × two sources [whisper and phonation]). A model was run using all the variables, which is referred to as the “full model,” with a subsequent “reduced model” to remove nonsignificant terms within the full model. The reduced model with significant terms for log Ps, log airflow, and log flow resistance is listed in Table 1.
Probability plots showed that the residuals of Ps, airflow, and flow resistance deviated from normality with nonconstant variance, thus not satisfying the assumptions of normal distribution and constant variance. The model was therefore adjusted by using natural log values instead of the original values of the dependent variables, the residual analysis for which did satisfy the normality criterion. To draw useful conclusions following the analyses, the predictions and confidence intervals were transformed back into the natural physical units. The values to be reported here are therefore median values of the data, rather than average values, because of the use of natural logarithms of the values for the statistical analyses. The analyses resulted in comparison of predicted median values with significance at the 0.002 level, which takes into account the multiple comparisons employed in the regression analyses. Analysis of subglottal pressure The bubble plot of Figure 1 provides a convenient visual representation of the numerical information in Table 2 and graphically shows the relationship between the three adduction levels and three loudness levels for Ps in male and female subjects. The area of each bubble is proportional to the median value of Ps for that variable combination from Table 2. The dashed circles refer to whisper productions and the solid circles refer to phonation productions. A dot in the middle of a pair of concentric bubbles indicates that the difference between the median values of Ps between whisper and phonation was statistically significantly different for the combination of gender, adduction, and loudness at the 0.2% level of significance. This comparisonwise level of 0.2% was based on a Bonferroni multiple
FIGURE 1. Bubble plot of predicted subglottal pressure in phonation and whisper.
ARTICLE IN PRESS Ramya Konnai, et al
5
Aerodynamics of Whisper and Phonation
TABLE 2. Summary of Median Subglottal Pressure Values (cm H2O) (Values in Parentheses Are the 95% Confidence Intervals). Ps (cmH2O) Soft Adduction Pressed
Source Whisper Phonation
Normal
Whisper Phonation
Breathy
Whisper Phonation
Medium
Loud
Female
Male
Female
Male
Female
Male
7.74 (6.09,9.85) 9.83 (7.73,12.49) 3.46 (2.72,4.40) 5.08 (4.00,6.46) 4.14 (3.25,5.26) 6.84 (5.38,8.70)
11.92 (8.77,16.20) 12.06 (8.88,16.37) 5.33 (3.92,7.24) 6.23 (4.59,8.46) 6.37 (4.69,8.65) 8.39 (6.18,11.40)
10.20 (8.02,12.96) 11.62 (9.14,14.76) 5.38 (4.23,6.84) 7.08 (5.57.9.01) 6.29 (4.95,7.99) 9.33 (7.34,11.87)
16.85 (12.40,22.88) 15.30 (11.26,20.77) 8.89 (6.54,12.07) 9.33 (6.87,12.67) 10.38 (7.64,14.10) 12.29 (9.05,16.69)
17.20 (13.54,21.86) 18.27 (14.37,23.24) 9.65 (7.59,12.23) 11.85 (9.32,15.06) 10.78 (8.48,13.69) 14.92 (11.75,18.96)
24.19 (17.82,32.83) 20.48 (15.08,27.82) 13.57 (9.99,18.43) 13.28 (9.78,18.04) 15.15 (11.17,20.56) 16.73 (12.32,22.71)
Bolded pairs of cell values correspond to significant differences at a 0.2% level between the whisper and the phonation values at a fixed level of gender, adduction, and source. (Bolding was based on the standard error of cell differences and was not a comparison of the provided 95% confidence intervals for each cell).
comparisons correction to a 5% family-wise level of significance (18 comparisons: 18/0.05 = 0.00278). The value of 0.2% is a round down to the nearest first digit to err toward a more stringent test. Table 2 and Figure 1 indicate that the median Ps was higher for phonation than for whisper in 15 of the 18 conditions, an important trend, with values differing significantly for 9 of the 18 conditions where phonation Ps was higher than whisper Ps. For one condition, loud male pressed, whisper pressure was significantly higher than phonation pressure. At normal adduction and medium loudness, there was no significant difference in Ps between whisper and phonation in men, unlike in women, for which Ps was higher in phonation than in whisper. Women demonstrated significant differences in Ps between whisper and
phonation in seven of nine conditions, and men with only three of nine differences. Thus, in general, women differentiated between whisper and phonation relative to Ps more often than the men did.
Analysis of glottal airflow Significant differences between the median airflow values for whisper and phonation are listed in Table 3 and Figure 2. The bubble plot of Figure 2 shows the relationship between the three adduction levels and the three loudness levels for airflow in male and female subjects. The significant differences between the airflow values of whisper and phonation are again indicated by using a dot in the center of the bubbles in the bubble plot.
TABLE 3. Summary of Median Glottal Airflow Values (cc/s), Comparison Between Whisper and Phonation (95% Confidence Intervals). Glottal Airflow (cm3/s) Soft Adduction Pressed
Source Whisper Phonation
Normal
Whisper Phonation
Breathy
Whisper Phonation
Medium
Loud
Female
Male
Female
Male
Female
Male
311 (257,375) 234 (195, 282) 338 (280, 407) 203 (169, 245) 630 (523, 760) 480 (398, 578)
280 (223,351) 316 (252, 396) 304 (243, 382) 274 (219, 344) 568 (453, 712) 648 (517, 811)
596 (494,717) 200 (166, 241) 583 (484, 703) 189 (157, 228) 890 (739, 1078) 535 (444, 645)
537 (428, 673) 270 (216, 338) 526 (420, 659) 255 (204, 320) 802 (640, 1006) 722 (576, 905)
808 (671, 972) 206 (171, 249) 1039 (862, 1252) 169 (140, 203) 1504 (1252, 1805) 807 (672, 970)
728 (582, 911) 278 (221, 349) 936 (747, 1174) 228 (182, 285) 1355 (1085, 1693) 1089 (870, 1362)
Bolded values correspond to significant differences between the whisper and phonation values.
ARTICLE IN PRESS 6
Journal of Voice, Vol. ■■, No. ■■, 2017
FIGURE 2. Bubble plot of predicted airflow in phonation and whisper.
There were differences between the glottal airflow of whisper (more) and phonation (less) in 16 of 18 conditions, with 13 being statistically significant. The differences in flow occurred mostly for the medium and loud conditions (10 of 12 significant differences), regardless of the adduction level. Women demonstrated significant differences between whisper and phonation in all nine conditions, unlike men with four of nine (airflow was not significantly different across adduction for soft productions by the
men). In general, airflow differences between whisper and phonation were more prominent as loudness increased. Analysis of flow resistance Table 4 and Figure 3 show the flow resistance values for the different combinations. There were differences between the glottal flow resistance for whisper (less) and phonation (more) in 17 of 18 conditions, 14 of which were statistically significant. All
TABLE 4. Summary of Median Glottal Flow Resistance (kPa/[L/s]), Comparison Between Whisper and Phonation (95% Confidence Intervals). Glottal flow resistance (kPa/[L/s]) Soft Adduction Pressed
Female
Male
Female
Male
Female
Male
Whisper
2.56 (1.7, 3.7) 4.05 (2.7, 5.9) 0.99 (0.6, 1.4) 2.54 (1.7, 3.7) 0.66 (0.4, 0.9) 1.40 (0.9, 2)
4.34 (2.6, 7.0) 3.65 (2.2, 5.9) 1.67 (1, 2.7) 2.30 (1.4, 3.7) 1.12 (0.6, 1.8) 1.27 (0.7, 2)
1.70 (1.1, 2.4) 6.09 (4.1, 8.9) 0.95 (0.6, 1.3) 3.74 (2.5, 5.4) 0.73 (0.5, 1) 1.75 (1.2, 2.5)
2.87 (1.7, 4.6) 5.50 (3.4, 8.8) 1.61 (0.9, 2.6) 3.38 (2, 5.4) 1.23 (0.7, 2) 1.58 (0.9, 2.5)
2.05 (1.4, 2.9) 8.46 (5.7, 12.3) 0.92 (0.6, 1.3) 6.61 (4.5, 9.6) 0.67 (0.4, 0.9) 1.81 (1.2, 2.6)
3.47 (2.1, 5.6) 7.63 (4.7, 12.3) 1.56 (0.9, 2.5) 5.96 (3.6, 9.6) 1.14 (0.7, 1.8) 1.63 (1.0, 2.6)
Whisper Phonation
Breathy
Loud
Source
Phonation Normal
Medium
Whisper Phonation
Bolded values correspond to significant differences between the whisper and the phonation values.
ARTICLE IN PRESS Ramya Konnai, et al
Aerodynamics of Whisper and Phonation
7
FIGURE 3. Bubble plot of predicted glottal flow resistance of phonation and whisper. nine whisper-phonation differences were significant for the women compared with men, with five out of nine. In general, flow resistance differences were more prominent for normal and pressed adduction at medium and loud conditions. Again, the men did not have significant differences for the soft condition across adduction levels. DISCUSSION Subglottal pressure Whisper vs. phonation The general observation of lower Ps in whisper is in agreement with the literature.3 The mean Ps for normal adduction and medium loudness whisper was 5.38 cm H2O for women and 8.89 cm H2O for men. These values are higher than those found by Stathopoulos et al21 who reported a mean Ps of 4.00 cm H2O for female subjects and 4.61 cm H2O for male subjects during whispering. Both the present study and the study by Stathopoulos et al21 used young adults (ie, younger than 28 years of age) and a similar method of measurement of intraoral pressure. However, the utterance used in the two studies was different. The present study used /baep:/ strings as stimulus, whereas Stathopoulos et al21 used /pi/ strings. Also, Stathopoulos et al21 instructed their subjects to produce a “comfortable whisper at a normal loudness and rate” (ie, their subjects were not asked to produce whisper at different loudness levels), whereas the present study trained subjects at three loudness levels, the medium level corresponding to “comfortable.” The differences in utterances and instruction for whisper may have resulted in the higher Ps values of the present study. In addition, the current study reports median values rather than mean values as in Stathopoulos et al.
The explanation for the finding that the female subjects had more conditions of higher Ps for phonation than for whisper (nine of nine, with seven being significant) compared with men (six of nine, with three being significant) is not clear. Further study of the effects of laryngeal size, aerodynamic power, and acoustic efficiency differences between adult men and women may prove informative. Ps range Pressed loud whisper in men had the highest median Ps value (24.19 cm H2O) compared with all other treatment combinations, followed by pressed loud phonation in men (20.48 cm H2O). Normal soft whisper in women had the lowest median Ps value (3.46 cm H2O) followed by breathy soft whisper in women (4.14 cm H2O). The male subject in the Sundberg et al study produced his pressed loud whisper with an average of approximately 10 cm H2O, which is closer to the average man’s soft pressed whisper (11.92 cm H2O) in this study, suggesting a relatively wide range of individual differences. The normal adduction low Ps whisper of the subject in Sundberg et al’s study was produced with an average of about 3 cm H2O, compared with the 6.37 cm H2O for the men’s soft breathy whisper, again suggesting individual differences. The full Ps range across productions by all subjects for whisper was close to that of phonation. The Ps range for phonation was 1.56–30.55 cm H2O, and for whisper was 1.54–32.4 cm H2O. The Ps range for whisper obtained in this study was wider than that found in the Sundberg et al10 study, where they reported a Ps range of 1.3–17 cm H2O for whisper in their one male subject. In the present study, the subjects were instructed to whisper /baep:/
ARTICLE IN PRESS 8 at least 5 dB below their comfortable (medium) loudness for soft whisper and at least 5 dB above their comfortable loudness for loud whisper. This difference in the way vocal loudness was varied and the stimuli used in both studies may account for the Ps range difference. The inclusion of eight subjects and both genders in the current study could have resulted in the increased range of Ps, as well as a possible “exaggeration” of the conditions to make sure there were nine distinct conditions matching the directions for targeting those conditions. Airflow Whisper vs. phonation Whisper had greater median airflow than phonation in 16 of 18 treatment combinations, 13 of which were significant (Table 3), and all cases for the women were significant. Male subjects had significantly different values primarily for medium and loud conditions (curiously, none of their soft productions had significant differences). The finding that whisper generally produced more mean airflow than did phonation is supported by other studies.9,21 In this study, female soft whisper with normal adduction had a median flow of 338 cm3/s, whereas the mean flow of female “low intensity whisper” in the study by Monson and Zemlin9 was 203 cm3/s. In this study, loud whisper for the female subjects with normal adduction had a median flow of 1039 cm3/s, whereas Monson and Zemlin’s study reported a mean flow of 328 cm3/s for “forced whisper.” Thus, the flows for this study were higher by a factor of 1.6–3.2 for women. It would be reasonable to infer that the subjects in the Monson and Zemlin study used greater adduction levels because their airflows were considerably lower. Median airflow for normal whisper (ie, normal adductionmedium loudness) in female and male subjects was similar to that reported by Stathopoulos et al.21 Median airflow for normal whisper was 583.25 cm3/s in women and 525.75 cm3/s in men in this study. Stathopoulos et al21 reported a mean airflow of 500 cm3/s in women and 530 cm3/s in men during “comfortable whisper.” Flow range Median airflow was highest for breathy loud whisper in women (1504 cm3/s) followed by breathy loud whisper in men (1355 cm3/s). Normal adduction loud phonation in women had the lowest median airflow (169 cm3/s). This was contrary to the expectation that a pressed phonation would have the least mean airflow (which was 206 cm3/s for women). It is possible that female subjects did not completely close the posterior glottis for the pressed loud phonation but instead hyperadducted the anterior glottis only. This could have resulted in greater mean airflow compared with the normal adduction loud condition where both the anterior and the posterior glottises are most likely in an adducted position. However, the higher Ps value for the pressed loud phonation (18.27 cm H2O) compared with the normal adduction loud phonation (11.85 cm H2O) would tend to drive flow at a higher rate for the pressed loud phonation condition. Similar comparisons hold for the male subjects: more flow and higher pressure for the pressed loud phonation (278 cm3/s, 20.48 cm H2O) than for normal adduction loud phonation (228 cm3/s, 13.28 cm H2O).
Journal of Voice, Vol. ■■, No. ■■, 2017
For whisper, the highest airflow for male subjects, 1355 cm3/ s, was for loud breathy production and the least airflow for male subjects, 280 cc/s, was for soft pressed production. In the Sundberg et al10 study, the highest mean airflow was found for loud breathy whisper, and the lowest mean airflow was found for medium pressed whisper. Thus, the two studies are consistent for the highest flow, and “close” for the lowest flow in whisper. An interesting contrast between whisper and phonation is the following. A strong finding in this study was that, for both male and female subjects, airflow in loud whisper was much greater than in soft whisper (the difference was 497–873 cm3/s) across all three levels of adduction (Table 3). This was likely due to the greater Ps in loud whisper than in soft whisper at all three levels of adduction. In contrast, for phonation, unlike whisper, the average difference between the airflow of soft and loud phonation was only 37 cm3/s for normal and pressed adduction. For breathy adduction, the flow was greater for loud phonation by 384 cm3/s (averaging male and female subjects). Greater closed quotient with increased Ps during phonation may help explain the difference of only about 37 cm3/s for the normal and pressed phonations, because the longer the glottal closed time, the less flow through the glottis, whereas the larger and more constant glottal area for breathy phonation and the whisper conditions would require flow to increase with increase in Ps. Flow resistance Whisper vs. phonation Whisper had lower flow resistance than phonation in 17 of 18 treatment combinations (Table 4), with 14 being significant. For men, a significant difference between whisper and phonation was present in only five of nine conditions, with none for the soft conditions across adduction (similar to the airflow results). For women, a significant difference was present between whisper and phonation for all nine conditions. The range of flow resistance in whisper was 0.18–17.39 kPa/(L/s). The flow resistance difference between loud and soft whisper was small (−0.52 to 0.01 kPa/[L/s]) for all three levels of adduction and in both men and women. Solomon and Markon22 studied three women and three men who repeated /pi/ at a rate of 1.5 syllables per second. Their flow resistance increased across low-effort whisper, high-effort whisper, breathy voice, and normal voice, with values of 1.163, 1.893, 2.105, and 3.848 kPa/(L/s), respectively. The best comparisons with the current study (values in kPa/[L/s]) might be the following: their low-effort whisper (1.163) with normal adduction soft whisper (0.99 female, 1.67 male); their high-effort whisper (1.893) with a production that is between normal adduction loud whisper (0.92 female, 1.56 male) and pressed loud whisper (2.05 female, 3.47 male); their breathy voice (2.105) with breathy medium loudness phonation (1.75 female, 1.58 male); and their normal voice (3.848) with normal adduction medium loudness phonation (3.74 female, 3.38 male). This comparison suggests that the more detailed parameterization of the present study can be useful in inferring the more specific laryngeal configuration and aerodynamics of other studies. This comparison points out the central intent of this study: to separate laryngeal function into a wider range of controllable
ARTICLE IN PRESS Ramya Konnai, et al
Aerodynamics of Whisper and Phonation
independent variables to determine the more subtle production effects. Other observations Breathy-loud phonation and pressed-soft phonation were two conditions that were difficult to produce for some subjects (but probable productions in some communication situations). The notion that it is difficult to increase loudness during breathy adduction was also reported by Holmberg et al.23 Pressed adduction is produced with increased medial compression of the vocal folds, and the vocal processes of the arytenoids are adducted close to each other. However, during the production of soft phonation, there is typically incomplete glottal closure. 24 Because the laryngeal configurations for pressed phonation and soft phonation contrast with each other, it is reasonable that subjects would have difficulty producing them, despite the training sessions. CONCLUSION The larynx can be viewed as a complex mechanism for sound production which, at the glottal level, has two rather distinct sound sources: phonation and whisper. Both of these sound sources are capable of a wide range of glottal configurations, aerodynamic output, and acoustic characteristics. The loudness and quality for phonation and whisper (and pitch for phonation) depend on the neuromuscular control of both the laryngeal and the respiratory systems, and in particular the Ps and the glottal adduction. In their daily lives, people are capable of controlling a wide range of Ps (primarily to regulate loudness) and a wide range of glottal adduction (to regulate quality in the breathy-normal-pressed sense) for both phonation and whisper. Parameterizing both loudness and adduction at the same time leads to a finer resolution of how the larynx and the body work to make sounds. With such a refinement, the two basic laryngeal sound sources, phonation and whisper, can be compared. It was the general goal of this study to examine such consequences of controlling loudness (assumed to be highly related to Ps) and breathy-normal-pressed quality (assumed to be highly related to glottal adduction) for both phonation and whisper. The prominent findings were the following: (1) Ps values were similar for phonation and whisper in men for normal adduction across the three levels of loudness but were significantly different (lower for whisper) for women at each loudness level (Table 2). (2) Glottal airflow tends to be higher for whisper than phonation for medium and loud conditions across all adduction levels for both male and female subjects (ie, for all 12 conditions; Table 3). For soft productions, whisper flow was greater than phonation flow for the female subjects but not for the male subjects across the three adduction levels. The notion that whisper has more airflow than phonation is therefore typical but cannot be generalized to all levels of loudness for male subjects. (3) For phonation, flow was greater for pressed adduction than for normal adduction at each level of loudness (Table 3) for both female and male subjects, an unexpected
9
finding. It may not be typical for pressed phonation to be accompanied by posterior glottal closure. (4) Similar to the results for flow, the difference in laryngeal flow resistance between whisper and phonation was significant (lower for whisper) for all nine adductionloudness conditions for the female subjects. For the male subjects, the difference was significant for the loud conditions, normal and pressed conditions for medium loudness, and none for the soft conditions (Table 4). The findings from this study suggest that parameterizing the controllable variables in both whisper and phonation can lead to a more detailed understanding of laryngeal sound sources, especially a deeper orientation to the variability of results reported in studies with looser controls over laryngeal configuration and aerodynamics. Also, although the sample sizes are small in the current study, results suggest that there is a difference between adult women and men regarding the use of pressures and airflows, with women showing more contrast between phonation and whisper than men do, and thus further experiments for those differences are suggested. Clinical implication of this study is a recommendation to record phonation and whisper at different loudness and adduction levels during aerodynamic assessment and measurement. This study explains some of the variability in whisper production depending on the loudness and adduction levels, which will be useful in interpretation of aerodynamic measures in the clinic. The results also pertain to speech synthesis relative to expectations of pressures and flows under different dynamic communication situations, with the added necessity of considering the role of the posterior glottis for both phonation and whisper and potential differences between male and female productions. REFERENCES 1. Cirillo J. Communication by unvoiced speech: the role of whispering. An Acad Bras Cienc. 2004;76:413–423. 2. Poyatos F. Nonverbal Communication across Disciplines. New York: John Benjamins Publishing Co; 2002:31. 3. Luchsinger R, Arnold GE. Voice Speech-Language: Clinical Communicology—Its Physiology and Pathology. Boston: Wadsworth Pub Co; 1965:118. 4. Hufnagle J, Hufnagle K. Is quiet whisper harmful to the vocal mechanism? A research note. Percept Mot Skills. 1983;57:735–737. 5. Hicks DM, Sweat LL. Whisper: help or hindrance to the professional voice. Paper presented at the Twelfth Symposium: Care of the Professional Voice, New York City;1983. 6. Solomon NP, McCall GN, Trosset MW, et al. Laryngeal configuration and constriction during two types of whispering. J Speech Hear Res. 1989;32:161–174. 7. Rubin AD, Praneetvatakul V, Gherson S, et al. Laryngeal hyperfunction during whispering: reality or myth? J Voice. 2006;20:121–127. 8. Fleischer S, Kothe C, Hess M. Glottal and supraglottal configuration during whispering. Laryngorhinootologie. 2007;86:271–275. 9. Monoson P, Zemlin W. Quantitative study of whisper. Folia Phoniatr (Basel). 1984;36:53–65. 10. Sundberg J, Scherer R, Hess M, et al. Whispering—a single-subject study of glottal configuration and aerodynamics. J Voice. 2010;24:574–584. 11. Holmberg EB, Hillman RE, Perkell JS. Glottal airflow and transglottal air pressure measurements for male and female speakers in soft, normal, and loud voice. J Acoust Soc Am. 1988;84:511–529.
ARTICLE IN PRESS 10 12. Higgins MB, Saxman JH. A comparison of selected phonatory behaviors of healthy aged and young adults. J Speech Lang Hear Res. 1991;34:1000– 1010. 13. Isshiki N, Von Leden H. Hoarseness: aerodynamic studies. Arch Otolaryngol. 1964;80:206–213. 14. Koike Y, Hirano M, Von Leden H. Vocal initiation: acoustic and aerodynamic investigations of normal subjects. Folia Phoniatr (Basel). 1967;19:173–182. 15. Netsell R, Lotz WK, DuChane AS, et al. Vocal tract aerodynamics during syllable production: normative data and theoretical implication. J Voice. 1991;5:1–9. 16. Yanagihara N, Koike Y, Von Leden H. Phonation and respiration: function in normal subjects. Folia Phoniatr (Basel). 1966;18:323–340. 17. Scherer RC, Sundberg J, Konnai R. Whisper. In: Benninger MS, Sataloff RT, eds. Sataloff’s Comprehensive Textbook of Otolaryngology: Head and Neck Surgery: Laryngology, Vol. 4. Philadelphia, PA: Jaypee Brothers Medical Publishers (P) Ltd.; 2014. 18. Hertegård S, Gauffin J, Lindestead P. A comparison of subglottal and intraoral pressure measurements during phonation. J Voice. 1995;9:149–155.
Journal of Voice, Vol. ■■, No. ■■, 2017 19. Rothenberg M. Rethinking the interpolation method for estimating subglottal pressure. Proceedings of the 10th International Conference on Advances in Quantitative Laryngology; 2013; Cincinnati, Ohio. 20. Frazer B. Approximating subglottal pressure from oral pressure: a methodological study [thesis]. Bowling Green (OH): Bowling Green State University; 2014. 21. Stathopoulos E, Hoit JD, Hixon TJ, et al. Respiratory and laryngeal function during whispering. J Speech Hear Res. 1991;34:761–767. 22. Solomon NP, Markon JR. Speech breathing consequences of varying laryngeal airway resistance. Poster presented at: The 29th Annual Voice Foundation Symposium, Philadelphia, PA; 2000. 23. Holmberg EB, Hillman RE, Perkell JS. Relationships between intra-speaker variation in aerodynamic measures of voice production and variation in SPL across repeated recordings. J Speech Lang Hear Res. 1994;37:484– 495. 24. Rammage LA, Peppard RC, Bless DM. Aerodynamic, laryngoscopic, and perceptual-acoustic characteristics in dysphonic females with posterior glottis chinks: a retrospective study. J Voice. 1992;6:64–78.