ARTICLE IN PRESS Comparing Chalk With Cheese—The EGG Contact Quotient Is Only a Limited Surrogate of the Closed Quotient *Christian T. Herbst, †Harm K. Schutte, *Daniel L. Bowling, and ‡Jan G. Svec, *Vienna, Austria, †Groningen, The Netherlands, and ‡Olomouc, Czech Republic
Summary: The electroglottographic (EGG) contact quotient (CQegg), an estimate of the relative duration of vocal fold contact per vibratory cycle, is the most commonly used quantitative analysis parameter in EGG. The purpose of this study is to quantify the CQegg’s relation to the closed quotient, a measure more directly related to glottal width changes during vocal fold vibration and the respective sound generation events. Thirteen singers (six females) phonated in four extreme phonation types while independently varying the degree of breathiness and vocal register. EGG recordings were complemented by simultaneous videokymographic (VKG) endoscopy, which allows for calculation of the VKG closed quotient (CQvkg). The CQegg was computed with five different algorithms, all used in previous research. All CQegg algorithms produced CQegg values that clearly differed from the respective CQvkg, with standard deviations around 20% of cycle duration. The difference between CQvkg and CQegg was generally greater for phonations with lower CQvkg. The largest differences were found for low-quality EGG signals with a signal-to-noise ratio below 10 dB, typically stemming from phonations with incomplete glottal closure. Disregarding those low-quality signals, we found the best match between CQegg and CQvkg for a CQegg algorithm operating on the first derivative of the EGG signal. These results show that the terms “closed quotient” and “contact quotient” should not be used interchangeably. They relate to different physiological phenomena. Phonations with incomplete glottal closure having an EGG signal-tonoise ratio below 10 dB are not suited for CQegg analysis. Key Words: Electroglottography–EGG–Contact quotient–Closed quotient.
INTRODUCTION The human singing voice is capable of producing a wide range of different vocal timbres. This is, among others, achieved by variation of the voice source quality at the laryngeal level. Both trained and untrained singers can influence glottal configuration by two independent means: (1) cartilaginous adduction (ie, adduction of the posterior glottis, controlled along the dimension of “breathy” to “pressed” via the lateral cricoarytenoid and the interarytenoid muscles) and (2) membranous medialization (ie, vertical bulging of the vocal fold via contraction of the thyroarytenoid muscle, induced by the choice of voice register).1 Assessment of glottal configuration is essential in (singing) voice research, pedagogy, and therapy. Although direct endoscopic observation produces the best insights, it is in many cases unpractical owing to its invasive nature. Often, electroglottography (EGG), pioneered by Fabre in 1957,2 is used as a low-cost, noninvasive alternative. In EGG, a high-frequency, low-voltage current is passed between two electrodes placed on each side of the thyroid cartilage. Changes in vocal fold contact area (VFCA) Accepted for publication November 8, 2016. This research will be submitted to the Voice Foundation’s 46th Annual Symposium: Care of the Professional Voice, from May 31 to June 4, 2017. From the *Bioacoustics Laboratory, Department of Cognitive Biology, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria; †Voice Research Lab Groningen, Wasaweg 9, 9723 JD Groningen, The Netherlands; and the ‡Voice Research Lab, Department of Biophysics, Faculty of Science, Palacký University Olomouc, 17. listopadu 12, 771 46 Olomouc, Czech Republic. Address correspondence and reprint requests to Christian T. Herbst, Bioacoustics Laboratory, Department of Cognitive Biology, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria. E-mail:
[email protected] Journal of Voice, Vol. ■■, No. ■■, pp. ■■-■■ 0892-1997 © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jvoice.2016.11.007
during vocal fold vibration result in admittance variations, and the resulting EGG signal is proportional to the relative VFCA.3,4 The most commonly used quantitative analysis parameter derived from the EGG signal is the EGG contact quotient (CQEGG),5 a concept originally introduced by Davies et al,6 which was also referred to as EGG “duty cycle,”7 “larynx closed quotient,”8 “quasi-closed quotient,”9 or “closed quotient.”10–12 In essence, the CQEGG is an estimation of the relative duration of vocal fold contact during one glottal cycle. To arrive at the CQEGG, one “contacting” (t1) and one “de-contacting” event (t2) is defined per glottal cycle (see Figure 1), and the duration of the “contact phase” (t2–t1) is divided by the period of the analyzed glottal cycle. The CQEGG is expressed in the range of 0–1, or as 0%–100% relative (time-normalized) cycle duration. It should be noted that EGG signal provides only information on changes in contact between the vocal folds; it cannot indicate whether there is full glottal closure. Therefore, the term “contact quotient” is more appropriate than the term “closed quotient” for the CQEGG. In previous research, two different approaches have been applied to deriving the (de)contacting events t1 and t2: (1) a threshold-based method (see Figure 1A–C), where the (de)contacting events are determined by the moments when the locally normalized EGG waveform crosses a given threshold (typically set at 20%, 25%, or 35%);5,7,13 or (2) a method operating on the first derivative of the EGG waveform (dEGG), where the (de)contacting events are constituted by positive peaks (for contacting) and negative peaks (for de-contacting) of the dEGG signal, corresponding to the moments of maximum increase or decrease of the relative VFCA (see Figure 1D).14,15 Additionally, a hybrid method was proposed where the contacting event is con-
ARTICLE IN PRESS 2
Journal of Voice, Vol. ■■, No. ■■, 2016
FIGURE 1. Overview of different methods to calculate the EGG contact quotient: (A–C) threshold-based methods; (D) dEGG-based method; (E) hybrid method (see text); and (F) videokymographic footage at the position of maximum vibration amplitude (perpendicular to the glottal axis), related to the shown EGG waveform.
stituted by the positive dEGG peak, and the de-contacting event is derived via a threshold set at ca. 0.43 (three-sevenths—see Figure 1E).8 As can be clearly seen from Figure 1, different methods result in different CQEGG values,13,14,16 with discrepancies up to 30% of the glottal cycle.17 Apparently, the choice of the method for calculating the CQEGG is vital. CQEGG measurements have been applied in many ways, including assessing registers in singing,14,17–20 and discriminating “breathy,” “normal,” and “pressed” phonation.16 These, and other studies,14,21 appear to suggest that the CQEGG is a somewhat viable approximation of the actual closed quotient, if only to a certain degree. Indeed, there is mounting evidence suggesting that the discrepancy between CQEGG measurements and closed quotient measurements is systematic, as derived from either the glottal flow,14,22 from videokymography,17 or from laryngeal highspeed videoendoscopy.19,20 Accordingly, if these measurements are to be used properly, it is critical to establish expected discrepancy ranges between CQEGG and closed quotient measurements for phonation with different laryngeal configurations (ie, with different choices of posterior glottal adduction and membranous medialization or vocal registers). This issue is addressed here by analyzing EGG data from a previously used database of 13 singers phonating at known glottal configurations.1 CQEGG measurements, as computed by five different CQEGG algorithms (see Figure 1), are related to the known videokymographically derived closed quotient. Specifically, the following questions are addressed in this study:
(1) To what extent does the CQ EGG deviate from the videokymographically derived closed quotient at different laryngeal configurations in singing? (2) Which of the specified algorithms for calculating the CQEGG provides results closest to the respective closed quotients? (3) Are there certain boundary conditions that need to be met for an EGG signal to be suitable for CQEGG calculation, or can CQEGG algorithms be indiscriminately applied to any EGG waveform? METHODS Participants and phonatory tasks The data analyzed in this study consist of a unique database of four “extreme” singing types from 13 trained and untrained singers: aBducted falsetto (FaB), aDducted falsetto (FaD), aBducted chest (CaB), and aDducted chest (CaD). The participants’ demographics, experimental protocols, and data acquisition methods are described in detail in a previous publication.1 All subjects were asked to sing target notes at their primary register transition (typically at or around pitch D4, fundamental frequency [f0] ≈ 294 Hz). The target notes were reached by descending (in the case of falsetto register) and ascending scales (chest register), to guarantee phonation in the designated register. The singers were explicitly asked not to blend the registers. The choice of aBducted vs aDducted phonation was instigated
ARTICLE IN PRESS Christian T. Herbst, et al
3
EGG Contact Quotient vs Closed Quotient
via the degree of breathiness in the voice. Proper target note execution was assessed in a previous publication.1 Data acquisition and analysis Phonation was monitored by simultaneous acoustic and EGG recordings, and videokymographic (VKG) endoscopy. In videokymography,23 a line perpendicular to the glottal axis is repeatedly scanned 7200 times per second, and the successive line images are concatenated along a global time axis to a kymogram. In a kymogram, the time-varying lateral deflections of the vocal folds at the position of the VKG scan line are depicted. From this representation, the VKG closed quotient (CQVKG) was computed for each phonation at the line of maximum vibratory amplitude of the vocal folds, perpendicular to the glottal axis. The duration of vocal fold contact, assessed by counting pixels within each analyzed kymogram, was divided through the respective period, also assessed by counting pixels. This process was repeated for three consecutive glottal cycles located at half the duration of each target note,1 and the computed closed quotients were averaged over all three glottal cycles for each target note, resulting in the CQVKG data reported here. Facilitated by a split screen feature,24 VKG data acquisition allows for simultaneous documentation of glottal configuration (via embedded laryngeal endoscopy) and vocal fold vibratory features (via the kymograms). The respective glottal configurations pertaining to the degree of posterior adduction were assessed by measuring the relative area of the observed posterior glottal chink with the image processing software Fiji.25 This process was described in detail in a previous publication.1 The EGG signal was captured with a Glottal Enterprises EG2PC electroglottograph (Glottal Enterprises, Syracuse, NY) with the high-pass filter cutoff frequency set to 2 Hz. During data acquisition, the EGG signal was monitored with a Tektronix TDS 210 oscilloscope (Tektronix Inc., Beaverton, OR). Within the EGG signal of each target note, individual glottal cycles were identified with a previously described algorithm.26 In short, the (timevarying) f0 of each target note was assessed with an autocorrelation method,27 and the individual EGG cycles were consecutively found by cross-correlating an ideal EGG waveform,28 stretched to the local period (as derived from the f0 information). Within each target note, the EGG waveforms of 30 consecutive glottal cycles extracted from the middle of that note were considered for further analysis. Each glottal cycle was tempo-
rally normalized (stretched) to a length of 200 samples by means of forward and inverse Fourier transform. All 30 glottal cycles of each target note were normalized in amplitude, and the normalized cycles were then superimposed on each other to arrive at one single “averaged” EGG glottal cycle for each target note. This analysis step was introduced to rule out the effects of noise in the EGG signal on the subsequent CQEGG calculations. A typical averaged EGG cycle, and its first derivative (dEGG), is illustrated in Figure 1. A total of 51 averaged EGG cycles were computed, constituted by four phonations (FaB, FaD, CaB, CaD) of 13 participants. Because of a malfunction in the equipment, no EGG signal for one participant’s FaD phonation was available. Five CQEGG measurements were conducted for each averaged EGG cycle: three of these were performed with threshold methods, with the threshold at 20%, 25%, and 35%, respectively (see Figure 1A–C for an illustration). One further measurement was made based on the dEGG signal, considering the dEGG maximum value (ie, the strongest peak) for the incident of contacting (t1) and the dEGG minimum for the incident of decontacting (t2—see Figure 1D). This approach slightly differs from Henrich’s DECOM algorithm14 as it does not consider multiple dEGG peak cases. The last CQEGG measurement was performed according to the suggestion of Davies et al6 and Howard et al.8,21 In this hybrid algorithm, the incident of contacting (t1) is derived from the strongest dEGG peak, and the incident of de-contacting (t2) is determined via a threshold method at ca. 43% (three-sevenths—see Figure 1E). An overview of all analyzed EGG waveforms and all the resulting CQEGG is provided in the supplementary material. For each target note and each CQEGG algorithm, the difference between the CQEGG and the CQVKG was calculated as ΔCQ = CQVKG − CQEGG. The ΔCQ values were averaged per phonation type and overall, and the respective standard deviations of ΔCQ were computed (see Results section; Table 1). Preliminary inspection of the EGG signals for all target notes suggested surprisingly low signal-to-noise ratios (SNR) for a majority of the aBducted falsetto phonations. To calculate SNR, the root-mean-square (RMS) energy of both the target note EGG signal (RMS voiced ) and the EGG signal without phonation (RMSunvoiced) was computed. For the “unvoiced” condition, a 100 ms portion of the EGG signal immediately following the target note phonation was selected. The EGG SNR for each target note signal was finally calculated as SNR = 20 log10(RMSvoiced/
TABLE 1. Differences Between CQVKG and CQEGG per CQEGG Algorithm, Averaged Per Phonation Type, and Overall (Last Column)
Threshold 20% Threshold 25% Threshold 35% dEGG dEGG 3/7
FaB (aBducted falsetto)
FaD (aDducted falsetto)
CaB (aBducted chest)
CaD (aDducted chest)
Overall
−0.45 (±0.21) −0.46 (±0.20) −0.39 (±0.18) −0.31 (±0.19) −0.33 (±0.19)
−0.15 (±0.12) −0.11 (±0.12) −0.05 (±0.11) 0.03 (±0.08) 0.02 (±0.10)
−0.10 (±0.11) −0.13 (±0.16) −0.11 (±0.17) −0.04 (±0.11) −0.03 (±0.11)
−0.04 (±0.05) −0.01 (±0.05) 0.05 (±0.06) 0.07 (±0.09) 0.09 (±0.08)
−0.17 (±0.20) −0.18 (±0.22) −0.13 (±0.22) −0.07 (±0.19) −0.06 (±0.20)
Standard deviations are indicated in parentheses. Positive values (where CQVKG was larger than the respective CQEGG) are printed in bold font to distinguish them from negative values (CQEGG greater than CQVKG).
ARTICLE IN PRESS 4
Journal of Voice, Vol. ■■, No. ■■, 2016
RMSunvoiced). This approach was feasible because the EGG device used to record these signals did not contain an automated gain controller. RESULTS A summary of the computed CQEGG data from all five CQEGG algorithms, in relation to the respective CQVKG values, grouped per phonation type, is given in Figure 2. Visual inspection suggests a moderate agreement between CQEGG and CQVKG for the CaD phonation type for all algorithms, as well as for the FaD and CaB phonation types as derived from the dEGG-based algorithms (Figure 2D and E). Not surprisingly, threshold-based methods produced increasingly smaller values with greater threshold values, as the time-normalized distance between the rising and the falling edge of the EGG waveform typically becomes smaller at higher threshold values. There is a slight tendency for the dEGG-based algorithms and the threshold 35% algorithm to underestimate the CQEGG in relation to the CQVKG for phonations with CQVKG greater than 0.3 (for the dEGG-based algorithms this is probably caused by the phenomenon that the maximum rate of change of VFCA does not necessarily occur at the moments of glottal closure and opening).29 The FaB phonations had greatly inflated CQEGG values as compared with the CQVKG values in all CQEGG algorithms, suggesting a systematic underlying issue concerning the applicability of those algorithms to the respective EGG signals. Quantitative data for the CQEGG vs CQVKG comparisons, based on the calculated ΔCQ parameter (see Methods section), are provided in Table 1. There, several trends are revealed:
• •
•
In the “heavier” phonations, that is, either in chest register or with aDducted vocal folds, the CQEGG is generally in better agreement with the respective CQVKG value. Regardless of CQEGG algorithm, the FaB phonations were systematically characterized by the largest disagreement between CQEGG and CQVKG (ie, the greatest ΔCQ values), corroborating the insights from visual inspection of Figure 2 (see above). Overall, the two dEGG-based algorithms (dEGG and dEGG 3/7) had the lowest ΔCQ values, suggesting the best agreement between CQEGG and CQVKG. Yet even these algorithms resulted in a mean error (ΔCQ) of 6% and 7% respectively, with standard deviations in the range of 20% of normalized glottal cycle duration.
The systematic issue with overestimation of the CQEGG in the FaB phonations can be explained by considering the EGG waveform depicted in Figure 3: The archetypical EGG waveform for breathy phonation without vocal fold adduction (regularly produced with a membranous posterior glottal chink1,30) is sinusoidal (comparable with the signals reported for “Zone 1” in Schutte and Seidner, Figures 3 and 431) and typically polluted with a substantial degree of measurement noise. This noise was filtered out to a certain degree in the example shown in Figure 3 by averaging 30 consecutive glottal cycles (see Methods section), yet the corresponding dEGG waveform still contains a considerable degree of noise (see Figure 3). Because of the sinusoidal nature of the EGG waveform, all five CQEGG algorithms produced greatly inflated results in the range of about 0.6–0.7 (ca
FIGURE 2. Correlations between CQVKG and CQEGG for all CQEGG algorithms, participants, and phonation types.
ARTICLE IN PRESS Christian T. Herbst, et al
5
EGG Contact Quotient vs Closed Quotient
FIGURE 3. (A–E) CQEGG calculation based on a typical EGG waveform derived from aBducted falsetto phonation. (F) Videokymographic footage related to the shown EGG waveforms. Note that although the kymographic contact quotient is zero (close inspection of the videokymogram suggests that the vocal folds are not quite making contact), all CQEGG algorithms produce paradoxically large CQEGG values.
60%–70% cycle duration). This is in stark contrast to the respective CQVKG, which, for the shown example, was zero (no glottal closure found in the endoscopic and VKG footage, again corroborating Schutte and Seidner’s findings31). The relative level of noise in the EGG signal was quantified via the SNR. Interestingly, most of these cases with gross overestimation of CQEGG in relation to CQVKG stemmed from EGG signals with rather low SNRs. This phenomenon is illustrated in Figure 4, where ΔCQ is shown as a function of SNR for all five algorithms, grouped by phonation type. Visual inspection of that figure suggests that the worst outliers had an SNR below 10 dB. These cases were almost entirely made up of FaB phonations (nine out of a total of 13), owing to the sinusoidal nature of the respective EGG signals (compare Figure 3). Based on this finding, it would appear that EGG signals with an SNR below 10 dB contributed most to distorting the correlation of CQEGG and CQVKG depicted in Figure 2. By inversion of this argument, the more general correlation of these two variables might be maintained by only considering EGG signals with an SNR above 10 dB. The result of this is depicted in Figure 5. In contrast to Figure 2, in Figure 5 the CQVKG was plotted as a function CQEGG. This was done because in a typical research or pedagogical application, only the CQEGG is known, based on which the “true” closed quotient as a measure that is more closely related to glottal opening and closing events (and thus laryngeal sound generation) might be approximated. For this purpose, secondorder polynomial regression fits were created per CQEGG algorithm, based on all phonations with an EGG SNR greater than 10 dB.
As can be seen from Figure 5D, the regression for the dEGG algorithm had the greatest coefficient of determination (R 2 = 0.808), suggesting that for the analyzed data (EGG SNR > 10 dB) the dEGG algorithm is the most reliable predictor of CQVKG within a CQEGG range of 0.2–0.8. At least to a certain degree, the quadratic nature of the regression fit allows for correction of the dEGG algorithm’s aforementioned underestimation of the CQEGG in relation to the CQVKG at higher CQEGG values in the range of 0.4–0.7. The performance of CQVKG estimation based on the polynomial regression fit for the dEGG algorithm
CQVKG_estd . = 1.1(CQEGG ) + 2.149CQEGG − 0.271 2
(Eq. 1)
was assessed by substituting all CQEGG values derived from EGG signals with an SNR greater than 10 dB (n = 38) into Equation 1, and then calculating the magnitude of the estimation error (ie, the difference between the known CQVKG and the estimated closed quotient [CQVKG_estd]) as
e = CQVKG − CQVKG_estd .
(Eq. 2)
A histogram of the estimation error e for all data points is shown in Figure 6. The 50th percentile at e = 0.046 suggests that in half of the cases the true CQVKG was either over- or underestimated by more than 0.046, amounting to roughly 5% cycle duration. The 95th percentile at 0.189 suggests that 95% percent of all cases had an estimation error e of less than 0.189. In other words, disregarding outliers that would account for 5% of all
ARTICLE IN PRESS 6
Journal of Voice, Vol. ■■, No. ■■, 2016
FIGURE 4. Difference between CQEGG and CQVKG as a function of EGG signal-to-noise ratio (SNR) for all participants and phonations types, illustrated for all five CQEGG algorithms.
FIGURE 5. Relationship of CQEGG with CQVKG when considering only data points where the EGG signal had an SNR above 10 dB.
ARTICLE IN PRESS Christian T. Herbst, et al
EGG Contact Quotient vs Closed Quotient
FIGURE 6. Distribution of error magnitudes when estimating the measured CQVKG values by substituting the CQEGG data calculated with the dEGG method into the regression fit derived in Figure 5D. Only data points where the EGG SNR was above 10 dB were considered (n = 38). The 50th and the 95th percentile are indicated with vertical dashed lines. cases, the method can only guarantee a correct CQVKG estimation with an error of roughly 20% cycle duration. This suggests that although the computed polynomial regression approach is adequately capable of predicting a general trend when relating dEGG-based CQEGG data to CQVKG data, it has only limited usefulness for reconstructing individual data points, in spite of the relatively good coefficient of determination (R2 = 0.808). DISCUSSION Overall, the analyzed data revealed only an approximate agreement between CQEGG and CQVKG, which is in agreement with previous research.14,19,20,22,32–34 The additional insights gained here stem mainly from differentiation of results according to known glottal configuration, assessed by laryngeal endoscopy and videokymography, facilitated by the “split screen” feature of the used VKG camera. In particular, phonation in the fully aDducted chest register (CaD) tended to exhibit the best correlation between the two measures, regardless of the CQEGG algorithm used, whereas phonation in the aBducted falsetto register (FaB) typically resulted in the greatest discrepancies between CQEGG and CQVKG. The FaB phonations were consistently produced with incomplete glottal closure, having a posterior glottal chink reaching into the membranous portion of the vocal folds (see Figure 4 in Ref. 1). As a consequence, the measured CQVKG was zero or only slightly above in all cases. In contrast, the respective CQEGG values were typically in the range of 0.2–0.7, resulting in a gross mismatch of closed quotient vs contact quotient. Such spurious results are caused by a fundamental flaw of currently available CQEGG algorithms: they are programmed to estimate incidents of contacting and de-contacting (see the t1 and t2 markers in Figure 1 and Figure 3) by default, regardless of whether vocal fold contact is present in the first place. In other words, currently available CQEGG algorithms always attempt to produce a CQEGG value greater than zero, even in cases where the comparable closed quotient would be zero owing to the absence of glottal closure. Because this is mostly an issue with aBducted (breathy) falsetto phonations, we advise against computing the CQEGG of those phonations with currently available algorithms, as this is bound to result in spurious data devoid of physiological meaning
7
and thus difficult to interpret. Most likely this would apply not only to EGG data from singing, but also to EGG data from speech. The data presented in Figure 4 suggest that the degree of mismatch between CQEGG and CQVKG is related to the analyzed EGG signal’s SNR, resulting in substantially greater discrepancies if the SNR is below 10 dB. Not surprisingly, it is mostly the FaB phonations that fall into this category. As those phonations are produced with limited or without glottal closure, the vocal fold contact variation is small as compared with EGG signals stemming from phonation with complete vocal fold closure. Consequently, in that class of EGG signals, the inherent system noise introduced by the equipment and the measurement setup (eg, skin impedance, EGG circuit noise, potential electromagnetic interference, etc) has a greater relative impact, thus resulting in a lower SNR. When discarding CQEGG data derived from EGG signals with an SNR below 10 dB, the correspondence between CQEGG and CQVKG was improved (yet still being far from perfect), particularly for the dEGG algorithm (see Figure 5D). Using a polynomial regression fit based on the available data (Equation 1), the relation between CQEGG and CQVKG can be described fairly well. However, the predictive reliability of this approach for individual data points is still limited (see Figure 6 and the Results section). Nevertheless, in view of these observations, the aforementioned recommendation not to include a certain class of EGG signals (typically devoid of glottal closure) in CQEGG analysis can be substantiated quantitatively. Based on our results, we suggest adding a 10 dB EGG SNR quality criterion to all available CQEGG algorithms to avoid inflated CQEGG data for EGG signals lacking vocal fold contact. The analyzed data show clear differences of CQEGG values as computed by different CQEGG algorithms, which is in agreement with findings from previous studies.13,14,16,17 Our data show that overall the dEGG-based algorithm produces CQEGG data that best correspond to respective closed quotient values. Based on this observation, a previously issued recommendation in favor of 20% or 25% threshold algorithms17 needs to be revised on the grounds that in the previous study only two male singers were investigated. Rather, the findings of the current study would suggest using a dEGG-based algorithm per default, if only for cases of nonpathologic phonation. Free software implementations of dEGG-based algorithms are available in the form of Nathalie Henrich’s DECOM algorithm (implemented in Matlab: http://voiceresearch.free.fr/egg/index.html) and as an integral part of the main author’s EGG wavegram package26 (implemented in Python: http://homepage.univie.ac.at/christian.herbst/index.php? page=wavegram). A limitation of this study is that the closed quotient was determined based on kymograms obtained from a single line,23 perpendicular to glottal axis, rather than from the whole glottal area waveform (GAW), originating from high-speed video data.35,36 This limitation is related to the nature of the VKG technique, which delivers the kymographic images in real time and takes into account only a single line perpendicular to the glottal axis. The place of maximum vibration amplitude, which is usually around the middle of the vibrating vocal fold length, has been considered to be the most representative place to represent the
ARTICLE IN PRESS 8 vibratory behavior of the vocal folds. Care has been taken to put the VKG scan line always on that position; however, slight deviations from that position might have introduced small variation into the CQVKG data. Recent evidence presented by Lohscheller et al37 (see Figure 4 of their study) suggests that the endoscopic closed quotient, when measured only at one point along the anterior-posterior (A-P) axis of the vocal folds, becomes smaller as that measurement point is moved posteriorly. The minimum CQ found along the A-P axis determines the overall CQ as derived from the GAW. As this minimum is likely to be found toward the posterior end of the A-P axis, our CQVKG measurements, taken approximately at the middle of the A-P axis, might have a tendency to be higher than hypothetical CQ measurements taken from the GAW (which, unfortunately, are not available), particularly in breathy (non-adducted) phonations. CONCLUSION The distinction between EGG “contact quotient” (CQEGG) and “closed quotient” is essential when interpreting phonatory data. As a first approximation, the closed quotient directly relates to the incidents of glottal closure and opening, that is, the incidents of initiation and, more importantly, cessation of glottal airflow per cycle, which constitute the acoustic excitation of the sound source.38 The CQEGG, on the other hand, is a surrogate quantity that attempts to emulate the properties of the closed quotient without having access to the direct information (because EGG measures only the relative VFCA and not the area of the glottis or glottal airflow directly3,39). Vocal fold (de)contacting does not happen in an instant of time (with zero duration), but extends over an interval, caused by phase differences in the vocal fold vibration, taking place along both the inferior-superior and the A-P vocal fold dimension.29,40,41 The arbitrary estimation of (de)contacting intervals via instants (compare the t1 and the t2 events in Figure 1 and Figure 3) constitute a fundamental issue inherent to all CQEGG algorithms. (In support of this argument, a recent publication has shown that there can be differences of up to 10% of cycle duration between dEGG peaks and opening or closing events.29) Therefore, while the closed quotient does relate to a real physiological phenomenon that is very relevant for laryngeal sound generation, the CQEGG is at best an imprecise approximation thereof. The CQEGG thus requires a different interpretation as compared with the closed quotient, and these two measures should never be used interchangeably.22 In particular, the CQEGG cannot be treated as an ersatz closed quotient when endoscopic (high-speed) video or airflow data are not available. Nevertheless, the CQEGG may be an unavoidable tool for preliminary assessment of laryngeal vibratory behavior when more invasive data acquisition methods are not available. In such cases, however, very careful interpretation is required when performing statistic tests on quantitative CQEGG data: Even statistically significant results may be meaningless if the measure is applied in an inappropriate context. Summarizing the findings of this study, we conclude that the CQEGG is only a limited surrogate of the closed quotient and should not be considered as a measure directly related to laryngeal sound generation events. EGG signals without vocal fold contact (typically derived from breathy phonations like aBducted falsetto)
Journal of Voice, Vol. ■■, No. ■■, 2016
are not suitable for CQEGG analysis. Consequently, EGG signals with an SNR below 10 dB should not be analyzed. The choice of CQEGG algorithm systematically influences the results. In this study, a CQEGG algorithm operating on the first derivative of the EGG signal (dEGG) gave the “best” results in relation to corresponding closed quotient data. The relation between the dEGGbased CQEGG and the CQVKG was modeled with a second-order polynomial regression. Although this model could adequately explain the overall relation between CQEGG and closed quotient (R2 = 0.808), it was less well suited for predicting individual closed quotient values based on known CQEGG data. In this light, qualitative approaches, such as the newly introduced EGG wavegram visualization technique,26 are advertised as promising alternatives to quantitative CQEGG analysis. Acknowledgments Our sincere thanks go to Dr. Qingjun Qiu for his help in acquiring the videokymographic footage analyzed in this study. We thank Dr. Donald Miller for his feedback and comments to the manuscript. This work has been supported by an APART grant from the Austrian Academy of Sciences (to C.T.H.), and by the Czech Science Foundation (GACR) project no. GA16-01246S (to J.G.S.). The title phrase “comparing chalk with cheese,” as suggested by an anonymous reviewer, was chosen over “comparing apples with oranges,” because apples and oranges can actually be compared quite well.42,43 SUPPLEMENTARY DATA Supplementary data related to this article can be found online at doi:10.1016/j.jvoice.2016.11.007. REFERENCES 1. Herbst CT, Qiu Q, Schutte HK, et al. Membranous and cartilaginous vocal fold adduction in singing. J Acoust Soc Am. 2011;129:2253–2262. 2. Fabre P. Un procédé électrique percuntané d’inscription de l’accolement glottique au cours de la phonation: glottographie de haute fréquence; premiers résultats [A non-invasive electric method for measuring glottal closure during phonation: high frequency glottogr]. Bull Acad Nat Med. 1957;141:66–69. 3. Hampala V, Garcia M, Svec JG, et al. Relationship between the electroglottographic signal and vocal fold contact area. J Voice. 2016;30:161– 171. 4. Scherer RC, Druker DG, Titze IR. Electroglottography and direct measurement of vocal fold contact area. In: Fujimura O, ed. Vocal Fold Physiology: Voice Production, Mechanisms and Functions, Vol. 2. New York: Raven Press; 1988:279–290. 5. Orlikoff RF. Assessment of the dynamics of vocal fold contact from the electroglottogram: data from normal male subjects. J Speech Hear Res. 1991;34:1066–1072. 6. Davies P, Lindsey GA, Fuller H, et al. Variation in glottal open and closed phases for speakers of English. Proc Inst Acoust. 1986;8:539–546. 7. Rothenberg M, Mahshie JJ. Monitoring vocal fold abduction through vocal fold contact area. J Speech Hear Res. 1988;31:338–351. 8. Howard DM. Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers. J Voice. 1995;9:163–172. 9. Hacki T. Electroglottographic quasi-open quotient and amplitude in crescendo phonation. J Voice. 1996;10:342–347. 10. Verdolini K, Chan R, Hess M, et al. Correspondence of electroglottographic closed quotient to vocal fold impact stress in excised canine larynges. J Voice. 1998;12:415–423. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd =Retrieve&db=PubMed&list_uids=9988028&dopt=Abstract.
ARTICLE IN PRESS Christian T. Herbst, et al
EGG Contact Quotient vs Closed Quotient
11. Morris RJ, Okerlund D, Bernadin S. Differentiated electroglottograph and audio signal measurements of vocal fold closed quotient during a register change: single note data. J Acoust Soc Am. 2015;137:2405. doi:10.1121/ 1.4920762. 12. Esposito CM. An acoustic and electroglottographic study of White Hmong tone and phonation. J Phon. 2012;40:466–476. 13. Sapienza C, Stathopoulos ET, Dromey C. Approximations of open quotient and speed quotient from glottal airflow and EGG waveforms: effects of measurement criteria and sound pressure level. J Voice. 1998;12:31–43. 14. Henrich N, d’Alessandro C, Doval B, et al. On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. J Acoust Soc Am. 2004;115:1321–1332. 15. Childers D, Hicks DM, Eskanazi L, et al. Electroglottography and vocal fold physiology. J Speech Hear Res. 1990;33:245–254. 16. Kankare E, Laukkanen A-M, Ilomäki I, et al. Electroglottographic contact quotient in different phonation types using different amplitude threshold levels. Logoped Phoniatr Vocol. 2012;37:127–132. 17. Herbst CT, Ternström S. A comparison of different methods to measure the EGG contact quotient. Logoped Phoniatr Vocol. 2006;31:126–138. 18. Henrich N, d’Alessandro C, Doval B, et al. Glottal open quotient in singing: measurements and correlation with laryngeal mechanisms, vocal intensity, and fundamental frequency. J Acoust Soc Am. 2005;117:1417–1430. 19. Echternach M, Dippold S, Sundberg J, et al. High-speed imaging and electroglottography measurements of the open quotient in untrained male voices’ register transitions. J Voice. 2010;24:644–650. doi:10.1016/ j.jvoice.2009.05.003, S0892-1997(09)00070-8 [pii]. 20. Yokonishi H, Imagawa H, Sakakibara K-I, et al. Relationship of various open quotients with acoustic property, phonation types, fundamental frequency, and intensity. J Voice. 2016;30:145–157. 21. Howard DM, Lindsey GA, Allen B. Toward the quantification of vocal efficiency. J Voice. 1990;4:205–212. 22. La FMB, Sundberg J. Contact quotient versus closed quotient: a comparative study on professional male singers. J Voice. 2015;29:148–154. 23. Svec JG, Schutte HK. Videokymography: high-speed line scanning of vocal fold vibration. J Voice. 1996;10:201–205. 24. Qiu Q, Schutte HK. A new generation videokymography for routine clinical vocal-fold examination. Laryngoscope. 2006;116:1824–1828. 25. Schindelin J, Arganda-Carreras I, Frise E, et al. Fiji: an open-source platform for biological-image analysis. Nat Methods. 2012;9:676–682. doi:10.1038/ nmeth.2019, nmeth.2019 [pii]. 26. Herbst CT, Fitch WT, Švec JG. Electroglottographic wavegrams: a technique for visualizing vocal fold dynamics noninvasively. J Acoust Soc Am. 2010;128:3070–3078. doi:10.1121/1.3493423. 27. Boersma P. Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, Vol. 17. Amsterdam: 1993:97–110.
9
28. Titze IR. A four-parameter model of the glottis and vocal fold contact area. Speech Commun. 1989;8:191–201. 29. Herbst CT, Lohscheller J, Svec JG, et al. Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings. J Exp Biol. 2014;217:955–963. 30. Södersten M, Hertegard S, Hammarberg B. Glottal closure, transglottal airflow, and voice quality in healthy middle-aged women. J Voice. 1995;9:182–197. 31. Schutte H, Seidner WW. Registerabhängige Differenzierung von Elektroglottogrammen. Sprache Stimme Gehör. 1988;12:59–62. 32. Baer T, Löfqvist A, McGarr NS. Laryngeal vibrations: a comparison between high-speed filming and glottographic techniques. J Acoust Soc Am. 1983;73:1304–1308. 33. Mecke A-C, Sundberg J, Granqvist S, et al. Comparing closed quotient in children singers’ voices as measured by high-speed-imaging, electroglottography, and inverse filtering. J Acoust Soc Am. 2012;131:435– 441. 34. Schutte HK, Miller DG. Measurement of closed quotient in a female singing voice by electroglottography and videokymography. In: Schutte HK, ed. 5th International Conference Advances in Quantitative Laryngology, Groningen, the Netherlands, April 27–28, 2001. (CD-ROM). Groningen Voice Research Lab, University of Groningen; 2001. 35. Deliyski DD, Petrushev PP, Bonilha HS, et al. Clinical implementation of laryngeal high-speed videoendoscopy: challenges and evolution. Folia Phoniatr Logop. 2008;60:33–44. doi:10.1159/000111802, 000111802 [pii]. 36. Hertegard S. What have we learned about laryngeal physiology from high-speed digital videoendoscopy? Curr Opin Otolaryngol Head Neck Surg. 2005;13:152–156. 37. Lohscheller J, Svec JG, Döllinger M. Vocal fold vibration amplitude, open quotient, speed quotient and their variability along glottal length: kymographic data from normal subjects. Logoped Phoniatr Vocol. 2013;38:182–192. doi:10.3109/14015439.2012.731083. 38. Fant G. Glottal source and waveform analysis. STL-QPSR. 1979;85–107. 1/1979. 39. Baken RJ. Electroglottography. J Voice. 1992;6:98–110. 40. Childers DG, Hicks DM, Moore GP, et al. A model for vocal fold vibratory motion, contact area, and the electroglottogram. J Acoust Soc Am. 1986;80:1309–1320. 41. Hess M, Ludwigs M. Strobophotoglottographic transillumination as a method for the analysis of vocal fold vibration patterns. J Voice. 2000;14:255–271. 42. Barone JE. Comparing apples and oranges: a randomised prospective study. BMJ. 2000;321. 43. Sandford SA. Apples and oranges—a comparison. Available at: http://www.improbable.com/airchives/paperair/volume1/v1i3/air-1-3-apples .html. Published 1995. Accessed December 13, 2016.