ARTICLE IN PRESS A Cepstral Analysis of Normal and Pathologic Voice Qualities in Iranian Adults: A Comparative Study *,†Arezoo Hasanvand, *Abolfazl Salehi, and *Mona Ebrahimipour, *†Tehran, Iran Summary: Introduction. The use of frequency-based analysis as an accurate method of voice analysis motivated us to evaluate the voice qualities of healthy versus dysphonic Iranian people. Methods. Two hundred normal and dysphonic participants aged between 20 and 50 years in either gender were divided into four different equal groups. For the tasks, 5-second prolongation of vowel /a/ and a sample of reading text were used for the analysis. “Speech Tool” software was employed for Cepstral peak prominence (CPP) and cepstral peak prominence-smoothed (CPPS) analyses. The t test and Mann-Whitney U test were used for statistical analysis. Results. Significant differences between the dysphonic and controls were discovered based on CPP and CPPS in the reading tasks (males and females) and CPPS in the sustained vowel (males and females). Nevertheless, the two male groups showed no differences in the sustained vowel in CPP. Moreover, significantly lower CPP and CPPS were observed in the sustained vowel and reading tasks for the dysphonic females compared to the control group and either group of males. Discussion and Conclusion. In spite of the different characteristics of consonant-vowel contexts in Persian language, the results of this study suggested that both CPP and CPPS are appropriate to differentiate between normal and dysphonic voices in connected speech and CPPS is promising for sustained phonation in Persian. The results of this research also suggested that the male group in the normal and dysphonic samples had better CPP and CPPS values. Key Words: Acoustic–Voice analysis–Dysphonia–Voice quality–Cepstrum peak prominence.
INTRODUCTION Human voice as a complex tonal wave is a product of vocal fold vibration and is modified by an expiratory airstream and resonatory filtering functions.1 It has features such as pitch, loudness, and quality. A thorough assessment of voice quality as a discriminative factor to identify voices includes an audioperceptual assessment and more objective and measurable evaluations, such as acoustic voice analysis. Auditory-perceptual assessment is still used by experienced listeners as the most reliable method and a necessary component in the assessment of pathological voices in clinical and research settings.2–5 However, many studies have reported that acoustic assessments, specifically in research fields, do not have an objective reproducibility due to individual bias, level of familiarity with the listener, and the listener-specific characteristics.2–6 Although creating objective and efficient methods to assess and follow up treatment of voice disorders has been made possible in recent years due to advances in digital technology and new methods of speech processing, audio-perceptual voice assessments should not be replaced by acoustic assessments. However, principles of evidencebased practices are indicative of quantitative and objectivebased criteria for a more accurate diagnosis and follow-up procedure. 7 Nowadays, acoustic analysis with the help of computer-based systems is one of the essential parts of voice assessment in many studies. Other advantages of acoustic methods Accepted for publication October 19, 2016. From the *Department of Speech Therapy, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran; and the †Student Research Committee, University Of Social Welfare and Rehabilitation Sciences, Tehran, Iran. Address correspondence and reprint requests to Abolfazl Salehi, Department of Speech Therapy, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran. E-mail:
[email protected] Journal of Voice, Vol. ■■, No. ■■, pp. ■■-■■ 0892-1997 © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jvoice.2016.10.017
as objective-based voice assessments include the following: easy, quick, measurable, and repeatable use; relative low cost; noninvasiveness; numerical output and the possibility of documenting the assessment in the early stages; determining the treatment progress; and provision of some criteria and incentives for patients to treat their voices.8–11 Regarding the importance of acoustic analysis, there are different methods that could facilitate the diagnosis of voice disorders and prediction of dysphonia. A normal vocal acoustic signal will have small cycle-to-cycle variability in frequency and amplitude. The periodicity of the signal can be calculated by time12–15 and frequency-based16–18 acoustic analyses. Among the timebased methods, the most common measurements that reflect disruptions in voice quality are perturbation parameters such as jitter, shimmer, and harmonics-to-noise ratio. These timebased parameters were created based on an exact demarcation of cycle-to-cycle boundaries (ie, where a cycle of vibration starts or ends). However, boundary detection approaches are highly valid and reliable only in periodic voices.7 In fact, normal acoustic signals are periodic and have a small cycle-to-cycle variability in frequency or amplitude.19 Despite that, rough voices have high jitter and shimmer levels. Therefore, in these acoustic samples, a small error in determination of fundamental frequency (F0) causes a considerable error in perturbation measurements of frequency, amplitude, and harmonics-to-noise ratio due to the irregularity of phonation and difficulty in the exact determination of cycle-by-cycle F0.7,19 For this reason, despite the ability of these acoustic parameters to quantify acoustic signals and their widespread uses in research and assessment fields, some doubts arose during the past two decades.9,20–22 On the other hand, a sustained mid-vowel is used in the traditional acoustic analysis due to the fact that sustained vowels possess a relatively fixed nature of phonation and comfortable and standardized production with no voiceless parts, prosodic changes in frequency and
ARTICLE IN PRESS 2 amplitude, pause, speed, stress, language loading, and relative influences of language, accent, etc, in the phonetic context of a continuous speech.9,11,23,24 However, the presence of both types of tasks, that is, continuous speech and sustained vowel, is necessary to create a more accurate and multidimensional image of voice. Moreover, if our goal is to get an “ecologically valid” picture and show people’s true voices in their daily lives, it is necessary to use both tasks of sustained vowels and continuous speech.11 These properties make time-based assessments unreliable in continuous speech. Additionally, the study performed by Carding et al25 suggested that none of the timedomain methods are sensitive enough for determining the voice quality changes associated with the treatment. To meet the limitations of the first acoustic analysis, a number of researchers created multiparameter approaches to assess voice quality.13,26–28 Among these so-called frequency-based measurement methods, cepstral peak prominence (CPP) and cepstral peak prominence-smoothed (CPPS) are strongly correlated with the elements of roughness and breathiness in the tasks of sustained vowels and continuous speech and serve as the best practices for assessing voice quality even in severely dysphonic voice samples.7,16,17,29–33 According to Hillenbrand and Houde,34 “a cepstrum is a log power spectrum of a log power spectrum. For periodic signals, the first power spectrum shows energy at harmonically related frequencies and the second spectrum displays a strong component corresponding to the regularity of the harmonic peaks.” In other words, CPP gives information about the organization of harmonics in a signal.16,34–37 CPPS, on the other hand, is a smoothed CPP that provides us with a clearer signal representation by using the average of cepstrum in quefrency and time domains.7,16,31,34,38 Also, quefrency represents the inverse distance of the signal frequency harmonies (ie, reverse of frequency or time). In fact, a normal acoustic signal is periodic and harmonic in the frequency spectrum and thus, it has a more prominent cepstral peak from the linear regression in the cepstral analysis.39 However, a voice with nonperiodic harmonies reduces cepstral peak to the regression line and the total amount of CPP to indicate the abnormality of an acoustic signal.40 Actually, this measurement is not based on a pitchtracking mechanism. It is based on the peak to an average calculation and it is exactly for this reason that it can reflect a valid and reproducible voice quality even if acoustic signals are severely dysphonic.19 Another feature of this acoustic parameter is its independence from the recording technique and volume of voice sample leading to its unique property.39 Previous studies have also shown that CPP predicts dysphonia in the samples of sustained vowels and connected speech with the greatest sensitivity, specificity, and predictive value, and has a strong relationship with the classification of acoustic perception of dysphonia.16,17,34,41 Several studies have been done to set up the acoustic measure as a reference in clinical practices. Maryn et al41 conducted a meta-analysis indicating that CPPS is a vigorous acoustic measure of an overall severity of dysphonia; therefore, the crucial role of cepstral analysis in the clinical voice evaluation became more noticeable. Their survey was based on a meticulous and methodological review among acoustic measurements, providing a
Journal of Voice, Vol. ■■, No. ■■, 2016
strong theoretical framework to use CPPS as a valid and reliable acoustic tool to measure dysphonia. Titze’s emphasis on making acoustic data bases42 motivated us to gather acoustic data to differentiate between normal and dysphonic groups based on an accurate method. The purpose of this study was to make a cepstral comparison between normal and pathologic voice qualities in Iranian adults. METHODOLOGY In this cross-sectional study, a convenience sample of 200 normal participants aged between 20 and 50 years was selected. Half of this population was dysphonic patients (50 males and 50 females). Similarly, the control group consists of 50 males and 50 females. Several factors were considered in the selection of participants who were Persian native speakers with a normal voice. Particularly, the subjects did not report any suffering from an upper respiratory tract infection in the last 3 weeks and had no recent history of laryngitis, speech or auditory problems, or surgery of the oropharynx and larynx. Furthermore, the subjects were in good health, had never had any professional training in singing, were all nonsmokers, free of neurologic diseases, and without a history of voice disorder within the last 5 years. Moreover, the female participants were not in the menstrual period during the time of sampling. Subsequently, an otorhinolaryngologist and a speech and language pathologist were asked to evaluate the subjects’ voices by using a comprehensive voice assessment form. This form consisted of the above-mentioned components (ie, medical history, oral examination, and respiratory assessment). As for the dysphonic group, a comprehensive otolarygologic examination was performed using a clinical and video stroboscopic assessment. It should be noted that all the subjects offered their written consent to participate in the study. The selected participants stood in an upright position in a sound-insulated room in the Voice Research Laboratory of the University of Social Welfare and Rehabilitation Sciences where a high-quality unidirectional dynamic microphone (SHUREProlog SM58, Niles, IL, USA) was used at a distance of about 15 cm off-center to their mouths to record their voices. They were asked to read a piece of written standardized text at a comfortable loudness and pitch. They were also requested to prolong sustained vowels /a/ for 5 seconds. To adapt them to the process, they were allowed to practice a few times before the recording was started. The whole procedure was repeated three times. A “Speech Tool” (James Hillenbrand, Western Michigan University, Kalamazoo, MI, USA), which uses the Hillenbrand algorithm and is available at http://homepages.wmich.edu/~hillenbr/, was utilized for the purpose of voice analysis. Using SPSS statistic software for windows (SPSS Corp., Chicago, IL., USA), the data were analyzed. To assess the normality of distributions, a one-sampled Kolmogorov-Smirnov test was applied. In the case of a normal distribution, the statistical differences between the normal/dysphonic participants in male and female groups were examined using the parametric independent samples t test. The significance level was set at P < 0.05. In the case of a non-normal distribution, a nonparametric MannWhitney U test was utilized.
ARTICLE IN PRESS Arezoo Hasanvand, et al
3
Cepstral Analysis of Adult Iranian Voice Qualities
TABLE 1. Results of CPPS for the Male Groups Group Control (n = 100) Dysphonic (n = 100)
Mean rank Mean rank Z P value†
CPPS /a/ (dB)
CPPS RT* (dB)
99.94 26.62 9.744 0.000
100.32 25.86 9.895 0.000
* Read text. † P < 0.05 indicated significant difference. Abbreviation: CPPS, cepstral peak prominence-smoothed.
RESULTS Of the 200 voice samples analyzed in the study, 100 samples were dysphonic. As presented in Table 1, the mean CPPS among dysphonic male voices and normal male voices were 26.62 and 99.94 in sustained vowel task and 25.86 and 100.32 in reading text task respectively. The difference between the mean values of CPPS in both tasks (sustained vowel /a/ and text reading) for the normal and dysphonic male voices was statistically significant (Figures 1–2). Similarly, the CPP in sustained vowel and reading text for the male group are demonstrated in Table 2. Analyzing the data showed that although there was a significant difference between normal and dysphonic male groups in the text reading, no significant difference was discovered for CPP in the sustained vowel task (Figure 3). In other words, the Iranian dysphonic males had a significantly lower CPP (11.386 dB) compared to the control group (18.713 dB), as shown in Figure 4 and Table 2. It should be mentioned that Levene test was used to test the homogeneity of variances, which was found at a significant level (F = 10.599, P = 0.001). Likewise, CPP and CPPS were computed for both dysphonic and control female groups in sustained vowel and text reading tasks, and the average values are shown in Tables 3 and 4. As illustrated in Figures 5–8, the differences in CPP and CPPS in both tasks were statistically significant with P < 0.05 using
FIGURE 1. Cepstral peak prominence-smoothed (CPPS) differences between dysphonic and nondysphonic males in sustained vowel.
FIGURE 2. Cepstral peak prominence-smoothed (CPPS) differences between dysphonic and nondysphonic males in connected speech task.
TABLE 2. Result of CPP for the Male Groups Group Control (n = 100) Dysphonic (n = 100)
Mean SD Mean SD P value†,‡
Age (year)
CPP /a/ (dB)
CPP RT* (dB)
41.050 2.189 38.120 1.680
12.716 1.130 12.625 0.981 0.630
18.713 3.273 11.386 2.117 0.000
* Read text. † Student’s t test. ‡ P < 0.05 indicated significant difference. Abbreviations: CPP, cepstral peak prominence; SD, standard deviation.
FIGURE 3. Cepstral peak prominence (CPP) differences between dysphonic and nondysphonic males in sustained vowel.
ARTICLE IN PRESS 4
Journal of Voice, Vol. ■■, No. ■■, 2016
FIGURE 4. Cepstral peak prominence (CPP) differences between dysphonic and nondysphonic males in connected speech.
the Mann-Whitney U test and t test. As predicted, the dysphonic females had significantly lower CPP and CPPS than the control groups in the sustained vowel and text reading tasks. Again, Levene test was employed to test the homogeneity of variances, at a significant level (F = 9.455, P = 0.003).
TABLE 3. Result of CPP and CPPS in Text Reading Task for the Female Groups Group Control (n = 100) Dysphonic (n = 100)
Mean SD Mean SD P value†
Age (Year)
CPP RT* (dB)
CPPS RT* (dB)
43.103 2.271 44.231 1.908
16.975 2.978 9.583 2.045 0.000‡
5.411 1.204 2.505 0.777 0.000‡
FIGURE 5. Cepstral peak prominence (CPP) differences between dysphonic and nondysphonic females in connected speech. DISCUSSION Voice as a multidimensional phenomenon can be described effectively by measures that provide us with a better understanding of clinical dimensions. Acoustic voice analysis has been widely used to quantify voice characteristics because of its noninvasiveness and repeatability.43,44 There are numerous acoustic parameters to evaluate voice characteristics, all of which are typically based on frequency and amplitude perturbation measurements.45,46 Nonetheless, assessments of the severity of dysphonia and treatment tracking outcomes should be based on the survey of both sustained vowels and continuous speech to improve their ecological validities.18 By transferring from timedomain to frequency-domain analysis, cepstrum-based measures were introduced as robust ways to estimate periodicity of harmonics based on the overall energy in a signal.41 Overall energy is represented in cepstral peaks and, therefore, better harmonic
* Read text. † Student’s t test. ‡ P < 0.05 indicated significant difference. Abbreviations: CPP, cepstral peak prominence; CPPS, cepstral peak prominence-smoothed; SD, standard deviation.
TABLE 4. Results of CPP and CPPS in Sustained Vowel /a/ for the Female Groups Group Control (n = 100) Dysphonic (n = 100)
Mean rank Mean rank Z P value
CPP /a/ (dB)
CPPS /a/ (dB)
91.87 42.76 6.526 0.000
100.50 25.50 9.967 0.000
Abbreviations: CPP, cepstral peak prominence; CPPS, cepstral peak prominence-smoothed.
FIGURE 6. Cepstral peak prominence-smoothed (CPPS) differences between dysphonic and nondysphonic females in connected speech.
ARTICLE IN PRESS Arezoo Hasanvand, et al
Cepstral Analysis of Adult Iranian Voice Qualities
FIGURE 7. Cepstral peak prominence (CPP) differences between dysphonic and nondysphonic females in sustained vowel. structures show higher values of CPP apart from background noise, whether hoarseness or breathiness.35 Considering the advantages of cepstrum analysis, an optimal evaluation is achieved by making a comparison between normal and dysphonic voices to accurately quantify voice characteristics in a connected speech, which provides a more representative phonation of how the voice is used on a regular basis. Having a well-defined harmonic structure, a normal voice resulted in a strong cepstral peak compared to a breathy and hoarse voice in this study because a poorly defined harmonic structure is evidenced in the latter case. Furthermore, an abnormal reduction was hypothesized in the cepstral peaks of the speakers for the vocal lesions compared to the normal controls due to being characterized by hoarse and breathy voices caused by the benign localized lesions of the vocal folds.7,19 Anyhow, lower values of CPP and CPPS were resulted in the pathologic con-
FIGURE 8. Cepstral peak prominence-smoothed (CPPS) differences between dysphonic and nondysphonic females in sustained vowel.
5
dition compared to the control group in the current study, which could be caused by the clinical group’s presentation with a glottic chink in a flat harmonic structure. Thus, there was a lowering tendency for CPP and CPPS values as there was a higher level of noise estimate.7 The lower the CPP and CPPS values are, the more abnormal the voices appear. Therefore, this particular measure was lowered in the clinical group as the cepstral peak stuck out above the background noise in the form of energy, that is, the signal’s most harmonic part was depicted by the peak.19 In the present study, from results of the CPP and CPPS analyses, one can conclude the influence of gender on voice characteristics in a sustained vowel task. This finding was concordant with the previous studies.40,47,48 Although male (normal and dysphonic) participants showed no significant differences in CPP values in the sustained vowel and had thus better overall voice qualities compared to the female groups, findings in the female normal and dysphonic groups revealed significant differences in CPP. This might be related to the presence of posterior phonatory gap as a common sign in the normal Iranian female population.49 As noted by Klatt and Klatt,50 a posterior phonatory gap produces a breathy voice quality that is perceived as an aspired noise. This noise, known as a different glottal closure in women from that of the men, reduces harmonic components and consequently plays an important role in discriminating between men and women’s voices. Therefore, in the Iranian population, women are more prone to have a weaker or breathy voice with probably less amplitude than men. In this study, both values of CPP and CPPS were lower in the dysphonic female group than in their normal counterparts, whereas in Brinca et al’s study,51 the CPPS values were just lower in the connected speech. Such differences may be due to the different aspects of different languages as well as the effects of different sociocultural or psychological factors on voice characteristics.52 The findings obtained from this study are congruent with the results of several other studies,7,19,33,34,46,52–55 especially that of Maryn et al’s study41 that concluded that CPP and CPPS were suitable for the connected speech analysis, and that CPPS was sensitive to both connected speech and vowels. This finding leads us to the fact that this cepstral measure (CPPS) can be used as the most robust acoustic measure for all grades of dysphonia. The findings of the present study were in line with HemanAckah cutoff point for dysphonia.39 In the present study, almost all the normal-voiced participants have high CPPS values. Therefore, the given treatment procedure efficacy can be tracked by the use of this measurement, which enables the voice clinician to assess voice characteristics in speakers involved in vocal pathologies. The findings suggest that clinical evidence of voice disorders is exhibited by speakers with vocal lesions based on cepstral analysis. The existence of benign localized lesion in the clinical group’s vocal folds led to deviations in their voices and thus lower CPP and CPPS values resulted in the speakers with vocal pathologies. After gathering the data, a comparison of CPP and CPPS in the normal and dysphonic groups guided us to a better understanding of acoustic characteristics of voice in the Iranian population. Nevertheless, these findings could be associated with other diagnostic tools for voice disorders to cover all aspects of
ARTICLE IN PRESS 6 the underlying voice production. In addition, visualization studies are needed to unfold the real link between cepstral data and phonatory correlates. To expand the usefulness of findings in the assessment of all population groups using CPP and CPPS, data should be gathered in the Iranian population with different pathologic voices to provide a cutoff point between normal and pathological ones in Persian speakers. Moreover, regarding the fast growth of older adult population in Iran, there is a need to collect data on the aging voice, especially those who are older than 50 or in the menopause ages, by using a robust and sensitive instrument for all types of voices. It would also be worthwhile to study these parameters in children. CONCLUSION Cepstral measures have the capabilities to equip us with the necessary information required to measure voices. Regarding the results of the present study, they account for a cornerstone and valid database to be used for other acoustic studies in the Persian population. In Persian language, CPP and CPPS are the best parameters to differentiate between normal and dysphonic voices of different severities. CPPS is most suitable for a sustained phonation, whereas CPP and CPPS are the best parameters for analyzing a connected speech. Although CPPS is suitable for the analysis of both sustained vowel and connected speech tasks in Persian language, repeating this study is suggested in other populations with different languages. These data can be used as a reference to assess normal versus various pathologic voices in all the Persian-speaking countries. Acknowledgment The authors would like to express their gratitude to Dr. James Hillenbrand for making his cepstral analysis software available to them. REFERENCES 1. Brackett IP. Parameters of voice quality. In: Travis LE, ed. Handbook of Speech Pathology and Audiology. New York, NY: Prentice-Hall, Inc.; 1971:441–464. 2. Kent R. Hearing and believing: some limits to the auditory-perceptual assessment of speech and voice disorders. Am J Speech Lang Pathol. 1996;5:7–23. 3. Kreiman J, Gerratt BR, Precoda K, et al. Individual differences in voice quality perception. J Speech Hear Res. 1992;35:512–520. 4. Kreiman J, Gerratt BR, Precoda K. Listener experience and perception of voice quality. J Speech Hear Res. 1990;33:103–115. 5. Keriman J, Gerratt BR. Sources of listener disagreement in voice quality assessment. J Acoust Soc Am. 2000;108:1867–1876. 6. Lowell SY, Colton RH, Kelley RT, et al. Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech. J Voice. 2013;27:393–400. 7. Heman-Ackah YD, Michael DD, Goding GS. The relationship between cepstral peak prominence and selected parameters of dysphonia. J Voice. 2002;16:20–27. 8. Buder EH. Acoustic analysis of voice quality: a tabulation of algorithms 1902–1990. In: Voice Quality Measurement. 2000:119–244. 9. Parsa V, Jamieson DG. Acoustic discrimination of pathological voice sustained vowels versus continuous speech. J Speech Lang Hear Res. 2001;44:327–339. 10. Portney LG, Watkins MP. Foundations of Clinical Research: Applications to Practice. Upper Saddle River, NJ: Prentice Hall; 2000.
Journal of Voice, Vol. ■■, No. ■■, 2016 11. Maryn Y, Corthals P, Van Cauwenberge P, et al. Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels. J Voice. 2010;24:540–555. 12. García MJV, Cobeta I, Martín G, et al. Acoustic analysis of voice in Huntington’s disease patients. J Voice. 2011;25:208–217. 13. Ma EP-M, Yiu EM-L. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20:380–390. 14. Niebudek-Bogusz E, Fiszer M, Kotylo P, et al. Diagnostic value of voice acoustic analysis in assessment of occupational voice pathologies in teachers. Logoped Phoniatr Vocol. 2006;31:100–106. 15. Olszewski AE, Shen L, Jiang JJ. Objective methods of sample selection in acoustic analysis of voice. Ann Otol Rhinol Laryngol. 2011;120:155–161. 16. Hillenbrand J, Cleveland RA, Erickson RL. Acoustic correlates of breathy vocal quality. J Speech Lang Hear Res. 1994;37:769–778. 17. Awan SN, Roy N. Acoustic prediction of voice type in women with functional dysphonia. J Voice. 2005;19:268–282. 18. Maryn Y, De Bodt M, Roy N. The Acoustic Voice Quality Index: toward improved treatment outcomes assessment in voice disorders. J Commun Disord. 2010;43:161–174. 19. Kumar BR, Bhat JS, Prasad N. Cepstral analysis of voice in persons with vocal nodules. J Voice. 2010;24:651–653. 20. De Bodt M. A framework of voice assessment: the relation between subjective and objective parameters in the judgement of normal and pathological voice [unpublished doctoral dissertation]. Antwerp: University of Antwerp; 1997. 21. Kreiman J, Gerratt BR. Perception of aperiodicity in pathological voice. J Acoust Soc Am. 2005;117:2201–2211. 22. Titze IR. Workshop on Acoustic Voice Analysis: Summary Statement. Iowa City: National Center for Voice and Speech; 1995. 23. Zraick RI, Wendel K, Smith-Olinde L. The effect of speaking task on perceptual judgment of the severity of dysphonic voice. J Voice. 2005;19:574–581. 24. Askenfelt AG, Hammarberg B. Speech waveform perturbation analysis: a perceptual-acoustical comparison of seven measures. J Speech Lang Hear Res. 1986;29:50–64. 25. Carding P, Steen I, Webb A, et al. The reliability and sensitivity to change of acoustic measures of voice quality. Clin Otolaryngol Allied Sci. 2004;29:538–544. 26. Awan SN, Roy N. Toward the development of an objective index of dysphonia severity: a four-factor acoustic model. Clin Linguist Phon. 2006;20:35–49. 27. Wuyts FL, De Bodt MS, Molenberghs G, et al. The Dysphonia Severity Index: an objective measure of vocal quality based on a multiparameter approach. J Speech Lang Hear Res. 2000;43:796–809. 28. Yu P, Ouaknine M, Revis J, et al. Objective voice analysis for dysphonic patients: a multiparametric protocol including acoustic and aerodynamic measurements. J Voice. 2001;15:529–542. 29. Heman-Ackah YD, Michael DD, Baroody MM, et al. Cepstral peak prominence: a more reliable measure of dysphonia. Ann Otol Rhinol Laryngol. 2003;112:324–333. 30. Halberstam B. Acoustic and perceptual parameters relating to connected speech are more reliable measures of hoarseness than parameters relating to sustained vowels. ORL J Otorhinolaryngol Relat Spec. 2004;66:70–73. 31. Heman-Ackah YD. Reliability of calculating the cepstral peak without linear regression analysis. J Voice. 2004;18:203–208. 32. Hartl DM, Hans S, Vaissière J, et al. Objective acoustic and aerodynamic measures of breathiness in paralytic dysphonia. Eur Arch Otorhinolaryngol. 2003;260:175–182. 33. Eadie TL, Baylor CR. The effect of perceptual training on inexperienced listeners’ judgments of dysphonic voice. J Voice. 2006;20:527–544. 34. Hillenbrand J, Houde RA. Acoustic correlates of breathy vocal quality dysphonic voices and continuous speech. J Speech Lang Hear Res. 1996;39:311–321. 35. Balasubramanium RK, Shastry A, Singh M, et al. Cepstral characteristics of voice in Indian female classical carnatic singers. J Voice. 2015;29:693– 695. 36. Noll AM. Short-time spectrum and “cepstrum” techniques for vocal-pitch detection. J Acoust Soc Am. 1964;36:296–302. 37. Noll AM. Cepstrum pitch determination. J Acoust Soc Am. 1967;41:293–309.
ARTICLE IN PRESS Arezoo Hasanvand, et al
Cepstral Analysis of Adult Iranian Voice Qualities
38. Maryn Y, Dick C, Vandenbruaene C, et al. Spectral, cepstral, and multivariate exploration of tracheoesophageal voice quality in continuous speech and sustained vowels. Laryngoscope. 2009;119:2384–2394. 39. Heman-Ackah YD, Sataloff RT, Laureyns G, et al. Quantifying the cepstral peak prominence, a measure of dysphonia. J Voice. 2014;28:783–788. 40. Balasubramanium RK, Bhat JS, Fahim S, et al. Cepstral analysis of voice in unilateral adductor vocal fold palsy. J Voice. 2011;25:326–329. 41. Maryn Y, Roy N, De Bodt M, et al. Acoustic measurement of overall voice quality: a meta-analysis. J Acoust Soc Am. 2009;126:2619–2634. 42. Titze IR. Toward standards in acoustic analysis of voice. J Voice. 1994;8:1–7. 43. Maryn Y, Weenink D. Objective dysphonia measures in the program Praat: smoothed cepstral peak prominence and acoustic voice quality index. J Voice. 2015;29:35–43. 44. Tatar EC, Sahin M, Demiral D, et al. Normative values of voice analysis parameters with respect to menstrual cycle in healthy adult Turkish women. J Voice. 2015;30:322–328. 45. Awan SN, Giovinco A, Owens J. Effects of vocal intensity and vowel type on cepstral analysis of voice. J Voice. 2012;26:670, e15-e20. 46. Moers C, Möbius B, Rosanowski F, et al. Vowel-and text-based cepstral analysis of chronic hoarseness. J Voice. 2012;26:416–424. 47. Awan SN. Analysis of Dysphonia in Speech and Voice: An Application Guide. Montvale, NJ: KayPENTAX; 2011.
7
48. Garrett R. Cepstral-and spectral-based acoustic measures of normal voices [theses and dissertations]. Milwaukee, WI: University of WisconsineMilwauki. 217 p. 2013. 49. Kkhoddami SM, Mehri A, Jahani Y. The role of sex in glottic closure pattern in people with normal voice. Audiology. 2011;20:64–72. 50. Klatt DH, Klatt LC. Analysis, synthesis, and perception of voice quality variations among female and male talkers. J Acoust Soc Am. 1990;87:820– 857. 51. Brinca LF, Batista APF, Tavares AI, et al. Use of cepstral analyses for differentiating normal from dysphonic voices: a comparative study of connected speech versus sustained vowel in European Portuguese female speakers. J Voice. 2014;28:282–286. 52. Samareh Y. Phonetic of Persian Language. 9th ed. Tehran, Iran: Iran University Publishers; 1931. 53. Alpan A, Schoentgen J, Maryn Y, et al. Assessment of disordered voice via the first rahmonic. Speech Commun. 2012;54:655–663. 54. Watts CR, Awan SN. Use of spectral/cepstral analyses for differentiating normal from hypofunctional voices in sustained vowel and continuous speech contexts. J Speech Lang Hear Res. 2011;54:1525–1537. 55. Lowell SY, Colton RH, Kelley RT, et al. Spectral-and cepstral-based measures during continuous speech: capacity to distinguish dysphonia and consistency within a speaker. J Voice. 2011;25:e223–e232.