Acoustic and Perceptual Appraisal of Vocal Gestures in the Female Classical Voice

Acoustic and Perceptual Appraisal of Vocal Gestures in the Female Classical Voice

Acoustic and Perceptual Appraisal of Vocal Gestures in the Female Classical Voice Dianna T. Kenny and Helen F. Mitchell New South Wales, Australia Su...

2MB Sizes 6 Downloads 53 Views

Acoustic and Perceptual Appraisal of Vocal Gestures in the Female Classical Voice Dianna T. Kenny and Helen F. Mitchell New South Wales, Australia

Summary: Long-term average spectra (LTAS) have identified features in the sounds of singers and have compared different vocal qualities based on energy changes that occur during different vocal tasks. In this study, we compared the perceptual ratings of vocal quality of expert pedagogues with acoustic measures performed on LTAS. Fifteen expert judges rated 24 samples with six repeats of six advanced singing students under two conditions: “optimal” (O), which represented the application of the maximal open throat technique; and “suboptimal” (SO), which represented the application of the reduced open throat technique. LTAS were performed on each singing sample, and two conventional assessments of peak energy height [singing power ratio (SPR)] and peak area [energy ratio (ER)] were calculated on each LTAS. Perceptual scores, SPR, and ER were rank ordered. We then compared perceptual rankings with rankings of acoustic measures (SPR and ER) to assess whether these acoustic measurements matched the perceptual judgments of vocal quality. Although we found the expected significant relationship between SPR and ER, there was no relationship between perceptual ratings of vocal samples or singers based on SPR or ER. These findings suggest that because LTAS measures are not consistent with perceptual ratings of vocal quality, such measurements cannot define a voice of quality. Future research with LTAS to assess vocal quality should consider alternative measures that are more sensitive to subtle differences in vocal parameters. Key Words: Long-term average spectra—Perceptual rankings—Voice quality.

INTRODUCTION Accepted for publication December 9, 2004. From the Australian Centre for Applied Research in Music Performance (ACARMP), Sydney Conservatorium of Music, The University of Sydney, New South Wales, Australia. Address correspondence and reprint requests to Dianna Kenny, Australian Centre for Applied Research in Music Performance (ACARMP), The Conservatorium of Music, The University of Sydney, New South Wales, Australia 2006. E-mail: [email protected] Journal of Voice, Vol. 20, No. 1, pp. 55–70 0892-1997/$32.00 쑕 2006 The Voice Foundation doi:10.1016/j.jvoice.2004.12.002

What qualities of the singing voice define good vocal quality and what constitute its specific vocal features? This question has come into sharp focus since the advent of acoustic analysis. Few empirical studies classify good sound quality in singing, although perceptual studies have attempted to describe the features of good singing1–3 and link these to acoustic features of voice. Expert listeners are generally reliable in their judgments of overall quality. Ratings of good and poor performance are correlated most strongly with 55

56

DIANNA T. KENNY AND HELEN F. MITCHELL

tone quality and intonation.1,2,4 Strong correlations have also been observed among different descriptors that assess quality, such as color/warmth, resonance/ring, clarity/focus, and appropriate vibrato, which indicate that these factors converge on the same underlying construct of overall vocal quality. Similar findings were reported in Merritt et al,5 where listeners matched their perception of performance anxiety of speakers with a variety of vocal and physical features. The ratings on the separate vocal and performance features (eg, physical ease, presence, gesture, eye contact, pace, clarity, and eye contact) all intercorrelated highly. Rating the separate features added nothing to the overall rating of the total performance. It seems that judges apply personal constructs to assist with their judgment strategies as they often cannot articulate the components of sound quality on which their judgments were based.6 Vocal pedagogues in the singing studio must assess each voice and devise a technical and aesthetic program to improve its basic sound. “Open throat” is one technical component taught in many modern singing studios, and pedagogues agree that it produces an identifiable quality in the sound, which is described as “even and consistent,” “balanced and coordinated,” “round,” and “warm.” Mitchell et al,4 in subsequent studies, compared the open throat technique (O) with a reduction of the technique described as suboptimal (SO) in the same advanced singing students and found that experts could identify the technique in 83% of samples.7,8 Applied research in voice must address the goals of singing pedagogy in assessing and refining a voice of quality and generate perceptually viable measures to systematically define voice. Ideally, we need exemplars of vocal quality and to find measurements that enable us to rank voices in accordance with acoustic measures. One key question addressed in this article is whether acoustic features are associated with perceptual preferences of overall quality. To date, research into the singing voice has described acoustic features of voice and its visual representation with few or no links to either perceptual preferences or pedagogy. Recent research has struggled to identify acoustic cues that attract the highest perceptual rankings from Journal of Voice, Vol. 20, No. 1, 2006

expert listeners.2,3,9 The voice can be represented acoustically in many ways. In current singing research, long-term average spectra (LTAS) are widely applied to represent timbre and vocal features, both in speech and in singing. An LTAS gives an overall impression of an entire excerpt by identifying certain consistent features contained in the sound over time.10 In singing samples, it averages out short-term variations in phonetic structure11 and is a measure of the decibel level of the time-average of the power of the acoustic signal at each frequency. A conventional way of assessing LTAS is to reduce the information it contains to a single meaningful number, by computing the ratio of energies in a low- and a highfrequency band.12,13 In singing, measures of spectra compare energy peak height [singing power ratio (SPR)] and peak area [energy ratio (ER)] between 0–2 and 2-4 kHz. The difference between the height of the major peaks between 0–2 and 2–4 kHz12 quantifies the relative energy between 2 and 4 kHz, although it does not account for the shape of these energy peaks. Assessing the areas under the LTAS curve at 0–2 and 2–4 kHz, rather than at the highest peaks of energy, also identifies the areas in which the energy is reinforced or reduced.13 Both SPR and ER enable effective comparison between individual singers and between groups of singers. LTAS exemplars of voice types14 and singing genres15,16 are available. LTAS curves have also differentiated male and female voices,17 singers and speakers,18 solo voice and choral voice,19 and pop or country from opera singers15,16 based on the changes in energy distribution necessary for these different vocal tasks. In singing literature, particular LTAS models or exemplars are associated with particular vocal qualities; high-range energy within the LTAS identifies voices that produce a “ringing quality”2,3 or carrying power or amplification over an orchestra,9,13,14 which is considered essential to operatic voices. In Barnes et al,20 the LTAS provides information about the formant structure. However, it is premature to assert a connection between projection as demonstrated in LTAS and vocal quality, as Thorpe et al13 and Barnes et al20 have been the only studies on vocal projection in which the precise standard of the singing subjects has been specified and a homogeneous group of singers of national and international reputation have been used.21 However, with

VOCAL GESTURES IN FEMALE CLASSICAL VOICE conservatory students, Vurma and Ross9 reported no correlation between measurable carrying power and the perception of tone quality by experts. They also found no association between these measures and length of training. To date, such LTAS patterns have not been consistently associated with “good” or “beautiful” singing. Bartholomew22 associated “good voice quality” with high energy but did not resolve whether quality was associated with energy. Since Bartholomew22 suggested that male singers produced a specific energy around 2800 Hz, and Sundberg14 identified the “singer’s formant” as a critical component of operatic singing, the spectral distribution of the classical and operatic voice has been measured against the standards established by this pioneering research. From this research, the energy produced in an operatic voice to project over an orchestra is consistently associated with performers of the highest standards in operatic singing.13,20 Ekholm et al2 matched acoustic features to specific descriptors, but like earlier studies,22 they reported relationships between acoustic and descriptive features and presented exemplars of high-energy only in the most highly rated voices. LTAS have also been used to “validate” perceptual cues. For example, Howard et al23 and White24 conducted studies in which raters were asked to identify child voices as male or female. Both studies found typically different LTAS for each gender and similarities between the spectra of girls identified as boys and boys identified as girls. LTAS is therefore reliable in distinguishing male from female child and adult voice.23–25 However, this is a long way from determining vocal quality among like voices, particularly singing voices (eg, experienced adult female classical singers). A variation of the study type that applies perceptual ratings to validate acoustic findings is with acoustic parameters to predict qualities, such as emotion in voice. These studies have found judges to be inconsistent and unreliable26 and/or to have a wide range of responses to the proposed emotional content of the music.27–30 Where listeners are consistent in identifying emotional content, stimuli have a tendency to be exaggerated, for example, in vibrato parameters,30 dynamics,31 or tone onset.32

57

Moreover, in a previous study, Mitchell and Kenny33 tested the energy distribution of singing with and without the use of open throat technique. Although visual inspection of LTAS illustrated small differences, conventional measures performed on these LTAS (SPR and ER) did not conclusively differentiate the high energy produced with and without open throat across all singers. There is no doubt that an operatic singer must produce energy in such a way as to amplify his/her voice over an orchestra, but the visual presentation of operatic vocal quality does not necessarily reflect its perception by expert listeners. Similarly, Titze et al34 found the acoustic output in “twangy” quality boosted energy around 3 kHz. If vocal qualities other than operatic also produce marked energy between 2 and 4 kHz, tone quality may not easily be demonstrated in the acoustic measure of LTAS. The timbral effects of the open throat technique have not been tested with respect to the LTAS parameters and rankings or overall perceptual rankings. Identifying trends in ER and SPR across all singers may result in an average that does not represent an actual voice and may not adequately represent the timbre of any individual voices. In this study, we tested whether the spectral measures made on LTAS in female classical singers could assess the difference in vocal quality between O and SO techniques and whether the ratings of pedagogues of overall quality matched LTAS. METHOD Participants Listeners Listeners were 15 experienced singing pedagogues, 12 women and 3 men, aged between 37 and 76 years, with a mean of 54 years. Six participants had a postgraduate qualification in singing, five had a diploma of music or singing, and three a bachelors degree in music. One cited extensive international performing experience as their qualification in singing. All had taught singing for 4 to 40 years, with an average for 20 years. Thirteen of fifteen participants taught singers in a Conservatorium of music. Overall, the singing studios comprised an average of 39.5% operatic students and 36.7% classical students. For 11 pedagogues, most of their studio comprised these two genres. Six pedagogues also taught Journal of Voice, Vol. 20, No. 1, 2006

58

DIANNA T. KENNY AND HELEN F. MITCHELL

ⱖ20% of musical theater students in their studio. Eleven taught a proportion of international and national level singers, nine at a big city or regional/ touring, and eight at the local community level. Participants were either known to the researchers via affiliations with key music centers in Australia or volunteered in response to an advertisement in a national singing organization newsletter. Participants were sent information about the project and were invited to take part in a perceptual study of singing technique. They were required to participate in a single listening session at a time and location convenient to them. Before commencement, participants completed a questionnaire detailing their musical and teaching experience and current singing studio. Each participant was asked for information related to age, number of years teaching, and highest qualifications attained in music and/or singing. Participants were asked to classify their singing studio according to their students. Specifically, they gave the proportion of their private studio and studio at a musical conservatorium. Finally, participants were asked to classify their studio according to their primary singing genres (opera, contemporary music theater, musical theater, concert/oratorio/recital, choral) and the level at which their students performed (superstar, international, national/big city, regional/touring, local community, full-time students of singing, amateur) and estimate the percentage each played in the performing career of their students.21 Singers Six female singers, three sopranos and three mezzo-sopranos, volunteered to participate in this project. They were advanced students with excellent technique of an experienced singing pedagogue, who is a lecturer in vocal studies and opera at a state Conservatorium of Music in Australia. It is the premier institution for musical education in the country and has produced singers of international repute. Criteria for participant selection included singers who (1) had a good classical singing technique for their level of training and experience and (2) understood and demonstrated skillful control of “open throat” or “retraction” techniques in their singing. Journal of Voice, Vol. 20, No. 1, 2006

Before the voice recording, participants completed a questionnaire seeking information on age, years of singing study, number of years of study with each singing teacher and highest qualifications attained or currently undertaken in music and/or singing, and singer type (soprano or mezzo soprano). The participants were also asked to classify the genres of singing they performed in public (opera, classical, choral, music theater, and contemporary) and to estimate the percentage each style played in their total performing career. Singer participants were aged between 23 and 30 years, with a mean of 26 years. All had studied singing for at least 7 years (average 9.8 years) and had spent an average of 5 years studying with their present singing teacher. Each singer held a qualification in singing or music (four had Bachelor of Music degrees and two had diplomas, in music and/or singing) and five of six were currently undertaking a second degree in singing (three postgraduate Diploma of Opera and two Bachelor of Music degrees). All defined most of their singing as operatic (⬎50%), with the second most common style as classical (⬎20%), in accordance with the Bunch and Chapman21 taxonomy of singing voices. All reported that they were in good health and could perform the tasks. Singers were sent information about the project and were invited to take part in an acoustic and perceptual study of singing technique. They were required to attend a single recording session lasting no more than an hour and were told that the object of the study was to investigate acoustical and perceptual features of the open throat in singing and to discover the sound qualities associated when a singer uses some form of open throat technique compared with when a singer does not use the open throat technique. Singer protocol A protocol was developed to assess the effect of “open throat” technique on singing. Two musical tasks were chosen to investigate the application of the technique in two song excerpts. Before singing, each singer selected the sequence of their tasks before commencing the experiment by selecting a blank card, the reverse side of which represented one task, to reduce the possible effects of task order.

VOCAL GESTURES IN FEMALE CLASSICAL VOICE The musical tasks Musical tasks were chosen to test different demands of good singing, but they were not musically difficult. They were designed to test the application of open throat and contained musical features derived from a previous qualitative study on the application of the technique,4 where application or lack of application of the technique was deemed to be particularly valuable or noticeable. These features were high tessitura, sustained or legato singing, dynamic range control, and vocal agility. The Mozart song Ridente la Calma, K 152, bars 1–27 (Figure 1A), was selected as it is a nominally simple song in the Italian language1 with a mixture of common musical statements involving repeated legato lines as well as the initial stylized leaps of a major 4 and short scale figures. All six singers sang this aria in the same key (F major). The third verse of the Schubert lied, Du bist die Ruh D. 776 (Op. 59, No. 3) (Figure 1B), bars 54 to 80, was chosen for its demanding vocal control, sustained musical line, and high climactic tessitura. The three sopranos and the three mezzo-sopranos sang this in an appropriate key depending on soprano or mezzo-soprano voices (E-flat, D-flat, and C major). Experimental conditions Singers sang each of the two song excerpts under three conditions: O, SO, and loud SO (LSO). O involved the maximal application of open throat. SO involved a reduced (open throat) technique but still with an acceptable singing technique and without consciously altering any other aspect of their technique. It was hypothesized, from interviews with pedagogues,4 that the SO condition would result in a reduction of sound pressure level (SPL), so a third condition, LSO, involved the same instruction as the SO condition, but with the added instruction that the singer should try to achieve a louder dynamic than in SO. This condition addressed SPL as an additional variable to the SO condition and was similar to technical conditions established by Foulds-Elliott et al26 in her study of emotional connection. Each task was performed twice in the O and SO conditions, whereas the LSO condition was performed only once because of the slight possibility

59

of adverse effects on vocal health. In total, each singer performed each musical task five times. Instructions A pedagogue was present during the recording sessions to provide accompaniment for warm-ups and practice of the tasks where necessary and to instruct singers to achieve the required vocal postures for each experimental condition. For example, she instructed them to pay attention to producing the most open sound in their throat in the O condition, and to a lesser degree in the SO and LSO conditions. Some singers asked how to produce the LSO condition, and the pedagogue instructed them to “use more twang,” as taught in their lessons.35 Recording Participants were given time to warm up in the singing studio and become familiar with the room before recording. Recording levels for each singer were set during this time. The voice was recorded with a high-quality microphone (AKG C-477, AKG Acoustics, Vienna, Austria) positioned on a head boom, a constant 7-cm distance from the lips of the singer. This placement ensured that the direct energy of the voices was recorded rather than room reflections, which enabled us to use a studio environment with low ambient noise rather than an anechoic studio.36 The signal was then amplified with a Behringer Ultragain preamplier and digitally recorded to a CD recorder (Marantz CDR 630, Marantz Japan Inc., Kanagawa, Japan). Calibration was carried out in each recording by playing pink noise samples immediately after each recording session at the same recording gain applied for recording the voice. For calibration of absolute SPLs, the a sound level meter (Rion NL-06 SPL, Rion Co. Ltd, Tokyo, Japan) was placed adjacent to the AKG microphone 7 cm from a speaker (Bose Lifestyle, Bose Corporation, Framingham, MA), from which the pink noise was played. The SPL shown on the sound level meter was noted for the pink noise signal and applied later for calibration. Pink noise enables calibration of an audio playback system across the frequency range for tasks such as perceptual testing and comparative analysis on a computer. Pink noise is at least as good as any other steady-state known signal. A pure sine wave Journal of Voice, Vol. 20, No. 1, 2006

60

DIANNA T. KENNY AND HELEN F. MITCHELL

FIGURE 1. A. We used bars 1 to 7 as perceptual stimuli of the Mozart song Ridente la Calma, K 152. B. We used bars 54 to 60 as perceptual stimuli of the Schubert lied, Du bist die Ruh D. 776 (Op. 59, No. 3).

tone is more susceptible to interference in the environment and hence would give less repeatable results. This data set was applied as a basis for the perceptual and acoustic evaluations. Preparation of perceptual CDs The audio recordings of the singers were digitally extracted from the CDs on a standard PC to wave audio format in stereo at a 16-bit, 41,000-Hz sample rate [with Audiograbber software (www. audiograbber.com)]. This methodology ensures as close as possible that the original recorded sound was played back to the judges (no filtering or normalization was applied to minimize the effects of digital artifacts). The recorded pink noise for each singer was applied to equalize the peak levels of each sample to ensure that relative SPL for each singer was the same. The amplification tool in Cool Edit Pro 1.2 (www.adobe.com) calculated the SPL necessary to equalize the peak SPL of each sample, thus Journal of Voice, Vol. 20, No. 1, 2006

making each recording relative to the level (in decibels) of the other samples produced on the day. The files were then edited in Cool Edit. We used the first 7 bars of Mozart and bars 54 to 60 of the Schubert musical tasks for the perceptual study. Each consisted of an entire musical phrase lasting around 20 seconds. Only samples with correct intonation were selected in the study. We used 24 CD tracks in the study: six singers singing Mozart O and SO and Schubert O and SO. CD tracks of the samples were presented in random order (generated at www.randomizer.org), and six additional repeats tracks were generated, three O and three SO, of which three were the Mozart task and three were Schubert. Pilot studies Two pilot studies were conducted to consider the methodological issues discovered in this perceptual study. The first pilot study was conducted, with a single expert pedagogue, to determine whether the three experimental conditions performed by the singers were detectable in a perceptual study. The

VOCAL GESTURES IN FEMALE CLASSICAL VOICE samples of three subjects were presented in each singing condition (O, SO, and LSO) in random order. LSO compared with SO did not produce a sufficiently discernible voice quality to be correctly distinguished by an expert pedagogue. The pilot test listener marked all LSO samples as SO and remarked that it was not possible to differentiate between the two conditions, SO and LSO. As LSO was created as acoustic confirmation of SO, it is unreasonable to expect the human ear to detect so small a difference in timbre. The LSO condition was generated to ensure acoustic differences could be attributed to the open throat technique and not solely to loudness. The LSO condition was therefore deleted from the perceptual study. The second pilot study addressed normalization of SPL of perceptual samples. Normalization reduces the possible effects of SPL as a variable. To test the perceptual difference in normalized samples, samples were normalized with the amplification tool in Cool Edit, saved, and then reduced back to their original SPL. Pairs of samples, each containing one normalized and returned and one original, were presented to two expert pedagogues. Pedagogues described a discernable difference in quality, saying, for example, “the overall recording quality is reduced” or the overall timbre had changed, with differences in the “woofy” and “tweety” balance in the recording. Although other perceptual studies have normalized SPL across stimuli,23 singing pedagogue subjects in Wapnick and Ekholm1 commented on the difficulty of evaluating voices not presented with realistic loudness. On the basis of the pilot study, we decided not to normalize samples but instead make them relative to known SPL, based on SPL at the recording. Perceptual test Procedure The perceptual test was conducted in a quiet environment, and samples were played from a Sony CD Walkman (DEJ885W; Sony Corporation, Tokyo, Japan) via closed-back stereo monitoring headphones (Sennheiser HD 270, Sennheiser, Tullamore, Ireland). This setup enabled the study to be conducted in the singing studios of the participants. With headphones, it was possible to eliminate room

61

effects from the listening environment. This methodology was favored as optimal sound quality, rather than with a computer sound card37 or sending tapes to participants.1,2 Only one listener took the test on any occasion, and each listener took the test once. Levels were checked before each test was conducted and set at a consistent and comfortable SPL volume for each subject. Before presentation of stimuli, participants were given information on the two singing conditions, O and SO, and were presented with the musical score of each musical task. They were asked to rate the overall vocal quality of each sample on a 10-point scale at a standard relevant to advanced singing students at a national Conservatorium of Music. They were also asked to give qualitative comments and recommendations about the sound quality. Each listener had a trial session with four samples, two O and two SO, in each musical task before starting the perceptual test. Acoustic analysis The audio recordings were digitized at a 16-kHz sample rate with Phog Version 2.0 (Hitech, Sweden) software and analyzed with Soundswell Version 4.0 (Hitech, Sweden), with A-weighting for SPL pink noise calibration and subject SPL measurement. LTAS analyses (bandwidth 300 Hz) were performed on the 24 musical task files.25,38 LTAS contained only voiced data.11,25 A 300-Hz LTAS bandwidth is less sensitive than a narrow-band analysis to movement in the partials when the singer is at higher fundamental frequencies. A complication with LTAS is to account for loudness, which is problematic as frequencies above 2 kHz increase faster than those below 2 kHz as SPL increases.39 To relate these LTAS plots to the known decibel of the calibration tone, an LTAS was performed on steady 5-second portions of the pink noise sample for each singer, with the same 300-Hz bandwidth. The Sect tool in Soundswell does not compute the overall equivalent level (Leq) for an LTAS. Therefore, to calculate the absolute SPL of each calibration tone, it was necessary to find the mean SPL of the pink noise LTAS. Each point of the LTAS curve was linearized [y ⫽ 10^(χ/10)], then all data points were summed, and this total was converted back to decibels [χ ⫽ 10·log(y)]. The mean SPL of Journal of Voice, Vol. 20, No. 1, 2006

62

DIANNA T. KENNY AND HELEN F. MITCHELL

pink noise (in decibels down from full scale) was subtracted from the known SPL of the calibration noise measured during recording to produce the calibration offset. This calibration offset (in decibels) was applied to each LTAS. In earlier research with LTAS methodology, White25 arrived at a “calibration offset” by calculating the mean SPL of a sung /i/ vowel by adding the peak level (in decibels) of the LTAS with the known level. Applying the LTAS area to calculate the mean SPL of pink noise was preferential to calculating the LTAS peak height and was considered a valid method for evaluating the overall level (in decibels) of the pink noise at 7 cm. Acoustic analysis focused on two frequency areas: 0–2 kHz and 2–4 kHz. From the LTAS plots, the highest peaks in the 0–2-kHz and 2–4-kHz regions were labeled P1 and P2, respectively. Peak levels (in decibels) and peak center frequencies (in Hertz) were calculated for each singer in each task and condition. We used two measures to quantify these LTAS data. The SPR, described by Omori et al12 compares the peak levels of P1 and P2. The SPR is the difference between the level of P1 and P2 in decibels (LP1 – LP2) and is a measure of spectral attenuation between 0–2 kHz and 2–4 kHz. When the energy is focused in the 0–2-kHz region, it results in higher SPR results and there is typically less energy reinforcement in the 2–4-kHz region. A low SPR indicates a stronger energy peak ⬎2 kHz. Although SPR does not prove the presence of the formant of a singer, it enables inter- and intrasinger comparison of spectral energy in the voice.12,40 The ER, described by Thorpe et al,13 compares overall energy between the area 0–2 kHz (A1) and the area 2–4 kHz (A2). It is calculated by taking the difference between average energy values for the two frequency areas (A1 – A2). A low ER represents a greater reinforcement in the 2–4-kHz region, whereas a high ER represents a smaller energy boost ⬎2 kHz. Areas below the LTAS curve were calculated in the same way as the calibration tone. When there is less reinforcement in the 2–4-kHz region, ER results follow SPR. Study design The reliability of the ratings of the judges was tested with the scores of the six repeats against the original scores. Intraclass correlation coefficients Journal of Voice, Vol. 20, No. 1, 2006

TABLE 1. Ranked Mean Perceptual Scores for Each Singer With Standard Deviations in Each Musical Task (Mozart and Schubert) in Each Condition (0 and SO) Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Singer

Task

Condition

Mean

SD

5 5 1 3 1 4 4 2 3 2 6 5 1 2 6 5 3 6 1 3 4 2 4 6

Schubert Mozart Mozart Mozart Schubert Schubert Mozart Schubert Schubert Mozart Schubert Schubert Mozart Schubert Mozart Mozart Schubert Schubert Schubert Mozart Schubert Mozart Mozart Mozart

O O O O O O O O O O O SO SO SO O SO SO SO SO SO SO SO SO SO

7.60 7.07 6.93 6.93 6.93 6.80 6.73 6.67 6.27 5.80 5.40 5.40 5.20 5.20 4.67 4.40 4.13 3.87 3.73 3.67 3.53 3.47 3.47 2.67

1.88 1.49 1.44 2.19 1.49 1.93 1.83 1.72 1.87 1.37 1.64 2.03 1.21 1.52 1.54 1.59 1.64 1.25 1.75 1.29 1.64 1.41 1.30 1.23

(ICCs) were calculated with a two-way random effects model (absolute agreement). The study design was a repeated-measures [24 ratings per listener] randomized complete block with a 2 [Task (Mozart vs. Schubert)] x 2 [Condition (O and SO)] factorial structure. The main effects for task and condition and interaction effects (contrasts) were calculated for each dependent measure. Because the design of this study included four samples from each of the six singers, the 24 samples are potentially nonindependent, which would constitute a violation of the assumptions of correlation and other parametric tests. Before analyses of these data, each dependent measure was tested for serial dependency with autocorrelation. The criterion value for serial dependency was P ⬎ 0.404. Results indicated that each of the three measures were independent: perceptual ratings, P ⫽ 0.111; ER, P ⫽ 0.238; SPR, P ⫽ 0.151.

VOCAL GESTURES IN FEMALE CLASSICAL VOICE

63

FIGURE 2. Highest ranked perceptual samples. LTAS, exemplars of the qualitative positive and negative descriptions of the sound quality of pedagogues, with details of rank, singer, task, condition, and perceptual score.

RESULTS Descriptives Mean rating scores and standard deviations calculated from averaging all measured notes sung by each singer, in each condition, for O and SO conditions in each musical task (Mozart and Schubert) are presented in Table 1. Data are ranked by mean score. Intrajudge reliability and test–retest reliability Intraclass correlations with two-way random effects model of absolute agreement (ICC(2,1)) compared the exact scores of the repeated samples with the original ratings. For the O condition, ICC ⫽ 0.7843 [CI 95% 0.4822 ⫺ 0.9613] F ⫽ 9.52, P ⬍ 0.0001. For SO, ICC ⫽ 0.7579 [CI 95% 0.4338 ⫺ 0.9558] F ⫽ 8.60, P ⬍ 0.0001; that is, listeners matched their exact rating in the first rendition and the repeat in nearly five of six cases. Therefore, judges were reliable in their absolute ratings. Overall ratings Overall means of score by condition indicated a significant reduction in SO compared with O in both

musical tasks. For mean score, there was a main effect for condition (F(1, 5) ⫽ 111.872, P ⫽ 0.000), but not for musical task (F(1, 5) ⫽ 1.497, P ⫽ 0.276); that is, ratings were significantly higher for O than for SO in both the Mozart and the Schubert tasks. There was no interaction between musical task and condition (F(1, 5) ⫽ 0.355, P ⫽ 0.577). Singer 5 was rated highest overall, in both Mozart and Schubert tasks (7.60, 7.07), and singers 1 and 3 were ranked third with a score of 6.93. For the O condition, the mean scores were 6.34 for Mozart and 6.61 for Schubert. The O condition scores ranged from 5.40 to 7.60. For SO, singer 5 was rated highest for Schubert with a score of 5.40. For SO, the mean scores were 3.81 for Mozart and 4.31 for Schubert. SO scores ranged from 2.67 to 5.40. O scores consistently scored a higher rating than did SO. Subject 6 in Mozart O was an outlier. LTAS analysis Figures 2–5 present LTAS performed on selected samples in the perceptual study. LTAS may provide acoustical interpretation of the perceptual ratings, or conversely, acoustic cues that influenced ratings may Journal of Voice, Vol. 20, No. 1, 2006

64

DIANNA T. KENNY AND HELEN F. MITCHELL

FIGURE 3. Third-ranked perceptual samples. LTAS, exemplars of the qualitative positive and negative descriptions of the sound quality of pedagogues, with details of rank, singer, task, condition, and perceptual score.

be evident in the LTAS. In qualitative interpretation of each sample, listeners categorized their responses into positive and negative comments about the vocal and technical quality and suggestions for improvement. In addition to LTAS, each figure provides the overall perceptual ranking, task, condition, and score of each sample. Finally, each figure presents exemplars of the qualitative responses given by listeners for these samples. These analyses were conducted to explore the perceptual responses to the individual samples. They illustrate the personal response of the listeners to the most important qualities contained in the sample: of the voice, the technique, and the overall musical performance. The LTAS plots of singer 5 (Figure 2) are markedly different than those of the other singers; that is, above 2 kHz, she produces a different distribution of energy from the other singers. Singers 1 and 3 produce similar LTAS plots in both tasks and are still rated highly (Figure 3). Lower scoring O samples showed lower amplitudes below Journal of Voice, Vol. 20, No. 1, 2006

3 kHz and the center frequency of energy ⬎ 2 kHz, P2, at a higher frequency. Midscoring samples included one O and three SO samples. Figure 4 presents these LTAS data. For lower scoring SO samples, all LTAS plots show less-defined or prominent peaks, particularly ⬎ 2 kHz. The lowest scoring SO samples (Figure 5) show a series of spectral peaks and a spectral roll off with rising frequency with no amplified peaks above 2 kHz. As spectral differences were unique to each singer, the SPR of each task and condition was performed to compare the relationship of energy peaks between 0–2 and 2–4 kHz. The SPR results of the perceptual samples are presented in Table 2, ranked from the best to the worst SPR values. Lower SPR results indicate more energy between 2 and 4 kHz. The SPR ranks did not correspond to the perceptual rankings. In this study, ER was calculated for each musical task in each condition for each subject. The ER data of the samples of each singer are presented in Table

VOCAL GESTURES IN FEMALE CLASSICAL VOICE 120

SPL (dB at 7 cm)

100

65

Singer 6 Schubert O

Positive Comments

Singer 5 Schubert SO

"good foundation tone, voice is still building" "good round sound" "still solid core" "a tone with roundness, depth"

Singer 1 Mozart SO Singer 2 Schubert SO

80 60

Negative Comments "sound is constricted and unstable" "lacks freedom" "lacks vibrancy and warmth" "little colour"

40 20 0

1000

2000

3000

4000

5000

6000

Hz

Rank

Singer

Task

Condition Mean Score

SD

min

max

11

6

Schubert

O

5.40

1.64

3

8

12

5

Schubert

SO

5.40

2.03

2

8

13

1

Mozart

SO

5.20

1.21

3

7

14

2

Schubert

SO

5.2

1.52

3

7

FIGURE 4. Mid-ranked perceptual samples. LTAS, exemplars of the qualitative positive and negative descriptions of the sound quality of pedagogues, with details of rank, singer, task, condition, and perceptual score.

3, ranked from the best to the worst ER. The changes to ER results by singer corresponded to SPR results in the Mozart and Schubert tasks from SO compared with O; that is, ER decreased with condition across singers. The rankings of ER did not correspond to perceptual rankings. Relationships between the rankings Table 4 presents the rankings of each singer and the respective musical task and experimental condition, ordered by perceptual rank from highest to lowest. The Pearson correlation coefficients were calculated for each of the three dependent measures (perceptual rating, SPR, and ER). Table 5 shows that although there is a very highly significant relationship between ER and SPR (as expected), there is no relationship between either of these acoustic measures and perceptual judgments. The Spearman rho correlation coefficients were calculated for the rankings of samples and singers

on each dependent measure by the judges. Table 6 displays these results. These data indicate that there was no relationship between perceptual ratings of samples or singers based on ranked ER and SPR. A series of ICCs (ICC2,1) were calculated to determine the degree to which judges were consistent in their rankings of samples and singers based on their rankings with respect to ER and SPR. In the first ICC, perceptual ratings were compared with ER for samples. The ICC was 0.012 (F ⫽ 0.987, P ⫽ 0.511), which indicated no consistency between the two rankings. When this test was completed for singers, the ICC 0.029 (F ⫽ 0.972, P ⫽ 0.527) indicated no consistency between the two rankings. In the second ICC, perceptual ratings were compared with SPR for both samples and singers. The ICC for samples was 0.47 (F ⫽ 1.89, P ⫽ 0.07), which indicated moderate consistency between the two rankings. When this test was conducted for singers, the ICC was 0.029 (F ⫽ 0.972, P ⫽ 0.527), which Journal of Voice, Vol. 20, No. 1, 2006

66

DIANNA T. KENNY AND HELEN F. MITCHELL

FIGURE 5. Lowest ranked perceptual samples. LTAS, exemplars of the qualitative positive and negative descriptions of the sound quality of pedagogues, with details of rank, singer, task, condition, and perceptual score.

indicated no consistency between the two rankings. The ICC between rankings of singers on ER and SPR was 0.6 (F ⫽ 2.5, P ⫽ 0.016), which indicated a significant relationship between the rankings of singers.

singing voice that was judged perceptually by expert listeners to produce the highest overall vocal quality, we therefore tentatively argue that optimal open throat is key to vocal quality and that it is this aspect of vocal quality that has not been detected in LTAS.

DISCUSSION

Reliability of judges Listeners were consistent in their ratings of overall quality, producing exactly the same score, on average, for five of the six repeated samples. There were no statistically significant differences between ratings of the two musical tasks. Overall preferences for particular singers were also remarkably similar, and judges ranked their favorite singers (5, 1, and 3) consistently higher than the others. Of these, singer 5 scored the highest overall ratings.

In this study, we recorded six female singers who performed two songs with two different singing techniques: “open throat” (O) and “reduced open throat” (SO). As expected, the audio samples related to the O production were judged by expert listeners to be sung with a better technique than the audio samples related to the SO production. These perceptual rankings, however, did not match the spectral measurements of two spectral parameters (SPR and ER, as measured on LTAS). We conclude that these two spectral parameters (SPR and ER) are not useful or relevant in assessing the vocal quality differences between “open throat” and “reduced open throat” singing techniques. Because we found that the maximum open throat produced a vocal quality in the Journal of Voice, Vol. 20, No. 1, 2006

Role of perceptual studies in assessing voice This study is a methodological improvement over previous twinned acoustic and perceptual studies.2,3,23 We reduced the information in LTAS to a single number to make it possible to compare

VOCAL GESTURES IN FEMALE CLASSICAL VOICE TABLE 2. Ranked SPR for Each Singer, in Each Musical Task (Mozart and Schubert) in Each Condition (O and SO) Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

TABLE 3. Ranked ER for Each Singer, in Each Musical Task (Mozart and Schubert) in Each Condition (O and SO)

Singer

Task

Condition

SPR

Rank

3 6 3 1 5 4 5 2 1 1 1 6 5 4 6 5 2 6 3 3 4 4 2 2

Mozart Mozart Mozart Mozart Schubert Mozart Mozart Mozart Schubert Mozart Schubert Schubert Mozart Mozart Mozart Schubert Mozart Schubert Schubert Schubert Schubert Schubert Schubert Schubert

SO SO O SO SO O SO O O O SO O O SO O O SO SO O SO O SO O SO

11.06 12.01 13.00 16.10 16.59 17.65 17.83 18.13 18.51 18.70 18.83 18.90 19.51 19.65 19.81 20.28 21.18 21.56 22.34 23.76 24.27 25.33 25.65 33.20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

intersinger differences across pairs of LTAS as well as intrasinger differences across pairs of LTAS.12,13,20,33 Neither ER nor SPR measurements were corroborated by the perceptual ratings of overall quality. LTAS may successfully differentiate between different qualities (opera versus country, pop, or speech), but this does not identify the timbral subtleties that may be contained in LTAS. Ideally, a meaningful acoustic measurement would be comparable with a perceptual rating by expert listeners. Although there may be acoustic cues in these singing samples that elicit listener preferences, providing a link between these and a visual representation of superior quality in LTAS presents a complex challenge for the field. Although more consistency seems to exist between LTAS and energy, we still do not know how this affects quality as available studies did not include a perceptual component.10,13,14,20 Also, SPR and ER may be uncorrelated to the perceived “goodness” of vocal quality because there are at

67

Singer

Task

Condition

ER

6 3 3 5 5 2 5 6 5 1 1 1 4 6 2 1 4 6 3 4 3 4 2 2

Mozart Mozart Mozart Mozart Schubert Mozart Mozart Mozart Schubert Mozart Mozart Schubert Mozart Schubert Mozart Schubert Mozart Schubert Schubert Schubert Schubert Schubert Schubert Schubert

SO O SO O SO O SO O O O SO O O O SO SO SO SO O O SO SO O SO

9.41 11.10 11.53 14.37 14.60 14.93 15.10 15.26 15.45 15.72 15.82 15.95 16.13 16.15 16.18 16.90 17.10 18.96 19.51 20.01 20.27 20.68 22.33 29.32

least three ways in which a singer can increase the energy in the higher harmonics, which thus affect both SPR and ER in a similar direction: (1) production of the formant (good quality), (2) employing pressed phonation (bad vocal quality, at least from a classical perspective), and (3) deliberate addition of twang to boost carrying power. Acoustic measures such as SPR and ER performed on LTAS were never intended to provide vocal quality descriptors,22 but when the lowest ranked singer on the perceptual ratings achieved the highest rankings in SPR and ER as in this study, these findings indicate that such measures on LTAS cannot define vocal quality as perceived by ear. The LTAS may contain subtle information on classical singing quality, such as differences between open throat technique and a SO open throat condition. The LTAS of O and SO conditions produced in this study were both different than LTAS produced in other studies examining speech quality,11,18 pop singing,16 Journal of Voice, Vol. 20, No. 1, 2006

68

DIANNA T. KENNY AND HELEN F. MITCHELL

TABLE 4. Singer, Task, Condition, and Rankings for Perceptual Score, SPR, and ER, Sorted by Perceptual Score Ranking from Highest to Lowest Perceptual Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

SPR ER Rank Rank Singer 16 13 10 5 9 21 6 23 19 8 11 4 3 24 15 7 20 18 12 1 22 17 14 2

9 3 10 2 11 20 13 23 19 6 14 4 12 24 8 7 21 18 16 5 22 15 17 1

5 5 1 3 1 4 4 2 3 2 6 5 1 2 6 5 3 6 1 3 4 2 4 6

Task

Condition

Schubert Mozart Mozart Mozart Schubert Schubert Mozart Schubert Schubert Mozart Schubert Schubert Mozart Schubert Mozart Mozart Schubert Schubert Schubert Mozart Schubert Mozart Mozart Mozart

Optimal Optimal Optimal Optimal Optimal Optimal Optimal Optimal Optimal Optimal Optimal Suboptimal Suboptimal Suboptimal Optimal Suboptimal Suboptimal Suboptimal Suboptimal Suboptimal Suboptimal Suboptimal Suboptimal Suboptimal

or country singing.15 We therefore argue that energy distribution above 2 kHz may indicate a fundamental quality of classical and operatic singing but is not necessarily an indicator of overall vocal quality. Measures of SPR and ER may concur more closely with perceptual studies of vocal projection, carrying power, or potential amplification over an orchestra. Exemplars of vocal quality rated by listeners We compared exemplars of LTAS sampled from high-, mid-, and low-perceptual ranks and presented these with representative qualitative descriptors. LTAS of the highest-ranking singers showed an increase in energy between 2 and 4 kHz, whereas LTAS of mid- and low-scoring singers lacked the unified peak of energy increase above 2 kHz. Previous similar findings9,13,14,18,41 have interpreted these visual cues in LTAS as representative of classical vocal quality. However, as the third ranking singers 1 and 3 in both the Mozart and the Schubert O Journal of Voice, Vol. 20, No. 1, 2006

TABLE 5. Pearson Correlations Among Perceptual Rating, SPR, and ER of the 24 Samples

SPR ER

Rating

SPR

0.101 P ⫽ 0.638 0.064 P ⫽ 0.767

0.960 P ⫽ 0.000

produced this unified energy peak ⬎2 kHz (the most “masculine” LTAS shape), they should have had the “best” voices perceptually. However, their LTAS were different than those of the highest-ranking singer (singer 5), who had a wider distribution of energy above 2 kHz. The low SO plots (Figure 5) showed a spectral rolloff more consistent with plots of speech than with singing,11,42 but with reduced energy from 0 to 2 kHz. Therefore, measurements of comparative energy between 0–2 and 2–4 kHz, such as SPR or ER, were inconclusive indicators of vocal quality. High-scoring singers were praised for their overall vocal quality, or combinations of positive qualities that resulted in a “balanced” sound.4 Judges recognized an overall good quality in the mid-scoring voices, but they identified technical flaws that impacted on vocal production. In general, judges were more likely to comment on faulty vocal production or technique, particularly in those ranked lowest. Negative comments across all singers were more descriptive and focused on technical suggestions for improvement in the sound. Conversely, those singers ranked highly attracted little comment. It seems that a beautiful voice is difficult to describe. LTAS, and measures applied to LTAS, are not sensitive to quality differences that were identified by other analyses, such as vibrato. In previous studies, these voices were examined for vibrato and spectral differences in O and SO,7,33 but although the differences were significant for vibrato parameters, these differences were not confirmed by spectral analysis, which suggests that current measures that assess voice, such as LTAS, are not sufficiently sophisticated to represent voice. Spectrographic work in other fields43–47 has not conclusively identified the same voice in different recordings despite simultaneous perceptual and visual cues. If this is the case, it may not be possible to convincingly

VOCAL GESTURES IN FEMALE CLASSICAL VOICE TABLE 6. The Spearman Rho for Rankings of Samples and Singers for Perceptual Rating, SPR, and ER Samples (n ⴝ 24) SPR ER

Rating

SPR

0.308 P ⫽ 0.143 ⫺0.006 P ⫽ 0.977

0.169 P ⫽ 0.431

Singers (n ⫽ 6) SPR ER

0.214 P ⫽ 0.315 ⫺0.314 P ⫽ 0.135

0.014 P ⫽ 0.947

illustrate good singing by visual representation alone. Indeed, with any acoustic analysis, isolation may lead to incorrect interpretation of acoustic cues in the singing voice. The danger of such a method is the creation of an ideal, but nonexistent, voice of quality. In measuring vocal quality in singing, the interpretation of LTAS is as subjective as is the perceptual assessment of expert listeners. Our listeners were consistent in their perceptual judgments and reliable in their assessment of vocal quality. Although spectral assessments, such as LTAS, allow for a comparable holistic visual assessment of the voice, future voice research must assign a prominent place to perceptual rating and avoid confusing spectral energy distributions with vocal quality. Ideally, LTAS should provide an acoustic resource to assess the spectra of a performance,10 rather than of a single note.12,48,49 An LTAS plot, like an extended listening sample, provides an overall impression of the most regular features of the sound of a singer. Given the advantages of LTAS, in that it stabilizes to a regular pattern over time10,33 while retaining the individuality generated by its musical stimuli and the individual singer, one would expect the final LTAS curves to show a positive relationship with perceptual ratings. This result was not confirmed in this study. Current acoustic information available to singing students and singing pedagogues has limited application in the singing studio. Vocal quality is based on multiple factors, which are unlikely to be independent. The next generation of acoustic analyses

69

must complement the human ear in integrating the complex dimensions of the human voice. Future efforts could well be directed toward establishing perceptual benchmarks of vocal quality and their associated acoustical parameters. To remain viable, acoustic research in voice must attempt to emulate the human ear. Acknowledgments: We thank Ms. Maree Ryan for her pedagogical advice and support. We also thank Mr. Peter Thomas and Dr. Densil Cabrera for their advice on acoustical matters and the six singers for their willing participation in the project. We are grateful for the helpful comments of the reviewers in the revision of this paper.

REFERENCES 1. Wapnick J, Ekholm E. Expert consensus in solo voice performance evaluation. J Voice. 1997;11:429–436. 2. Ekholm E, Papagiannis GC, Chagnon FP. Relating objective measurements to expert evaluation of voice quality in Western classical singing: critical perceptual parameters. J Voice. 1998;12:182–196. 3. Robison CW, Bounous B, Bailey R. Vocal beauty: a study proposing its acoustical definition and relevant causes in classical baritones and female belt singers. J Singing. 1994;51:19–30. 4. Mitchell HF, Kenny DT, Ryan M, Davis PJ. Defining open throat through content analysis of experts’ pedagogical practices. Logoped Phoniatr Vocol. 2003;28:167–180. 5. Merritt L, Richards A, Davis PJ. Performance anxiety: loss of the spoken edge. J Voice. 2001;15:257–269. 6. Thompson WF, Diamond PCT, Balkwill L-L. The adjudication of six performances of a Chopin etude: a study of expert knowledge. Psychol Music. 1998;26:154–174. 7. Mitchell HF, Kenny DT. The impact of “open throat” technique on vibrato rate, extent and onset in classical singing. Logoped Phoniatr Vocol. 2004;29:171–182. 8. Mitchell HF, Kenny DT. Can experts identify “open throat”technique as a perceptual phenomenon? Musicae Scientiae. In press. 9. Vurma A, Ross Y. Priorities in voice training: carrying power or tone quality. Musicae Scientiae. 2000;4:75–93. 10. Jansson EV, Sundberg J. Long-time average spectra applied to analysis of music. Part I: method and general applications. Acustica. 1975;34:15–19. 11. Lo¨fqvist A, Mandersson B. Long-time average spectrum of speech and voice analysis. Folia Phoniatr Logoped. 1987;39:221–229. 12. Omori K, Kacker A, Carroll LM, Riley WD, Blaugrund SM. Singing power ratio: quantitative evaluation of singing voice quality. J Voice. 1996;10:228–235. Journal of Voice, Vol. 20, No. 1, 2006

70

DIANNA T. KENNY AND HELEN F. MITCHELL

13. Thorpe C, Cala S, Chapman J, Davis P. Patterns of breath support in projection of the singing voice. J Voice. 2001;15:86–104. 14. Sundberg J. Articulatory interpretation of the “singing formant.” J Acoust Soc Am. 1974;55:838–844. 15. Cleveland TF, Sundberg J, Stone RE. Long-term average spectrum characteristics of country singers during speaking and singing. J Voice. 2001;15:54–60. 16. Borch DZ, Sundberg J. Spectral distribution of solo voice and accompaniment in pop music. Logoped Phoniatr Vocol. 2002;27:37–41. 17. Mendoza E, Valencia N, Munoz J, Trujillo H. Difference in voice quality between men and women: use of the longterm average spectrum (LTAS). J Voice. 1996;10:59–66. 18. Barrichelo VMO, Heuer RJ, Dean CM, Sataloff RT. Comparison of singer’s formant, speaker’s ring and LTA spectrum among classical singers and untrained normal speakers. J Voice. 2001;15:344–350. 19. Rossing TD, Sundberg J, Ternstrom S. Acoustic comparison of soprano solo and choir singing. J Acoust Soc Am. 1987; 82:830–836. 20. Barnes JJ, Davis PJ, Oates J, Chapman J. The relationship between professional operatic soprano voice and high range spectral energy. J Acoust Soc Am. 2004;116:530–538. 21. Bunch M, Chapman J. Taxonomy of singers used as subjects in scientific research. J Voice. 2000;14:363–369. 22. Bartholomew W. A physical definition of “good voice quality” in the male voice. J Acoust Soc Am. 1934;6:25–33. 23. Howard DM, Szymanski J, Welch GF. Listeners’ perception of English girl and boy choristers. Music Percept. 2002; 20:35–49. 24. White P. Long-term average spectrum (LTAS) analysis of sex and gender-related differences in children’s voices. Logoped Phoniatr Vocol. 2001;26:97–101. 25. White P. A study of the effects of vocal intensity variation on children’s voices using long-term average spectrum (LTAS) analysis. Logoped Phoniatr Vocol. 1998;23:111–120. 26. Foulds-Elliott S, Thorpe C, Cala S, Davis P. Respiratory function in operatic singing: effects of emotional connection. Logoped Phoniatr Vocol. 2000;25:151–168. 27. Behrens G, Green S. The ability to identify emotional content of solo improvisations performed vocally and on three different instruments. Psychol Music. 1993;21:20–30. 28. Howes P, Callaghan J, Davis PJ, Kenny DT, Thorpe C. The relationship between measured vibrato characteristics and perception in Western operatic singing. J Voice. 2004; 18:216–230. 29. Siegwart H, Scherer K. Acoustic concomitants of emotional expression in operatic singing: the case of Lucia in Ardi gli incensi. J Voice. 1995;9:249–260. 30. Sundberg J, Iwarsson J, Hagegard H. A singer’s expression of emotions in sung performance. In: Fujimura O, Hirano

Journal of Voice, Vol. 20, No. 1, 2006

31.

32.

33.

34.

35. 36.

37.

38.

39. 40.

41. 42.

43. 44.

45.

46. 47. 48.

49.

M, editors. Vocal Fold Physiology: Voice Quality Control. San Diego, CA: Singular Publishing Group; 1995: 217–231. Kamenetsky S, Hill D, Trehub S. Effect of tempo and dynamics on the perception of emotion in music. Psychol Music. 1997;25:149–160. Gabrielsson A, Juslin PN. Emotional expression in music performance: between the performer’s intention and the listener’s experience. Psychol Music. 1996;24:68–91. Mitchell HF, Kenny DT. The effects of open throat technique on long term average spectra (LTAS) of female classical voices. Logoped Phoniatr Vocol. 2004;29:99–118. Titze I, Bergan CC, Hunter EJ, Story B. Source and filter adjustments affecting the perception of the vocal qualities twang and yawn. Logoped Phoniatr Vocol. 2003;28:147– 155. Estill J. Primer of Basic Figures. 2nd ed. Santa Rosa, CA: Estill Voice Training Systems; 1996. Cabrera D, Davis P, Barnes J, Jacobs M, Bell D. Recording the operatic voice for acoustic analysis. Acoust Australia. 2002;30:103–108. Erickson ML, Perry SR. Can listeners hear who is singing? A comparison of three-note and six-note discrimination tasks. J Voice. 2003;17:353–369. White P. Formant frequency analysis of children’s spoken and sung vowels using sweeping fundamental frequency production. J Voice. 1999;13:570–582. Nordenberg M, Sundberg J. Effect on LTAS of vocal loudness variation. TMH-QPSR. 2003;45:93–100. Lundy DS, Roy S, Casiano RR, Xue JW, Evans J. Acoustic analysis of the singing and speaking voice in singing students. J Voice. 2000;14:490–493. Sundberg J. Level and center frequency of the singer’s formant. J Voice. 2001;15:176–186. Novak A, Vokral J. Acoustic parameters for the evaluation of voice of future voice professionals. Folia Phoniatr Logoped. 1995;47:279–285. Koenig BE. Spectrographic voice identification: a forensic survey. J Acoust Soc Am. 1986;79:2088–2090. Hakes J, Shipp T, Doherty ET. Acoustic properties of straight tone, vibrato, trill and trillo. J Voice. 1987;1: 148–156. Hoit JD, Jenks CL, Watson PJ, Cleveland TF. Respiratory function during speaking and singing in professional country singers. J Voice. 1996;10:39–49. Hollien HF. The Acoustics of Crime: The New Science of Forensic Phonetics. New York: Plenum Press; 1990. Rose P, editor. Forensic Speaker Identification. New York: Taylor & Francis; 2002. Cleveland TF. Acoustic properties of voice timbre types and their influence on voice classification. J Acoust Soc Am. 1977;61:1622–1629. Sundberg J. The Science of the Singing Voice. DeKalb, IL: Northern Illinois University Press; 1988.