Acoustical Comparison Between Samples of Good and Poor Vibrato in Singers

Acoustical Comparison Between Samples of Good and Poor Vibrato in Singers

Acoustical Comparison Between Samples of Good and Poor Vibrato in Singers *Jose A. Diaz and †Howard B. Rothman *Carabobo, Venezuela †Gainesville, Flor...

181KB Sizes 0 Downloads 22 Views

Acoustical Comparison Between Samples of Good and Poor Vibrato in Singers *Jose A. Diaz and †Howard B. Rothman *Carabobo, Venezuela †Gainesville, Florida

Summary: The purpose of this research was to analyze samples of frequency vibrato taken from recordings of eight different singers, which were classified as examples of good or poor singing. The samples were analyzed by a software package, which makes use of the linear prediction coding (LPC) method to determine the time varying rate and extent of the frequency vibrato wave. Four parameters, which relate to the periodicity of the samples, were extracted from the time varying rate and extent and investigated in order to verify or reject the hypothesis that the best vibrato samples were the most symmetric ones. Ten samples per singer were analyzed, 5 good and 5 poor, for a total of 80 samples. The results show that the samples judged as good were the most periodic ones. Key Words: Vibrato—Singers’ voice—Speech Processing—Spectrogram— Linear Prediction Coding (LPC).

INTRODUCTION

pedagogues, vocal critics, and scientists who study vibrato, agree that the samples with better quality are the most periodic ones,2 that is, the ones whose rate and extent are more constant and regular over time. The main objective of this research was to study in detail the periodicity features of vibrato samples, in order to verify or reject the hypothesis that the best samples were the most periodic ones. In order to accomplish this objective, signal processing algorithms and software were used to extract and analyze some parameters related to the periodicity of the vibrato wave. There was also an expectation that the parameters directly related to vibrato quality could be identified. Therefore, this research presents a novel approach to vibrato analysis because there does not appear to be any other analysis being performed to measure vibrato in singers or musical instruments in which the time varying rate and extent of the vibrato samples were analyzed at a resolution as good or better than 80.58 samples per second, nor through the use of linear prediction coding (LPC).3–9

Historically, in order to determine the quality of a vibrato sample, experts in the area of vocal music and the singing voice would listen to, analyze, and judge the quality of vibrato samples.1 This is a subjective process; however, there is agreement among experts. Most experts, that is, singers, vocal

Accepted for publication August 9, 2002. Presented at the II Iberoamerican Congress on Acoustics held in Madrid from October 16 to 20, 2000. From the *Department of Electrical Engineering, University of Carabobo and †Department of Communication Sciences and Disorders, University of Florida, Gainesville, Florida. Address correspondence and reprint requests to Jose A. Diaz, Department of Electrical Engineering, University of Carabobo, Venezuela, P.O. Box 025685, Miami, FL 33102-5685. E-mail: [email protected] Journal of Voice, Vol. 17, No. 2, pp. 179–184 쑕 2003 The Voice Foundation 0892-1997/2003 $30.00⫹0 doi:10.1016/S0892-1997(03)00002-X

179

180

JOSE A. DIAZ AND HOWARD B. ROTHMAN

FIGURE 2. Example of frequency vibrato.

FIGURE 1. Spectrogram of vibrato sample.

SAMPLE SELECTION For this research, eight professional singers (5 male, 3 female) with years of experience were chosen from a database of vibrato samples that is kept at the Department of Communication Sciences and Disorders at the University of Florida. Four different individuals with singing experience or with experience in studying the singing voice listened to all samples from the eight singers in order to classify the samples produced by each singer as samples of good or poor vibrato. The judges were instructed to classify the samples depending on their quality, as good (pleasing to the ear) or poor (disagreeable to the ear). The results provided by each judge were compared, and the samples that were judged as good or poor by the four judges were chosen for analysis; the other samples were rejected because the judges did not unanimously agree on their quality. From the chosen samples, five samples of good vibrato and five of poor vibrato were randomly selected for each singer, which resulted in a total of 80 samples (8 singers times 10 samples/singer). All samples selected fulfilled the following criteria: 1. Length greater than 1.5 s. 2. No change in pitch. 3. No change in vowel. 4. Absence of accompanying musical instruments or noise. SAMPLE ANALYSIS The MMSV1 (Mathematical Model of Singers’ Vibrato, The Mathworks, Natick, MA) software, Journal of Voice, Vol. 17, No. 2, 2003

which was developed by one of the authors, was selected for sample analysis because it provided the time varying rate and extent in Hertz of the frequency vibrato wave. This software generated sample spectrograms (see Figure 1), from which the vibrato samples were extracted. Figure 2 shows the frequency vibrato wave extracted from the sixth harmonic in Figure 1. The software used the LPC method to generate the time varying rate and extent waves of the frequency vibrato, which represent how the rate and extent of the frequency vibrato wave vary along the time axis. These two waves were created by taking a small segment of the vibrato signal (170 ms.), applying the LPC method, and obtaining the rate and extent in Hertz of the segment from the LPC parameters obtained. Then, the software selected a new segment of the same length (170 ms.), shifted 12 ms. to the right and analyzed it. This process was repeated until the entire signal was analyzed. The length of the vibrato segment was chosen to be equal to 170 ms. because a longer segment would show the average rate and extent of the segment, and a shorter segment would not contain enough data for a reliable result. Then, all rate and extent values were plotted on separate windows (see Figures 3 and 4). These two waves represent the time varying rate and extent of the frequency vibrato wave seen in Figure 2. Notice that the time scale in Figures 3 and 4 is shorter than that of Figures 1 and 2. This is because the LPC method requires a segment of the vibrato wave to calculate one value of the rate and extent. It is important to note that the time varying extent wave represents the extent of the wave in Figure 2, and it does not represent

ACOUSTICAL COMPARISON OF GOOD AND POOR VIBRATO the amplitude vibrato wave, which contains the amplitude variations associated with the frequency vibrato. The main advantage in using this software compared to other methods was the time resolution (number of samples per second), which is 80.58 samples per second in Figures 3 and 4. This allowed the variation patterns in rate and extent of the frequency vibrato to be observed and compared. If the frequency vibrato wave was a sinusoidal wave, the waves in Figures 3 and 4 would be straight lines because the rate and extent of a pure sine wave are constant over time. Therefore, as the waves in Figures 3 and 4 approach a straight line, the frequency vibrato becomes more periodic, and the departure from a straight line shape indicates the degree of aperiodicity in the frequency vibrato wave. As a result, in order to measure the periodicity of a vibrato sample, four parameters were selected that measure the deviations of the waves in Figures 3 and 4 from a straight line. These parameters are as follows: 1. Total time varying rate deviation (maximum rate minus minimum rate) in Hertz. 2. Variability (standard deviation) of the time varying rate in Hertz. 3. Total time varying extent deviation (maximum extent minus minimum extent) in Hertz. 4. Variability (standard deviation) of the time varying extent in Hertz. Parameters 1 and 2 were extracted from the time varying rate wave and parameters 3 and 4 were extracted from the time varying extent wave. Parameters 2 and 4 (variability) represent the spread of the time varying rate and extent in Hertz. In order to make a meaningful comparison among samples and singers, the four parameters mentioned above were converted into percentages of the average value of the wave from which the parameter was extracted. This was necessary because the MMSV software can extract the vibrato wave from different harmonics; therefore, for higher harmonics, the variability is higher and it should not be compared to a time varying wave extracted from a lower harmonic. By converting each parameter into percent of the average wave value, this error was canceled.

181

FIGURE 3. Time varying rate of frequency vibrato.

FIGURE 4. Time varying extent of frequency vibrato.

The 80 samples chosen for analysis were analyzed using the MMSV software in order to generate the time varying rate and extent waves for each of them. Then, the four parameters mentioned above were calculated by a Matlab-based program (The Mathworks, Natick, MA), which also saved the parameters in an Excel-compatible format for later analysis.

RESULTS Tables 1 through 8 show the results of the four parameters under analysis for the eight singers after they were converted into percentages as explained above. The first column indicates the parameter name. Columns 2 and 3 show the average values obtained for the good and poor samples, respectively. Single-factor analyses of variance10 (ANOVAs) with five samples per group, and two groups (good and poor samples) were applied to the four parameters of the eight singers (critical value of F = 5.32 Journal of Voice, Vol. 17, No. 2, 2003

182

JOSE A. DIAZ AND HOWARD B. ROTHMAN TABLE 1. Results for Singer Number 1

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

4.1672 19.48266 6.7132 49.85025

5.388 25.12942 7.9686 56.05383

4.403696 3.369403 0.74747 0.341805

0.069099 0.103744 0.41245 0.574902

No No No No

TABLE 2. Results for Singer Number 2

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

5.2718 23.07027 5.4446 38.4382

5.438 26.39093 13.8324 100.0234

0.046982 0.652133 13.61278 16.93521

0.833826 0.442698 0.006134 0.003366

No No Yes Yes

TABLE 3. Results for Singer Number 3

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

6.6054 30.66759 7.555 56.5322

10.4638 48.50149 13.9394 103.4366

2.911387 3.11204 8.675875 9.830369

0.126346 0.115722 0.018553 0.013903

No No Yes Yes

TABLE 4. Results for Singer Number 4

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

4.5234 21.00638 9.1682 68.10582

5.2498 23.78328 11.8154 89.01863

0.372504 0.22438 2.435761 2.424016

0.558578 0.64838 0.157218 0.158102

No No No No

TABLE 5. Results for Singer Number 5

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

3.3768 15.50761 4.5994 37.0789

4.3642 21.98027 6.3954 47.01511

3.630769 3.753822 7.808637 2.875563

0.093181 0.088691 0.023399 0.128376

No No Yes No

for α = 0.05, df1 = 1, and df2 = 8). Columns 4 and 5 show the F and P values respectively. No arcsine transform was performed on the data. Column 6 indicates whether the obtained results were significant. Significant results were found in four of the eight singers, which indicates that the mean values of Journal of Voice, Vol. 17, No. 2, 2003

the variables under study were higher for the poor samples in four singers of eight. No significant results were found for singers numbered 1, 4, 6, and 8. These results support the hypothesis that the best samples were the most symmetrical ones and serve to validate the listeners’ classification of the samples.

ACOUSTICAL COMPARISON OF GOOD AND POOR VIBRATO

183

TABLE 6. Results for Singer Number 6

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

3.1004 14.52671 6.2266 48.69182

4.3288 19.16041 10.4394 71.93773

1.997513 1.05888 5.171822 3.480643

0.195265 0.333575 0.052548 0.099067

No No No No

TABLE 7. Results for Singer Number 7

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

7.1548 33.52693 6.5218 46.75865

7.471 32.18296 10.9056 78.79748

0.032922 0.029144 6.52134 6.500981

0.860532 0.868687 0.033979 0.034191

No No Yes Yes

TABLE 8. Results for Singer Number 8

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

4.1346 17.21066 6.4406 44.56048

4.6404 29.39715 7.2862 55.2096

0.148622 1.006298 0.617938 1.728126

0.709904 0.345165 0.454466 0.225085

No No No No

Analysis of the vibrato rate for the individual singers We can see from Tables 1 through 8 that the variability in rate for the poor samples was always higher than that of the good samples. The total deviation of the time varying rate was higher in seven of the eight singers and was only smaller for singer 7. The higher total deviation of the time varying rate for the good samples in singer 7 indicates large oscillations of the time varying rate, which were not perceived as poor vibrato. The statistical analysis of the variability of the time varying rate did not show significant results, nor did the total deviation of the time varying rate. Analysis of the vibrato extent for the individual singers We can see from Tables 1 through 8 that the variability and the total deviation of the time varying extent for the poor samples were always higher than that of the good samples. The statistical analysis

of the variability of the time varying extent yielded significant results in four singers of eight, whereas the total deviation of the time varying extent yielded significant results in three cases of eight. Comparison between good and poor samples for the entire group Table 9 provides a summary of the results after comparing the good and poor samples of the entire group. The first column shows the parameters under analysis. Columns 2 and 3 show the average results for good and poor samples. Single factorANOVAs with eight samples per group (8 singers) and two groups (good and poor samples) were applied to the four parameters under analysis (critical value of F = 4.60 for α = 0.05, df1 = 1, and df2 = 14). Columns 4 and 5 show the F and p values respectively. No arc-sine transform was performed on the data. Column 6 indicates whether the obtained results were significant or not. Journal of Voice, Vol. 17, No. 2, 2003

184

JOSE A. DIAZ AND HOWARD B. ROTHMAN TABLE 9. Comparison Between Good and Poor Samples

Time Time Time Time

varying varying varying varying

rate variability % rate total dev % ext variability % ext total dev %

Avg good samp

Avg poor samp

F value

p value

Significant

4.7918 21.87485 6.583675 48.75204

5.918 28.31574 10.3228 75.18655

1.562169 2.527292 10.99749 10.06189

0.231838 0.134214 0.005095 0.006788

No No Yes Yes

Analysis of the vibrato rate for the entire group Table 9 indicates that the mean values of the four parameters were always smaller for the good samples than for those of the poor samples. In Table 9, the variability and the total deviation of the time varying rate do not show significant results. However, it is important to note that the poor samples used in this study showed higher oscillations in rate than that of the good samples, although no significant results were found. Analysis of the vibrato extent for the entire group In Table 9, the variability and the total deviation of the time varying extent show significant results. Therefore, it appeared that the oscillations observed in the rate and extent of the frequency vibrato wave were mainly due to variations in extent.

CONCLUSIONS The results of this research support the hypothesis that the most symmetrical samples were judged as good samples of vibrato. Also, it appeared that the largest number of significant results was found in the variability of the time varying extent (4 cases of 8), which suggests that the variability of the time varying extent provided a measure of vibrato quality that was more accurate and meaningful than the other three parameters under analysis.

Journal of Voice, Vol. 17, No. 2, 2003

It could also be seen that the oscillations in the rate and extent of frequency vibrato wave were found mainly in the time varying extent wave, rather than in the time varying rate wave, which shows that it was more difficult for singers to control the extent of the vibrato pulse than the rate at which the vibrato pulse occurred. Taken together, these results show a direct relationship between the periodicity of the vibrato wave and its perceived quality. REFERENCES 1. Diaz JA. Frequency characterization of singers’ vibrato. Gainesville, FL: University of Florida; 1995. Master’s thesis. 2. Sundberg J. The Science of the Singing Voice. Dekalb, IL: Northern Illinois University Press; 1987. 3. DeJonckere PH, Minoru H, Sundberg J. Vibrato. San Diego, California: Singular Publishing Group; 1995. 4. Prame E. Measurement of the vibrato rate of ten singers. J Acoust Soc Am. 1994;96:1979–1984. 5. Horii Y. Frequency modulation characteristics of sustained /a/ sung in vocal vibrato. J Speech Hear Res. 1989;32:1–8. 6. Maher R, Beauchamp J. An investigation of vocal vibrato for synthesis. Appl Acoustics. 1990;30:19–245. 7. Sundberg J. Acoustic and psychoacoustic aspects of vocal vibrato. KTH Speech Transmission Lab Quart Progr Status Report. 1994;2–3:45–48. 8. Titze IR. Synthesis of sung vowels using a time-domain approach. Transcripts of the Eleventh Symposium: Care of the Professional Voice. New York: The Voice Foundation; 1983:90–98. 9. Mellody M, Wakefield G. The time-frequency characteristics of violin vibrato: modal distribution analysis and synthesis. J Acoust Soc Am. 2000;107:598–611. 10. Ott RL. An Introduction to Statistical Methods and Data Analysis. Belmont, CA: Duxbury Press; 1993.