Objective vocal fold vibration assessment from videokymographic images

Objective vocal fold vibration assessment from videokymographic images

Biomedical Signal Processing and Control 1 (2006) 129–136 www.elsevier.com/locate/bspc Objective vocal fold vibration assessment from videokymographi...

687KB Sizes 1 Downloads 32 Views

Biomedical Signal Processing and Control 1 (2006) 129–136 www.elsevier.com/locate/bspc

Objective vocal fold vibration assessment from videokymographic images C. Manfredi a,*, L. Bocchi a, S. Bianchi a, N. Migali b, G. Cantarella b a

Department of Electronics and Telecommunications, Universita` degli Studi di Firenze, Via S. Marta 3, 50139 Firenze, Italy b Otolaryngology Department, University of Milan, Ospedale Maggiore Policlinico Mangiagalli e Regina Elena, Fondazione IRCCS, Via F. Sforza 35, 20122 Milano, Italy Received 15 February 2006; received in revised form 29 May 2006; accepted 14 June 2006 Available online 1 September 2006

Abstract Vocal folds oscillation crucially influences all the basic qualities of voice, such as pitch and loudness, as well as the spectrum. Stroboscopy provides the standard view of the larynx. However, a two-dimensional high-speed imaging system currently cannot provide enough image resolution to evaluate irregular vocal fold vibrations, due to the limitation of transmission speed and storage volume. Videokymography is a new diagnostic tool developed to overcome specific limitations of stroboscopy in severely dysphonic patients with an aperiodic signal. It registers the movements of the vocal folds with a high time resolution on a line perpendicular to the glottis. The technique, being independent from the periodicity and intensity of the vocal signal, allows an objective evaluation of vocal folds function. However, due to its novelty, no established clinical evaluation protocol, or validity and reliability data are available for videokymography. Moreover, few results concerning objective parameter estimation from videokymographic images are available. The main focus of this paper is on measuring and tracking quantitative parameters for objective vocal fold functional assessment, from videokymographic (VKG) examinations of subjects with normal and pathological laryngeal function, based on active contour search implemented with a properly adjusted robust snake algorithm. The method is designed to deliver to the clinician the essential objective information of VKG in an effective way, as an aid to its subjective and intuitive skilfulness. A set of VKG images has been analysed, coming both from healthy and dysphonic subjects, recorded at the Otolaryngology Department, University of Milan, Italy, showing the robustness and reliability of the proposed technique. # 2006 Elsevier Ltd. All rights reserved. Keywords: Videokymography; Snake active contour; Objective parameters; Dysphonia

1. Introduction Vibration of the vocal folds is a highly relevant aspect of voice production, both in normal and in pathological voices. The periodicity, or lack of periodicity, critically determines the quality of voice. It is typically described in terms of jitter and/or shimmer, period-to-period correlation, or by spectral characteristics. Analysis of such characteristics is generally applied to acoustic signals recorded from microphones, but can also be used on signals derived from electroglottography, or airflow recorded from flow masks. Another method for acquisition of physiological data is direct visual inspection of the vocal folds vibration by means of stroboscopy or videolaringostroboscopy (VLS). Currently, VLS is considered as a first-choice exam for the diagnosis

* Corresponding author. Tel.: +39 055 479 6410; fax: +39 055 494569. E-mail address: [email protected] (C. Manfredi). 1746-8094/$ – see front matter # 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.bspc.2006.06.001

of a large number of functional pathologies of the larynx. However, there are intrinsic limitations that restrict its usefulness in some clinical applications: VLS exams suffer from high inter-subject variability, largely due to the specific experience gained by each specialist. It is also difficult to quantify parameters related to vocal folds vibration and glottis closure, to compare results from the same population or among those collected by different institutions. Finally, due to its structure, irregular vibrations cannot be observed precisely. In fact, VLS is based on an optical illusion caused by persistence of visual images. A brief sequence of images, presented at intervals less than 0.2 s, appears to the human eye as a continuously moving picture. Rapid periodic movements, such as vibration, can be made visible with very short light flashes, synchronous with the motion of the object under observation. Exact coincidence of the frequency of light flashes with the position of the moving object provides a sharp and clear picture (steady-phase mode), because points at the same phase angle of the object’s motion are repeatedly

130

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

illuminated. Slight differences between the frequency of the flashing light source and that of the object’s motion produces the well-known slow-motion effect. Discontinuities in the slowmotion mode and blurred pictures in the steady-phase mode indicate irregularities quite clearly, but they cannot be quantified (only relative measures are possible). In cases in which very rough voices produce de-synchronization of the stroboscopic light source, they may not even be verifiable. Efforts to improve the sensitivity of VLS in order to be able to visualize the variations of wave characteristic across the glottis also in aperiodic patterns of vibration have yielded new techniques. Applied research has evolved in two directions: high-speed indirect laryngoscopy and videokymograpy (VKG). At present, the full high-speed digital imaging systems [5] are perhaps the most technologically advanced, but also the most expensive method for visualization of the vocal fold oscillation. The advantage of the highspeed digital kymographic imaging is that the measurement position can be selected from any place of the vocal folds after the recording has been done. The line resolution and the image rate, however, are about three times smaller than in the VKG method. Usually, they are also cumbersome and not handy. The kymographic concept introduced by [1,2] seems to be a promising solution, since each vibratory cycle is documented in terms of a sequence of several images, which can be acquired directly from a single-line camera or by extraction from highspeed image sequences [3,4]. VKG is based on a special camera, which can operate in two different modes: standard and high-speed [1]. In the standard mode, the camera provides standard images displaying the whole vocal folds at standard video frame rate (30/25 frames/s, depending on video coding standards, with 720  486/768  576 pixels of resolution). In the high-speed mode, the video camera delivers images from only a single line selected from the whole image, at the speed of approximately 7875/7812.5 line-images/s and 720  1/ 768  1 pixels resolution. Fig. 1a shows the standard mode VLS image, on the left. The line on the image corresponds to the measuring position for the high-speed VKG image, as shown in Fig. 1a, on the right. This is called ‘‘videokymogram’’, and displays the vibratory pattern of the selected part of the vocal folds. The parameters measured on the VKG image are schematically reported in Fig. 1b, and will be described later. Kymographic recording is divided into video frames, i.e. segments of approximately 15/18 ms duration. Images are not in colour, and continuous high-intensity light is needed [2–4]. The vibratory pattern displayed in kymographic images depends on the measuring position that must be selected prior to examination. The position in the middle of the vocal folds usually provides the vibratory pattern of the whole vocal folds. However, in the case of functional vocal fold pathologies, the vibration characteristics may differ along the glottal axis. In such cases, the clinician has to select the most representative line for analysis. The angle of the measuring line is, as a standard, adjusted to be perpendicular to the glottal axis.

Patients are asked to phonate at comfortable pitch and loudness. Due to difficulties in keeping the endoscope fixed on the same line, commonly VKG recordings last only few seconds [12]. The advantage of VKG over stroboscopy alone is the ability to evaluate one portion of the glottis in comparison to another, while avoiding the limitation of vibratory periodicity. This technique also allows for accurate objective measurements of glottal gap without the inherent error produced by stroboscopic averaging of many individual glottis cycles. However, at present, there is no commercial tool for objective parameter extraction from VKG images. This study aims at offering an automatic quantitative method to obtain vibration properties of human vocal folds via videokymography, by developing a digital image processing algorithm optimized for the analysis of VKG recordings, such as intensity adjustment, noise removal and glottis identification. The presented method extends our previous work [6–9] and combines an active contour model with a parameter extraction algorithm that can accurately track the vibrational wave in videokymograms and automatically quantify its properties in terms of few parameters, useful for clinicians. Tracking parameters, other than simply measuring their mean value and standard deviation (std), is in fact considered of utmost importance by clinicians, as irregular patterns can be found at the instant of occurrence during phonation and put into relation with images and acoustic signal analysis. Specifically, the amplitude and period ratios between right and left vocal folds, as well as the ratio between the opening and closing phase are considered [10]. When required, more parameters could be added, on analogy to [11]. This point is under consideration. Examples are given concerning pathological subjects that show the robustness of the contour detection and parameter extraction algorithm. 2. Materials and methods As previously said, VKG allows isolation of specific portions of the glottic image (taken at up to 7812 images/s) to be analysed. Such kymographic images give a good view of the movements of the vocal folds, periodic or non-periodic, but only for part of the image, i.e. the single line. In VKG the measuring position is selected prior to examination and once the recording has been done it cannot be changed. The kymographic concept is the best way to represent visually a large amount of dynamic content in a two-dimensional image. Despite its usefulness, in our knowledge, until now no quantitative analysis of VKG images is commercially available, and only little work has been made towards its fulfilment [6–11]. At present, physicians perform only qualitative evaluation of VKG parameters, basically by visual inspection of subsequent frames, or by manual measures from printed images. Obviously, such analysis prevents reliable comparison among wide sets of data, and hence, deters from finding and defining standard reference values for classification and assessment of treatment effectiveness. Based on such requirements, this work aims at providing first results that would allow the filling of this gap.

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

131

Fig. 1. Main parameters for objective videokymographic image analysis. (a) Left: standard mode; right: high-speed VKG mode. The measuring position for the kymographic image is marked as the line on the image. The videokymogram displays the vibratory pattern of the selected part of the vocal folds. (b) High-speed videokymographic image. Main parameters.

Parameter extraction is obtained here by means of two subsequent stages: image analysis, for vocal folds contours detection, followed by signal analysis, for parameter evaluation from data sets representing vocal folds edges. Specifically, and with reference to Fig. 1b, the parameters to be measured and tracked are:  Ramp, the ratio between the right and left vocal fold amplitudes, related to possible asymmetries between the two folds;

 Rper, the ratio between the right and the left vibratory periods, inversely related to possible frequency variations due to pathology;  Roc, the ratio between opening and closing phase (open and closed, respectively), basically related to glottal incompetence, i.e. incomplete vocal folds closure during phonatory adduction. For healthy voices, such parameters should be equal or close to one, and almost constant during all phonatory cycles.

132

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

Any asymmetry due to pathology can thus be quantified by evaluating and tracking the above-mentioned parameters.

coincide with intensity extremes, edges, and other image features of interest:

2.1. First stage: edge detection

Eext ½vðsÞ ¼ cjrGs  Iðx; yÞj2

The detection of the vocal folds contour is carried out in a two-step process, the first one aimed at finding an initial contour, to which active snakes are applied in the second step. The VKG images under study are in fact mostly embedded in strong noise, as sometimes light intensity was too low and images were often unstable, possibly due to the difficulty in keeping the VKG device in a steady position. This has required applying robust techniques for edge detection.

where I(x, y) is the image intensity, Gs the Gaussian of standard deviation s, 5 the gradient operator, and c is a weight associated with image energies [13–15]. As concerns the energy minimization, the original model employs the variational calculus to iteratively find the minimum. However, there may be a number of problems associated with this approach such as algorithm initialization, existence of local minima, and selection of model parameters. Among existing methods, the greedy algorithm [15] exhibits a low computational cost, provided the initial position of the snake is relatively close to the desired contour. In our application, the technique described in step 1 provides a fairly good approximation of the real contour, therefore allowing the use of the active contour and the snake algorithm to perform a fine tuning of the contour on the image. Each vocal fold has been modelled as an independent snake, having its extreme point constrained to belong to the first and the last scan line of the image, respectively. Differently from [11], the snake is applied on the whole contour and not on sequential rows. This makes the search particularly efficient and robust.

2.1.1. First step: initial contour The following routine, which handles some basic settings, is executed to provide initial data for the snake. It normalizes grey levels, initializes contour, takes notice of black lines, to be disregarded by the algorithm, and thus finds the significant rows in the image. Then, it sets to 0 the level of the image outside the right and left edges, defined by the user. This allows avoiding noisy fluctuations of the grey levels in regions far from the vocal folds. The routine scans the significant rows and, for each of them, determines the largest interval of pixels with a grey level lower than a pre-specified threshold. A first contour is thus obtained by storing the interval co-ordinates for each line in two separate arrays. 2.1.2. Second step: snake active contour Snakes are planar deformable contours that are useful in several image analysis tasks. They are often used to approximate the location and shape of object boundaries, on the basis of the reasonable assumption that boundaries are piecewise continuous or smooth [13]. Representing the position of a snake parametrically by vðsÞ ¼ ðxðsÞ; yðsÞÞ, s 2 [0, 1], its energy Esnake can be written as: Z Z Esnake ¼ Eint ½vðsÞ ds þ Eext ½vðsÞ ds (1) where Eint represents the internal energy of the snake due to bending, associated with a priori constraints, and Eext is an external potential energy, which depends on the image and accounts for a posteriori information. The final shape of the contour corresponds to the minimum of Esnake. In the original technique [14], the internal energy is defined as:   2      @ vðsÞ 2  @vðsÞ 2 1    :  Eint ½vðsÞ ¼ aðsÞ þ bðsÞ (2) 2 @s  @s2  This energy is composed of a first order term controlled by a(s) (tension of the contour) and a second order term controlled by b(s) (rigidity of the contour). The external energy couples the snake to the image. It is defined as a scalar potential function whose local minima

(3)

2.2. Second stage: parameter extraction and tracking In this stage, data consist of (time, edge value) pairs, for each fold, obtained as described in the previous steps. As already said, at present three clinically relevant parameters are extracted from data, specifically Ramp, Rper, Roc. As we deal with pathological voices, mostly characterized by irregular vocal folds vibration, the numerical evaluation of the parameters above has required setting up an operating procedure that takes into account such aspect. Here, we will briefly describe the steps involved in the estimation of Ramp, Rper, Roc. Ramp: in order to evaluate the vibration amplitude a reference point has been defined, corresponding to the rest position of the folds. It coincides with the mid-point between the left and right folds at their minimum distance, and has been called the closed-fold point. Amplitude has been defined as the maximum distance of a fold from the closed-fold point, during a vibration period. The value of Ramp on a frame has been evaluated as the average of ratios between amplitudes, as evaluated in each period. Rper: the vibration period has been assumed to correspond to the distance between two consecutive maxima of the amplitude, defined as above. The value of Rper on a frame has been evaluated as the ratio of distances, averaged over all periods which are fully visible in the frame. Roc: the opening phase and the closing phase have been determined by means of a selective filtering of data, aimed at removing noise-generated artifacts in a neighborhood of the

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

opening or closing point. A noise-generated opening or closing is defined as an opening (or closing) phase which lasts less then a fixed number (three) of scanlines. The filter processes the data sequence and removes noise-generated phases. The estimation of Roc has been obtained by averaging the ratio of the length of the opening phase and the length of the following closing phase, over all pairs of phases which are fully visible in the frame. Following [11], as well as clinicians suggestions, other parameters are under study. With the proposed contour detection algorithm, they could be easily extracted from VKG images, to set up an exhaustive clinical picture. 3. Experimental results The proposed approach was applied to a set of VKG recordings (Kay Elemetrics VKG Camera, Model 89001), ENT Dept., Ospedale Maggiore, Milan, Italy, from one normal subject and a set of dysphonic patients. Specifically, 11 patients (6 males, 5 females, age 24–81, mean 52 years) were analysed, affected by different pathologies: leukoplachia (thickened white patches on the vocal folds), granuloma, polyp, functional dysphonia, and hypomotility of the left vocal fold (possibly due to idiopathic left laryngeal hemiplegia). One healthy male subject was considered for reference values. All subjects were asked to emit the Italian sustained vowel /a/ at comfortable pitch. In order to have a more complete description of pathologies, acoustic analysis was also performed for all subjects. It comprises spectrogram, fundamental frequency F 0, noise and formant tracking, by means of a new robust tool for voice analysis [16]. Specifically, noise is measured by means of an adaptive version of the Normalized Noise Energy (NNE) index, named ANNE. The ratio (in dB) between estimated spectral noise energy and the energy of the whole signal is evaluated and tracked on the whole recording. The result is a negative number that gives a measure of the noise level. Large negative values correspond to low noise. The VKG analysis was carried out under C++ development environment, while the acoustic analysis for F 0, noise and formants tracking was implemented under Matlab1 6.0 environment. For each subject, each VKG image has been processed and visually inspected, to qualitatively assess the contour identification. Both the results of the first step (before the application of the snake algorithm), and of the second one, optimized

133

through the active contour procedure described in Section 2, were considered. The first step was found to work reasonably well in about 80% of the test cases, although there is a considerable amount of noise, which reduces the reliability of the measured parameters. In the remaining 20% of cases, the images present dark zones, which cause the detection of artefacts appearing as anomalous contours. Indeed, as the thresholding method performs only a local search, it is rather sensitive to local irregularities in brightness and contrast. In about 80% of cases, such irregularities caused only local alteration of the contour, but made the thresholding method inadequate in the remaining 20% of images. The application of the active contour method, however, greatly reduced the presence of both noise and artefacts, allowing achieving an accurate contour detection. Three examples are reported, showing the effectiveness of the proposed approach when applied to different pathologies. Results obtained with the reference subject confirm symmetric and regular vibration, both by visual inspection and by parameters tracking, Ramp and Rper being almost stable and around a mean value equal to 1. As expected, Roc is found to oscillate a little below 1. These results are not reported here. The first case concerns a male patient, 38 years old, light music singer, with granuloma on the left vocal fold. As expected, acoustic analysis showed that, on the whole recording, F 0 varied almost irregularly in the range 140– 160 Hz, with noise level heavily oscillating between 0 and 20 dB. Almost unstable formants were found, embedded in strong noise, especially in the harmonics spectral region (below 4 kHz). Fig. 2 shows one VKG frame, both before (left) and after (right, red line) edge contour detection. The upper side of the figures corresponds to the left vocal fold motion. Visual inspection clearly shows a strong asymmetry due to pathology that has been objectively quantified by the proposed technique. Quantitative results are reported in Fig. 3, relative to a sequence of 15 frames, corresponding to about 0.6 s. Strong Ramp and Roc fluctuations are observed, though around a mean value ffi 1. Rper is almost stable here, only to a small extent lower than 1. The second case concerns another male patient, 75 years old, affected by leukoplachia on the left vocal fold. On the whole recording, F 0 was found to vary irregularly around 200 Hz, with abrupt jumps. Noise level oscillates between 5 and 25 dB, with a mean value of about 17 dB. From acoustic analysis, it was found that spectrogram and formants were characterized by strong noise and low formants’ energy, due to

Fig. 2. VKG images for case 1. Left: before edge contour detection; right: after edge contour detection. Notice asymmetric vibration of the vocal folds.

134

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

Fig. 3. Case 1: tracking of VKG parameters, Ramp, Rper, and Roc, on a set of 15 VKG images, along with their standard deviation.

pathology. Fig. 4 shows a single VKG frame out of about 150, for about 6 s total duration of the whole recording. The upper side of the figure is relative to the left vocal fold motion. Despite strong noise and irregular vibration of the vocal folds, as clearly shown on the left-side image of Fig. 4, the proposed contour detection procedure has correctly detected edges. Fig. 5 shows tracking of the three parameters Roc, Ramp. Notice that there are large fluctuations for Roc and Ramp, both around mean values <1. These parameters confirm and quantify asymmetric and irregular vocal folds vibration, linked to pathology. Mean Rper is close to 1, and almost stable, in agreement with Fig. 4. The last example refers to a female patient, 80 years old, affected by hypomotility of the left vocal fold. For this patient, on the whole recording, F 0 varied irregularly in the range 180–300 Hz, frequently showing period doubling, typical of unstable oscillation. Noise level varies around 15 dB with an increasing trend towards 0 dB in the final part of the emission, due to increasing effort made by the patient. Almost unstable formants were found, embedded in moderate noise.

Fig. 5. Roc, Ramp, Rper tracking along about 150 VKG frames for case 2. Roc and Ramp confirm irregular vibration of the left vocal fold.

Fig. 6 shows the vocal folds oscillation as it can be observed in one VKG image out of about 100 (4 s). The image shows that the oscillation is noticeably reduced on the left fold, thus giving a closing phase longer than the opening one. The upper side of the figure corresponds to the left vocal fold motion. On the right, the contour, successfully detected by the snake algorithm, is shown (bold line). In Fig. 7, the asymmetric and irregular behaviour of the vocal folds is objectively evaluated and tracked. Irregularity is pointed out, especially as far as Ramp and Roc indexes are concerned. Specifically, strong Roc variation is observed. This is mainly due to pathology, but perhaps to some extent also to difficulty in keeping the endoscope fixed on the same line through the whole analysis, this being one of the main drawbacks of VKG. Notice also large fluctuations of Ramp around its mean value, while Rper is almost stable and close to 1. The results obtained with the other subjects, concerning different pathologies, were all in agreement with visual inspection of VKG recordings. Both visual inspection of contours and objective parameters tracking have thus provided clinicians with useful details and information, also in cases not clearly distinguishable with stroboscopy alone.

Fig. 4. Case 2: edge contour detection. Left: original VKG image; right: same frame with contour detection superimposed (bold line). Vocal folds oscillate irregularly due to pathology. The VKG image is imbedded in strong noise that partially masks contours.

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

135

Fig. 6. VKG image, before (left) and after (right) edge detection for case 3. Notice asymmetry of vibration and long closing phase due to pathology.

Fig. 7. Tracking of VKG parameters for case 3. A sequence of about 100 VKG frames is analysed. Notice irregular and highly varying Roc.

tion is based on qualitative inspection of asymmetry, glottal insufficiency and vibration amplitude. The physician, taking distances (in mm) on printed images, can make only rough quantitative measures. This implies the necessity of using only ratios between measures, to avoid calibration of images. While being time consuming, this method prevents comparison among different subjects and pathologies. Hence, the need for automatic evaluation and tracking of objective parameters, extracted from VKG images, is of great interest. This paper focused on providing a basic set of such parameters, by means of a two-step active contour technique for edge detection. Specific adjustments of the general-purpose methods were also required, in order to deal with highly varying and noisy images under study. Current research focuses on refinements of the proposed technique, as well as on the estimation of a wider set of parameters, such as asymmetric vibration and phase displacement of the vibratory pattern of the vocal folds. A user-friendly interface is also under construction, with the aim of making the analysis fully automatic and allowing easy storage and retrieval of patient’s data. References

The examples presented here concern three cases of irregular vibration, all with vocal fold closure. The condition of incomplete vocal folds closure is under study from the VKG point of view. This in fact implies defining specific parameters for analysis, such as mean distance between folds and phase displacement of vibratory patterns, as well as finding new techniques for their measure. Moreover, larger sets of data must be analysed, in order to optimize the procedure and set up reference values, as a valid support to diagnosis and surgical effectiveness evaluation. 4. Final remarks Kymographic imaging provides valuable information on the dynamic behaviour of the laryngeal tissues that is not so clearly distinct in the classical stroboscopic viewing, especially in case of early or non-specific lesions, irregular closure patterns, vocal fold weakness, paralysis, that may also allow for the differentiation of weakness due to overuse, aging, paresis, or early stages of neurological conditions. The information can be used in basic research, vocal fold modelling, as well as in clinical practice, as, for instance, in evaluating the results achieved by phonosurgery. However, until now, no quantitative analysis of videokymographic images is commercially available. Parameter evalua-

[1] J.G. Svec, H.K. Schutte, Videokymography: high-speed line scanning of vocal fold vibration, J. Voice 10 (1996) 201–205. [2] H.K. Schutte, J.G. Sˇvec, F. Sˇram, Videokymography: research and clinical issues, Log. Phon. Vocol. 22 (4) (1997) 152–156. [3] M. Tigges, P. Mergell, H. Herzel, T. Wittenberg, U. Eysholdt, Observation and modelling glottal biphonation, Acustica/Acta Acustica 83 (1997) 707–714. [4] H. Larsson, S. Hertegard, P.-A. Lindestad, B. Hammarberg, Vocal fold vibrations: high-speed imaging, kymography and acoustic analysis, Laryngoscope 110 (2000) 2117–2122. [5] D. Deliyski, P. Petrushev, Methods for objective assessment of high-speed videoendoscopy, in: Proceedings of the 6th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research AQL2003, Hamburg, Germany, 2003, CD-ROM. [6] C. Manfredi, L. Bocchi, G. Peretti, First results on quantitative analysis of videokymographic images, in: MEDICON’04 Conference, Ischia Island, Italy, 2004, CD-ROM. [7] C. Manfredi, L. Bocchi, N. Migali, G. Cantarella, Quantitative analysis of videokymographic images and audio signals in dysphonia, in: XVIII IFOS World Congress, Rome, Italy, 2005, CD-ROM. [8] C. Manfredi, G. Cantarella, L. Bocchi, B. Maraschi, Hoarse voice analysis: comparing videokymographic and acoustic data, in: 6th Pan European Voice Conference, PEVOC6, London, UK, (2005), p. 89. [9] S. Bianchi, L. Bocchi, C. Manfredi, G. Cantarella, N. Migali, Objective vocal fold vibration assessment from videokymographic images, in: Proceedings of the 4th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, Firenze, Italy, (2005), pp. 121–124. [10] G. Peretti, C. Piazza, M. Giudice, C. Balzanelli, C. Mensi, M. Rossini, Videokymography, Acta Phon. Lat. 24 (2001) 71–77.

136

C. Manfredi et al. / Biomedical Signal Processing and Control 1 (2006) 129–136

[11] Q. Qiu, H.K. Schutte, L. Gu, Q. Yu, An automatic method to quantify the vibration properties of human vocal folds via videokymography, Folia Phoniatrica et Logopaedica 55 (2003) 128–136. [12] J.G. Svec, F. Sram, Kymographic imaging of the vocal folds oscillations, in: Proceedings of ICSLP-2002, vol. 2, Denver, CO, USA, (2002), pp. 957–960. [13] K.-W. Cheung, D.-Y. Yeung, R.T. Chin, On deformable models for visual pattern recognition, Pattern Recog. 35 (2002) 1507–1526.

[14] M. Kass, A. Witkin, D. Terzopoulos, Snakes: active contour models, Int. J. Comput. Vision 1 (1988) 321–331. [15] T. McInerney, D. Terzopoulos, Deformable models in medical image analysis: a survey, Med. Image Anal. 1 (1996) 91–108. [16] C. Manfredi, G. Peretti, A new insight into post-surgical objective voice quality evaluation: application to thyroplastic medialisation, IEEE Trans. Biomed. Eng. 53 (3) (2006) 442–451.