Computerized Medical Imaging and Graphics PERGAMON
Computerized Medical Imaging and Graphics 23 (1999) 323–330 www.elsevier.com/locate/compmedimag
Imaging of vocal fold vibration by digital multi-plane kymography M. Tigges*, T. Wittenberg, P. Mergell, U. Eysholdt Department of Phoniatrics and Pediatric Audiology, Friedrich-Alexander University of Erlangen-Nu¨rnberg, Bohlenplatz 21, D-91054 Erlangen, Germany Received 6 April 1999; accepted 16 August 1999
Abstract Digital multi-plane kymography is presented as a new method to demonstrate vocal fold vibration from digital high-speed recordings. Single lines from digital high-speed sequences of laryngoscopical examinations are concatenated to images, which are called kymograms. In order to reveal anterior–posterior (AP) modes of vibration several kymograms from different location of the glottis can be obtained from a single recording. Problems due to rotation of the endoscope or relative movements of patient or examiner can be solved by image processing algorithms specifically designed for this application. Different types of phonation onset and examples of voice disorders are given. q 1999 Elsevier Science Ltd. All rights reserved. Keywords: Voice; Vocal cords; Physiology of voice; High-speed recording; Kymography; Image processing
1. Introduction Direct observation of vocal fold vibration is essential for the diagnosis of voice disorders. During phonation the vocal folds vibrate with a frequency of about 100 Hz in men and 200 Hz in women. Such rapid oscillation can be observed by means of technical devices only, as the temporal resolution of human visual perception is limited to frequencies of about 20 Hz. In clinical practice, stroboscopy is generally used for the examination of vocal fold vibration [1,2]: phase-coupled flashlamp illumination adapted from technical to medical application provides a virtual slow motion of vocal fold vibration. For methodic reasons the application of stroboscopy is restricted to periodical processes. However, hoarseness is frequently related to aperiodicity of vocal fold vibration [3,4]. The observation of an oscillation needs, due to Shannon’s sampling theorem, a sampling rate of more than two times of the highest frequency, which is to be captured to avoid aliasing effects [5]. This condition is fulfilled with high-speed recordings that allow frame rates of up to 10 000 fps (frames per second). Another method to demonstrate the fast movements of the vocal fold by means of photographical procedures called photokymography has been already developed in 1971 [6]. The principle is to line up strip-shaped selections of an * Corresponding author. Tel.: 149-9131-853-2603; fax: 149-9131-8539272. E-mail address:
[email protected] (M. Tigges)
image derived from a camera with a focal-plane shutter that moves in front of the film during exposure. The resulting picture represents the movements of a certain part of an object over a period of time. Since the first application of the photokymographic method to vocal fold vibration many improvements have been reported: on the focal plane shutter [7,8], on strip kymography [9], microscopic kymography [10] and quantitative evaluation [11,12]. With strip kymography only a selected segment of the glottis is depicted while the film is moved behind the focal plane shutter [9]. Microkymography allows depiction of a maximum of 10 cycles (about 50 ms for 200 Hz as fundamental frequency of phonation) [10]. Nevertheless, the methods were used in research laboratories only, because the photographic procedure, the delayed development process and the time consuming quantitative evaluation by hand was too cumbersome for clinical application. A substantial progress was achieved by videokymography [13,14]. By reading out only one single line of a conventional video camera image, a cheap and efficient one-dimensional high-speed recording system was designed, which delivers up to 8000 lines/s in black and white. A monitor shows the lines recorded successively during examination. At a recording rate of 7812.5 fps each videoframe contains the information of a single line during a time period of 18.4 ms. Each line has the full resolution of a video recording (approximately 720 pixel/line). With this technique only one dimension (horizontal recording line) is recorded at high speed instead of a full frame recording at ordinary slow speed.
0895-6111/99/$ - see front matter q 1999 Elsevier Science Ltd. All rights reserved. PII: S0895-611 1(99)00030-0
324
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330
• Documentation on videotape or photograph is tedious and inconvenient for patient files. In this paper a combination of digital high-speed recording technique and the kymographical principle is introduced. The method is intended to overcome the problems mentioned and will be applied to selected examples of normal subjects and patients.
2. Method 2.1. High-speed glottography Fig. 1. Subsequent single frames of a high-speed recording of vocal folds during a glottal cycle (healthy subject, male, 26 years, recording rate: 1922 fps, resolution 128 × 64 pixel: The first frame shows the maximum opening. During the following frames, the glottal aperture decreases more and more. The glottis is closed from the second frame in the second line until the first frame of the third line. In the last three frames, the glottis opens itself again up to the maximum.
However, all of the kymographical methods mentioned have restrictions that limit their practical use in the examination of vocal fold vibration: • Anterior–posterior modes of vibration cannot be demonstrated. • Distortion of the dorso-ventral axis of the glottis, which appears when the endoscope cannot be introduced exactly in saggital direction, leads to a bias of the spatial parameters (in videokymography this problem can only be compensated by manual rotation of the camera during the recording [15]). • Quantitative evaluation is difficult and time-consuming since the data available are analogous.
A digital high-speed camera is used which, in its current version, delivers a spatial resolution of maximum 256 × 256 pixel at 10 000 fps with 8 bit gray values (the system costs US$20 000 at the current configuration level) [15]. The video frames of the camera are continuously written into a circular memory of 256 MB, which is equivalent to about 8 s recording time. The recording is stopped by a manual trigger signal. For further evaluation the digital image sequences are transferred from the memory into a commercial PC. A selection of subsequent single frames taken from a glottal cycle of a healthy male subject is given in Fig. 1. 2.2. Digital multi-plane kymography As the high-speed sequences are available in digital format already image processing procedures can be applied on the recordings [16,17]. In order to generate digital kymograms a special algorithm was designed. With the aid of this algorithm single lines from each frame of the high-speed recording are concatenated to a digital kymogram (Fig. 2).
Fig. 2. Principle of kymography from digital high-speed recordings. Single scan lines from subsequent recordings are concatenated to an image by a line scanning algorithm.
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330
325
However, a shift in other directions can lead to erroneous evaluations: in this case concatenation of a chosen horizontal line depicts lines next to each other that do not come from identical positions of the glottis. This is above all important for the judgment of amplitudes of vocal fold vibration since they are pronounced variously at different glottal positions. Considering these problems, the digital high-speed recordings need further processing. A program that renders reliable kymograms needs two parameters: • the index y of the scan line in each single frame of the image sequence; • the rotation angle f between the camera and the main glottal axis. A digital gray-scale image of the larynx is a n × m matrix of 8 bit values (n columns, m lines). Fig. 3. Single frame obtained from a high-speed recording. The solid line indicates the position of the kymographic evaluation. It is perpendicular to the principal axis of the glottis (dotted line) that was calculated after segmentation of the glottal area by means of an image processing program.
To receive descriptive kymograms the following properties of laryngeal endoscopy have to be considered: • Torsion of the main glottal axis (connection line between anterior and posterior commissures) (Fig. 3). During phonation the vocal folds vibrate primarily in mediolateral direction. However, the investigation conditions (high tongue base, gag reflex) do not always allow to introduce the endoscope exactly in dorso-ventral direction. Therefore, in real recordings the upright image axis and the main glottal axis often differ from each other (Fig. 3). Thus simple concatenation of horizontal lines would map non-corresponding locations of the right and left vocal fold in opposite position in the kymogram lines. • Glottal shift: during examination, relative movements between endoscope and larynx are inevitable. Consequently, the glottal area often shifts within the image. A horizontal shift of the glottis in the picture appears in the kymogram as wavelike movement of the glottal midline (Figs. 4–6). This is not disadvantageous for the visual assessment as the reason is easily recognized.
The case of 08 , f # 158 can be regarded as equivalent to f 0; because in these conditions the perpendicular line to the main glottal axis is within the height of one line, especially with resolutions within the range of 128 pixel. In this case the digital kymogram is constructed by concatenation of horizontal lines from successive frames. The line number y can be arbitrarily chosen
0 , y , m; but is held constant for each image of the video sequence. If f . 158; the kymogram is constructed from lines that run perpendicular to the main glottal axis (Fig. 3). Depending on the parameters y and f , these lines are computed automatically from each image frame, using a digital line scanning algorithm [18]. Consecutive perpendicular lines are then concatenated to angle-corrected digital kymogram images [19]. In the program the scan line y and the angle f have to be selected manually in the first image. Since each line of the pixel can be considered as a m-vector of 8 bit gray values, a kymogram is a discrete time series of such vectors. The time difference between one vector and its neighbor is Dt 1=Fs : If the relevant phases of phonation (adduction of the vocal folds, prephonatory standstill, phonation onset and steady state of vibration) are included in the recording, a typical duration for the whole recording is 1 s (e.g. about 4000 frames at a recording rate of 4000 fps which is used in females because of the higher pitch). The final kymogram shows corresponding positions of
Fig. 4. Example of a kymogram, with
y; f
26:98 vocal fold vibration during normal initiation in a normal subject (corresponding to Figs. 1 and 2), total time displayed is 500 ms. The left vocal fold is on top, the right vocal fold below. On the left side the adduction movement of the vocal fold from respiration position to phonation position is depicted, followed by a period of prephonatory standstill. Vocal folds are closed. During this phase the false vocal folds show a discrete adduction movement, too. However, the view at the true vocal folds is never hampered. After the beginning of vibration, it takes approximately 6 cycles to reach the maximum amplitude (voice onset) and steady state of vibration.
326
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330
Fig. 5. Kymogram of a hard initiation by a normal subject (26 year old male, recording rate: 1922 fps, resolution 128 pixel/line, total time: 500 ms). After adduction of the vocal folds, a longer period of prephonatory standstill than in normal initiation can be observed. In addition, there is an adduction movement of the false vocal folds which temporarily cover the vocal folds.
the vocal folds, directly opposite, independent of rotation or transversal shift of the glottis. A digital kymogram is a twodimensional, 8 bit gray scale image which shows the movements of a small part for a certain length of time. For better readability the kymograms are rotated 908 counterclockwise (Fig. 4, kymogram corresponding to Fig. 1). The maximum duration of a digital kymogram is limited by the memory capacity of the high-speed camera (a digital kymogram can contain up to 130 000 lines which corresponds up to 8 s recording time). This is sufficient to comprehend the relevant phases of phonation onset and to even register several syllables in a single kymogram. Each line of a frame can be selected to be depicted in a kymogram. From an individual recording theoretically n kymograms can be derived immediately after the recording (n number of lines in a single frame). Computation time is approximately 30 s and depends mainly on the speed of the processor and the storage medium involved. The kymograms can be printed on a laser printer and stored for documentation in a patient’s file. 2.3. Multi-plane concatenation If every part of the glottis moves in the same direction, computation of a single line is sufficient to demonstrate vocal fold vibration [19,22]. Extraction of digital high-speed kymograms from digital high-speed sequences reduces the spatial information from two-dimensions to one-dimension. The advantage of full frame high-speed examination is that it provides information about the whole length of the glottis simultaneously.
For example, female voices with hyperfunctional dysphonia often exhibit a lack of glottal closure in the posterior third, whereas incomplete closure in the anterior third is typical for male voices with hypofunctional dysphonia [20]. While medio-lateral vibration is adequately represented by conventional kymography, anterior–posterior (AP) modes, which are important for certain voice disorders [21], cannot be detected by this procedure. Therefore, additional scanning of several positions of the glottis (anterior, median, posterior third) is desirable.
3. Results 3.1. Phonation onset Phonation onset (initiation) as an important characteristic of a voice is used for the classification of functional voice disorders [20]. At the beginning of vibration a high amount of aperiodicity is observed, as the strengths of expiratory air and laryngeal muscle tension are not yet matched at this stage of phonation. Quantitative evaluation of phonation onset is possible from kymograms by defining parameters for the description of amplitude growth [22]. Fig. 4 shows a typical kymogram of a normal phonation onset in a healthy male subject. In the time interval between 0 and 100 ms the prephonatory adduction movement is visible. The dark triangle corresponds to the temporal change of the glottal opening, the brighter stripes in the image, above and below the dark area in the center (glottis), are the vocal folds. They are bordered to the lateral right and
Fig. 6. Kymogram of a breathy initiation by a normal subject (26-year-old male, recording rate: 1922 fps, spatial resolution 128 pixel/line, total time: 500 ms). In contrast to normal and hard initiation there is no glottal closure after adduction of the vocal folds. The period of prephonatory standstill is shorter than in normal initiation. There is no adduction of the false vocal folds.
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330
327
Fig. 7. Kymogram of a patient with paralysis of the right vocal fold (female, 47 years, recording rate: 3707 fps, spatial resolution: 128 pixel/line, total time displayed: 90 ms). At about 10 ms the vocal fold have no glottal closure: as long as the vocal folds touch each other, frequencies are identical, as soon as no glottal closure can be achieved, the left vocal fold vibrates two times while the right one vibrates only once (marked by white arrows). The number of periods of vibration between the irregularities varies between 7 and 8 periods.
left by the Morgagni sinus (smooth dark stripe) and the false vocal folds (brighter stripe). From 0 to 100 ms the vocal folds close the glottis, from 100 to 300 ms the glottis remains closed. From 300 to 370 ms there is the onset of vibration, from 370 ms a steady state of vibration. The false vocal folds do not vibrate. They change their position only during the phase of prephonatory standstill. Then they approach the median and as a result, almost cover completely the vocal folds (around 200 ms). At the end of this stage they move apart in order to take the initial position again. In comparison to normal initiation, two other types of initiation have been examined. A normal subject was asked to phonate the vowel “i” using normal, hard and breathy onsets. Perceptual rating confirmed that the type of initiation could be performed correctly by untrained normal subjects after a single demonstration [22,23]. At the beginning of a hard initiation (Fig. 5) the vocal folds move from respiratory position to phonatory position. However, the phase of prephonatory standstill is longer than in normal initiation. In this period a very bright strip is visible on the right vocal fold, which corresponds to light reflection due to superficial mucus. The false vocal folds show a distinct movement for the center of the glottis, temporarily covering the vocal folds completely. After the beginning of vibration the steady-state amplitude is reached earlier than in normal initiation. In a breathy initiation (Fig. 6) the vocal folds move from the lateral position to the center of the glottis, but do not touch each other. The glottal closure remains incomplete during the phase of prephonatory rest. Vibration begins after a standstill of 100 ms and needs approximately 5 periods to reach the maximum amplitude. An irregular vibration can be seen at 270 ms.
closure. Asymmetry is characterized by differences of either mass or tension of the vocal folds. A clinical example is given in Fig. 7. A female patient suffered from roughness of voice due to paralysis in the right vocal fold. In vocal fold paralysis the vocal fold involved looses tension and mass due to muscle atrophy. Although both vocal folds in this example have identical fundamental frequencies there is a second, long term periodicity: every 7–8 cycles a variation of frequency and amplitude appears, which results in a temporarily incomplete glottal closure. The coupling force between the vocal folds is able to synchronize both vibrations for some cycles, but not for more than 7–8 periods of phonation. The perceptual equivalent to this phenomenon is a rough voice.
3.2. Aperiodic vibration
4. Discussion
A specific application for the high-speed technique is the observation of aperiodic vibrations, which can occur in asymmetric vocal folds combined with incomplete glottal
The kymographical principle by various techniques of line scanning is appropriate for the demonstration of laryngeal movement. Vocal fold vibration can be judged from
3.3. Anterior–posterior mode of vibration An example of an AP mode of vibration is demonstrated in Figs. 8 and 9. A 26-year-old female patient presented with hoarseness. Single frames of a vibration cycle show a strong AP mode of the left vocal fold (Fig. 8). A single plane kymogram would not be able to detect this mode. An attempt to demonstrate the AP movement is made in Fig. 9 where kymograms of different locations are aligned with each other. All of the kymograms show marked variation of amplitude and frequency as well as a clear right–left asymmetry. The frequency of the left (upper) vocal fold distinguishes irregularly from that of the right vocal fold. The AP mode shown in Fig. 8 cannot be recognized from one single kymogram only, no matter which line was chosen for kymographic evaluation. Exact comparison of the kymograms shows different positions of the vocal folds at a given time (for example the period between white lines in Fig. 9).
328
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330
methods was able to deliver synchronous kymograms of different locations. This is only possible with the digital multi-plane kymography derived from high-speed recordings. Kymograms of arbitrary location can be arranged in columns to feature different modes of vibration (Fig. 9). A digital multiplane kymogram is available immediately after examination. Its similarity to the familiar laryngeal picture makes it very convenient for the quick orientation of vocal fold vibration in the out patient clinic because any doctor is well trained to imaging procedures rather than to numerical evaluation. Printouts of the kymograms are cheap and easily obtainable for documentation in the patients’ file.
5. Conclusions
Fig. 8. Subsequent single frames from a period of vibration in a 26-year-old female patient who complained of hoarseness (recording rate: 3703 fps, spatial resolution 128 × 64 pixel=line: The left vocal fold exhibits an AP mode of vibration besides the common medio-lateral mode: in some frames the anterior and posterior thirds are lateral while the middle one is still medial (white arrows). The right vocal fold moves in the medio-lateral direction only.
kymograms in terms of its symmetry and periodicity [7,8]. Kymographic imaging allows immediate orientation of vocal fold vibration during phonation. This way of depiction of vocal fold vibration is easily recognized by the medical observer. In addition, supraglottal structures such as the false vocal folds are included in the images. Conventional methods of photokymography either scan the whole length of the glottis at different times [9,10] or demonstrate a fixed line [6,8,14]. Digital multiplane kymograms are easily available by variation of an algorithm designed for image processing in high-speed examination of the larynx. The compilation of kymograms from digital high-speed recordings is the only method that allows to choose an arbitrary scan line from the picture after the recording. With digital kymography a large number of cycles can be demonstrated as the length of a kymogram is limited only by the duration of a high-speed recording (e.g. 800 cycles for 10 000 fps with F0 100 Hz; recording time 8 s). Therefore, even irregularities of vibration that occur less than every 5 or 6 cycles can be detected. An image processing algorithm corrects the laryngeal rotation in the picture due to distorted positioning of the endoscope during examination. Signal processing techniques allow quantitative evaluation for extraction of temporal and spatial parameters of vocal cord vibration. To visualize AP modes several positions of the glottis have to be depicted at the same time. Up to now, repeated examinations or a shift of inspection angle during the endoscopic examination have been tried in an effort to overcome these problems to some extent [15]. None of the former
Digital kymography requires a high-speed camera recording system and special image processing program. The following properties of digital kymography from highspeed recording make it an appropriate tool for the examination of vocal fold vibration in clinical practice: • Position of kymographic line is optional and the choice can be made after the recording. • Several kymograms of different locations of the vocal folds can be generated to depict AP modes. • Kymograms, which depict vocal fold vibration of up to 8 s duration, can be computed to demonstrate even infrequent aperiodicities. • Correction of a rotated view on the larynx is performed by an image processing algorithm. • Immediate availability. • Printing for documentation in patients’ files on an office printer instead of video print or photograph. 6. Summary Digital multi-plane kymography is presented as a method to demonstrate the movements of vocal fold vibrations from high-speed recordings. Application of the kymographic principle allows depiction of the course of mediolateral movement of the vocal folds. Single lines from images of digital high-speed sequences are concatenated to kymograms with correction of the principal axis of the glottis. Arbitrary lines can be chosen for computation to depict asymmetries and irregularities of vibration of any location of the vocal fold. Anterior–posterior modes can be visualized easily by digital multi-plane kymography, whereas, other methods of kymography are unsuitable to register AP modes of vibration simultaneously. Therefore an immediate visual judgment of vocal fold vibration is possible. Similarity to the endoscopic view and quick availability make kymography a convenient tool for the out patient clinic. Prints can be made for documentation in patients’ files.
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330
329
Fig. 9. Kymograms of different locations (line numbers: 54, 64, 74, 90) from the same patient as in Fig. 8. The image on top shows a single frame obtained from recording in Fig. 8. The horizontal lines that have been chosen for kymographic evaluation are marked. The corresponding kymograms are given below (total time displayed is 55 ms). All of the kymograms provide marked variation of amplitude and frequency as well as a clear right–left asymmetry. The frequency of the left (upper) vocal fold differs irregularly from the right vocal fold: e.g. phases with period doubling from 30 ms onwards. The AP mode shown in Fig. 8 cannot be recognized from one single kymogram only, no matter which location for the kymographic line was chosen. Exact comparison of the kymograms shows different positions of the vocal folds on a stated time (for example, periods between marked vertical lines).
References [1] Scho¨nha¨rl E. Die Stroboskopie in der praktischen Laryngologie. Thieme 1960; Stuttgart. [2] Faure MA, Nuller A. Stroboscopy. J Voice 1992;6:139–48. [3] Hirano M. Objective evaluation of the human voice clinical aspects. Folia Phoniatr 1989;41:89–144. [4] Isshiki N, Takeuchi S. Factor analysis of hoarseness. Stud Phonol 1970;5:37. [5] Shannon CE, Weaver W. The mathematical theory of communication, Urbana, III, University of Illinois Press, 1964 10. Auflage.
[6] Gall V, Gall D, Hanson J. Larynx-Fotokymographie. Arch Ohr-, Nas- und Kehlk Heilk 1971;20:34–41. [7] Gross M. Larynxfotokymographie. Sprache-Stimme-Geho¨r 1985;9: 112–3. [8] Gross M. Endoskopische Larynxfotokymographie, Bingen: Renate Gross Verlag, 1988. [9] Gall V. Strip kymography of the glottis. Arch Otorhinolaryngol 1984;240(3):287–93. [10] Schultz-Coulon JH. Mikrofotokymographie des Kehlkopfes. SpracheStimme-Geho¨r 1990;14:4–10. [11] Gall V, Hanson J. Bestimmung Physikalischer Parameter der
330
[12]
[13] [14] [15]
[16]
[17]
[18] [19]
[20] [21]
[22]
[23]
M. Tigges et al. / Computerized Medical Imaging and Graphics 23 (1999) 323–330 Stimmlippenschwingungen mit Hilfe der Larynxkymographie. Folia Phoniatr 1973;25:450–9. Gall V. Fotokymografische Befunde bei Funktionellen Dysphonien, Kehlkopfla¨hmungen und Stimmlippentumoren. Folia Phoniatr 1978;30:28–35. Sˇvec JG, Schutte HK. Videokymography: high-speed line scanning of vocal fold vibration. J Voice 1996;10(2):201–5. Schutte HK, Sˇvec JG, Sˇram F. First results of clinical application of videokymography. Laryngoscope 1998;108(8 Pt 1):1206–10. Eysholdt U, Pro¨schel U, Tigges M. Direct evaluation of highspeed recordings of vocal fold vibrations. Proceedings of The Third International Symposium on Phonosurgery, Kyoto, Japan, 26–28 June 1994. Wittenberg T, Moser M, Tigges M, Eysholdt U. Recording, processing, and analysis of digital high-speed sequences in glottography. Machine Vision Appl 1995;(8):399–404. Eysholdt U, Tigges M, Wittenberg T, Pro¨schel U. Direct evaluation of highspeed recordings of vocal fold vibrations. Folia Phoniatr 1996;48:163–70. Bresenham JE. Algorithm for computer control of a digital plotter. IBM Syst J 1965;4:24–30. Wittenberg T. Automatic motion extraction from laryngeal kymograms. In: Wittenberg T, Mergell P, Tigges M, Eysholdt U, editors. Quantitative Laryngoscopy, 2nd Round Table, Advances in Quantitative Laryngoscopy using Motion-, Image- and Signal Analysis, Erlangen, 1997. pp. 21–8. Fawcus, Margaret, editors. Voice disorders and their management, San Diego, CA: Singular Pub Group, 1991. Hess MM, Herzel H, Koster O, Scheurich F, Gross M. Endoscopic imaging of vocal cord vibrations. Digital high speed recording with various systems. HNO 1996;44(12):685–93. Mergell P, Herzel HP, Wittenberg T, Tigges M, Eysholdt U. Phonation onset: vocal fold modelling and high-speed glottography. J Acoust Soc Am 1998;103(6):000 in press. Tigges M, Wittenberg T, Rosanowski F, Eysholdt U. High-speed imaging and image processing in voice disorders. In: Hans-Jochen Foth, Renato Marchesini, Halina Podbielska, editors. Optical and Imaging Techniques in Biomonitoring II, Proc SPIE, Washington, 2927, 1996. pp. 209–16.
Monika Tigges studied medicine at the University of Bonn, Germany, and received her MD from the University of Bonn, Germany, in 1986. She worked as a resident at the ENT-Hospital of Bonn University. Since 1992 she works in the Department of Phoniatrics and Pediatric Audiology, at the University of Erlangen-Nuremberg, Germany. She finished her specialization in ENT in 1990 and in Phonatrics and Pediatric Audiology in 1995. In 1999 she received her PhD in Phonatrics and Pediatric Audiology. Her current research interests concentrate on voice disorders, endoscopical high-speed recording, and biomechanical modelling. Thomas Wittenberg started his studies of Computer Science at Christopher Newport College, Newport News, Virginia and received his Diploma in 1992 from the University of Erlangen. From 1992 to 1993 he worked at the Fraunhofer Research Company in the field of industrial image processing. In 1993 he became a research associate in the Department of Phonatrics and Pediatric Audiology, at the University of Erlangen-Nuremberg. He finished his doctoral thesis on motion analysis of vocal cord vibration and high-speed imaging in 1998. His research interests include medial pattern recognition and analysis, motion detection and neural networks. Patrick Mergell was born in Neustadt an der Weinstraße, Germany in 1970. He studied physics at Mainz (Germany) and at Marseille (France). He received his diploma in 1995 from the Johannes Gutenberg University of Mainz. From 1995 to 1998 he was a research associate at the Department of Phonatrics and Pediatric Audiology, University of Erlangen-Nuremberg, at the Institute of Theroretical Physics, Technical University of Berlin, and at the Institute of Theoretical Biology, Humboldt University at Berlin. He received the PhD degree in physics in 1998. Since 1999 he works at the research and development department of Siemens Audiological Technology, Erlangen, Germany. Ulrich Eysholdt was born in 1949 in Go¨ttingen. He finished his studies in Applied Physics and Medicine and was a PhD student of Manfred R. Schroeder. He received degrees of both MD and PhD in Go¨ttingen and specialized in the fields of ENT, Phoniatrics and Medical Physics. At present, he is the head of the Department of Phoniatrics and Pediatric Audiology at the University Erlangen-Nuremberg, Germany. He is a member of the Standing Committee of Phoniatrics and Voice Care of IFOS, UEP and the Applied Audiology of IALP.