Swallowing Sound Recognition at Home Using GMM

JID:IRBM AID:520 /FLA [m5G; v1.246; Prn:17/10/2018; 12:46] P.1 (1-6) IRBM ••• (••••) •••–••• Contents lists available at ScienceDirect IRBM www.el...

Download PDF

635KB Sizes 0 Downloads 71 Views

Report

PDF Reader
Full Text

JID:IRBM AID:520 /FLA

[m5G; v1.246; Prn:17/10/2018; 12:46] P.1 (1-6)

IRBM ••• (••••) •••–•••

Contents lists available at ScienceDirect

IRBM www.elsevier.com/locate/irbm

JETSAN

Swallowing Sound Recognition at Home Using GMM H. Khlaiﬁ c,a,∗ , D. Istrate a , J. Demongeot b , D. Malouche c a b c

Sorbonne University, University of Technology of Compiegne, CNRS, UMR 7338 Biomechanics and Bioengineering, 60200 Compiegne, France Université Grenoble Alpes (UGA), AGEIS, Faculté de Médecine, 38700 La Tronche, France National School of Engineers of Tunis, Signals and Systems Research Unit, University of Tunis El Manar, Tunisia

h i g h l i g h t s

g r a p h i c a l

a b s t r a c t

• Detection of useful signals from continuous sound recording and its classiﬁcation.

• Automatic detection is eﬃcient and allows global good detection rate of 87.31%.

• GMM based swallowing sounds classiﬁcation shows a good recognition rate of 85.57%. • The algorithm evaluation was performed on data from different sensors and conditions. • Swallowing sound isolation was good for 95.94% but for other sounds was deteriorated.

a r t i c l e

i n f o

Article history: Received 21 October 2017 Received in revised form 5 October 2018 Accepted 6 October 2018 Available online xxxx Keywords: Automatic detection Classiﬁcation Sound recognition GMM

*

a b s t r a c t Background: Aiming for autonomous living for the people after a stroke is the challenge these days especially for swallowing disorders or dysphagia. The most common cause of dysphagia is stroke. In France, stroke occurs every 4 minutes, which implies 13000 hospitalizations per year. Currently, continuous medical home monitoring of patients is not available. The patient must be hospitalized or visit the medical community for possible follow-up. It is in this context that E-SwallHome (Swallowing & Breathing: Modelling and e-Health at Home) project proposes to develop tools, from hospital care until the patient returns home, which are able to monitor in real time the process of swallowing. Method: This paper presents a relevant health problem affecting patient recovering from stroke. We propose a frequency acoustical analysis for automatic detection of swallowing process and a non-invasive acoustic based method to differentiate between swallowing sounds and other sounds in normal ambient environment during food intake. Result: The proposal algorithm for events detection gives a global rate of good detection of 87.31%. Classiﬁcation of sounds of swallowing and other sounds based on Gaussian Mixture Models (GMM), using the leave-one-out approach according to the small amount of data in our database, gives a good recognition rate of swallowing sounds of 84.57%. Conclusion: The proposal method has great potential to assist in the clinical evaluation using only swallowing sounds, which is a non-invasive technic for swallowing studies. © 2018 AGBM. Published by Elsevier Masson SAS. All rights reserved.

Corresponding author. E-mail address: hajer_khlaiﬁ@yahoo.fr (H. Khlaiﬁ).

https://doi.org/10.1016/j.irbm.2018.10.009 1959-0318/© 2018 AGBM. Published by Elsevier Masson SAS. All rights reserved.

JID:IRBM

AID:520 /FLA

2

[m5G; v1.246; Prn:17/10/2018; 12:46] P.2 (1-6)

H. Khlaiﬁ et al. / IRBM ••• (••••) •••–•••

1. Introduction Increased life expectancy increases the risk of stroke. Its incidence in the elderly is important; about 130 000 French people are affected each year by a stroke. In France, according to the French National Authority for Health (HAS) [1], stroke is the leading cause of acquired disability in adults, the second cause of dementia and the third cause of death. The increase in survival (e.g., 79.3 years for French men and 85.5 for French women in 2014) leads to an increase in care needs and social charges [2] because of the high frequency of medical complications estimated between 40% and 96% [3]. Among them, swallowing disorders or dysphagia are frequent, 42 to 76% of patients have these problems after acute stroke [4]. Its prevalence increases with age, from 9% for subjects aged 65–74 living at home to 28% after 85 years [5]. Numerous studies have attempted to establish the incidence of dysphagia after a stroke with ﬁgures ranging from 23% to 50% [6]. In institutions, it reaches 30 to 60% [7]. As for patients living at home, a study by questionnaire found that 10 to 16% of them have symptoms of dysphagia [5]. These disorders are responsible for severe complications such as aspiration, pneumonia and undernutrition. They are responsible for about 4000 deaths per year in France [5]. The incidence of pneumonia is estimated to be around one third of patients [8]. Pneumonia is associated with a quarter of deaths in the ﬁrst month after stroke [4]. Some studies show that approximately 4% of elderly people living at home are malnourished [7]. They are also responsible for the degradation of the patient’s quality of life by causing them anxiety associated with meals, loss of eating pleasure and then depression and social isolation. These disorders require multidisciplinary care, requiring a good organization and articulation of the health and social system. In this order, E-SwallHome project participates in the home monitoring of the swallowing process by proposing automatic methods for monitoring and evaluating functional rehabilitation. The aim of our work is to start with different signals in ambulatory way to automatically detect swallow event ﬁrstly and secondly identifying swallow event with and without disorders. In order to recognize swallow event, the aim of this paper is to automatically detect swallow event and distinguish it from other event sound which can be present during food intake using GMM model adapted through the Expectation Maximisation (EM) algorithm and Maximum a Posteriori (MAP) adaptation along validation step. 2. State of the art Swallowing disorders are frequent after the onset of neurological disorders such as stroke, which causes lesions in certain parts of the brain that may be the cause of this dysphagia and may thus have disrupted the mechanisms involved in the function of swallowing. The clinical evaluation of swallowing disorders includes a number of tools. The Videoﬂuoroscopic Swallowing Study (VFSS) is considered as the preferred instrument by most practicing swallowing clinicians thanks to its ability to allow to visualize bolus ﬂow versus structural motion in the upper aerodigestive tract in real time. The VFSS helps to identify the physiologic causes of the aspiration and clearly identify the symptoms of the patients swallowing disorder [9,10] and estimate the degree of swallowing impairment allowing clinicians to make an accurate judgment about the sensory or motor impairment [11]. Also, ultrasound imaging is an excellent non-invasive opportunity to study the dynamics of the oropharyngeal system and the muscles and other soft tissues of the oropharynx during swallowing [12]. It was used to examine movements of the lateral pharyngeal wall [13], to visualize and examine tongue movement during swallowing [14,15]. It was also used to evaluate normal and abnormal swallowing by

tracking the motion of the hyoid bone [16]. Ultrasound imaging was also used to assess laryngeal phonation function by identifying the morphology of vocal folds and quantifying the tissue horizontal displacement velocity in the vibrating portion of vocal folds [17] and quantifying vocal fold motion during breathing using Bmode ultrasound [18]. Considering the assumption “position of the head during the meal is advised to be leaned forward”, Imtiaz et al. [19] presented a system comprised of an Inertial Measurement Unit (IMU) and an Electromyographie sensor (EMG), able to measure the head-neck posture and submental muscle activity during swallowing. Another study [20] was established to evaluate the inﬂuence of two swallowing maneuvers (effortful swallow) on anterior suprahyoid surface through electromyographic measurement and pharyngeal manometric pressure. They found that signiﬁcant change in both suprahyoid surface electromyographic and pharyngeal pressures during effortful swallow compared with normal swallow. Some researchers [21] have proved that Tongue-to-palate emphasis during execution of effortful swallowing increases amplitudes of submental surface electromyography, orolingual pressure, and upper pharyngeal pressure to a greater degree than a strategy of inhibiting tongue-to-palate emphasis. Swallowing assessment is also supported by acoustic point of view. In 1990, Vice et al. [22] described the detailed pattern of cervical sounds in newborn infants during suckle feeding. As normal swallowing process can be divided into three stages (oral, pharyngeal and oesophageal phases), swallowing sound can be divided into three different segments [22]: Initial Discrete Sound (IDS), Bolus Transit Sound (BTS) and Final Discrete Sound (FDS). The opening of the cricoipharynx in pharyngeal phase contributes to the IDS and bolus transition into the oesophagus in pharyngeal phase contributes to BTS. Oesophageal phase contributes to the FDS. Moussavi [23] proved that the swallowing sounds have three stages by applying a Hidden Markov Model (HMM). An Automated method [24] has been set up to separate swallowing sounds from breathing sounds using multilayer feedforward neural networks. The algorithm was able to detect 91.7% of swallows correctly. Lazareck et al. [25,26] proposed a non-invasive, acoustic-based method to differentiate between individuals with and without dysphagia. Wearable monitoring swallow system was developed [27] in order to assess an overall food and drink intake habits as well as other swallow related disorders. For example, piezo-respiratory belt able to be used for identifying nonintake swallows, solid intake swallows, and drinking swallows is in development. 2.1. Swallowing phases Swallowing transports food from the mouth to the stomach while protecting the airways. Classically, it is described in three phases: oral, pharyngeal and oesophageal. Swallowing is initiated by the introduction of the food into the oral cavity and its preparation by the process of mastication and insalivation. Once the bolus is formed and is ready to be expelled to the pharynx, the tip of the tongue is raised up and applied against the alveolar ridge of the upper incisors and the tongue takes the form of a spoon where the bolus slips and forms a single mass. The latter moves backwards by the moving tongue which gradually applies to the palate from front to back. At this moment, the soft palate realizes the closure of the oropharynx and prevents the penetration of the bolus into the pharynx, however, the larynx is still open. By the time the bolus penetrates the throat isthmus, the oral time (voluntary time) is over. The back of the tongue moves forward and forms an inclination allowing the bolus to move towards the oropharyngeal cavity. Thus, the pharyngeal phase is triggered by the contact of the bolus with the sensory receptors of the throat isthmus and of the oropharynx.

JID:IRBM AID:520 /FLA

[m5G; v1.246; Prn:17/10/2018; 12:46] P.3 (1-6)

H. Khlaiﬁ et al. / IRBM ••• (••••) •••–•••

The pharyngeal process is a continuous phenomenon in time, considered as a reﬂex accompanied simultaneously by the velopharyngeal closure by the velum, by the laryngeal occlusion assured by the elevation of the larynx, and by the retreat of the tongue’s base, the movement at the bottom of the epiglottis, the pharyngeal peristalsis and ﬁnally the opening of the upper sphincter of the oesophagus allow the passage of the food bolus into the oesophagus. This phase lasts less than one second. The opening of the upper sphincter of the oesophagus is initiated by the arrival of the pharyngeal peristalsis and passage through the oesophagus is ensured by the continuity between the pharyngeal and the oesophageal peristalsis. The pharyngeal stage triggered by the end of the oral stage becomes involuntary and is called the swallowing reﬂex. Velum is raised to close the nasal cavities and avoids any passage of food into the nose and facilitates the passage of the bolus downwards towards the esophagus through the pharynx. Precisely at this moment passage of food into the trachea is avoided. The larynx opens during chewing, to allow breathing, is closed as soon as the bolus arrives on the base of the tongue. At the same time, the vocal cords are embracing ensuring airway closure, the moving cartilages of the larynx (arytenoids) swing forward in the laryngeal vestibule, covered by the rocking of the epiglottis. The larynx is pulled up and down by the hyoid muscles, which places it under the protection of the base of the tongue. As a result, breathing is interrupted and at the same time the last stage of swallowing begins with the introduction of the bolus into the esophagus and progresses through the esophageal peristaltic waves (muscle contractions) to the stomach. The esophagus is indeed a muscular tube, the two extremities of which are sphincteric zones: the upper esophageal sphincter and the lower esophageal sphincter acting as a valve preventing reﬂux of food or acidity from the stomach. The pharyngeal and esophageal phases constituting the reﬂex of swallowing, without voluntary control [28,29,12]. The swallowing reﬂex is the result of the laryngopharyngeal activity, which is triggered by intrapharyngeal and sometimes laryngeal stimuli and is introduced in humans during the fourth month of intrauterine life and observed by ultrasound from the 6th month [29] in order to protect airways from any passage of food. Given the large number of dysphagic patients in the stroke population of up to 76% [4], this thesis topic from the E-SwallHome project (ANR project) participates in the monitoring of patients who have suffered a stroke proposing automatic methods of monitoring and evaluation of the functional rehabilitation of swallowing disorders. Using different sensors which are detailed below in section 3, we must be able to both detect in real time swallowing events and cases of distress of the person during food intake such as aspiration, asphyxiation accompanied or not with falls. In this paper, we propose an automatic detection technique of events from a recorded sound signal during food intake and a GMM model adapted through EM algorithm for classiﬁcation of sound events. 3. Materials and methods In this project, non-invasive equipment is used to simultaneously record sounds associated with the swallow and also environmental sound, respiratory habits and electrical activity of neck muscles during food intake. Sounds were recorded using discrete module of miniature omni-directional microphone capsule (Sennheiser-ME 102) with IMG Stageline MPR1 Microphone PreAmpliﬁer over supra-sternal notch able to record simultaneously swallowing sound and ambient environmental sound. Sounds were recorded at 44.1 KHz sampling rate. Also, sounds were recorded using the microphone of the electroglottograph equipment which is a device that allows to obtain an image of the joining and the open-

3

Fig. 1. Proposed system.

ing of the vocal cords by means of the measurement of the electrical impedance between two electrodes placed on each side of the larynx. The data is recorded at 10 KHz sampling rate. Respiratory signals were recorded using a respiratory plethysmographic inductive (RPI) Visuresp system (RBI, France) for measuring respiratory volume in both spontaneously ventilated subjects and patients receiving ventilatory assistance (at home or in a hospital setting). Visuresp is in the form of a vest connected to an oscillator and both are connected to a signal processing box. This housing is itself connected to a PC computer through a CAN card (Digital Analog Conversion). Respiratory signals were recorded with a sampling rate of 40 Hz. Electrical activity of neck muscles was recorded using a grid of 64 monopolar surface EMG electrodes which was evenly spaced and glued over the front of neck region covering infrahyoid muscles. EMG Electrode Conductive Gel was putted on the electrodes before it was glued to the neck. EMG signals were recorded at 2000 Hz sampling rate. Signals were recorded simultaneously. In this paper, only recorded sounds are used. Two database were recorded; the ﬁrst one was recorded in Compiegne within BioMedical and BioIngeneering laboratory in University of Technology of Compiegne (UTC). The second database was recorded in Grenoble in TIMC-IMAG laboratory by PRETA team. We present in this paper the results obtained from the treatment of the sounds recorded in the laboratory of Grenoble. – Signals were recorded from healthy subjects with no history of swallowing dysfunction. Participants were seated comfortably on a chair. They were told that the equipment would record swallowing sound, respiration signals and the activity of vocal cords. Grenoble database was recorded from 14 persons aged between 25 and 68 years. Participants were asked to swallow their own saliva, different volume of water (5 ml, 10 ml, 15 ml and 100 ml) in a glass and with a straw in a random order. They were fed also compote with a teaspoon and a pom’pote (90 g) to study the effect of different textures on the sound. Additionally, they were asked to read aloud by putting intonation twice each of the: Three phonetically balanced sentences – A succession of ten phonetically balanced sentences, – A paragraph extracted of “J’irai cracher sur vos tombes” by Boris Vian with short and long sentences. Then, subjects were asked to make three times voluntary apneas of different durations (3 and 8 seconds) and to cough voluntary three times. Recording of voluntary apnea and cough can be rewarding for the analysis of the various events for the development of automatic detection algorithm and classiﬁcation. Proposed system (Fig. 1) allows us to automatically detect useful sounds, among which we ﬁnd swallow sounds, in a ﬁrst step, and differentiate between swallow sounds and other sounds in second step. The swallowing database acquired in TIMC-IMAG laboratory and a part of the database used by Sehili [30] during his thesis (described in Table 1) were used in this paper. TIMC-

JID:IRBM

AID:520 /FLA

[m5G; v1.246; Prn:17/10/2018; 12:46] P.4 (1-6)

H. Khlaiﬁ et al. / IRBM ••• (••••) •••–•••

4

Table 1 Sehili’s database. Sounds

Number of samples

Total duration (s)

Breathing Cough Yawn Dishes Glass breaking Keys Laugh Sneeze Water

50 62 20 98 101 36 49 32 54

106.44 181.69 95.87 03.77 99.52 166.33 272.65 51.67 84.72

Table 2 Grenoble’s database (acquired as part of the E-SwallHome project by Pascale Calabraise [31].) Sounds

Number of samples

Total duration (s)

Swallowing Speech Other sound

923 351 126

1538.6 2047.6 166.2

IMAG database is described in the Table 2. Table 1, presents nonphysiological data, but it englobes sounds that can be detected during food intake which allows us to reach our goal which is recognition of swallowing sounds in an environmental place likely to have other types of sounds. 3.1. Automatic detection algorithm We propose a pre-treatment of signals which consists of a wavelet decomposition using symlets wavelet at level 5. After comparison of residues and details we retain detail 3 for the following treatments. Signal is then segmented into slices of 100 milliseconds via a sliding window which runs along the signal with an overlap of 50%. In each position of the sliding window, the energy and the average energy of the ten previous windows of the current window are calculated as shown in Fig. 2. Then Threshold is calculated as a function of the average energy as follows:

A Gaussian mixture model is a parametric probability density function represented as a weighted sum of M component Gaussian densities as given by the equation below:

p (x|Θ) =

M

πi ( f x|μi , Σi )

i =1

where x is a d-dimensional continuous-valued data vector in our case it represents feature acoustic vector, πi , i = 1, . . . , M, are the mixture weights, and f (x|μi , Σi ), i = 1, . . . , M, are the component Gaussian densities. The mixture weights satisfy the constraint that M i =1 πi = 1. Each component density is a d-variate Gaussian function which takes the form:

f (x|μi , Σi ) =

1

1 exp − (x − μi ) Σi−1 (x − μi ) d / 2 1 / 2 2 (2π ) |Σi |

Which μi is the mean vector and Σi is the covariance matrix. Gaussian Mixture Model is parametrized by the mean vector, covariance matrix and mixture weights denoted:

Θ = ( p i , μi , Σi ) 3.3. Expectation Maximisation EM-algorithm Given training acoustic vectors and an initial GMM conﬁguration (θ ), we aim to estimate the parameters of the optimal GMM model which corresponds to the distribution of the training features vectors. There are several techniques to estimate model parameters. The well-established method is Expectation Maximization (EM) an iterative algorithm which its basic idea is to start with the initial model to estimate a new model Θ with maximum likelihood (p (x|Θ) ≥ p (x|Θ)). At the iteration, the new model is becoming the initial model. This step is repeated until a convergence threshold ε is reached which is expressed by the equation below:

L (Θ; x) − L (Θ; x) < ε where L is the likelihood of the model. On each EM iteration, the re-estimation of parameters is done in the following way: Gaussian Component Weights

Fig. 2. Threshold calculation.

where APW i is the average of ten previous windows at position i. Thus, an event is detected as soon as the energy exceeds the threshold. So a start and an end are detected for each event. Then an offset of 450 milliseconds is added on each side of each detection and a fusion of the detected events separating by less than one second is performed. Therefore, any entire or partial detections of the reference event are validated as good detection.

πi = Mean

T 1

T

P ( z|xi , Θ)

i =1

T

μi = i=T 1

P ( z|xi , Θ)xi

i =1

P ( z|xi , Θ)

Variance

T

Σi =

i =1

P ( z|xi , Θ)(xi − μi )(xi − μi )

T

i =1

3.2. Principle of the GMM algorithm Classiﬁcation has now become an important part of exploratory and decision-making analysis, both in different domains. The exploratory objective is typically to made unsupervised classiﬁcation or automatic classiﬁcation. The decision-making purpose of this refers to the supervised classiﬁcation which aims to discover a hypothetical partition in a set of observations referring to a supervised classiﬁcation. Accordingly, it is necessary to have methods that adapt to the diversity of the data to be analyzed, which are in our case sounds that can be present during the food intake. Thanks to their ﬂexibility, the ﬁnite mixture probability distribution model (Gaussian Mixture Model) correspond to these requirements.

P ( z|xi , Θ)

where μi , xi and Σi refer to the ith elements of μ, X and Σ and z refer to the zth component. So the EM algorithm iteratively reﬁnes the GMM parameters to monotonically increase the likelihood of the estimated model for the observed features vectors. In the validation step, we derive the hypothesized sound model by adapting the parameters of the test vector using the training features vectors and a form of Bayesian adaptation-maximum a posteriori adaptation. Like the EM algorithm, the adaptation includes two steps where the ﬁrst is identical to the expectation step of the EM algorithm. Contrary to the second step of the EM algorithm, for adaptation, new suﬃcient statistics estimated are then combined with the old suﬃcient statistics from training model.

JID:IRBM AID:520 /FLA

[m5G; v1.246; Prn:17/10/2018; 12:46] P.5 (1-6)

H. Khlaiﬁ et al. / IRBM ••• (••••) •••–•••

Table 3 Automatic detection’s result. Rate of good detection 87.31%

Swallowing 90.11% Speech 80% Sounds 100%

Automatic detections containing +50% of reference event

Swallowing 76.83% Speech 80% Sounds 66.67%

Table 4 Confusion matrix of UTC database. Swallowing Speech Sounds

Swallowing

Speech

Sounds

84.57% 0% 14.96%

4.26% 75% 14.46%

11.17% 25% 70.57%

Swallowing

Speech

Sounds

95.94% 1.08% 11.11%

1.22% 98.92% 21.43%

2.85% 0% 67.46%

Table 5 Confusion matrix of Grenoble database. Swallowing Speech Sounds

5

two identiﬁed challenges for both approaches. First for detection algorithm, we obtain detection for majority of reference events (the position of reference event). Thought if we notice partial detections we’ll keep high rate comparing to total detection of reference events. Despite that we added an offset for each detection with a duration less than one second, it’s not enough to guarantee that detection contains at least more than 50% of the reference event. Reducing missed event and increasing false positive remain an objective to be achieved. For classiﬁcation, GMM trained through the Expectation Maximisation (EM) algorithm using leave one out approach gives good rate recognition for three classes; swallowing, speech and sound using the two databases of UTC and Grenoble. 6. Conclusion

3.4. Acoustical features Sounds were all resampled to 10 KHz and were not used directly into GMM model. An intermediate step consists in acoustical feature extraction. Features are calculated on a 16 milliseconds window with an overlap of 50%. We use in this study Mel Frequency Cepstral Coeﬃcients (MFCC). MFCCs are very often used in automatic speech recognition. In addition, it was proved [32] that the MFCCs have the most robust characteristics in this domain. 4. Results Automatic detection is the hinge on which classiﬁcation method rests for practical application. Good results from automatic detection’s algorithm ensure good results for the next step of the classiﬁcation using GMM. Result of automatic detection algorithm shows a global rate of good detection of 87.31% which are divided into different events of swallowing, speech and sounds as shown in Table 3 and a global rate of false detection of 36.49%. Then to show the impact of partial detection in the classiﬁcation system, we observe the detections that contains more than 50% of the reference event which represents 82.9% of global good automatic detections which are distributed as shown in Table 3. Table 4 shows the confusion matrix using just the UTC database using GMM trained through the Expectation Maximization (EM) algorithm and leave-one-out approach. It shows a good rate recognition of 84.57% for swallowing events and 75% and 70.57% respectively for speech and sounds events. On other side, results show a false positive rate of swallowing events of 4.26% and 11.17% classiﬁed respectively in speech and sounds events which means that some swallowing events will be missed. Also, sound events are misclassiﬁed as swallowing events which means false alarm of swallowing event. This result is obtained using segments manually annotated and extracted. The same test has been applied to Grenoble database. Result in Table 5 shows a good rate recognition for three classes Swallowing, Speech and Sounds respectively with 95.94%, 98.92% and 67.46%. Then we train models using UTC data and test with Grenoble data. Obviously the results deteriorate because signals were not recorded in the same environmental conditions and with different sensors. To remedy this degradation, we plan in the next work to apply GMM and MAP (Maximum a Posteriori) adaptation approach.

Challenge to detect swallowing events can be formulated at home where sensors are supposed to be continuously worn by the person or operates continuously during food intake to detect in real time swallowing events and non-swallowing events. The proposed system aims to isolate swallowing events for the second step of classiﬁcation. Once the detections are recognized as swallowing, we will identify the food texture as well as the way of swallowing. In our case, it is spontaneously or voluntarily swallowed saliva, water and apple sauce. By an analogous manner, an attempt is also made to analyze the quality of speech. This paper proposes an automated method to detect events from sound signal and identify swallowing sounds from other sounds in healthy subjects. Both algorithms of automatic detection and classiﬁcation using GMM through EM algorithm show promising result. Although, if we introduce signals required from other system, results are deteriorated. So future challenge is to adapt our model through foreign signals such as non-physiological signals in order to ensure the proper functioning of the system in a real environment. Human and animal rights The authors declare that the work described has been carried out in accordance with the Declaration of Helsinki of the World Medical Association revised in 2013 for experiments involving humans as well as in accordance with the EU Directive 2010/63/EU for animal experiments. Informed consent and patient details The authors declare that this report does not contain any personal information that could lead to the identiﬁcation of the patient(s). The authors declare that they obtained a written informed con sent from the patients and/or volunteers included in the article. The authors also conﬁrm that the personal details of the patients and/or volunteers have been removed. Disclosure of interest The authors declare that they have no known competing ﬁnancial or personal relationships that could be viewed as inﬂuencing the work reported in this paper. Funding This work has been supported by: ANR research allowance.

5. Discussion

Acknowledgements

We are seeing some encouraging results for both automatic detection from continuous signal and classiﬁcation but there’s still

This work is a part of the project ESwallHome and was also funded within the framework of EBIOMED Chair of BMBI labo-

JID:IRBM

AID:520 /FLA

6

[m5G; v1.246; Prn:17/10/2018; 12:46] P.6 (1-6)

H. Khlaiﬁ et al. / IRBM ••• (••••) •••–•••

ratory. This work is also a part of the STIC AMSUD-EMONITOR project. We would also like to thank the PRETA team, in TIMCIMAG laboratory, for the collaboration and the database. References [1] de Santé HA. Accident vasculaire cérébral : méthodes de rééducation de la fonction motrice chez l’adulte Méthode Recommandations pour la pratique clinique. 2012. [2] Donnan GA, Fisher M, Macleod M, Davis SM. Stroke. Lancet 2008;371:16121623. [3] Langhorne P, Stott D, Robertson L, MacDonald J, Jones L, McAlpine C, et al. Medical complications after stroke a multicenter study. Stroke 2000;31:1223–9. [4] Zhou Z. Accidents Vasculaires Cérébraux (AVC) : conséquences fonctionnelles et dysphagie associée. PhD thesis, Universite de Limoges; 2009. [5] Rouah AG. La prise en charge des troubles de la deglutition en EHPAD, Etude descriptive des pratiques professionnelles de Médecins Coordonnateurs dans 27 EHPAD d’un groupe privé associatif. 2013. [6] Singh S, Hamdy S. Review dysphagia in stroke patients. Postgrad Med J 2006;82:383391. [7] de Santé HA. Stratégie de prise en charge en cas de dénutrition protéinoénergétique chez la personne agée. 2007. [8] Hilker R, Poetter C, Findeisen N, Sobesky J, Jacobs A, Neveling M, et al. Nosocomial pneumonia after acute stroke: implications for neurological intensive care medicine. Stroke 2003;34:975–81. [9] Dodds WJ, Logemann JA, Stewart ET. Radiologic assessment of abnormal oral and pharyngeal phases of swallowing. Am Roentgen Ray Soc 1990;154:965–74. [10] Logemann JA. Behavioral management for oropharyngeal dysphagia. Folia Phoniatr Logop 1999;51:199–212. [11] Martin-Harris B, Jones B. The videoﬂuorographic swallowing study. Phys Med Rehabil Clin N Am 2008;19:769–85. [12] Jones B. Normal and abnormal swallowing: imaging in diagnosis and therapy. Second edition. Springer; 2012. [13] Miller JL, Watkin KL. Lateral pharyngeal wall motion during swallowing using real time ultrasound. Dysphagia 1997;12:125–32. [14] Shawker TH, Sonies B, Stone M, Baum BJ. Realtime ultrasound visualization of tongue movement during swallowing. J Clin Ultrasound 1983;11:485–90. [15] Stone M, Shawker TH. An ultrasound examination of tongue movement during swallowing. Dysphagia 1986;1:78–83. [16] Sonies BC, Wang C, Sapper DJ. Evaluation of normal and abnormal hyoid bone movement during swallowing by use of ultrasound duplex-doppler imaging. Ultrasound Med Bml 1996;22:1169–75.

[17] Hsiao T-Y, Wang C-L, Chen C-N, Hsieh F-J, Shau Y-W. Noninvasive assessment of laryngeal phonation function using color doppler ultrasound imaging. Ultrasound Med Biol 2001;27:1035–40. [18] Cohen M-E, Lefort M, Bergeret-Cassagne H, Hachi S, Li A, Russ G, et al. Detection of recurrent nerve paralysis: development of a computer aided diagnosis system. IRBM 2015;36(6):367–74. [19] Imtiaz U, Yamamura K, Kong W, Sessa S, Lin Z, Bartolomeo L, et al. Application of wireless inertial measurement units and emg sensors for studying deglutition preliminary results. IEEE 2014;978:4244–7929. [20] Huckabee M-L, Butler SG, Barclay M, Jit S. Submental surface electromyographic measurement and pharyngeal pressures during normal and effortful swallowing. Arch Phys Med Rehabil 2005;86:2144–9. [21] Huckabee M-L, Steele CM. Electromyographic measures and pharyngeal pressure during effortful swallow. Arch Phys Med Rehabil 2006;87:1067–72. [22] Vice FL, Heinz JM, Giuriati G, Hood M, Bosma JF. Cervical auscultation of suckle feeding in newborn infants. Dev Med Child Neurol 1990;32:760–8. [23] Moussavi Z. Assessment of swallowing sounds stages with hidden Markov model. IEEE 2005;6:7803–8740. [24] Aboofazeli M, Moussavi Z. Automated classiﬁcation of swallowing and breath sounds. IEEE 2004;3:7803–8439. [25] Lazareck LJ, Moussavi Z. Swallowing sound characteristics in healthy and dysphagic individuals. IEEE 2004;3:7803–8439. [26] Lazareck LJ, Moussavi Z. Classiﬁcation of normal and dysphagic swallows by acoustical means. IEEE Trans Biomed Eng 2004;51:0018-9294. [27] Dong B, Biswas S. Swallow monitoring through apnea detection in breathing signal. IEEE. p. 6341–4. [28] VWB, et al. La réhabilitation de la déglutition chez l’adulte Le point sur la prise en charge fonctionnelle. Solal, Collection Le Monde du verbe, 2e edition revue et augmente. 2011. [29] Guatterie M, Lozano V. Déglutition-respiration : couple fondamental et paradoxal, unit de rééducation de la déglutition. Kinera 2009;42:1–9. [30] Sehili MEA. Reconnaissance des sons de l’environnement dans un contexte domotique. PhD thesis, Institut National des Telecommunications; 2013. [31] Calabrese Pascale, Pham Dinh Tuan, Eberhard André, Bachy Jean-Pierre, Benchetrit Gila. Effects of resistive loading on the pattern of breathing. Respir Physiol 1998;113:167–79. [32] Davis SB, Mermelstein P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE 1980;28:357–66.

Swallowing Sound Recognition at Home Using GMM

Swallowing Sound Recognition at Home Using GMM

Recommend Documents