Detection of Parkinson’s disease based on voice patterns ranking and optimized support vector machine

Biomedical Signal Processing and Control 49 (2019) 427–433 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journa...

Download PDF

1MB Sizes 0 Downloads 7 Views

Report

PDF Reader
Full Text

Biomedical Signal Processing and Control 49 (2019) 427–433

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc

Detection of Parkinson’s disease based on voice patterns ranking and optimized support vector machine Salim Lahmiri a,∗ , Amir Shmuel a,b a b

The Montreal Neurological Institute, Department of Neurology and Neurosurgery, McGill University, Montreal, Canada Department of Physiology, Department of Biomedical Engineering, McGill University, Montreal, Canada

a r t i c l e

i n f o

Article history: Received 28 August 2017 Received in revised form 6 August 2018 Accepted 27 August 2018 Keywords: Parkinson’s disease Voice disorder Features ranking Support vector machine Radial basis function Bayesian optimization Classiﬁcation

a b s t r a c t Parkinson’s disease (PD) is a neurodegenerative disorder that causes severe motor and cognitive dysfunctions. Several types of physiological signals can be analyzed to accurately detect PD by using machine learning methods. This work considers the diagnosis of PD based on voice patterns. In particular, we focus on assessing the performance of eight different pattern ranking techniques (also termed feature selection methods) when coupled with nonlinear support vector machine (SVM) to distinguish between PD patients and healthy control subjects. The parameters of the radial basis function kernel of the SVM classiﬁer were optimized by using Bayesian optimization technique. Our results show that the receiver operating characteristic and the Wilcoxon-based ranking techniques provide the highest sensitivity and speciﬁcity. © 2018 Elsevier Ltd. All rights reserved.

1. Introduction Parkinson’s disease (PD) is a neurodegenerative disorder linked to loss of dopamine-producing neurons in the basal ganglia [1]. The major symptoms of PD can be classiﬁed into motor and nonmotor symptoms [1]. The ﬁrst category includes tremor, muscular rigidity, bradykinesia, and postural instability, whereas the second category includes depression, executive dysfunctions, sleep disturbances, and autonomic impairments. Over recent years, the efforts to understand and characterize PD have intensiﬁed. Recent studies have focused on detection of PD by relying on several measurements, including the dynamics of electro-myographic (EMG) signals [2], gait analysis [3–7], spontaneous cardiovascular oscillations [8], compound force signal [9], and steadiness of syllable repetition [10]. In order to assist clinicians in distinguishing between PD patients and normal subjects, several computer-aided diagnosis (CAD) systems have been proposed. For instance, the authors of [11] used fractional amplitude of low-frequency resting-state functional magnetic resonance imaging (RS-fMRI) and support vector machine (SVM) for classiﬁcation. The results from 51 patients with PD and 50 healthy controls based on the leave-one-out cross-validation method (LOOM) showed that the proposed system

∗ Corresponding author. E-mail addresses: [email protected] (S. Lahmiri), [email protected] (A. Shmuel). https://doi.org/10.1016/j.bspc.2018.08.029 1746-8094/© 2018 Elsevier Ltd. All rights reserved.

distinguished PD from healthy control subjects with 92% sensitivity and 87% speciﬁcity. A system based on principal components analysis (PCA) for feature extraction from whole brain structural magnetic resonance images (MRI) and SVM was proposed in [12]. Experimental results from 28 PD and 28 healthy control subjects indicated a mean accuracy above 92.7% following LOOM. A system that uses multilevel region of interests (ROI) features from structural brain MRI, ﬁltering, wrapper feature selection methods, and multi-kernel support vector machine (SVM) was proposed in [13]. Experimental results from 69 PD patients and 103 normal controls showed that the proposed PD detection system achieved an accuracy of 85.78%, speciﬁcity of 87.79%, and sensitivity of 87.64% based on 10-fold cross-validation scheme. In another study, a PD detection system relied on radial basis function (RBF) neural networks trained with gait characteristics represented by gait dynamics [14]. The proposed PD detection system was tested on gait patterns from 93 PD patients and 73 healthy controls using 5-fold cross-validation. The RBF neural networks achieved 96.39% accuracy, 96.77% sensitivity and 95.89% speciﬁcity. Other studies used patterns from emotional information [15], handwriting [16], articulation disorder [17], and dysphonia measurements [18]. The main objective of this study is to distinguish PD patients from healthy control subjects by using SVM trained with voice disorder patterns. In fact, compared to alternative physiological measurements, speech signal based patterns obtained noninvasively are informative characteristics that can discriminate PD from healthy control subjects [17,18]. Vocal impairment is a PD

428

S. Lahmiri, A. Shmuel / Biomedical Signal Processing and Control 49 (2019) 427–433

symptom that can be revealed up to ﬁve years prior to clinical diagnosis [19], and clear majority of PD patients classically show some form of vocal disorder [20]. We present a detailed comparison of features ranking techniques coupled with SVM in the task of PD detection. Indeed, voice patterns inﬂuence the performance of the classiﬁer since some of them could be redundant or irrelevant. Pattern ranking makes it possible to assess the relevance and to select the most distinctive features and class variables. Therefore, pattern ranking methods can identify the most informative features to be used along with a classiﬁer in the design of a CAD system. In this study, the performance of the SVM classiﬁer under eight feature/pattern ranking techniques is examined in terms of accuracy, sensitivity, and speciﬁcity statistics. The included eight features ranking techniques are t-test, entropy, Bhattacharyya statistic, receiver operating characteristic (ROC), Wilcoxon statistic, fuzzy mutual information (FMI), genetic algorithms (GA), and the SVM recursive feature elimination with correlation bias reduction (RFE-CBR). We chose these statistical feature selection techniques because they are fast and effective [21]. In addition, genetic algorithms are inductive. Adaptive random search techniques are capable of exploiting accumulating information about an unknown search space to subsequently search into new promising subspaces. In addition, they are fundamentally a domain independent search techniques, suitable when domain knowledge and theory are difﬁcult or impossible to provide [22]. Finally, the SVM-based RFE-CBR is an embedded feature selection algorithm (or a wrapper-based technique) that uses criteria derived from the coefﬁcients in original SVM models to assess features. It recursively removes features that are not informative. Compared to other wrapper-based techniques, the SVM-based RFE-CBR does not use the cross-validation accuracy on the training data as the selection criterion. Therefore, it is less prone to overﬁtting and is fast even if the original feature set is large [23]. In summary, in the current work we focus only on fast feature selection techniques chosen from statistical ﬁlters, evolutionary algorithms, and SVM-based embedded feature selection techniques. We rely on SVM [24] as the main classiﬁer because of the following reasons. Based on structural risk minimization, the SVM classiﬁer can evade local minima and have excellent generalization ability [17]. It is robust to limited data [24], and it performs better than linear discriminant analysis, k nearest-neighbors, naïve Bayes, regression trees, and radial basis function in identifying PD patients based on dysphonia measurements [25]. In general, it is effective in biomedical data classiﬁcation problems [11,14,26–28].

2. Data and methods In this work, a reduced dataset from [18] is used to examine the performance of the SVM classiﬁer under each feature ranking technique. The dataset in [18] contains 132 voice patterns, whilst the current dataset contains 22. The current dataset contains 195 vowel phonations from 147 PD patients and 48 healthy control (HC) subjects. For each vowel phonation, a set of 22 voice patterns is measured; including the average vocal fundamental frequency (Fo), maximum vocal fundamental frequency (Fhi), minimum vocal fundamental frequency (Flo), jitter (%), jitter absolute value (Abs), relative amplitude perturbation (RAP), period perturbation quotient (PPQ), difference of differences between cycles divided by average period (DDP), local shimmer, shimmer in decibels (db), three point amplitude perturbation quotient (APQ3), ﬁve point amplitude perturbation quotient (APQ5), amplitude perturbation quotient (APQ), average absolute difference between consecutive differences between amplitudes of consecutive periods (DDA), noise-to-harmonics ratio (NHR), harmonics-to-noise ratio (HNR),

Fig. 1. Plots of the cumulative distribution function (CDF). Blue and red lines present data from healthy control subjects and Parkinson’s patients, respectively. For each panel, the value of the voice pattern runs along the horizontal axis and its corresponding CDF value along the vertical axis. See the methods section for the various measures extracted from the voice.

recurrence period density entropy (RPDE), detrended ﬂuctuation analysis based fractal (DFA), two measures of spread (Spread 1, Spread 2), correlation dimension (D2), and pitch period entropy (PPE). The full description of these patterns can be found in [18]. Fig. 1 presents the cumulative distribution function (CDF) of each vocal pattern, separately for HC and PD patients. As indicated previously, eight features ranking techniques were employed; namely t-test [29], entropy [30], Bhattacharyya statistic [31], ROC [32], Wilcoxon statistic [33], FMI [34], GA [35], and RFE-CBR [36]. Therefore, voice patterns are ranked from the most to less informative according to each pattern ranking technique. Then, we investigate SVM classiﬁcation accuracy as a function of the k best ranked voice patterns for each ranking technique. Accordingly, for each ranking technique, the SVM is trained with the ﬁrst ranked voice pattern, ﬁrst and second ranked voice patterns, and so on until using all patterns. For instance, the training and testing of the SVM is carried out for the k highest ranked voice patterns ranging from one (ﬁrst ranked pattern) to twenty-two, where twenty-two is the number of voice patterns. In this framework, each k ranked voice pattern represent a subset of features.

S. Lahmiri, A. Shmuel / Biomedical Signal Processing and Control 49 (2019) 427–433

429

1 0.9 0.8

Accuracy

0.7 0.6 0.5

Battacharyya GA ROC RFE-CBR Wilcoxon Entropy t-test FMI

0.4 0.3 0.2 0.1 0 0

2

4

6

8

10 12 Number of patterns

14

16

18

20

22

Fig. 2. SVM accuracy as a function of the number of patterns.

SVM [24] is a powerful nonlinear classiﬁer that employs a hyperplane based on structural risk minimization to separate classes. The space between classes and the constructed hyper-plane is maximized to distinguish between classes. In particular, the linear SVM is given by: y = f (x) = wT x − b

(1)

where x is the data, y is class label, w is the weight vector orthogonal to the decision hyper-plane, b is offset of the hyper-plane, and T is the transpose operator. The solution to the linear SVM is found by maximizing the margin used to separate classes. This is equivalent to solving the following minimization problem: min w,b,

1 2

wT w + C

Subject to,

n

i=1

i

(2)

yi = f (xi ) = wT xi − b ≥ 1 − i

(3)

where ␰ (␰i ≥0, i = 1,2,. . .,n) is a slack variable used to indicate the allowed degree of classiﬁcation error, C > 0 is a penalty parameter which is the upper bound on the error, and n is the number of instances. The nonlinear SVM classiﬁer employs a kernel function K to separate data nonlinearly. It is expressed as follows: f (xi ) = sign

n

yi ˛i K x, xi + b

(4)

i=1

where ˛ is the Lagrange multiplier, K is a kernel function, and b is a constant coefﬁcient. In our work, we adopt the radial basis function (RBF) as the nonlinear Kernel. It is given by:

second ranked voice patterns, and so on until using all twenty-two voice patterns of the entire dataset. Accordingly, the performance of the SVM classiﬁer is calculated for each subset of k highest ranked voice patterns, where k ranges from one to all twenty two voice patterns.

2

K (x, xi ) = exp −ı |x − xi |

(5)

where ı > 0 is a scale parameter. In our study, the value of the slack variable is set to 0.001. For each ranking feature along with each run, we optimize the cross-validated SVM classiﬁer penalty parameter C and scale parameter ı by using Bayesian optimization [37]. For robust validation of the SVM classiﬁer, our training-testing stages apply ten-fold cross-validation. Then, the average values of common performance measures, including accuracy, speciﬁcity and sensitivity are computed to evaluate the performance of the SVM under each feature ranking technique. Recall that for each feature ranking approach, the features are sorted to consider features with the highest discriminative power ﬁrst. Afterward, the SVM classiﬁer is trained with the ﬁrst ranked voice pattern, ﬁrst and

3. Results For each feature ranking technique, voice patterns were ranked from the most to the least informative. Then, ten-fold crossvalidation method was pursued for training and testing the SVM classiﬁer with each subset of ranked voice patterns. Following the ten-fold cross-validation, the average and standard deviation of each performance measure were calculated. Figs. 2–4 present the average accuracy, speciﬁcity, and sensitivity of the SVM classiﬁer as a function of the number of highest ranked patterns used for training and classiﬁcation. According to Fig. 2, the highest SVM classiﬁcation accuracy performance (92.21%) to distinguish between PD and HC was achieved when using 14 patterns selected by the Wilcoxon-based pattern ranking technique. Conversely, the lowest SVM classiﬁcation accuracy performance (71.20%) was achieved with a single pattern selected by the entropy-based pattern ranking technique. According to Fig. 3, the highest SVM speciﬁcity (82.79%) was obtained with 13 patterns selected by the ROC-based pattern ranking technique. The lowest SVM speciﬁcity performance (0%) was obtained with a single pattern selected by the Battacharyya, genetic algorithm and the t-test based pattern ranking techniques. When using up to seven patterns, the RFE-CBR based ranking method outperformed all other ranking techniques in terms of speciﬁcity. Fig. 4 shows that the highest SVM sensitivity performance (99.63%) was obtained with only a single pattern selected by the ROC-based pattern ranking technique. The lowest SVM sensitivity performance (85.58%) was obtained with 17 patterns using the entropy-based patterns ranking technique. Figs. 5–7 present boxplots of accuracy, speciﬁcity, and sensitivity distributions for all pattern ranking techniques under study. According to the distributions of accuracy shown in Fig. 5, the ROC and Wilcoxon ranking techniques are better suited for coupling with SVM than the other ranking techniques. In addition, the Battacharyya-based and the Entropy-based ranking techniques yield lower accuracy in comparison with the other techniques. Inspecting the distributions of speciﬁcity in Fig. 6, the ROC, RFECBR and Wilcoxon based pattern ranking techniques perform better than the other techniques. The entropy-based and the fuzzy mutual

430

S. Lahmiri, A. Shmuel / Biomedical Signal Processing and Control 49 (2019) 427–433

0.9 0.8 0.7

Specificity

0.6 0.5 Battacharyya GA ROC RFE-CBR Wilcoxon Entropy t-test FMI

0.4 0.3 0.2 0.1 0 0

2

4

6

8

10 12 Number of patterns

14

16

18

20

22

Fig. 3. SVM speciﬁcity as a function the number of patterns.

1 0.9 0.8

Sensitivity

0.7 0.6 0.5 Battacharyya GA ROC RFE-CBR Wilcoxon Entropy t-test FMI

0.4 0.3 0.2 0.1 0 0

2

4

6

8

10 12 Number of patterns

14

16

18

20

22

Fig. 4. SVM sensitivity as a function of the number of patterns.

Fig. 5. Boxplot of the SVM accuracy of each of the pattern ranking techniques. The symbol ‘+’ indicates an outlier.

S. Lahmiri, A. Shmuel / Biomedical Signal Processing and Control 49 (2019) 427–433

431

Fig. 6. Boxplot of the SVM speciﬁcity of each of the pattern ranking techniques. The symbol ‘+’ indicates an outlier.

Fig. 7. Boxplot of the SVM sensitivity of each of the pattern ranking techniques. The symbol ‘+’ indicates an outlier.

information ranking techniques yield lower speciﬁcity than the other ranking methods. Finally, by examining the distributions of sensitivity in Fig. 7, one can observe that the ROC-based and the fuzzy mutual information pattern ranking techniques yield high sensitivity results. In contrast, the RFE-CBR method provides lower sensitivity values. 4. Discussion and conclusion PD is a common neurodegenerative disorder causing motor and cognitive dysfunction. Several studies have been conducted to understand its physiological aspects [2–10] and to design CAD systems to accurately detect PD [15–18]. CAD systems are commonly

based on machine learning, and in many cases they incorporate of a feature selection scheme. We examined the effectiveness of SVM coupled with several pattern ranking techniques in distinguishing PD and healthy control subjects using voice characteristics. Pattern ranking evaluates the importance of patterns and classes, so that most informative features can be fed to the SVM classiﬁer. We selected SVM for classiﬁcation due to its ability to map original patterns from a high dimensional space to eventually construct an optimal boundary hyper-plane in this space by using nonlinear kernel functions. In addition, it achieves the global optimum and is robust even when the original data sample is small. Moreover, the SVM classiﬁer was also selected based on its success in various biomedical science and engineering applications

432

S. Lahmiri, A. Shmuel / Biomedical Signal Processing and Control 49 (2019) 427–433

Table 1 Comparison with other studies. Works

basis

features

classiﬁer

accuracy/sensitivity/speciﬁcity

[11] [12] [13] [14] [15] [16] [17] [18] [18] Current Current

RS-fMRI MRI MRI Gait Emotions Handwriting Articulatory Speech Speech Speech Speech

Fractional amplitude of low-frequency Principal component analysis multilevel region of interests Dynamics of vertical ground Reaction force higher order spectral features Statistics of horizontal and vertical directions Standard articulatory features 132 phonation features Relief applied to 132 features Ranking 22 features using ROC All 22 phonation features

SVM SVM SVM RBFNN SVM NB SVM SVM SVM SVM + BO SVM + BO

NA/92%/87% 92.7%/NA/NA 85.78%/87.79%/87.64% 96.39%/96.77%/95.89% 85.85%±4.88/93.16%±3.19/NA 91%/88%/95% 88%/NA/NA 97.7%±2.8/NA/NA 98.6%±2.1/99.2%±1.8/95.1%±8.4 92.13%/82.79/95.27% 91.82%/80.72%/95.02%

In the current work, the performance obtained by ranking 22 phonation features using the ROC is based on 13 features trained by an SVM classiﬁer optimized by Bayesian optimization. RS-fMRI: resting-state functional magnetic resonance imaging. RBFNN: radial basis function neural network. NB: Naïve Bayes classiﬁer. NA stands for information not available.

[11,14,26–28,38–40]. There are several kernel functions that could be used under SVM, including the linear, quadratic, polynomial, and multilayer perceptron kernels. However, we focused on radial basis function which is a common, local and ﬂexible kernel function. We did not consider a linear kernel because it does not perform well in separating data arranged with a non-linear boundary. The SVM parameters were optimized by using Bayesian optimization [37] which is a fast and effective optimization technique. In this work, we focused on a preprocessing step that can rank patterns based on their capacity to discriminate between instances of the classes before induction takes place. They are simple to implement and interpret. They form an appealing alternative to wrapper techniques which are more demanding computationally. In fact, wrapper techniques are based on an induction algorithm which is assessed over each considered pattern set. For each pattern ranking technique considered in this study, the computations of the ten-fold cross-validation protocol were completed no more than a few seconds. However, only one wrapper-based technique was considered in this study for comparison purpose; namely the SVM-REF-CBR model where a recursive feature elimination (RFE) procedure is employed to shrink the effect of correlation bias in patterns. Consequently, the computational cost of the SVM-REF-CBR was signiﬁcantly higher than those of the other techniques. The obtained results show that the SVM classiﬁer achieved the highest classiﬁcation accuracy (92.21%) with the ﬁrst fourteen voice patterns identiﬁed by the Wilcoxon-based pattern ranking technique. The SVM achieved the highest speciﬁcity (82.79%) when trained with the ﬁrst signiﬁcant thirteen voice patterns identiﬁed by the ROC-based pattern ranking technique. The SVM yielded the highest sensitivity, 99.63%, with only one voice pattern under the ROC-based pattern ranking technique. These results are of interest, since they can guide building CAD systems for PD with promising applications. Overall, the SVM classiﬁer achieved 91.82% accuracy, 80.72% sensitivity, and 95.02% speciﬁcity when trained with All 22 phonation-based features. Using the ROC-based ranked features with 13 patterns yielded 92.13% accuracy, 82.79% sensitivity, and 95.27% speciﬁcity. Therefore, decreasing the number of phonation features leads to small improvements in sensitivity and speciﬁcity. In summary, our study shows that ROC-based and the Wilcoxonbased pattern ranking techniques combined with the SVM classiﬁer perform well relative to the other techniques considered in this work. ROC achieved the highest sensitivity (99.63%) with only one voice pattern, the highest speciﬁcity (82.79%) with thirteen voice patterns, and the second best accuracy with 13 patterns (92.13% against 92.21% with 14 patterns obtained by the Wilcoxon-based pattern ranking). The high-level performance of the ROC ranking can be explained by the fact that the ROC approach seeks to ﬁnd an effective compromise between sensitivity and speciﬁcity.

Finally, for illustration purpose, Table 1 compares the results from different previous studies using different modalities and methods for PD detection. The work of [18] yielded higher performance than that obtained in the current study. This is mainly due to the fact that the dataset in [18] contained a larger number of phonation features than the one we use in the current study: 132 compared to 22. Therefore, informative and discriminative phonation features that can be found in the dataset used in [18] could not be used in our study. Unfortunately we do not have access to the comprehensive dataset in [18]. Finally, according to the results presented in Table 1, systems for PD detection based on speech yield better results than modalities based on MRI, emotions, and handwriting characteristics. In addition, PD detection systems based on gait patterns yield high performance measures. Lastly, there is no speciﬁc theory on the choice of features used to identify PD subjects. Therefore, multimodal feature based system for PD diagnosis needs to be explored. The performance of such developed multimodal systems may further improve accuracy. Acknowledgement Supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN 2015-05103). References [1] R. Yuvaraj, M. Murugappan, N. Mohamed Ibrahim, K. Sundaraj, M.I. Omar, K. Mohamad, R. Palaniappan, Optimal set of EEG features for emotional state classiﬁcation and trajectory visualization in Parkinson’s disease, Int. J. Psychophysiol. 94 (2014) 482–495. [2] G. De Michele, S. Sello, M. Chiara Carboncini, B. Rossi, S.-K. Strambi, Cross-correlation time-frequency analysis for multiple EMG signals in Parkinson’s disease: a wavelet approach, Med. Eng. Phys. 25 (2003) 361–369. [3] M.R. Daliri, Chi-square distance kernel of the gaits for the diagnosis of Parkinson’s disease, Biomed. Signal Process. Control 8 (2013) 66–70. [4] Y. Xia, Q. Gao, Q. Ye, Classiﬁcation of gait rhythm signals between patients with neuro-degenerative diseases and normal subjects: experiments with statistical features and different classiﬁcation models, Biomed. Signal Process. Control 18 (2015) 254–262. [5] B.L. Su, R. Song, L.Y. Guo, C.W. Yen, Characterizing gait asymmetry via frequency sub-band components of the ground reaction force, Biomed. Signal Process. Control 18 (2015) 56–60. [6] Y. Wu, P. Chen, X. Luo, M. Wu, L. Liao, S. Yang, R.M. Rangayyan, Measuring signal ﬂuctuations in gait rhythm time series of patients with Parkinson’s disease using entropy parameters, Biomed. Signal Process. Control 31 (2017) 265–271. [7] M. Eltoukhy, C. Kuenze, J. Oh, M. Jacopetti, S. Wooten, J. Signorile, Microsoft Kinect can distinguish differences in over-ground gait between older persons with and without Parkinson’s disease, Med. Eng. Phys. 44 (2017) 1–7. [8] G. Valenza, S. Orsolini, S. Diciotti, L. Citi, E.P. Scilingo, M. Guerrisi, S. Danti, C. Lucetti, C. Tessa, R. Barbieri, N. Toschi, Assessment of spontaneous cardiovascular oscillations in Parkinson’s disease, Biomed. Signal Process. Control 26 (2016) 80–89. [9] S. Bilgin, The impact of feature extraction for the classiﬁcation of amyotrophic lateral sclerosis among neurodegenerative diseases and healthy subjects, Biomed. Signal Process. Control 31 (2017) 288–294.

S. Lahmiri, A. Shmuel / Biomedical Signal Processing and Control 49 (2019) 427–433 [10] S. Skodda, Steadiness of syllable repetition in early motor stages of Parkinson’s disease, Biomed. Signal Process. Control 17 (2015) 55–59. [11] Y. Tang, L. Meng, C.-M. Wan, Z.-H. Liu, W.-H. Liao, X.-X. Yan, X.-Y. Wang, B.-S. Tang, J.-F. Guo, Identifying the presence of Parkinson’s disease using low-frequency ﬂuctuations in BOLD signals, Neurosci. Lett. 645 (2017) 1–6. [12] C. Salvatore, A. Cerasa, I. Castiglioni, F. Gallivanone, A. Augimeri, M. Lopezd, G. Arabia, M. Morelli, M.C. Gilardi, A. Quattrone, Machine learning on brain MRI data for differential diagnosis of Parkinson’s disease and Progressive Supranuclear Palsy, J. Neurosci. Methods 222 (2014) 230–237. [13] B. Peng, S. Wang, Z. Zhou, Y. Liu, B. Tong, T. Zhang, Y. Dai, A multilevel-ROI-features-based machine learning method for detection of morphometric biomarkers in Parkinson’s disease, Neurosci. Lett. 651 (2017) 88–94. [14] W. Zeng, F. Liu, Q. Wang, Y. Wang, L. Ma, Y. Zhang, Parkinson’s disease classiﬁcation using gait analysis via deterministic learning, Neurosci. Lett. 633 (2016) 268–278. [15] R. Yuvaraj, M. Murugappan, N.M. Ibrahim, K. Sundaraj, M.I. Omar, K. Mohamad, R. Palaniappan, Detection of emotions in Parkinson’s disease using higher order spectral features from brain’s electrical activity, Biomed. Signal Process. Control 14 (2014) 108–116. [16] C. Kotsavasiloglou, N. Kostikis, D. Hristu-Varsakelis, M. Arnaoutoglou, Machine learning-based classiﬁcation of simple drawing movements in Parkinson’s disease, Biomed. Signal Process. Control 31 (2017) 174–180. ˇ ´ J. Rusz, R. Cmejla, ˚ ziˇcka, Automatic evaluation of articulatory [17] M. Novotny, E. Ruˇ disorders in Parkinson’s disease, IEEE/ACM Trans. Audio Speech Proc. Conf. Empir. Methods Nat. Lang. Process. 22 (2014) 1366–1378. [18] A. Tsanas, M.A. Little, P.E. McSharry, J. Spielman, L.O. Ramig, Novel speech signal processing algorithms for high-accuracy classiﬁcation of Parkinson’s disease, IEEE Trans. Biomed. Eng. 59 (2012) 1264–1271. [19] B. Harel, M. Cannizzaro, P.J. Snyder, Variability in fundamental frequency during speech in prodromal and incipient Parkinson’s disease: a longitudinal case study, Brain Cogn. 56 (2004) 24–29. [20] A. Ho, R. Iansek, C. Marigliani, J. Bradshaw, S. Gates, Speech impairment in a large sample of patients with Parkinson’s disease, Behav. Neurol. 11 (1998) 131–137. [21] S. Lahmiri, C. Gargour, M. Gabrea, Statistical features selection and pathologies detection in retina digital images, in: Proceeding of The 38th Annual Conference on IEEE Industrial Electronics Society, 2012, pp. 1585–1590. [22] K. De Jong, Learning With Genetic Algorithms: an Overview, Machine Learning, Vol. 3, Kluwer Academic publishers, 1988. [23] K. Yana, D. Zhang, Feature selection and analysis on correlated gas sensor data with recursive feature elimination, Sens. Actuators B: Chem. 212 (2015) 353–363. [24] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer, New York, 1995. [25] S. Lahmiri, D.A. Dawson, A. Shmuel, Performance of machine learning methods in diagnosing Parkinson’s disease based on dysphonia measures, Biomed. Eng. Lett. 8 (2018) 29–39.

433

[26] R. Kheliﬁ, M. Adel, S. Bourennane, Segmentation of multispectral images based on band selection by including texture and mutual information, Biomed. Signal Process. Control 20 (2015) 16–23. [27] S. Lahmiri, Image characterization by fractal descriptors in variational mode decomposition domain: application to brain magnetic resonance, Physica A 456 (2016) 235–243. [28] S. Lahmiri, Glioma detection based on multi-fractal features of segmented brain MRI by particle swarm optimization techniques, Biomed. Signal Process. Control 31 (2017) 148–155. [29] U. Fayyad, K. Irani, Multi-interval discretization of continuous valued attributes for classiﬁcation learning, in: Proceeding of The International Joint Conference on Artiﬁcial Intelligence, 1993, pp. 1022–1029. [30] S. Kullback, R.A. Leibler, On information and sufﬁciency, Ann. Math. Stat. 22 (1951) 79–86. [31] Y. Chen, L. Zhang, J. Li, Y. Shi, Domain driven two-phase feature selection method based on bhattacharyya distance and kernel distance measurements, in: Proceeding of The IEEE International Conferences on Web Intelligence and Intelligent Agent Technology, 2011, pp. 217–220. [32] H. Mamitsuka, Selecting features in microarray classiﬁcation using ROC curves, Pattern Recognit. 39 (2006) 2393–2404. [33] J.L. Myers, A. Well, Research Design and Statistical Analysis, Lawrence Erlbaum Associates, Inc, Mahwah, New Jersey, USA, 2003. [34] N. Hoque, H.A. Ahmed, D.K. Bhattacharyya, J.K. Kalita, A fuzzy mutual information-based feature selection method for classiﬁcation, Fuzzy Inf. Eng. 8 (2016) 355–384. [35] I. Beheshti, H. Demirel, H. Matsuda, for the Alzheimer’s Disease Neuroimaging Initiative, Classiﬁcation of Alzheimer’s disease and prediction of mild cognitive impairment-to-Alzheimer’s conversion from structural magnetic resource imaging using feature ranking and a genetic algorithm, Comput. Biol. Med. 83 (2017) 109–119. [36] I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classiﬁcation using support vector machines, Mach. Learn. 46 (2002) 389–422. [37] T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction, second edition, Springer-Verlag, New York, 2009. [38] S. Lahmiri, M. Boukadoum, New approach for automatic classiﬁcation of Alzheimer’s disease, mild cognitive impairment and healthy brain magnetic resonance images, IET Healthcare Technol. Lett. 1 (2014) 32–36. [39] S. Lahmiri, M. Boukadoum, Hybrid discrete wavelet transform and gabor ﬁlter banks processing for features extraction from biomedical images, J. Med. Eng. (2013), 104684, http://dx.doi.org/10.1155/2013/104684, 13 pages. [40] S. Lahmiri, An accurate system to distinguish between normal and abnormal electroencephalogram records with epileptic seizure free intervals, Biomed. Signal Process. Control 40 (2018) 312–317.

Detection of Parkinson’s disease based on voice patterns ranking and optimized support vector machine

Detection of Parkinson’s disease based on voice patterns ranking and optimized support vector machine

Recommend Documents