Application of Empirical Mode Decomposition (EMD) on DaTSCAN SPECT images to explore Parkinson Disease

Application of Empirical Mode Decomposition (EMD) on DaTSCAN SPECT images to explore Parkinson Disease

Expert Systems with Applications 40 (2013) 2756–2766 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal hom...

1MB Sizes 0 Downloads 36 Views

Expert Systems with Applications 40 (2013) 2756–2766

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Application of Empirical Mode Decomposition (EMD) on DaTSCAN SPECT images to explore Parkinson Disease A. Rojas a, J.M. Górriz a,⇑, J. Ramírez a, I.A. Illán a, F.J. Martínez-Murcia a, A. Ortiz b, M. Gómez Río c, M. Moreno-Caballero c a b c

Dept. of Signal Theory, Networking and Communications, University of Granada, 18071 Granada, Spain Dept. of Ingeniera de Comunicaciones, University of Malaga, 29071 Malaga, Spain Virgen de las Nieves Hospital, Granada, Spain

a r t i c l e

i n f o

Keywords: Parkinsonian Syndrome (PS) Parkinson Disease (PD) Computer Aided Diagnosis (CAD) Empirical Mode Decomposition (EMD) Principal Component Analysis (PCA) Independent Component Analysis (ICA) Support Vector Machines (SVM) DaTSCAN

a b s t r a c t Parkinsonism is the second most common neurodegenerative disorder. It includes several pathologies with similar symptoms, what makes the diagnosis really difficult. I-ioflupane allows to obtain in vivo images of the brain that can be used to assist the PS diagnosis and provides a way to improve its accuracy. In this paper a new method for brain SPECT image feature extraction is shown. This novel Computer Aided Diagnosis (CAD) system is based on the Empirical Mode Decomposition (EMD), which decomposes any non-linear and non-stationary time series into a small number of oscillatory Intrinsic Mode Functions (IMF) a monotonous Residuum. A 80-DaTSCAN image database from the ‘‘Virgen de las Nieves’’ Hospital in Granada (Spain) was used to evaluate this method, yielding up to 95% accuracy, which greatly improves the baseline Voxel-As-Feature (VAF) approach. Ó 2012 Elsevier Ltd. All rights reserved.

1. Introduction Parkinsonian Syndrome (PS) or Parkinsonism is one of the most common neurodegenerative disorders, characterized by the presence of hypokinesia associated with rest tremor, rigidity or postural instability. Parkinsonism is commonly caused by Parkinson’s Disease (PD), a neurodegenerative disease which originates due to progressive loss of dopaminergic neurons of the nigrostriatal pathway. This cell loss involves a substantial decrease in the dopamine content of the striatum and a loss of dopamine transporters (Booij et al., 1997). Up to 2% population over 65 years suffer from PD and about 20–25% of all are not correctly diagnosed. I-ioflupane (better known as DaTSCAN) is a neuroimaging radiopharmaceutical drug used to assist the diagnosis of Parkinsonism. This drug binds to the dopamine transporters in the striatum. Dopamine transporters (DAT) are proteins situated at the presynaptic terminal of dopaminergic neurons which are responsible for the re-uptake of dopamine. In order to obtain the distribution of the radiopharmaceutical in the brain and visualize the loss of dopamine, a Single-Proton Emission Computed Tomography (SPECT) camera is used (Booij et al., 1997; Winogrodzka et al., 2003). The SPECT maps obtained are normally examined by experts

⇑ Corresponding author. Fax: +34 958243271. E-mail address: [email protected] (J.M. Górriz). 0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2012.11.017

on PS diagnosis (Lozano, del Valle Torres, García, Cardo, & Arillo, 2007). These experts often use proprietary software to delimit Regions Of Interest (ROIs) and quantify the radiopharmaceutical uptake (Górriz, Segovia, Ramírez, Lassl, & Salas-González, 2011; Jennings et al., 2004; Ortega Lozano et al., 2010; Pirker et al., 2000). This procedure is subjective and prone to errors since it relies on gross changes in transporter density throughout the ROI (Salas-Gonzalez et al., 2010). As such, it may not be sensitive to changes in the pattern of distribution that can characterize the progression of the disease (Scherfler & Nocker, 2009). Recently some methods based on the machine learning paradigm have been applied to image analysis procedures, developing CAD systems for several neurodegenerative diseases (Seppi et al., 2006; Zubal et al., 2007), just like Alzheimer Disease or Parkinson Disease (Illán et al., 2012; Segovia et al., 2012). This CAD system is used for the extraction of complex high-dimensional features of the images that will be used to train the automatic classifier SVM to discriminate between normal and pathological controls (Illán et al., 2010; Illán et al., 2010; Ramírez et al., 2009), performing thus an automatic diagnosis (Chaves, Ramírez, Górriz, & ÁIllán, 2012; Chaves et al., 2011; Chaves, Ramírez, Górriz, & Puntonet, 2012; Chaves, Ramírez, Górriz, Salas-Gonzalez, & López, 2011; Martínez-Murcia, Górriz, Ramírez, Puntonet, & Salas-Gonzalez, 2012 Ortiz, Górriz, Ramírez, & Salas-Gonzalez, 2011). The aim of this work is to continue the work undertaken by Gallix, Górriz, Ramírez, ÁIllán, and Lang (2012) and evaluate the

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

performance of an adaptative image decomposition technique based on EMD (Zeiler et al., 2010). For this purpose we make use of different methods. Firstly we apply a preprocessing method consisting on an intensity normalization and a gaussian filtering of the images to remove noise added by the image reconstruction system in the Hospital. Once the noise is removed, images are decomposed with EMD and filtered again. Finally we make use of Principal Component Analysis (PCA) (Mac´kiewicz & Ratajczak, 1993; Moore, 1981) and Independent Component Analysis (ICA) (Hyvärinen, 1999b) to extract their features and classify with a Support Vector Machines (SVM) method (Martínez et al., 2010; Segovia, Grriz, Ramrez, Salas-Gonzlez, & Lvarez, 2012; Vapnik, 1998) as shown in Fig. 1. This article is organized as follows. In Section 2 the image acquisition, preprocessing, adaptative image decomposition technique (EMD), feature extraction and classification methods used in this work are presented. In Section 3 we describe the experiment proposed, comparing the results obtained with the baseline results (VAF). Results are discussed in Section 3 too, and finally, conclusions are shown in Section 4.

2757

2. Materials and methods 2.1. Database The images were obtained after a period of three hours after the intravenous injection of 111–185 MBq of I-123-Ioflupane, with prior thyroid blocking with Lugol’s solution. The tomographic study (SPECT) with Ioflupane/FP-CIT-I-123 was performed using a triple-head gamma-camera (Picker 3000), equipped with collimators for a low-energy and ultra high-resolution neurofan beam (Leuhr-NeuroFan). A 360° circular orbit was made around the cranium, with a rotation radius of 12.9–13.2 cm, step and shoot mode: 3° intervals, 45 s per step, 128  128 matrix. Image reconstruction was carried out using a Metz Filter (75% max) and Iterative OS MLEM (4 it.), then are corrected for attenuation by Chang method (att. coef. = 0.11 cm1). Final images were exposed according transoblique OM reformatted in the 3 standardized planes. All the images were spatially normalized using the SPM software (Friston, Ashburner, Kiebel, Nichols, & Penny, 2007) and yielding a 45  73  73 three-dimensional functional activity map for each patient. This software performs the normalization in two steps. The first one assumes a general affine model with 12 parameters and a cost function (Woods, 2000, chap. 29), f, that measures the difference between a given image and a template:

f ¼

X ðiðMxk Þ  tðxk ÞÞ2 ;

ð1Þ

k

where i is the initial image, Mxk is the affine transformation for each voxel xk = (x1, x2, x3) and t is the template (Kas et al., 2007). The second step consists of several non-linear deformations defined by a linear combination of three dimensional discrete cosine transform (DCT) basis functions. This spatial normalization ensures that the same position in the volume coordinate system in different images corresponds to the same anatomical position (Ramírez et al., 2008). Once the images have been properly normalized they were interpreted by three Nuclear Medicine specialists. Visual assessment was established by exclusively considering the normal/ abnormal criterion and images were labeled after reaching a consensus report among the three specialists. A study was considered to be normal when bilateral, symmetrical uptake appeared in caudate and putamen nuclei, and abnormal when there were areas of qualitative reduced uptake in any of the striatal structures (Lozano et al., 2007). A total of 80 subjects (41 patients and 39 controls) randomly selected from the total studies performed in our center and referred to it because of a movement disorder were included in the study. Mean age was 73.8 years (men = 74.4 years, women = 73.2 years). Clinical diagnosis, a parameter used as ‘‘gold Standard’’ to establish the existence of PS, was made using the diagnostic criteria established previously, with a minimum follow-up period of 18 months. Those patients who were receiving treatment with drugs that had known or suspected effect on the level of the dopaminergic transporters through direct competitive mechanism were excluded. Although PD is the most representative pathology of PS, there are other pictures which, though they differ clinically from this, are also expressed by this set of symptoms. Worth noting are multisystem atrophy (MSA), progressive supra-nuclear palsy (PSP) and corticobasal degeneration (CBD), in which, unlike PD, as well as involvement of the presynaptic terminal, there is involvement at the post-synaptic level of the nigrostriatal pathway (Winogrodzka et al., 2003). 2.2. Normalization

Fig. 1. Flowchart of the CAD system.

Spatial normalization is done just after the image acquisition, thus we are going to focus on intensity normalization. It is

2758

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

necessary in order to establish comparisons between the uptake value in areas of specific activity (binding to dopaminergic transporters) and areas of non-specific activity (vascular activity) between subjects (Illán et al., 2011). A successful approach is to estimate the non-specific activity in Regions of Interest (ROIs), fixing them manually or with the help of MRI coregistration, and then reparametrize the counts of the whole image to the aforementioned value. Common choices are the occipital cortex or cerebellum (Messa et al., 1998). In the context of CAD systems, manually interventions in the pre-processing of the images can be considered as drawbacks, as requiring a degree of experience, and fully automatic assistance is desired (Kas et al., 2007; Woods, 2000, chap. 29). Also, MRI complementing images are not always an available possibility. Therefore, the problem of intensity normalization is tackled here obtaining in intrinsic parameter from the image Ip, and estimating the binding activity as:

t 0 ¼ t=Ip

ð2Þ

where t denotes the spatially normalized image, and t0 the spatially and in intensity normalized image. Ip will be determined by two alternative methods (Illán et al., 2010):  Integral normalization: All the intensity values of the image are R summed as an approximation of the expression Ip ¼ t, which gives an integral value of intensity.  Normalization to a maximum: Ip is obtained from a 100-binhistogram of intensities. The spectrum of the images can be distorted by outliers if the maximum intensity value of the images is chosen, and small values may cause image saturation. Therefore, Ip is obtained by averaging the range of intensities covered by the 95th bin. 2.3. Gaussian filter In order to remove noise from original images, we will use a Gaussian filter, represented by

 2 1 x GðxÞ ¼ pffiffiffiffiffiffiffiffiffiffi exp 2r 2 2pr

ð3Þ

where r2 is the variance and the size of the filter kernel l (l 6 x) is normally determined by omitting values lower than 5% of the maximum value of the kernel. For noise suppression a large filter variance is effective, smoothing out the noise, but at the same time it distorts those parts of the image where there are abrupt changes in pixel brightness (Deng & Cahill, 2003). For this work we make use of a gaussian filter with a r2 of 0.9 and a matrix kernel of 9  9.

 In the whole data set, the number of zero-crossings and the number of extrema must be equal or differ at most by one.  At any point, envelopes defined by the local maxima and the local minima have a mean value of zero. IMF’s are obtained using a sifting process in which only local extrema take part. The process is divided in various steps: 1. Identify all the local extrema and estimate upper and lower envelope with a cubic spline interpolation. 2. Obtain the first component h by taking the difference between the data and the local mean of the two envelopes. 3. Treat h as the data and repeat steps (1) and (2) until the envelopes are symmetric with respect to zero mean under certain criterion. The final h is designated as cj. A complete sifting process stops when the residue, rn, becomes a monotonic function or a function with only one extremum from which no more IMF can be extracted. Summing up, the EMD is an adaptive method that decompose data x(t) in terms of IMFs cj and a residual component rn,

xðtÞ ¼

n X Cj þ rn

ð4Þ

j¼1

where rn could be a constant, a monotonic function, or a function containing only a single extrema, from which no more IMFs can be extracted. The excellent performance of EMD has stimulated the development of multi-dimensional EMD. Mostly, the efforts succeeded only for spatial two-dimensional image analysis. The earliest trial of such efforts is to treat two-dimensional image as a collection of one-dimensional slices and then decompose each slice using one-dimensional EMD (Huang, 2009; Zeiler et al., 2010). This is a pseudo-two-dimensional EMD. There are some inconvenients when using this method but the main shortcoming is the mode mixing, or more generally, scale mixing. In one-dimensional EMD, scale mixing is defined as a single IMF either consisting of signals of widely disparate scales, or a signal of a similar scale residing in different IMF components, caused by unevenly distributed extrema along the time axis (Linderhed, 2009; Zeiler et al., 2011). That is a consequence of signal intermittency and could not only cause serious aliasing in the timefrequency distribution, but also make the individual IMF lose its clear physical meaning. Another side-effect of mode mixing is the lack of physical uniqueness (Huang, 2009). To overcome the scale mixing problem, a new method was developed, the ensemble EMD (EEMD) which defines the IMF components as the mean of an ensemble, each consisting of the signal plus a white noise of finite amplitude (WU & Huang, 2009).

2.4. Multidimensional Ensemble EMD (MEEMD) We make use of an adaptative image decomposition method, Empirical Mode Decomposition (EMD), proposed by Huang et al. (1998) and the improved version Ensemble Empirical Mode Decomposition (EEMD) (WU & Huang, 2009). Contrary to previous decomposition methods, EMD is empirical, intuitive, direct, and adaptive, without pre-determined basis functions. EMD was developed for the analysis of non-linear and non-stationary time series. Any time series data consists of different simple intrinsic modes of oscillation. The aim is to identify the intrinsic oscillatory modes by their characteristic time scales in the data empirically (Huang, 2009). A simple oscillatory mode is called intrinsic mode function (IMF) and satisfies:

Fig. 2. Schematic of the IMFs summing strategy in MEEMD algorithm to data in two orthogonal directions.

2759

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

observed in Fig. 4. A battery of tests permit us to choose the following parameters:

Ensemble EMD method contains the following steps: 1. Add a white noise series to the targeted data. 2. Decompose the data with added white noise into IMFs (EMD previously explained). 3. Repeat step (1) and step (2) a certain number of iterations. 4. Obtain the (ensemble) means of corresponding IMFs of the decompositions as the final result.

 Signal-to-noise ratio: 1.6.  Number of iterations: 50.  Number of IMFs: 3 and a residue.

2.5. Feature selection

White noise would populate the whole time–frequency space uniformly with the constituting components at different scales. Although each individual trial may produce very noisy results, the noise in each trial is canceled out in the ensemble mean of all trials. Finally Wu and Huang developed the Multidimensional EEMD (Wu, Huang, & Chen, 2009). The peculiarity of this method is to decompose data of any dimension, only using 1D-EEMD. For a 2D-image, the data h is first decomposed row by row, stopping the sifting process after n IMFs, resulting in n + 1 intermediate images (including the residue), h1,1 to h1,n+1. Each of these images is then again decomposed, this time column by column, with each intermediate image h1,y splitting into n + 1,h1,y to hn+1,y components. The final IMFs are obtained by adding intermediate IMFs of the same scale, as illustrated in Fig. 2. This method could be easily extended to higher dimensional data sets, successively decomposing lines of data in every dimension and adding sub-images of the same scale (Huang, 2009; Wu et al., 2009). Unfortunately the computational cost increases exponentially with the dimension of the data. MEEMD has two parameters to set, the power of the added white noise, and the number of iterations of the noise addition process (called ensemble number in literature) (Wu et al., 2009). By setting these parameters properly, this method provides a more robust decomposition and prevents it from mode mixing, as it can be

2.5.1. Mask A binary mask is applied to each volume in order to select only the high-intensity voxels of the striatum area. This mask is computed as follow:

 mi ¼

1 if ci P th 0

ð5Þ

otherwise

where mi, i = 1 . . . , n are the n voxels of the mask, ci, i = 1 . . . , n are the n voxels of a intermediate image, c, and h is the threshold. The image c is computed by squaring and making symmetric (regarding to the axial axis) the average of all controls of the database. Applying this mask allows to select the voxels whose intensity is high (compared with the maximum intensity) in healthy subjects. In practice, this is equivalent to select the voxels of the striatum.

Table 1 Results of the VAF system.

VAF

Accuracy (%)

Sensitivity (%)

Specificity (%)

87.5

90.24

84.62

1 20

0.9

40

0.8

60

0.7

80

0.6

100 0.5 120 0.4

140

0.3

160

0.2

180

0.1

200 50

100

150

200

250

300

350

400

450

500

0

Fig. 3. 21 Masked slices used from a PD patient.

1 10

0.8

20

0.6

30

0.4

40

0.2

50 50

100

150

200

250

0

Fig. 4. Experiment VI MEEMD decomposition of PD patient brain slice. (From left to right: Original slice, IMF1, IMF2, IMF3 and residue).

2760

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

65.0834 67.6508 87.1639 83.6971 68.9345 87.4198 83.3119 83.4403 84.8524 83.3126 83.8254 83.6971 82.0283 83.0552

81.3765 85.5601 90.1485 88.7989 85.2902 90.5533 88.1242 90.6883 85.8363 87.9892 90.8232 85.9649 87.5844 87.4494

Sens.

73.0263 76.3816 88.6184 86.1842 76.9079 88.9474 85.6579 86.9737 85.3289 85.5921 87.2368 84.8026 84.7368 85.1974 74.6984 90.0203 92.3077 88.9096 88.4804 92.8786 89.2348 90.1582 86.8543 89.2348 91.9582 83.5494 89.9892 88.7746 70.3589 81.0834 87.9333 83.1836 83.4955 89.2914 85.3864 89.068 81.8999 85.7985 86.7985 82.1566 86.6566 84.0013 72.8289 86.8421 90.7895 86.5789 85.8553 91.1842 87.1053 89.6395 83.4211 87.1711 89.7368 82.8289 88.2895 86.5789 79.9474 85.0395 89.3387 86.0698 85.8324 90.8786 86.3047 89.3387 85.6951 87.3445 88.664 85.2902 86.7746 85.5601 82.5417 87.1049 91.2709 87.4035 87.8049 91.7574 87.1333 91.5276 87.0347 90.7304 90.629 87.0347 89.9872 87.8049 81.4474 85.9868 90.5921 86.7105 86.3816 91.0526 86.7105 90.1974 86.2632 89.2763 89.8026 86.1263 88.6842 86.9737 74.6984 84.4872 89.9436 84.2872 82.0743 92.7808 85.9743 88.8845 82.2629 85.9743 88.7639 80.3743 87.8591 84.3872 70.3589 88.2374 91.5978 88.0374 89.2876 89.5876 89.5876 90.2631 84.2956 89.5876 90.7573 85.0876 88.6588 88.1374 72.8289 86.8421 90.7895 86.5789 85.8553 91.1842 87.1053 89.5395 83.4211 87.1711 89.7368 82.8289 88.2895 86.5789 78.8124 83.5357 88.6639 80.8367 83.4008 89.9037 80.1619 88.7193 79.6221 83.1309 87.4494 79.9473 82.8610 81.3765 84.3388 89.2169 90.7573 86.2644 88.0616 90.3722 86.3928 90.4857 84.0821 89.3453 90.8857 84.3389 88.8318 86.6495 81.9079 86.3816 89.8684 83.6184 85.0000 90.1316 82.8947 89.6079 81.5789 86.0921 89.6711 82.0395 85.5526 84.2632 72.0648 83.9406 90.1484 80.4318 81.9163 89.7436 80.1619 88.2591 78.4075 80.8920 87.3144 82.2629 82.0823 84.4872 79.0757 89.7304 91.1425 85.8793 87.4197 91.5978 86.9063 89.6588 85.2374 86.5211 89.0885 84.2956 86.5211 88.2374 75.3289 85.9211 90.8553 82.8289 84.3421 90.7237 83.5079 88.9158 82.6974 84.3421 87.7632 83.2895 84.9342 86.7763 IMF1 IMF2 IMF3 Residue IMF1 + IMF2 IMF1 + IMF3 IMF1 + Residue IMF2 + IMF3 IMF2 + Residue IMF3 + Residue IMF1 + IMF2 + IMF3 IMF1 + IMF2 + Residue IMF1 + IMF3 + Residue IMF2 + IMF3 + Residue

Experiment VI

Acc.

Experiment V

Spec. Sens. Acc. Spec. Sens.

Experiment III

Acc. Spec. Sens. Acc.

Experiment II

Spec. Sens.

Experiment I

Table 2 Average accuracy, sensibility and specificity values (%) obtained by the application of ICA.

Acc.

Experiment IV

Spec.

Acc.

Sens.

Spec.

2.6. Feature extraction 2.6.1. Principal Component Analysis (PCA) The aim is a reduction of the dimensions of the observation space. The reduction is obtained by creating new linear combinations of variables characterizing the objects (López et al., 2011; Mac´kiewicz & Ratajczak, 1993). These combinations, termed principal components, must satisfy certain mathematical and statistical conditions and are written in the form:

V ¼ A0 X

ð6Þ

where V is a matrix of the new variables, A is a matrix of orthonormal eigenvectors of matrix S and X is the observation matrix. Transformation (6) is possible after solving Eq. (7).

jS  lIj ¼ 0

ð7Þ

where S is a variance–covariance matrix of order (p  p), l is the characteristic root of the determinantal equation and I is a unit matrix of order (p  p). It is a polynomial of degree p with respect to unknown l, hence it has p roots which can be ordered in such a way that l1 P l2 P l3    P lp P O Due to there is an orthonormal column eigenvector Ai corresponding to each root li, the variable V1 has the maximum value l1 (maximum variance) and is termed the first principal component. Quantity l2 corresponds to variable V2, which therefore is termed the second principal component and so on (Mac´kiewicz & Ratajczak, 1993; Moore, 1981). Resulting vectors A1, A2, . . . , Ap are orthonormal and their basic property is the lack of correlation among them. The variance of the ith component is li, that is

VarðAi XÞ ¼ li

ð8Þ

2.6.2. Independent Component Analysis (ICA) ICA (Hyvärinen, 1999b) is a statistical technique that represents a multidimensional random vector as a lineal combination of non-gaussian random variables (called independent components) to be as independent as possible (Hyvärinen & Oja, 2000; Hyvärinen, 1999b; Segovia et al., 2012). Assume that we observe n lineal mixtures x1, x2, . . . , xn of length N that can be modeled as an expression of K independent components (IC). These independent components are defined as S = (s1, s2, . . . , sk), where each sK vector has a length of N. So, each random vector xn can be described as a linear combination of K independent components:

xn ¼ a1n s1 þ a2n s2 þ    þ akn sk

ð9Þ

We denote as matrix X the random vector whose elements are x1, x2, . . . , xn, A as the matrix that contains all akn elements, the mixing matrix that projects each image into the space defined by the IC. Thus the mixing model remains as follows:

X ¼ AS

ð10Þ

It is assumed that all components sk are statistically independent. This is done by measuring the degree of non-gaussianity of all independent components (Illán et al., 2010). After estimating the A, we can compute its inverse, W and obtain the projection S of the images in the dataset into the IC space (Hyvärinen, 1999a; Hyvärinen & Oja, 1997) with:

S ¼ WX

ð11Þ

2.7. Classification 2.7.1. Support Vector Machines (SVM) The aim of SVM (Vapnik, 1998) is separate a set of binary labeled training data by means of a hyperplane that is maximally

2761

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

Accuracy, Sensibility and Specificity vs Number of Selected components

100

Acc, Sens, Sec (%)

95

90

85 IMF2 Acc IMF3 Acc Residue Acc VAF IMF2 Sens IMF3 Sens Residue Sens IMF2 Spec IMF3 Spec Residue Spec

80

75

70

2

4

6

8

10

12

14

16

18

20

Number of Selected Components Fig. 5. Experiment III IMF’s and residue accuracy, sensibility and specificity values (%) obtained by the application of ICA.

Accuracy, Sensibility and Specificity vs Number of Selected components

100

Acc, Sens, Spec (%)

95

90

85 IMF1+IMF3 Acc IMF2+IMF3 Acc IMF1+IMF2+IMF3 Acc VAF IMF1+IMF3 Sens IMF2+IMF3 Sens IMF1+IMF2+IMF3 Sens IMF1+IMF3 Spec IMF2+IMF3 Spec IMF1+IMF2+IMF3 Spec

80

75

70

2

4

6

8

10

12

14

16

18

20

Number of Selected Components Fig. 6. Experiment III combinations of IMFs and residue with major accuracy, sensibility and specificity values (%) obtained by the application of ICA.

distant from the two classes. Denote the two classes by 1 and 1. The linear hyperplane classifier c(x) predicts the class according to

 cðxÞ ¼

1

if xw0 P 0

1 if xw0 < 0

ð12Þ

where w is the weight vector that is orthogonal to the decision hyperplane and w0 is the bias. The optimization task consists of finding the unknown parameters wi,i = 1, . . . , M, defining the decision hyperplane. The margin of a sample x of class y2 {1, 1} is de^ w0 The margin of correctly classified samples is positive fined as yx and that for misclassified samples is negative. The margin of the classifier is defined as the smallest margin of all the training samples (López et al., 2009; Ramírez, Yélamos, Górriz, & Segura, 2006). When no linear separation of the training data is possible, SVM can work effectively in combination with kernel techniques so that the hyperplane defining the SVM corresponds to a nonlinear decision boundary in the input space (Górriz et al., 2009; Górriz et al., 2008; López et al., 2013). A kernel function is defined as:

Kðxi ; xj Þ ¼ uðxi Þuðxj Þ

ð13Þ

The use of kernel functions avoids directly working in the high dimensional feature space, thus the training algorithm only depends on the data through dot products in Euclidean space (Álvarez et al., 2009).

3. Experiments and results For all the experiments, we have used part of the database previously described, focussing on Axial slices. Thus, from the beginning we have 80 patients with 45 different slices each one. Within those 45 slices we have selected only 21 central slices (Fig. 3) because slices from the extrema have deficiencies and a significant lack of information. Six experiments were performed considering different preprocessing and post-processing methods:  Experiment I: – Preprocess: No intensity normalization, Gaussian filtering. – Process: MEEMD. – Post-process: No Gaussian filtering, masking.  Experiment II: – Preprocess: No intensity normalization, Gaussian filtering. – Process: MEEMD. – Post-process: Gaussian filtering, masking.  Experiment III: – Preprocess: Integral intensity normalization, Gaussian filtering. – Process: MEEMD. – Post-process: No Gaussian filtering, masking.  Experiment IV: – Preprocess: Integral intensity normalization, Gaussian filtering.

81.4815 74.0741 94.8718 84.7607 72.3647 93.4077 87.8889 90.3077 88.6191 83.3191 93.8718 89.3191 82.8946 90.0285 70.5556 66.3889 95.0000 87.3611 66.9444 94.3056 88.6111 85.8333 87.3611 83.1944 89.8611 88.1944 82.7778 90.4167

Sens. Acc.

70.4815 89.9231 92.7322 82.4852 89.2082 91.4537 82.8946 91.7607 81.9701 87.4758 92.9077 84.9003 91.4758 89.4758 67.0833 84.0278 90.2778 82.2222 83.0556 90.0000 81.8056 89.7222 81.8056 88.1944 90.6944 84.5833 87.9167 87.2222 79.7588 86.7019 91.1019 82.9079 88.2889 90.4409 80.8049 89.7019 87.1138 90.2089 94.3097 83.1599 90.8409 82.8889 65.4986 85.6369 91.3279 83.7399 84.6369 92.3134 82.9789 91.5989 84.8239 88.2396 92.0539 85.6369 88.5529 89.2919 67.0833 84.0278 90.2778 82.2222 83.0556 90.0000 81.8056 89.7222 81.8056 88.1944 90.6944 84.5833 87.9167 87.2222 71.7835 72.7721 92.3077 87.3419 62.906 90.3077 85.8815 82.453 85.0174 84.9835 85.7379 90.7721 85.0798 89.4872 77.2168 62.9729 95.493 91.8049 66.6179 95.019 89.1978 94.2065 90.9079 86.3518 93.1223 94.8049 93.4417 81.7949 74.5833 67.0833 93.6111 89.0278 64.1667 92.6389 87.7778 88.8889 88.1944 85.6944 89.3056 92.0833 89.3056 90.9722 66.8123 82.6211 88.9604 79.9174 81.1966 89.9134 80.2023 87.7003 73.4892 72.3345 89.6097 74.8741 72.9345 71.7949 73.1707 88.8889 90.7729 88.2919 85.5339 91.8279 84.1789 91.7279 79.4038 78.1518 91.0569 79.4038 76.9648 77.2358 70.6944 85.5556 89.8611 83.6111 83.0556 90.9722 82.0833 89.4444 76.4056 75.1389 90.1389 77.7778 74.3056 74.4444 IMF1 IMF2 IMF3 Residue IMF1 + IMF2 IMF1 + IMF3 IMF1 + Residue IMF2 + IMF3 IMF2 + Residue IMF3 + Residue IMF1 + IMF2 + IMF3 IMF1 + IMF2 + Residue IMF1 + IMF3 + Residue IMF2 + IMF3 + Residue

Acc. Sens.

68.8761 83.7607 89.7436 81.2023 81.8211 88.6866 80.6211 87.1795 77.7778 88.0954 89.4587 83.2023 87.3305 85.2154

79.0278 85.5556 90.0000 82.6389 84.1667 90.2778 82.3611 89.5833 83.3333 88.1944 90.6944 82.9167 87.3611 85.6944

78.3268 84.604 88.8981 82.4701 80.0852 90.0832 84.7436 89.3248 80.5043 86.1587 87.7493 82.7664 84.4285 88.1738

63.8897 79.1076 88.1599 82.0109 77.0976 88.6179 80.8239 87.0108 81.6108 89.6748 88.0108 84.0108 83.2168 85.0878

Spec. Sens. Acc. Spec.

Experiment V Experiment IV

Sens. Sens. Acc.

Experiment II

Acc.

Table 3 Average accuracy (%) of PCA.

Spec. Experiment I

Sens.

Spec.

Experiment III

Spec.

Acc.

Experiment VI

Spec.

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

59.8076 59.6206 95.122 89.9729 59.7236 94.9669 89.7019 80.5339 85.6369 83.0759 86.8049 87.3079 82.6789 90.8409

2762

– Process: MEEMD. – Post-process: Gaussian filtering, masking.  Experiment V: – Preprocess: Intensity normalization to the maximum, Gaussian filtering. – Process: MEEMD. – Post-process: No Gaussian filtering, masking.  Experiment VI: – Preprocess: Intensity normalization to the maximum, Gaussian filtering. – Process: MEEMD. – Post-process: Gaussian filtering, masking. Once the whole process is finished, results are vectorized (feature vector). This is, images (IMFs and residue) are considered as vectors belonging to a high dimensional vectorial space, the ‘‘brain images space ’’ by rearranging each masked image to a vector of dimension M. 14 feature vectors are obtained from each patient, 4 of them directly from the decomposition and 10 from combinations of 2 or 3 of them: – IMF1, IMF2, IMF3 and Residue – IMF1 + IMF2, IMF1 + IMF3, IMF1 + Residue, IMF2 + IMF3, IMF2 + Residue, IMF3 + Residue – IMF1 + IMF2 + IMF3, IMF1 + IMF2 + Residue, IMF1 + IMF3 + Residue, IMF2 + IMF3 + Residue According to MEEMD theory, first decomposition (IMF1) will be noisier than the next decompositions. IMF2 will be less noisier than IMF1, and IMF3 (Fig. 9) will be the most suitable option to classify with. Due to the stopping of the sifting process after the third IMF instead of finishing the sifting process, the Residue might have a good deal of information. To compare results we use the evaluation parameters known as accuracy, sensitivity and specificity. The baseline of this work is the Voxels-as-Features (VAF) approximation shown on Table 1, which uses all voxels in each RAW image as a feature vector. It is used as an input to the SVM classifier. The original feature vector has 39,338 features. By applying the SVM classifier on the 6 experiments we observed that the higher the normalization of images is, the better the results are. But mainly it is shown that the Gaussian filtering performed after the decomposition of images makes results improve considerably. It makes the accuracy increase from a 91.25% on IMF3 of Experiment I to a 93.75% on Experiment VI, and from a 93.75% on the combination IMF2 + IMF3 of Experiment I to a 96.25% accuracy on Experiment VI. The second tested system is a feature extraction based on Independent Component Analysis (ICA) and a SVM classification. via this method we obtain a very high feature reduction, from the original feature vector of 39,338 characteristics to a length of between 2 and 10 of them. Results are shown on Table 2. It is shown on Table 2 and Figs. 5 and 6 that, although there are good results in some of the vector combinations, there is no stability between the various tests carried out. i.e. Results vary in some tests between 80% and 96.25% accuracy (combination IMF1 + IMF2 + IMF3), and between 84% and 92.25% accuracy (IMF3). The third tested system is a feature extraction based on Principal Component Analysis (PCA) and a SVM classification. via this method we obtain a very high feature reduction, firstly from the original feature vector of 39,338 characteristics to 79. But the best results are obtained with the first 10 components, shown on figures below: As shown on Table 3 and Figs. 7 and 8, the main characteristic of the results of this method is stability. Values obtained tend to

2763

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

Accuracy, Sensibility and Specificity vs Number of Selected components

100 95 90

Acc, Sens, Spec (%)

85 80 75 70 65 IMF2 Acc IMF3 Acc Residuo Acc VAF IMF2 Sens IMF3 Sens Residue Sens IMF2 Spec IMF3 Spec Residue Spec

60 55 50 45 40 35

1

2

3

4

5

6

7

8

9

10

79

Number of Selected Components Fig. 7. Experiment VI IMF’s and residue accuracy, sensibility and specificity values (%) obtained by the application of PCA.

Accuracy, Sensibility and Specificity vs Number of Selected components

100 95

Acc, Sens, Spec (%)

90 85 80 75 70

IMF1+IMF3 Acc IMF1+IMF2+IMF3 Acc IMF2+IMF3+Residuo Acc VAF IMF1+IMF3 Sens IMF1+IMF2+IMF3 Sens IMF2+IMF3+Residuo Sens IMF1+IMF3 Spec IMF1+IMF2+IMF3 Spec IMF2+IMF3+Residuo Spec

65 60 55 50

1

2

3

4

5

6

7

8

9

10

79

Number of Selected Components Fig. 8. Experiment VI combinations of IMFs and residue with major accuracy, sensibility and specificity values (%) obtained by the application of PCA.

Table 4 Experiment VI IMF3 average values versus VAF system results.

VAF IMF3

Accuracy (%)

Sensitivity (%)

Specificity (%)

87.5 95.00

90.24 95.122

84.62 94.8718

be constant throughout all tests proved. We need to highlight the results of IMF3 on Experiment VI, which will be our choice for an optimum classification, obtaining an average accuracy of 95% (constant throughout all the experiment). This results easily improve VAF system results, as can be seen on Table 4.

5

5

10

10

15

15

20

20

25

25

30

30

35

35

40

40

45

45

50

50

55

55 10

20

30

40

50

10

20

30

40

50

Fig. 9. Original image (left) and IMF3 decomposition (right). IMF3 maximizes the ROI’s of the original image, that is, the striatal area.

2764

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

Fig. 10. Data image coefficients projection and decision surfaces designed by SVM kernel. The left one shows the entire set of PCA feature extraction classification while the right one shows the IMF3 PCA classification.

ROC curve

1 0.9 0.8

True positive rate

0.7 0.6 0.5 0.4

MEEMD PCA Conf. interv. MEEMD PCA Conf. interv. MEEMD PCA VV2−entropy VV2−ttest VV2−wilcoxon PPMI1−entropy PPMI1−ttest PPMI1−wilcoxon Random Classifier

0.3 0.2 0.1 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate Fig. 11. ROC curves obtained with the system proposed in Experiment VI.

Classification results are achieved with a features extraction and a SVM method following a leave-one-out cross validation (LOOCV) scheme: each subject is successively held out for test and considered as unknown. SVM separate a set of binary-labeled training data by means of a hyperplane (called maximal margin hyperplane) that is maximally distant from the two classes, as can be seen in Fig. 10. The classifier evaluation was done by means of leave-one-out cross-validation, that is, the classifier is trained on all the available samples but one, which is used in the test stage. This process is repeated for each subject, and finally an average accuracy rate is computed. Finally, in Fig. 11 we compare the ROC curves obtained by Experiment VI (with PCA feature reduction), which is the one that grant the best results, and 3 tested systems on Martínez-Murcia, Górriz, Ramírez, Illán, and The Parkinson’s Progression Markers Initiative (2010). We must emphasize that all four methods, as can be seen above, behave in a very similar way. Albeit it is noted a slight improvement in the MEEMD with PCA method, obtaining a value of the area under the curve (AUC) of 0.9491.

4. Conclusion In this paper, a new approach to Computed Aided Diagnosis (CAD) for Parkinsonian Syndrome (PS) is proposed. Once reviewed and analysed the results of all available options, the one who gives a better performance is obtained by applying an intensity normalization to the maximum and a Gaussian Filter before MEEMD, after the decomposition, IMFs and residue are filtered with a Gaussian

filter, and then a feature selection and extraction is done prior to classification, which corresponds to Experiment VI. As we have shown, the combination between Experiment VI (explained above) and a feature extraction based on Principal Component Analysis (PCA) provide really good results. Average accuracy remain stable and very high in some cases, just like in our final choice, IMF3, with a 95% accuracy, 95.122% sensibility and 94.8718% specificity. This indicates that the system is robust to slight changes in the images and the system will be more precise and stable without compromising the reliability of it. It is remarkable that results obtained using systems with Gaussian filtering after decomposition of the images are much more reliable than systems without Gaussian filter after decomposition. This is quite significant as we originally assumed that images were reasonably clean of noise after decomposition, but this fact shows us that it was not entirely true. By filtering them we obtain a much more robust classification system. Among all possible tested combinations, it is noteworthy that the IMF3 is the one that provide the best results of accuracy, sensitivity and specificity, as mentioned earlier, mainly because of its stability throughout all the different tests. Feature selection and extraction from IMF3 isolated from other IMFs, let as the only area of interest to analyse the region of interest (ROI) observed in the original image, but in this case maximized as can be seen on Fig. 12. So it will be easier to make a correct classification of analysed images. We should also point out the results obtained from the residue. Despite not being of a high accuracy (87.36% accuracy, 89.97% sensibility and 84.76% specificity), it remains at the same level than

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

Fig. 12. Isolated IMF3 eigenimages obtained from PCA feature extraction.

those obtained by the VAF approach. This suggests that it still contain a lot of information, thus perhaps more IMFs could have been obtained from the decomposition for subsequent combination and analysis. Maybe in combination with IMF3 it would provide a better result. The system vastly improves the VAF-based systems, which is considered to reproduce the actual procedure performed by doctors in the Databases. Additionally the system proposed demonstrates its robustness in PS pattern detection as it provides high accuracy, sensitivity and sensibility in a wide range of operation. Acknowledgments This work was partly supported by the MICINN under the TEC2008-02113 project and the Consejera de Innovacin, Ciencia y Empresa (Junta de Andalucía, Spain) under the Excellence Projects P07-TIC-02566, P09-TIC-4530, P11-TIC-7103, P11-TIC-7103 and TEC2012–34306 We are grateful to M. Gómez-Río and coworkers from the Virgen de las Nieves hospital in Granada (Spain) for providing and labeling the SPECT images used in this work. References Álvarez, I., Górriz, J. M., Ramírez, J., Salas-González, D., López, M., Puntonet, C.G., Segovia, F., & Prieto, B. (2009). Alzheimers diagnosis using eigenbrains and support vector machines. In IWANN 2009 Proceedings, Lecture Notes in Computer Science. Salamanca, Spain. Booij, J., Tissingh, G., Boer, G. J., Speelman, J. D., Stoof, J. C., Janssen, A. G., et al. (1997). [123I]FP-CIT SPECT shows a pronounced decline of striatal dopamine transporter labelling in early and advanced Parkinsons disease. Journal of Neurology, Neurosurgery & Psychiatry, 62, 133–140. Chaves, R., Ramírez, J., Górriz, J. M., Illán, I. Á., Segovia, F., & Olivares, A. (2011). Effective diagnosis of Alzheimer’s disease by means of distance metric learning and random forest. In IWINAC (Vol. 2, pp. 59–67). Chaves, R., Ramírez, J., Górriz, J. M., Salas-Gonzalez, D., & López, M. (2011). Distance metric learning as feature reduction technique for the Alzheimer’s disease diagnosis. In IWINAC (Vol. 2, pp. 68–76). Chaves, R., Ramírez, J., Górriz, J. M., & ÁIllán, I. (2012). Functional brain image classification using association rules defined over discriminant regions. Pattern Recognition Letters, 33(12), 1666–1672. Chaves, R., Ramírez, J., Górriz, J. M., & Puntonet, C. G. (2012). Association rule-based feature selection method for Alzheimer’s disease diagnosis. Expert Systems with Applications, 39(14), 11766–11774. Deng, G., & W.Cahill, L. (2003). Nuclear Science Symposium and Medical Imaging Conference, 1993. IEEE Conference Record. (vol.3, 1615–1619).

2765

Friston, K., Ashburner, J., Kiebel, S., Nichols, T., & Penny, W. (2007). Statistical parametric mapping: The analysis of functional brain images. Academic Press. Gallix, A., Górriz, J. M., Ramírez, J., ÁIllán, I., & Lang, E. W. (2012). On the empirical mode decomposition applied to the analysis of brain SPECT images. Expert Systems with Applications, 39(18), 13451–13461. Górriz, J. M., Ramírez, J., Lassl, A., Álvarez, I., Segovia, F., Salas, D., & López, M. (2009). Classification of spect images using clustering techniques revisited. IWINAC 2009 Proceedings. Lecture Notes in Computer Science (p. 168178). Santiago de Compostela, Spain. Górriz, J. M., Ramírez, J., Lassl, A., Salas-González, D., Lang, E. W., Puntonet, C. G., Álvarez, I., López, M., & Gómez-Río, M. (2008). Automatic computer aided diagnosis tool using component-based SVM. In IEEE nuclear science symposium conference record, medical imaging conference (pp. 4392–4395). Dresden, Germany: IEEE-NSS. Górriz, J., Segovia, F., Ramírez, J., Lassl, A., & Salas-González, D. (2011). GMM based SPECT image classification for the diagnosis of Alzheimers disease. Applied Soft Computing, 11(2), 2313–2325. Huang, N. E. (2009). Computer implemented empirical mode decomposition apparatus, method and article of manufacture for two-dimensional signals. US Patent 6,311,130 B1, Granted Oct. 30, 2001. Huang, N., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., et al. (1998). The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London A, 454, 903–995. Hyvärinen, A. (1999a). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks, 10(3), 626–634. Hyvärinen, A. (1999b). Survey on independent component analysis. Neural Computing Surveys, 2, 94–128. Hyvärinen, A., & Oja, E. (1997). A fast fixed-point algorithm for independent component analysis. Neural Computation, 9, 1483–1492. Hyvärinen, A., & Oja, E. (2000). Independent component analysis: Algorithms and applications. Neural Networks, 13(4–5), 411–430. http://dx.doi.org/10.1016/ S0893-6080(00)00026-5. Illán, I. A., Górriz, J. M., Ramírez, J., Salas-González, D., López, M., Segovia, F., et al. (2010). Projecting independent components of SPECT images for computer aided diagnosis of Alzheimers disease. Pattern Recognition Letters, 31(11), 1342–1347. http://dx.doi.org/10.1016/j.patrec.2010.03.004. Illán, I., Górriz, J., Ramírez, J., Salas-González, D., López, M., Segovia, F., et al. (2011). 18F-FDG pet imaging analysis for computer aided Alzheimers diagnosis. Information Sciences, 181(4), 903–916. Illán, I., Górriz, J. M., Ramírez, J., Segovia, F., Jiménez-Hoyuela, J. M., & Ortega, S. J. (2012). Automatic assistance to Parkinson’s disease diagnosis in DaTSCAN SPECT imaging. Medical Physics, 39, 5971. Jennings, D. L., Seibyl, J. P., Oakes, D., Eberly, S., Murphy, J., & Marek, K. (2004). (123I)beta-CIT and single-photon emission computed tomographic imaging vs clinical evaluation in Parkinsonian syndrome: Unmasking an early diagnosis. Archives of Neurology, 61(8), 1224–1229. Kas, A., Payoux, P., Habert, M., Malek, Z., Cointepas, Y., El Fakhri, G., et al. (2007). Validation of a standardized normalization template for statistical parametric mapping analysis of 123I-FP-CIT images. Journal of Nuclear Medicine, 48(9), 1459–1467. Linderhed, A. (2009). Image empirical mode decomposition: A new tool for image processing. Advances in Adaptive Data Analysis, 1, 265–294. López, M. M., Górriz, J. M., Ramírez, J., Gómez-Río, M., Verdejo, J., & Vas, J. (2013). Component-based technique for determining the effects of acupuncture for fighting migraine using SPECT images. Expert Systems with Applications, 40(1), 44–51. López, M., Ramírez, J., Górriz, J., Álvarez, I., Salas-Gonzalez, D., Segovia, F., et al. (2009). SVM-based CAD system for early detection of the Alzheimers disease using kernel PCA and LDA. Neuroscience, 464, 233–238. López, M., Ramírez, J., Górriz, J. M., Álvarez, I., Salas-Gonzalez, D., Segovia, F., et al. (2011). Principal component analysis-based techniques and supervised classification schemes for the early detection of Alzheimer’s disease. Neurocomputing, 74(8), 1260–1271. Lozano, S. J. O., del Valle Torres, M. D. M., García, J. M. J.-H., Cardo, A. L. G., & Arillo, V. C. (2007). Diagnostic accuracy of FP-CIT SPECT in patients with Parkinsonism. Revista Espaola de Medicina Nuclear, 26, 277–285 (English edition). Mac´kiewicz, A., & Ratajczak, W. (1993). Principal components analysis (PCA). Computers & Geosciences, 19(3), 303–342. Martínez-Murcia, F. J., Górriz, J. M., Ramírez, J., Illán, I. A., & The Parkinson’s Progression Markers Initiative, (2010). Automatic detection of Parkinsonism using significance measures and component analysis in DaTSCAN imaging. Neurocomputing. Martínez-Murcia, F. J., Górriz, J. M., Ramírez, J., Puntonet, C. G., & Salas-Gonzalez, D. (2012). Computer aided diagnosis tool for Alzheimer’s disease based on Mann– Whitney–Wilcoxon U-test. Expert Systems with Applications, 39(10), 9676–9685. Martínez, F. J., Salas-Gonzalez, D., Górriz, J. M., Ramírez, J., Puntonet, C. G., & GómezRío, M. (2010). Analysis of SPECT brain images for the diagnosis of Alzheimers disease using factor analysis and quadratic discriminant functions. Lecture Notes in Computer Science, 66–87. Messa, C., Volontí, M. A., Fazio, F., Zito, F., Carpinelli, A., dAmico, A., et al. (1998). Differential distribution of striatal [123I]beta-CIT in Parkinson’s disease and progressive supranuclear palsy, evaluated with single-photon emission tomography. European Journal of Nuclear Medicine, 25(9), 1270–1276. Moore, B. C. (1981). Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Transactions on Automatic Control, AC26(1).

2766

A. Rojas et al. / Expert Systems with Applications 40 (2013) 2756–2766

Ortega Lozano, S., Martínez del Valle Torres, M., Ramos Moreno, E., Sanz Viedma, S., Amrani Raissouni, T., & Jiménez-Hoyuela, J. (2010). Quantitative evaluation of SPECT with FP-CIT importance of the reference area. Revista Espaola de Medicina Nuclear, 29(5), 246–250 (English edition). Ortiz, A., Górriz, J. M., Ramírez, J., & Salas-Gonzalez, D. (2011). MRI brain image segmentation with supervised SOM and probability-based clustering method. In IWINAC (Vol. 2, pp. 49–58). Pirker, W., Asenbaum, S., Bencsits, G., Prayer, D., Gerschlager, W., Deecke, L., et al. (2000). [123I]beta-CIT SPECT in multiple system atrophy, progressive supranuclear palsy, and corticobasal degeneration. Movement Disorders: Official Journal of the Movement Disorder Society, 15(6), 1158–1167. Ramírez, J., Cháves, R., Górriz, J. M., López, M., Salas-González, D., Álvarez, I., & Segovia, F. (2009). SPECT image classification techniques for computer aided diagnosis of the Alzheimer disease. In IWANN 2009 Proceedings, Lecture Notes in Computer Science. Salamanca, Spain. Ramírez, J., Górriz, J. M., Gómez-Río, M., Romero, A., Cháves, R., Lassl, A., et al. (2008). Effective emission tomography image reconstruction algorithms for SPECT data. In ICCS 2008, Part I. LNCS (Vol. 5101/2008, pp. 741–748). Berlin, Heidelberg: Springer-Verlag. Ramírez, J., Yélamos, P., Górriz, J. M., & Segura, J. C. (2006). SVM-based speech endpoint detection using contextual speech features. Electronics Letters, 42(7), 877–879. Salas-Gonzalez, D., Górriz, J. M., Ramírez, J., López, M., Álvarez, I., Segovia, F., et al. (2010). Computer-aided diagnosis of Alzheimers disease using support vector machines and classification trees. Physics in Medicine and Biology, 55(10), 2807–2817. Scherfler, C., & Nocker, M. (2009). Dopamine transporter SPECT: How to remove subjectivity? Movement Disorders: Official Journal of the Movement Disorder Society, 24(Suppl 2), S721–724. Segovia, F., Grriz, J., Ramrez, J., Salas-Gonzlez, D., & Lvarez, I. (2012). Early diagnosis of Alzheimers disease based on partial least squares and support vector machine. Expert Systems with Applications. URL .

Segovia, F., Górriz, J. M., Ramírez, J., Álvarez, I., Jiménez-Hoyuela, J. M., & Ortega, S. J. (2012). Improved Parkinsonism diagnosis using a partial least squares based approach. Medical Physics, 39. Segovia, F., Górriz, J. M., Ramírez, J., Salas-Gonzalez, D., Álvarez, I., López, M., et al. (2012). A comparative study of feature extraction methods for the diagnosis of Alzheimer’s disease using the ADNI database. Neurocomputing, 75(1), 64–71. Seppi, K., Scherfler, C., Donnemiller, E., Virgolini, I., Schocke, M. F. H., Goebel, G., et al. (2006). Topography of dopamine transporter availability in progressive supranuclear palsy: A voxelwise [123I]beta-CIT SPECT analysis. Archives of Neurology, 63(8), 1154–1160. Vapnik, V. N. (1998). Statistical learning theory. New York: John Wiley and Sons, Inc. Winogrodzka, A., Bergmans, P., Booij, J., van Royen, E. A., Stoof, J. C., & Wolters, E. C. (2003). [123I]beta-CIT SPECT is a useful method for monitoring dopaminergic degeneration in early stage Parkinson’s disease. Journal of Neurology, Neurosurgery & Psychiatry, 74, 294–298. Woods, R. P. (2000). Spatial transformation models. In I. N. Bankman (Ed.), Handbook of medical imaging (pp. 465–490). San Diego: Academic Press. WU, Z., & Huang, N. (2009). Ensemble empirical mode decomposition: A noiseassisted data analysis method. Advances in Adaptive Data Analysis, 1(1), 1–41. Wu, Z., Huang, N. E., & Chen, X. (2009). The multi-dimensional ensemble empirical mode decomposition method. World Scientific. Advances in Adaptive Data Analysis, 1(3), 339–372. Zeiler, A., Faltermeier, R., Böhm, M., Keck, I. R., Tom, A. M., & Puntonet, C. G. (2010). Empirical mode decomposition techniques for biomedical time series analysis (vol. 4). Bentham Science Publishers, pp. 60–81. Zeiler, A., Faltermeier, R., Brawanski, A., Tom, A. M., Puntonet, C. G., Górriz, J. M., et al. (2011). Brain status data analysis by sliding EMD. IWINAC, 2, 77–86. Zubal, I. G., Early, M., Yuan, O., Jennings, D., Marek, K., Seibyl, J. P., et al. (2007). Optimized, automated striatal uptake analysis applied to SPECT brain scans of Parkinson’s disease patients. Journal of Nuclear Medicine, 48(6), 857–864.