An efficient machine learning approach for diagnosis of paraquat-poisoned patients

An efficient machine learning approach for diagnosis of paraquat-poisoned patients

Computers in Biology and Medicine 59 (2015) 116–124 Contents lists available at ScienceDirect Computers in Biology and Medicine journal homepage: ww...

4MB Sizes 17 Downloads 80 Views

Computers in Biology and Medicine 59 (2015) 116–124

Contents lists available at ScienceDirect

Computers in Biology and Medicine journal homepage: www.elsevier.com/locate/cbm

An efficient machine learning approach for diagnosis of paraquat-poisoned patients Lufeng Hu a,1, Guangliang Hong b,1, Jianshe Ma c, Xianqin Wang c, Huiling Chen d,n a

The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China Department of Emergency, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou 325000, China c Function Experiment Teaching Center, Wenzhou Medical University, 325035 Wenzhou, China d College of Physics and Electronic Information Engineering, Wenzhou University, 325035 Wenzhou, China b

art ic l e i nf o

a b s t r a c t

Article history: Received 9 September 2014 Received in revised form 2 February 2015 Accepted 8 February 2015

Numerous people die of paraquat (PQ) poisoning because they were not diagnosed and treated promptly at an early stage. Till now, determination of PQ levels in blood or urine is still the only way to confirm the PQ poisoning. In order to develop a new diagnostic method, the potential of machine learning technique was explored in this study. A newly developed classification technique, extreme learning machine (ELM), was taken to discriminate the PQ-poisoned patients from the healthy controls. 15 PQ-poisoned patients recruited from The First Affiliated Hospital of Wenzhou Medical University who had a history of direct contact with PQ and 16 healthy volunteers were involved in the study. The ELM method is examined based on the metabolites of blood samples determined by gas chromatography coupled with mass spectrometry in terms of classification accuracy, sensitivity, specificity and AUC (area under the receiver operating characteristic (ROC) curve) criterion, respectively. Additionally, the feature selection was also investigated to further boost the performance of ELM and the most influential feature was detected. The experimental results demonstrate that the proposed approach can be regarded as a success with the excellent classification accuracy, AUC, sensitivity and specificity of 91.64%, 0.9156%, 91.33% and 91.78%, respectively. Promisingly, the proposed method might serve as a new candidate of powerful tools for diagnosis of PQ-poisoned patients with excellent performance. & 2015 Elsevier Ltd. All rights reserved.

Keywords: Paraquat Poison Extreme learning machine Medical diagnosis

1. Introduction Paraquat (1,10 -dimethyl-4,40 -bipyridium dichloride, PQ), the most widely used herbicides in the world, has been deemed as the most highly toxic pesticide for human [1]. Its mortality rate is highly correlated with plasma PQ concentrations [2]. Acute ingesting 7– 8 mL PQ can cause serious symptoms such as liver, lung, kidney, and heart failure that directly leads to death if without getting prompt treatment [3]. Although few of PQ poisoning appeared in developed countries, there are thousands of people died of PQ poisoning every year in developing countries [4]. For example, PQ accounts for most fatal poisonings, with 500 or more deaths per year in Korea [5]. PQ intoxication is associated with reactive oxygen species and free radicals that cause early multiorgan failure and late pulmonary fibrosis with respiratory failure [5,6]. The current treatment strategies of PQ poisoning are increasing the elimination of PQ from the body, administration of antioxidants and the maintenance of vital functions, which is an entirely different from other intervention of

n

Corresponding author. E-mail address: [email protected] (H. Chen). 1 Contributed equally to this work. http://dx.doi.org/10.1016/j.compbiomed.2015.02.003 0010-4825/& 2015 Elsevier Ltd. All rights reserved.

intoxication [1,7]. And earlier the treatment initiated, more effective in reducing mortality, particularly hemoperfusion (HP) within 2–4 h after intoxication [4]. Therefore, diagnosis is very important in treatment of PQ-poisoned patients. Till now, the diagnosis of PQ poisoning mainly is according to the PQ concentration in blood. However, PQ is absorbed poorly from the stomach and small intestine (o 5%) and distributed into all organs in the body within 5 h, which means it is difficult to detect PQ in blood after poisoned 5 h [1,8]. And in some cases, the patients cannot provide a clear contact history of PQ poisoning, such as disturbance of consciousness or language. This poses a more serious problem for diagnosis. How to develop a new diagnosis method of PQ poisoning is becoming an important topic in medicine. In this study, patients with acute PQ intoxication were involved and determined by gas chromatography coupled with mass spectrometry (GC–MS). According to their plasma metabonomics, a rapid diagnosis method was developed based on the extreme learning machine (ELM) technique [9], a new learning algorithm for a single hidden layer feedforward neural network (SLFNs). Different from the common parameter tuning strategy of neural network, ELM tries to choose input weights and hidden biases randomly, and the output weights are analytically determined by using Moore–Penrose (MP) generalized inverse. It not only learns much faster with higher generalization performance, but

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

also keeping very few parameters for tuning. Thanks to its good properties, ELM has found its applications in a wide range of fields such as cancer diagnosis [10], image quality assessment [11], face recognition [12], land cover classification [13] and hyperspectral images classification [14]. To the best of our knowledge, there is no research work dealing with the problem of PQ poisoning from the machine learning perspective. Therefore, an attempt was made in this study to explore the potential of the performance of ELM in discriminating the PQ-poisoned patients from the healthy controls. For comparison purpose, the support vector machines [15] (SVM) was also taken for diagnosis of PQ poisoning. In addition, the effectiveness of feature selection was investigated as well. The efficient and commonly used feature selection method, maximum relevance minimum redundancy (mRMR) [16], was employed for pre-processing before the classification models were constructed. mRMR is a filter type feature selection method that seeks to choose features which are relevant to the target class (maximum relevance) and come up with the feature subset containing as non-redundant features as possible (minimum redundancy). The effectiveness of the proposed approach is examined in terms of classification accuracy, AUC, sensitivity and specificity respectively on the diagnosis of PQ-poisoned cases whose samples were collected from The First Affiliated Hospital of Wenzhou Medical University. Promisingly, the developed ELM based approach has achieved high diagnosis accuracy, AUC, sensitivity and specificity of 91.64%, 0.9156%, 91.33% and 91.78%, respectively. The remainder of this paper is organized as follows. Section 2 offers brief background knowledge on ELM. The detail of the ELM method is described in Section 3. Section 4 presents the detailed experimental designs. The experimental results and discussion of the proposed method are presented in Section 5. Finally, conclusions and recommendations for future work are summarized in Section 6.

P i ¼ 1N~

117

βi gðwi U xj þ bi Þ ¼ t j ; j ¼ 1; 2; …; N. The above Equation can be

reformulated compactly as: Hβ ¼ T

ð2Þ

where 0 B Hðw1; …; wN~ ; b1; …; bN~ ; x1; …; xN Þ ¼ @

gðw1 U x1 þb1 Þ



⋮ gðw1 U xN þ b1 Þ

⋱ ⋯

gðwN~ U x1 þ bN~ Þ

NN~

ð3Þ 2

3 βT1 6 7 ⋮ 7 β¼6 4 5 βTN~

2

N~ m

3 t T1 6⋮ 7 and T ¼ 4 5 t TN

ð4Þ Nm

As named by Huang et al. [18], H is called the hidden layer output matrix of the neural network, with the ith column of H

2. Extreme learning machine (ELM) A brief description of ELM is given in this section, for more details, one can refer to [9,17]. Given a training set ℵ ¼ fðxi ; t i Þj xi A Rn ; t i A Rm ; i ¼ 1; 2; …; Ng, where xi is the n  1 input feature vector and t i is a m  1 target vector. The standard SLFNs with an activation function gðxÞ and N~ hidden neurons can be mathematically modeled as follows [9]: X βi gðwi xj þ bi Þ ¼ oj ; j ¼ 1; 2; …; N ð1Þ i ¼ 1N~

where wi is the weight vector between the ith neuron in the hidden layer and the input layer, bi means the bias of the ith neuron in the hidden layer; βi is the weight vector between the ith hidden neuron and the output layer; and oj is the target vector of the jth input data. Here, wi U xj denotes the inner product of wi and xj . If SLFNs can approximate these N samples with zero error, we PN will have j ¼ 1 j j oj  t j j j ¼ 0, i.e., there exist βi , wi , bi such that

Fig. 1. The diagnosis model of PQ poisoning based on ELM.

1

C ⋮ A gðwN~ U xN þ bN~ Þ

Fig. 2. The flowchart of the proposed diagnosis model.

118

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

being the ith hidden neuron output with respect to inputs xi1 ; xi2 ; …; xiN . Huang et al. [19,20] has shown that the input weights and the hidden layer biases of SLFNs need not be adjusted at all and can be arbitrarily given. Under this assumption, the output weights can be analytically determined by finding the least square solution β^ of the linear system Hβ ¼ T:

where H† is the MP generalized inverse of the matrix Η. The use of the MP generalized inverse method has led to the minimum norm leastsquares (LS) solution, which is unique and has the smallest norm among all the LS solutions. As analyzed by Huang et al. [17], by using theMP inverse method, ELM tends to obtain a good generalization performance with a dramatically increased learning speed.

j j Hðw1 ; …; wN~ ; b1 ; …; bN~ Þβ^  Tj j ¼ min j j Hðw1 ; …; wN~ ; b1 ; …; bN~ Þβ  Tj j

3. Proposed ELM model

β

ð5Þ Eq. (5) can be easily accomplished using the linear method, such as the Moor–Penrose (MP) generalized inverse of Η, as is shown in Eq. (6) Hβ ) β^ ¼ H† T

ð6Þ

Table 1 The main plasma metabolites detected by GC–MS in PQ-poisoned patients and healthy volunteers. No.

RT (min)

Metabolites

MD (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

6.044 7.007 7.449 9.356 10.375 10.461 10.564 12.22 13.98 15.206 15.278 16.036 16.515 17.262 18.679 18.787 19.01 19.205 20.314 21.11

Butanoic acid L-Alanine Piperazine L-Valine L-Leucine glycerol phosphate L-threonine L-Proline Glutamine L-phenylalanine Xylitol Arabitol 9H-Purine L-Tyrosine glucitol Inositol D-Altrose Uric acid 9,12-Octadecadienoic acid

66 74 50 74 78 80 80 90 83 90 78 64 64 72 50 50 64 50 90 90

RT (retention time) means the time it takes for an ingredient to go through the column of gas chromatograph. MD (match quality) represents the matched-degree of detected peaks with reference peaks.

In this section, we briefly described the proposed ELM method for distinguishing PQ-poisoned patients from the healthy controls. As depicted in Fig. 1, the diagnosis model is created by ELM. From the figure, the input data consist of the weights of metabolites in blood. The optimal hidden nodes in the hidden layer of ELM will be determined by the ten repetitions of 3-fold cross-validation method. When the trained model was performed for prediction, the output of ELM has two statuses, ‘1’ means that the PQ-poisoned patients, while ‘0’ represents the healthy persons. The flowchart of the proposed method is shown in Fig. 2. In the proposed model, the ELM model was evaluated on the data space with variation number of hidden neurons and activation functions to perform the classification task. Finally, the optimal number of hidden neurons and the best type of activation function were obtained based on performance analysis.

4. Experimental designs 4.1. Data description This study was approved by the Medical Ethics Committee of The First Affiliated Hospital of Wenzhou Medical University and conducted in accordance with the Declaration of Helsinki. All individual information of PQ-poisoned patients was securely protected and only available to the investigators. All data were analyzed anonymously. There were 15 patients (aged from 18–64 years, 9 male/6 female) who had a history of direct contact with PQ poisoning were involved in this study. Their timeframe of PQ poisoning was 5 h to 12 h, the plasma PQ concentration ranged from 0.1 to 1.5 mg/mL. All of these 15 patients have not received any drug treatment and hemoperfusion

Fig. 3. The results of ELM with different activation functions with increase of hidden neurons.

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

(HP) or hematodialysis (HD) treatment. A total of 16 healthy volunteers who have no history of direct contact with PQ poisoning were chosen as control. The blood samples of 31 subjects were collected and analyzed by the GC–MS method as the same way. Table 2 Average classification performance results of 10 runs of 3-fold CV. Different number of hidden neurons Average classification performance

10 50 100 250 500 750 1000 1250

ACC

AUC

Sensitivity Specificity

0.6427 0.7155 0.7848 0.8267 0.8785 0.8979 0.9000 0.8603

0.6461 0.7150 0.7872 0.8283 0.8800 0.8967 0.9006 0.8656

0.7067 0.7600 0.8467 0.8333 0.8867 0.9067 0.8867 0.9267

0.5856 0.6700 0.7278 0.8233 0.8733 0.8867 0.9144 0.8044

119

The GC–MS is a mature method used in metabolomics because many peaks can be reliably obtained from blood and identified according to retention time (RT) and mass spectra data (more details have been published in our previous work [21]). Herein, there were 119 peaks detected in blood samples, and 20 peaks were identified. There are Butanoic acid, L-Alanine, Piperazine, L-Valine, L-Leucine, glycerol, phosphate, L-threonine, L-Proline, Glutamine, L-phenylalanine, Xylitol, Arabitol, 9H-Purine, L-Tyrosine, glucitol, Inositol, D-Altrose, Uric acid, 9,12-Octadecadienoic acid, the match quality of them are all beyond 50%, as shown in Table 1. In the samples, each patient was represented as a 119dimensional feature vector and the corresponding outcome is coded as either 0 (healthy control) or 1 (PQ-poisoned patient). 4.2. Experimental setup Both ELM and SVMwere implemented in the experiment. For ELM, the implementation by Huang available from http://www3.ntu.edu.sg/

Fig. 4. Validation accuracy versus different number of hidden neurons for ELM in 10 runs of 3-fold CV.

Fig. 5. Mean accuracy and standard deviation versus the different number of hidden neurons for ELM in 10 runs of LOOCV.

120

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

Fig. 6. ACC, AUC, sensitivity and specificity obtained for each run of 3-fold CV by ELM.

Fig. 7. ACC, AUC, sensitivity and specificity obtained for each training-test set by ELM.

home/egbhuang was used. For SVM, LIBSVM implementation was utilized, which was originally developed by Chang and Lin[22]. mRMR feature selection method can be obtained from http://penglab.janelia. org/proj/mRMR/index.htm. The empirical experiment was conducted

on AMD Athlon 64  2 Dual Core Processor 5000þ(2.6 GHz) with 4GB of RAM and the system was Windows 7. In order to guarantee the valid results, the k-fold CV [23] was employed to evaluate the classification accuracy. Since there were

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

121

Fig. 8. ACC, AUC, sensitivity and specificity obtained by SVM and ELM.

Table 3 The optimal confusion matrix obtained by SVM and ELM via 10 runs of 3-fold CV

Fig. 9. The standard deviation of ELM and SVM within 10 times 3-fold CV.

only 31 samples at hand, so the value of k was set to be 3 in this study. Each time, 2 of the 3 subsets were put together to form a training set and the remaining subset was used as the test set. Then the average result across all 3 trials was computed. The advantage of this method is that all of the test sets are independent and the reliability of the results could be obtained. In order to keep the same proportion of PQ-poisoned patients and the healthy

SVM

Predicted PQ patients

Predicted healthy controls

PPV

FPR

PQ patients Healthy controls ELM PQ patients Healthy controls

14 2

1 14

0.875

0.125

14 0

1 16

1

0

controls of each set as that of the entire data set, here a stratified 3-fold CV was employed for analysis. Considering the bias produced within the cross validation process [24,25], we have also conducted the leave-one-out cross validation analysis (LOOCV). For comparison purpose, the hold-out way was also taken in our experiment. The data was split into three training-test partitions, namely, 80–20%, 70–30% and 50–50%, respectively. Namely, the model was constructed on the 80%, 70% and 50% training set, and the constructed model was evaluated on the remaining 20%, 30% and 50% test set for prediction, respectively. After the data was split, the normalization was taken on the training set to avoid feature values in greater numerical ranges dominating those in smaller numerical ranges. The metabolites were normalized by Z-scores according to the Eq. (7). x0 ¼

xμ σ

ð7Þ

122

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

Fig. 10. The relationship between the classification performance and different reduced feature space.

Table 4 Performance of ELM using different feature subsets obtained by mRMR Feature subset

AUC

ACC

Sensitivity

Specificity

1 5 10 20 40 0 80 100 All features

0.8744[0.0047] 0.7889[0.0258] 0.8072[0.0513] 0.8033[0.0702] 0.9083[0.0554] 0.9156[0.0344] 0.8889[0.0493] 0.8961[0.0396] 0.9006[0.0345]

0.8709[0.0026] 0.7909[0.0259] 0.8052[0.0524] 0.8018[0.0708] 0.9082[0.0561] 0.9164[0.0349] 0.8906[0.0483] 0.8961[0.0407] 0.9000[0.0352]

1[0] 0.7867[0.0422] 0.8467[0.0549] 0.8733[0.0492] 0.9200[0.0526] 0.9133[0.0322] 0.8800[0.0613] 0.9133[0.0450] 0.8867 [0.0632]

0.7489[0.0094] 0.7911[0.0513] 0.7678[0.0971] 0.7333[0.1162] 0.8967[0.0873] 0.9178[0.0598] 0.8978[0.0601] 0.8789[0.0709] 0.9144[0.0875]

Value inside the square brackets shows standard deviation of 10 runs of 3-fold CV.

where x is the original value, x0 is the scaled value, μ is the mean of feature a, and σ represents the standard deviation of feature a. 4.3. Measure for performance evaluation The classification accuracy (ACC), the area under the receiver operating characteristic curve (AUC) [26], sensitivity and specificity were used to test the performance of the proposed model. They are defined as follows: ACC ¼

TP þ TN  100% TP þ FP þFN þTN

ð8Þ

Sensitivity ¼

TP  100% TP þ FN

ð9Þ

Specif icity ¼

TN  100% FP þ TN

ð10Þ

Where TP is the number of true positives, which means that some cases with ‘PQ-poisoned’ class are correctly classified as PQpoisoned patients; FN, the number of false negatives, which means that some cases with the ‘PQ-poisoned’ class are classified as healthy controls; TN, the number of true negatives, which means that some cases with the ‘healthy controls’ class are correctly classified as healthy controls; and FP, the number of false positives,

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

which means that some cases with the ‘healthy controls’ class are classified as PQ-poisoned patients. AUC is one of the popular methods for evaluating the performance of the binary classifier. A perfect classifier provides an AUC that equals 1, in this study we implemented the AUC algorithm developed in [27].

5. Experimental results and discussions Different types of activation functions may have different impact on the performance of ELM. Therefore, the first attempt was made to evaluate the influence of different activation functions on the performance of the ELM model. Five different types of activation function including Sigmoid function (sig), Sine function (sin), Hard-limit function (hardlim), Triangular basis function (tribas) and Radial basis function (radbas) were evaluated in the experiment. The relationship between the different number of neurons and the classification accuracy of the ELM with different activation functions is shown in Fig. 3. From the figure, we can clearly see that ELM with the Sigmoid function outperforms ELM with other activation functions. It is also interesting to find that the standard deviation obtained by the ELM with Sigmoid function is much smaller than ELM with other activation functions in the experiment. It indicates that ELM with Sigmoid function is much more stable than other ELM models on this specific data taken in the experiment. Therefore, the Sigmoid function was adopted in the subsequent experiment analysis. Except for activation functions, the hidden neurons also play important role in the performance of ELM. In order to find the most suitable hidden neurons for ELM, different models were built when the different hidden neurons of 10, 50, 100, 250, 500, 750, 1000 and 1250 were given. The average classification performance results of 10 runs of 3-fold CV with different number of hidden neurons are presented in Table 2. The detailed classification accuracies of 10 runs of 3-fold CV with different number of hidden neurons are shown in Fig. 4. As can be seen from Table 2, the classification performance of ELM models varied with the different number of hidden neurons. When the number of hidden neurons was increased from 10 to 1000, the performance of ELM was boosted accordingly. While the number of hidden neurons was set to 1250, the performance of ELM began to decrease. The highest validation accuracy has been achieved when the number of hidden neurons is equal to 1000. In addition to the 3-fold CV analysis, we have also performed the LOOCV analysis on the PD data, and the mean results and standard deviation of the 10 different runs are recorded in Fig. 5. From the figure, we can see that the best classification accuracy is obtained when the hidden neurons are set to 1000, and the according standard deviation is the smallest among the 10 different runs. Therefore, 1000 hidden neurons were chosen to create the training model in the following implementations. After the activation function and the number of hidden neurons were determined, the model was taken to be trained for diagnosis. Fig. 6 shows the ACC, AUC, sensitivity and specificity for each run of 3-fold CV obtained by ELM with the fixed hidden neurons of 1000. It can be observed from the figure that an average classification accuracy of 90% and AUC of 0.9006 have been achieved by the ELM model, while the average sensitivity and specificity have been arrived to be 88.67% and 91.44%, respectively. In addition to the cross validation analysis, we have also conducted the experiment in hold-out way. For simplicity, the hidden neurons of 1000 were also adopted for analysis. PQ Data were first divided into three training-test partitions, namely, 80–20%, 70–30% and 50–50%, respectively. Then the ELM model was constructed on the 80%, 70% and 50% training set. Finally, the constructed models were validated on the remaining 20%, 30% and 50% test set for prediction. The detailed results of ACC, AUC, sensitivity and specificity obtained for each training-test set by ELM model are shown in

123

Fig. 7. The average classification accuracies achieved on three training-test partitions are 90%, 88.89% and 93.33% respectively as shown in Fig. 7(a), AUC achieved on three training-test partitions are 0.9093, 0.8958 and 0.9333 respectively as shown in Fig. 7(b), sensitivity achieved on three training-test partitions are 94.17%, 90.17% and 94.17% respectively as shown in Fig. 7(c), and specificity achieved on three training-test partitions are 87.69%, 89.00% and 92.50% respectively as shown in Fig. 7(d). It can be observed that the ELM model have achieved the best classification performance on 80–20% training-test set. Compared with the cross validation method, the hold-out way can get slightly higher classification performance. For comparison purpose, we have also implemented SVM to compare the results with that of ELM. For simplicity, we have adopted the 10 runs of 3-fold CV for fair comparison. For SVM, we considered the nonlinear SVM based on the popular Gaussian (RBF) kernel, and a grid-search technique [28] was employed using 3-fold CV to find out the optimal parameter values of RBF kernel function. The range of the related parameters C and γ were varied between C¼ {2–5, 2  3,…, 215} and γ¼{2–15,2  13,…,21}. There will be 11  10 ¼ 110parameter combinations of ðC; γÞ are tried and the one with the best CV accuracy is chosen as the parameter values of the RBF kernel. Then the best parameter pair ðC; γÞ is used to create the model for training. The comparison results of ELM and SVM in terms of classification accuracy, AUC, sensitivity and specificity are shown in Fig. 8. As shown in the figure, we can see that ELM has dominated SVM in most runs of 3-fold CV, namely, ELM has achieved the high classification accuracy, AUC, sensitivity and specificity equal to or better than that of SVM obtained by the 3-fold CV. The average classification accuracy, AUC, sensitivity and specificity of ELM is 90%, 0.9006%, 88.67%, and 91.44%, while the average classification accuracy, AUC, sensitivity and specificity of SVM is 86.55%, 0.8678%, 88.00% and 85.56%. Compared with SVM, the average classification performance of ELM has lifted 3.45%, 3.28%, 0.67% and 5.88% in terms of ACC, AUC, sensitivity and specificity, respectively. It can be also observed from Fig. 9 that the standard deviation of the acquired ACC and AUC by ELM is smaller than that of SVM, which indicates the robustness and stability of ELM model on the problem of PQ diagnosis. In addition, the optimal confusion matrix obtained by SVM and ELM via 10 runs of 3-fold CV were also recorded. As shown in Table 3, we can see that the positive predictive value (PPV) achieved by ELM is 100%, while the PPV of SVM is 87.5%. And the false positive rate (FPR) of ELM is 0%, while the FPR of SVM is 0.125. It indicates the superiority of ELM over SVM in discriminating the PQ patients from the healthy controls. In order to evaluate whether feature selection can further boost the performance of the proposed method for diagnosis of PQ, we have conducted the experiment in the reduced feature space. mRMR feature selection method was implemented to rank the features before the classification performed. Fig. 10 shows the comprehensive results obtained by ELM and SVM in terms of ACC, AUC, sensitivity and specificity in one run of 3-fold CV over the incremental feature subset where the features range from 1 to 119 with the step size of 1. It can be observed that with the aid of the feature selection, it only need few features to discriminate the PQ patients from the healthy controls, while keeping the same or little better performance than that obtained with the full feature set. In addition, ELM achieves the better results than SVM in terms of ACC, AUC, sensitivity and specificity on the reduced space in most cases. Table 4 lists the detailed results of ELM construed on different feature subsets in terms of AUC, ACC, sensitivity and specificity. From Table 4 we can observe the following facts: 1) The performance of ELM models built with feature subset size of 40 and 60 is better than the one built with all features. The best performance of ELM is obtained on the feature subset with

124

L. Hu et al. / Computers in Biology and Medicine 59 (2015) 116–124

size of 60, with the average AUC of 0.9156, ACC of 91.64%, sensitivity of 91.33% and specificity of 91.78%. 2) Among six feature subset sizes, the results show that the size of 40 is enough to build classification model, the ELM model with feature subset size of 40 achieves the average AUC of 0.9083, ACC of 90.82%, sensitivity of 92.00% and specificity of 89.67%, which is nearly the same as those obtained by using all features. 3) It is interesting to find that ELM can achieve the sensitivity of 100% using only one feature, which is better than the models built on any other feature subsets. It indicates the first feature, namely Uric acid, selected by mRMR filter is an informative feature. Using only this feature, the PQ-poisoned patients can be diagnosed safely with excellent performance. Generally, in the absence of a clear history of PQ contact, the decision on PQ poisoning is suspected in clinical practice. It is hard to make the final decision only based on the clinical symptoms, because the clinical manifestations of PQ poisoning are not specific. The blood samples determined by the GC–MS obtained plenty of mass spectra data. As each compound has different retention time (RT) and molecular mass, the analyses results based on these data are credible. The developed ELM model possess high classification accuracy, AUC, sensitivity and specificity, it can be entirely applied in clinical practice to distinguish the PQ-poisoned patients from healthy persons. It will be a valuable decision-making tool in early diagnosis of PQ poisoning. Especially under the current condition, the determination of PQ levels in urine or serum PQ is the only way to confirm the PQ poisoning. Our research will be a significant supplement in diagnosis of PQ poisoning. 6. Conclusions and future work This paper presents a novel method for diagnosis of PQ poisoning from the machine learning perspective. The empirical experiments on the data, collected from The First Affiliated Hospital of Wenzhou Medical University, have demonstrated the excellent superiority of the proposed ELM based approach in terms of classification accuracy, AUC, sensitivity and specificity. With the aid of feature selection, we have identified the most crucial feature for PQ poisoning diagnosis. Based on these discoveries, we foresee the potential use of ELM methods for PQ poisoning detection in biomedical applications. It provides a viable alternative solution to traditional PQ poisoning diagnosis tools by offering an excellent predictive ability. It should be noted that, only 15 PQ-poisoned patients were involved in this study and the blood samples were detected through GC–MS method. In future, we plan to recruit more patients and apply more detection methods, such as liquid chromatography-mass spectrometry and nuclear magnetic resonance spectrometer, to obtain more data to validate the developed diagnosis method. Acknowledgments This research is supported by the National Natural Science Foundation of China (NSFC) (61303113, 81401558 and 61402337). This work is also supported by Science and Technology Committee of Shanghai Municipality of China (KF1405), Zhejiang Provincial Natural Science

Foundation of China (LY14H230001, LQ13G010007, LQ13F020011, LY14F020035), and the key construction academic subject (medical innovation) of Zhejiang Province (11-CX26). References [1] S.C. Yoon, Clinical outcome of paraquat poisoning, Korean J. Intern. Med. 24 (2) (2009) 93–94. [2] J.L. Lin, et al., A prospective clinical trial of pulse therapy with glucocorticoid and cyclophosphamide in moderate to severe paraquat-poisoned patients, Am. J. Respir. Crit. Care Med. 159 (2) (1999) 357–360. [3] M.J. Rio, C. Velez-Pardo, Paraquat induces apoptosis in human lymphocytes: protective and rescue effects of glucose, cannabinoids and insulin-like growth factor-1, Growth Factors 26 (1) (2008) 49–60. [4] C.W. Hsu, et al., Early hemoperfusion may improve survival of severely Paraquat-poisoned patients, PLoS One 7 (10) (2012) e48397. [5] J.R. Koo, et al., Failure of continuous venovenous hemofiltration to prevent death in paraquat poisoning, Am. J. Kidney Dis. 39 (1) (2002) 55–59. [6] Y. Zou, et al., An improved approach for extraction and high-performance liquid chromatography analysis of paraquat in human plasma, J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 879 (20) (2011) 1809–1812. [7] W.P. Wu, et al., Addition of immunosuppressive treatment to hemoperfusion is associated with improved survival after paraquat poisoning: a nationwide study, PLoS One 9 (1) (2014) e87568. [8] P. Houze, et al., Toxicokinetics of paraquat in humans, Hum. Exp. Toxicol. 9 (1) (1990) 5–12. [9] G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications, Neurocomputing 70 (1-3) (2006) 489–501. [10] R. Zhang, et al., Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis, IEEE/ACM Trans. Comput. Biol. Bioinf. 4 (3) (2007) 485–494. [11] S. Suresh, R. Venkatesh Babu, H.J. Kim, No-reference image quality assessment using modified extreme learning machine classifier, Appl. Soft Comput. 9 (2) (2009) 541–552. [12] A.A. Mohammed, et al., Human face recognition based on multidimensional PCA and extreme learning machine, Pattern Recognit. 44 (10–11) (2011) 2588–2597. [13] M. Pal, Extreme-learning-machine-based land cover classification, Int. J. Remote. Sens. 30 (14) (2009) 3835–3841. [14] R. Moreno, et al., Extreme learning machines for soybean classification in remote sensing hyperspectral images, Neurocomputing 128 (2014) 207–216. [15] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995) 273–297. [16] H. Peng, F. Long, C. Ding, Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell. 27 (8) (2005) 1226–1238. [17] G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: a new learning scheme of feedforward neural networks, IEEE Int. Jt. Conf. Neural Netw. (2004) 985–990. [18] G.B. Huang, H.A. Babri, Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions, IEEE Trans. Neural Netw. 9 (1) (1998) 224–229. [19] G.B. Huang, Learning capability and storage capacity of two-hidden-layer feedforward networks, IEEE Trans. Neural Netw. 14 (2) (2003) 274–281. [20] G.B. Huang, L. Chen, C.K. Siew, Universal approximation using incremental constructive feedforward networks with random hidden nodes, IEEE Trans. Neural Netw. 17 (4) (2006) 879–892. [21] M. Zhang, et al., An evaluation of acute hydrogen sulfide poisoning in rats through serum metabolomics based on gas chromatography-mass spectrometry, Chem. Pharm. Bull. (Tokyo) 62 (6) (2014) 505–507. [22] Chang, C.C., C.J. Lin, LIBSVM: a library for support vector machines. 2001, Software available at 〈http://www.csie.ntu.edu.tw/cjlin/libsvm〉. [23] S.L. Salzberg, On comparing classifiers: pitfalls to avoid and a recommended approach, Data Min. Knowl. Discov. 1 (3) (1997) 317–328. [24] R.J. Tibshirani, R. Tibshirani, A bias correction for the minimum error rate in cross-validation, Ann. Appl. Stat. (2009) 822–829. [25] Y. Ding, et al., Bias correction for selecting the minimal-error classifier from many machine learning models, Bioinformatics 30 (22) (2014) 3152–3158. [26] T. Fawcett, An introduction to ROC analysis, Pattern Recognit. Lett. 27 (8) (2006) 861–874. [27] T. Fawcett, ROC graphs: notes and practical considerations for researchers, Mach. Learn. 31 (2004) 1–38. [28] Hsu, C.W., C.C. Chang, C.J. Lin, A practical guide to support vector classification, Technical report, Department of Computer Science and Information Engineering, National Taiwan University, Taipei, 2003, Available at 〈http://www.csie. ntu.edu.tw/cjlin/libsvm/〉.