A survival classification method for hepatocellular carcinoma patients with chaotic Darcy optimization method based feature selection

A survival classification method for hepatocellular carcinoma patients with chaotic Darcy optimization method based feature selection

Medical Hypotheses 139 (2020) 109626 Contents lists available at ScienceDirect Medical Hypotheses journal homepage: www.elsevier.com/locate/mehy A ...

3MB Sizes 0 Downloads 45 Views

Medical Hypotheses 139 (2020) 109626

Contents lists available at ScienceDirect

Medical Hypotheses journal homepage: www.elsevier.com/locate/mehy

A survival classification method for hepatocellular carcinoma patients with chaotic Darcy optimization method based feature selection

T

Fahrettin Burak Demira, Turker Tuncerb, , Adnan Fatih Kocamazc, Fatih Ertamb ⁎

a

Department of Computer Sciences, Vahap Kucuk Vocational School, Malatya Turgut Ozal University, Malatya, Turkey Department of Digital Forensics Engineering, Technology Faculty, Firat University, Elazig, Turkey c Department of Computer Engineering, Engineering Faculty, Inonu University, Malatya, Turkey b

ARTICLE INFO

ABSTRACT

Keywords: Chaotic Darcy optimization Feature selection HCC survival classification Missing feature completion

Survey is one of the crucial data retrieval methods in the literature. However, surveys often contain missing data and redundant features. Therefore, missing feature completion and feature selection have been widely used for knowledge extraction from surveys. We have a hypothesis to solve these two problems. To implement our hypothesis, a classification method is presented. Our proposed method consists of missing feature completion with a statistical moment (average) and feature selection using a novel swarm optimization method. Firstly, an average based supervised feature completion method is applied to Hepatocellular Carcinoma survey (HCC). The used HCC survey consists of 49 features. To select meaningful features, a chaotic Darcy optimization based feature selection method is presented and this method selects 31 most discriminative features of the completed HCC dataset. 0.9879 accuracy rate was obtained by using the proposed chaotic Darcy optimization-based HCC survival classification method.

Introduction Cancer is one of the diseases that are quite common nowadays and have a high risk of loss of life. Most primary liver tumors are hepatocellular carcinoma (HCC) [1]. HCC, a type of cancer, is a common tumor of the liver and is mostly seen in people with chronic liver disease or cirrhosis [2]. HCC is the first among the five most common cancer diseases [3]. In general, liver cancer patients do not show any symptoms before the diagnosis. Therefore, it is more difficult to be treated as fewer symptoms can be seen [4]. Early diagnosis and diagnosis is an important factor in the recovery of liver cancer patients. As a basis for early diagnosis, serum marker detection and medical imaging are used [5]. Liver cancer is a common cause of death worldwide. Cancerous tissue can be accurately identified using computed tomography images. In the image processing approaches, computer-assisted diagnosis can be used to classify liver cancer to assist the clinician in the decision-making process [6,7]. HCC is a major factor in liver cancer and causes the death of hundreds of thousands of people [8–11]. Recently, many studies have been presented for HCC-related diseases [2,4,5,7,9–13]. Książek et al. [12] used widely known machine

learning methods such as naive Bayes, linear discriminant, random forest, SVM, kNN, logistic regression, multilayer perceptron to classify HCC dataset. in their study, achieved the accuracy performance metric rate of approximately 88.5% using the genetic algorithm for feature selection. They used mean value for the missing feature fulfillment. Sawhney et al. [13] used together random forest algorithm and firefly optimization method for HCC diagnosis. Optimization algorithms are used to select the best parameters for instance feature, classifier [14]. Feature selection is the technique of selecting a small subset from a specific set of features by removing inappropriate and unnecessary features. Santos et al. [11] used a new cluster-based sampling algorithm. In their study, they developed a new technique to balance the data set and used logistic regression and neural network methods. Diagnosis from pathology images remains the most important role in the diagnosis of many diseases, including most cancers. Li et al. [15] proposed a new extreme learning machine based model for rough segmentation of pathology images for the correct segmentation of the HCC nucleus. Das et al. [7] presented basin consisting of density-based segmentation to effectively identify the cancer lesion in liver computed tomography images suggested a computer-aided diagnosis model called

Corresponding author. E-mail addresses: [email protected] (F.B. Demir), [email protected] (T. Tuncer), [email protected] (A.F. Kocamaz), [email protected] (F. Ertam). ⁎

https://doi.org/10.1016/j.mehy.2020.109626 Received 11 January 2020; Received in revised form 10 February 2020; Accepted 12 February 2020 0306-9877/ © 2020 Elsevier Ltd. All rights reserved.

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

the Gauss-based deep learning technique. The statistical information, textural and geometric features were classified using the Deep Neural Network classifier to distinguish hemangioma and metastatic carcinoma types of liver tumors besides HCC. Fa et al. [16] defined diseasespecific features in HCC patients based on the Pathifier methodology. Chaudhary et al. [17] used a deep learning computational framework on HCC data sets. They selected an auto encoder frame for the application of deep learning. These algorithms have proven to be highly effective approaches to produce properties associated with clinical outcomes [18,19]. Their contribution to the HCC field arises not only in their thorough and integrative computational rigor but also in the unification of solid subtypes based on the testing of various cohorts, even if the incompatible molecular subtypes are in different omics forms. Palazzo et al. [20] used a data analysis pipeline to classify the virus associated with alcohol-associated HCC tumor subtypes using Simple Genetic Mutations per gene in tumor samples of patients. In order to perform data analysis, they used Lasso for feature selection and Support Vector Machines for classification. Dogan and Turkoglu used Support Vector Machines to analyze big data in the diagnosis of Hyperlipidemia [21]. Likewise, Ozyurt used the ELM classifier to detect white blood cells [22]. Aydemir et al. proposed a method to diagnose epilepsy disease used a multilevel machine learning method [23]. Tuncer et al. presented a machine learning method with a signal processing algorithm for automated diagnosis of cardiac health using Electrocardiography (ECG) signals [24]. Tuncer and Dogan suggested a machine learning method for recognizing Parkinson’s Disease automatically [25]. The motivation of the proposed method is given as follows. In the literature, various methods have been proposed for survival recognition using this HCC survey. They proposed meta-heuristic and ensemblebased classifiers to achieve high-performance rates. To solve this problem, we propose a Darcy optimization algorithm based feature selection. The contributions and novelties of this work;

quantitative analysis of water with turbulent and variable flow movements. As it is known, many plants in nature exhibit root movements towards locations where water and nutrients are concentrated. In the same way, the barren soil, harmful substances and move away from the salt density. In this study, by calculating the flow of groundwater with Darcy's Law, the flow of the plants towards the flow density of the water is modeled. The Darcy Law has two different mathematical equations, depending on the free-surface aquifer (the rock part with a permeable structure in which water can accumulate) and the pressured aquifer. In this study, Darcy's Theorem in the free surface aquifer was used because the plants were generally fed from surface waters. Mathematical illustration of the Darcy theorem for Free - Surface Aquifers are shown in Eq. (1).

Q=

h L

(1) 3

where Q describes volumetric flow (m /s), K is Hydraulic Conductivity (m/s), i represents Hydraulic Slope, A is Cross Section Area (m2), L is length (m) . In a free-surface aquifer well, the radius r is calculated using the Dupuit hypothesis. And the mathematical notation of the Darcy law is shown in Eq. (2).

Q=

xK x

h22 h12 ln

( ) r2 r1

(2)

Logistic-Sine chaotic map Chaotic maps have been widely used in image encryption, optimization, substitution box (SBOX) generation and random number generation. We used a hybrid and one-dimensional chaotic map to provide uniform distribution and directly used good statistical attributes of the chaos in the DOA. Logistic-Sine chaotic map (LSCM) has a wide chaotic interval and it contains both logistic and sine maps together. Equation of the LSCM is given below [27,28].

• A novel swarm optimization algorithm is presented and this method • •

KxixA, i =

called as Darcy Optimization Algorithm. The Darcy Optimization Algorithm is a nature-inspired method. We used it as a feature selector in this work. Several conventional classifiers were used to demonstrate the effectiveness of the Darcy Optimization-based feature selector. The proposed Darcy Optimization-based HCC survival method achieved a 98.79% classification accuracy. This result illustrates that the proposed Darcy optimization and supervised missing feature completion methods are effective for HCC survival classification and the proposed Darcy Optimization-based classification method outperforms.

x i + 1 = rx i (1

x i) +

(4

r )sin( x i ) (mod1) 4

(3)

where × is a randomly generated number sequence and r is a multiplier of chaos. To show range chaotic interval of the LSCM, bifurcation diagrams of the logistic map, sine map and LSCM are shown in Fig. 1. Fig. 1 shows that LSCM has a high chaotic range than logistic and sine map. Therefore, we selected LSCM as a chaotic map. The proposed chaotic DOA

Darcy optimization algorithm

The steps of the chaotic DOA method are given below.

In this article, a novel root development inspired chaotic optimization method is presented and this method is called as Darcy Optimization Algorithm (DOA). DOA uses the Darcy theorem. A hybrid chaotic map is also used to provide uniform distribution in the DOA. Backgrounds of the DOA are explained in subsections.

Step 1: Generate initial particles using Eq.4.

Pi = rand [0, 1]x(UB

LB ) + LB

(4)

Step 2: Find the best personal particle using Eq. (5).

Darcy theorem

Pbest = min(act (P )), i = {1, 2,

Henry Darcy is a French scientist and lived between 1803 and 1858. He observed the movements of water in sand columns as hydrology engineer and access to the water supply. The hydraulic conductivity of the soil has proven to be directly related to the cross-sectional area and flow path length. The Darcy Theorem is used for the laminar (regular flow movement) movements of the water under the soil, in the porous structure (regular material state) [26]. That is to say, it is not used for

, pn}

(5)

Step 3: Calculate the T vector using any chaotic map. To calculate the T vector, Eq. (3), is used. Step 4: Update particles using Darcy and T vector by using Eq.6. Pt

Pit + 1

2

=

Pit

+

2 Tbest

i+1

Pit Ti

(UB pn

LB ) (6)

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

Fig. 1. Bifurcation diagrams (a) Logistic map, (b) Sine map, (c) Logistic-sine map (x = (0,1) and r = (0,4)).

The procedure of the proposed DOA also shown in Algorithm 1. Algorithm 1: Chaotic DOA procedure. Input: The Upper bound (UB), Lower bound (LB), seed values, particle numbers (pn), fitness function. Output: Best value 1: for i = 1 to pn do 2: Use Eq. (3) to generate particles. 3: end for i 4: Find pbest value. 5: for k = 1 to max _iter do 6: Define seed value 7: for j = 1 to pn do 8: Update particles using Eq. (7). 9: end for j 10: Update best value. 11: end for k

Fig. 2. Schematically explanation of the proposed chaotic DOA based HCC survival classification method.

Step 5: If updated particle exceeds the lower bound or upper bound of search space. Update particle using Eq. (7).

Pi = Pbest Ti

Proposed HCC survival classification method

(7)

We present a novel method for survival classification by using a HCC dataset. The proposed HCC classification method consists of three main phases and these are missing feature completion using supervised mean method, feature selection using chaotic DOA and classification.

Step 6: Repeat Steps 2–5 until reaching maximum number of iteration or desired error.

3

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

FP is false positive, TN is true negative. The main objective of the chaotic DOA is to select optimum features with minimum error. To implement chaotic DOA based feature selection, three functions are used and these are called as main, fitness and feature selection functions. The abstract of the proposed chaotic DOA based feature selection method is shown in Fig. 3. Steps of the proposed chaotic DOA based feature selector are given below. Step 0: Load completed the HCC dataset. Step 1: Set initial parameters.

UB = 249

Fig. 3. Abstract of the proposed chaotic DOA based feature selector.

LB = 1

(10)

pnum = 250

(11)

niter = 1000

(12)

In Eqs. (9)–(12), parameters of the chaotic DOA are given. where niter is the maximum iteration number. The first value of the T vector is assigned randomly.

The diagrammatic explanation of the proposed chaotic DOA based HCC classification method is shown in Fig. 2. Three phases of the proposed chaotic DOA based HCC survival method are explained in subsections.

Step 2: Generate initial particles using Eq. (4). Step 3: Convert a binary value to each particle. In this step, each particle was coded as 49 bits. Step 4: Select features by using each particle. The feature selection procedure is given in Algorithm 2.

Missing feature completion method In here, we used a supervised missing feature completion method and this method is presented by Tuncer and Ertam [29]. In this completion method, observations are divided into parts according to the target value. Then, the average value of the non-missing observations is calculated then this value is assigned to missing values. In the used HCC dataset, there are two classes and these classes are alive and dead. Therefore, we divided data into two parts and two average values are calculated. These values are assigned for alive and dead classes missing features.

Algorithm 2. Feature selection procedure by using the binary value of the particle Procedure: FeatSelect (X , particlei) Input: Completed HCC dataset features ( X ) with size of 165 × 49, binary value particle ( particlei) ) with size of 49. S Output: Selected features ( X particle ) with size of 165 × 49. i

1: for i = 1 to 165 do 2: k = 1; 3: for j = 1 to 49 do 4: if particlei (j) = 1 then

Chaotic DOA based feature selection method Metaheuristic optimization methods have been widely used for data mining, numerical function optimization, creating ensemble classifiers and feature selection. In this phase, we used chaotic DOA for feature selection. Error is considered as a fitness function in this method and Eq.8 defines error function.

Err = 1

TP + TN TP + FN + FP + TN

(9)

1

5:

S X particle (i , k ) = X (i , j ) ; i

k = k + 1; 6: 7: end if 8: end for j 9: end for i

Step 5: Calculate the error value of each particle by using our fitness function.

(8)

Where,

S predict = KNN (X particle , y) i

Err is error, TP is true positive, FN is false negative,

(13)

KNN is k nearest where predict is predicted value of the neighbor classifier (k = 1, distance = city block, 10 fold CV) and y is the target. By using predict and y , TP , FP , FN and TN values are S X particle , i

Table 1 Explanation of the used nine machine learning algorithm. No

Classifier

Attributes of classifier

1 2 3 4 5 6 7 8 9

DT [30] LD [31] QD [32] LR [33] SVM [34] KNN [34,35] BT [36] SD [37] SNN [38]

Gini’s criterion is selected and maximum number of splits is selected as 100. Covariance structure is full. Covariance structure is full. This classifier has no advanced option. It is non-parametric. Linear kernel function SVM is used. 1 is assigned as box constraint level. K and distance metric are selected as one and city block respectively. Number of learners and splits are selected as 30 and 164 respectively. Number of learners and subspace dimension are selected as 30 and 16 respectively. 0.1 is selected as learning rate. Number of learners and subspace dimension are selected as 30 and 16 respectively. 0.1 is selected as learning rate.

4

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

Fig. 4. Details of the used HCC dataset.

5

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

Table 2 The used numerical benchmark functions to evaluate the optimization methods. No

Functions

Equation

1

Sphere Function

f1 (x ) =

n 2 i= 1 xi ,

Figure

2

Schwefel 2.22 Function

f2 (x ) =

n i=1

|xi| +

n i = 1 |xi |,

x [ 10, 10]n

(15)

3

Schwefel 1.2. Function

f3 (x ) =

n i=1

(

)

x

[ 100, 100]n

(16)

4

Schwefel 2.21 Function

f4 (x ) = max{|xi|, 1

i

n}, x

5

Rosenbrock’s Function

f5 (x ) =

n 1 i=1

[100(xi + 1

6

Step Function

f6 (x ) =

n i=1

|xi + 0.5|2 , x

7

Noise Function

f7 (x ) =

n i = 1 ixi

x

(14)

[ 100, 100]n

2 i j = 1 xj ,

x i2 ) 2 + (xi

1) 2], x

[ 100, 100]n

+ rand (0, 1), x

(17)

[ 100, 100]n

(18)

[ 30, 30]n

(19)

[ 1.28, 1.28]n

(20)

(continued on next page)

6

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

Table 2 (continued) No

Functions

Equation

8

Rastrigin’s Function

f8 (x ) =

n i=1

9

Ackley’s Function

f9 (x ) =

20exp

10

Griewank’s Function

11

Generalized Penalized 1 Function

12

Generalized Penalized 2 Function

Figure

f10 (x ) =

1 4000

f11 (x ) =

n{

(x i2 + 10cos(2 xi) + 10), x

0.2

1 n

n 2 i=1 xi

10sin( y1) +

n 2 i= 1 xi

n i = 1 cos

n 1 i=1

(yi

exp

(21)

[ 5.12, 5.12]n

(

1 n

( ), x xi i

n i = 1 cos(2

)

xi) + 20 + e, x

[ 600, 600]n

(23)

1)2 [1 + 10sin2 ( yi + 1)] + (yn

k (xi a)m, xi > a 1 yi = 1 + (xi + 1), u (xi , a, k, m) = k ( xi a)m , xi < a , x 4 0, a xi a

f12 (x ) =

1 sin2 (3 10 {

x1) +

n i=1

(xi

1) 2} + [ 50, 50]n

1)2 [1 + sin2 (3 xi + 1)] + (xn

a) m ,

k (xi xi > a u(xi, a, k, m) = k ( xi a) m, xi < a , x 0, a xi a

[ 50, 50]n

(22)

[ 32, 32]n

n i = 1 u (xi ,

10, 100, 4)

(24)

(25)

1) 2 [1 + sin2 (2 xn)]} +

n i = 1 u (xi ,

5, 100, 4)

(26)

(27)

calculated. Eq. (13) applied the calculatedTP , FP , FN and TN to obtain an error.

Overview of the proposed chaotic DOA based method Procedure of the proposed chaotic DOA feature selector based HCC classification method is shown in Algorithm 3.

Step 6: Use Step 2–6 of the proposed chaotic DOA.

Algorithm 3. Proposed chaotic DOA feature selector based HCC classification method.

Classification

Input: HCC dataset. Output: Results

To illustrate the superiority of the selected features with chaotic DOA, we selected 9 mostly used classifiers in the machine learning and these classifiers are Decision Tree (DT) [30], Linear Discriminant (LD) [31], Quadratic Discriminant (QD) [32], Logistic Regression (LR) [33], Support Vector Machine (SVM) [34], K-Nearest Neighbors (KNN) [34,35], Bagged Tree (BT) [36], Subspace Discriminant (SD) [37] and Subspace Nearest Neighbor (SNN) [38]. The first six classifiers are conventional classifiers and the last three classifiers are ensemble classifiers. Attributes of these classifiers are listed in Table 1. MATLAB classification learner toolbox was used to implement these nine classifiers.

0: Load HCC dataset. 1: Complete missing features by using supervised mean feature completion method. 2: Set initial parameters of the chaotic DOA.

3: Generate particles by using chaotic DOA in range of from 1 to 249 1. 4: Convert generated particle to binary form. 5: Implement feature selection procedure by using Algorithm 2 and calculated binary particles. 6: Select personal best particle. 7: Update particles by using logistic sine map and DOA formulation. 8: Select best particle. 9: Select features by using best particle and Algorithm 2. 10: Forward selected features to 9 classifiers. 11: Obtain results from classifiers.

7

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

Table 3 Performance of the proposed DOA and other widely used Optimization Algorithms. F F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12

Metric Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D. Mean S.D.

SCA [42] 1.11 1.83 1.62 2.12 2.10 1.12 7.37 2.11 1.70 4.86 3.69 8.84 6.65 1.81 5.91 3.65 1.15 9.95 1.04 2.94 4.17 1.38 1.51 2.87

× × × × × × × × × × × × × × × × × × × × × × × ×

WOA [44] 1

10 101 10−2 10−2 104 104 101 101 106 106 101 101 10−1 100 101 101 101 100 100 10−1 106 107 107 107

1.41 4.91 1.06 2.93 5.39 2.93 7.25 3.97 2.78 7.63 3.11 5.32 1.42 1.14 0 0 7.40 9.89 2.89 1.58 3.39 2.14 1.88 2.66

GOA [45] −30

× × × × × × × × × × × × × ×

10 10−30 10−21 10−21 10−7 10−6 10−2 10−1 102 10−1 101 10−1 10−3 10−3

× × × × × × × ×

101 101 10−4 10−3 10−1 10−1 100 10−1

1.72 1.77 2.16 3.08 4.80 1.65 1.33 1.42 3.61 6.73 2.94 5.68 1.05 1.86 8.79 6.25 6.03 8.43 1.75 1.12 6.21 2.80 1.80 4.90

× × × × × × × × × × × × × × × × × × × × × × × ×

PSO [39] −8

10 10−8 100 100 10−6 10−5 10−4 10−4 102 102 10−8 10−8 10−1 10−1 100 100 10−1 06−1 10−1 10−1 10−4 10−3 10−3 10−3

3.90 2.03 2.03 4.58 5.73 2.90 4.15 9.55 2.10 2.04 1.05 1.23 8.38 5.02 2.44 6.18 4.92 3.21 4.96 1.05 1.22 2.83 1.08 7.95

× × × × × × × × × × × × × × × × × × × × × × × ×

Results and discussions

Dataset In this study, a survey is utilized as dataset and this survey included 49 questions. This survey is applied to 165 subjects and the acquired answers are utilized as features. The main purpose of this survey is to detect survive or dead subjects by using these features. However, there are missing features (unanswered questions) in this dataset. This dataset can be downloaded from https://archive.ics.uci.edu/ml/datasets/ HCC+Survival URL. According to this dataset, 102 subjects are alive (survive) and 63 subjects are dead. Details of the used dataset are shown in Fig. 4. Results To show success of the proposed chaotic DOA, firstly we used 12 numerical benchmark function and these functions are explained in Table 2 [39–41]. As seen in Table 2, the widely used numerical fitness functions are used in this article. The minimum optima of these functions were 0. The main aim of the optimization methods is to obtain 0.

Recall

Precision

F1

Error

DT LD QD LR SVM KNN BT SD SNN

95.15 93.94 90.91 90.91 95.15 95.15 97.58 92.73 98.79

94.86 92.37 90.64 90.24 93.65 94.86 96.83 90.48 98.41

95.47 95.00 90.28 90.47 96.36 95.47 98.11 94.74 99.04

94.86 93.66 90.40 90.35 94.99 94.86 97.47 92.56 98.72

4.85 6.06 9.09 9.09 4.85 4.85 2.42 7.27 1.21

6.59 6.34 7.18 2.90 3.29 7.91 5.61 1.31 2.68 6.99 8.16 1.26 2.21 1.00 3.10 4.73 1.06 7.78 4.48 6.65 5.34 2.07 6.54 4.47

precision =

recall =

Table 4 Results (%) of the used nine classifiers. Accuracy

10 10−1 10−1 10−1 102 102 100 10−1 102 102 100 100 10−8 10−8 101 100 10−1 10−1 10−1 10−1 10−2 10−2 10−1 10−2

× × × × × × × × × × × × × × × × × × × × × × × ×

MVO [47] −28

10 10−5 10−17 10−2 10−6 101 10−7 100 101 101 10−1 10−4 10−3 10−1 10−1 101 10−13 10−2 10−3 10−3 10−2 10−2 10−1 10−3

2.08 6.48 1.59 4.47 4.53 1.77 3.12 1.58 1.27 1.47 2.29 6.30 5.19 2.96 1.18 3.93 4.07 5.50 9.38 5.95 2.45 7.91 2.22 8.64

× × × × × × × × × × × × × × × × × × × × × × × ×

1

10 10−1 101 101 102 102 100 100 103 103 101 10−1 10−2 10−2 102 101 100 100 10−1 10−2 100 10−1 10−1 10−2

CDW-PSO [49]

DARCY-COA

0 0 0 3.79 × 10−36 0 0 0 5.26 × 10−43 2.81 × 101 4.82 × 101 0 0 0 0 0 0 8.88 × 10−16 0 0 0 2.93 × 10−1 8.29 × 10−1 2.31 × 100 4.29 × 100

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8.88 × 10−16 0 5.26 × 10−3 5.00 × 10−3 4.71 × 10−31 8.90 × 10−47 1.34 × 10−32 5.56 × 10−48

By using these functions, the proposed chaotic DOA was compared to (SCA) [42,43], Whale Optimization Algorithm (WOA) [44], Grasshopper Optimization Algorithm (GOA) [45], Particle Swarm Optimization (PSO) [39], Grey Wolf Optimization (GWO) [46], Multi-Verse Optimization (MVO) [47,48] and chaotic dynamic weight particle swarm optimization (CDW-PSO) [31]. Comparisons are given in Table 3. Table 3, clearly shown success of the proposed chaotic DOA. Our optimization method achieved the best results for 11 numerical benchmark function. Therefore, we used this method for the feature selection. The proposed chaotic DOA method is applied to the defined HCC dataset and results were obtained. The obtained results are presented in this section. After applying the proposed chaotic feature selection method, 31 features are selected. These features are selected by using binary values of the global optimum and these values are “1101111001100010110001101011010110111110100111111”. According to Algorithm 2, 0 values are defined as redundant feature. We select 1 coded features. The selected features by using optimum binary values are utilized as the input of the selected nine classifiers. The proposed method was implemented on MATLAB2018a programming environment. Precision, recall, F1 score, accuracy and error metrics are selected. Mathematical explanations of them are given below.

Results of the proposed chaotic DOA feature selector based classification method are given and discussed in this section. Dataset definition, results and discussions are subsections of this section. These subsections are given below.

Classifiers

GWO [46] −1

TP FP + TP

(28)

TP FN + TP

(29)

Table 5 Accuracies and F1 scores (%) of the proposed chaotic DOA method and other HCC survival classification methods.

8

Method

Accuracy

F1

Santos et al.’s method [11] Sawhney et al.’s method [13] Książek et al.’s method [12] Tuncer and Ertam’s method [27] Proposed chaotic DOA based method (SNN based method)

75.19 ± 1.05 83.50 88.49 92.12 98.79

66.50 ± 1.82 – 87.62 91.61 98.72

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

F1 =

2xprecision xrecall precision + recall

accuracy = 1

error

The disadvantages of the proposed chaotic DOA method are;

(30)

• The small dataset is used (165 observations). • The proposed chaotic DOA based feature selector has high execution

(31)

time.

To obtain general results of the used nine classifiers, the test of each classifier was repeated 100 times. The obtained results are listed in Table 4. Table 2 shows that the best classifier is SNN for HCC survival classification because it achieved 98.79% classification accuracy and all classifiers achieved higher 90% classification rate. The worst classifiers are QD and LR.

Conclusion In this work, we presented a novel chaotic optimization method. The optimization method has been widely used in many areas. One of them is feature selection. A chaotic nature inspired optimization method was presented in this work. The proposed chaotic DOA achieved better results for numerical functions optimization. To use this success for feature selection, a novel feature selection method was presented by using chaotic DOA. Survival recognition was predicted with high classification accuracy by using a supervised missing feature completion method and chaotic DOA optimization method based feature selector in the HCC dataset. We implemented our hypothesis and achieved 98.79% classification accuracy (only 2 errors) (See Fig. 5). The proposed chaotic DOA based method was also compared to other HCC survival classification method. According to comparative results, the proposed method achieved very high classification accuracy, precision, recall and F1 score (See Table 3). These results openly illustrated that our proposed method resulted successfully and this method can be applied to other datasets for feature selection.

Discussions Table 2 clearly indicated that the proposed method achieved high results. To illustrate the success of the proposed method, this method was compared to others and the obtained comparatively results were listed in Table 5. It can be shown in Table 3; the proposed chaotic DOA based method is the best of among them. It achieved 6.67% and 7.11% higher accuracy and F1 rate than the best of others respectively. Moreover, the proposed chaotic DOA based method used nine classifiers and all of the classifiers achieved higher 90% classification accuracy. All of the classifiers resulted higher than Santos et al.’s method [11], Sawhney et al.’s method [13] and Książek et al.’s method [12]. DT, SVM, KNN, BT, SD and SNN resulted higher than all of them in our method. Only 2 observations are false predicted in the SNN based method. The confusion matrix of the best result is shown in Fig. 5. The advantages of the proposed chaotic DOA based method are;

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

• A novel chaotic nature-inspired (root growth) optimization algorithm is presented and this method is used for feature selection. • The proposed chaotic DOA based method achieved 98.79% classi• • •

References

fication accuracy (See Table 2). This result clearly showed that this method resulted successfully. A robust method is proposed because tests were implemented by using 10-fold cross-validation. The implementation of this method is easy. Nine classifiers are used to generalize results.

[1] Maucort-Boulch D, de Martel C, Franceschi S, Plummer M. Fraction and incidence of liver cancer attributable to hepatitis B and C viruses worldwide. Int J Cancer 2018;142:2471–7. [2] Davis GL, Dempster J, Meler JD, Orr DW, Walberg MW, Brown B, et al. Hepatocellular carcinoma: management of an increasingly common problem. Baylor University Medical Center Proceedings. Taylor & Francis; 2008. p. 266–80. [3] Gadiparthi C, Yoo ER, Are VS, Charilaou P, Kim D, Cholankeril G, et al. Hepatocellular carcinoma is leading in cancer-related disease burden among hospitalized baby boomers. Ann Hepatol 2019;18:679–84. [4] Li S, Jiang H, Yao Y-d, Pang W, Sun Q, Kuang L. Structure convolutional extreme learning machine and case-based shape template for HCC nucleus segmentation. Neurocomputing. 2018;312:9–26. [5] Attwa MH, El-Etreby SA. Guide for diagnosis and treatment of hepatocellular carcinoma. World journal of hepatology. 2015;7:1632. [6] Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med 2001;23:89–109. [7] Das A, Acharya UR, Panda SS, Sabut S. Deep learning based liver cancer detection using watershed transform and Gaussian mixture model techniques. Cognit Syst Res 2019;54:165–75. [8] Iavarone M, Colombo M. HBV infection and hepatocellular carcinoma. Clin Liver Dis 2013;17:375–97. [9] DeWaal D, Nogueira V, Terry AR, Patra KC, Jeon S-M, Guzman G, et al. Hexokinase2 depletion inhibits glycolysis and induces oxidative phosphorylation in hepatocellular carcinoma and sensitizes to metformin. Nat Commun 2018;9:1–14. [10] Singh S, Singh PP, Roberts LR, Sanchez W. Chemopreventive strategies in hepatocellular carcinoma. Nat Rev Gastroenterol Hepatol 2014;11:45–54. [11] Santos MS, Abreu PH, García-Laencina PJ, Simão A, Carvalho A. A new clusterbased oversampling method for improving survival prediction of hepatocellular carcinoma patients. J Biomed Inform 2015;58:49–59. [12] Książek W, Abdar M, Acharya UR, Pławiak P. A novel machine learning approach for early detection of hepatocellular carcinoma patients. Cognit Syst Res 2019;54:116–27. [13] Sawhney R, Mathur P, Shankar R. A firefly algorithm based wrapper-penalty feature selection method for cancer diagnosis. International Conference on Computational Science and Its Applications. Springer; 2018. p. 438–49. [14] Mitra S, Saha S, Acharya S. Fusion of stability and multi-objective optimization for solving cancer tissue classification problem. Expert Syst Appl 2018;113:377–96. [15] Liu S, Liu S, Cai W, Pujol S, Kikinis R, Feng D. Early diagnosis of Alzheimer's disease with deep learning. 2014 IEEE 11th international symposium on biomedical imaging (ISBI). IEEE; 2014. p. 1015–8. [16] Fa B, Luo C, Tang Z, Yan Y, Zhang Y, Yu Z. Pathway-based biomarker identification

Fig. 5. Confusion matrix of the chaotic DOA based method and SNN results.

9

Medical Hypotheses 139 (2020) 109626

F.B. Demir, et al.

[17] [18]

[19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33]

with crosstalk analysis for robust prognosis prediction in hepatocellular carcinoma. EBioMedicine. 2019;44:250–60. Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep learning–based multi-omics integration robustly predicts survival in liver cancer. Clin Cancer Res 2018;24:1248–59. Tan J, Ung M, Cheng C, Greene CS. Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders. Pacific Symposium on Biocomputing Co-Chairs. World Scientific; 2014. p. 132–43. Chen L, Cai C, Chen V, Lu X. Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model. BMC bioinformatics: BioMed Central; 2016. p. S9. Palazzo M, Beauseroy P, Yankilevich P. Hepatocellular Carcinoma tumor stage classification and gene selection using machine learning models. Electronic J SADIO (EJS). 2019;18:26–42. Dogan S, Turkoglu I. Diagnosing hyperlipidemia using association rules. Math Comput Appl 2008;13:193–202. Özyurt F. A fused CNN model for WBC detection with MRMR feature selection and extreme learning machine. Soft Comput 2019;1–10. Aydemir E, Tuncer T, Dogan S. A Tunable-Q wavelet transform and quadruple symmetric pattern based EEG signal classification method. Med Hypotheses 2020;134:109519. Tuncer T, Dogan S, Pławiak P, Acharya UR. Automated arrhythmia detection using novel hexadecimal local pattern and multilevel wavelet transform with ECG signals. Knowl-Based Syst 2019;186:104923. Tuncer T, Dogan S. A novel octopus based Parkinson’s disease and gender recognition method using vowels. Appl Acoust 2019;155:75–83. Mala GM, Li D. Flow characteristics of water in microtubes. Int J Heat Fluid Flow 1999;20:142–8. Hua Z, Zhou Y, Pun C-M, Chen CP. Image encryption using 2D Logistic-Sine chaotic map. 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE; 2014. p. 3229–34. Hua Z, Jin F, Xu B, Huang H. 2D Logistic-Sine-coupling map for image encryption. Signal Process 2018;149:148–61. Tuncer T, Ertam F. Neighborhood component analysis and reliefF based survival recognition methods for Hepatocellular carcinoma. Physica A 2020;540:123143. Shaikhina T, Lowe D, Daga S, Briggs D, Higgins R, Khovanova N. Decision tree and random forest models for outcome prediction in antibody incompatible kidney transplantation. Biomed Signal Process Control 2019;52:456–62. Ibrahim W, Abadeh MS. Protein fold recognition using Deep Kernelized Extreme Learning Machine and linear discriminant analysis. Neural Comput Appl 2019;31:4201–14. Gyamfi KS, Brusey J, Hunt A, Gaura E. A dynamic linear model for heteroscedastic LDA under class imbalance. Neurocomputing. 2019;343:65–75. Desai SD, Giraddi S, Narayankar P, Pudakalakatti NR, Sulegaon S. Back-propagation

[34]

[35] [36] [37]

[38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49]

10

neural network versus logistic regression in heart disease classification. Advanced Computing and Communication Technologies. Springer; 2019. p. 133–44. Alshamlan H, Badr G, Alohali Y. Gene selection and cancer classification method using artificial bee colony and SVM algorithms (ABC-SVM). Proceedings of the International Conference on Data Engineering 2015 (DaEng-2015). Springer; 2019. p. 575–84. Ayyad SM, Saleh AI, Labib LM. Gene expression cancer classification using modified K-Nearest Neighbors technique. BioSystems. 2019;176:41–51. De'Ath G. Boosted trees for ecological modeling and prediction. Ecology 2007;88:243–51. Allegretta I, Marangoni B, Manzari P, Porfido C, Terzano R, De Pascale O, et al. Macro-classification of meteorites by portable energy dispersive X-ray fluorescence spectroscopy (pED-XRF), principal component analysis (PCA) and machine learning algorithms. Talanta 2020;120785. Ho TK. Nearest neighbors in random subspaces. Joint IAPR. International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer; 1998. p. 640–8. Mirjalili S, Wang G-G. Coelho LdS. Binary optimization using hybrid particle swarm optimization and gravitational search algorithm. Neural Comput Appl 2014;25:1423–35. Tejani GG, Savsani VJ, Patel VK, Mirjalili S. An improved heat transfer search algorithm for unconstrained optimization problems. J Comput Des Eng 2019;6:13–32. Mirjalili S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl-Based Syst 2015;89:228–49. Mirjalili S. SCA: a sine cosine algorithm for solving optimization problems. KnowlBased Syst 2016;96:120–33. Mirjalili SM, Mirjalili SZ, Saremi S, Mirjalili S. Sine cosine algorithm: Theory, literature review, and application in designing bend photonic crystal waveguides. Nature-inspired optimizers. Springer; 2020. p. 201–17. Mirjalili S, Lewis A. The whale optimization algorithm. Adv Eng Softw 2016;95:51–67. Saremi S, Mirjalili S, Lewis A. Grasshopper optimisation algorithm: theory and application. Adv Eng Softw 2017;105:30–47. Mirjalili S, Mirjalili SM, Lewis A. Grey wolf optimizer. Adv Eng Softw 2014;69:46–61. Al-Madi N, Faris H, Mirjalili S. Binary multi-verse optimization algorithm for global optimization and discrete problems. Int J Mach Learn Cybern 2019;10:3445–65. Aljarah I, Mafarja M, Heidari AA, Faris H, Mirjalili S. Multi-verse optimizer: theory, literature review, and application in data clustering. Nature-Inspired Optimizers. Springer; 2020. p. 123–41. Hong H, Liu J, Bui DT, Pradhan B, Acharya TD, Pham BT, et al. Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China). Catena. 2018;163:399–413.