Boosting Salp Swarm Algorithm by Sine Cosine algorithm and Disrupt Operator for Feature Selection
Journal Pre-proof
Boosting Salp Swarm Algorithm by Sine Cosine algorithm and Disrupt Operator for Feature Selection Nabil Neggaz, Ahmed A. Ewees, Mohamed Abd Elaziz, Majdi Mafarja PII: DOI: Reference:
S0957-4174(19)30820-6 https://doi.org/10.1016/j.eswa.2019.113103 ESWA 113103
To appear in:
Expert Systems With Applications
Received date: Revised date: Accepted date:
2 May 2019 25 November 2019 25 November 2019
Please cite this article as: Nabil Neggaz, Ahmed A. Ewees, Mohamed Abd Elaziz, Majdi Mafarja, Boosting Salp Swarm Algorithm by Sine Cosine algorithm and Disrupt Operator for Feature Selection, Expert Systems With Applications (2019), doi: https://doi.org/10.1016/j.eswa.2019.113103
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Ltd.
Highlights • Propose a novel FS method, called ISSAFD, which improves Salp Swarm Algorithm (SSA). • ISSAFS enhanced the followers in SSA using SCA and Disrupt operator (DO). • Evaluating the influence of the operators of SCA on the behavior of leaders in SSA. • Comparing the performance of ISSAFD with swarm intelligence (SI). • The proposed ISSAFD provided better results in terms of performance measures.
1
Boosting Salp Swarm Algorithm by Sine Cosine algorithm and Disrupt Operator for Feature Selection Nabil Neggaza , Ahmed A. Eweesb , Mohamed Abd Elaziz∗,c,e , Majdi Mafarjad a
Université des Sciences et de la Technologie d’Oran Mohamed Boudiaf, USTO-MB, BP 1505, EL M’naouer, 31000 Oran-Algérie - Laboratoire Signal Image PArole (SIMPA)-Département d’informatique Faculté des Mathématiques et Informatique
[email protected]/
[email protected] b Department of Computer, Damietta University, Egypt
[email protected],
[email protected] c Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, Egypt
[email protected] d Department of Computer Science, Birzeit University,Birzeit Palestine
[email protected] e School of Computer Science & Technology, Huazhong university of Science and Technology, Wuhan 430074, China. Corresponding Author: Mohamed Abd Elaziz (
[email protected])
Abstract Features Selection (FS) plays an important role in enhancing the performance of machine learning techniques in terms of accuracy and response time. As FS is known to be an NP-hard problem, the aim of this paper is to introduce basically a new variant of Salp Swarm Optimizer (SSA) for FS (called ISSAFD (Improved Followers of Salp swarm Algorithm using Sine Cosine algorithm and Disrupt Operator), that updates the position of followers (F) in SSA using sinusoidal mathematical functions that were inspired from the Sine Cosine Algorithm (SCA). This enhancement helps to improve the exploration phase and to avoid stagnation in a local area. Moreover, the Disruption Operator (Dop ) is applied for all solutions, in order to enhance the population diversity and to maintain the balance between exploration and exploitation processes. Two other variants of SSA are developed based on SCA called ISSALD (Improved Leaders of Salp swarm Algorithm using Sine Cosine algorithm and Disrupt Operator) and ISSAF (Improved FollowPreprint submitted to Elsevier
November 26, 2019
ers of Salp swarm Algorithm using Sine Cosine algorithm). The updating process in consists to update the leaders (L) position by SCA and applying (Dop ), whereas in ISSAF, the Dop is omitted and the position of followers is updated by SCA. Experimental results are evaluated on twenty datasets where four of them represent high dimensionality with a small number of instances. The obtained results show a good performance of ISSAFD in terms of accuracy, sensitivity, specificity, and the number of selected features in comparison with other metaheuristics (MH). Keywords: Feature Selection (F S); Disruption Operator (Dop ); Salp Swarm Algorithm (SSA); Sine Cosine Algorithm (SCA); Metaheuristics (MH).
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
1. Introduction Feature Selection (FS) is a preprocessing step that proved its efficiency in improving the performance of different learning techniques, in terms of enhancing their quality and reducing the required computational time for learning (Liu & Motoda, 2012). The importance of FS methods is due to the availability of redundant and/or irrelevant features in the datasets, which negatively influence the performance of the learning algorithms. FS problem can be classified as a searching problem since it aims to search for the minimum number of features that represent the original feature set without information loss (Liu & Motoda, 2012). With the advancement of the data collection tools, a huge amount of features becomes available in the datasets in most of the real world fields such as medical, biology and telecommunications industry. Thus, analyzing such amounts of data became impractical. Consequently, searching for the representative features is a time consuming and complicated process since, it is exhaustive search strategy (generate all possible feature subsets to select only one subset) (Talbi, 2009). In recent years, metaheuristics (MH) algorithms have been widely used to tackle different optimization problems including the FS problem (Silva et al., 2018; Guyon & Elisseeff, 2003). According to (Yang, 2013), metaheuristics algorithms, especially the nature-inspired ones, proved their ability to outperform traditional and deterministic methods in tackling different optimization problems in science, engineering, and industry. Good examples of those algorithms the categorized as Swarm Intelligence (SI) such as Particle Swarm Optimization (PSO) (Eberhart & Kennedy, 1995), Ant Colony Optimization 2
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
52 53 54
55 56
57 58 59 60
(ACO) (Dorigo et al., 1996), Salp Swarm Algorithm (SSA) (Mirjalili, 2015), and Sine Cosine Algorithm (SCA) (Mirjalili, 2016). Besides the searching strategies that can be employed to determine the representative features, evaluating the selected feature subsets is another aspect of FS process. Filters, wrappers and embedded are three categories of FS methods based on the subset evaluation criteria. More details about the three models can be found in (Liu & Motoda, 2012). Recently, wrapper approaches attracted the attention of many researchers in the literature, due to the involvement of the learning algorithm in the selection process. Hence the selection of a feature is based on the resulting performance of the learning algorithm (e.g., classification accuracy for a specific classifier) (Kohavi & John, 1997). Different classification techniques have been widely used in different FS methods. For example, K-nearest Neighbor (KNN), Decision Tree (DT), and Artificial Neural Networks (ANN). In this paper, an alternative FS approach, called ISSAFD, is proposed which depends on improving the performance of slap swarm algorithm using sine cosine algorithm and the disrupt operator (Dop ). In ISSAFD, the process of updating the population is realized by incorporating two strategies. The first one aims to improve the leaders by employing the standard operator of the SSA, while the second strategy, updates the position of followers using sine/cosine operators imitated from the SCA. This cooperation enhances the behavior of convergence ability. Whereas, the DO operator allows to produce several diversified solutions in the search space, which is necessary for such algorithm. Then, the best solution which has the smallest value of fitness is reached. The previous process is repeated a certain number of iterations in order to obtain the nearest or global optimal. The major contributions of this paper are as follows: • Introducing a novel algorithm for FS called ISSAFD that combines the merits of both SSA, SCA algorithms and DO, with the purpose to enhance the behavior of followers in SSA • Evaluating the influence of the operators of SCA on the behavior of leaders in SSA. • Comparing the performance of ISSAFD with swarm intelligence (SI) algorithms such as SCA, SSA, Grey Wolf Optimizer (GWO), Ant Lion Optimization (ALO), Particle Swarm Optimization (PSO) and bioinspired method known as Genetic Algorithm (GA). Furthermore, a 3
61 62
fair comparison is realized with some works of the literature in terms of accuracy and the selected number of features.
68
The rest of the paper is organized as follows: the recent FS approaches in literature are presented in section 2, followed by a description of the used algorithms in Section 3. Section 4 describes the details of the proposed approach (ISSAFD). The setup of all experiments done in this paper along with the obtained results are discussed in Section 5. Finally, the conclusion and the future directions are drawn in Section 6.
69
2. Related Works
63 64 65 66 67
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
Recently, various SI algorithms have been utilized as search strategies in different wrapper FS methods (Faris et al., 2018; Hafez et al., 2016; Mafarja & Mirjalili, 2018; Emary et al., 2016a; Ibrahim et al., 2017; Zhang et al., 2018; Ghimatgar et al., 2018). PSO, as a primary SI algorithm, has been widely used with FS methods. (Mafarja et al.) and (Mafarja & Sabar) are two recent approaches that employ two variants of PSO algorithm as searching strategies in wrapper FS approaches. Recently, a hybrid approach between PSO and Shuffled Frog Leaping Algorithm (SFLA) was proposed in (Rajamohana & Umamaheswari, 2018) to improve the accuracy of fake reviews identification. Moreover, different evolutionary algorithms (EAs) (e.g., Genetic Algorithms (GA) and Differential Evolution (DE)) have been widely employed as searching strategies in different FS approaches in order to determine the best subset of features (Dong et al., 2018; Hancer et al., 2018; Lensen et al., 2018; Elaziz et al., 2019). In the recent years, the intention to use cooperative metaheuristics (CMH) has risen and many approaches were proposed in the literature. Their results were competitive in solving many optimization problems including FS (Silva et al., 2018; Elaziz et al., 2017b). (Tawhid & Dsouza, 2018) proposed a novel synergy between Bat Algorithm (BA) and PSO called Hybrid Binary Bat Enhanced Particle Swarm Optimization Algorithm(HBBEPSO) for FS. In HBBEPSO, the exploration capabilities of BA were combined with PSO capabilities, which produce a new approach that is able to converge to the best global solution in the search space. In (Chen et al., 2018), an enhanced PSO approach with two crossover operators was proposed to tackle the FS problem. ACO algorithm was also applied in many FS methods. For instance, (Shunmugapriya & Kanmani, 2017) proposed FS approach that com4
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133
bines the characteristics of ACO with Artificial Bees Colony (ABC) (called (AC-ABC) to enhance the search process. A Binary Butterfly Optimization (BOA) based FS approach has been proposed by (Arora & Anand, 2019). In (Mafarja et al., 2019), three variants of Binary Grasshopper Optimization Algorithm (BGOA), known as BGOA using sigmoid function (BGOA-S), BGOA using V shaped (BGOA-V), and (BGOA based on Mutation (BGOA-M), were proposed. In these approaches, the crossover and mutation operators from the GA algorithm were employed to enhance the performing of the GOA algorithms, and the results were promising and better than basic approaches. Another GOA based FS approach was proposed in (Zakeri & Hokmabadi, 2019). The SSA is a recent metaheuristic algorithm that mimics the behavior of salps in the deep oceans. However, it has been used as a search strategy in many FS approaches (Aljarah et al., 2018; Faris et al., 2018). The experimental results in both works proved the ability of the SSA to outperform other optimizers. Moreover, another SSA based FS was proposed in (Ahmed et al., 2018), where a set of chaotic maps were used to control the balance between exploration and exploitation in the SSA algorithm. (Sayed et al., 2018) proposed a chaotic based SSA for global optimization and feature selection. A new method of wrapper FS is created based on mutation operator and SSA called SSAMUT in (Khamees et al., 2018). In addition, (Ibrahim et al., 2019), proposed a new hybridization between SSA and PSO called SSAPSO which allows to enhance the efficiency of exploration and exploitation steps. Furthermore, (Baliarsingh et al., 2019) proposed a weighted chaotic SSA named as WCSSA for genomic high dimensional data. This algorithm aims to seek simultaneously the optimal gene selection and the kernel parameters of extreme machine learning (ELM). Another inspiration based on sine and cosine functions is developed for FS. As example, (Sindhu et al., 2017) proposed a novel FS method based on an improved SCA variant called (ICSA). In ICSA, an elitism strategy is used to select the global solution, and a new updating mechanism for the new solution was proposed. As other global optimization algorithms, SCA suffers from the stagnation in local optima. To overcome this drawback, (Elaziz et al., 2017b) proposed a hybrid model between the SCA and the differential evolution operators that served as local search methods. This model helps the SCA algorithm to avoid the local optima. As can be concluded from the previous studies, both SSA and SCA algorithms suffer from stagnation in local optima and low convergence rates. 5
134
135 136
137 138 139 140 141 142
3. Background In this section, the general concept of sine cosine algorithm (SCA) and salp swarm algorithm (SSA) and Dop are described. 3.1. Sine cosine algorithm The SCA algorithm is a new method that belongs to the class of populationbased optimization techniques. This algorithm is introduced by (Mirjalili, 2016). The particularity of this algorithm lies in the movement of search agents that uses two mathematical operators based on the sine and cosine functions as in Eqs.(1) or (2), respectively: t Xijt+1 = Xijt + r1 × sin(r2 ) × |r3 XBest − Xijt | if r4 < 0.5 j
(1)
t Xijt+1 = Xijt + r1 × cos(r2 ) × |r3 XBest − Xjt | if r4 ≥ 0.5 j
(2)
t is the target solution in j th dimension at t iteration, Xijt is the where XBest j current solution in j th dimension at t iteration, |.| indicates the absolute cost. r1 , r2 , r3 and r4 are random numbers. The parameter r1 controls the balance between exploration and exploitation. This parameter is modified during the iterations by using the following formula:
r1 = a − t 143 144 145 146 147 148 149 150
a T
(3)
where t is the current iteration, T , is the maximum number of iteration and a is a constant. r2 determines the direction of the movement of the next solution if it towards or outwards target. r3 indicates the weight for the best solution in order to stochastically emphasize (r3 >1) or de-emphasize (r3 <1) the effect of destination in defining the distance (Elaziz et al., 2017a). The parameter r4 allows switching between sine and cosine or vice versa using Eqs (1) and (2). The general framework of the SCA is depicted in Algorithm 1.
6
151
152 153 154 155 156 157 158 159 160 161
1: 2: 3: 4: 5: 6: 7: 8: 9:
Algorithm 1: Sine cosine algorithm (SCA) Initialize N solutions. Set the initial iteration number t := 0. repeat Evaluate each solution and we determine the best solution. Update random parameters r1 , r2 , r3 and r4 . Update the position of search agents using Eqs. (1) and (2). Set t = t + 1. until (t
3.2. Salp Swarm Algorithm The SSA is a new swarm intelligence (SI) algorithm, that was developed recently by (Mirjalili et al., 2017). The principal idea behind the operators of the SSA is that they imitate the swarming behavior of salps in deep oceans. Salps belong to the species of Salpidae and have a transparent barrel-shaped body . They are similar to jellyfishes in their tissues and movement. Furthermore, they move as the water is pumped through the body as propulsion to move forward (Anderson & Bone, 1980). The salps provide together a new form of swarm known as slap chain when navigating in oceans as shown in Figure. 1.1
Figure 1: Individual salp. 162 163 164
The salp chain behavior has been modeled mathematically by dividing the population into groups based on leader and followers. The front of the chain is considered as the leader while the remainder of salps is known as 1
www.alimirjalili.com/SSA.html.
7
Figure 2: The behavior of natural salps swarm.
165 166 167 168 169 170 171 172
173 174
175 176 177 178 179
followers as shown in Figure. 2 . The role of the leader is to guide the swarm of salps, and each follower follows the preceding one. Similar to other SI algorithms, the process of SSA start by initializing a random population of salps, then evaluating the fitness for each salp. The salp with the best fitness value is denoted as the leader salp, while other salps are denoted as followers. The best performing salp is denoted also as a food source to be chased by the salp chain. To update the position of sales chain, two main phases are distinguished: leader phase and followers phase. 3.2.1. Leader phase The position of the leader is updated using Eq.(4) as follows: XBestj + c1 ((ubj − lbj ) c2 + lbj ) if c3 ≥ 0.5 1 Xj = XBestj − c1 ((ubj − lbj ) c2 + lbj ) else
(4)
where Xj1 and XBestj represent the new position of the leader and food source in the j th dimension, ubj and lbj represent the upper and lower bounds of j th dimension, respectively. c2 and c3 are randomly generated numbers in the interval [0, 1]. The parameter c1 presents a significant factor in SSA which controls the balance between exploration and exploitation. Furthermore, c1 8
180
decreases gradually by the course of iterations as shown in Eq.(5): 4t 2
c1 = 2e−( T ) 181 182
183 184 185
186 187 188 189 190 191
192
(5)
where t indicates the current iteration and T is the maximum number of iterations. 3.2.2. Followers phase To update the position of the followers, new concept is introduced which is based on Newton’s law of motion as in Eq.(6): 1 (6) Xji = gt2 + ω0 t, i ≥ 2 2 where Xji represents the position of ith follower salp in the j th dimension. In the optimization process, the time t corresponds to the current iteration, where g and ω0 indicate the acceleration and the velocity, respectively. In Eq.(6), the initial speed ω0 is fixed to 0 and the discrepancy is fixed to 1 (∆t = 1), so the updating process of followers can be expressed as in Eq.(7): 1 Xji = (Xji + Xji−1 ) 2 The pseudo-code of the SSA is shown in Algorithm 2.
9
(7)
193
194 195 196 197 198 199 200 201 202 203
204 205 206 207
Algorithm 2: Salp swarm algorithm 1: Initialize the population size N and max iterations number T . 2: Set the initial iteration number t := 0. 3: Generate the initial population X which contains N . 4: Evaluate solutions the fitness function of all individuals X. 5: Denote the best solution in the population as XBest 6: Repeat 7: Update c1 according to Eq.(5). 8: for i=1 to N do 9: if (Xi leader) then 10: Update the position of the leader salp as in Eq.(4). 11: else 12: Update the position of the follower salp as in Eq.(7). 13: end if 14: end for 15: Set t = t + 1. 16: until (t
10
208
209 210 211 212 213 214 215 216 217 218 219 220
4. Proposed ISSAFD method The framework of the proposed feature selection method is given in Figure 3. In general, the proposed ISSAFD method depends on improving the behaviors of the SSA by using the operators of the SCA and the Dop operator. In which, each operator has its own task. The aim of using the SCA is to improve the exploration ability of the followers rather than the updating mechanism used in the traditional SSA. Since the followers have the largest effect on the convergence of solutions toward the global solution. Whereas, the Dop is used to enhance the diversification of the whole population after updating it using the SSA or SCA operators, which leads to improving the exploration ability of the ISSAFD and convergence to the optimal solution. The ISSAFD consists of four stages which are given in details in the following sections.
Figure 3: The Framework of the proposed ISSAFD method.
11
221 222 223 224
4.1. Initial stage The ISSAFD begins by generating a population that contains a set of N individuals where each individual represents a solution for a given optimization problem. The population X is generated using the following equation: Xi = lbi + αi × (ubi − lbi ), i = 1, 2, ..., N
225 226 227 228
229 230 231 232 233
234 235 236 237
where Xi is the ith solution belong to X, the αi ∈ [0, 1] represents a random number. The lbi and ubi represent the lower and upper boundary of the given problem, however, in this study, lbi = 0 and ubi = 1. In addition, each solution Xi must be converted into binary solution BXi using Eq.(10): 1 if Xij > 0.5 BXij = (10) 0 otherwise To make the definition of BX more clear, consider the solution Xi has five elements given as Xi = [0.81, 0.23, 0.12, 0.53, 0.91] then BXi = [1, 0, 0, 1, 1]. This means that when BXi is applied to features of the given dataset, the second and third features are irrelevant features, while the others are relevant features and must be selected.
4.2. Evaluating stage This stage starts by evaluate the quality of each solution Xi by computing the objective function. To compute the objective value for the ith solution, Eq.(11). |BXi | ) (11) Dim where λ ∈ [0, 1] and (µ = 1−λ). λ represents an equalization factor that used to balance between the classification error rate γi and the number of selected features |BXi |. In this study, the KNN classifier is used as an evaluator during the FS process. The KNN classifier was selected to be used in this work due to its simplicity, easy to implement, and since no parameters are required. In addition, the γi represents the error rate of testing set, in which the dataset is divided randomly into two parts, the first part is the training set which has a size equals to 80% from the total size of the dataset. Meanwhile, the second part is the testing set which has 20% from the dataset. F iti = λ × γi + µ × (
238 239 240 241 242 243 244 245 246
(9)
12
247 248 249 250 251 252 253 254 255 256
257 258
259 260 261 262 263 264
265
4.3. Updating stage In this stage, the solution with the highest objective value among all solutions is denoted as the best solution XBest . Then the population X is split into two populations, using the traditional SSA. The first half of the population represent the leaders and they are updated using the operators of SSA as mentioned in Eq.(4). Whereas, the second half of the population, which represents the followers, is updated using the operators of the SCA as defined in Eqs. (1) and (2). Thereafter, the Dop is used to update the whole population X in order to maintain its diversity. However, in order to decrease the computational time at this stage, the Dop is used as in Eq.(12). X × Dop if αo > 0.5 X= (12) X otherwise
Eq.(12) refers to the Dop is applied to X only when the random number αo ∈ [0, 1] is greater than 0.5 otherwise it is not used.
4.4. Terminal stage The steps of the evaluating and updating stages are repeated until the termination condition is met. In this study, this condition is the maximum number of iteration which used to assess the quality of the proposed method to find the optimal subset of features during the specified number of iterations. 4.5. Computational complexity The computational complexity of ISSAFD is depends on the complexity of the SSA, SCA and disrupt operator (DO). Therefore, the complexity of the proposed method is given as: O (ISSAF D) = Ks O (SSA) + (N − Ks )O (SCA) + O (DO) where
266 267 268 269
O (SSA) = O (t (Dim × N + C × N + N log N )) O (SCA) = O (t (Dim × N + C × N )) O (DO) = O (t × N )
(13)
where t represents the number of iterations, Dim indicates the number of variables. C is the cost of objective function and N represents the number of solutions. Ks represents the number of solution which updated using the SSA. 13
270 271 272 273 274 275 276 277
278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300
301 302 303 304 305
5. Experimental evaluation and discussion In this section, the performance of the proposed approach is compared with other feature selection methods. In addition, the proposed ISSAFD is compared with two other variant methods of SSA depend on SCA and Disruption Operator. The first method depends on using the SCA to update the leader instead of the followers and using the Dop , so this method called ISSALD. Whereas, the second variant of SSA is called ISSAF which aims to use only the SCA to update the followers without applying the Dop . 5.1. Datasets and parameter setup In order to validate the efficiency of the proposed algorithm ISSAFD, twenty datasets, with varying dimensionality; i.e., two categories are defined as low and high dimensionality, were utilized. The first category is available online at UCI (Frank, 2010), whereas the second type is inspired by (Mafarja & Mirjalili, 2018). Table 1 describes the used datasets in terms of number of features, number of instances and number of classes. These datasets belong to several fields (i, e., biology, games, physics and biomedical) and cover different sizes and dimensions. In order to validate the efficiency of ISSAFD, some parameters and strategies are required. We define the strategy of classification and the type of classifier. As a strategy of classification, we used Hold-out strategy which consists to divide randomly the dataset into two parts: 80% for the training set and 20% for testing set. All experiments were repeated for 30 independent times to obtain statistically meaningful results. Moreover, we consider KNN as a classifier with a Euclidean distance metric (K=5). To have a robust comparison, some other evolutionary feature selections algorithms such as GA, PSO, ALO, SCA, SSA, and GWO have been tested using the same parameter settings. So, all algorithms are uniformly distributed where the population size is set to 10 and the max number of iterations is fixed to 100. The dimension of all algorithms is fixed to the number of features in the original dataset. Table 2 describes parameters setting for all algorithms. 5.2. Performance measures In order to evaluate the performance of the proposed method (ISSAFD), some measures should be defined. Table 3 shows the confusion matrix (CM), which helps to evaluate the performance of the classifier such as Accuracy, Sensitivity, and Specificity. 14
Table 1: Low and high dimensionality datasets description Datasets
Number of features
Number of instances
Number of classes
Data category
Exactly
13
1000
2
Low dimensionality
Exactly2
13
1000
2
Low dimensionality
HeartEW
13
270
2
Low dimensionality
Lymphography
18
148
2
Low dimensionality
M-of-n
13
1000
2
Low dimensionality
PenglungEW
325
73
2
Low dimensionality
SonarEW
60
208
2
Low dimensionality
SpectEW
22
267
2
Low dimensionality
CongressEW
16
435
2
Low dimensionality
IonosphereEW
34
351
2
Low dimensionality
KrvskpEW
36
3196
2
Low dimensionality
Vote
16
300
2
Low dimensionality
WaveformEW
40
5000
3
Low dimensionality
WineEW
13
178
3
Low dimensionality
Zoo
16
101
6
Low dimensionality
BreastEW
30
569
2
Low dimensionality
10367
50
4
High dimensionality
9_Tumors
5726
60
9
High dimensionality
Leukemia 2
11225
72
3
High dimensionality
Prostrate Tumors
10509
102
2
High dimensionality
Brain_Tumors 2
Table 2: Parameters setting Parameter
Value
Size of the population Maximum number of iterations Dimension Number of runs λ in fitness function µ in fitness function a in GWO and SCA c1 and c2 in PSO wmax and wmin in PSO Crossover probability in GA Mutation probability in GA Elitism selection in GA
N = 10 T = 100 Number of features Nr =30 0.99 0.01 decrease linearly from 2 to 0 c1 =c2 =2 wmax = 0, 9 and wmin = 0, 2 Pc = 0.7 Pm = 0.2 Rate = 0.8
15
Table 3: Confusion Matrix Predected class Actual class
Positive
Negative
Postive
True Positive (TP)
False Negative (FN)
Negative
False Positive (FP
True Negative (TN)
• Average accuracy (AV GAcc ): The accuracy metric represents the rate of correctly data classification (see Eq. (14). Accuracy =
TP + TN TP + FN + FP + TN
(14)
In this study, the different algorithms are executed 30 times (Nr = 30), so the AV GAcc metric is calculated as Eq. (15). AV GAcc
Nr 1 X AcckBest = Nr k=1
(15)
• Average sensitivity (AV GSens ) : The sensitivity measure is called also true positive rate (TPR), which indicates the percentage of predicting positive patterns (See Eq. (16)). Sensitivity =
TP TP + FN
(16)
The AV GSens metric is computed from the selected features of the best solution as in Eq.(17). AV GSens
Nr 1 X = SenskBest Nr k=1
(17)
• Average specificity (AV GSpec ): The specificity metric is known also true negative rate (TNR), that represents the percentage of prognosticating negative patterns. This metric is computed by Eq. (18). Specif icity =
16
TN FP + TN
(18)
The AV GSpec measure is computed as the following equation. AV GSpec
Nr 1 X = SpeckBest Nr k=1
(19)
• Average fitness value (AV GF it ): The fitness value metric evaluates the performance of algorithms, which puts the relationship between minimizing the error rate of classification and reducing the selection ratio as in Eq. (11) and its average is expressed as in Eq. 20. AV GF it
306 307
Nr 1 X F itkBest = Nr k=1
(20)
• Average number of the selected features (AV G|BXBest | ): This measure calculate the ability of an algorithm in reducing the features number of a given dataset over all number of independent runs. It calculated as in Eq. (21). Nr 1 X k BXBest (21) AV G|BXBest | = Nr k=1
k | is the cardinality of the selected features of the best where |BXBest solution for k th run.
• Average computation time (AV GT ime ): It calculates the average of CPU time (in seconds) for each algorithm as in the following equation: AV GT ime
Nr 1 X = T imekBest Nr k=1
(22)
• Standard deviation (STD): It is used for evaluating the quality of each algorithm and analysing the obtained results over different executions. Eq. (23) is used to apply this measure. v u Nr u 1 X 2 k ST DY = t YBest − AV GY (23) Nr k=1 17
308 309
310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333
(Note: the ST DY is calculated for all measures: Accuracy, Fitness, Time, Number of selected features, Sensitivity and Specificity. 5.3. The effect of λ and µ on the fitness function The main objective in the FS problem is to find a trade-off between minimizing the selection ratio while trying to maximize the classification accuracy. Thus, a good balance between these to contradictory objectives should be maintained. Here we are interested to study the influence of the λ and µ parameters in Eq.(11) that controls this balance. In this study, the Leukimia2 dataset was used since it represents a real challenge in the field of feature selection (Moayedikia et al., 2017). This dataset contains a small number of instances with a large dimensionality (72 instances with 11225 attributes). So, the leukimia2 dataset has a certain peculiarity in FS which is used in all experiments due to two reasons: it contains a large number of features (11225) and its sensitivity in comparison with other datasets. Different values for λ and µ were used to measure the classification accuracy, the number of selected features, and the fitness value. Table 4 presents the average of classification accuracy, number of selected features, and the fitness values with different values of λ and µ. It can be seen, when changing the values of λ and µ, equally, the accuracy rate, the number of selected features and the fitness values are changing. As soon as, the value of λ increases, the accuracy also increases while the fitness and the number of features selected decrease which is the purpose of this study. By analyzing the Table 4, the best values of the corresponding λ and µ are set to 0.99 and 0.01, respectively which correctly confirms the results obtained by the works of (Mafarja et al., 2018; Faris et al., 2018; Emary et al., 2018). Table 4: The infuence of λ and µ for the Leukimia2
λ
µ
AV GAcc
0.5 0.7 0.9 0.99
0.5 0.3 0.1 0.01
0.95 0.97 0.99 1
AV G|BXBest | 9462.675 5343.1 5331.875 5320.65
18
AV GF it 0.254 0.1709 0.0048 0.0047
334 335 336 337 338 339
340 341 342 343 344 345 346 347
348 349 350 351 352
353 354 355 356 357 358 359 360
361 362 363 364 365
5.4. Comparison of ISSAFD with ISSAF and ISSALD This subsection aims to compare the performance of the proposed method ISSAFD with two novel versions known as ISSAF and ISSALD, where ISSALD method modifies the position of leader in SSA by sine/cosine equations. However, ISSAF employed the same process used by ISSAFD but without applying disruption operator. • In terms of accuracy: According to Table 5, the ISSAFD method achieved the highest accuracy in 15 datasets out of 20 and it acts similarly with ISSAF and ISSLD in 2 and 3 datasets, respectively. The ISSAF achieved the best values in 2 datasets and ranked second between the proposed approaches, whereas, the ISSALD took the last rank. Moreover, The ISSAFD also achieved the smallest STD value in 14 datasets which proves the robustness of the algorithm and its ability to search the promising regions in the search space. • In terms of fitness: As can be seen in Table 6, the ISSAFD method showed good behavior in reaching the minimum fitness values in 13 datasets out of 20, followed by the ISSAF with 4 datasets, whereas, the ISSALD obtained the best fitness value in only 3 dataset. Also, the ISSAFD was the most stable method in terms of STD value. • In terms of selected number of attributes: Inspecting the results in Table 7, the ISSAFD method was able to select the most significant features in 11 datasets out of 20 and it acted like the ISSALD method in one dataset (namely Exactly2 dataset). The ISSALD method comes in the second place by achieving the best results in 4 datasets and it acted like the ISSAF method in one dataset (namely KrvskpEW dataset) whereas, the ISSALD comes in the last place where is achieved the best results in 3 datasets. • In terms of sensitivity and specificity: The ISSAFD method ranked first with 7 datasets whereas, it acted similarly as the ISSAF and SSALD methods in 2 and 4 datasets respectively. The ISSAF ranked second with 6 datasets followed by SSALD with 5 datasets. From Table 8, the ISSAFD was the most stable method in terms of STD value. Table 9 19
Table 5: Comparison of ISSAFD with other variants: ISSAF and ISSALD in terms of accuracy Algorithms Datasets
ISSAFD AVG
ISSAF
ISSALD
STD
AVG
STD
AVG
STD
Exactly
0.9803
0.0440
0.9932
0.0182
0.7977
0.1180
Exactly2
0.8100
0.0000
0.7553
0.0079
0.7800
0.0000
HeartEW
0.9056
0.0164
0.8667
0.0171
0.8272
0.0190
Lymphography
0.9717
0.0156
0.8870
0.0241
0.8778
0.0331
M-of-n
0.9875
0.0277
0.9980
0.0048
0.8973
0.0786
PenglungEW
1,0000
0.0000
0.9356
0.0122
1.0000
0.0000
SonarEW
0.9968
0.0082
0.9540
0.0187
0.9675
0.0171
SpectEW
0.9389
0.0121
0.8586
0.0150
0.8815
0.0173
CongressEW
1,0000
0.0000
0.9870
0.0078
0.9628
0.0120
IonosphereEW
0.9850
0.0073
0.9540
0.0148
0.9620
0.0149
KrvskpEW
0.9742
0.0057
0.9730
0.0115
0.9442
0.0147
Vote
0.9806
0.0063
0.9694
0.0088
0.9561
0.0148
WaveformEW
0.7636
0.0132
0.7505
0.0118
0.7249
0.0130
WineEW
1.0000
0.0000
0.9944
0.0113
0.9806
0.0208
Zoo
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
BreastEW
0.9851
0.0070
0.9675
0.0124
0.9675
0.0066
Brain_Tumor2
1.0000
0.0000
0.5933
0.0583
0.9900
0.0305
9_Tumors
1.0000
0.0000
0.8495
0.0636
0.8883
0.0826
Leukemia2
1.0000
0.0000
1,0000
0.0000
1.0000
0.0000
Prostrate Tumors
0.9857
0.0222
0.9805
0.0194
0.9635
0.0271
Ranking
W|T|L
18|2|2
W|T|L
4|2|18
W|T|L
3|3|17
20
Table 6: Comparison of ISSAFD with other variants: ISSAF and ISSALD in terms of fitness
Algorithms Datasets
ISSAFD
ISSAF
ISSALD
AVG
STD
AVG
STD
AVG
STD
Exactly
0.0251
0.0444
0.0120
0.0185
0.2036
0.1153
Exactly2
0.1889
0.0000
0.2452
0.0078
0.2186
0.0000
HeartEW
0.0983
0.0157
0.1369
0.0170
0.1738
0.0185
Lymphography
0.0324
0.0154
0.1158
0.0238
0.1227
0.0326
M-of-n
0.0181
0.0281
0.0073
0.0052
0.1064
0.0773
PenglungEW
0.0036
0.0002
0.0675
0.0119
0.0002
0.0001
SonarEW
0.0071
0.0081
0.0498
0.0185
0.0344
0.0167
SpectEW
0.0644
0.0117
0.1450
0.0145
0.1203
0.0169
CongressEW
0.0022
0.0004
0.0163
0.0080
0.0387
0.0115
IonosphereEW
0.0188
0.0072
0.0491
0.0149
0.0395
0.0144
KrvskpEW
0.0316
0.0055
0.0263
0.0114
0.0581
0.0141
Vote
0.0234
0.0057
0.0332
0.0080
0.0460
0.0144
WaveformEW
0.2394
0.0133
0.2524
0.0117
0.2752
0.0126
WineEW
0.0033
0.0007
0.0105
0.0106
0.0216
0.0204
Zoo
0.0025
0.0005
0.0030
0.0004
0.0019
0.0002
BreastEW
0.0193
0.0071
0.0363
0.0125
0.0339
0.0060
Brain_Tumor2
0.0047
0.0000
0.4074
0.0577
0.0100
0.0302
9_Tumors
0.0047
0.0001
0.1538
0.0629
0.1114
0.0818
Leukemia2
0.0047
0.0000
0.0047
0.0000
0.0002
0.0000
Prostrate Tumors
0.0190
0.0220
0.0143
0.0192
0.0364
0.0269
Ranking
W|T|L
13|0|7
W|T|L
4|0|16
W|T|L
3|0|17
21
Table 7: Comparison of ISSAFD with other variants: ISSAF and ISSALD in terms of selected attributes Algorithms
ISSAFD
ISSAF
ISSALD
Datasets
AVG
STD
AVG
STD
AVG
STD
Exactly
5.3333
1.2685
4.3333
0.9248
6.8000
2.7834
Exactly2
1.0000
0.0000
3.8667
3.2561
1.0000
0.0000
HeartEW
6.2000
1.4479
6.3333
1.3476
6.4667
0.8193
Lymphography
7.8333
1.7036
8.0000
1.4856
8.1333
1.3060
M-of-n
7.4333
1.1943
6.8667
0.7303
8.2000
1.2429
PenglungEW
118.1667
7.2829
129.5333
9.8216
108.0667
2.7283
SonarEW
23.7667
3.3701
28.2333
2.7628
27.3333
5.9558
SpectEW
8.6667
2.5235
11.0667
2.1324
6.4667
1.6761
CongressEW
3.2333
0.6814
5.4000
1.4527
3.2667
0.9948
IonosphereEW
13.4667
3.5790
13.5333
2.1129
13.5000
2.2834
KrvskpEW
21.7000
1.9325
20.0333
1.8659
22.3000
4.9211
Vote
4.2333
1.8998
4.6667
2.4117
4.2667
0.9444
WaveformEW
12.2667
2.5452
12.5667
3.3598
12.6000
4.0735
WineEW
2.1213
0.9444
3.1000
1.2521
6.5333
0.7589
Zoo
4.3333
0.7303
4.7333
0.6915
3.0667
0.2537
BreastEW
5.4000
2.3413
12.4667
2.3154
11,4000
3,2013
Brain_Tumor2
4913.5000
35.1399
5014.5667
60.2842
4979.6000
85.1581
9_Tumors
2681.5000
32.3758
2773.3000
36.9642
2794.1000
326.2576
Leukemia2
5323.0333
42.5258
5326.1333
48.6803
102.6333
18.4979
Prostrate Tumors
5085.8667
58.8187
5063.2333
46.1838
5168.3333
420.6817
Ranking
W|T|L
12|1|8
W|T|L
4|0|16
W|T|L
6|1|14
22
366 367 368 369
shown that the ISSAFD method achieved the best values in 13 datasets whereas, it achieved 100% specificity in Leukemia2 dataset like ISSAF and ISSALD methods; followed by the ISSAF method in 4 datasets. The ISSAF ranked last with only two datasets. Table 8: Comparison of ISSAFD with other variants: ISSAF and ISSALD in terms of sensitivity
Algorithms
370 371 372
ISSAFD
ISSAF
ISSALD
Datasets
AVG
STD
AVG
STD
AVG
STD
Exactly
0.9880
0.0266
0.9869
0.0109
0.9514
0.0615
Exactly2
1.0000
0.0000
0.9658
0.0479
1.0000
0.0000
HeartEW
0.9237
0.0354
0.9449
0.0512
0.8903
0.0553
Lymphography
0.9278
0.0362
0.9222
0.0417
1.0000
0.0000
M-of-n
0.9815
0.0420
0.9986
0.0054
0.8544
0.1274
PenglungEW
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
SonarEW
0.9961
0.0149
0.9379
0.0348
0.9519
0.0431
SpectEW
0.7667
0.0595
0.7615
0.0310
0.7056
0.0811
CongressEW
1.0000
0.0000
0.9914
0.0094
0.9509
0.0117
IonosphereEW
0.9986
0.0053
0.9928
0.0109
0.9935
0.0130
KrvskpEW
0.9703
0.0076
0.9702
0.0099
0.9472
0.0195
Vote
0.9970
0.0115
1.0000
0.0000
0.9737
0.0359
WaveformEW
0.7370
0.0205
0.7369
0.0221
0.6997
0.0234
WineEW
1.0000
0.0000
0.9958
0.0228
1.0000
0.0000
Zoo
0.9722
0.0987
1.0000
0.0000
0.9667
0.0562
BreastEW
0.9818
0.0128
0.9815
0.0140
0.9958
0.0074
Brain_Tumor2
0.5750
0.2470
0.1667
0.3790
0.9917
0.0456
9_Tumors
0.5667
0.3051
0.7000
0.4661
0.4222
0.2466
Leukemia2
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
Prostrate Tumors
0.9667
0.0615
0.9889
0.0288
0.9455
0.0453
Ranking
W|T|L
11|4|9
W|T|L
8|2|12
W|T|L
7|4|13
5.5. Comparison of ISSAFD with standard Meta-heuristics In this section, the analysis of the results of ISSAFD and others methods namely SSA, SCA, GWO, ALO, GA and PSO are presented in Tables 10- 16 23
Table 9: Comparison of ISSAFD with other variants: ISSAF and ISSALD in terms of specificity
Algorithms
ISSAFD
ISSAF
ISSALD
Datasets
AVG
STD
AVG
STD
AVG
STD
Exactly
0.9641
0.0872
0.9839
0.0384
0.4213
0.3889
Exactly2
0.3333
0.0000
0.1240
0.1468
0.2200
0.0000
HeartEW
0.8812
0.0497
0.7940
0.0321
0.7420
0.0948
Lymphography
0.9833
0.0339
0.8133
0.0367
1.0000
0.0000
M-of-n
0.9909
0.0206
0.9906
0.0056
0.9253
0.0619
PenglungEW
0.9030
0.0231
1.0000
0.0000
0.8310
0.1036
SonarEW
0.9973
0.0101
0.9717
0.0284
0.9792
0.0212
SpectEW
0.9881
0.0150
0.8894
0.0229
0.9317
0.0291
CongressEW
1.0000
0.0000
0.9798
0.0145
0.9856
0.0168
IonosphereEW
0.9565
0.0255
0.8550
0.0547
0.9040
0.0387
KrvskpEW
0.9785
0.0129
0.9754
0.0159
0.9411
0.0347
Vote
0.9711
0.0080
0.9563
0.0126
0.9480
0.0210
WaveformEW
0.8801
0.0102
0.8591
0.0119
0.8467
0.0118
WineEW
1.0000
0.0000
0.9964
0.0109
0.9760
0.0290
Zoo
0.8556
0.0974
1.0000
0.0000
0.9872
0.0574
BreastEW
0.9896
0.0106
0.9417
0.0211
0.9190
0.0194
Brain_Tumor2
0.5944
0.1431
0.8889
0.0000
0.9944
0.0304
9_Tumors
0.6444
0.1182
0.9424
0.0506
0.6481
0.1432
Leukemia2
1.0000
0.0000
1.0000
0.0000
1.0000
0.0000
Prostrate Tumors
0.9952
0.0181
0.9926
0.0282
0.9833
0.0379
Ranking
W|T|L
14|1|6
W|T|L
5|1|15
W|T|L
3|1|17
24
373 374
375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391
392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408
and Figures 4- 7 in term of the performance measures mentioned in Section 5.2. • In terms of accuracy: By observing the results in Table 10, it can be clearly seen that ISSAFD achieved the highest accuracy in 90% of the datasets, followed by ALO and PSO that obtained the same results in two datasets. The rest of the methods are ranked as follows: SSA followed by SCA, GA, and GWO. The power of the proposed algorithm ISSAFD is shown obviously in the large datasets presented in Table 10, it is ranked first with average accuracy of these datasets equals to 99.6%; whereas, the SCA is ranked second with only 76% followed by PSO and ALO with 70.8% and 70.3%, respectively. In addition, the proposed method showed stable behavior through all experiments. This can be proved by inspecting the STDev values, where since it achieved the smallest value among all methods (i.e., 0.0093) followed by ALO and PSO with 0.036 and 0.037, respectively. Figures 4- 7 show the Boxplots of classification accuracy measurement for the ISSAFD, which is compared with several optimizers including GWO, SCA, SSA, ALO, GA, PSO, that have been implemented and tested on twenty datasets in the same environment. • In terms of sensitivity and specificity: According to Table 11 the proposed method ISSAFD achieved the highest sensitivity value in 15 datasets out of 20 followed by PSO and ALO that obtained the same results in three datasets; whereas, the GWO obtained the best results in two datasets. However, the SSA did not achieve the best sensitivity in all datasets, it is ranked the forth based on the average overall datasets followed by GA, GWO and SCA. Moreover, the proposed approach showed a stable behavior due to the STD values, since it obtained the smallest STD average among all methods (i.e., 0.049) followed by ALO, SSA, PSO, and GA with 0.075, 0.090, 0.093, and 0.099, respectively. By inspecting the results in Table 12, the ISSAFD reached the best specificity value in 11 datasets out of 20 followed by PSO, ALO, GWO, and SSA that obtain the best values in 7, 6, 3, and 2 datasets, respectively; some of these methods obtained the best specificity values in same datasets. Overall the results of datasets, the ISSAFD ranked first followed by ALO, SSA, SCA, PSO, GA, and GWO. In the large datasets, the ISSAFD also achieved the best value followed by GA and 25
Table 10: Comparison of ISSAFD versus other optimizers in terms of accuracy Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking
Metric
ISSAFD
SCA
SSA
GWO
ALO
GA
PSO
AVG
0.9803
0.8510
0.9815
0.7860
1.0000
0.8052
1.0000
STD
0.0440
0.1531
0.0647
0.1240
0.0000
0.1261
0.0000
AVG
0.8100
0.4842
0.6403
0.6290
0.6555
0.6513
0.6718
STD
0.0000
0.1584
0.0625
0.0978
0.0409
0.0510
0.0242
AVG
0.9056
0.8068
0.8000
0.7802
0.7957
0.7926
0.7932
STD
0.0164
0.0488
0.0349
0.0550
0.0286
0.0520
0.0223
AVG
0.9717
0.8122
0.8078
0.8044
0.8411
0.8211
0.8222
STD
0.0156
0.0536
0.0623
0.1113
0.0572
0.0719
0.0576
AVG
0.9875
0.9563
0.9895
0.9203
1.0000
0.9110
1.0000
STD
0.0277
0.0799
0.0347
0.0667
0.0000
0.0778
0.0000
AVG
1.0000
0.7378
0.7089
0.6644
0.7200
0.6600
0.7133
STD
0.0000
0.0801
0.0792
0.0811
0.0537
0.0590
0.0704
AVG
0.9968
0.8405
0.8683
0.8381
0.8595
0.8571
0.8802
STD
0.0082
0.0473
0.0389
0.0421
0.0469
0.0447
0.0417
AVG
0.9389
0.7617
0.7395
0.7525
0.7512
0.7488
0.7395
STD
0.0121
0.0418
0.0404
0.0375
0.0448
0.0417
0.0476
AVG
1.0000
0.9456
0.9307
0.9180
0.9291
0.9261
0.9291
STD
0.0000
0.0145
0.0199
0.0270
0.0174
0.0254
0.0189
AVG
0.9850
0.9160
0.9047
0.8920
0.9258
0.9047
0.9150
STD
0.0073
0.0380
0.0342
0.0493
0.0261
0.0268
0.0290
AVG
0.9742
0.9364
0.9565
0.9279
0.9656
0.9377
0.9616
STD
0.0057
0.0200
0.0100
0.0298
0.0073
0.0170
0.0093
AVG
0.9806
0.9494
0.9489
0.9489
0.9511
0.9389
0.9494
STD
0.0063
0.0102
0.0190
0.0169
0.0138
0.0216
0.0208
AVG
0.7636
0.7258
0.7374
0.7208
0.7491
0.7279
0.7486
STD
0.0132
0.0151
0.0166
0.0143
0.0142
0.0185
0.0116
AVG
1.0000
0.9731
0.9639
0.9676
0.9954
0.9657
0.9806
STD
0.0000
0.0523
0.0414
0.0438
0.0105
0.0315
0.0522
AVG
1.0000
0.9714
0.9825
0.9222
0.9937
0.9460
0.9937
STD
0.0000
0.0826
0.0318
0.0880
0.0207
0.0496
0.0165
AVG
0.9851
0.9491
0.9433
0.9415
0.9468
0.9433
0.9447
STD
0.0070
0.0287
0.0161
0.0137
0.0172
0.0191
0.0153
AVG
1.0000
0.7197
0.6796
0.6681
0.6644
0.6933
0.6907
STD
0.0000
0.1554
0.0494
0.0824
0.0797
0.0563
0.0778
AVG
1.0000
0.6257
0.6468
0.6346
0.6236
0.6218
0.6355
STD
0.0000
0.2572
0.1195
0.1306
0.1095
0.1307
0.1148
AVG
1.0000
0.9044
0.8667
0.8489
0.8800
0.8711
0.8689
STD
0.0000
0.0736
0.0581
0.0630
0.0565
0.0605
0.0593
AVG
0.9857
0.8016
0.6063
0.6111
0.6444
0.6111
0.6381
STD
0.0222
0.1091
0.0718
0.0993
0.0827
0.0730
0.0657
W|T|L
18|0|2
0|0|20
0|0|20
0|0|20
2|2|18
0|0|20
2|2|18
26
409 410 411
412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434
435 436 437 438 439 440 441 442 443 444
ALO. Moreover, the proposed approach also showed the smallest STD average among all methods (i.e., 0.034) followed by ALO and SSA with 0.037 and 0.050, respectively. • In terms of average number of feature selected: This measure is used to show how the methods can reduce the features numbers of the given datasets. In this measure, all methods were examined and illustrated in Table 13. The results showed that, the SCA obtained the greatest ability to reduce the features numbers in all datasets with the average equals to 79%; whereas, the proposed approach ISSAFD ranked second with 63% followed by the ALO, PSO, SSA, GWO, and GA with 0.59%, 058%, 0.55%, 0.53%, and 0.51%, respectively. We have noted that, the SCA was able to reduce the large number of features to the minimum limit among all methods since it reduced the features number 98% of their original number. Although, the SCA showed a high ability to reduce the large features number, the obtained the highest accuracy in these datasets equals to 0.99 whereas, SCA achieved 0.76 and ranked second. Figure 8 illustrates the relationship between the average classification accuracy and the number of selected features. From this figure we can see that, the SCA is ranked first in reducing the number of features whereas, its accuracy came in the fifth rank. So, although the ISSAFD is ranked second in reducing the features number, it showed the best accuracy overall datasets. Therefore, it can be more accurate than SCA. From above notes the ranking will be as the following order, ISSAFD followed by SCA, ALO, PSO, SSA, and GWO while GA came in the last rank. • In terms of the average value of the fitness : This measure is used to show the ability of methods in minimizing the fitness value. Based on the results of Table 14, the PSO ranked first since it obtained the minimum values in 10 datasets out of 20, followed by the ISSAFD with 8 datasets. The remain methods ranked as follow, ALO ranked third followed by SCA, SSA, GA, and GWO. Whereas, the ISSAFD, achieved the best and minimum value (i.e., 0.009) in term of STD followed by ALO and SSA with 0.0120 and 0.0172, respectively. Moreover, to evaluate the convergence of the proposed method, it is compared with the traditional SCA and SSA as depicted in Figure 9. It can be seen that 27
Table 11: Comparison of ISSAFD versus other optimizers in terms of sensitivity Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking
Metric
ISSAFD
SCA
SSA
GWO
ALO
GA
PSO
AVG
0.9880
0.9148
0.9883
0.9124
1.0000
0.8744
1.0000
STD
0.0266
0.0909
0.0471
0.0903
0.0000
0.0864
0.0000
AVG
1.0000
0.4625
0.7327
0.7033
0.7566
0.7497
0.7793
STD
0.0000
0.2853
0.1021
0.1676
0.0620
0.0632
0.0247
AVG
0.9237
0.8460
0.8080
0.7828
0.8253
0.8057
0.8287
STD
0.0354
0.0809
0.0541
0.0811
0.0434
0.0681
0.0389
AVG
0.9278
0.0667
0.0667
0.1000
0.0333
0.1667
0.7333
STD
0.0362
0.2537
0.2537
0.3051
0.1826
0.3790
0.4498
AVG
0.9815
0.9510
0.9863
0.8932
1.0000
0.8815
1.0000
STD
0.0420
0.0919
0.0463
0.0900
0.0000
0.1039
0.0000
AVG
1.0000
0.7667
0.9333
0.8333
0.9333
0.9667
0.8667
STD
0.0000
0.4302
0.2537
0.3790
0.2537
0.1826
0.3457
AVG
0.9961
0.8479
0.8750
0.8792
0.8646
0.8625
0.9021
STD
0.0149
0.0815
0.0544
0.0613
0.0736
0.0601
0.0629
AVG
0.7667
0.0000
0.0533
0.0567
0.0133
0.0633
0.0633
STD
0.0595
0.0000
0.0629
0.0626
0.0346
0.0718
0.0556
AVG
1.0000
0.9553
0.9520
0.9433
0.9567
0.9393
0.9533
STD
0.0000
0.0086
0.0214
0.0263
0.0183
0.0213
0.0177
AVG
0.9986
0.9404
0.9695
0.9518
0.9667
0.9759
0.9688
STD
0.0053
0.0528
0.0260
0.0739
0.0278
0.0207
0.0260
AVG
0.9703
0.9438
0.9579
0.9419
0.9638
0.9443
0.9628
STD
0.0076
0.0171
0.0114
0.0208
0.0124
0.0213
0.0102
AVG
0.9970
0.9449
0.9474
0.9564
0.9641
0.9410
0.9641
STD
0.0115
0.0330
0.0311
0.0331
0.0200
0.0388
0.0363
AVG
0.7370
0.6936
0.7127
0.6867
0.7208
0.6970
0.7183
STD
0.0205
0.0273
0.0253
0.0241
0.0192
0.0230
0.0162
AVG
1.0000
0.9923
0.9846
0.9821
0.9949
0.9974
0.9974
STD
0.0000
0.0235
0.0313
0.0482
0.0195
0.0140
0.0140
AVG
0.9722
0.9375
0.9792
0.9000
0.9875
0.9292
1.0000
STD
0.0987
0.1760
0.0576
0.1687
0.0381
0.1169
0.0000
AVG
0.9818
0.9481
0.9472
0.9435
0.9509
0.9495
0.9472
STD
0.0128
0.0377
0.0238
0.0239
0.0265
0.0264
0.0254
AVG
0.5750
0.6167
0.9500
0.9667
0.9667
0.9500
0.9333
STD
0.2470
0.3869
0.1526
0.1269
0.1269
0.1526
0.1729
AVG
0.5667
0.3333
0.8000
0.8667
0.8333
0.8333
0.7667
STD
0.3051
0.4795
0.4068
0.3457
0.3790
0.3790
0.4302
AVG
1.0000
0.9542
0.8500
0.8542
0.8500
0.8583
0.8542
STD
0.0000
0.0695
0.0509
0.0474
0.0509
0.0432
0.0474
AVG
0.9667
0.7861
0.6056
0.5778
0.6278
0.5972
0.6444
STD
0.0615
0.1325
0.0927
0.1217
0.1132
0.1160
0.0901
W|T|L
15|0|5
0|0|20
1|0|19
2|1|18
3|3|17
0|0|20
3|2|17
28
Table 12: Comparison of ISSAFD versus other optimizers in terms of specificity Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking
Metric
ISSAFD
SCA
SSA
GWO
ALO
GA
PSO
AVG
0.9641
0.6948
0.9649
0.4764
1.0000
0.6356
1.0000
STD
0.0872
0.3163
0.1096
0.3638
0.0000
0.2578
0.0000
AVG
0.3333
0.5546
0.3397
0.3872
0.3262
0.3312
0.7793
STD
0.0000
0.2676
0.1098
0.1455
0.0485
0.0811
0.0247
AVG
0.8812
0.7613
0.7907
0.7773
0.7613
0.7773
0.8287
STD
0.0497
0.0925
0.0784
0.0710
0.0782
0.0805
0.0389
AVG
0.9833
1.0000
0.9989
1.0000
1.0000
0.9989
1.0000
STD
0.0339
0.0000
0.0063
0.0000
0.0000
0.0063
0.0000
AVG
0.9909
0.9601
0.9917
0.9396
1.0000
0.9319
1.0000
STD
0.0206
0.0779
0.0274
0.0559
0.0000
0.0639
0.0000
AVG
0.9030
0.9810
0.9929
1.0000
0.9952
0.9976
0.8667
STD
0.0231
0.0321
0.0218
0.0000
0.0181
0.0130
0.3457
AVG
0.9973
0.8359
0.8641
0.8128
0.8564
0.8538
0.9021
STD
0.0101
0.0571
0.0532
0.0569
0.0562
0.0548
0.0629
AVG
0.9881
0.9348
0.8955
0.9106
0.9189
0.9045
0.0633
STD
0.0150
0.0513
0.0476
0.0442
0.0581
0.0522
0.0556
AVG
1.0000
0.9324
0.9018
0.8838
0.8919
0.9081
0.9533
STD
0.0000
0.0323
0.0446
0.0555
0.0487
0.0548
0.0177
AVG
0.9565
0.8681
0.7778
0.7750
0.8458
0.7653
0.9688
STD
0.0255
0.0701
0.0884
0.0615
0.0560
0.0722
0.0260
AVG
0.9785
0.9294
0.9550
0.9146
0.9674
0.9313
0.9628
STD
0.0129
0.0465
0.0163
0.0415
0.0129
0.0243
0.0102
AVG
0.9711
0.9529
0.9500
0.9431
0.9412
0.9373
0.9641
STD
0.0080
0.0166
0.0234
0.0203
0.0204
0.0296
0.0363
AVG
0.8801
0.8583
0.8592
0.8497
0.8656
0.8555
0.7183
STD
0.0102
0.0123
0.0150
0.0124
0.0129
0.0138
0.0162
AVG
1.0000
0.9841
0.9696
0.9826
1.0000
0.9812
0.9974
STD
0.0000
0.0333
0.0414
0.0335
0.0000
0.0295
0.0140
AVG
0.8556
0.9949
1.0000
0.9923
1.0000
0.9949
1.0000
STD
0.0974
0.0281
0.0000
0.0310
0.0000
0.0195
0.0000
AVG
0.9896
0.9508
0.9365
0.9381
0.9397
0.9325
0.9472
STD
0.0106
0.0293
0.0201
0.0262
0.0278
0.0235
0.0254
AVG
0.5944
0.9958
1.0000
1.0000
1.0000
1.0000
0.9333
STD
0.1431
0.0228
0.0000
0.0000
0.0000
0.0000
0.1729
AVG
0.6444
0.3152
0.5273
0.4848
0.5576
0.6273
0.7667
STD
0.1182
0.1447
0.1420
0.1459
0.1579
0.1293
0.4302
AVG
1.0000
0.9571
0.9667
0.9524
0.9857
0.9619
0.8542
STD
0.0000
0.0764
0.0615
0.0685
0.0436
0.0643
0.0474
AVG
0.9952
0.8222
0.6074
0.6556
0.6667
0.6296
0.6444
STD
0.0181
0.1258
0.1000
0.1498
0.1092
0.1143
0.0901
W|T|L
11|0|9
1|3|19
1|2|19
3|5|17
6|9|14
1|2|19
7|7|13
29
1
0.8
0.95
0.7
0.9
Accuracy
Accuracy
0.6 0.85 0.8 0.75
0.5
0.4
0.7 0.3 0.65 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
GWO
ALO
Algorithms
PSO
SCA
SSA
ISSAFD
Algorithms
(a) Exactly
(b) Exactly2 1
0.9 0.9 0.85
Accuracy
Accuracy
0.8 0.8
0.75
0.7
0.7 0.6 0.5
0.65
0.4
0.6 GA
GWO
ALO
PSO
SCA
SSA
0.3
ISSAFD
GA
GWO
ALO
Algorithms
PSO
SCA
SSA
ISSAFD
Algorithms
(c) HeartEW
(d) Lymphography 1
1
0.95 0.9
0.95
0.9
Accuracy
Accuracy
0.85
0.85
0.8 0.75 0.7
0.8
0.65 0.6
0.75
0.55 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
Algorithms
GWO
ALO
PSO
SCA
SSA
ISSAFD
Algorithms
(e) M-of-n
(f) PenglungEW
Figure 4: Boxplots of ISSAFD versus other optimizers based on accuracy metric: Exactly to SpectEW.
30
1 0.95 0.9 0.85
0.9
Accuracy
Accuracy
0.95
0.85
0.8 0.75
0.8
0.7 0.65
0.75 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
GWO
ALO
Algorithms
PSO
SCA
SSA
ISSAFD
SSA
ISSAFD
Algorithms
(a) SonarEW
(b) SpectEW 1
1 0.98
0.95
0.96 0.9
Accuracy
Accuracy
0.94 0.92 0.9
0.85 0.8
0.88 0.75
0.86
0.7
0.84 0.82 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
GWO
ALO
Algorithms
PSO
SCA
Algorithms
(c) CongressEW
(d) IonosphereEW 0.98
0.98
0.97 0.96
0.96 0.95
Accuracy
Accuracy
0.94
0.92
0.94 0.93 0.92
0.9
0.91 0.9
0.88
0.89 0.86
0.88 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
Algorithms
GWO
ALO
PSO
SCA
SSA
ISSAFD
Algorithms
(e) KrvskpEW
(f) Vote
Figure 5: Boxplots of ISSAFD versus other optimizers based on accuracy metric: SonarEW to Vote.
31
1
0.78
0.98 0.96
Accuracy
Accuracy
0.76
0.74
0.94 0.92 0.9
0.72
0.88 0.7
0.86 0.84
0.68 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
GWO
ALO
Algorithms
PSO
SCA
SSA
ISSAFD
Algorithms
(a) WaveformEW
(b) WineEW 1
1 0.95
0.98
0.9 0.96
Accuracy
Accuracy
0.85 0.8 0.75
0.94
0.92
0.7 0.65
0.9 0.6 0.55
GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
GWO
ALO
Algorithms
PSO
SCA
SSA
ISSAFD
Algorithms
(c) Zoo
(d) BreastEW
1
1 0.9
0.9
0.8 0.7
Accuracy
Accuracy
0.8
0.7
0.6
0.6 0.5 0.4 0.3 0.2
0.5
0.1 0.4
0 GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
Algorithms
GWO
ALO
PSO
SCA
SSA
ISSAFD
Algorithms
(e) Brain_Tumor2
(f) 9_Tumors
Figure 6: Boxplots of ISSAFD versus other optimizers based on accuracy metric: WaveformEW to 9_Tumors.
32
1
0.95
0.95
0.9
0.9
Accuracy
Accuracy
1
0.85 0.8 0.75
0.8 0.75
0.7 0.65
0.85
0.7
GA
GWO
ALO
PSO
SCA
SSA
ISSAFD
GA
Algorithms
GWO
ALO
PSO
SCA
SSA
ISSAFD
Algorithms
(a) Leukemia2
(b) Prostate Tumors
Figure 7: Boxplots of ISSAFD versus other optimizers based on accuracy metric for Leukemia2 and Prostate Tumors.
Figure 8: The average of the accuracy and the features selection ratio of all methods
33
Table 13: Comparison between ISSAFD with other metaheuristics based on minimum number of selected features Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumors 2 9_Tumors Leukemia 2 Prostrate Tumors Ranking
Metric
ISSAFD
SCA
SSA
GWO
ALO
GA
PSO
AVG
5.3333
5.7667
6.9333
5.7667
6.0000
8.2000
6.0000
STD
1.2685
0.8172
1.0483
3.1588
0.0000
1.8828
0.0000
AVG
1.0000
2.9667
5.9667
6.0333
6.5000
6.9333
6.3333
STD
0.0000
1.9205
1.6709
2.3265
1.1371
1.4840
0.8023
AVG
6.2000
4.9333
6.7333
7.8000
6.1000
7.5333
5.9667
STD
1.4479
1.0807
1.3374
2.0069
1.2134
1.7953
1.0300
AVG
7.8333
4.6000
6.8667
6.9000
6.3333
7.6000
7.0667
STD
1.7036
1.1919
1.1366
2.1066
1.3730
1.5222
1.2576
AVG
7.4333
6.0333
6.7667
9.1000
6.0000
8.5000
6.0000
STD
1.1943
0.8899
1.0726
1.7090
0.0000
1.3326
0.0000
AVG
118.1667
13.7000
143.5333
110.1667
128.6667
144.8333
141.0333
STD
7.2829
4.9421
10.4806
38.9085
9.8132
11.0018
10.4436
AVG
23.7667
11.9667
27.6667
30.0000
23.5333
27.0000
26.6333
STD
3.3701
4.0725
2.6042
8.1875
3.8393
3.0626
3.2956
AVG
8.6667
5.1000
7.7667
10.0667
5.8667
9.0000
8.3667
STD
2.5235
0.3051
1.7157
2.1961
1.5025
1.6400
2.4280
AVG
3.2333
3.2667
6.0000
7.0333
5.5000
7.3000
5.1667
STD
0.6814
1.1725
1.9827
3.0792
1.5920
1.8965
1.2617
AVG
13.4667
4.3000
12.2333
10.7667
9.7333
14.1667
10.9667
STD
3.5790
1.0875
1.9241
3.2237
1.9989
2.3057
1.9384
AVG
21.7000
9.6667
19.5667
27.4667
19.4000
19.9000
20.6667
STD
1.9325
2.6566
2.4870
4.6441
3.5681
2.9402
2.8567
AVG
4.2333
4.2667
6.5000
6.8333
6.6000
6.5333
6.5667
STD
1.8998
1.9640
2.3305
3.0181
1.4288
2.1292
1.3817
AVG
12.2667
12.7000
22.7667
27.5667
22.7000
22.8000
21.5333
STD
2.5452
3.3646
3.1479
6.0211
3.0643
3.4180
2.6876
AVG
2.1213
2.6667
3.6333
5.4667
2.4667
6.0333
2.3333
STD
0.9444
0.8841
1.1592
2.0126
0.7303
1.4000
0.7100
AVG
4.3333
4.5333
5.8667
7.9333
4.6000
6.5000
4.6333
STD
0.7303
0.6288
1.1958
1.9640
0.8944
1.7370
0.6149
AVG
5.4000
5.6000
12.7333
15.3000
9.1333
12.5000
11.8333
STD
2.3413
2.2682
2.4766
3.3441
1.9954
3.1596
2.5742
AVG
4913.5000
144.8333
4999.2333
3678.0667
4880.8000
5068.4333
4987.4667
STD
35.1399
53.8991
51.1675
1080.4006
35.3040
64.0499
18.5728
AVG
2681.5000
144.5667
2798.0333
2353.3667
2771.2000
2821.6000
2821.5667
STD
32.3758
205.0720
38.2834
792.8315
51.0472
47.2883
33.9103
AVG
5323.0333
182.5333
5439.0000
3988.6667
5304.7000
5459.9333
5401.0333
STD
42.5258
67.1040
42.0689
1123.2541
39.5406
43.8917
28.5868
AVG
5085.8667
362.6333
5141.7333
3327.9000
5101.7667
5163.3000
5182.2000
STD
58.8187
423.2828
52.0404
903.9182
46.7522
47.7213
43.2079
W|T|L
8|0|12
11|0|9
0|0|20
0|0|20
1|1|19
0|0|20
1|1|19
34
0.2
0.36
0.18
0.34
0.16
0.32
0.14
0.3
0.12
Average Fitness
Average Fitness
0.38
0.28 0.26 0.24
0.06
0.22
0.04 ISSAFD SCA SSA
0.2 0.18
0.1 0.08
0
20
40
60
80
ISSAFD SCA SSA
0.02 0
100
0
20
40
iterations
60
80
100
iterations
(a) Exactly2
(b) CongressEW
0.14
0.25
0.12 0.2
0.08
Average Fitness
Average Fitness
0.1
0.06
0.15
0.1
0.04 0.05 0.02
0
ISSAFD SCA SSA
0
20
40
60
80
0
100
ISSAFD SCA SSA
0
20
40
iterations
80
100
(d) IonosphereEW
0.06
0.12
0.05
0.1
0.04
0.08
Average Fitness
Average Fitness
(c) Zoo
0.03
0.02
0.06
0.04
0.01
0
60
iterations
0.02
ISSAFD SCA SSA
0
20
40
60
80
0
100
iterations
ISSAFD SCA SSA
0
20
40
60
80
100
iterations
(e) Sonar
(f) Leukemia 2
Figure 9: Convergence curve for SSA, SCA and ISSAFD on six datasets
35
445 446 447
448 449 450 451 452 453 454 455 456
457 458 459 460 461 462 463 464 465 466 467 468 469
470 471 472 473 474 475 476 477 478 479
the proposed method ISSAFD presents a fast convergence for the most datasets, while SCA presents a slow convergence. In addition, SSA presents an intermediate convergence behavior for the most datasets. • In terms of average time computation : This measure indicates the speed of an algorithm in selecting features from a given dataset. According to the results of Table 15, the average time of all algorithms equals to 6.1 seconds. In this measure, the GA is the fastest algorithm since it obtained the lowest time in 9 datasets out of 20 in 4.9 seconds, followed by ISSAFD and SSA with 7 and 4 datasets, respectively. In addition, the STD measure proved the stability of the ISSAFD since it ranked three with 0.15 after both of PSO and ALO with 0.0752 and 0.0753, respectively. 5.6. Wilcoxon’s rank test: In this subsection, the Wilcoxon’s test is applied to check if there is a significant difference among the proposed approach and other methods. It is applied on accuracy measure at a significant level equals to 0.05; if p-value < 0.05 that indicates the proposed approach has a significant difference. Table 16 shows that the results of the ISSAFD vs. SCA and ALO showed significant differences in all dataset except for M-of-n and Zoo, respectively. Whereas, it also outperformed the SSA in all datasets except for Exactly and M-of-n. we note that the bold values in Table 16 mean that p-value is greater than 0.05. We can observe that the ISSAFD showed significant differences in all datasets vs. PSO, GWO, and GA. In general, the proposed approach showed positive improvement against all methods and it inherited the strength of SSA and SCA algorithms. 5.7. Comparison with the state-of-the-art FS methods The performance of the proposed approach in classifying twenty benchmark datasets is compared with thirteen state-of-the-art namely binary dragonfly algorithm (BDA) (Mafarja et al., 2018), hybrid whale optimization algorithm with simulated annealing with tangent transfert functio (WOASAT) (Mafarja & Mirjalili, 2017), binary salp swarm algorithm (BSSA) (Faris et al., 2018), binary gray wolf optimization-Approach 2 (bGWO2) (Emary et al., 2016b), binary gray wolf optimization-Approach 1 (bGWO1) (Emary et al., 2016b), SSAPSO (Ibrahim et al., 2019), ISCA (Sindhu et al., 2017), improved salp swarm algorithm (ISSA) (Hegazy et al., 2018), grasshopper optimization 36
Table 14: Comparison of ISSAFD versus others optimizers based on best values of fitness Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking
Metric
ISSAFD
SCA
SSA
GWO
ALO
GA
PSO
AVG
0.0251
0.1222
0.0278
0.2229
0.0046
0.2066
0.0046
STD
0.0444
0.1205
0.0576
0.0799
0.0046
0.0769
0.0000
AVG
0.1889
0.2325
0.2310
0.2442
0.2213
0.2457
0.2133
STD
0.0000
0.0025
0.0086
0.0122
0.0069
0.0150
0.2308
AVG
0.0983
0.1077
0.1017
0.1239
0.0915
0.1219
0.0772
STD
0.0157
0.0214
0.0152
0.0173
0.0129
0.0196
0.0978
AVG
0.0324
0.0807
0.0654
0.1149
0.0519
0.0999
0.0363
STD
0.0154
0.0180
0.0207
0.0377
0.0162
0.0292
0.0688
AVG
0.0181
0.0357
0.0126
0.0869
0.0046
0.0861
0.0046
STD
0.0281
0.0532
0.0242
0.0420
0.0046
0.0467
0.0000
AVG
0.0036
0.1258
0.1914
0.2498
0.1690
0.2355
0.1360
STD
0.0002
0.0400
0.0249
0.0418
0.0376
0.0414
0.2024
AVG
0.0071
0.0554
0.0541
0.0946
0.0456
0.0634
0.0277
STD
0.0081
0.0221
0.0168
0.0224
0.0172
0.0237
0.0749
AVG
0.0644
0.0836
0.0750
0.0981
0.0729
0.0878
0.0586
STD
0.0117
0.0115
0.0098
0.0147
0.0064
0.0105
0.0761
AVG
0.0022
0.0294
0.0318
0.0431
0.0262
0.0391
0.0145
STD
0.0004
0.0053
0.0070
0.0086
0.0040
0.0077
0.0278
AVG
0.0188
0.0389
0.0557
0.0757
0.0396
0.0706
0.0308
STD
0.0072
0.0121
0.0121
0.0146
0.0079
0.0116
0.0593
AVG
0.0316
0.0504
0.0368
0.0482
0.0255
0.0510
0.0228
STD
0.0055
0.0112
0.0065
0.0075
0.0039
0.0117
0.0367
AVG
0.0234
0.0313
0.0283
0.0373
0.0217
0.0360
0.0196
STD
0.0057
0.0065
0.0072
0.0077
0.0037
0.0070
0.0349
AVG
0.2394
0.2071
0.1946
0.2075
0.1837
0.2038
0.1748
STD
0.0133
0.0097
0.0078
0.0105
0.0068
0.0125
0.1948
AVG
0.0033
0.0021
0.0028
0.0088
0.0019
0.0056
0.0015
STD
0.0133
0.0097
0.0078
0.0105
0.0068
0.0125
0.1948
AVG
0.0025
0.0107
0.0037
0.0238
0.0029
0.0339
0.0025
STD
0.0005
0.0178
0.0007
0.0381
0.0006
0.0499
0.0038
AVG
0.0193
0.0172
0.0202
0.0277
0.0149
0.0256
0.0110
STD
0.0071
0.0051
0.0046
0.0066
0.0042
0.0068
0.0217
AVG
0.0047
0.0001
0.0048
0.0464
0.0047
0.0214
0.0048
STD
0.0000
0.0001
0.0000
0.0498
0.0000
0.0375
0.0048
AVG
0.0047
0.0039
0.1206
0.2288
0.0654
0.2022
0.0048
STD
0.0001
0.0204
0.0727
0.1014
0.0681
0.0843
0.1463
AVG
0.0047
0.0002
0.0708
0.0784
0.0707
0.0709
0.0048
STD
0.0000
0.0001
0.0000
0.0224
0.0000
0.0000
0.0709
AVG
0.0190
0.0208
0.1620
0.2232
0.0850
0.1841
0.0521
STD
0.0220
0.0269
0.0469
0.0451
0.0331
0.0624
0.1464
W|T|L
8|1|12
3|0|17
0|0|20
0|0|20
2|2|18
0|0|20
10|2|10
37
Table 15: Comparison between ISSAFD with other metaheuristics based on time Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking
Metric
ISSAFD
SCA
SSA
GWO
ALO
GA
PSO
AVG
5.2474
5.2675
6.1712
5.8158
6.0738
5.3047
6.0253
STD
0.2345
0.2401
0.3838
0.7531
0.0411
0.2111
0.0604
AVG
4.6817
4.6848
6.0642
6.4241
6.2144
5.0456
6.1010
STD
0.3949
0.5941
0.4893
0.6478
0.0563
0.1888
0.0899
AVG
3.0765
3.6705
3.8701
3.8027
3.8757
3.0971
3.8228
STD
0.1641
0.0646
0.0490
0.0872
0.0332
0.0220
0.0207
AVG
4.0305
3.5469
3.6485
3.6406
3.6802
2.9251
3.6211
STD
0.0218
0.0523
0.0428
0.0407
0.0346
0.0326
0.0297
AVG
5.0924
5.0928
5.7137
5.5713
5.7925
5.1035
5.7482
STD
0.2566
0.1319
0.3158
1.1317
0.0344
0.1202
0.0319
AVG
8.9222
3.8012
3.9324
4.0957
3.9515
3.1252
3.9185
STD
0.0981
0.0414
0.0325
0.0361
0.0347
0.0319
0.0283
AVG
8.7573
3.6515
3.6813
3.7135
3.7133
2.9429
3.6493
STD
0.0130
0.0517
0.0403
0.0388
0.0328
0.0202
0.0284
AVG
3.8668
3.6496
3.7769
3.6278
3.7698
3.0199
3.7236
STD
0.0194
0.0410
0.0603
0.0518
0.0345
0.0640
0.0489
AVG
4.9958
3.7073
4.0702
4.0139
4.1382
3.2910
4.0713
STD
0.0801
0.1621
0.1368
0.1735
0.0378
0.0463
0.0343
AVG
4.9477
3.7844
3.8080
3.7675
3.8042
2.9807
3.7562
STD
0.0349
0.1256
0.2128
0.1250
0.0316
0.0572
0.0270
AVG
13.3236
16.9805
8.6549
10.2884
8.6481
7.0108
8.5729
STD
0.2122
2.7574
0.4485
0.9030
0.1821
0.5941
0.1403
AVG
3.4026
3.4110
3.6087
3.5897
3.6620
3.8761
3.5907
STD
0.0360
0.1210
0.1058
0.0965
0.0378
0.0524
0.0341
AVG
20.3592
36.2139
18.8454
23.8825
18.3341
15.0997
18.3388
STD
0.5800
9.3794
1.5606
2.6485
0.4581
1.5280
0.5309
AVG
2.6622
3.1859
3.4195
3.4294
3.4390
2.7383
3.3916
STD
0.1533
0.1481
0.0485
0.0395
0.0258
0.0277
0.0276
AVG
3.4129
3.4212
3.4761
3.5188
3.4999
3.7786
3.4567
STD
0.0845
0.0563
0.0464
0.0569
0.0386
0.0267
0.0391
AVG
5.2273
4.1948
3.8757
3.7588
3.9326
3.1492
3.8247
STD
0.0633
0.1808
0.2941
0.1129
0.0340
0.1048
0.0279
AVG
9.9118
5.2046
6.9822
13.2539
7.9923
5.4345
7.5435
STD
0.0679
0.0715
0.0954
0.3628
0.0645
0.1127
0.0742
AVG
8.7458
4.9592
5.8747
9.3682
6.3548
4.9919
6.1487
STD
0.0965
0.0865
0.1431
0.5131
0.0864
0.1852
0.0517
AVG
10.3236
5.8585
8.9088
14.7234
10.0450
6.9539
9.5933
STD
0.0894
0.0850
0.1204
0.5314
0.1089
0.1723
0.0911
AVG
11.8059
5.9487
11.1463
14.8553
12.2234
8.6859
11.7038
STD
0.2989
0.2993
0.1576
1.1016
0.0979
0.3981
0.0871
W|T|L
7|0|13
4|0|16
0|0|20
0|0|20
0|0|20
9|0|11
0|0|20
38
Table 16: p-value of the the Wilcoxon test for the classification accuracy results of ISSAFD and other algorithms ISSAFD vs.
SCA
GWO
ALO
GA
PSO
SSA
Datasets
p-value
p-value
p-value
p-value
p-value
p-value
Exactly
1.35E-02
5.08E-09
6.61E-04
4.18E-08
6.61E-04
2.39E-01
Exactly2
1.08E-12
1.20E-12
9.91E-13
1.20E-12
8.31E-13
1.16E-12
HeartEW
3.86E-09
2.08E-11
4.77E-11
1.63E-11
1.14E-11
1.59E-11
Lymphography
1.80E-11
1.19E-10
2.23E-11
2.25E-11
1.80E-11
2.12E-11
M-of-n
7.45E-01
5.91E-06
1.37E-03
2.53E-06
1.37E-03
9.45E-02
PenglungEW
9.53E-13
9.11E-13
7.58E-13
7.88E-13
9.23E-13
9.78E-13
SonarEW
3.73E-12
3.66E-12
3.61E-12
3.75E-12
3.40E-12
3.57E-12
SpectEW
1.28E-11
1.23E-11
1.21E-11
1.16E-11
1.28E-11
1.28E-11
CongressEW
4.49E-13
1.07E-12
1.03E-12
1.06E-12
8.19E-13
1.04E-12
IonosphereEW
4.96E-11
8.54E-12
8.32E-12
8.13E-12
8.30E-12
1.12E-11
KrvskpEW
7.65E-11
3.93E-11
1.35E-05
2.76E-10
4.18E-07
6.32E-10
Vote
7.57E-12
2.86E-10
2.58E-11
4.35E-11
7.56E-11
6.75E-10
WaveformEW
3.13E-10
6.31E-11
4.29E-04
1.52E-09
3.55E-05
2.09E-07
WineEW
6.55E-04
1.25E-05
2.14E-02
1.38E-08
4.19E-02
8.22E-07
Zoo
5.56E-03
1.24E-07
8.15E-02
1.14E-07
4.18E-02
1.32E-03
BreastEW
2.40E-08
1.64E-11
2.26E-10
2.17E-11
2.02E-11
1.75E-11
Brain_Tumor2
4.45E-12
7.21E-13
7.20E-13
5.20E-13
4.53E-13
3.17E-13
9_Tumors
6.08E-10
4.14E-12
1.15E-12
1.17E-12
1.13E-12
1.16E-12
Leukemia2
4.99E-09
7.77E-13
5.68E-13
7.11E-13
7.67E-13
8.22E-13
Prostrate Tumors
9.03E-11
9.34E-12
8.86E-12
8.78E-12
8.39E-12
8.90E-12
39
480 481 482 483 484 485 486
487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507
algorithm for feature selection (GOFS) (Zakeri & Hokmabadi, 2019), chaotic salp swarm algorithm (CSSA) (Sayed et al., 2018), sigmoid binary butterfly optimization Algorithm (S-bBOA) (Arora & Anand, 2019), multi-ensemble grey wolf optimizer (MEGWO) (Tu et al., 2019), and binary grasshopper optimization algorithm based on mutation (BGOA-M) (Mafarja et al., 2019). Not all the previous studies contain the same datasets, therefore, the missed results were replaced with "-". • In terms of accuracy
Table 17 reports the classification accuracy of ISSAFD in all datasets along with different methods. From this table, we can conclude that, in terms of the small datasets (i.e., from number 1 to 16 in Table 17), the ISSAFD outperformed all methods in 6 datasets out of 16 that equals to 38% of all datasets. As well as the ISSAFD and BDA achieved 100% accuracy in 2 datasets namely PenglungEW and WineEW. In addition, the ISSAFD obtained 100% accuracy in the Zoo dataset equal with BDA and BSSA. The BDA comes in the second rank because it obtained the best accuracy in 3 datasets out of 16 and achieved 100% accuracy in the M-of-n and Exactly datasets equal with the third-ranked method (i.e., WOASAT (Mafarja & Mirjalili, 2017)). The BGOA-M is ranked fourth with 3 datasets, it acted similar to WOASAT but in general, WOASAT obtains better results then BGOA-M. The BSSA method is ranked fifth followed by bGWO2. However, the results of the rest of the methods are good, they are less than the other methods for all matching datasets. In terms of the large datasets, the ISSAFD outperformed all other methods namely BDA, BSSA, ISCA and GOFS. These results indicate that the proposed approach has the ability to achieve promising results than other methods in terms of small and large datasets.
40
Table 17: Comparison of ISSAFD versus state-of-the-arts in terms of classification accuracy Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumors2 9_Tumor Leukemia2 Prostrate Tumors Ranking (W|T|L) Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumors2 9_Tumor Leukemia2 Prostrate Tumors Ranking (W|T|L)
ISSAFD 0.980 0.810 0.906 0.972 0.988 1.000 0.997 0.939 1.000 0.985 0.974 0.981 0.764 1.000 1.000 0.985 1.000 1.000 1.000 0.986 13|3|7 ISCA 0.842 0.837 0.913 0.987 0.966 0|0|5
BDA 1.000 0.773 0.876 0.992 1.000 1.000 0.980 0.850 0.987 0.991 0.979 0.989 0.758 1.000 1.000 0.979 0.710 0.561 8|5|10 ISSA 0.734 0.853 0.766 0.978 0.872 0.957 0|0|6
WOASAT 1.000 0.750 0.850 0.890 1.000 0.940 0.970 0.880 0.980 0.960 0.980 0.970 0.760 0.990 0.970 0.980 3|3|13 GOFS 0.881 0.986 0.942 0.949 0.987 0.991 0|0|6
41
BSSA 0.980 0.758 0.860 0.890 0.991 0.877 0.937 0.836 0.963 0.918 0.964 0.951 0.733 0.993 1.000 0.948 0.989 1|1|15 CSSA 0.989 0.928 0.846 0.766 0.882 0|0|5
bGWO2 0.776 0.750 0.776 0.700 0.963 0.584 0.729 0.822 0.938 0.834 0.956 0.920 0.789 0.920 0.879 0.935 1|0|15 S-bBOA 0.972 0.760 0.824 0.868 0.972 0.878 0.936 0.846 0.959 0.907 0.966 0.965 0.743 0.984 0.978 0.971 0|0|16
bGWO1 0.708 0.745 0.776 0.744 0.908 0.600 0.731 0.820 0.935 0.807 0.944 0.912 0.786 0.930 0.879 0.924 0|0|16 MEGWO 0.853 0.948 0.959 0.825 0.788 0.984 0.992 0|0|7
SSAPSO 0.823 0.847 0.944 0.939 0.785 0|0|5 BGOA-M 1.000 0.736 0.837 0.919 1.000 0.973 0.933 0.843 0.977 0.966 0.980 0.967 0.760 0.989 0.961 0.970 3|3|13
508 509 510 511 512 513 514 515 516 517 518 519
520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541
• In terms of the selected number of features
Table 18 shows a comparison between the proposed approach and ten previous studies. As can be seen in this table, the ISSAFD obtained the smallest number of features in 8 datasets out of 20, whereas, the BGOAM came in the second rank with 6 datasets. However, the BGOA-M showed good ability in decreasing the features numbers, its accuracy, in general, is less than the ISSAFD. The BDA is ranked third followed by bGWOA2 and WOASAT, respectively. The rest of the studies are showed similar results in most datasets except for the bGWOA1 method, it showed the worst features number in all datasets. Finally, the ISSAFD showed good result in the produced features numbers as well as good accuracy in most datasets.
From all the previous results, it can be noticed that the performance of the ISSAFD is better than other algorithms in most of comparisons. The superiority of the ISSAFD can be due to the following two factors, the first one is to use the SCA to enhance the behaviors of the SSA’s followers that leads to maintain the population and improve its ability to search for the best solution. The second reason is using the DO operator that helps in avoiding getting trapped in local optima and improving the ability of exploration as well as guaranteeing the diversity of the population. Therefore, these factors obviously improve the behavior of the SSA and the advantages of both SCA and Dop are moved to the SSA. As well as, the simplicity, few predefined parameters, and low computation requirement of the SSA help in giving stable and good results after combing with SCA algorithm and Dop operator. In addition, the DO has a large effect on improving the performance of the proposed method. This result is observed when comparing between ISSAFD and ISSAF where the ISSAFD outperformed ISSAF by 70% in terms of accuracy. Moreover, the ISSAFD achieved the highest accuracy in 65% of all datasets when comparing with the state-of-the-art followed by BDA with 40%. In general, the proposed ISSAFD is easy to implement and showed good performances and results, however, it still needs more enhancement, especially, in the time complexity issue. That can be obtained by applying the DO for only a small part of the population or based on a specific condition.
42
Table 18: Comparison of ISSAFD versus state-of-the-arts in terms of features selected number Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking (W|T|L) Datasets Exactly Exactly2 HeartEW Lymphography M-of-n PenglungEW SonarEW SpectEW CongressEW IonosphereEW KrvskpEW Vote WaveformEW WineEW Zoo BreastEW Brain_Tumor2 9_Tumors Leukemia2 Prostrate Tumors Ranking (W|T|L)
ISSAFD 5.3 1.0 6.2 7.8 7.4 118.2 23.8 8.7 3.2 13.5 21.7 4.2 12.3 2.1 4.3 5.4 4913.5 2681.5 5323.0 5085.9 9|2|11 BGOA-M 6.0 7.0 5.0 7.0 6.0 36.0 16.0 7.0 3.0 7.0 11.0 3.0 14.0 3.0 5.0 10.0 5|2|11
S-bBOA 7.6 4.8 5.8 8.4 6.8 172.0 32.8 10.8 6.4 16.2 17.6 5.2 25.0 6.2 5.2 16.8 0|0|16 bGWOA1 8.6 8.4 9.2 9.6 10.6 160.6 37.2 12.8 7.0 19.6 26.6 8.6 30.0 8.8 10.6 21.0 0|0|16
BSSA 7.4 2.0 8.0 10.2 7.5 172.7 32.4 13.3 5.4 17.3 21.9 7.1 23.3 8.0 7.6 16.0 0|0|16 bGWOA2 5.3 4.4 6.2 8.8 6.0 124.5 16.0 8.7 6.6 9.2 12.8 4.8 14.6 6.6 7.4 15.3 2|2|14
43
WOASA-1 6.0 1.0 6.2 6.8 6.0 138.0 26.60 9.60 4.40 11.4 19.4 3.8 21.6 6.8 5.8 13.6 3|2|13 CSSA 13.2 12.6 19.0 7.5 1|0|3
ISSA 6.2 18.9 11.1 5.4 19.8 4.5 12.9 0|0|7 MEGWO 4.0 25.6 10.6 15.2 16.0 4.0 5.4 1|0|6
BDA 6.0 7.1 5.8 8.2 6.1 121.2 25.6 6.8 5.5 11.5 20.7 3.4 23.0 3.6 4.4 11.5 5121.4 2758.0 5287.0 2|0|17 QCSI-FS 6.0 10.0 7.0 8.0 5.0 0|0|5
557
We can observe that ISSAFD provides high accuracy, however the number of features selected is important for 60% datasets in our study. Moreover, it is important to emphasize that ISSAFD requires more computational time in comparison to another algorithm as GA. Also, we can see that when we treat high datasets with great number of attributes, ISSAFD needs more time due to the integration of Disrupt operator. Another limitation is marked which is the no-precision of the performance i.e the final subsets of features changes at each execution because the process of exploration/exploitation is based on random rules. This phenomenon provides more confusion for user. In addition, FS based on ISSFD requires the integration of KNN classifier which represent a simple classifier in order to accelerate the process of execution but the community of machine learning prefers to integrate a robust classifier as SVM, RF or MLP which provide better performance. The application of disrupt operator to the whole population allow to diversify the solution but the computational time is increased; therefore, we will handle all these notes in the future works.
558
6. Conclusion and future works
542 543 544 545 546 547 548 549 550 551 552 553 554 555 556
559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577
This paper introduced an alternative feature selection approach based on improved the performance of the Salp swarm algorithm (SSA) which emulates the behavior of salps. The modified version of SSA called ISSAFD since it depends on using the sine-cosine algorithm (SCA) to enhance the behaviors of the followers. In addition, the disruption operator (DO) is used to improve the exploration ability and ensure the diversity of the population. To assess the quality of the proposed ISSAFD, a set of experimental series are performed on twenty datasets where four datasets represent high dimensional with a small number of instance. The performance of the ISSAFD was compared with numerous methods including SSA, SCA, binary grey wolf optimizer (bGWO), particle swarm optimization (PSO), ant lion optimization (ALO) and genetic algorithm (GA) as well as a comparison with the stateof-the-art were provided. The obtained results showed that the ISSAFD has performed better than other FS methods in terms of performance measures which including accuracy, sensitivity, specificity and number of selected features. Based on the encouraging results of the proposed ISSAFD, in future, it can apply it in different applications such as image segmentation, task scheduling, and cloud computing. Moreover, we can formulate the problem of FS as a multi-objective optimization problem (MOP) based on ISSAFD. 44
581
The process aims to find a compromise between two objectives maximizing the accuracy and reducing the number of features. So, the advantage of using MOA allows to generate a set of solutions rather one solution. Also, becomes more efficient in order to determine the best set.
582
Acknowledgments
578 579 580
584
This work is supported by the China Postdoctoral Science Foundation under Grant No. 2019M652647.
585
References
586
References
583
587 588 589 590
591 592 593
594 595
596 597
598 599 600
601 602 603
604 605 606
Ahmed, S., Mafarja, M., Faris, H., & Aljarah, I. (2018). Feature selection using salp swarm algorithm with chaos. In Proceedings of the 2nd International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence (pp. 65–69). Aljarah, I., Mafarja, M., Heidari, A. A., Faris, H., Zhang, Y., & Mirjalili, S. (2018). Asynchronous accelerating multi-leader salp chains for feature selection. Applied Soft Computing, 71 , 964–979. Anderson, P. A., & Bone, Q. (1980). Communication between individuals in salp chains. ii. physiology. Proc. R. Soc. Lond. B , 210 , 559–574. Arora, S., & Anand, P. (2019). Binary butterfly optimization approaches for feature selection. Expert Systems with Applications, 116 , 147–160. Baliarsingh, S. K., Vipsita, S., Muhammad, K., Dash, B., & Bakshi, S. (2019). Analysis of high-dimensional genomic data employing a novel bioinspired algorithm. Applied Soft Computing, 77 , 520–532. Chen, Y., Li, L., Xiao, J., Yang, Y., Liang, J., & Li, T. (2018). Particle swarm optimizer with crossover operation. Engineering Applications of Artificial Intelligence, 70 , 159–169. Dong, H., Li, T., Ding, R., & Sun, J. (2018). A novel hybrid genetic algorithm with granular information for feature selection and optimization. Applied Soft Computing, 65 , 33–46. 45
607 608 609
610 611 612
613 614 615
616 617 618
619 620 621 622
623 624 625
626 627
628 629 630
631 632 633 634
635 636
Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 26 , 29–41. Eberhart, R., & Kennedy, J. (1995). A new optimizer using particle swarm theory. In Micro Machine and Human Science, 1995. MHS’95., Proceedings of the Sixth International Symposium on (pp. 39–43). IEEE. Elaziz, M. A., Ewees, A. A., Ibrahim, R. A., & Lu, S. (2019). Oppositionbased moth-flame optimization improved by differential evolution for feature selection. Mathematics and Computers in Simulation, . Elaziz, M. A., Oliva, D., & Xiong, S. (2017a). An improved oppositionbased sine cosine algorithm for global optimization. Expert Systems with Applications, 90 , 484–500. Elaziz, M. E. A., Ewees, A. A., Oliva, D., Duan, P., & Xiong, S. (2017b). A hybrid method of sine cosine algorithm and differential evolution for feature selection. In International Conference on Neural Information Processing (pp. 145–155). Springer. Emary, E., Zawbaa, H. M., & Grosan, C. (2018). Experienced gray wolf optimization through reinforcement learning and neural networks. IEEE transactions on neural networks and learning systems, 29 , 681–694. Emary, E., Zawbaa, H. M., & Hassanien, A. E. (2016a). Binary ant lion approaches for feature selection. Neurocomputing, 213 , 54–65. Emary, E., Zawbaa, H. M., & Hassanien, A. E. (2016b). Binary grey wolf optimization approaches for feature selection. Neurocomputing, 172 , 371– 381. Faris, H., Mafarja, M. M., Heidari, A. A., Aljarah, I., AlaâĂŹM, A.-Z., Mirjalili, S., & Fujita, H. (2018). An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowledge-Based Systems, 154 , 43–67. Frank, A. (2010). Uci machine learning repository. http://archive. ics. uci. edu/ml , .
46
637 638 639
640 641
642 643 644
645 646 647
648 649
650 651 652
653 654 655 656
657 658 659 660
661 662 663
664 665
Ghimatgar, H., Kazemi, K., Helfroush, M. S., & Aarabi, A. (2018). An improved feature selection algorithm based on graph clustering and ant colony optimization. Knowledge-Based Systems, 159 , 270–285. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3 , 1157–1182. Hafez, A. I., Zawbaa, H. M., Emary, E., & Hassanien, A. E. (2016). Sine cosine optimization algorithm for feature selection. In 2016 (INISTA) (pp. 1–5). IEEE. Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. KnowledgeBased Systems, 140 , 103–119. Harwit, M. (2006). Astrophysical concepts. Springer Science &Business Media, ,. Hegazy, A. E., Makhlouf, M., & El-Tawel, G. S. (2018). Improved salp swarm algorithm for feature selection. Journal of King Saud University-Computer and Information Sciences, . Ibrahim, R. A., Ewees, A. A., Oliva, D., Elaziz, M. A., & Lu, S. (2019). Improved salp swarm algorithm based on particle swarm optimization for feature selection. Journal of Ambient Intelligence and Humanized Computing, 10 , 3155–3169. Ibrahim, R. A., Oliva, D., Ewees, A. A., & Lu, S. (2017). Feature selection based on improved runner-root algorithm using chaotic singer map and opposition-based learning. In International Conference on Neural Information Processing (pp. 156–166). Springer. Khamees, M., Albakr, A. Y., & Shakher, K. (2018). A new approach for features selection based on binary slap swarm algorithm. Journal of Theoretical & Applied Information Technology, 96 . Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97 , 273–324.
47
666 667 668 669
670 671
672 673
674 675 676 677
678 679 680 681
682 683 684
685 686
687 688
689 690 691
692 693
694 695
Lensen, A., Xue, B., & Zhang, M. (2018). Automatically evolving difficult benchmark feature selection datasets with genetic programming. In Proceedings of the Genetic and Evolutionary Computation Conference (pp. 458–465). ACM. Liu, H., Ding, G., & Wang, B. (2014). Bare-bones particle swarm optimization with disruption operator. Appl. Math. Comput., 238 , 106–122. Liu, H., & Motoda, H. (2012). Feature selection for knowledge discovery and data mining volume 454. Springer Science & Business Media. Mafarja, M., Aljarah, I., Faris, H., Hammouri, A. I., AlaâĂŹM, A.-Z., & Mirjalili, S. (2019). Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Systems with Applications, 117 , 267–286. Mafarja, M., Aljarah, I., Heidari, A. A., Faris, H., Fournier-Viger, P., Li, X., & Mirjalili, S. (2018). Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowledge-Based Systems, 161 , 185–204. Mafarja, M., Jarrar, R., Ahmad, S., & Abusnaina, A. A. (). Feature selection using binary particle swarm optimization with time varying inertia weight strategies. Mafarja, M., & Mirjalili, S. (2018). Whale optimization approaches for wrapper feature selection. Applied Soft Computing, 62 , 441–453. Mafarja, M., & Sabar, N. R. (). Rank based binary particle swarm optimisation for feature selection in classification. Mafarja, M. M., & Mirjalili, S. (2017). Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing, 260 , 302– 312. Mirjalili, S. (2015). The ant lion optimizer. Advances in Engineering Software, 83 , 80–98. Mirjalili, S. (2016). Sca: a sine cosine algorithm for solving optimization problems. Knowledge-Based Systems, 96 , 120–133.
48
696 697 698 699
700 701 702 703
704 705 706
707 708 709
710 711 712
713 714 715 716
717 718 719
720 721
722 723 724
725 726 727
Mirjalili, S., Gandomi, A. H., Mirjalili, S. Z., Saremi, S., Faris, H., & Mirjalili, S. M. (2017). Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Advances in Engineering Software, 114 , 163– 191. Moayedikia, A., Ong, K.-L., Boo, Y. L., Yeoh, W. G., & Jensen, R. (2017). Feature selection for high dimensional imbalanced class data using harmony search. Engineering Applications of Artificial Intelligence, 57 , 38– 49. Rajamohana, S., & Umamaheswari, K. (2018). Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Computers & Electrical Engineering, 67 , 497–508. Sayed, G. I., Khoriba, G., & Haggag, M. H. (2018). A novel chaotic salp swarm algorithm for global optimization and feature selection. Applied Intelligence, (pp. 1–20). Shunmugapriya, P., & Kanmani, S. (2017). A hybrid algorithm using ant and bee colony optimization for feature selection and classification (ac-abc hybrid). Swarm and Evolutionary Computation, 36 , 27–36. Silva, M. A. L., de Souza, S. R., Souza, M. J. F., & de Franca Filho, M. F. (2018). Hybrid metaheuristics and multi-agent systems for solving optimization problems: A review of frameworks and a comparative analysis. Applied Soft Computing, 71 , 433–459. Sindhu, R., Ngadiran, R., Yacob, Y. M., Zahri, & Hanin, N. A. (2017). Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism. NCA, 28 , 2947–2958. Talbi, E.-G. (2009). Metaheuristics: from design to implementation volume 74. John Wiley & Sons. Tawhid, M. A., & Dsouza, K. B. (2018). Hybrid binary bat enhanced particle swarm optimization algorithm for solving feature selection problems. Applied Computing and Informatics, . Tu, Q., Chen, X., & Liu, X. (2019). Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Applied Soft Computing, 76 , 16–30. 49
728 729 730
731 732 733
734 735 736
Yang, X.-S. (2013). Metaheuristic optimization: Nature-inspired algorithms and applications. In Artificial Intelligence, Evolutionary Computing and Metaheuristics (pp. 405–420). Springer. Zakeri, A., & Hokmabadi, A. (2019). Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Systems with Applications, 119 , 61–72. Zhang, L., Mistry, K., Lim, C. P., & Neoh, S. C. (2018). Feature selection using firefly optimization for classification and regression models. Decision Support Systems, 106 , 64–85.
50
Credit Author Statement Author Contributions: All authors contributed equally to this work. Nabil Neggaz proposed the idea of solving the problem of feature selection. He developed the code of the objective function and searched for the datasets and He also wrote background. Majdi Mafarja wrote the Introduction and related work. Mohamed Abd elaziz developed part of the experiments and described the proposed approach and part of the implementation. Ahmed A. Ewees write the analysis of the results Declaration of Competing Interest The authors declare no competing financial interests
52