Comparisons of metaheuristic algorithms and fitness functions on software test data generation

Comparisons of metaheuristic algorithms and fitness functions on software test data generation

G Model ARTICLE IN PRESS ASOC 3844 1–13 Applied Soft Computing xxx (2016) xxx–xxx Contents lists available at ScienceDirect Applied Soft Computin...

774KB Sizes 3 Downloads 101 Views

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

Applied Soft Computing xxx (2016) xxx–xxx

Contents lists available at ScienceDirect

Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc

Comparisons of metaheuristic algorithms and fitness functions on software test data generation

1

2

Q1

3

Omur Sahin, Bahriye Akay ∗ Erciyes University, The Department of Computer Engineering, 38039 Melikgazi, Kayseri, Turkey

4 5

6 25

a r t i c l e

i n f o

a b s t r a c t

7 8 9 10 11 12

Article history: Received 29 December 2015 Received in revised form 21 July 2016 Accepted 26 September 2016 Available online xxx

13

24

Keywords: Software testing Test data generation Artificial Bee Colony Particle Swarm Optimization Differential Evolution Firefly algorithm Approximation level Branch distance Path-based coverage Similarity-based coverage

26

1. Introduction

14 15 16 17 18 19 20 21 22 23

Cost of testing activities is a major portion of the total cost of a software. In testing, generating test data is very important because the efficiency of testing is highly dependent on the data used in this phase. In search-based software testing, soft computing algorithms explore test data in order to maximize a coverage metric which can be considered as an optimization problem. In this paper, we employed some meta-heuristics (Artificial Bee Colony, Particle Swarm Optimization, Differential Evolution and Firefly Algorithms) and Random Search algorithm to solve this optimization problem. First, the dependency of the algorithms on the values of the control parameters was analyzed and suitable values for the control parameters were recommended. Algorithms were compared based on various fitness functions (pathbased, dissimilarity-based and approximation level + branch distance) because the fitness function affects the behaviour of the algorithms in the search space. Results showed that meta-heuristics can be effectively used for hard problems and when the search space is large. Besides, approximation level + branch distance based fitness function is generally a good fitness function that guides the algorithms accurately. © 2016 Elsevier B.V. All rights reserved.

Q2 27 28 29 30 31 32 33 34 35 36 37 38 39

40 41 42

Testing is an important phase in software production and software life cycle because testing cost is mostly accounted for 50% of the total development costs and releasing a reliable product is mostly depended on the testing activities. Of all the testing activities, test case generation involves major portion of the labour since it affects the efficiency of the whole software testing and then the software produced [1]. As a software becomes more complicated, software testing becomes more challenging [2]. In order to generate test data/case artifacts that make the testing more efficient and robust, a large number of tools have been proposed and used in the state-of-art for various programming languages, frameworks and platforms [1]. Anand et al. [1] categorized the most prominent techniques into five groups: 1. symbolic execution and program structural coverage testing 2. model-based test case generation 3. combinatorial testing

∗ Corresponding author. E-mail addresses: [email protected] (O. Sahin), [email protected] (B. Akay).

4. adaptive random testing 5. search-based testing In search-based software testing (SBST), the main goal is to explore effective test data that maximizes a coverage metric of a software structure. Random testing [3] is a widely-used and low cost approach in which test inputs are randomly picked from valid ranges. However, its performance is poor when the inputs are subjected to hard constraints. In SBST, some meta-heuristic search techniques have been used for test case generation by formulating the problem as a combinatorial search problem. Harman and Jones [4] claims that search-based software testing is an emerging field and meta-heuristics are ideal to be applied software engineering by reformulating the classical software engineering problems. While reformulating the problem, a fitness function is defined to evaluate the quality of a solution in terms of a coverage metric. The meta-heuristics do not make assumptions on the problem characteristics and produce reasonable results for difficult problems that cannot be solved by analytical approaches because of the problem dimensionality, the problem surface that has discontinuities and noise. Search-based software test data generation methods are reviewed by Harman and Jones [4], McMinn [5], Afzal et al., Räih”a [6], McMinn [7], Harman et al. [8], and Harman et al. [9]. In all these reviews, it is stated that there remain plenty of fields related with

http://dx.doi.org/10.1016/j.asoc.2016.09.045 1568-4946/© 2016 Elsevier B.V. All rights reserved.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

43 44

45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

G Model ASOC 3844 1–13 2 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131

ARTICLE IN PRESS O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

search-based software engineering and many interesting research challenges ahead. There are some various meta-heuristic algorithms proposed based on different natural phenomena. Particle Swarm Optimization (PSO) algorithm [10], Differential Evolution (DE) [11], Artificial Bee Colony (ABC) [12], Firefly algorithm (FA) [13] are some examples of most popular meta-heuristic algorithms. PSO algorithm [10], introduced by Eberhart and Kennedy in 1995, is a swarm-based meta-heuristic that models the social behaviour of bird flocking or fish schooling. DE algorithm [11] proposed by Storn and Kennedy for numerical optimization problems is a population based algorithm using evolutionary operators; crossover, mutation and selection. Artificial Bee Colony (ABC) algorithm [12] developed in 2005 by Karaboga mimics the foraging behaviour of honey bees and has been applied to many problems encountered in different research areas [14–16]. FA [13] presented by Yang based on the flashing and communication behaviour of fireflies with flashes. In recent years, some studies on PSO, DE, ABC and FA for test data/case generation have been presented to the literature. Windisch et al. [17] combined PSO and branch coverage criteria to generate test data and conducted a comparison between PSO and Genetic Algorithm (GA) on a benchmark set. Their results showed that PSO outperformed GA in terms of effectiveness and efficiency. Tiwari et al. [18] applied a variant of PSO in the creation of new test data for modified code in regression testing. They reported that the proposed algorithm performed better in terms of code coverage capability compared to other existing PSO algorithms on five well known benchmark test functions. Zhu et al. [19] proposed an improved PSO algorithm which uses adaptive inertia weight. Results showed that proposed PSO algorithm had better performance compared to immune genetic and basic PSO algorithms. Dahiya et al. [20] proposed a hybrid pseudo dynamic testing based on PSO to generate test data for C programs using the all-path testing criterion. Experiments were carried out on the structural testing problems such as dynamic variables, input dependent array index, abstract function calls, infeasible paths and loop handling. The technique was claimed to be robust and to produce test inputs that are not redundant. Singla et al. [21] combined GA and PSO algorithm to generate automatic test data for data flow coverage with using dominance concept between two nodes on a number of programs having different size and complexity. It was stated that the performance of new technique is superior to both GA and PSO. Latiu et al. [22] performed a comparison between GA, PSO and Simulated Annealing algorithms that were integrated with the approximation level and branch distance metrics. The results indicated that evolutionary testing strategies are suitable to generate test data with a high coverage amount. Landa Becerra et al. [23] built a test data generator which employed some DE variants and branch coverage metric for automated test data generation problem which can be formulated as a constrained optimization problem. DE algorithm was compared to the Breeder Genetic Algorithm and it was concluded that DE is a promising solution technique for this real-world problem. Jianfeng et al. [24] presented a DE-based combinatorial test data generator. In the approach, a selection and substitution based on the degree of unfinished interaction are employed to optimize the test case selected in further. They compared the proposed approach with Automatic Efficient Test Generator, Simulated Annealing, GA, Ant Colony Optimization, Cross-Entropy and PSO algorithms and the results showed the competitiveness of the proposed approach in test suite size and running time. Liang et al. [25] proposed a DE algorithm based on the one-test-at-a-time strategy for test case suite generation. In the experiments, the effect of different mutations and the influence of the control parameter values were analyzed. It

was concluded that the approach was effective and improved the solution. Mala et al. employed ABC algorithm for software test suite optimization and compared ABC and GA in [26], and ABC and ACO in [27]. Their fitness function was based on coverage-based test adequacy criteria. They reported that the proposed approach had less computation time, was more scalable and effective. In [28], they parallelized the approach in order to reduce time overhead and compared it to sequential ABC, GA and Random Testing. The results indicated that the proposed ABC-based approach converged within less number of test runs. Dahiya et al. [29] employed ABC with branch distance-based objective function for automatic test data generation in structural software tests and performed experiments on ten real world problems with large-range input variables. It was reported that the new technique was a reasonable alternative for test data generation but the performance was deteriorated when the inputs have large range and problem has many equality constraints. Lam et al. [30] presented a parallel ABC approach for the automated generation of feasible independent test path based on the priority of all edge coverage criteria to achieve the all test coverage with less number of test runs. It was stated that the proposed approach did not get stuck to local optima and path sequence comparison performed better than many fitness functions in literature. Malhotra and Khari [2] proposed different variants of ABC algorithm which utilizes mutation function of GA, in onlooker and scout bee phases in order to further improve the global search capability of the basic ABC algorithm. The proposed approach was evaluated on 10 C++ programs and it was shown that ABC algorithm with mutation produced better results in less time. Malhotra et al. [31] applied ABC, ant colony and GA algorithm based on path coverage metric and the experiments were validated on 9 C++ programs. It was concluded that ABC was the efficient due to the incorporation of parallelism and its neighbourhood production mechanism. Suri and Kaur [32] applied ABC algorithm as a regression test data generator to find the affected portions in a program and to achieve maximum path coverage. New test cases are generated until 100% coverage is achieved. Experiments were repeated for eight examples and the proposed approach was shown to be able to detect the paths that had been affected by changes and achieved 100% path coverage. Srivatsava et al. [33] optimized test case paths using FA based on appropriate objective function and introducing guidance matrix in traversing the graph. Their objective function employed cyclomatic complexity and random function. Graph reduction and state-based transformation ensured the right code coverage for testing. It was shown that FA produced the optimal paths and could minimize the test efforts. One purpose of this paper is to investigate the search abilities of PSO, DE, ABC, FA and Random Search algorithms on software test data generation benchmark problems including the triangle classifier, quadratic equation, even-odd, largest number, remainder, leap year, division of mark problems. PSO, DE, ABC and FA algorithms are simple and practical to implement compared to many other search heuristics. Especially ABC algorithm has little number of control parameters to be tuned. In most of the studies given above, there is no experiment on the control parameter sensitivity. Because the algorithm dependent parameters have critical impact on the local and global search abilities of the algorithms; they are directly related to the efficiency and efficacy. In this study, we conducted a comprehensive parametric analysis by setting different values for control parameters of PSO, DE, ABC and FA and suggested values for control parameters that yield generally good performance. Besides, in the experiments, designing a fitness function is another key issue that should be decided [4]. Tailoring a good fitness function helps the algorithm to track the optima in the search space more accurately and quickly. In SBST tools based

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197

G Model ASOC 3844 1–13

ARTICLE IN PRESS O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

219

on the meta-heuristics summarized above, code/statement coverage [18], path-based criterion [26–28,20,2,31–33], edge coverage [30], data flow coverage [21], branch coverage [17,23], branch distance [19,29], approximation level + branch distance [22] have been used individually in objective/fitness function of the algorithms. Because the fitness function guides the solutions during search, performance of the algorithms in the fitness landscapes should be investigated using different fitness functions. Approximation level enables the comparison of individuals that miss the hit in different branching node, hence, enables different paths through the program [34]. In this study, PSO, DE, ABC and FA algorithms were analyzed on path-based, dissimilarity based and approximation level + branch distance based fitness functions to find out the effect of the fitness function. In addition to success rates, a runtime analysis is also provided in the experiments. The remainder of the paper is organized as follows. In the second section, software test data generation is formulated, and then, a section is dedicated to the search-based test data generation. In the next section, a brief overview of PSO, DE, ABC and FA algorithms are provided. Section 4 reports the experimental results of the metaheuristics on SBST. Finally, Section 5 is dedicated to the discussion and conclusion.

220

2. Search-based software test data generation

198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218

221 222 223 224 225 226

227 228 229 230 231 232 233

234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259

Software testing is an essential task in software production in order to satisfy client’s all requirements and 50% of the total cost is devoted to software testing. A test plan is provided to give information about all testing activities in different stages. In software testing, the main goal is to decide the issues given below [8]: 1. Smallest set of test cases that provide maximum coverage 2. Architectural structure of the system 3. Requirements that both minimize software development cost and maximize customer satisfaction 4. Optimum allocation of resources to be used in software production 5. The sequence of refactoring steps to apply to this system Selecting the test data that provides high coverage can be achieved manually or automatically. For large size software, producing test data manually to validate the program is very difficult, not scalable, impractical, time consuming, and inefficient. Therefore, automated test data generator tools are needed to reduce the cost and improve the quality of testing. In literature, different test data generation architectures have been used [35]. In most of the studies, source code is transformed into a control flow graph (CFG) (Fig. 1) which is a graphical representation of the program sequence. A CFG is defined as a directed graph G = (N, E, s, e) in which N is set of nodes, each node in N corresponds to a statement, each edge in E corresponds to flow of control between nodes and labelled with a predicate. CFGs should have a unique entry node (s) and an exit node (e) that the program is going to start to execute and terminate, respectively [5]. Finding an input that drives the software through the branch predicates in a CFG can be considered as an optimization problem in which the aim is to maximize a coverage criteria. For this purpose, SBST techniques which use search-based optimization algorithms guided by a fitness function received attention of the researchers in recent years [5,7,1]. A generic search-based test generation scheme constructed by Harman is given in Fig. 2. In all SBST techniques, first an appropriate representation for problem is adopted to encode the solutions that correspond to test data. The representation depends on the type of the test data such as real, binary, integer, string, symbolic, etc. The

3

Fig. 1. An example of a CFG graph.

representation affects the type of the search operators used in the search based algorithm. More important part of the SBST approach is related with defining a fitness function which assess the quality of the solutions produced by the meta-heuristic. The fitness function can be structural (covering branches, paths, statements), functional (covering scenarios), temporal (finding worst/best case execution times), and so on. Once a fitness function is defined, the approach can be used for a wide range of different problems. 3. Brief explanations of the meta-heuristic algorithms in the study Meta-heuristic algorithms are global optimization tools modelling generally a natural phenomena. Due to their stochastic nature, they produce good approximate solutions in reasonably limited computation times. Especially, when the problems are incomplete, noisy, non-continuous, meta-heuristics are preferred to analytical approaches which make many assumptions on the problem surface. Particle Swarm Optimization (PSO) [10], Differential Evolution (DE) [11], Artificial Bee Colony (ABC) [12] and Firefly Algorithm (FA) [13] are some examples of the recent and efficient meta-heuristic algorithms. Meta-heuristic algorithms initially generate a population of solutions. Each solution is a test input to the software under test. If the number of variables in the software under test is n, than the solution (test input) is a n-dimensional vector. Each element in the vector corresponds to a software variable which can be integer, binary, categorical, string, etc. A meta-heuristic algorithm should use an appropriate mechanism to handle with different types of variables. The initial population is generated within an acceptable range that can be different for each variable and the solutions generated may be required to remain in the range. Each solution (test input) is evaluated in the cost function which substitutes the test data in the software or CFG and measures the efficiency of the test data based on a criteria such as coverage of CFG. For example, a solution is a three dimensional integer vector for a software with three integer parameters. If the acceptable range for each parameter is [−100,100], an algorithm initially generates random 3-dimensional solutions within the range [−100,100]. Each 3-dimensional solution is evaluated in the cost function to calculate how much coverage is obtained using the solution (test input). Brief descriptions of the DE, PSO, ABC and the FA algorithms are given in the following sections. 3.1. Particle Swarm Optimization In the PSO algorithm [10] which models the collective behaviour of bird flocking or fish schooling, a swarm of particles change their

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

260 261 262 263 264 265 266 267

268 269

270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300

301

302 303

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

4

Fig. 2. A generic search-based test generation scheme [36].

304 305 306

positions in the search space depending on their previous experience and the swarm’s experience to find the global optimum. Main steps of the algorithm are given as below: Algorithm 1.

Main steps the PSO algorithm

3.2. Differential evolution algorithm

327

The DE algorithm [11] is a global optimization algorithm using evolutionary operators; crossover, mutation and selection. Main steps of the algorithm are given as below:

328 329 330

307

308

309 310

311

1: 2: 3: 4: 5: 6: 7: 8:

Initialize the population repeat Evaluate the fitness of each particle  (t) Update the best experience of each particle, p Choose the best particle, g (t) Calculate the velocities of the particles Update the positions of the particles until requirements are met

Algorithm 2.

331

In the initialization step, locations of all particles are generated randomly by Eq. (1): xij =

xjmin

+ rand(0, 1)(xjmax

− xjmin )

(1)

316

where xi is the position of ith particle, i = 1 . . . SS, SS is swarm size, j = 1 . . . D and D is the dimension of the problem. In the loop of the PSO algorithm, the cost function is evaluated for each particle’s position. The position of each particle is updated using Eq. (2):

317

 (t + 1) x (t + 1) = x (t) + 

312 313 314 315

320

321

 (t) + 1 rand(0, 1)(p  (t + 1) = ω  (t) − x (t)) 

319

322

323 324 325 326

+ 2 rand(0, 1)(g (t) − x (t))

1: 2: 3: 4: 5: 6: 7: 8:

Initialize Population Evaluation repeat Mutation Crossover Evaluation Selection until requirements are met

332

In the initialization step, a population is formed by generating solutions randomly using Eq. (1). For each solution in the population, a mutant solution (ˆxi ) is produced by mutation operator defined by Eq. (4): xˆ i = xr1 + F(xr3 − xr2 )

(4)

where F is the scaling factor within the range of [0,1], solution vectors xr1 , xr2 and xr3 are randomly chosen and must satisfy Eq. (5):

333 334 335 336

337

338 339 340

(2)

where (t) is the velocity in time t. Velocities are calculated by Eq. (3) using the best-so-far position of the population (global best,  (t)). g (t)) and the best-so-far position of the particle (p

318

Main steps the DE algorithm

xr1 , xr2 , xr3 |r1 = / r2 = / r3 = / i

where i is the index of current solution. A trial vector is constructed by shuffling the information contained in the mutant vector and the current solution using the crossover operation defined by Eq. (6):

 (3)

where 1 is the cognitive component that weights the difference  (t), 2 is the social compobetween the current particle x (t) and p nent that weights the difference between the current particle x (t) and g (t), and ω is the inertia weight that controls the magnitude of  (t). the old velocity, 

(5)

j yi

=

xˆ i

j

Rj ≤ CR

j xi

Rj > CR

(6)

where CR is crossover rate and Rj is a random real number within the range [0,1] and j is jth parameter. In the selection phase, each trial vector (y i ) competes with its parent (xi ) and the better one is retained in the population.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

341

342 343 344

345

346 347 348

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx 349

3.3. Artificial Bee Colony algorithm

364

The Artificial Bee Colony algorithm [12] is an optimization tool that simulates the foraging behaviour of honey bees. In nature, three groups of bees play role in foraging task: employed bees, onlooker bees and scout bees. Employed bees are assigned to food sources, search the food source site and brings the nectar of the source they are currently exploiting to the hive. They also share information with the other bees in the hive about the food source location and quality. Onlookers are not assigned to a certain food source but wait in the hive to determine a source to fly using the information obtained from employed bees. Scout bees search for undiscovered sources in the environment based on internal instinct or external clues. All these tasks are performed collectively but without a supervision. The main phases of the algorithm are given step-by-step in Algorithm 3.

365

Algorithm 3.

350 351 352 353 354 355 356 357 358 359 360 361 362 363

366

367 368 369 370 371 372 373 374 375

376

Main steps of the ABC algorithm

1: 2: 3: 4: 5: 6: 7: 8:

Initialization Evaluation repeat Employed Bee Phase Onlooker Bee Phase Scout Bee Phase Memorize the best solution achieved so far until A termination criteria is satisfied

(7)

384

385

pi =

378 379 380 381 382 383

fitnessi

CS

i=1

(8)

fitnessi

395

Each onlooker bee flies a source chosen based on a roulettewheel-like selection. A local search is carried out by Eq. (7), and a greedy selection is performed to retain the better solution in the population as in the employed bee phase. In nature, a source is abandoned by its bee when it is exhausted. In ABC algorithm, if a solution cannot be improved along with a number of cycles (“limit”), that source is assumed to be exhausted, and its bee starts to work as a scout bee. A scout bee finds a new random solution produced by (1) to be replaced with the exhausted source.

396

3.4. Firefly algorithm

386 387 388 389 390 391 392 393 394

397 398

Algorithm 4.

Main steps of FA

1: 2: 3: 4: 5: 6: 7: 8:

Initialization Evaluation repeat Calculate the brightness of all fireflies Attraction with brighter fireflies Move in the search space Evaluate new fireflies and update light intensity until A termination criteria is satisfied

(9)

FA [13] simulates the flashing patterns and bioluminescence of fireflies. A firefly attracts other fireflies proportional to the

xit+1 = xit + ˇ0 e

−r 2 ij

(xjt − xit ) + ˛t ti

400 401

403

In FA, each solution corresponds to the position of each firefly. In initialization phase, all fireflies are randomly located using Eq. (1). The brightness of a firefly is determined by evaluating the objective function. A firefly’s attractiveness (ˇ) is determined with the distance r by Eq. (9) −r 2

399

402

ˇ0 is the attractiveness at r = 0, and the distance r is the Cartesian distance between any two fireflies i and j.  characterizes the attractiveness variation and affects the the speed of convergence. Movement of a firefly i towards a more attractive firefly j in the search space is defined by Eq. (10):

where k ∈ [1, CS] is a uniform random index, CS is the number of food sources, j ∈ [1, D] is uniform random index and D is the dimension of the problem. The employed bees share their experiences about the source quality and locations, and onlookers are recruited to potentially high quality solutions. In ABC algorithm, this phenomenon is achieved by assigning a probability to each source proportional to the fitness of the solution (Eq. (8)).

377

brightness and the distance. A less brighter firefly moves towards to the brighter one. The main steps of FA are given in Algorithm 4.

ˇ = ˇ0 e

In the ABC algorithm, each food source position corresponds to a solution and the fitness of the solution is the nectar of the source. In the initialization phase, a food source population is generated randomly by (1). The nectar amount of each source is evaluated in the cost function and a fitness value is assigned for each. In the employed bee phase, a local search is performed around each food source by Eq. (7). If the solution produced (xi ) is better than the solution in the bee’s memory (xi ), the solution is replaced with the new one by a greedy approach. xij = xij + ij (xij − xkj )

5

(10)

where ˛t is randomization parameter, ti is a vector of random numbers. 4. Experiments and results In this study, four different experiments were performed using DE, PSO, ABC, FA and Random Search algorithms on seven benchmark software problems. The first experiment aimed to point out the sensitivities of the algorithms to the control parameter values and find out the appropriate values for software test problems. In the second experiment, a comparison among meta-heuristics considered in the paper was presented using the control parameter values determined in the first experiment. In addition to the performance analysis in terms of the coverage metric, a runtime comparison of the algorithms was provided because time efficiency of the algorithms is also another important metric. Since the fitness function guides the algorithms in the search space, efficiency of the fitness functions defined based on different coverage criteria was analyzed in the third experiment. In the forth experiment, the results of Random Search algorithm were presented and compared to the previous results obtained from meta-heuristic algorithms. 4.1. Experimental setup In the experiments, an architecture based on concrete/actual value execution (Fig. 3) that consists of program analyzer, path selector and test data generator parts was employed. Program analyzer is given a source code and an appropriate representation (control flow graph, data dependence graph, program dependence graph, etc.) is generated to be used in subsequent parts. The path selector produces test paths for the test data generator and then, the test data generator outputs path info for given test paths. The path selector uses this path info to regenerate the paths until a coverage criteria is satisfied. Seven benchmark problems were employed including triangle, even odd, quadratic equation, largest number, remainder, leap year, division of mark problems.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

404 405 406 407 408 409

410 411 412 413 414 415

416

417 418

419

420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435

436

437 438 439 440 441 442 443 444 445 446 447 448

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

6

Table 2 Tracey’s branch distance function for relational predicate [38]. Relational predicate

Objective function

Boolean a=b / b a = ab a≥b

if TRUE then 0 else K if abs(a − b) = 0 then 0 else abs(a − b) + K if abs(a − b) = / 0 then 0 else abs(a − b) + K if a < b then 0 else a − b + K if a ≤ b then 0 else a − b + K if a > b then 0 else b − a + K if a ≥ b then 0 else b − a + K

Fig. 3. Architecture of test data generation used in the experiments.

449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465

466 467 468 469 470 471 472 473 474 475

• Triangle: The program checks whether a triangle is formed using the sides given. If a triangle is formed, then the program classifies the type of triangle as isosceles, equilateral or scalene. • Quadratic Equation: The program checks whether three inputs can form a quadratic equation or not. If a quadratic equation is formed, then the roots of equation are found. • Even-Odd: The program checks whether a number input is even or odd. • Largest Number: The program finds the largest number among three input numbers. • Remainder: The program checks if the remainder is zero or nonzero after dividing the dividend and divisor inputs. • Leap Year: The program checks whether a given year is a leap year or not. • Mark: The program calculates the average of the marks of a student and assigns a letter from five possible categories: A, B, C, D, E.

Fig. 4. An example CFG for path-based fitness.

approximation level + branch distance, path-based fitness function and dissimilarity-based fitness function [37]. Approximation level + branch distance (Eq. (11)) combines normalized branch distance (Eq. (12)) and approximation level. fitnessALBD = approximation level + normalize(branchdistance) (11) −branch distance

normalize(branch distance) = 1 − 1.001

(12)

Approximation level is the number of branch’s nodes that were not included in the path constructed by using the inputs supplied. Branch distance is calculated based on the values at the predicate using Tracey’s table [38] given in Table 2 and it gives how close the relational predicate to the desired branch. In Table 2, K is a positive constant which is added when the if clause is false. Path-based fitness function is calculated by using Eq. (13):

Characteristics of each problem are presented in Table 1. In all of the problems, solutions were encoded using integer representation. Experiments were carried out on a platform with Intel Xeon X5660 2.80 GHz processor and 16 GB of RAM. All code fragments of the problems in Table 1 were implemented using Python programming language and the meta-heuristics were implemented using MATLAB environment. In this study, fitness function was considered as the objective function to be minimized and we analyzed the performance of the algorithms designed using various fitness functions:

fitnesspath

based

=1−

|˛ ∧ ˇ| |˛ ∪ ˇ|

(13)

where ˛ and ˇ are set of nodes in target path and executed path, |˛ ∧ ˇ| is the number of matched nodes in correct order between ˛ and ˇ. The path-based fitness for a CFG given in  Fig. 4 is 3/6 =0.5  because the target path set (˛) contains the nodes s, 1, 3, 5, 8, e





and the executed path set (ˇ) contains the nodes s, 1, 3, 6, 7, e . The fitness value is the ratio between the number of the matched

Table 1 Properties of the benchmark problems used in the experiments.

1 2 3 4 5 6 7

Program title

Line count

Cyclomatic complexity

CFG node size

No. of variables

Triangle Even Odd Largest Number Leap Year Quadratic Equation Remainder Mark

23 6 11 7 15 7 19

9 2 4 4 4 3 11

23 5 8 8 12 6 22

3 1 3 1 3 2 3

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

476 477 478 479 480 481 482 483 484 485 486 487 488

489

490 491 492 493 494 495

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

7

Table 3 Values tested for the control parameters of the algorithms, NP is the number of food sources and D is the dimension of the problem. ABC Population size: 10, 20,30 limit={ NP * D * 0.2, NP * D * 0.4, NP * D * 0.6, NP * D * 0.8, NP * D }

PSO

DE

FA

w = {0.4, 0.5, 0.6, . . ., 1} c1 = {1, 1.2, 1.4, . . ., 2} c2 = {1, 1.2, 1.4, . . ., 2}

F = {0.5, 0.6, 0.7, . . ., 1} CR = {0.3, 0.4, 0.5, . . ., 1}

˛ = {0.1, 0.2, 0.3, . . ., 1} ˇmin = {0.1, 0.2, 0.3, . . ., 1}  = {0.5, 1, 1.5, .. ., 10}

Table 4 Statistical results of the coverage produced by the algorithms run with different configurations. Mean ∓ standard deviation are reported. Median values are given in parenthesis. Program title Triangle EvenOdd Largest Leap Mark Quadratic Remainder

ABC

PSO

DE

FA

99.2323 ∓ 2.5547 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 96.5556 ∓ 4.7572 (100.0000) 100.0000 ∓ 0.0000 (100.0000)

98.2134 ∓ 3.4501 (100.0000) 99.6638 ∓ 2.8796 (100.0000) 99.9259 ∓ 0.9911 (100.0000) 94.4036 ∓ 8.3944 (100.0000) 95.4925 ∓ 7.8380 (100.0000) 97.6459 ∓ 4.2499 (100.0000) 100.0000 ∓ 0.0000 (100.0000)

99.4395 ∓ 2.2358 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 99.9660 ∓ 0.6720 (100.0000) 88.6406 ∓ 8.8043 (81.8182) 98.5537 ∓ 3.6959 (100.0000) 90.0949 ∓ 6.8162 (90.0000) 99.9558 ∓ 1.0971 (100.0000)

98.9559 ∓ 2.6462 (100.0000) 99.9817 ∓ 0.6768 (100.0000) 99.9759 ∓ 0.5721 (100.0000) 88.1762 ∓ 9.0554 (81.8182) 98.8232 ∓ 3.6508 (100.0000) 81.0588 ∓ 3.1234 (80.0000) 89.9018 ∓ 13.1694 (100.0000)

497

   s, 1, 3  and the number of all nodes in   target path,  s, 1, 3, 5, 8, e .

498

Dissimilarity-based fitness function is calculated by Eq. (14):

496

499

500 501 502

503

504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528

nodes in correct order,

fitnessdissimilarity

  ˛ ⊕ ˇ    = based ˛ ∪ ˇ 

(14)

where |˛ ⊕ ˇ| denotes the cardinality of the symmetric difference between the set ˛ and the set ˇ and this cardinality is normalized using the cardinality of ˛ ∪ ˇ [37]. 4.2. Experiment 1: parameter tuning of the meta-heuristics In the first part of the experiments, parameter sensitivity of the meta-heuristic algorithms was provided because the performance of the algorithms is influenced by the values of the control parameters. Common control parameters of the algorithms are population size and the maximum number of generations. All the meta-heuristic algorithms used the same maximum evaluation number (5000) as termination criterion and the search space was bounded with the range [−100,100]. We analyzed the values 10, 20 and 30 for the population size of all algorithms. For ABC algorithm, experiments were repeated for five different values of the parameter limit and 15 different configurations are tested for ABC algorithm. For PSO algorithm, the control parameters to be tuned are population size, inertia weight (w), social component (c1 ) and cognitive component (c2 ). The experiments were repeated with different 756 parameter configurations for PSO algorithm. For DE algorithm, crossover rate (CR) and the scaling factor (F) are the control parameters and 144 different parameter settings were tested. For FA, there are three control parameters: ˛, ˇ and  and 6000 different settings have been tried. The values for the control parameters are given in Table 3. Each algorithm was run 30 times for each different configuration of the parameters. Statistics of all coverage results were reported in Table 4 and the recommended values for the control parameters were presented in Table 5. In Table 4, each cell has two lines

Table 5 The recommended values for the control parameters of each algorithm. ABC

PSO

DE

FA

Limit : NP * D Max. Cyc. :500 NP: 10

c1 : 1.4 c2 : 1.8 w : 0.8 Max. Cyc. : 167 NP : 30

CR : 0.9 F:1 Max. Cyc. :250 NP : 20

˛ : 0.3 ˇ: 0.2  :7 Max. Cyc. : 167 NP : 30

in which the first line holds the mean and standard deviation values and the second line holds the median of the results. When the results in Table 4 were investigated, it was seen that ABC algorithm had the minimum standard deviation values and produced stable results for different control parameter values. It should be noted that ABC algorithm has less number of control parameter and continues to produce stable results for different values of the control parameter limit. Designers do not have to make fine tuning for ABC. Depending on the median which is an unbiased estimator, ABC and PSO algorithms produced the best results. All the algorithms but for ABC were sensitive to the values of the control parameters on the software test data generation problems. From all results, the values in Table 5 for control parameters of each algorithm can be recommended. 4.3. Experiment 2: comparison of meta-heuristics In order to see the behaviour of algorithms in a small search space and a large search space, the experiments were repeated for the search space bounded with the range [−100,100] and [−1000,1000]. We reported the mean and standard deviation in the first line, the median and rank of each algorithm in the second line of each cell. The number of evaluations and CPU times (in seconds) to converge to the maximum coverage were also reported in the tables. The results produced by the algorithms were presented in Tables 6–8 when path-based fitness function, dissimilarity-based fitness function and approximation level + branch distance-based fitness function were used, respectively.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

529 530 531 532 533 534 535 536 537 538 539 540 541 542

543

544 545 546 547 548 549 550 551 552 553

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

8

Table 6 The results of the algorithms designed using path-based fitness function. Mean ∓ standard deviation are reported. Median values are given in parenthesis and ranks are presented next to the median. Program title

Range [−100,100]

EVENODD ABC

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

PSO

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

DE

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

FA

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE

554 555 556 557 558 559 560 561 562 563

[−1000,1000]

Coverage

Evaluation

Time (s)

Coverage

Evaluation

Time (s)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 98.0000 ∓ 4.0684 (100.0000) 2 100.0000 ∓ 0.0000 1 (100.0000) 98.9899 ∓ 2.2973 1 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 6.3333 ∓ 3.7905 (5.0000) 37.1667 ∓ 29.5087 (29.5000) 51.5000 ∓ 24.8662 (51.0000) 308.3333 ∓ 168.3505 (304.5000) 3.4000 ∓ 2.4155 (2.0000) 378.8667 ∓ 205.9164 (354.5000)

0.0242 ∓ 0.0055 (0.0216) 0.0586 ∓ 0.0110 (0.0554) 0.1272 ∓ 0.0805 (0.1059) 0.2281 ∓ 0.0762 (0.2231) 0.9380 ∓ 0.4855 (0.9304) 0.0284 ∓ 0.0066 (0.0255) 1.2005 ∓ 0.6246 (1.1221)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 97.6000 ∓ 3.7287 (100.0000) 3 87.6667 ∓ 5.0401 (90.0000) 2 100.0000 ∓ 0.0000 (100.0000) 1 97.3737 ∓ 3.0546 (100.0000) 1

2.0000 ∓ 0.0000 (2.0000) 7.1000 ∓ 4.9713 (5.0000) 95.2333 ∓ 90.4394 (63.5000) 727.4333 ∓ 194.9519 (743.5000) 817.2000 ∓ 160.1194 (835.0000) 4.9000 ∓ 2.7082 (4.5000) 579.6000 ∓ 237.9782 (602.0000)

0.0246 ∓ 0.0052 (0.0232) 0.0592 ∓ 0.0136 (0.0544) 0.2841 ∓ 0.2429 (0.1966) 2.2659 ∓ 0.5912 (2.3202) 2.4193 ∓ 0.4690 (2.4649) 0.0342 ∓ 0.0105 (0.0312) 1.8155 ∓ 0.7297 (1.8957)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 98.1818 ∓ 5.5478 2 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 99.0000 ∓ 3.0513 1 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 1 90.3030 ∓ 7.8940 (93.9394) 3

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 23.0333 ∓ 49.6515 (5.0000) 7.7667 ∓ 1.0400 (8.0000) 69.3667 ∓ 49.8235 (55.0000) 2.0333 ∓ 0.1826 (2.0000) 238.9667 ∓ 88.8848 (250.0000)

0.0275 ∓ 0.0028 (0.0267) 0.0629 ∓ 0.0081 (0.0605) 0.1925 ∓ 0.3731 (0.0581) 0.1310 ∓ 0.0123 (0.1288) 0.6023 ∓ 0.4031 (0.4865) 0.0319 ∓ 0.0047 (0.0303) 2.0965 ∓ 0.7661 (2.1979)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 99.3939 ∓ 3.3195 (100.0000) 2 99.7333 ∓ 1.4606 (100.0000) 1 97.3333 ∓ 4.4978 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 85.8586 ∓ 8.3211 (87.8788) 3

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 16.4333 ∓ 30.3505 (8.5000) 263.2333 ∓ 57.8045 (257.5000) 102.8333 ∓ 61.2553 (97.5000) 3.3000 ∓ 1.8782 (3.0000) 293.8000 ∓ 67.2250 (336.0000)

0.0278 ∓ 0.0010 (0.0281) 0.0606 ∓ 0.0021 (0.0602) 0.1446 ∓ 0.2285 (0.0856) 2.2954 ∓ 0.4912 (2.2511) 0.8754 ∓ 0.4956 (0.8274) 0.0394 ∓ 0.0132 (0.0366) 2.5954 ∓ 0.5854 (2.9297)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 1 89.0909 ∓ 9.0595 (81.8182) 4 100.0000 ∓ 0.0000 1 (100.0000) 94.3333 ∓ 5.0401 3 (90.0000) 100.0000 ∓ 0.0000 1 (100.0000) 97.7778 ∓ 2.9705 2 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 153.3000 ∓ 122.9598 (252.0000) 7.9000 ∓ 1.5166 (8.0000) 227.6667 ∓ 70.9231 (253.5000) 2.1000 ∓ 0.5477 (2.0000) 210.5333 ∓ 136.7330 (260.0000)

0.0166 ∓ 0.0040 (0.0154) 0.0412 ∓ 0.0149 (0.0369) 0.8502 ∓ 0.6726 (1.3818) 0.0975 ∓ 0.0124 (0.0972) 1.3792 ∓ 0.4176 (1.5332) 0.0207 ∓ 0.0055 (0.0184) 1.3523 ∓ 0.8594 (1.6736)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 98.7879 ∓ 4.6129 (100.0000) 3 98.4000 ∓ 3.2547 (100.0000) 2 86.3333 ∓ 4.9013 (90.0000) 3 100.0000 ∓ 0.0000 (100.0000) 1 92.8283 ∓ 6.0858 (93.9394) 2

2.0000 ∓ 0.0000 (2.0000) 4.0333 ∓ 0.1826 (4.0000) 30.2667 ∓ 61.2929 (13.0000) 239.9000 ∓ 103.5066 (197.5000) 423.3667 ∓ 84.7981 (445.5000) 8.9333 ∓ 9.8679 (5.0000) 359.0667 ∓ 108.1359 (338.5000)

0.0167 ∓ 0.0011 (0.0169) 0.0385 ∓ 0.0051 (0.0364) 0.1787 ∓ 0.3370 (0.0827) 1.5207 ∓ 0.6633 (1.2433) 2.5285 ∓ 0.4966 (2.6609) 0.0594 ∓ 0.0522 (0.0393) 2.2677 ∓ 0.6744 (2.1487)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 95.7576 ∓ 7.8215 3 (100.0000) 99.2000 ∓ 2.4410 2 (100.0000) 81.0000 ∓ 3.0513 4 (80.0000) 93.6364 ∓ 11.7323 2 (100.0000) 81.7172 ∓ 3.5126 4 (83.3333)

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 43.5667 ∓ 70.4237 (5.0000) 31.7667 ∓ 49.3396 (16.0000) 319.4333 ∓ 50.5499 (336.0000) 40.7333 ∓ 71.4104 (2.0000) 341.6000 ∓ 30.2958 (336.0000)

0.0199 ∓ 0.0017 (0.0194) 0.0441 ∓ 0.0027 (0.0435) 0.3547 ∓ 0.5610 (0.0474) 0.3643 ∓ 0.5413 (0.1937) 2.9212 ∓ 0.4602 (3.0675) 0.3370 ∓ 0.5762 (0.0231) 3.3594 ∓ 0.2828 (3.3094)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 92.7273 ∓ 9.0595 (100.0000) 4 45.6000 ∓ 22.5688 (48.0000) 4 80.0000 ∓ 0.0000 (80.0000) 4 76.3636 ∓ 9.4294 (72.7273) 2 80.7071 ∓ 4.0173 (78.7879) 4

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 75.4000 ∓ 78.0184 (23.5000) 713.6333 ∓ 126.5474 (697.5000) 336.0000 ∓ 0.0000 (336.0000) 145.8667 ∓ 57.3938 (168.0000) 336.1333 ∓ 43.5959 (336.0000)

0.0212 ∓ 0.0012 (0.0208) 0.0467 ∓ 0.0036 (0.0455) 0.6217 ∓ 0.6276 (0.2069) 7.0849 ∓ 1.2398 (6.9408) 3.1178 ∓ 0.0123 (3.1161) 1.1961 ∓ 0.4660 (1.3746) 3.3463 ∓ 0.4255 (3.3463)

4.3.1. Comparison of meta-heuristics based on path-based fitness function From the results in Table 6, when the search space range was within [−100,100], ABC, DE and PSO algorithms showed similar performances based on mean and standard deviation for the evenodd, largest and remainder problems. ABC algorithm was the best on the triangle, leap and mark problems while PSO was the best on the quadratic problem and DE was the best on mark problem. When we considered the median results given in parenthesis, ABC algorithm found % 100 coverage for all problems and PSO found

% 100 coverage for all problems except for the triangle problem. According to the median values, DE found the best coverage for all problems except for the leap and quadratic problems and FA found the best coverage for all problems except for the triangle and quadratic problems. According to the number of evaluations and CPU times, algorithms produced the results in a short time if they succeed to converge to the optimum. Otherwise, the CPU times were proportional to the maximum number of evaluations. When the algorithms were compared on the even odd problem on which they all converged the optimum in the same number of evaluation

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

564 565 566 567 568 569 570 571 572 573

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

9

Table 7 The results of the algorithms designed using dissimilarity-based fitness function. Mean ∓ standard deviation are reported. Median values are given in parenthesis and ranks are presented next to the median. Program title

Range [−100,100]

EVENODD ABC

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

PSO

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

DE

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

FA

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE

574 575 576 577 578 579 580 581 582 583

[−1000,1000]

Coverage

Evaluation

Time (s)

Coverage

Evaluation

Time (s)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 97.6667 ∓ 4.3018 (100.0000) 2 100.0000 ∓ 0.0000 1 (100.0000) 99.1919 ∓ 2.6313 1 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 9.2333 ∓ 7.3844 (5.0000) 42.4333 ∓ 41.1613 (30.0000) 53.0333 ∓ 33.2685 (43.0000) 299.7667 ∓ 172.8507 (279.5000) 3.1000 ∓ 2.1870 (2.0000) 426.1000 ∓ 211.0291 (395.5000)

0.0238 ∓ 0.0049 (0.0219) 0.0677 ∓ 0.0222 (0.0551) 0.1559 ∓ 0.1206 (0.1230) 0.2477 ∓ 0.1102 (0.2189) 0.9736 ∓ 0.5332 (0.9135) 0.0292 ∓ 0.0078 (0.0253) 1.4816 ∓ 0.7102 (1.3830)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 97.3333 ∓ 3.8357 (100.0000) 3 89.3333 ∓ 3.6515 (90.0000) 2 100.0000 ∓ 0.0000 (100.0000) 1 98.1818 ∓ 3.4321 (100.0000) 1

2.0000 ∓ 0.0000 (2.0000) 7.2667 ∓ 6.1640 (4.0000) 88.0000 ∓ 81.0419 (53.0000) 668.2667 ∓ 287.9903 (617.0000) 722.0667 ∓ 159.8803 (692.5000) 3.7333 ∓ 3.5129 (3.0000) 709.3000 ∓ 215.2401 (631.5000)

0.0222 ∓ 0.0010 (0.0219) 0.0593 ∓ 0.0168 (0.0510) 0.2894 ∓ 0.2354 (0.1957) 2.2628 ∓ 0.9486 (2.0732) 2.2706 ∓ 0.4933 (2.1903) 0.0287 ∓ 0.0101 (0.0259) 2.4219 ∓ 0.7187 (2.1536)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 97.5758 ∓ 6.2863 2 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 99.0000 ∓ 3.0513 1 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 1 94.3434 ∓ 5.4402 (93.9394) 3

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 31.6333 ∓ 55.4719 (8.5000) 7.6000 ∓ 0.7701 (7.5000) 71.0667 ∓ 57.2062 (47.5000) 2.1000 ∓ 0.3051 (2.0000) 238.3333 ∓ 106.9084 (253.0000)

0.0267 ∓ 0.0013 (0.0266) 0.0647 ∓ 0.0076 (0.0610) 0.2758 ∓ 0.4484 (0.0880) 0.1365 ∓ 0.0093 (0.1372) 0.6559 ∓ 0.4898 (0.4525) 0.0313 ∓ 0.0036 (0.0302) 2.3211 ∓ 1.0219 (2.4512)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 99.4667 ∓ 2.0297 (100.0000) 1 96.6667 ∓ 4.7946 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 89.1919 ∓ 7.5363 (90.9091) 3

2.0000 ∓ 0.0000 (2.0000) 4.0667 ∓ 0.3651 (4.0000) 9.3333 ∓ 6.9497 (8.0000) 246.4667 ∓ 78.2325 (250.0000) 114.9667 ∓ 58.4746 (127.5000) 3.4333 ∓ 1.5241 (3.5000) 299.9000 ∓ 65.1843 (336.0000)

0.0282 ∓ 0.0030 (0.0274) 0.0642 ∓ 0.0074 (0.0613) 0.0954 ∓ 0.0557 (0.0844) 2.4177 ∓ 0.7668 (2.3853) 1.0487 ∓ 0.5215 (1.1467) 0.0422 ∓ 0.0114 (0.0446) 2.8933 ∓ 0.6166 (3.2267)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 1 88.4848 ∓ 8.9115 (81.8182) 4 100.0000 ∓ 0.0000 1 (100.0000) 96.3333 ∓ 4.9013 3 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 98.4848 ∓ 2.8416 2 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 161.3000 ∓ 121.2499 (252.0000) 7.9333 ∓ 2.1804 (7.0000) 192.2333 ∓ 87.7971 (224.5000) 2.1000 ∓ 0.4026 (2.0000) 169.4333 ∓ 126.6401 (147.5000)

0.0157 ∓ 0.0004 (0.0157) 0.0382 ∓ 0.0039 (0.0369) 0.9481 ∓ 0.7036 (1.4694) 0.1017 ∓ 0.0192 (0.0989) 1.2375 ∓ 0.5508 (1.4373) 0.0206 ∓ 0.0042 (0.0185) 1.1760 ∓ 0.8563 (1.0313)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 96.9697 ∓ 6.8918 (100.0000) 2 98.6667 ∓ 3.6891 (100.0000) 2 86.0000 ∓ 4.9827 (90.0000) 3 100.0000 ∓ 0.0000 (100.0000) 1 96.6667 ∓ 5.0550 (100.0000) 2

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 51.8667 ∓ 91.3688 (11.0000) 270.6000 ∓ 111.6486 (234.0000) 421.4000 ∓ 89.1854 (448.0000) 8.2667 ∓ 7.4368 (5.5000) 296.3000 ∓ 107.8924 (292.0000)

0.0171 ∓ 0.0023 (0.0169) 0.0389 ∓ 0.0051 (0.0374) 0.3140 ∓ 0.5287 (0.0781) 1.8360 ∓ 0.7330 (1.5951) 2.6702 ∓ 0.5557 (2.8383) 0.0591 ∓ 0.0430 (0.0466) 2.0317 ∓ 0.7279 (2.0095)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 89.6970 ∓ 9.1638 3 (81.8182) 96.8000 ∓ 3.9862 2 (100.0000) 83.6667 ∓ 4.9013 4 (80.0000) 96.3636 ∓ 9.4294 2 (100.0000) 82.4242 ∓ 6.0815 4 (80.3030)

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 97.9333 ∓ 82.6676 (169.0000) 80.3000 ∓ 81.0156 (18.5000) 275.6333 ∓ 80.6955 (336.0000) 24.1333 ∓ 57.3938 (2.0000) 315.4000 ∓ 53.8277 (336.0000)

0.0200 ∓ 0.0030 (0.0192) 0.0460 ∓ 0.0040 (0.0447) 0.8275 ∓ 0.6919 (1.4039) 0.9577 ∓ 0.9492 (0.2386) 2.6812 ∓ 0.7750 (3.2517) 0.2146 ∓ 0.4950 (0.0225) 3.3353 ∓ 0.5599 (3.5418)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 92.7273 ∓ 9.0595 (100.0000) 3 43.7333 ∓ 24.9896 (44.0000) 4 80.0000 ∓ 0.0000 (80.0000) 4 77.2727 ∓ 10.3377 (72.7273) 2 81.2121 ∓ 3.2227 (78.7879) 4

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 75.3333 ∓ 78.2658 (24.0000) 714.0333 ∓ 143.1615 (708.0000) 336.0000 ∓ 0.0000 (336.0000) 140.3333 ∓ 62.9221 (168.0000) 336.0667 ∓ 0.2537 (336.0000)

0.0224 ∓ 0.0029 (0.0219) 0.0480 ∓ 0.0038 (0.0468) 0.6600 ∓ 0.6733 (0.2219) 7.6524 ∓ 1.5099 (7.6174) 3.3072 ∓ 0.0173 (3.3039) 1.2385 ∓ 0.5527 (1.4776) 3.5969 ∓ 0.0175 (3.5962)

numbers, DE executed the same number of evaluations in a shorter time. This means that steps of DE consumed less time. When the search space was extended to the range [−1000,1000], the results deteriorated because the problems become harder apart from leap problem. For the leap problem, the number of optimal solutions increases by the extension of the search space. On the even odd, largest and remainder problems, ABC, DE and PSO algorithms showed similar performances. ABC produced the best results on the triangle problem while PSO produced the best on quadratic and mark problems. Because an algorithm iterated more

when it could not converge to the optimum before termination criteria, the number of evaluations and CPU times were high in this case. As in the case that when the range was [−100,100], DE consumed less time when it found the optimum.

4.3.2. Comparison of meta-heuristics based on dissimilarity-based fitness function Depending on the results in Table 7, based on mean values, when the search space was within the range [−100,100], ABC

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

584 585 586 587

588 589 590 591

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

10

Table 8 The results of the algorithms designed using approximation level + branch distance-based fitness function. Mean ∓ standard deviation are reported. Median values are given in parenthesis and ranks are presented next to the median. Program title

Range [−100,100]

EVENODD ABC

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

PSO

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

DE

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE EVENODD

FA

LARGEST LEAP MARK QUADRATIC REMAINDER TRIANGLE

592 593 594 595 596 597 598 599 600

[−1000,1000]

Coverage

Evaluation

Time (s)

Coverage

Evaluation

Time (s)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 97.6667 ∓ 4.3018 (100.0000) 2 100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 5.7667 ∓ 2.3735 (4.0000) 36.9000 ∓ 31.5413 (29.0000) 44.3333 ∓ 22.7025 (40.5000) 291.0333 ∓ 177.2860 (242.0000) 3.3333 ∓ 3.1220 (2.0000) 83.0000 ∓ 59.5356 (73.5000)

0.1096 ∓ 0.4492 (0.0229) 0.0620 ∓ 0.0199 (0.0577) 0.1350 ∓ 0.0856 (0.1185) 0.2127 ∓ 0.0812 (0.1932) 0.9093 ∓ 0.5245 (0.7698) 0.0308 ∓ 0.0149 (0.0247) 0.3028 ∓ 0.1856 (0.2774)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 80.8000 ∓ 9.7641 (84.0000) 1 89.6667 ∓ 4.1384 (90.0000) 2 100.0000 ∓ 0.0000 (100.0000) 1 99.1919 ∓ 3.0753 (100.0000) 1

2.0000 ∓ 0.0000 (2.0000) 5.5000 ∓ 2.0968 (5.0000) 63.2667 ∓ 45.1296 (57.0000) 1858.3667 ∓ 289.0570 (1843.5000) 718.5000 ∓ 159.9407 (698.0000) 5.1333 ∓ 4.6292 (4.0000) 130.8333 ∓ 66.0168 (108.5000)

0.0246 ∓ 0.0052 (0.0229) 0.0591 ∓ 0.0104 (0.0545) 0.2037 ∓ 0.1242 (0.1806) 5.7189 ∓ 0.8780 (5.6866) 2.1622 ∓ 0.4653 (2.1043) 0.0324 ∓ 0.0125 (0.0292) 0.4451 ∓ 0.2041 (0.3774)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 98.1818 ∓ 5.5478 2 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 99.3333 ∓ 2.5371 1 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 1 99.3939 ∓ 1.8493 (100.0000) 2

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 24.4000 ∓ 49.3038 (7.0000) 7.5000 ∓ 1.0422 (7.0000) 70.1000 ∓ 53.8813 (58.0000) 2.0333 ∓ 0.1826 (2.0000) 54.0333 ∓ 48.2747 (40.5000)

0.0286 ∓ 0.0048 (0.0272) 0.0629 ∓ 0.0045 (0.0614) 0.2071 ∓ 0.3773 (0.0730) 0.1312 ∓ 0.0122 (0.1305) 0.6211 ∓ 0.4467 (0.5252) 0.0320 ∓ 0.0066 (0.0296) 0.5115 ∓ 0.4309 (0.3918)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 99.3939 ∓ 3.3195 (100.0000) 2 40.5333 ∓ 21.9022 (32.0000) 2 95.0000 ∓ 5.0855 (95.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 97.1717 ∓ 3.0753 (100.0000) 3

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 13.8333 ∓ 29.6777 (7.0000) 768.0333 ∓ 84.5407 (805.5000) 127.3667 ∓ 56.5536 (153.5000) 3.2333 ∓ 1.6121 (2.0000) 135.9333 ∓ 97.4891 (153.0000)

0.0302 ∓ 0.0108 (0.0282) 0.0631 ∓ 0.0056 (0.0611) 0.1261 ∓ 0.2246 (0.0745) 6.7370 ∓ 0.7341 (7.0436) 1.0971 ∓ 0.4705 (1.3085) 0.0400 ∓ 0.0125 (0.0321) 1.2398 ∓ 0.8577 (1.3909)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 1 89.0909 ∓ 9.0595 (81.8182) 4 99.7333 ∓ 1.4606 2 (100.0000) 95.6667 ∓ 5.0401 3 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 98.9899 ∓ 2.2973 3 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 153.0667 ∓ 123.2455 (252.0000) 16.2667 ∓ 45.3020 (8.0000) 210.4333 ∓ 87.3032 (245.0000) 2.0667 ∓ 0.3651 (2.0000) 61.0667 ∓ 93.6865 (19.5000)

0.0205 ∓ 0.0164 (0.0156) 0.0400 ∓ 0.0056 (0.0378) 0.8661 ∓ 0.6865 (1.4042) 0.1593 ∓ 0.3172 (0.0999) 1.2996 ∓ 0.5236 (1.5056) 0.0199 ∓ 0.0036 (0.0184) 0.4197 ∓ 0.6061 (0.1552)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 96.3636 ∓ 7.3971 (100.0000) 3 30.4000 ∓ 19.3865 (20.0000) 4 87.0000 ∓ 5.3498 (90.0000) 3 100.0000 ∓ 0.0000 (100.0000) 1 97.9798 ∓ 2.9058 (100.0000) 2

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 60.4000 ∓ 98.1479 (11.0000) 1221.2000 ∓ 77.3057 (1251.0000) 393.7333 ∓ 98.0605 (382.5000) 11.8667 ∓ 11.3646 (8.5000) 134.0667 ∓ 128.6031 (62.5000)

0.0172 ∓ 0.0037 (0.0163) 0.0370 ∓ 0.0031 (0.0360) 0.3491 ∓ 0.5475 (0.0732) 7.6615 ∓ 0.4878 (7.8152) 2.3890 ∓ 0.5831 (2.3267) 0.0761 ∓ 0.0604 (0.0569) 0.8759 ∓ 0.8255 (0.4130)

100.0000 ∓ 0.0000 1 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 93.3333 ∓ 8.9115 3 (100.0000) 100.0000 ∓ 0.0000 1 (100.0000) 81.3333 ∓ 3.4575 4 (80.0000) 94.5455 ∓ 11.0956 2 (100.0000) 98.7879 ∓ 2.4657 4 (100.0000)

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 65.4000 ∓ 80.2009 (7.0000) 19.4333 ∓ 5.5874 (18.5000) 313.8667 ∓ 57.3938 (336.0000) 35.2000 ∓ 67.5351 (2.0000) 96.9333 ∓ 49.6185 (80.0000)

0.0204 ∓ 0.0036 (0.0196) 0.0485 ∓ 0.0141 (0.0444) 0.5343 ∓ 0.6453 (0.0643) 0.2412 ∓ 0.0674 (0.2343) 2.9165 ∓ 0.5245 (3.1184) 0.2936 ∓ 0.5513 (0.0226) 1.0880 ∓ 0.5369 (0.9129)

100.0000 ∓ 0.0000 (100.0000) 1 100.0000 ∓ 0.0000 (100.0000) 1 91.5152 ∓ 9.2258 (100.0000) 4 36.0000 ∓ 18.3152 (20.0000) 3 80.0000 ∓ 0.0000 (80.0000) 4 76.3636 ∓ 9.4294 (72.7273) 2 94.3434 ∓ 1.5376 (93.9394) 4

2.0000 ∓ 0.0000 (2.0000) 4.0000 ∓ 0.0000 (4.0000) 84.9667 ∓ 80.1286 (24.0000) 818.0667 ∓ 51.4681 (836.0000) 336.0000 ∓ 0.0000 (336.0000) 145.8667 ∓ 57.3938 (168.0000) 207.2667 ∓ 21.3573 (212.0000)

0.0216 ∓ 0.0043 (0.0210) 0.0464 ∓ 0.0064 (0.0444) 0.7041 ∓ 0.6551 (0.2074) 8.8657 ∓ 0.5962 (9.0101) 3.1174 ∓ 0.0085 (3.1174) 1.2053 ∓ 0.4722 (1.3768) 2.3598 ∓ 0.2450 (2.3907)

algorithm produced better results than the algorithms on triangle and leap problems, PSO produced better results on quadratic problem. On even odd, largest and remainder, ABC, DE and PSO showed similar performances. When the size of search space was enlarged to [−1000,1000], on the even odd and largest problems, all algorithms showed similar performances. On the remainder problem, all algorithms except for FA produced the best result. ABC was better than the others on leap and triangle problems, PSO was better on the quadratic and mark problems.

4.3.3. Comparison of meta-heuristics based on approximation level + branch distance fitness function Upon the results in Table 8, when the algorithms used a fitness function based on approximation level + branch distance, ABC algorithm was the best if the range was [−100,100] on all problems but for quadratic problem. When the search space was in the range [−1000,1000], on the even odd and largest problems, all algorithm produced the optimum results. On the remainder problem, FA was worse than the other algorithms. ABC was better than the other algorithms on the leap, triangle and mark problems while PSO was

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

601 602 603 604 605 606 607 608 609 610

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

11

Table 9 The ranks of the fitness functions. Program title

ABC

Triangle Even Odd Largest Leap Year Quadratic Equation Remainder Mark Sum

PSO

Triangle Even Odd Largest Leap Year Quadratic Equation Remainder Mark Sum

DE

Triangle Even Odd Largest Leap Year Quadratic Equation Remainder Mark sum

FA

Triangle Even Odd Largest Leap Year Quadratic Equation Remainder Mark Sum

Total

BD + AL

Path-based [−1000,1000]

[−100,100]

[−1000,1000]

[−100,100]

[−1000,1000]

1 1 1 1 2 1 1 8

1 1 1 1 1 1 3 9

3 1 1 1 1 1 1 9

3 1 1 1 3 1 1 11

2 1 1 1 2 1 1 9

2 1 1 1 2 1 2 10

20

17 1 1 1 1 1 1 1 7

1 1 1 2 3 1 3 12

3 1 1 1 2 1 1 10

1 1 1 3 1 1 3 11

2 1 1 1 3 1 1 11

1 1 1 2 1 2 3 11

2 1 1 1 3 3 2 13

43

43

19 1 1 1 1 2 1 2 9 20 1 1 1 2 2 2 1 10

614

615

4.4. Experiment 3: fitness tailoring

613

629

In addition to the tuning values of the control parameters, deciding the fitness function is another important design issue because a good fitness function guides the solutions to track the optima in the search space more accurately and quickly. Ranks of the fitness functions were compared in Table 9 in order to point out which fitness function was more efficient on a certain algorithm. When the search range was [−100,100], the fitness function based on branch distance + approximation level was more efficient for all algorithms. When the search space was within [−1000,1000], branch distance + approximation level function was more efficient for ABC algorithm and dissimilarity fitness function was more efficient for PSO algorithm and FA. When all algorithms were considered, dissimilarity fitness function can be said to be an efficient fitness function in general within the range [−1000,1000].

630

4.5. Experiment 4: comparison with random search

616 617 618 619 620 621 622 623 624 625 626 627 628

631 632

2 1 1 2 2 1 1 10

3 1 1 1 2 1 2 11

3 1 1 2 1 1 1 10

3 1 1 1 1 2 1 10

3 1 1 3 1 1 3 13

43

42

Random search (RS) generates random values for variables to be tested and calculates the coverage amount obtained using these

86

2 1 1 1 2 1 1 9 19 2 1 1 2 3 1 1 11 21

23

better on quadratic problem. In terms of the evaluation time, DE consumed less time when it converged to the optimum. FA had poor success rate as in the other fitness functions compared to the other algorithms.

612

3 1 1 2 1 1 2 11

22

21 34

19

21

77

611

Dissimilarity

[−100,100]

2 1 1 1 1 1 2 9 21 39 81

values. In the experiments, we terminated RS when the algorithm reached the maximum coverage or maximum iteration number (5000 as for the meta-heuristics). Results of RS given in Table 10 were compared to those of meta-heuristics based on approximation level + branch distance presented in Table 8. From the tables, on the even odd and largest problems, RS produced same results with the meta-heuristics when the search space was within [−100,100]. On the quadratic problem, RS was better than FA while worse than ABC, DE and PSO. On the triangle problem, RS was the worst compared to the meta-heuristics. On the leap problem, ABC and RS were better than other algorithms. When the search space was within [−1000,1000], on the even odd, largest and remainder problems, ABC, DE, PSO and RS produced the same results. RS was the worst algorithm on the leap and triangle problems. On the mark problem, ABC was the best while RS was the second best. On the quadratic problem, PSO was the best and ABC was the second best algorithm. Generally, RS algorithm is easy to apply and is efficient for easy problems while it is inefficient for hard problems. In order to see overall performance of the algorithms as well as random search, the averages of the coverage and time in seconds of all algorithms for all problems were given in Table 11. In overall, ABC algorithm was the best based on the coverage mean and standard deviation. In terms of convergence time, ABC and PSO showed similar performances.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657

G Model

ARTICLE IN PRESS

ASOC 3844 1–13

O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

12 Table 10 The results of Random Search algorithm. Program title

[−100,100]

Triangle Even Odd Largest Leap Year Quadratic Equation Remainder Mark

[−1000,1000]

Coverage

Time

Coverage

Time

94.0404 ∓ 1.2541 (93.9394) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 94.6667 ∓ 5.0742 (90.0000) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000)

2.3245 ∓ 0.6504 (2.2034) 0.0135 ∓ 0.0010 (0.0135) 0.0327 ∓ 0.0591 (0.0215) 0.0787 ∓ 0.0816 (0.0547) 1.3206 ∓ 0.5124 (1.5932) 0.0300 ∓ 0.0437 (0.0197) 0.4760 ∓ 0.3765 (0.3104)

88.1818 ∓ 4.4558 (84.8485) 100.0000 ∓ 0.0000 (100.0000) 100.0000 ∓ 0.0000 (100.0000) 70.6667 ∓ 8.7175 (76.7273) 89.3333 ∓ 2.5371 (90.0000) 100.0000 ∓ 0.0000 (100.0000) 60.8000 ∓ 16.3209 (68.0000)

3.3958 ∓ 0.5430 (3.6459) 0.0115 ∓ 0.0059 (0.0098) 0.0249 ∓ 0.0041 (0.0237) 3.1564 ∓ 0.2946 (3.2430) 2.4085 ∓ 0.5194 (2.1617) 0.0450 ∓ 0.0331 (0.0315) 8.6096 ∓ 0.9578 (9.0612)

Table 11 Overall average of the coverage and time (in seconds) of all algorithms. Mean ∓ standard deviation are reported. Median values are given in parenthesis.

Coverage Time (s)

658

659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695

ABC

PSO

DE

FA

Random

98.3015 ∓ 4.0135 (100.0000) 0.7202 ∓ 1.1377 (0.1798)

97.0252 ∓ 9.4676 (100.0000) 0.7193 ∓ 1.2687 (0.1311)

95.5805 ∓ 11.1680 (99.3616) 0.8358 ∓ 1.3577 (0.1690)

88.0197 ∓ 15.4965 (93.4848) 1.6406 ∓ 2.1550 (0.6820)

92.6921 ∓ 12.3001 (100) 1.5663 ∓ 2.3933 (0.2773)

5. Discussion and conclusion In this study, some soft computing meta-heuristics were applied to a software engineering problem: test data generation. Differential Evolution, Particle Swarm Optimization, Artificial Bee Colony and Firefly algorithms were chosen due to their popularity and success in various fields. In addition to these meta-heuristics, Random Search algorithm was also employed in order to see when these algorithms are necessary or superior to Random Search. Because the predetermined control parameters of meta-heuristics affect the performance of the algorithms, an analysis on the control parameters of the algorithms was performed. Based on this experiment, it can be concluded that except for common control parameters of all meta-heuristics (population size and maximum generation number), ABC algorithm has the least number of control parameters and does not need to be fine tuned according to low standard deviation values. All the other algorithms seem to have a dependence on the values of the control parameters. PSO and DE algorithms produce good results when the optimal control parameters are chosen which is a nontrivial task. At the end of parameter analysis experiment, we recommended some values generally good for control parameters on the type of problems considered in this paper. In second part of the experiments, we compared the coverage performance of the meta-heuristics on software test data generation. In the comparisons, different fitness functions were employed: approximation level + branch distance, path-based fitness function and dissimilarity based fitness function. Algorithms were initialized and allowed to run in the search space within [−100,100] and [−1000,1000]. When we considered all algorithms based on all fitness functions, randomness is essential to produce test data for the program fragments that include branches due to equalities (quadratic). For this reason, algorithms that have more random nature (PSO, Random Search and ABC run with low limit values which means more scout) are more successful for these problems. On easy problems (even odd, largest, remainder), metaheuristics produce similar satisfactory results. ABC algorithm is superior on problems that have multimodality (leap) and a number of constraints (triangle, mark). For some problems in which the probability of falling into branches decreases by the enlargement

of the search space (such as mark), a balanced exploration and exploitation should be performed by the algorithms to find the global optimum. Because ABC algorithm performs exploitation in the employed and onlooker bee phases and exploration in the scout bee phase in a balanced manner, its performance in large search space is superior over the other meta-heuristics on the mark problem. DE algorithm (especially with high crossover rate) shows similar performance on triangle problem which requires much more neighbour information in search time. Supportively, performance of RS algorithm which does not use any information coming from the previous iterations or from the other individuals is worse compared to the meta-heuristics on the mark and triangle problems. In the third part of the experiments, path-based, dissimilaritybased and approximation level + branch distance fitness functions were compared. Choosing a fitness function that fits to the local search of the meta-heuristic is very important because the fitness function guides the algorithms in the search space towards local or global optima. If a fitness function has little changes with respect to the changes in the branch such as approximation level + branch distance, algorithms that use this fitness function produce more sensitive results though the convergence speed is low. It can be said that, approximation level + branch distance fitness function is generally more appropriate for the meta-heuristics considered in this paper within a small search range and dissimilarity fitness function is an efficient fitness function within large search range. In the forth part of the experiments, results of the RS were presented. On the performance of RS, it can be concluded that RS is very effective on easy problems in a small search space while it is not satisfactory on hard problems and when the search space was large. In the future work, this work will be extended for the code fragments implemented by an object oriented programming language and include class definitions using fitness functions appropriate for object oriented programming. A multi-objective approach that considers more than one fitness function is another future work in order to exploit various advantage of different fitness functions. Scalability through parallelism of software test data generation is an open area that will speed-up the process.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733

G Model ASOC 3844 1–13

ARTICLE IN PRESS O. Sahin, B. Akay / Applied Soft Computing xxx (2016) xxx–xxx

734

Acknowledgement

736

This work is supported by Erciyes University, the Department of Research Projects under Contract FYL-2016-6454.

737

References

735

738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788

[1] S. Anand, E.K. Burke, T.Y. Chen, J. Clark, M.B. Cohen, W. Grieskamp, M. Harman, M.J. Harrold, P. McMinn, A. Bertolino, J.J. Li, H. Zhu, An orchestrated survey of methodologies for automated software test case generation, J. Syst. Softw. 86 (8) (2013) 1978–2001. [2] R. Malhotra, M. Khari, Test suite optimization using mutated artificial bee colony, in: Proc. of Int. Conf. on Advances in Communication Network and Computing, CNC, 2014, pp. 45–54. [3] D.L. Bird, C.U. Munoz, Automatic generation of random self-checking test cases, IBM Syst. J. 22 (3) (1983) 229–245. [4] M. Harman, B.F. Jones, Search-based software engineering, Inf. Softw. Technol. 43 (14) (2001) 833–839. [5] P. McMinn, Search-based software test data generation: a survey, Softw. Test. Verif. Reliab. 14 (2) (2004) 105–156. [6] O. Räihä, A survey on search-based software design, Comput. Sci. Rev. 4 (4) (2010) 203–249. [7] P. McMinn, Search-based software testing: past, present and future, in: 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops (ICSTW), 2011, pp. 153–163. [8] M. Harman, S.A. Mansouri, Y. Zhang, Search-based software engineering: trends, techniques and applications, ACM Comput. Surv. 45 (1) (2012) 11:1–11:61. [9] M. Harman, P. McMinn, J. de Souza, S. Yoo, Search based software engineering: techniques, taxonomy, tutorial, in: B. Meyer, M. Nordio (Eds.), Empirical Software Engineering and Verification, Lecture Notes in Computer Science, vol. 700, Springer, Berlin, Heidelberg, 2012, pp. 1–59. [10] J. Kennedy, R.C. Eberhart, Particle swarm optimization, in: 1995 IEEE International Conference on Neural Networks, vol. 4, 1995, pp. 1942–1948. [11] R. Storn, K. Price, Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces, J. Global Optim. 11 (4) (1997) 341–359. [12] D. Karaboga, An Idea Based on Honey Bee Swarm for Numerical Optimization. Tech. Rep. TR06, Erciyes University, Engineering Faculty, Computer Engineering Department, 2005. [13] X.-S. Yang, Nature-Inspired Metaheuristic Algorithms, Luniver Press, 2008. [14] D. Karaboga, B. Akay, A survey: algorithms simulating bee swarm intelligence, Artif. Intell. Rev. 31 (1) (2009), 68–55. [15] D. Karaboga, B. Gorkemli, C. Ozturk, N. Karaboga, A comprehensive survey: artificial bee colony (ABC) algorithm and applications, Artif. Intell. Rev. 42 (1) (2014) 21–57. [16] B. Akay, D. Karaboga, A survey on the applications of artificial bee colony in signal, image, and video processing, Signal Image Video Process. 9 (4) (2015) 967–990. [17] A. Windisch, S. Wappler, J. Wegener, Applying particle swarm optimization to software testing, in: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO ’07, ACM, New York, NY, USA, 2007, pp. 1121–1128. [18] S. Tiwari, K. Mishra, A. Misra, Test case generation for modified code using a variant of particle swarm optimization (PSO) algorithm, in: 2013 10th International Conference on Information Technology: New Generations, Institute of Electrical & Electronics Engineers (IEEE), 2013. [19] X. mei Zhu, X. feng Yang, Software test data generation automatically based on improved adaptive particle swarm optimizer, in: 2010 International

[20]

[21]

[22]

[23]

[24]

[25] [26] [27]

[28]

[29]

[30]

[31]

[32] [33] [34]

[35] [36] [37] [38]

13

Conference on Computational and Information Sciences, Institute of Electrical & Electronics Engineers (IEEE), 2010. S.S. Dahiya, J.K. Chhabra, S. Kumar, PSO based pseudo dynamic method for automated test case generation using interpreter, in: Proceedings of the Second International Conference on Advances in Swarm Intelligence – Volume Part I, ICSI’11, Springer-Verlag, Berlin, Heidelberg, 2011, pp. 147–156. H.M.R. Sanjay Singla, Dharminder Kumar, P. Singla, A hybrid PSO approach to automate test data generation for data flow coverage with dominance concepts, Int. J. Adv. Sci. Technol. 37 (2011). G.I. Latiu, O.A. Cret, L. Vacariu, Automatic test data generation for software path testing using evolutionary algorithms, in: Proceedings of the 2012 Third International Conference on Emerging Intelligent Data and Web Technologies, EIDWT ’12, IEEE Computer Society, Washington, DC, USA, 2012, pp. 1–8. R.L. Becerra, R. Sagarna, X. Yao, An evaluation of differential evolution in software test data generation, in: 2009 IEEE Congress on Evolutionary Computation, Institute of Electrical & Electronics Engineers (IEEE), 2009. W. Jianfeng, W. Changan, J. Shouda, Test data generation algorithm of combinatorial testing based on differential evolution, in: 2013 Third International Conference on Instrumentation, Measurement, Computer, Communication and Control, Institute of Electrical & Electronics Engineers (IEEE), 2013. X. Liang, S. Guo, M. Huang, X. Jiao, Combinatorial test case suite generation based on differential evolution algorithm, JSW 9 (6) (2014). D.J. Mala, V. Mohan, ABC tester artificial bee colony optimization for software test suite optimization, IJSE Int. J. Softw. Eng. 2 (2) (2009) 15–48. D. Mala, M. Kamalapriya, R. Shobana, V. Mohan, A non-pheromone based intelligent swarm optimization technique in software test suite optimization, in: International Conference on Intelligent Agent Multi-Agent Systems, 2009. IAMA 2009, 2009, pp. 1–5. D.J. Mala, V. Mohan, M. Kamalapriya, Automated software test optimisation framework – an artificial bee colony optimisation-based approach, IET Softw. 4 (5) (2010) 334. S. Dahiya, J. Chhabra, S. Kumar, Application of artificial bee colony algorithm to software testing, in: Software Engineering Conference (ASWEC), 2010 21st Australian, 2010, pp. 149–154. S.S.B. Lam, Raju M.L.H.P., U.K. M, S. Ch, P.R. Srivastav, Automated generation of independent paths and test suite optimization using artificial bee colony, in: International Conference on Communication Technology and System Design 2011, Proc. Eng. 30 (2012) 191–200. R. Malhotra, C. Anand, N. Jain, A. Mittal, Comparison of search based techniques for automated test data generation, Int. J. Comput. Appl. 95 (23) (2014) 4–8. D. Suri, P. Kaur, Path based test suite augmentation using artificial bee colony algorithm, Int. J. Res. Appl. Sci. Eng. Technol. 2 (2014) 156–164. P.R. Srivatsava, B. Mallikarjun, X.-S. Yang, Optimal test sequence generation using firefly algorithm, Swarm Evol. Comput. 8 (2013) 44–53. A. Baresel, H. Sthamer, M. Schmidt, Fitness function design to improve evolutionary structural testing, in: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’02, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2002, pp. 1329–1336. H. Tahbildar, B. Kalita, Automated software test data generation: direction of research, Int. J. Comput. Sci. Eng. Surv. 2 (1) (2011) 99–120. M. Harman, Software engineering meets evolutionary computation, Computer 44 (10) (2011) 31–39. J.-C. Lin, P.-L. Yeh, Automatic test data generation for path testing using {GAs}, Inf. Sci. 131 (1–4) (2001) 47–64. N. Tracey, J. Clark, K. Mander, J. McDermid, An automated framework for structural test-data generation, in: Proceedings of the 13th IEEE International Conference on Automated Software Engineering, ASE ’98, IEEE Computer Society, Washington, DC, USA, 1998, p. 285.

Please cite this article in press as: O. Sahin, B. Akay, Comparisons of metaheuristic algorithms and fitness functions on software test data generation, Appl. Soft Comput. J. (2016), http://dx.doi.org/10.1016/j.asoc.2016.09.045

789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848