Inspired grey wolf optimizer for solving large-scale function optimization problems

Inspired grey wolf optimizer for solving large-scale function optimization problems

Accepted Manuscript Inspired grey wolf optimizer for solving large-scale function optimization problems Wen Long , Jianjun Jiao , Ximing Liang , Ming...

1MB Sizes 1 Downloads 41 Views

Accepted Manuscript

Inspired grey wolf optimizer for solving large-scale function optimization problems Wen Long , Jianjun Jiao , Ximing Liang , Mingzhu Tang PII: DOI: Reference:

S0307-904X(18)30121-5 10.1016/j.apm.2018.03.005 APM 12196

To appear in:

Applied Mathematical Modelling

Received date: Revised date: Accepted date:

25 April 2017 7 March 2018 13 March 2018

Please cite this article as: Wen Long , Jianjun Jiao , Ximing Liang , Mingzhu Tang , Inspired grey wolf optimizer for solving large-scale function optimization problems, Applied Mathematical Modelling (2018), doi: 10.1016/j.apm.2018.03.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights ·A nonlinear strategy of control parameter and a modified position- updating-equation are presented.

·The proposed algorithm is used to solve large-scale global optimization problems. ·Only 15,000 number of Function evaluations are required to solve all dimensional ·It is a low computational cost optimization technique.

CR IP T

functions.

AC

CE

PT

ED

M

AN US

·It converges fast as compared to other population-based optimization algorithms.

ACCEPTED MANUSCRIPT

Inspired grey wolf optimizer for solving large-scale function optimization problems Wen Longa,b, Jianjun Jiaob, Ximing Liangc*, Mingzhu Tangd a

Key Laboratory of Economics System Simulation, Guizhou University of Finance & Economics, Guiyang 550025, PR China School of Mathematics and Statistics, Guizhou University of Finance & Economics, Guiyang 550025, PR China c School of Science, Beijing University Civil Engineering and Architecture, Beijing 100044, PR China d School of Energy Power and Engineering, Changsha University of Science & Technology, Changsha 410114, PR China b

ABSTRACT

CR IP T

* Corresponding author. E-mail: [email protected]. Keywords: Grey wolf optimizer; Large-scale global optimization; Engineering design optimization; Electricity load forecasting Grey wolf optimizer algorithm was recently presented as a new heuristic search algorithm with satisfactory results in

real-valued and binary encoded optimization problems that are categorized in swarm intelligence optimization techniques. This algorithm is more effective than some conventional population-based algorithms, such as particle swarm optimization, differential evolution and gravitational search algorithm. Some grey wolf optimizer variants were developed by researchers to improve the performance of the basic grey wolf optimizer algorithm. Inspired by particle swarm optimization algorithm, this study investigates the

AN US

performance of a new algorithm called Inspired grey wolf optimizer, which extends the original grey wolf optimizer by adding two features, namely, a nonlinear adjustment strategy of the control parameter, and a modified position-updating equation based on the personal historical best position and the global best position. Experiments are performed on four classical high-dimensional benchmark functions, four test functions proposed in the IEEE Congress on Evolutionary Computation 2005 special session, three well-known engineering design problems, and one real-world problem. The results show that the proposed algorithm can find more accurate

M

solutions and has higher convergence rate and less number of fitness function evaluations than the other compared techniques.

1. Introduction

AC

CE

PT

ED

Grey wolf optimization (GWO) is a recently developed powerful population-based stochastic algorithm proposed by Mirjalili et al. [1]. It can converge to a better quality near-optimal solution, possesses better convergence characteristics than other prevailing population-based techniques, such as genetic algorithm (GA), particle swarm optimization (PSO), gravitational search algorithm (GSA), and differential evolution (DE) algorithms. GWO algorithm mimics the social leadership and hunting behavior of grey wolves in nature. Similar to other population-based stochastic algorithms, the GWO algorithm does not require gradient information of the function in its optimization process. The most important advantages of the GWO algorithm, compared to other optimization methods include easy implementation and fewer parameters for adjustment. Owing to its simplicity and ease of implementation, GWO has gained significant attention, has been applied in solving many practical optimization problems since its invention. Similar to other population-based stochastic algorithms, the GWO algorithm also confronts some challenging problems. For example, when solving complex multimodal tasks, the standard GWO algorithm can be easily trapped in the local optima, and the convergence rate will decrease considerably in the later period of evolution. This phenomenon can be attributed to the necessity of exploration and exploitation in population-based stochastic algorithms. To achieve satisfactory performances on problem optimizations, the algorithm should be well balanced. In standard GWO,  exploration and exploitation are guaranteed by the adaptive values of the control parameter a . The values of this control parameter are linearly decreased from 2 to 0 over the course of iterations. However, given that the search process of the GWO algorithm is nonlinearly and highly complicated,  the linear decrease of the control parameter a does not reflect the actual search process. Moreover,

ACCEPTED MANUSCRIPT

CR IP T

the conventional GWO algorithm uses the global best solution (i.e.,  wolf) of all agents from previous iterations in the calculation of the agents’ positions. The possibility that the algorithm will be trapped in the local optima exists because the best position of each agent from previous iterations of the optimization process is not utilized. Therefore, the GWO algorithm is a memory-less populationbased stochastic optimization technique. This drawback reduces the performance of the GWO algorithm when dealing with complicated optimization problems. Inspired by PSO, this paper presents an improved variant of the GWO algorithm called Inspired GWO (IGWO) for solving global optimization problems. To achieve a better balance of the explorative and exploitative behaviors of the GWO algorithm, we incorporated a nonlinear adjustment  strategy of the control parameter a . Furthermore, the IGWO algorithm uses a novel learning strategy where each agent’s historical best information is used to update its position. The proposed algorithm is tested on four classical benchmark problems with large-scale (D = 100, 500, and 1000), four test functions from Congress on Evolutionary Computation 2005, three well-known engineering design problems, and one real-world problem. The experimental results show that the proposed algorithm is effective, efficient, and robust, and it outperformed other algorithms in comparison.

AN US

2. Background 2.1 Classical grey wolf optimizer

M

Grey wolf optimizer (GWO) is a newly designed population-based optimization technique inspired by Canis-lupus and presented by Mirjalili [1]. It mimics the social leadership and hunting behavior of grey wolves. In the GWO algorithm, the fittest solution in the population is named alpha (α). The second and third best solutions are called beta (β) and delta (δ), respectively. The rest of the individuals in the population are assumed as omega (ω). To model the encircling mechanism mathematically, Eq. (1) is used as follows [1]: (1)

ED

X (t  1)  X p (t )  A | C  X p (t )  X (t ) |

PT

where X denotes the position vector of a grey wolf; t denotes the current iteration; X p indicates the position vector of the prey; and A  2a  r1  a and C  2  r2 denote the coefficient vectors, where r1 and r2 are random vectors in [0,1]. a is linearly decreased from 2 to 0 over the course of iterations as follows:

CE

a (t )  2 

2t MaxIter

(2)

AC

where t indicates the current iteration and MaxIter indicates the total number of iterations. The other wolves update their positions according to the positions of α, β, and δ as follows [1]: X 1  X   A1  | C1  X   X |

(3)

X 2  X   A2  | C2  X   X |

(4)

X 3  X   A3  | C3  X   X |

(5)

X 1 (t )  X 2 (t )  X 3 (t ) 3      are similar to A , C1 , C 2 , and C 3 are similar to C .

X (t  1)  





where A1 , A2 , and A3

The pseudo code of the GWO algorithm is presented in Algorithm 1.

(6)

CR IP T

ACCEPTED MANUSCRIPT

2.2 Related studies

AC

CE

PT

ED

M

AN US

Recently, many researchers have applied GWO in solving optimization problems. As a result, a number of GWO modifications were also developed to improve its performance. Saremi et al. [2] introduced the evolutionary population dynamics technique to remove the poor search agents of GWO and reposition them to enhance its exploitation. However, the removal requires a random positioning mechanism to avoid the local optima. Kamboj [3] proposed a hybrid algorithm that combines both GWO and PSO. The hybrid algorithm suffers from premature convergence. Zhu et al. [4] provided the hybridization of GWO and DE. The hybrid approach used the search ability of DE to escape from stagnation and accelerate the convergence speed. However, it fails to converge in high-dimensional problems. Zhang and Zhou [5] presented an improved version of GWO based on the Powell local search technique. The improved GWO algorithm utilized the convergence property of Powell’s local method to enhance the exploitation ability. Long [6] developed a novel nonlinear adjustment strategy of the control parameter in GWO. This approach utilized the opposition-based learning technique to initialize the population. However, it fails to converge in problems that possess high dimensionality. Mittal et al. [7] utilized the exponential decay function to tune the coefficient parameter and proposed a modified (mGWO) algorithm to improve the performance of the GWO algorithm. However, this method suffers from premature convergence. Emary et al. [8] introduced the reinforcement learning theories with neural networks to improve the performance of GWO and developed a novel variant of GWO. This technique employs a complex methodology. Rodríguez et al. [9] presented several variants of GWO that utilized the fuzzy hierarchical operator to improve the performance of the GWO algorithm. The fuzzy hierarchical operator is used to update the new position of the omega wolves based on the best three wolves (alpha, beta, and delta). The main drawback of this approach is that it suffers from premature convergence. Kumar et al. [10] designed a novel variant of GWO for numerical optimization and engineering design problems. They utilized the prey weight and astrophysics-based learning concept to enhance the exploration ability. However, This method fails to converge in high dimensionality problems. Heidari and Pahlavani [11] used the Lévy flight concept to solve the stagnation problem, enhance the exploration, and provide a modified version of GWO. However, it suffers from premature convergence problem. Although the studies mentioned above aimed to create a balance between exploration and exploitation, they did not provide an optimal solution. Therefore, new concepts and/or strategies that may help provide a better balance between exploration and exploitation in GWO are necessary [10].

ACCEPTED MANUSCRIPT This necessity motivated us to develop an improved GWO algorithm based on new proposed concepts/strategies. 3. Inspired grey wolf optimization (IGWO)  3.1 Nonlinear adjustment strategy of a

PT

ED

M

AN US

CR IP T

All population-based stochastic algorithms aim to achieve a balance between exploration and exploitation in the process of finding the global optimum. This balance is extremely important for the successful performance of an optimization algorithm. In the population-based stochastic optimization algorithms, exploration refers to the ability to investigate various unknown regions in the solution space to identify the global optimum. Exploitation refers to the ability to apply the knowledge of the previous satisfactory solutions to find better solutions [12]. In the basic GWO algorithm, exploration  and exploitation are guaranteed by adapting the values of the control parameter a . However, the values of this control parameter are linearly decreased from 2 to 0 over the course of iterations. Given that the search process of the GWO algorithm is nonlinear and highly complicated, the linear decrease  of the control parameter a does not reflect the actual search process. PSO is a population-based stochastic algorithm developed by Kennedy and Eberhart in 1995 [13], which is inspired by the social behavior of bird flocking or fish schooling. The PSO algorithm is easy to understand and implement and has been proven to perform well on many optimization problems [14]. Empirical studies on PSO with inertia weight have shown that a relatively large w has better global search ability (exploration), while a relatively small w results in a faster convergence (exploitation). However, in the original PSO, the values of inertia weight are linearly decreased over the course of iterations. However, Chatterjee and Siarry pointed out that a better performance would be obtained if the inertia weight were chosen as a time-varying, non-linearly decreasing quantity, instead of a constant or linearly decreasing strategy [15].  Given that the role of the control parameter a is similar to the role of the inertia weight in PSO algorithm, it is easily followed to obtain a new balance between the exploration and exploitation characteristics of the GWO algorithm. Inspired by the setting of inertia weight in PSO, this paper  presents a novel nonlinear adjustment strategy of a (logarithmic decay function), where the adaptive  values of the control parameter a are adjusted as follows:

CE

     t a (t )  ainitial  (ainitial  a final )  log1  (e  1)  max_ iter 

  

(7) 

AC

where t denotes the current iteration, max_ iter denotes the maximum number of iterations, and ainitial   and a final are the initial value and final value of the control parameter a , respectively. Based on Eq.  (7), the values of the control parameter a are nonlinearly decreased over the course of iterations. Based on the original GWO algorithm [1], the transition between exploration and exploitation is 



generated by the adaptive values of a and A . Half of the iterations are devoted to exploration 



( | A |  1 ), and the other half are used for exploitation ( | A |  1 ), as shown in Fig.1. Excessive exploration is similar to excessive randomness, and will probably not provide satisfactory optimization results. However, excessive exploitation is related to inadequate randomness [7]. Higher exploitation of the search space results in higher probability of finding better solutions, and accelerates the convergence speed. The exploitation rate can still be enhanced, as shown in Fig.1,  where the logarithmic function is used instead of the linear function to decrease the value of a over

ACCEPTED MANUSCRIPT the course of iterations. Using the logarithmic decay function (i.e., Eq. (7)), the number of iterations used for exploration and exploitation are 38% and 62%, respectively. 2 Exploration Exploitation

1.8 1.6 GWO with Eq.(2)

1.2 1 0.8 IGWO with Eq.(7)

0.6 0.4 0.2 0

0

20

40

60

Iteration



80

CR IP T

Values of a

1.4

100

Fig. 1 Updating the value of a over the course of iterations for IGWO and GWO.

AN US

3.2 Modified position-updating equation

ED

M

In the standard GWO algorithm, by analyzing Eq. (6), only the information of the α wolf (global best solution), β wolf, δ wolf, and the current position of the agent were shared to the next-generation agents. As a result, it may easily get trapped in a local optimum because of the lack of diversity among agents and the over-learning from the global best agent found so far. The personal historical best information (pbest) has not yet been systematically exploited in the GWO algorithm. Therefore, GWO is a memory-less population-based stochastic optimization technique. In the original PSO algorithm, each particle learns from both personal historical best position (pbest) and global best position (gbest) to escape from the local optimum. Inspired by the original PSO algorithm, we make the individual learn simultaneously from the global best (α wolf) position and personal historical best position (pbest), and present a novel modified position-updating rule:

PT

        X1  X 2  X 3 X (t  1)  w   c1  r3  ( X pbest  X )  c2  r4  ( X1  X ) 3

where t denotes the current iteration, individual memory coefficient,

r3

c2 [0,1]

and

r4

are random vectors in [0,1],

(8) c1 [0,1]

denotes the

denotes the population communication coefficient,

 X pbest

CE

denotes the personal historical best position, and w denotes the inertia weight. Similar to the PSO algorithm, the value of w is linearly decreased from an initial value ( winitial ) to a final value ( w final )

AC

according to the following equation: w(t ) 

max_ iter  t  ( winitial  w final )  w final max_ iter

(9)

The first term on the right-hand side of Eq. (8) represents the average of the estimated position of the three best search agents, which provides the necessary momentum for search agents to roam across the search space. The second part of Eq. (8) is known as the “cognitive” component similar to PSO, and represents the personal thinking of each search agent. The cognitive component encourages the search agents to move toward their own best positions found so far. Similar to PSO, the third part of Eq.(8) is known as the “social” component, which represents the collaborative effect of the search agents in finding the global optimal solution. The social component always pulls the search agents toward the global best search agent found so far.

ACCEPTED MANUSCRIPT Compared with the position-updating Eq. (2) in the conventional GWO, the position-updating Eq. (8) has four different characteristics: 1) The inertia weight w is introduced to accelerate the convergence speed further; 2) The values of r3 and r4 only vary stochastically with the number of agents and iterations; 3) The parameters of c1 and c2 are introduced to balance the search performance of GWO; and 4) Both the global best information (α wolf) and personal historical best information (pbest) have an impact on the candidate individual. Specifically, the individual learns from its own experience, and from the experience of the individuals in the swarm. 3.3 IGWO algorithm flowchart

AC

CE

PT

ED

M

AN US

CR IP T

The main difference between the proposed IGWO and the original GWO algorithms is the positionupdating rule according to Eq. (8). Additionally, the balancing of exploration and exploitation of  GWO according to the control parameter a demands the modification of the control parameter value according to Eq. (7). The flow chart of proposed IGWO algorithm is shown in Fig. 2.

Fig. 2 The flowchart of the proposed IGWO algorithm.

3.4 Computational complexity The time complexities of GWO and IGWO are described as follows. 1) GWO and IGWO initialization requires О(N×D) time, where N denotes the population size, and D indicates the dimension of the function. 2) Calculation of the control parameters of GWO and IGWO requires О(N×D) time. 3) Search agents position update steps of GWO and IGWO requires О(N×D) time. 4) Fitness evaluation of each search agent require О(N×D) time. In summary, the total time complexity becomes О(N×D) time per generation. Therefore, the total time complexity of GWO and IGWO for maximum number of iterations is О(N×D×max_iter), where max_iter is the maximum number of iterations.

ACCEPTED MANUSCRIPT 4. Experiments and comparison 4.1 Benchmark problems and parameter settings

Table 1 Test functions used in the experiments. Function name

Function formula

i 1 i xi4  random[0,1) n f 2 ( x)   [ xi2  10 cos (2 xi )  10] i 1

Rastrigin

n

 f 3 ( x)  1  cos  2   sin 2   f 4 ( x)  0.5 

Salmon

Schaffer



i 1 xi2   0.1 i 1 xi2 n

n



AN US

f1 ( x) 

Quartic

i 1 xi2   0.5

1  0.001 

n



x 2  i 1 i  n

2

CR IP T

In this section, we select 4 typical high-dimensional benchmark test problems from [16] and list them in Table 1. These test problems consist of different kinds of problems, such as unimodal, multimodal, regular, irregular, separable, non-separable, and multidimensional problems. Four test functions are minimization problems, and the number of their local minima increases exponentially with the problem dimension. The names, formulas, search ranges, and optimal values (fmin) of these problems are presented in Table 1.

Search range

fmin

[-1.28,1.28]

0

[-5.12,5.12]

0

[-100,100]

0

[-100,100]

0

AC

CE

PT

ED

M

In all experiments, the values of the common parameters, such as the maximum number of iterations, the dimension of the functions (D), and the population sizes were chosen the same. For all test problems, we focus on investigating the optimization performance of the proposed method on problems with D = 100, 500, and 1000. The maximum number of iterations is set to 500, and the population size is set to 30 (i.e., the maximum number of fitness evaluation is set to 15000) for four test problems of different dimensions. For each experiment of an algorithm on a test function, 20 independent runs are executed with the same set of random seeds to obtain a fair comparison among different algorithms. Other special control parameters of our algorithm are set as follows: In Eq. (7),  the initial and final values of the control parameter a are set to 1 and 0, respectively; in Eq. (8), the population communication coefficient c1 is set to 0.5, and the individual memory coefficient c2 is set to 0.5 for all test problems. The initial and final values of w are set to 0.9 and 0.1, respectively. All programs were coded in MATLAB 7.11.0 (Win32) and executed on a Lenovo computer with Intel (R) Core (TM) 2, Quad CPU Q8300 @ 2.50 GHz under Windows 7 operating system. 4.2 Experimental results To test the performance of IGWO for global optimization of continuous functions, we tested it on four typical benchmark problems, which are listed in Table 1. The proposed IGWO was compared with other four algorithms, the original GWO algorithm [1], the GWO algorithm with nonlinear control parameter (NGWO) [6], the modified GWO algorithm (mGWO) [7], and the GWO algorithm with average weight (AWGWO) [9]. To obtain a fair comparison, five algorithms were forced to use the same number of fitness evaluations (i.e., the population size is set to 30, and the maximum number of iterations is set to 500), and the results of all experiments were averaged over 30 independent runs. For all experiments, the reported results are best, average minimum value (mean), worst, and standard deviation (st.dev).

ACCEPTED MANUSCRIPT In Experiment 1, the best, mean, worst, and st.dev found by five algorithms on four test functions with D=100 are presented in Table 2. From Table 2, it is evident that, overall, IGWO is the best among the five compared algorithms in a statistically significant fashion. Compared to the original GWO, NGWO, mGWO, and AWGWO algorithms, IGWO finds better results for four test functions. The convergence curves of the function values of the five algorithms on four test functions with D=100 are shown in Fig.3. As seen from Fig. 4, IGWO has higher precision and faster convergence speed than other algorithms. Table 2 Experimental results of GWO, NGWO, mGWO, AWGWO, and IGWO on four test functions with D=100. GWO

NGWO

mGWO

AWGWO

IGWO

Best

3.59×10-3

4.82×10-4

1.91×10-3

3.09×10-3

4.18×10-6

6.52×10

-3

2.83×10

-3

2.99×10

-3

8.12×10

-3

5.55×10-5

1.23×10

-2

4.92×10

-3

4.50×10

-3

1.13×10

-2

1.22×10-4

3.45×10

-3

1.68×10

-3

1.01×10

-3

3.16×10

-3

4.90×10-5

Best

1.77×10

-7

4.12×10

-10

9.40×10

-10

7.67×10

-9

0

Mean

9.24×100

2.72×10-5

7.56×10-3

Worst

2.12×101

1.52×10-4

3.43×10-2

St.dev

8.36×100

6.93×10-5

8.37×10-3

Best

3.00×10-1

2.00×10-1

2.00×10-1

-1

-1

-1

Mean Worst St.dev

f2(x)

f3(x)

f4(x)

2.00×10

10

10

4.00×10-1

0

St.dev

5.48×10-2

0

0

5.48×10-2

0

Best

9.72×10-3

9.72×10-3

9.72×10-3

3.72×10-3

0

Mean

9.72×10-3

9.72×10-3

9.72×10-3

3.72×10-3

0

Worst

9.72×10-3

9.72×10-3

9.72×10-3

3.72×10-3

0

St.dev

0

0

0

0

0

Rastrigin function with D=100

M

0

-2

-4

100

200

300

3.40×10

10

10

10

10

10

400

10

500

2

0

-2

-4

GWO NGWO mGWO AWGWO IGWO

-6

-8

0

100

CE

Salmon function with D=100 10

AC 10

10

10

10

Average fitness value

Average fitness value

10

-5

-10

GWO NGWO mGWO AWGWO IGWO

-15

-20

0

100

200

300

Iteration

200

300

400

500

Iteration

Iteration

10

0

2.00×10-1

2

0

0

-1

2.00×10-1

GWO NGWO mGWO AWGWO IGWO

0

0

3.00×10-1

4.00×10-1

ED

10

0

6.31×100

3.40×10

PT

Average fitness value

10

0

1.67×101

Worst

Quartic function with D=100

2.00×10

7.50×100

Mean

AN US

f1(x)

CR IP T

Statistical result

Average fitness value

Function

400

10

10

10

10 500

Schaffer function with D=100

0

-2

-4

-6

GWO NGWO mGWO AWGWO IGWO

-8

-10

0

100

200

300

Iteration

Fig.3 Convergence curves of the five algorithms for four test functions with D=100.

400

500

ACCEPTED MANUSCRIPT In experiment 2, the performance of the IGWO has been compared with that of the GWO, NGWO, mGWO, and AWGWO on four test functions with D=500 to analyze their scalability. Table 3 demonstrates that with increasing dimensionality, IGWO continues to provide the best solution. Therefore, we can conclude that the IGWO is insensitive to growing dimensionality and has superior scalability. Compared to the GWO, NGWO, mGWO, and AWGWO algorithms, the proposed IGWO obtained better results for all test functions. The convergence curves of the five algorithms on four test functions with D=500 are shown in Fig.4. Table 3 Experimental results of GWO, NGWO, mGWO, AWGWO, and IGWO on four test functions with D=500. GWO

NGWO

mGWO

AWGWO

IGWO

Best

3.57×10-2

4.67×10-3

8.64×10-3

2.81×10-2

1.77×10-5

4.61×10

-2

1.15×10

-2

1.90×10

-2

4.33×10

-2

7.10×10-5

7.03×10

-2

1.98×10

-2

3.61×10

-2

5.57×10

-2

1.24×10-4

1.39×10

-2

5.89×10

-3

1.15×10

-2

1.03×10

-2

3.85×10-5

Best

5.37×10

1

1.66×10

-8

1.32×10

-6

2.27×10

1

0

Mean

8.81×101

1.09×10-1

3.43×100

5.03×101

0

Worst

1.28×102

5.50×10-1

1.49×101

6.74×101

0

St.dev

2.89×101

2.46×10-1

6.68×100

1.73×101

0

Best

1.00×100

2.00×10-1

4.00×10-1

9.00×10-1

0

0

-1

-1

-1

0

Mean Worst St.dev

f2(x)

f3(x)

f4(x)

1.06×10

Worst

1.10×100

4.00×10-1

4.00×10-1

9.00×10-1

0

St.dev

5.48×10-2

1.00×10-1

0

0

0

Best

3.46×10-1

7.82×10-2

7.82×10-2

2.73×10-1

0

Mean

3.96×10-1

1.27×10-1

1.27×10-1

3.11×10-1

0

Worst

4.42×10-1

3.99×10-1

3.99×10-1

3.46×10-1

0

St.dev

3.35×10-2

2.67×10-2

2.67×10-2

2.58×10-2

0

3.00×10

Quartic function with D=500

M

5

10

0

PT

10

ED

GWO NGWO mGWO AWGWO IGWO

-5

0

CE

Average fitness value

10

Mean

100

200

300

4.00×10

9.00×10

AN US

f1(x)

CR IP T

Statistical result

10

10

Average fitness value

Function

10

10

10

10

400

10

500

Rastrigin function with D=500

4

2

0

-2

-4

GWO NGWO mGWO AWGWO IGWO

-6

-8

0

100

Iteration

10

AC

0

10

10

10

10

Average fitness value

Average fitness value

10

-5

-10

GWO NGWO mGWO AWGWO IGWO

-15

-20

0

100

200

300

Iteration

300

400

500

Iteration

Salmon function with D=500

10

200

400

10

10

10

10 500

Schaffer function with D=500

0

-2

-4

-6

GWO NGWO mGWO AWGWO IGWO

-8

-10

0

100

200

300

Iteration

Fig.4 Convergence curves of the five algorithms for four test functions with D=500.

400

500

ACCEPTED MANUSCRIPT In Experiment 3, to further verify the scalability of the six algorithms, four test functions with 1000 dimensions were used. The “Best”, “Mean”, “Worst”, and “St.dev” results are listed in Table 4. When examining the values given in Table 4 for the dimension 1000, IGWO still continues to get best results on four test functions. IGWO significantly outperforms GWO, NGWO, mGWO, and AWGWO algorithms on four test functions. We also provide the convergence curves of five algorithms on four test functions in Fig.4. As seen from Fig. 4, IGWO has higher precision and faster convergence speed than other algorithms. Table 4 Experimental results of GWO, NGWO, mGWO, AWGWO, and IGWO on four test functions with D=1000. GWO 1.63×10 1.95×10

-1

1.90×10

-1

Best

1.17×10

2

Mean

1.83×102

1.24×100

8.78×100

Worst

2.65×102

6.28×100

2.22×101

St.dev

6.45×101

7.56×10-1

9.73×100

0

-1

Worst St.dev

f4(x)

4.00×10

10

1.05×10

-2

5.01×10

-5

-1

10

10

1.45×10-4

3.32×10

-2

4.35×10-5

8.41×10

1

0

1.49×102

0

2.17×102

0

4.76×101

0

0

0 0

Worst

1.70×100

4.00×10-1

6.00×10-1

1.90×100

0

St.dev

1.00×10-1

0

0

1.00×10-1

0

Best

4.76×10-1

7.82×10-2

2.28×10-1

4.42×10-1

0

Mean

4.80×10-1

1.27×10-1

2.28×10-1

4.60×10-1

0

Worst

4.83×10-1

3.99×10-1

2.28×10-1

4.72×10-1

0

St.dev

2.95×10-3

2.67×10-2

0

1.33×10-2

0

ED

4

0

-2

-4

100

200

300

400

1.70×10

Rastrigin function with D=1000

10

10

10

10

500

0

-5

GWO NGWO mGWO AWGWO IGWO

-10

-15

0

100

200

300

400

500

Iteration

Salmon function with D=1000

10

Schaffer function with D=1000

0

0

Average fitness value

10

2.07×10

1.80×100

AC Average fitness value

10

7.92×10-5

-1

6.00×10-1

Iteration

10

3.73×10-5

4.00×10-1

2

6.00×10

1.59×10

-1

1.60×100

GWO NGWO mGWO AWGWO IGWO

0

4.43×10

-2

1.18×10

1.50×10

Average fitness value

10

1.06×10

-5

PT

10

9.40×10

-3

3.28×10

-2

Mean

CE

Average fitness value

10

2.82×10

-2

1.85×10

IGWO

-1

Best

Quartic function with D=1000 10

1.53×10

-2

-2

AN US

f3(x)

7.37×10

-3

AWGWO

-1

Mean

f2(x)

mGWO

1.45×10

Best f1(x)

NGWO -1

CR IP T

Statistical result

M

Function

-5

-10

GWO NGWO mGWO AWGWO IGWO

-15

-20

0

100

200

300

Iteration

400

500

10

10

10

-5

-10

GWO NGWO mGWO AWGWO IGWO

-15

0

100

200

300

Iteration

Fig.5 Convergence curves of the five algorithms for four test functions with D=1000.

400

500

ACCEPTED MANUSCRIPT 4.3 Comparison with state-of-the-art algorithms

AN US

CR IP T

In this section, IGWO was also compared with four state-of-the-art algorithms: CLPSO [17], GL-25 [18], CMA-ES [19], and CoDE [20]. Four test functions proposed in CEC 2005 were used to investigate the performance of IGWO and other four state-of-the-art algorithms. A detailed description of these four test functions can be shown in [21]. CLPSO, proposed by Liang et al. is an improved version of PSO. In CLPSO, a novel learning strategy is proposed, where all other particles’ historical best information are used to update a particle’s velocity. GL-25, proposed by García-Martínez et al. is an improved version of GA based on parent-centric crossover operators. CMA-ES, proposed by Hansen and Ostermeier, is an improved version of evolution strategy (ES) based on the concept of covariance matrix adaptation (CMA). CoDE, proposed by Wang et al. is an improved version of DE based on composite trial vector generation strategies and control parameters. These four algorithms were selected in comparison based on the following reasons: (1) CLPSO, GL-25, CMA-ES, and CoDE represent the state-of-the-art in PSO, GA, ES, and DE, respectively. (2) Their numerical performances are very competitive. Table 5 presents the experimental results of IGWO and the four algorithms mentioned above. To obtain a fair comparison, the parameter settings for the four algorithms mentioned above are similar to their original papers, and their experimental results were directly taken from [20]. For clarity, the results of the best algorithms are marked in bold. From Table 5, compared with CLPSO, GL-25, CMA-ES, and CoDE, IGWO finds better results for four test functions. Table 5 Comparison among IGWO, CLPSO, GL-25, CMA-ES and CoDE on optimizing 4 benchmark test functions (D=30). GL-25 Mean Error ± St.dev

CMA-ES Mean Error ± St.dev

CoDE Mean Error ± St.dev

IGWO Mean Error ± St.dev

6.99×103 ± 1.73×103

9.07×102 ± 4.25×102

9.15×105 ± 2.16×106

5.81×10-3 ± 1.38×10-2

5.13×10-3 ± 1.14×10-2

1.04×102 ± 1.53×101

1.42×102 ± 6.45×101

4.63×101 ± 1.16×101

4.15×101 ± 1.16×101

3.80×101 ± 9.22×100

2.06×100 ± 2.15×10-1

6.23×100 ± 4.88×100

3.43×100 ± 7.60×10-1

1.57×100 ± 3.27×10-1

1.36×100 ± 3.28×10-1

2.46×102 ± 4.81×101

1.61×102 ± 6.80×101

4.43×102 ± 3.34×102

6.67×101 ± 2.12×101

6.60×101 ± 1.57×101

M

Uni-modal function F1 Basic multi-modal function F2 Expanded Multi-modal Function F3 Hybrid composition Function F4

CLPSO Mean Error ± St.dev

ED

Function

4.4 The effectiveness of the two components of IGWO

AC

CE

PT

As mentioned previously, IGWO includes two main components: the nonlinear control parameter adjustment strategy (see subsection 3.1) and the modified position-updating equation (see subsection 3.2). The goal of this subsection is to verify the effectiveness of these two components. To accomplish this, two additional experiments were performed for four test functions with D=100, 500, and 1000. In the first experiment, IGWO only used the nonlinear control parameter adjustment strategy in subsection 3.1, meaning the modified position-updating equation in subsection 3.2 was not used (denoted IGWO-1). In this case, similar to [1], the original position-updating equation (Eq. (6)) was used. In the second experiment, IGWO only used the modified position-updating equation in subsection 3.2, meaning the nonlinear control parameter adjustment strategy in subsection 3.1 was ignored (denoted IGWO-2). In this case, similar to [1], the original control parameter adjustment (Eq. (2)) was used. It is noted that the parameter settings for IGWO-1 and IGWO-2 were the same as those in subsection 4.1. Table 6 listed the experimental results of three algorithms for four functions with different dimensions.

ACCEPTED MANUSCRIPT

Table 6 Comparison results of IGWO-1, IGWO-2 and IGWO on four test functions with D=100, 500, and 1000. Dimension D=100 f1(x)

f2(x)

f3(x)

f4(x)

IGWO-1

IGWO-2

IGWO

Mean

St.dev

Mean

St.dev

Mean

1.02×10-1

1.52×10-2

7.87×10-5

6.52×10-5

5.55×10-5

4.90×10-5

-1

-5

-5

-5

3.85×10-5

0

D=500

1.94×10

D=1000

1.86×101 2

2.88×10

8.26×10

2.63×10-1

7.10×10

9.02×10-5

6.63×10-5

7.92×10-5

4.35×10-5

1

D=100

2.11×10

0

0

0

0

D=500

2.47×103

1.69×102

0

0

0

0

D=1000

6.52×103

2.41×102

0

0

0

0

D=100

2.50×100

7.07×10-2

0

0

D=500

1.30×101

6.53×10-1

0

0

D=1000

2.55×101

1.43×100

2.00×10-2

D=100

4.79×10-1

4.10×10-3

D=500

5.00×10-1

D=1000

5.00×10

-1

3.16×10

7.14×10

St.dev

CR IP T

Function

0

0

0

0

4.46×10-2

0

0

0

0

0

0

0

1.94×10-3

4.35×10-3

0

0

0

-3

-3

0

0

5.83×10

5.32×10

M

AN US

From Table 6, IGWO surpassed IGWO-1 on all four test functions with different dimensions. We attribute this phenomenon to the fact that the nonlinear control parameter adjustment strategy (Eq. (7)) was not used dependently during the search process. As a result, it could adapt the search according to different landscapes. Additionally, IGWO and IGWO-2 obtained similar results on Quartic and Rastrigin functions with different dimensions, and Salmon function with D=100 and D=500, and Schaffer function with D=100. For Salmon function with D=1000 and Schaffer function with D=500 and D=1000, IGWO obtained better results. This is not difficult to understand that the modified position-updating equation (Eq. (8)) was more effective for balancing between exploration and convergence speed during the evolution process.

ED

4.5 Application to practical engineering design problems

AC

CE

PT

In this subsection, the performance of the proposed IGWO algorithm is further tested on practical engineering design problems. Three well-known practical engineering design problems such as pressure vessel design [22], tension/compressing coil spring design [23], and welded beam design [24], were widely used in the literature to show the validity and effectiveness of meta-heuristic algorithms. The detailed information of these design problems could be found in their original papers. Deb’s feasibility-based rules [25] are used to handle the constraints. The parameters of IGWO are set as follows: the number of search agents is equal to 10, and the maximum number of iterations is set to 5000. Tables 7-9 compare the optimization results reported by IGWO with those of other optimization algorithms reported in the literature on pressure vessel design, tension/compressing coil spring design, and welded beam design problems, respectively. Note that “NA” denotes the unavailability of the results in Tables 7-9. The parameter settings for the compared algorithms are similar to those in their original papers, and their experimental results were directly taken from their original papers as well.

ACCEPTED MANUSCRIPT

Table 7 Comparison results of IGWO and other algorithms for pressure vessel design problem. x1 (best)

x2 (best)

x3 (best)

x4 (best)

f (best)

f (mean)

f (worst)

St.dev

Max_NFEs

GA [26]

0.8125

0.4375

40.3239

200.0000

6288.7445

6293.8432

6308.4970

7.4133

900,000

SMES [27]

0.8125

0.4375

42.0981

176.6405

6059.7500

6850.0000

7332.8800

426.00

NA

CDE [28]

0.8125

0.4375

42.0984

176.6377

6059.7340

6085.2303

6371.0455

43.013

204,800

CPSO [29]

0.8125

0.4375

42.0913

176.7465

6061.0777

6147.1332

6363.8041

86.45

240,000

G-QPSO [30]

0.8125

0.4375

42.0984

176.6372

6059.7208

6440.3786

7544.4925

448.4711

8,000

GDA [31]

0.8125

0.4375

42.0975

176.6484

6059.8391

6149.7276

6823.6024

210.77

20,000

BA [32]

0.8125

0.4375

42.0984

176.6366

6059.7143

6179.1300

6318.9500

137.223

20,000

AFA [33]

0.8125

0.4375

42.0984

176.6366

6059.7143

6064.3361

6090.5261

11.2878

50,000

CSA [34]

0.8125

0.4375

42.0984

176.6366

6059.7144

6342.4991

7332.8416

384.9454

250,000

TEO [35]

0.8125

0.4325

42.0984

173.6366

6059.71

6138.61

6410.19

129.9033

20,000

IGWO

12.85317

6.980472

42.09806

176.6416

6059.7659

6059.9066

6060.1246

0.139521

50,000

CR IP T

Algorithm

Table 8 Comparison results of IGWO and other algorithms for tension/compression coil spring design problem. x1 (best)

x2 (best)

x3 (best)

f (best)

f (mean)

f (worst)

St.dev

Max_NFEs

DGA [36]

10.890522

0.363965

0.051989

0.012681

0.012742

0.012973

5.90×10-5

80,000

SC [37]

10.6484423

0.3681587

0.0521602

0.012669

0.012923

0.016717

5.92×10-4

30,000

CPSO [29]

11.244543

0.357644

0.051728

0.0126747

0.012730

0.012924

5.20×10-5

200,000

DEDS [38]

11.288965

0.356717

0.051689

0.012665

0.012669

0.012738

1.30×10-5

24,000

AATM [39]

11.119253

0.051813

0.359690

0.0126683

0.0127081

0.0128614

4.50×10-5

25,000

GSA [40]

14.22867

0.05000

0.317312

0.0128739

0.0134389

0.0142117

1.34×10-2

30,000

GWO [1]

12.04249

0.344541

0.051178

0.0126723

0.0126971

0.0127208

2.10×10-5

30,000

CSA [34]

11.289012

0.356717

0.051689

0.0126652

0.012666

0.0126702

1.36×10-6

50,000

TEO [35]

11.168390

0.358792

0.051775

0.012665

0.012685

0.012715

4.41×10-6

300,000

MGWO [10]

11.80809

0.348197

0.051334

0.0126696

0.0126799

0.0127057

1.10×10-5

30,000

IGWO

11.2756

0.356983

0.051701

0.012667

0.012691

0.012718

1.97×10-5

50,000

M

AN US

Algorithm

Algorithm

x1 (best)

x2 (best)

DGA [36]

0.205986

3.471328

ESs [41]

0.199742

3.612060

CDE [28]

0.203137

3.542998

CPSO [29]

0.202369

BA [32]

0.2015

GWO [1]

0.205678

TEO [35]

0.205681

IGWO

x3 (best)

x4 (best)

f (best)

f (mean)

f (worst)

St.dev

Max_NFEs

9.020224

0.206480

1.728226

1.792654

1.993408

7.47×10-2

80,000

9.037500

0.206082

1.7373

1.81329

1.994651

7.05×10-2

NA

9.033498

0.206179

1.733462

1.768158

1.824105

2.22×10-2

204,800

9.048210

0.205723

1.728024

1.748831

1.782143

1.29×10-2

200,000

3.562

9.0414

0.2057

1.7312

1.878656

2.345579

2.68×10-1

20,000

3.471403

9.036964

0.205729

1.724995

1.725228

1.725664

1.87×10-4

30,000

3.472305

9.035133

0.205796

1.725284

1.768040

1.931161

5.82×10-2

20,000

0.205667

3.471899

9.036679

0.205733

1.724984

1.725156

1.725420

1.37×10-4

30,000

0.20571

3.4714

9.0369

0.20573

1.7250

1.7255

1.7263

4.72×10-4

50,000

PT 3.544214

CE

MGWO [10]

ED

Table 9 Comparison results of IGWO and other algorithms for welded beam design problem.

AC

For the pressure vessel design problem, in terms of the best index in Table 7 shows that the result found by TEO was superior to those obtained by the remaining algorithms. However, the mean, the worst, and the st.dev results obtained by IGWO outperformed the other reported results of the compared algorithms. Moreover, G-QPSO had the minimum number of FEs (8,000) for the pressure vessel design problem, while GA had a considerable number of FEs (900,000). The number of FEs by IGWO (50,000) was moderate among the compared algorithms. For the tension/compression coil spring design problem, the best index in Table 8 shows that the results obtained by IGWO was inferior to DEDS, TEO, and CSA algorithms. For the mean, the worst, and the st.dev indices, the results obtained by IGWO were better than those obtained by the other algorithms except for the three algorithms (CSA, TEO, and MGWO). In addition, IGWO had the moderate number of FEs (50,000) for the tension/compression coil spring design problem. For the welded beam design problem, Table 9 shows that compared with DGA, ESs, CDE, CPSO,

ACCEPTED MANUSCRIPT BA, and TEO, IGWO found better best, mean, worst, and st.dev results. However, IGWO provided worse results than GWO and MGWO algorithms. 4.6 IGWO optimizes the parameters of LSSVM for electricity load forecasting

N  1 T 1 ei2  min J ( w, e)  w w   2 2 w,b, e i 1  T yi  w  ( x)  b  ei , i  1,2, , N s.t.

AN US



CR IP T

In this section, IGWO is used to optimize the parameters of the least square support vector machine (LSSVM) model for the electric load forecasting in Longhui, Hunan, China. Recently, an increasing number of researchers are focusing on the electricity load forecasting problem and constructing some models to improve the forecasting accuracy of electricity load [42]. LSSVM model is developed by Suykens and Vandewalle [43], which solves a set of linear equations instead of a convex quadratic programming (QP) problem for standard SVM. Given a training set xi , yi iN1 with input data xi and corresponding output data yi , where N is the size of the sample set. For nonlinear regression, the samples are mapped in the high-dimensional space using the following equation [43]: y ( x)  wT  ( x)  b (10) where  (x) denotes the nonlinear function, w denotes the weight vector, and b denotes the biasness. The w and b can be obtained by the optimization problem [43]: (11)

where  is the regularization parameter and e is the error between the actual and predicted output. Based on Eq. (11), the Lagrangian function is constructed [43]: 1 T 1 w w 2 2

N



N

ei2 

 i{wT ( xi )  b  ei  yi }

M

L(w, b, e,  ) 

i 1

(12)

i 1

where  i are the Lagrange multipliers. The conditions for optimality are as follows:

CE

PT

ED

N  L  0w i  ( xi )  w i 1  N  L  0  i  0  b  i 1  L  0  i   ei   ei  L T    0  w  ( xi )  b  ei  yi i 





(13)

AC

The elimination of w and e will obtain the following linear problem: 0 Q T  b  0    1     Y  Q    I     

(14)

where Q  [1,1,,1]T ,   [1, 2 ,,  N ]T , and Y  [ y1, y2 ,, yN ]T . Here, I is an N  N identity matrix and  is the kernel matrix defined by ij   ( xi )T  ( x j )  K ( xi , x j ) . The resulting of the LSSVM model for regression in Eq. (10) becomes [43]: N

y ( x) 

 i K ( x, xi )  b

(15)

i 1

In this work, the radial basis function (RBF) kernel is used because of its suitability in dealing with nonlinear cases [44], and it provides satisfactory performance in many prediction cases [45]: 2 K ( x, xi )  exp   x  xi  2  (16) 



ACCEPTED MANUSCRIPT where  is a tuning parameter. As mentioned above, the generalization performance of the LSSVM model is highly dependent on the values of its two parameters, namely, the regularization parameter  and the kernel parameter  [43]. Therefore, this study utilizes IGWO to optimize the two parameters (i.e.,  and  ) of the LSSVM model. The objective function is guided by the Mean Absolute Percentage Error (MAPE): N yi  y ( xi )  1  (17)  N yi  i 1  indicates the actual values, and y( xi ) is the ith predicted value obtained by Eq.(15). The

MAPE 



  N y  y( x )  i i  min f ( ,  )  1    N yi  i 1   s.t.   [ min ,  max ],   [ min ,  max ]



CR IP T

where yi goal is to find the ideal combination of two parameters that will minimize the MAPE as follows:

(18)

M

AN US

In this study, as an example, the actual hourly load data daily (24 whole points of values in MW) of Longhui, Hunan, China in August 1 to 29th were used as the original load data, and the 24 whole points of load data on August 30th were used to test the forecasting accuracy. To evaluate the performance of the proposed model, this study utilizes the following three measures: mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE). The parameters of the IGWO in the IGWO-LSSVM are set as follows: the population size is set at 30, and the maximum iteration is set at 100. The boundaries of  and  are set to the range of [1, 10000] and [0.1, 100], respectively. Table 10 shows the MAE, RMSE, and MAPE results of the three models (LSSVM, GWO-LSSVM and IGWO-LSSVM). Fig. 7 illuminates the load series forecasting results given by LSSVM, GWOLSSVM, and IGWO-LSSVM models. Table 10 Three statistics measures of the three forecasting models. MAE

RMSE

MAPE

LSSVM

1.491167

1.604940

0.056858

GWO-LSSVM

0.691667

0.851469

0.026212

IGWO-LSSVM

0.358333

0.425245

0.013377

PT

ED

Model

Electricity load (MW)

AC

CE

40

35

Actual value Predicted by LSSVM model Predicted by GWO-LSSVM model Predicted by IGWO-LSSVM model

30

25

20

15 0

5

10

15

20

25

Time points Fig. 6 The load forecasting results from LSSVM, GWO-LSSVM, and IGWO-LSSVM models.

ACCEPTED MANUSCRIPT From Table 10, compared with LSSVM and GWO-LSSVM models, IGWO-LSSVM model obtains smaller results for MAE, RMSE, and MAPE, respectively. As shown in Fig. 6, IGWO-LSSVM model has higher accuracy than LSSVM and GWO-LSSVM models. 5. Conclusion

M

AN US

CR IP T

This paper presents a new GWO algorithm called IGWO to obtain a tradeoff between exploration and exploitation. In IGWO, as inspired by PSO, a novel nonlinear strategy of the control parameter and a modified position-updated equation are developed to improve the performance of the GWO algorithm. Eight benchmark test functions, three well-known engineering design problems, and one real-world problem were used to test the performance of IGWO in terms of exploration, exploitation, and convergence. First, the exploitation ability of IGWO was confirmed by the results on unimodal functions. Second, the results on the multimodal functions demonstrated the superior exploration of IGWO. Simulations demonstrated that IGWO could provide very competitive results compared to recent and state-of-the-art heuristic algorithms, such as GWO, NGWO, mGWO, AWGWO, CLPSO, GL-25, CMA-ES, and CoDE. Although IGWO is efficient, it has some weaknesses that should be addressed in future studies. In IGWO, the number of parameters is greater than the number of parameters in the original version. Furthermore, the current study does not have automatic tuning for the control parameters used. Another limitation in IGWO is that finding an optimal solution is not 100% guaranteed. In future work, we would investigate how to extend the IGWO algorithm to handle constrained optimization, multi-objective optimization, and combinatorial optimization problems. The application of IGWO is desirable to solve more complex real-world problems. Acknowledgments

References

CE

PT

ED

The authors sincerely thank the anonymous associate editor and the four anonymous reviewers for providing detailed and valuable comments and suggestions that greatly helped us improve the quality of this paper. They also gratefully acknowledge Dr. Tiebin Wu and Dr. Jianguo Fan for improving the presentation of this paper. This work was supported by the National Natural Science Foundation of China under Grant No. 61463009, Science and Technology Foundation of Guizhou Province under Grant No. [2016]1022, Program for the Science and Technology Top Talents of Higher Learning Institutions of Guizhou under Grant No. KY[2017]070, Joint Foundation of Guizhou University of Finance and Economics and Ministry of Commerce under Grant No. 2016SWBZD13, Education Department of Guizhou Province Projects under Grant No. KY[2017]004, Central Support Local Projects under Grant No. PXM 2013-014210-000173, Project of High Level Creative Talents in Guizhou Province under Grant No. 20164035, and Natural Science Foundation of Hunan Province under Grant No.2016JJ3079.

AC

[1] S. Mirjalili, S.M. Mirjalili, A. Lewis, Grey wolf optimizer, Adv. Eng. Softw. 69(3) (2014) 46-61. [2] S. Saremi, S.Z. Mirjalili, S.M. Mirjalili, Evolutionary population dynamics and grey wolf optimizer. Neural Comput. Appl. 26(5) (2015) 1257-1263. [3] V. K. Kamboj, A novel hybrid PSO-GWO approach for unit commitment problem, Nerual Comput. Appl. 27 (6) (2016) 1643-1655. [4] A. Zhu, C. Xu, Z. Li, J. Wu, Z. Liu, Hybridizing grey wolf optimization with differential evolution for global optimization and test scheduling for 3D stacked SoC, J. Syst. Eng. Elect. 26 (2) (2015) 317-328. [5] S. Zhang, Y. Zhou, Grey wolf optimizer based on Powell local optimization method for clustering analysis. Discrete. Dyn. Nature Soc. (2015) 1-7. [6] W. Long, Grey wolf optimizer based on nonlinear adjustment control parameter. In: International Conference on Sensors, Mechatronics and Automation. (2016) 643-648. [7] N. Mittal, U. Singh, B.S. Sohi, Modified grey wolf optimizer for global engineering optimization, Appl. Comput. Intell. Soft Comput. ID 7950348 (2016) 1-16. [8] E. Emary, H.M. Zawbaa, C. Grosan, Experienced grey wolf optimization through reinforcement learning and neural networks, IEEE Trans. Neural Netw. Learn Syst. PP (99) (2017) 1-14. [9] L. Rodríguez, O. Castillo, J. Soria, P. Melin, F. Valdez, C.I. Gonzalez, G.E. Martinez, J. Soto, A fuzzy hierarchical operator in the grey wolf optimizer algorithm, Appl. Soft Comput. 57 (2017) 315-328. [10] V. Kumar, D. Kumar, An astrophysics-inspired grey wolf algorithm for numerical optimization and its application to engineering design problems, Adv. Eng. Softw. 112 (2017) 231-254.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

[11] A.A. Heidari, P. Pahlavani, An efficient modified grey wolf optimizer with Lévy flight for optimization tasks. Appl. Soft Comput. 60 (2017) 115-134. [12] W. Gao, S. Liu, A modified artificial bee colony algorithm, Comput. Oper. Res. 39 (3) (2012) 687-697. [13] J. Kennedy, R. Eberhart, Particle swarm optimization, in: Proceedings of IEEE International Conference on Neural Network, vol. 4, 1995, pp. 1942-1948. [14] Y. Peng, B. Lu, A hierarchical particle swarm optimizer with latin sampling based memetic algorithm for numerical optimization, Appl. Soft Comput. 13 (5) (2013) 2823-2836. [15] A. Chatterjee, P. Siarry, Nonlinear inertia weight variation for dynamic adaptation in particle swarm optimization, Comput. Oper. Res. 33 (3) (2006) 859-871. [16] R. Jensi, G. Wisenlin Jiji, An enhanced particle swarm optimization with levy flight for global optimization, Appl. Soft Comput. 43 (2016) 248-261. [17] J.J. Liang, A.K. Qin, P.N. Suganthan, S. Baskar, Comprehensive learning particle swarm optimizer for global optimization of multimodal functions, IEEE Trans. Evol. Comput. 10 (2006) 281-295. [18] C. Garcia-Martinez, M. Lozano, F. Herrera, D. Molina, A.M. Sanchez, Global and local real-coded genetic algorithms based on parent-centric crossover operators, Eur. J. Oper. Res. 185 (3) (2008) 1088-1113. [19] N. Hansen, A. Ostermeier, Completely derandomized self-adaptation in evolution strategies, Evol. Comput. 9 (2) (2001) 159-195. [20] Y. Wang, Z.X. Cai, Q.F. Zhang, Differential evolution with composite trial vector generation strategies and control parameters, IEEE Trans. Evol. Comput. 15 (1) (2011) 55-66. [21] P.N. Suganthan, N. Hansen, J.J. Liang, K. Deb, Y.-P. Chen, A. Auger, S. Tiwari, Problem definitions and evaluation criteria for the CEC 2005 special session on real-parameter optimization, Nanyang Technol. Univ., Singapore, IIT Kanpur, 2005. [22] E. Sandgren, Nonlinear integer and discrete programming in mechanical design optimization, J. Mech. Design 112 (2) (1990) 223-229. [23] A.D. Belegundu, A study of mathematical programming methods for structural optimization, PhD Thesis, Department of Civil and Environmental Engineering, University of Iowa, Iowa, 1982. [24] A.D. Belegundu, J.S. Arora, A study of mathematical programming methods for structural optimization, Int. J. Numer. Methods Eng. 21 (1985) 1583-1599. [25] K. Deb, An efficient constraint handling method for genetic algorithms, Comput. Methods Appl. Mech. Eng. 186 (2000) 311-338. [26] C.A.C. Coello, Use of a self-adaptive penalty approach for engineering optimization problems. Comput. Ind. 41 (2000) 113-127. [27] E.M. Montes, C.A.C. Coello, A simple multimembered evolution strategy to solve constrained optimization problems, IEEE Trans. Evol. Comput. 9 (1) (2005) 1-17. [28] R. Becerra, C.A.C. Coello, Cultured differential evolution for constrained optimization, Comput. Methods Appl. Mech. Eng. 195 (2006) 4303-4322. [29] Q. He, L. Wang, An effective coevolutionary particle swarm optimization for constrained engineering design problems, Eng. Appl. Artif. Intell. 20 (1) (2007) 89-99. [30] L.D.S. Coelho, Gaussian quantum-behaved particle swarm optimization approaches for constrained engineering design problems, Expert Syst. Appl. 37 (2010) 1676-1683. [31] A. Baykasoğlu, Design optimization with chaos embedded great deluge algorithm, Appl. Soft Comput. 12 (2012) 1055-1067. [32] A.H. Gandomi, X. Yang, A.H. Alavi, S. Talatahari, Bat algorithm for constrained optimization tasks, Neural Comput. Appl. 22 (6) (2013) 1239-1255. [33] A. Baykasoğlu, F.B. Ozsoydan, Adaptive firefly algorithm with chaos for mechanical design optimization problems, Appl. Soft Comput. 36 (2015) 152-164. [34] A. Askarzadeh, A novel meta-heuristic method for solving constrained engineering optimization problems: Crow search algorithm, Comput. Struct. 169 (2016) 1-12. [35] A. Kaveh, A. Dadras, A novel meta-heuristic optimization algorithm: Thermal exchange optimization, Adv. Eng. Softw. 110 (2017) 69-84. [36] M.E. Montes, C.A.C. Coello, Constraint-handling in genetic algorithms through the use of dominance-based tournament selection, Adv. Eng. Inf. 16 (2002) 193-203. [37] T. Ray, K.M. Liew, Society and civilization: An optimization algorithm based on the simulation of social behavior, IEEE Trans. Evol. Comput. 7 (4) (2003) 386-396. [38] M. Zhang, W. Luo, X. Wang, Differential evolution with dynamic stochastic selection for constrained optimization, Inform. Sci. 178 (2008) 233-243. [39] Y. Wang, Z. Cai, Y. Zhou, Accelerating adaptive trade-off model using shrinking space technique for constrained evolutionary optimization, Int. J. Numer. Methods Optim. 77 (11) (2009) 1501-1534. [40] E. Rashedi, H. Nezamabadi-pour S. Saryzdi, GSA : a gravitational search algorithm, Inform. Sci. 179 (13) (2009) 2232-2248. [41] M.E. Montes, C.A.C. Coello, An empirical study about the usefulness of evolution strategies to solve constrained optimization problems, Int. J. Gen. Syst. 37 (4) (2008) 443-473. [42] Y. Chen, Y. Yang, C. Liu, C. Li, L. Li, A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting, Appl. Math. Modelling 39 (2015) 2617-2632. [43] J.A.K. Suykens, J. Vandewalle, Least squares support vector machine classifiers, Neural Process. Lett. 9 (3) (1999) 293-300. [44] G.B. Huang, H. Zhou, X. Ding, R. Zhang, Extreme learning machine for regression and multiclass classification, IEEE Trans. Syst. Man, Cyber. 42 (2) (2012) 513-529. [45] R. Liao, H. Zheng, Particle swarm optimization-least squares support vector regression based forecasting model on dissolved gases in oil-filled power transformers, Electric Power Syst. Res. 81 (12) (2011) 2074-2080.