Hybridization of the imperialist competitive algorithm and local search with application to ship design optimization

Hybridization of the imperialist competitive algorithm and local search with application to ship design optimization

Computers & Industrial Engineering 137 (2019) 106069 Contents lists available at ScienceDirect Computers & Industrial Engineering journal homepage: ...

5MB Sizes 0 Downloads 67 Views

Computers & Industrial Engineering 137 (2019) 106069

Contents lists available at ScienceDirect

Computers & Industrial Engineering journal homepage: www.elsevier.com/locate/caie

Hybridization of the imperialist competitive algorithm and local search with application to ship design optimization

T

Daniele Peri Istituto per le Applicazioni del Calcolo, Consiglio Nazionale delle Ricerche, Via dei Taurini 19, 00185 Rome, Italy

ARTICLE INFO

ABSTRACT

Keywords: Global optimization Non-linear programming Evolutionary optimization Hybrid optimization methods Imperialist competitive algorithm Ship design optimization

In the wide scenario of the optimization techniques, a large number of algorithms are inspired by natural processes, in many different ways. One of the latest is the Imperialist Competitive Algorithm (ICA) AtashpazGargari and Lucas (2007), judged by their authors as very efficient and competitive with other popular optimization algorithms. However, its diffusion is still limited, so that it has not yet been adequately studied. In this paper, we have investigated the convergence properties of the ICA algorithm, observing the effect of the various coefficients and their role in the global convergence. Some modifications, including the coupling with a local search method, have been listed/suggested and then tested on a suite of standard algebraic test functions, verifying the improvements on the speed of convergence of the original algorithm. An application to naval design has been also included, in order to check the ability to solve realistic problems.

1. Introduction The number of optimization algorithms inspired by an evolutionary system is growing constantly. Starting from the Genetic Algorithms (GA), probably the first ever proposed in 1975 by Holland (1975), the number of optimization algorithms somehow based on the evolution of a natural system is nowadays very large. The differences between them are sometimes quite small, and the different terminology adopted for the description of the base elements is often hiding a great similitude. Recently, a new evolutionary algorithm has been proposed by Atashpaz-Gargari and Lucas in Atashpaz-Gargari and Lucas (2007): the Imperialist Competitive Algorithm (ICA). This algorithm, in the knowledge of the author, is the first in which the human behavior, and not a natural system, is used as a source of inspiration: in fact, the core of the algorithm is represented by a replication of the mechanism of the imperialistic competition. The base iteration is very simple, and the structure is easy to implement (and also parallelize), as the most of the evolutionary algorithms. For this reason, this is very attractive for the application to real industrial problems. In this paper, we have analyzed the structure of the algorithm, deriving some conditions for global convergence. Several modifications have been included, in order to possibly improve the performances of the algorithm, with the purpose of reducing the total number of objective function evaluations required for the identification of the minimum/maximum of the objective function. A suite of algebraic multidimensional functions have been utilized in order to check the

properties of the algorithm. As a last test, the algorithm has been also applied to a real industrial problem. 2. The Imperialist Competitive Algorithm (ICA) In its original formulation (Atashpaz-Gargari & Lucas, 2007), the ICA is described as an evolutionary algorithm. A number of trial vectors of parameters, each defining a different configuration of the system (county), are distributed onto the design space and assigned to different groups: each group is called empire. The county presenting the most convenient value of the objective function inside a empire is called imperialist, and each county is placed under the control of a imperialist. From now on, we are referring to a minimization problem, so that the most convenient value is represented by the lower value of the objective function. More details about the algorithm can be found in AtashpazGargari and Lucas (2007). Here the initialization of the counties is performed randomly, and the counties are assigned to the imperialist on the base of their relative power, so that, at the beginning, the most powerful imperialist have the control of a larger empire. At each iteration, three main operations are performed in order to let the counties to evolve:

• Shifting the counties: each county is moved toward the imperialist according to the equation

X tk+ 1 = X tk + r (Xi

E-mail address: [email protected]. https://doi.org/10.1016/j.cie.2019.106069 Received 21 March 2019; Received in revised form 13 September 2019; Accepted 16 September 2019 Available online 24 September 2019 0360-8352/ © 2019 Published by Elsevier Ltd.

X tk)

(1)

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

• •



where X is the generic vector of the coordinates of a point in the design variable space, Xi is the position of the imperialist controlling the moving k th county Xk, t is the current iteration, k is identifying the county, is the so-called assimilation coefficient, and r is a random number in between 0 and 1. If is sufficiently large, so that the product r is greater than the unit value, the county will overpass the imperialist, changing the side from which the county observes the imperialist. The displacement vector (X tk+ 1 X tk ) is further deviated from the indicated direction by a random angle in between and + , to be fixed. Change of the imperialist: if in new position of the county the value of the objective function is smaller than the value owned by the referenced imperialist, the positions of the county and the imperialist are swapped. Imperialistic competition: the power of each empire is computed as the power of the imperialist plus a fraction of the sum of the powers of the single counties of the empire. The worst county of the worst empire is re-assigned to the best empire. In a minimization process, the average value of the power of the counties is summed up to the power of the imperialist: lower value means higher power. Empire elimination: if a empire has been emptied, and no counties are under the control of a imperialist, the empire is canceled.

Yu, and Li (2012), Mollinetti, Almeida, Pereira, and Teixeira (2013), Hosseini and Khaled (2014) the option revolution has been suggested, so that a random number of counties is relocated randomly, and the swap of the position is completed if an improvement is obtained, while Kaveh and Talatahari (2010) explores four different possible selections for the random components of the algorithm. In this paper, the following modifications have been adopted and compared with the original algorithm in a match-race competition (details are provided in the following sections): 1. Deterministic selection of the initial population: instead of a random choice of the initial distribution, a Uniformly Distributed Sequence (UDS) is adopted in order to cover uniformly the design space. In particular, a Pτ-network Statnikov and Matusov (1996) is utilized to this aim. 2. Once a empire is emptied, the imperialist is transferred to the weakest remaining empire in order to increase the competition. 3. Elimination: if two counties are too close to each other (in the design variable space), the one presenting the higher value of objective function is eliminated. 4. De-clustering: if two counties are too close to each other, the one presenting the higher value of objective function is shifted far away. 5. Empire re-initialization: if a single empire is survived, the initialization procedure of the empires is repeated as at the beginning. 6. Revolution: at each iteration, the worst county of each empire is randomly relocated. 7. Modulation: the parameter is gradually reduced thru the iterations.

A flowchart of the original algorithm is reported in Fig. 1. Starting from this initial definition, several modifications of the algorithm have been proposed in literature: for instance, in Lin, Tsai,

As already indicated, options number 6 is not original. The effect of each option is evaluated, and an indicator is also defined in order to quantify the improvements. 3. Global convergence analysis As a first step, the governing equation of the motion of the counties has been observed and analyzed in order to check the existence of some theoretical properties of the original algorithm. If we focus on the Eq. (1), we can observe the full equivalence with the one dimensional, firstorder, autonomous, linear differential equation that governs the evolution of a state variable (2)

yt + 1 = ayt + b

In fact, Eq. (2) is absolutely equivalent to Eq. (1) once we rewrite it in the reference frame of the imperialist. Since the value of the state variables are assigned at the beginning, that is, the relative position of the county with respect to the imperialist, at the first step we have (3)

y1 = ay0 + b

Applying the Eq. (2), at the second step the position of the county is

y2 = ay1 + b = a (ay0 + b) + b = a2y0 + ab + b

(4)

and at the third step

y3 = ay2 + b = a (a2y0 + ab + b) + b = a3y0 + a2b + ab + b

(5)

Generalizing, t 1

yt = at y0 + b

ai i=0

(6)

If b = 0 , we can demonstrate that, if 0 < a < 1, the series converges to the zero value (the origin of the reference frame) Galor (2007). Since the imperialist is located in the origin of the reference frame, each county converges toward the imperialist. To be more explicit, we can simplify the Eq. (1) by rewriting it in the reference frame of the imperialist: the term X i disappears, and the equation now reads

Fig. 1. Flowchart of the original ICA algorithm. 2

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Xt+1 = Xt

r X t = (1

elements of the empire is shared with PSO: in fact, the Imperialist is the equivalent of the best element of PSO, that is, the best visit of the whole swarm/empire. The great difference with PSO is the aforementioned proof of convergence of ICA, while for PSO an incomplete proof of convergence can be obtained Clerc and Kennedy (2002). A deeper comparison of the interactions between the empires/sub-swarms could be the argument of a further study.

r ) Xt

We are clearly in the case of Eq. (2) where b = 0 and a = (1 r ) : the motion is developing along the direction connecting the initial position of the county and the origin of the reference frame (that is, the imperialist), and the county is converging on the corresponding imperialist: convergence is monotone or not depending on the value of . Since the coefficient a needs to be positive and smaller than the unit value in order to have convergence Galor (2007), we have convergence if

0 < (1

r ) <1

<

4. Selection of parameters: number of Imperialists and Counties In the numerical experiments reported in Atashpaz-Gargari and Lucas (2007), the authors are not providing precise indications or criteria for the selection of the number of imperialists NI and counties NC to be utilized by the algorithm: in the cited paper, different numbers are adopted for different problems, but there is not an indication about the criteria driving the selection. On the contrary, the values of the parameters driving the motion of the counties are suggested. To give some guidelines, a systematic variation of the parameters of the algorithm has been produced, to identify their most convenient values. Investigated parameters are , , , NI , NC . The range of variations of the parameters is in some cases deducible from the previous convergence analysis: in particular, is limited in between 0 and the limit value reported in Fig. 2, depending on . In this test, NI is ranging between 2 and 20, NC between 4 and 16, between 0.5 and 2.5, between 5 and 45, between 0.1 and 2. Eight different objective functions, reported in the appendix, have been selected for the evaluation of the performances of the algorithm: seven of them are defined for an arbitrary number of design variables, while one them is defined only for a number of design variables that is a multiple of four. For this reason, five different options have been investigated for the number of design variables: 4, 8, 12, 16 and 20. Ultimately, we result in a set of 40 algebraic test functions. The use of algebraic test functions should appear obsolete, but it surely represents a simple and clear strategy to analyze the effect of the number of design variables on the properties of the algorithm. The optimizer has been run 250 times in order to have a statistic (average value and variance) about the detected minimum, due to the stochastic nature of the algorithm: it has been observed that 250 is a number of repetitions sufficient in order to produce a substantially stable value of the statistical quantities. The different sets of parameters are compared with a reference set of parameters selected from = 2, Atashpaz-Gargari and Lucas (2007): = 45, = 0.1, NI = 15, NC = 10 × NI . Since the application of a full factorial design would be too expensive, a reduced test matrix has been produced by applying the theory of the Orthogonal Arrays (OA) Hedayat, Sloane, and Stufken (2012). In this approach, if we fix the number of subdivision (levels) that we want to apply to our variables, and we represent our test matrix by reporting the corresponding levels, we have that, in the case of four variables and two levels (namely 0 and

1 ;0
Since r is a random variable defined within an interval ranging from 0 and 1, the second condition is equivalent to strictly positive. The first condition requires , on average, to be smaller than the inverse of the expected value of r, which is 0.5 by definition, so that is limited in the closed interval 0 < < 2 . In the original formulation, a further perturbation is induced by shifting the direction of motion of the county by a random angle whose value is varying in between ± : this can be treated as the application of a damping on the assimilation coefficient by a factor cos ( ) plus the super-imposition of a motion on a plane orthogonal to the base direction, whose intensity is proportional to sin ( ) . Due to the randomness of the perturbation angle, the effect of the orthogonal perturbation is zero on average, since the expected value of sin ( ), in the previously described conditions, is zero: consequently, its effect on the asymptotic position of the county can be neglected. Regarding the attenuation of the assimilation coefficient, we can evaluate its average value experimentally (hereafter, R ). Under this perspective, we can rewrite

a = (1.

0.5 R )

Remembering the previous demonstration, stating that the convergence of the series to the zero value is ensured if 0 < a < 2 , the 2 convergence is also ensured, on average, if 0 < < R . In Fig. 2, the limit value of for a convergent sequence is reported as a function of the deviation angle (R has been estimated numerically). Following the previous considerations, can approximately assume values in the interval [0:3.2] depending on the upper limit of the deviation angle : consequently, the value of = 2 adopted in AtashpazGargari and Lucas (2007), in conjunction with = 45 degrees, is well inside the limits of convergence. The different terminology adopted for the description of the algorithm is hiding a substantial similarity between ICA and the multiswarm Particle Swarm Optimization (PSO) formulation Hendtlass (2005). With respect to the original formulation of PSO, the ICA has not a personal memory, so that the new position of a county is not influenced by the positions previously visited by itself, while PSO is using this information. Conversely, limited interaction with the other

Fig. 2. Limit value for

as a function of the maximum deviation angle 3

(in degrees).

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

1), the full factorial design is expressed by the matrix of 16 (42 ) elements.

0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1

Table 1 Optimal value of the parameters for the optimization algorithm: different values for different depth of the exploration. Reference valuers from the paper Atashpaz-Gargari and Lucas (2007) are also reported.

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

NI NC

0 1 1 0 0 1 1 0

0 1 0 1 1 0 1 0

1000 × NDV

Original

6 7 × NI 0.595 14.170 1.945

12 11 × NI 2.601 5.000 0.100

15 10 × NI 2.000 45.000 0.100

great difference in the range of the various objective functions, a normalization of the results is necessary, although the minimum value of all the test functions is the same (zero). Some of the algebraic functions are very steep at the borders so that the maximum value is not indicative of the real range of the objective function, representing only a peak value. As a consequence, the objective function value is normalized by the average value of the objective function, obtained by sampling the objective function in 10000 random points and then averaging the results: this is to give a coherent reference value for all the functions. The normalized value is also multiplied by 100 so that the value in the graph is indicating the reduction of the objective function in percentage points with respect to the average value. Four different graphs are reported, one couple for each of the adopted stopping criteria. Three curves are reported in each sub-picture: one is reporting the values obtained by the best set of parameters taken from the original paper, and the other two curves are representing the results provided by the two new sets of parameters. As expected, the best performances are obtained by using the proper set of parameters for the two different stopping criteria, and an improvement with respect to the original set is evident. Due to the very large range of variation of the last three objective functions, their relative improvement is too large to be represented together with the others, so that a separate graph is necessary with a different scale range.1

If we extract a couple of columns and put them side by side, we have four possible combinations for a generic line: (0 0) (0 1) (1 0) (1 1). An OA is a matrix for which, for every couple of columns picked up from the matrix, all the possible combinations occur, and they are all present the same number of times. An example of OA is

0 1 0 1 0 1 0 1

100 × NDV

0 1 1 0 1 0 0 1

By basing the experiment on an OA we ensure that all the possible combinations of the variables occur together equally often: this represents a good characteristic for a design of experiments. In this test, five levels have been investigated: the total number of experiments has been reduced from 3125 (the number required by a full factorial design) to 125 samples only. Once the best combination has been identified among the selected samples, a local exploration has been performed around it, refining the density of samples in the proximity of the promising region: at the end, a total number of 450 different configurations have been evaluated. A clear convergence criterion has been adopted in Atashpaz-Gargari and Lucas (2007). The maximum number of evaluations of the objective function has been assumed as a reasonable stopping criteria: this is in the spite of the application of the algorithm to realistic design problems, where the allocated time for the design activities is limited, so that also the maximum number of objective function evaluations are fixed. Two different values have been considered, taking into account the complexity of the optimization problem: 100 × NDV and 1000 × NDV (where NDV indicates the number of design variables) are two values that can be judged as representative of a small and a medium/high effort in the solution of the optimization problem. In order to give a simplistic measure of the corresponding effort, recalling that a gradientbased optimization algorithm requires at least 2 × NDV evaluations of the objective function for the determination of its gradient, the first limit is roughly equivalent to 5 iterations of a gradient-based local optimization algorithm, while for the second limit we can assume ten different starting points for a gradient-based (local) optimization algorithm performing 5 steps. The exploration of the parameters has been performed for the two different limits on the stopping criteria, obtaining two different sets. Values are reported in Table 1. In Fig. 3, the averages of the optimal values identified by the algorithm using the three different sets of parameters (see Table 1) are compared. Also in this case, the optimization procedure has been repeated 250 times for each of the 40 objective functions, and the value reported in the graph is the average value over the 250 trials. Due to the

5. Effect of a deterministic initialization The original formulation of the algorithm applies random variables extensively. This is often justified by the necessity of a great variety of exploration directions and a variable length of the exploration step, but this is not always the best choice. In particular, if a complete exploration of a portion of the R N is the task, an initial uniform exploration of the design space is probably better than a random exploration. In order to check the effect of this potential improvement, the counties have been initialized by using a UDS, and the results have been compared with the ones obtained by using a fully random distribution. The selected UDS is the so-called Pτ-network described in Statnikov and Matusov (1996). This UDS has the advantageous property of being sequential, that is, the Nth sample is not dependent from the previous ones, so that the points can be generated one by one, which is beneficial if further additions are needed. Since the first generated sample of this UDS is located at the origin of the axes, the qualities of the algorithm would be hidden whenever the minimum of the objective function is also located at the origin of the axes: the optimal value would be systematically included in the initial population. In order to overcome this situation without losing the regularity of the sequence, a shift of the samples is applied. The amount of the shift, divided by the length of the boundaries of the investigated field, is that is, an irrational value close to 1 . Other314 100 wise, we can simply suppress the first generated point from the sequence: this is causing a variation in the density of the points around the origin of the axis, so that is probably not the best option. 1 We remember how each algebraic function is tested by using four different values for the number of design variables.

4

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Fig. 3. Comparison of the results obtained with different set of parameters and different stopping criteria. The averaged best value is normalized by using the average value of the objective function.

The results are provided in Fig. 4. It is evident how the application of a UDS is improving drastically the performances of the algorithm. In the following, all the further variants of the algorithm will be compared with the here presented results.

the other hand, the existence of a multiplicity of the empires slows down the deterioration of the diversity, since different groups of counties are converging toward different positions (its proper imperialist). Since the preservation of diversity during the search is commonly considered as a positive quality, preventing a premature convergence toward a single point, the current population could be re-initialized once a single empire is survived, recreating a multiplicity of empires. The same number of initial empires is now created: this operation is not rising the overall computational cost, since we are using the current population without adding new elements. In Fig. 6, an indicator of the amount of the variation obtained after the modification of the algorithm is reported. The difference between the averaged minimum/variance obtained using the modified and unmodified versions of the algorithm has been divided by the average value of the current objective function, and then multiplied by 100, obtaining a percentage variation with respect to the average value of the objective function. A negative value stands for an improvement of the detected minimum. Also in this case, in order to produce a statistically significant sample, the optimization process has been repeated for 250 times for each test function. Observing the results reported in Fig. 6, a deterioration of the performances is substantially not observed for any of the test functions, or at least it is marginal. The only relevant variations are relative to improvements of the detected minimum. The variance is sometimes increased, but this is an expected result, since we are increasing the diversity in the search. For these reasons, the modification can be considered as positive.

6. Effect of fixed/random parameters Most of the base parameters of the algorithm are coupled with a random variable: the real necessity of this randomness has been here investigated. In Fig. 5, the effect of suppressing the randomness of the different parameters of the algorithm is presented. As partly expected, only the randomness of is clearly influencing the exploration ability of the algorithm in a positive way, while the effect on the other parameters is vague and unclear. For this reason, the randomness of the parameters is preserved in the following. 7. Re-initialization of empires The original algorithm is not designed to preserve a minimum number of empires: each imperialist fights for surviving with his empire, and the system naturally evolves up to the survival of a single empire. This mechanism is harmful from the standpoint of the diversity of the population, since all the counties are approaching the reference imperialist, reducing the diversity with him: when a single imperialist is survived, all the counties are gradually becoming more and more similar to a single individual, and the fact that each county is approaching the imperialist from a different direction is a sort of warranty that an exploration in the vicinity of the current best solution is performed. On 5

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Fig. 4. Optimal value identified by the ICA algorithm for the assigned objective functions: comparison between the results obtained by a random initial distribution and a uniform initial distribution. On the horizontal axis, the test function number, on the vertical axis the minimum value of the objective function detected in logarithmic scale.

8. Effect of some further modifications of the algorithm

is much more appropriate. The needed, and a small value of progressive reduction of the assimilation coefficient thru the iteration is producing this effect, if we start from a high value (close to the maximum admissible value). At the end of each iteration, is reduced by 0.1%: this means that, after ten iterations, is reduced at his 90%, and it takes about 700 iterations to halve the assimilation coefficient.

• The revolution operator: a modification of the algorithm proposed





in Lin et al. (2012), Mollinetti et al. (2013) is the so-called revolution operator. The aim of this modification is again to increase the diversity in the population of a empire, increasing also the exploration ability of the algorithm. A random number of counties is relocated randomly, and the swap of the position is completed if an improvement is obtained. In the current implementation, the worst element of each empire is relocated to a random position: using this option, the number of relocated counties is equal to the number of imperialists, and the parallelism of the algorithm is preserved. Clustering and decimation: since there is not an inertia term affecting the movement of the counties, two counties that are close each other are bound to evolve in the same way if they are under the same imperialist, apart from the effect of randomness. For this reason, one of the two counties can be eliminated or relocated. Observing this situation, two possible options are available: 1. the county is suppressed, and no further action is taken (in the following, decimation). 2. the worst county is randomly shifted away (in the following, declustering). Progressive reduction of : an option already applied to the PSO algorithm is the progressive reduction of the assimilation coefficient. The base idea is that a large value of is useful in the initial phase of the exploration, in order to cover a large portion of the full design space, while in a second phase a more localized exploration is

In Table 2, some global results are reported for a clearer observation of the effects of the proposed modifications of the algorithm. Since there is not a univocal effect of the modification for all the test functions, an average of the ratio between the minimum value detected by the original algorithm and the minimum produced by the modified algorithm is reported over all the test functions. In the second part of the Table 2, the variance of the 250 optimizations is also reported using the same scheme, in order to check also the variations in terms of the variability (or stability) of the obtained solutions. The effect produced by the re-initialization of the empires, previously analyzed in a different way, is also reported in order to provide a reference value for the averaged improvements obtained with the other modifications. Two different stopping criteria, with two different maximum number of allowed objective function evaluations, have been tested also in this case. All the proposed modifications give marginal results when the maximum number of iterations is low. A possible explanation is that the small number of iterations is not enough to materialize the positive effects of the diversification of the search, so that further randomization of the movement of the counties is not helpful. On the contrary, if the 6

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Fig. 5. Effect of the deterministic choice of the parameters on the exploration ability of the algorithm: comparison between the results obtained by using random variables and deterministic variables. On the horizontal axis, the test function number, on the vertical axis the minimum value of the objective function detected in logarithmic scale.

maximum number of iterations is higher, the effect of the modifications is clearer, and is possible to draw some conclusions about the benefits of the different options. Re-initialization of the empires shows the best averaged improvement, and also the reduction of the assimilation coefficient appears to be clearly helpful. All the other options produce an improvement smaller than 5% on average, and their application is not changing

substantially the results provided by the reference version of the algorithm. Observing the averaged variance, we can draw similar conclusions, although the order of preference of the modifications is slightly different: also in this case, re-initialization and reduction are the only modifications producing real differences.

Fig. 6. Effect of the re-initialization of the empires when a single empire is survived. On the horizontal axis, the test function number, on the vertical axis the percentage improvement with respect to the original version of the algorithm computed with respect to the average value of the objective function (a negative value stands for reduction).

7

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Among the possible choices, the use of the simplex method Nelder and Mead (1965) appears to be a good option. The initial simplex can be initialized by using the currently available counties: this is not the best choice in terms of efficiency for the simplex algorithm, due to the unevenly spacing of the elements of the initial simplex, but this is not requiring the evaluation of the objective function on new points. Unfortunately, the simplex method is purely serial, so that the parallelism of the algorithm is lost in this phase unless we initialize more simplexes in parallel. It is intuitive to think that the simplex method should be initialized around one or more imperialists: the best choice about the number of imperialists, the number of iteration of the simplex method and the frequency of application are to be identified from considerations regarding the balance between the obtained improvement and the computational cost. To this aim, three different options have been considered:

Table 2 Optimal value of the parameters for the optimization algorithm: different values for different depth of the exploration. Re-init.

Revol.

De-clust.

Decim.

red.

100 × NDV 1000 × NDV

1.003 0.726

Normalized average 1.037 1.044 0.989 0.955

1.013 0.982

1.030 0.863

100 × NDV 1000 × NDV

1.019 0.951

Normalized variance 1.008 1.006 1.013 0.994

1.064 1.021

1.036 0.812

9. Hybridization of the algorithm One of the main deficiencies of the original formulation of the algorithm is the complete absence of a reference to the local behavior of the objective function. The exploration is based uniquely on the relative position of the imperialist and the counties apart from a random deviation artificially introduced, without any element based on the local descent directions: the fact that the imperialist is located in a region where the objective function value is lower than in the area of the counties is not avoiding the possibility of losing some basin of attraction located in between, or nearby the counties. On the other hand, the computation of the gradient of the objective function is an expensive task, in particular when the number of design variables is not small, and the exploration of the design space avoiding the computation of the local gradient is exactly the goal of the algorithm. As a consequence, if we want to introduce a local search into this framework, producing a hybrid version of the ICA algorithm (from now on hICA), it appears to be much more coherent to link the ICA with a derivative-free optimizer. The introduction of a local optimizer is a theoretical warranty of final best point to be a stationary point for the objective function. On the other hand, the use of a second search strategy is diverting resources from the main algorithm, so that if the efficiency of the local search is not enough, the final result of the hybrid algorithm would be higher with respect to the one of the original algorithm, since the number of iterations of ICA is smaller. The following experiments are focused on the verification that the mix of the two algorithms is producing a positive effect.

1. a simplex is initialized around every imperialist: parallelism is preserved. 2. a single simplex is initialized around the best imperialist: parallelism is lost. 3. a single simplex is initialized around the worst imperialist: parallelism is lost. Furthermore, different values for the maximum number of iterations for the simplex algorithm have been also investigated. The frequency of application of the simplex has been not investigated here: a local search is started at the end of each ICA iteration. The results reported in Fig. 7 are encouraging, and they are different in the case of small and large number of the overall iterations allowed. As already observed previously, when the overall number of objective function computations is small (100 × NDV ) the hybridization of the algorithm is not particularly effective under the standpoint of the bare minimization: the detected minimum is substantially comparable with the value identified by the reference version of the algorithm, but we can observe a strong reduction of the variance. This is a sign that the optimization process is ending up in the same position every time, and the effect of the local search is, at least, a stabilization of the final result without a numerical improvement. The use of a simplex search starting from every imperialist produces

Fig. 7. Effect of the hybridization of the algorithm. On the horizontal axis, the number of simplex iterations, on the vertical axis the ratio between the value for ICA the hICA. On left the averaged minimum, on right the averaged variance (over all the test functions). On top, 100 × NDV iterations, on bottom 1000 × NDV iterations.

8

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

the largest deterioration of the results, while the use of a single search at the end of one iteration appears to be more efficient, with a small preference for the best imperialist as starting point. The smaller number of iterations for the local search algorithm is the best choice if we observe the effects on the detected minimum, while a number of 25 iterations are producing the best results on the variance. When the maximum number of the iterations is higher (1000 × NDV ), the effect of the local search comes to light. The use of a single local search starting from the worst imperialist produces minimal variations on the detected minimum. On the contrary, a strong improvement of the capabilities of the algorithm is observed when the local search starts from the best imperialist, and this effect appears to increase when also the maximum number of iterations dedicated to the local search is increased. Some improvements are also obtained when many local searches are started, one for every imperialist, but the effect disappears as soon as the deepness of the local search grows (more than 40 iterations): in this situation, the algorithm is probably losing a large number of resources exploring regions already explored during previous local searches or, more generally, the effort that the algorithm is putting on the local search phase becomes rapidly too large and it is reducing the total number of iterations of the ICA part. Similar considerations can be also drawn observing the variance, although the results are not as clear as in the previous test. Summarizing, the hICA outperforms ICA when the maximum number of objective function evaluations is not small. The best situation is when a single local search is started from the current best imperialist, reserving at least 20 iterations to the local search phase.

part of Fig. 8, the full set of evaluations is reported, while in the right part of Fig. 8 the current best value only is plotted: a value close to the final minimum is detected after about 1000 objective function evaluation: since then, only minor improvements are obtained. In the left part of Fig. 8, we can observe the regular sequence of global search and local search: during the global search, the values of the objective function are largely dispersed, while during the local search phase they are much more concentrated around the current best value. In Fig. 8, an alternation of 14 of these phases are observed, and a total of 6000 (1000 × NDV) objective function evaluations are produced. In Fig. 9 the original and optimal hull shape are represented. A bulbous bow is now included in the modified design, and this represents the more evident result of the hull modification. Since the full displacement is constrained, the volume of the bulbous bow has been taken from other parts of the hull, closer to the free surface, reducing the impact of the hull on the water surface and obtaining a reduction of the wave pattern, with a reduction of the radiated energy (and then dissipated) into the water field. In Fig. 10, we can observe the different distribution of pressure on the hull surface, and the deformation of the bow region is even more evident. Wave elevation is also reported for the original and the optimal hull: a clear reduction of the bow wave is observed. This is obtained by a shift of the bow volume from the top to the bottom part, clearly visible from this picture: the hull is more slender at the waterplane, and the volume in the lower part is utilized for the definition of a bow bulb, further reducing the bow wave. A last verification has been performed by applying a high-fidelity CFD solver: a RANSE solver, with a more accurate mathematical model for the treatment of the viscous effects OpenFOAM (2018). The tracking of the free surface is here obtained by using a level-set approach. The same computational grid, in terms of the number of cells and topology, has been produced for the original and optimal hull shape. A gain of about 3.6% has been verified by using this solver, very similar to the 3.2% improvement predicted by the simpler mathematical model adopted during the optimization, as reported in Fig. 8. A 3D snapshot of the free surface elevation is reported in Fig. 11. In Fig. 12, in order to give a measure of the efficiency of the algorithm, a comparison with a standard Genetic Algorithm implementation Deb, Agrawal, Pratap, and Meyarivan (2000) is presented. Two different optimization problems have been solved, varying the maximum displacement of the control points describing the hull surface: in a first case, the maximum movement of the control points is 0.5 meters, while in the second case the amplitude of the movement has been doubled (1.0 meters). The hICA outperforms NSGA-II in both the test cases, also in the case where the induced modification on the hull is small, and marginal improvements can be obtained. In the case where larger modifications are allowed, the improvement of the best solution found by hICA with respect to the one obtained by NSGA-II is of about 0.15%, which is about 5% in relative terms.

10. Practical test case: shape optimization of a motoryacht As a practical application, in order to demonstrate the applicability of the hICA to real design problems, a shape optimization problem for a surface ship has been set-up and solved. The final goal is the minimization of the effective power required for the propulsion of a 40 meters motor-yacht, at a speed of 16 knots. The numerical tool for the evaluation of the effective power in the indicated conditions is a computational fluid-dynamic (CFD) solver for steady flows with free-surface Gadd (1976): although the physical model is largely simplified, this tool has demonstrated to be reliable for this class of problems. The hull modification is performed by shifting the control points of the network describing the hull shape by a spline surface (NURBS): the new hull shape is obtained by mimic the activity of a ship designer working with a CAD system, so that the fairing of the hull surface is preserved. Only the lateral movement of six control points, located in the lower front part of the ship, is allowed. A variation of the full displacement of 2% is the only active constraint: the small portion of the surface subject to modification represents a strong constraint itself. The computational grid for the CFD solver is produced directly from the CAD file, exploiting the features of the NURBS describing the hull surface, without the use of a complex grid generation tool. Distribution of panels, with stretching at the bow edge, is also settled at this stage. In Fig. 8 the convergence of the algorithm is presented. In the left

11. Managerial implication The improvement obtained by using hICA instead of ICA have some Fig. 8. Convergence of the optimization algorithm to the detected minimum. On right, the course of the current minimum value, on left the complete set of the computed values. In the left part, the local search phase is clearly identified in between the global search phase. On the horizontal axis, the number of objective function evaluations, on the vertical axis the percentage improvement of the effective power with respect to the original value.

9

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Fig. 9. Deformation of the hull shape after the shift of some selected control points of the NURBS surface describing the hull. On the left, the original hull, on the right the optimal one.

Fig. 10. Bottom view of the original and optimized hulls with the corresponding wave patterns. Red is for a wave crest, blue for hollows. Hulls are colored by the corresponding local pressure (purged of the hydrostatic component): red is for high pressure, blue for low pressure.

excellent managerial implications. From a managerial point of view, the improved algorithm aids decision-makers in detecting more useful configurations of the system in a smaller computational time: the overall time allocated for the optimization task is now shorter, freeing resources for other activities to be carried out within the same project. The hICA can be a practical tool for the design managers to determine the optimal system configuration, leading to higher performance and typically reducing also the operative costs. Case by case, a deeper investigation of the Return Of Investment (ROI) can also highlights the time of amortization of the higher production costs possibly caused by a greater complexity of system configuration: sometime, the production costs of the optimized configuration are higher than the ones of the original design.

12. Conclusions In this paper, the theoretical properties of the ICA algorithm have been observed. Some improvements of the original version of the algorithm have been obtained by small modifications. The integration with a local search algorithm, producing a hybrid version of the ICA (hICA), led to a probability of success remarkably higher than the original ICA, representing one of the main achievements of this paper: it is a confirmation of the fact that efficient optimization algorithms can be obtained by combining different optimization algorithms/techniques, as largely demonstrated in literature (dos Santos Coelho & Mariani, 2006; Ghodrati & Lotfi, 2012; Peri, 2015). In order to better assess the hICA capabilities, a more in-depth comparison with other optimization algorithms should also be produced.

Fig. 11. Top side view of the wave elevation of the original (on left) and optimized (on right) hulls. Red for a wave crest, blue for hollows. Computations have been performed by a RANSE solver as a verification of the results obtained by the potential flow solver. 10

Computers & Industrial Engineering 137 (2019) 106069

D. Peri

Fig. 12. Comparison between the results provided by hICA and a classical Genetic Algorithm (NSGA-II). On left, a small variation of the design variables is allowed (0.5 m), on right the maximum movement is doubled (1.0 m).

8. Perm: f (x) = f (x1, x2 , …, xn ) =

Furthermore, a more rigorous approach to the recourse to the local search phase should be also investigated, in order to increase the speed of convergence of the algorithm.

Declaration of Competing Interest No potential conflict of interest was reported by the authors. Acknowledgements The authors would like to express their gratitude to associate editor and three anonymous referees for their valuable comments and suggestions which contributed to a significant improvement of the original version of this paper. Appendix A n

x i ) 2]

3. Powell: f (x) = f (x1, …, xn) =

(x 4

(x 4 i 2

i 3

+ 10· x 4 2·x 4

i 1)

i 2) 2

5. Levy: f (x) = f (x1, x2 , …, xn ) =

w1)2 +

n 1 i=1

+ 5·(x 4

+ (10· x 4

4. Griewank: f (x) = f (x1, …, xn) = 1 +

sin (

2

(wi

i 3 xi2 n i = 1 4000

i 1

x 4 i)2 + 5· 2·x 4 i ) 4 n x cos ( ii i=1

n j=1

(j + 10)· x ij

1/ j i ) 2

Atashpaz-Gargari, E., & Lucas, C. (2007). Imperialist competitive algorithm: An algorithm for optimization inspired by imperialistic competition. IEEE congress on evolutionary computation (pp. 4661–4667). IEEE. Clerc, M., & Kennedy, J. (2002). The particle swarm – explosion, stability, and convergence in a multidimensional complex space. IEEE Transactions on Evolutionary Computation, 6(58), 58–73. Deb, K., Agrawal, S., Pratap, A., & Meyarivan, T. (2000). A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii. In M. Schoenauer,, K. Deb,, G. Rudolph,, X. Yao,, E. Lutton,, & J. J. Merelo, (Eds.). Parallel problem solving from nature PPSN VI (pp. 849–858). Berlin, Heidelberg: Springer. dos Santos Coelho, L., & Mariani, V. C. (2006). Particle swarm optimization with quasinewton local search for solving economic dispatch problem. 2006 IEEE international conference on systems, man and cybernetics: Vol. 4, (pp. 3109–3113). Gadd, G. (1976). A method of computing the flow and surface wave pattern around full forms. Transactions of the Royal Institute of Naval Architects, 118, 207–215. Galor, O. (2007). Discrete dynamical systems. Springer Verlag. Ghodrati, A., & Lotfi, S. (2012). A hybrid cs/pso algorithm for global optimization. In J.-S. Pan, S.-M. Chen, & N. T. Nguyen (Eds.). Intelligent information and database systems (pp. 89–98). Berlin, Heidelberg: Springer. Hedayat, A., Sloane, N., & Stufken, J. (2012). Orthogonal arrays: Theory and applications. Springer series in statistics. New York: Springer. Hendtlass, T. (2005). Wosp: A multi-optima particle swarm algorithm. In IEEE (Ed.). Proceedings of the 2005 IEEE congress on evolutionary computation (pp. 727–734). IEEE. Holland, J. H. (1975). Adaptation in natural and artificial systems. The University of Michigan Press. Hosseini, S., & Khaled, A. A. (2014). A survey on the imperialist competitive algorithm metaheuristic: Implementation in engineering domain and directions for future research. Applied Soft Computing, 4, 1078–1094. Kaveh, A., & Talatahari, S. (2010). Imperialis competitive algorithm for Engineering design problems. Asian Journal of Civil Engineering (Building and Housing), 11(6), 675–697. Lin, J.-L., Tsai, Y.-H., Yu, C.-Y., & Li, M.-S. (2012). Interaction enhanced imperialist competitive algorithms. Algorithms, 5, 433–448. Mollinetti, M. A. F., Almeida, J. N. M., Pereira, R. L., & Teixeira, O. N. (2013). Performance analysis of the imperialist competitive algorithm using benchmark functions. SoCPaR (pp. 349–353). IEEE. Nelder, J. A., & Mead, R. (1965). A simplex method for function minimization. Computer Journal, 7, 308–313. OpenFOAM (2018). OpenFOAM, the openfoam foundation. https://openfoam.org/. Peri, D. (2015). An inner-point modification of PSO for constrained optimization. Engineering Computations, 32(7), 2005–2019. Statnikov, R. B., & Matusov, J. B. (1996). Use of p nets for the approximation of the edgeworth-pareto set in multicriteria optimization. Journal Optimization Theory Applications, 91(3), 543–560.

The authors would like to thank the Italian Minister of Instruction, University and Research (MIUR) to support this research with funds coming from PRIN Project 2017 (No. 2017KKJP4X entitled “Innovative numerical methods for evolutionary partial differential equations and applications”).

n /4 i=1

(

References

Funding

1. Sphere: f (x) = f (x1, x2 , …, xn ) = i = 1 x i2 n 2. Rosenbrock: f (x , y ) = i = 1 [100(x i + 1 x i2) 2 + (1

n i=1

)

1) 2 (1 + 10sin ( wi + 1) 2 + wn2

(1 + sin (2 wn + 1) 2 where wi = 1. +(x i 1. )/4 . n 6. Rastrigin: f (x) = f (x1, x2 , …, xn ) = 10n + i = 1 (x i2 10cos (2 x i )) n x sin ( |x i | ) 7. Shwefel: f (x) = f (x1, x2 , …, xn ) = 418.9829·n i=1 i

11