AdaBoost-inspired Multi-operator Ensemble Strategy for Multi-objective Evolutionary Algorithms
Communicated by Dr Zhan Zhi-Hui
Journal Pre-proof
AdaBoost-inspired Multi-operator Ensemble Strategy for Multi-objective Evolutionary Algorithms Chao Wang, Ran Xu, Jianfeng Qiu, Xingyi Zhang PII: DOI: Reference:
S0925-2312(19)31757-6 https://doi.org/10.1016/j.neucom.2019.12.048 NEUCOM 21677
To appear in:
Neurocomputing
Received date: Revised date: Accepted date:
9 September 2019 8 November 2019 10 December 2019
Please cite this article as: Chao Wang, Ran Xu, Jianfeng Qiu, Xingyi Zhang, AdaBoost-inspired Multioperator Ensemble Strategy for Multi-objective Evolutionary Algorithms, Neurocomputing (2019), doi: https://doi.org/10.1016/j.neucom.2019.12.048
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier B.V.
AdaBoost-inspired Multi-operator Ensemble Strategy for Multi-objective Evolutionary Algorithms Chao Wanga , Ran Xua , Jianfeng Qiua , Xingyi Zhanga,b,∗ a
Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, Hefei 230601, China. b State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110000, China.
Abstract Evolutionary algorithms have shown prominent performance in solving various kinds of multi-objective optimization problems (MOPs), but most of them only use single operator that is often sensitive to the characteristics of problems. As different operators have different search patterns, a proper combination of multiple operators can be more efficient and robust than using one single operator in solving complex problems. However, how to ensemble multiple operators based on their performances in optimization process is a challenging task. In the machine learning field, it is well known that AdaBoost can effectively ensemble multiple classifiers by giving the proper weights based on their classification errors. Inspired by this ensemble way, we propose a multi-operator ensemble (MOE) strategy based on multiple subpopulations for evolutionary multiobjective optimization. In the proposed strategy, the survival rate of each subpopulation after environmental selection is used to evaluate the performance of the operator, and then a credit assignment scheme is developed by using the weight update method in AdaBoost. Based on these credits, an emigration-immigration mechanism is designed to update the subpopulation that can adaptively reward or punish the computational resources for operators. Experimental results on three complex test suites demonstrate that the proposed MOE can significantly improve the performance of multi-objective evolutionary algorithms on different types of MOPs. Corresponding author. Email addresses:
[email protected] (Chao Wang),
[email protected] (Ran Xu),
[email protected] (Jianfeng Qiu),
[email protected] (Xingyi Zhang ) ∗
Preprint submitted to Journal of Neurocomputing
December 18, 2019
Keywords: Multi-objective optimization, AdaBoost, Multi-operator ensemble, Credit assignment, Subpopulation update 1. Introduction Multi-objective evolutionary algorithms (MOEAs) have attracted great attention over the past few decades and shown prominent performance in tackling various kinds of multi-objective optimization problems (MOPs) [1–3]. Mathemat5
ically, a general MOP can be defined as follows: min
x⊂Rn
s.t.
F(x) = (f1 (x), f2 (x), . . . , fM (x))T gj (x) ≥ 0, j = 1, 2, . . . , J;
(1)
hk (x) = 0, k = 1, 2, . . . , K. where x = (x1 , x2 , . . . , xn )T is a n-dimensional decision variable vector from the decision space Ω, and M is the number of objectives. g (x) and h(x) define J inequality constraints and K equality constraints, respectively. Due to the conflicts among the multiple objectives, there exists no single optimal solution for all 10
the objectives, whereas there is only a set of optimal solutions that called Paretooptimal set (P S). The set of all Pareto-optimal solutions in objective space is called the Pareto front (P F ), P F = {F(x) ∈ RM | x ∈ P S}.
Generally speaking, there are two major components associated with MOEAs
for solving MOPs: evolutionary operator and environmental selection. The former 15
is about how to produce the high quality offspring by the parents while the latter is about how to select the elite individuals for the next generation evolution. Many works have been developed on the design of the selection strategy. They can be roughly divided into three categories [4–7]: decomposition-, indicator-, and Pareto-based approaches. Decomposition-based MOEAs employ the reference
20
vectors (or weight vectors) to decompose the MOP into a set of single-objective subproblems and use the scalarizing function to select solutions. Indicator-based algorithms construct a single performance indicator to evaluate the quality of the solution and prefer the solution with better indicator value. Pareto-based MOEAs adopt the Pareto dominance relation to rank solutions into different non-Pareto
25
levels and select nondominated solutions with good diversity. The non-Pareto2
based selection strategy (decomposition and indicator) can provide high convergence pressure while Pareto-based selection strategy can maintain the diversity [8]. In order to achieve a better overall performance for a diverse range of MOPs, ensemble approaches of different selection strategies have been widely studied in 30
[8–11]. For example, in Two-Arch2 [9], the advantages of indicator and Paretobased selection strategies are combined by using two archives. In BCE [8], two populations work in a collaborative manner for Pareto criterion evolution and non-Pareto criterion evolution, respectively. In EAG [10], the working population evolves by using decomposition-based strategy while the external archive is
35
updated by using a Pareto-based sorting method. By contrast, few works have reported focusing on the ensemble of evolutionary operators to improve the performance of MOEAs. Most existing MOEAs always adopt the single operator that is often sensitive to the characteristics of problems [12]. Since different evolutionary operators have different abilities on
40
exploitation and exploration, the ensemble of multiple operators can combine the advantages of different search patterns and be more robust than using one single operator in solving complex problems [13]. It is well known that the ensemble methods have intensively been studied in machine learning community for dealing with classification problem. Its core idea is that a group of ’unstable and di-
45
verse’ learning models can be combined to obtain an overall better performance. This idea has been widely used in evolutionary multi-objective optimization community, such as ensembles of mating selection strategies [14], neighborhood sizes [15], constraint handling techniques [16], selection metrics [13] and even different MOEAs [11]. Moreover, as one of the most effective ensemble methods, the boost-
50
ing algorithm has been applied into MOEAs for solving MOPs. For example, Phan et al. [17] combined existing indicator-based selection metrics by using the PdiBoosting method. In our study, AdaBoost [18], as a representative of this kind of method, is applied into the ensemble of evolutionary operators. The motivation is that we can regard the operator as the classifier and thus the ensemble method
55
of multiple classifiers in AdaBoost can be introduced for the ensemble of multiple operators. Therefore, a multi-operator ensemble (MOE) strategy inspired by AdaBoost is proposed for evolutionary multi-objective optimization, which can efficiently ensemble multiple operators at different evolutionary stages. The main 3
contributions of this paper are as follows: 60
1) Inspired by the ensemble way of AdaBoost, a simple and efficient MOE strategy based on multiple subpopulations is proposed for multi-objective optimization. It uses the survival rate of each subpopulation to evaluate the performance of the operator and the weight update method in AdaBoost to assign the appropriate credit for next generation.
65
2) Like AdaBoost focusing on the misclassified samples, a subpopulation update method is designed to allow inferior individuals to migrate into other subpopulations for producing the promising offspring. This method can not only reallocate the computational resources but also facilitate the exchange of the evolution information among subpopulations.
70
3) The effectiveness of the proposed MOE is validated by embedding it into six existing MOEAs for solving MOPs with complicated PSs, many objectives and constraints. Experimental results demonstrate that the proposed MOE can significantly improve the performance of MOEAs on complex MOPs. The rest of this paper is organized as follows. Section 2 introduces the re-
75
lated work and motivation. Section 3 presents the general framework of MOEA embedded with MOE and makes a detailed description on each part of MOE. Experimental settings and comprehensive experiments are conducted and analyzed in Section 4. Finally, Section 5 makes a conclusion about this paper and points out some future research issues.
80
2. Related work and motivation In this section, we first give a brief introduction of AdaBoost. Afterwards, we briefly review the multi-operator search strategies proposed in the literature. Lastly, the motivation of this work is given. 2.1. AdaBoost
85
The boosting strategy is to learn many weak classifiers and combine them in some way, instead of learning a single strong classifier. The idea of building ensembles of classifiers has gained interest in the last decade [19]. AdaBoost, short 4
for Adaptive Boosting, is a ensemble learning algorithm, proposed by Yoav Freund and Robert Schapire [18]. It uses one training data set with different weights 90
to construct diverse weak base classifiers. Each data sample is given a weight representing its importance to be selected as a training sample, and all samples share an equal weight in the first round of iteration. In next iterations, the weights of misclassified samples are increased (or alternatively, the weights of the correctly classified samples are decreased), so that the new classifier focuses more and more
95
on the misclassified samples that may locate near the classification margin and finally reduce the error rate. After T iterations, the base weak classifiers are combined to be a strong classifier using a weighted majority vote. Each classifier is weighted (by wT ) according to its error rate on the weighted training set that it is trained on and the final classification result is output. The steps of AdaBoost
100
algorithm are as shown in Algorithm 1. 2.2. Multi-operator search strategy As mentioned above, different operators are suitable for different problems or different evolutionary stages. The search strategy based on multiple operators is more efficient than using single operator. The representative multi-operator
105
search strategies can be divided into the following two categories. The first category is adaptive operator selection (AOS), which selects an operator from the operators pool based on the previous performance at each generation [20]. Since the performance of the operator can be quantized by the single fitness value, AOS has been widely employed in single-objective optimization,
110
such as genetic algorithm (GA) with different crossover operators [21], differential evolution algorithm (DE) with self-adaptive mutation strategies and control parameters [22, 23], particle swarm optimization algorithm (PSO) with the adaptive selection of different learning strategies [24–26]. For the multi-objective optimization, since decomposition-based approaches decompose an MOP into a set
115
of single-objective subproblems and use the scalar value to select solutions, the AOS strategy used for single-objective optimization can be easily introduced for MOEAs. For example, Li et al. [27] developed a method of randomly combining multiple DE mutation operators to improve the reproduction procedure under the MOEA/D framework. Khan et al. [28] studied the effect of the use of two
120
crossover operators in MOEA/D-DRA, and the selection probability of each op5
Algorithm 1 AdaBoost Input: N labeled samples D = {(x1 , y1 ), (x2 , y2 ), · · · (xN , yN )} , y ∈ {−1, +1}, base classifier Gt (x), number of iterations T ; (1)
1:
Initialize the weight vector: wi
2:
For t = 1, . . . , T :
1 N
=
i = 1, 2, 3, · · · N ;
,
(a) The training set with weight distribution is used to obtain the base classifier: Gt (x) = arg min G(x)
N P
i=1
(t)
wi I(yi 6= G(xi )); (t)
(b) Calculate the error rate of Gt (x) respect to the weight distribution wi : t =
N P
i=1
(t)
wi I(yi 6=Gt (xi )) N P
i=1
(t)
;
wi
(c) Calculate the weighting coefficient of Gt (x): t ; αt = 21 ln 1− t
(d) Update the weight vector: (t+1)
wi where Z (t) =
N P
i=1
3:
(t)
=
wi e−yi αt Gt (xi ) Z (t)
,
i = 1, 2, 3 · · · N ;
(t)
wi e−yi αt Gt (xi ) is a normalization factor.
Final classifier: G(x) = sign
T P
t=1
αt Gt (x) ;
erator at each generation was updated in an adaptive way. Li et al. [29] proposed a bandit-based adaptive operator selection method that can consider the fitness improvement rates and the number of times used by operators simultaneously. By contrast, not much work has been done to apply AOS into Pareto-based MOEAs 125
due to the difficulty in quantizing the quality difference between the solutions by using Pareto dominance relation. In Borg MOEA [30], David et al. quantized the performance of the operator by counting the number of solutions produced by each operator in the -box dominance archive. Yuan et al. [31] simply used a random manner to select one operator among SBX, DE operator, and polynomial mu-
130
tation. Since this kind of method only selects an operator from the operators pool 6
based on the previous performance at each generation, it is more suitable for the steady state selection scheme adopted by the decomposition-based approaches. The another category is multiple operators ensemble (MOE), which employs multiple populations to make full use of the combination advantages of multiple 135
operators. Since multiple operators exist simultaneously at each generation, the evolutionary process can be stable than AOS. In recent years, this type of methods have been widely used for the single-objective optimization [32]. For example, Wu et al. [33] proposed a multi-population based approach to realize an ensemble of multiple DE variants, where an extra reward subpopulation was allocated
140
to the best performing operator. Zhou et al. [34] developed a self-adaptive differential artificial bee colony algorithm, where multiple subpopulations evolved with different mutation strategies and the size of each subpopulation was dynamically adjusted during the search process. Ali et al. [35] proposed a differential evolution with different mutation strategies that divided the population into
145
subpopulations with the same size, where individuals were randomly selected based on a certain neighborhood topology to migrate between subpopulations. Mallipeddi et al. [36] proposed a DE algorithm with an ensemble of parallel populations where the number of function evaluations allocated to each population is self-adapted by learning from their previous experiences in generating supe-
150
rior solutions. To the best of our knowledge, few studies have been conducted on MOE for multi-objective optimization, because it is hard to evaluate the performance of the operator during the evolutionary process. Recently, Mashwani et al.[37] used the survival numbers of offspring to assess the performance of operators and developed a novel multi-operator evolutionary algorithm based on
155
NSGA-II [38]. Wang et al. [13] realized the ensemble of SBX and DE based on two populations, but the performance evaluation of the operator used the decomposition method to calculate the fitness improvement that required lots of computation. Since the MOE strategy needs to be run on multiple populations, it can be more suitable for the generational selection scheme adopted by the indicator- and
160
Pareto-based approaches. Because they create a whole population of individuals from the parent population that can be divided into several subpopulations easily. While decomposition-based approaches adopt the steady-state selection scheme that create only one new individual to update the parent, it is difficult to make a 7
division on this dynamic population. 165
2.3. Motivation of this work From the literature review, we can find that AOS has been widely applied into the decomposition-based approaches for solving MOPs while MOE is rarely embedded into MOEAs for multi-objective optimization. The major reason is that, MOE needs to be run on multiple populations that can only suit the generational
170
selection scheme. Under this selection scheme, it is hard to quantize the quality difference between the solutions. Hence, how to evaluate the performance of each operator is a critical for the application of MOE into evolutionary multi-objective optimization. In the machine learning field, AdaBoost can effectively ensemble multiple classifiers based on the error rates. Despite that it is used for classifi-
175
cation problem, the ensemble way can be borrowed for solving MOPs. Firstly, AdaBoost uses one training set with different weights that can be regarded as multiple training sets. Their distributions are adjusted according to the performance of the classifier. Each training set is used to construct one weak base classifier. Correspondingly, MOE partitions the whole population into multiple subpopu-
180
lations, each operator owns its subpopulation. Therefore, we can dynamically adjust the size of each subpopulation according to the performance of the operator. Secondly, AdaBoost uses the error rate to evaluate the performance of the classifier. Accordingly, we use the survival rate of offspring produced by different operators, to evaluate the performance of each operator. Thirdly, AdaBoost
185
uses the error rate to calculate the weighting coefficient of the classifier. So MOE can use the survival rate to calculate the weight of the operator. Then AdaBoost uses the weighting coefficient to update weights of training samples while MOE adopts the weight of the operator to update the subpopulation ratio. Moreover, AdaBoost gives the misclassified samples higher weights that may locate near the
190
classification margin. In our proposed MOE, the inferior individuals from one subpopulation are allowed to migrate into the other subpopulations that may be improved by other operators. The corresponding relation between MOE and AdaBoost is shown in Table 1.
8
Table 1: Corresponding relation between MOE and AdaBoost. MOE
AdaBoost
Multiple subpopulations with different ratios
Training samples with different weights
Operator
Classifier
Survival rate
Error rate
Calculate the weight of the operator
Calculate the weighting coefficient of the classifier
Update ratios of subpopulations
Update weights of training samples
Inferior individuals improved by the other operators
Misclassified samples given the higher weights
3. The proposed MOE for MOEA 195
The basic procedure of the proposed algorithm is similar to most Paretobased MOEAs. The main difference is that MOE is used to create offspring at each generation. Algorithm 2 gives the general framework of MOEA embedding with MOE, termed as MOEA/MOE. Firstly, an initial parent population P with the size of N is randomly generated, then P is randomly divided to K subpopula-
200
tions with equal size (line 1-4). Secondly, for each subpopulation Pkt , an offspring population Qtk is generated by using the operator Opk , and then all the Qtk are merged into the final offspring population Q (line 6-10). Using the environmental selection in any Pareto-based algorithms, the N elite individuals are survived and 0
added into P (line 12), and the order of surviving for next population is labeled. 205
Thirdly, the survival rate after environmental selection is employed to calculate the weight of each operator (line 13) and the ratio of the subpopulation is updated (line 14). Finally, the subpopulations {P1t , P2t , · · · , PKt } are redivided according to
the updated subpopulation ratios and individuals can be exchanged among subpopulations at the same time (line 15). This procedure repeats until a termination
210
condition is met. For better understanding the framework, the general flow chart of an MOEA/MOE is also given in Fig. 1. Overall, our proposed MOE consists of three steps: the performance evaluation of operators, the credit assignment and the subpopulation update. The first one is how to evaluate the performance of each operator at the current generation
215
and the second one is how to assign an appropriate credit based on its performance for the next generation. While the last one is how to reward or punish the use of operators according to the credits. Note that the proposed MOE has a strong ability for scalability. First of all, it can be easily scaled to use any number 9
Algorithm 2 Framework of MOEA/MOE Input: tmax ( maximum number of iterations); N (population size); Op (operators pool with K operators) Output: P tmax (final population); 1: 2: 3: 4: 5: 6:
P ← P opulation initialization(N ); t ← 0;
(0)
Initialize the radio of each subpopulation: λi =
1 K
,
i = 1, 2, 3, · · · K;
Randomly divide P into {P1t , P2t , · · · , PKt } with equal size; while t < tmax do
for k = 1 to K do Pkt ← M ating selection(Pkt );
7:
Qtk ← Of f spring creation(Pkt ); /*Using the k-th operator to create the
8:
offspring*/ 9: 10: 11: 12: 13: 14: 15: 16: 17:
end for t Q ← ∪K k=1 Qk ;
S ← P ∪ Q; 0
[P , Order] ← Environmental selection (S); 0 αt ← P erf ormance evaluation P ; λt+1 ← Credit assignment (αt−1 , αt , λt );
0 {P1t , P2t , · · · , PKt } ← Subpopulation update λt+1 , P , Order ;
t ← t + 1;
end while
of populations to perform the ensemble of multiple operators. Secondly, it can be 220
easily embedded into both indicator- and Pareto-based approaches. In this study, we validate the effectiveness of MOE based on the Pareto-based framework. 3.1. Performance evaluation of operators The most commonly used indicator for evaluating the performance of an operator is based on the fitness improvement between the parent and the offspring. However, this method is only suitable for the decomposition-based approaches. For the Pareto-based approaches, such as NSGA-II [38], SPEA2 [39], PESA-II [40], and NSGA-III [41], due to using the Pareto-dominance relation to compare individuals, it is very difficult to quantify the quality difference between two solu10
P
P
,QLWLDOL]DWLRQ
Pk
PK ͘͘͘
͘͘͘ Op
OpK
Opk
2SHUDWRU
͘͘͘
͘͘͘
Q
QK
Qk
͘͘͘
P
6HOHFWLRQ
͘͘͘
Pk
PK ͘͘͘
͘͘͘
7HUPLQDWLRQ"
(YDOXDWLRQ $VVLJQPHQW 8SGDWH
2XWSXWP
1R
Figure 1: The general flow chart of MOEA/MOE
tions and the improvement caused by using an operator. Therefore, how to make a performance evaluation for Pareto-based approaches is an important issue. In fact, there is no need to calculate the absolute fitness improvement but the relative relation among the performances of operators in an MOE method. Running on the multiple subpopulations, this relation can be easily quantified by the survival numbers of individuals from different subpopulations after environmental selection. To be specific, the offspring individuals generated by different operators are combined with the parent subpopulations. Then the environmental selection is implemented on this union population similar to most of Pareto-based approaches, the only difference is that the individuals are labeled by different kinds of operators. Finally, we can count the survival numbers from different subpopulations, if one operator is more suitable for the current problem, the individuals 11
generated by this operator can survive much more naturally. So, the performance of an operator at current generation can be measured by the survival rate, which is represented by:
|Pkt | , (2) N where |Pkt | is the number of the k-th subpopulations in the t-th generation and N tk =
is the size of the whole population. 225
Note that this evaluation method can consider the convergence and diversity of the operator search simultaneously, because one individual can be survived for next generation that must be with good convergence and diversity. While the fitness improvement in AOS only considers the convergence. Moreover, it does not require the computation for the calculation of the fitness improvement
230
between the last and current generation. It only observes the survive situation of each subpopulation after the environmental selection at the current generation. If one or more operators are more suitable for the current problem, the sizes of subpopulations associated with them naturally enlarge; otherwise, the inferior individuals generated by the unbefitting operator are gradually being eliminated
235
in competition. According to the survival rate, the weight of the operator can be determined. Here, we do not employ the nonlinear update formula of weighting coefficients in t AdaBoost, because αt = 21 ln 1− can be negative that does not fit the definition of t
240
the weight. Since the survival rate of each subpopulation has the same property K P with the weight of the operator that tk ∈ [0, 1] and tk = 1, we simply make k=1
the weight of the operator αkt = tk . More importantly, this linear relation makes
the change of the weight much gently and is good for exerting the advantages of operators at different stages. 3.2. Credit assignment Making use of the advantages of multiple populations, the credit assignment can be implemented by enlarging or reducing the sizes of subpopulations, for the adaptive allocation of computational resources. If one operator has a higher weight in the current generation, the credit value should be given more for the next generation by enlarging the subpopulation ratio. However, the existing difficulty is how to assign an appropriate credit value to an operator based on its 12
weight. In this paper, inspired by the weight vector update method of training samples in AdaBoost [18], the credit assignment method for the subpopulation ratio can be assigned as follows: t
λt+1 k 245
where Z (t) =
K P
k=1
t
t−1
λtk e(αk −αk
)
t−1
λt e(αk −αk = k (t) Z
)
,
(3)
is a normalization factor, and λt+1 is the ratio of the k
k-th subpopulation for the next generation. From Formula (3), we can find that it is slightly differen from the original weight vector update method in Algorithm 1. After all, the optimization problem is different from classification problem, there is no class labels for individuals. So, 250
we can not determine the update direction by using the label like in AdaBoost, but compare the last weight of the operator with the current weight and thus provide the update direction: if the weight value increases, the subpopulation ratio should be enlarged; otherwise, it should be reduced. Moreover, this incremental update method can make a decaying effect. That is, the short-time well-performing oper-
255
ator can be limited to rapidly get the excessive resources and thus its corresponding subpopulation can be avoided to trap at locally optimal regions, while the real appropriate operator that maybe perform worse in a very early stage can be given a chance to survive for rising at the late stage. For example, suppose that the operators pool includes four kinds of operators, the initial ratios of subpopulations
260
are set to be equal, that is λ0 = [0.25, 0.25, 0.25, 0.25]. After the first round of environmental selection, the survival rates for subpopulations are {0.4, 0.3, 0.2, 0.1},
so the current weights of operators are also α0 = [0.4, 0.3, 0.2, 0.1]. The subpopulation ratios can be updated according to (3) that λ1 = [0.29, 0.26, 0.23, 0.21]. We can find that, if we give the immediate reward for each operator according to the cur265
rent operator weight α0 , which can give the excessive resources for the operator that only suits for the early stage and leave no room for the operator suiting for the late search stage. While the update method in (3) makes a decaying effect on these changes, avoiding the excessive growth or decline on the sizes of subpopulations. This can guarantee a smooth transition among operators and make full
270
use of the advantages of different kinds of operators at different search stages. It is worth noting that an operator may do badly at an earlier stage of the evolution process and perform very well at a later stage or vice versa. Consider13
ing this fact in designing the MOE strategy, we set a lower bound 5% · N for the
subpopulation size, so as to keep each operator working on some individuals in
275
each generation. This can give a chance to survive for the operator that maybe work well at the late search stage. 3.3. Subpopulation update Based on the updated subpopulation ratio, we adjust the subpopulation size to reward or punish the computational resource for each operator. In detail, the
280
size of the subpopulation associated with the well-performing operator should be increased, while the size of the subpopulation associated with the poor-performing operator should be reduced. Moreover, due to the change in the population size, the individuals need be removed from some subpopulations and added into other subpopulations. Like AdaBoost focusing on the misclassified samples, the in-
285
ferior individuals from subpopulations are allowed to migrate into other subpopulations in MOE. Moreover, these inferior individuals in the subpopulation very likely have no promising for producing high-quality offspring over time and should be removed. In this way, the evolutionary information among subpopulations can be exchanged by the migration of individuals, which can achieve
290
complementary advantages of different operators. Unlike the misclassified samples in AdaBoost that can be easily identified by comparing the predicted label with the true label, how to define the inferior individuals in each subpopulation should be considered. The direct method is using environmental selection again on each subpopulation to rank individuals, but this
295
process is very time-consuming. Here, we utilize the selected order for next generation in environmental selection to define the inferior individuals. Because in most of Pareto-based approaches, the environmental selection always selects N elite individuals from the combined population with size of 2N according to the sequential order constructed by different selection strategies. We can record these
300
orders to evaluate the quality of the individuals. For example, in NSGA-II [38], the population is divided into different non-domination levels, i.e., F1 , F2 , and so on. As illustrated in Fig. 2, we can label all the individuals in F1 with Order1 because all of them are first selected for next generation. The individuals in F2 are labeled with Order2 , and so on. For the last non-domination level Fl , the diversity
305
maintenance is used, the individuals with larger crowding distances are succes14
F F
½ ° °° ¾ ° ° °¿
Order
½ ° °° ¾ ° ° °¿
Order
͘͘͘
͘͘͘
Orderl Orderl Orderl
Fl 'LYHUVLW\ 0DLQWHQDQFH
1RQGRPLQDWHG 6RUWLQJ
Figure 2: Illustration of making an order for each individual.
sively selected for next generation, which can be labeled with the sequential order values, such as Orderl , Orderl+1 . Like the survive rate, we only need to label the selected order for each individual in the process of environmental selection, which does not need extra computation. The set of inferior individuals Pk− is constructed by selecting the ones with higher orders in the subpopulation Pk . These individuals would be removed and
310
temporarily stored in P oorArchive, where the number of inferior individuals in − P is calculated by: k − P = µk · |Pk | (4) k and
(1 − λk ) µk = PK k=1 (1 − λk )
(5)
If the value of λk is larger, it indicates that the better performance of the operator Opk is, so fewer individuals are removed from the subpopulation Pk and the elimination rate µk is smaller; otherwise, more individuals should be removed and the elimination rate µk is larger. Then the individuals in P oorArchive are assigned randomly into each subpopulation according to the subpopulation ratio λk . Suppose that Pk+ represents the set of individuals that added into Pk , where the number of individuals P + is k
15
2UGHU
Pt
3RRU$UFKLYH 2UGHU
Pt
Pt
2UGHU
Pt
Pt
Pt
Figure 3: Illustration of subpopulation update.
calculated by:
315
+ P = λk · P1− + · · · + P − K k
(6)
If λk is larger, more individuals selected from P oorArchive is added into the subpopulation Pk . This emigration-immigration mechanism can not only reallocate the computational resources but also facilitate the exchange of the evolution information among subpopulations. Moreover, the inferior individuals very likely can pro-
320
duce the high-quality offspring by using the other operator(s). The illustration for the process of the subpopulation update is shown in Fig. 3. 4. Empirical analysis In this section, we first describe the experimental settings, including the benchmark suites, the quality indicator, comparison algorithms and operator settings.
325
Then, the effectiveness of the proposed MOE strategy is verified by embedding it into different Pareto-based MOEAs. Afterwards, we verify the competitiveness of the proposed MOE strategy by comparing NSGA-III/MOE with MOEA/DFRRMAB that uses the AOS strategy. Lastly, for understanding the dynamics of MOE, the trajectory of the subpopulation ratio during the evolutionary process is
330
given. 4.1. Experimental settings 4.1.1. Test problems In this paper, UF [42], WFG [43] and CDTLZ [44] , three different kinds of test suits are chosen in this study. The UF test instances are characterized for 16
Table 2: Maximum number of fitness estimations for different test problems. Problem
335
M =3
M =5
M =8
M = 10
M = 15
WFG1-9
/
265,000
/
552,000
405,000
C1-DTLZ1
46,000
127,200
124,800
276,000
204,000
C1-DTLZ3
92,000
318,000
390,000
966,000
680,000
C2-DTLZ2*
69,000
265,000
312,000
828,000
544,000
C2-DTLZ2
23,000
74,200
78,000
207,000
136,000
C3-DTLZ1, C3-DTLZ4
69,000
265,000
312,000
828,000
544,000
having nonlinear PS with arbitrary shape in the decision space. The number of decision variables of the UF instances is set to 30, where UF1 to UF7 instances with two objectives and UF8 to UF10 instances with three objectives. As suggested in [29], the population size N =600 for two-objective test instances, N =1000 for three-objective test instances. The maximum number of function evaluations is
340
fixed to be 300,000. WFG test problems have various characteristics (e.g., linear, convex, concave, mixed, multi-modal etc.) and are more difficult to solve when the number of objectives increases. As suggested in [43], for WFG test instances, the number of decision variables is set as D = M + 9, where the position-related variable K = M − 1 and the distance-related variable L = D − K. The population
345
sizes N =212, 276, 136 are set for five-, ten-, and fifteen-objective test instances,
respectively. The CDTLZ test suite includes three types of constrained problems, several infeasible regions are introduced into the decision space. As suggested in [44], the population sizes N =92, 212, 156, 276, 136 are set for three-, five-, eight-, ten-, and fifteen-objective test instances, respectively. The maximum number of 350
function evaluations for WFG and CDTLZ test instances is summarized in Table 2, where C2-DTLZ2* denotes C2-convex-DTLZ2. Note that all the parameters are set same as the original research for a fair comparison. 4.1.2. Quality indicator In our empirical studies, the inverted generational distance (IGD) [45] is em-
355
ployed to measure the convergence and diversity of solutions simultaneously. A smaller IGD value indicates a better performance of the algorithm. For calculating IGD, roughly 10,000 reference points on the PF of each test instance are sampled by the Das and Dennis’s approach [46] for M ∈ {2, 3, 5} and two layer reference points generation [41] for M ≥ 8.
17
360
4.1.3. Algorithms and parameters The effectiveness of the proposed MOE is verified on six state-of-the-art Paretobased MOEAs, namely, NSGA-II [38], SPEA2 [39], PESA-II [40], NSGA-III [41], VaEA [47] and KnEA [48]. For KnEA, the rate T of knee points in population is set to 0.5. For VaEA, the angle threshold σ is set to (π/2)(N + 1). In PESA-II, the
365
number of divisions in each objective is set to 6. In addition, in order to reduce the time cost, the non-dominated sorting procedure of the above algorithms employ the efficient non-dominated sorting approach ENS-SS as presented in [49]. In the experimental study, all test results are obtained from the open source software PlatEMO [50].
370
4.1.4. Operator settings In this paper, we select four kinds of operators with distinct search characteristics, including three DE operators [29, 51] and the GA operator [52, 53]. The operators pool are set as follows: 1) DE/rand/1: v i = xi + F × (xr1 − xr2 );
375
2) DE/rand/2: v i = xi + F × (xr1 − xr2 ) + F × (xr3 − xr4 ); 3) DE/current-to-rand : v i = xi + K × (xi − xr1 ) + F × (xr2 − xr3 ); 4) GA: it includes the simulated binary crossover (SBX) [52] and the polynomial mutation (PM) [53]. where xi is called the target individual and v i is the mutant individual. The in-
380
dividuals xr1 , xr2 , xr3 , and xr4 are randomly selected from P , which are different from each other and also different from xi . The scaling factor F > 0 controls the impact of the individual differences on the mutant individual, and K ∈ [0, 1] plays a similar function to F , where F = 0.5 and K = 0.5. The binomial crossover [54] is used after the mutation, where the crossover probability CR = 1.0. In SBX, the
385
distribution index is ηc = 20 and the crossover probability is pc = 1.0. In PM, the mutation probability is pm = 1/V and its distribution indicator is ηm = 20, where V denotes the length of decision variables. In the following experimental analysis, DE1, DE2, and DE3 represent DE/rand/1, DE/rand/2, and DE/currentto-rand, respectively. GA operator favors exploitation while DE operator focuses 18
390
on exploration. Moreover, the used three DE operators show different degrees of the exploration ability. DE2 has the strongest exploration ability that includes two random-to-random terms with four randomly selected individuals; DE3 has the second strongest exploration ability, because the first random-to-random term includes the target individual xi that reduces the randomness; while DE1 has the
395
weakest exploration ability because it has only one random-to-random term. 4.2. Effectiveness of the MOE strategy In order to validate the effectiveness of the proposed MOE, we select three different kinds of test suits, including MOPs with complex PSs, many objectives and constraints, respectively. The reason is that, all of them pose the difficulties
400
for the search ability of the operator. For MOP with complex PS, the search of the operator is not smooth in the decision space due to the irregular shape of the PS; for MOP with many objectives, how to create the solution with good convergence and diversity by the operator is important for MOEAs; for MOP with constraints, some infeasible regions exist in the the decision space that hinders the search of
405
the operator. In this study, the average and standard deviation of the IGD indicator value over 30 independent runs are used for each MOP. The best result is highlighted in bold. In addition, to test the differences for statistical significance, the Wilcoxon rank sum test with a 5% significance level is performed between the original algorithm and the modified version using MOE.
410
4.2.1. Performance on MOPs with complex PSs In this section, in order to validate the effectiveness of the proposed MOE on MOPs with complex PSs. We integrate it to three classical Pareto-based algorithms that are good at solving problems with two or three objectives, NSGA-II [38], PESA-II [40], and SPEA2 [39], which results in three new MOEAs, denoted NSGA-
415
II/MOE, PESA-II/MOE, and SPEA2/MOE, respectively. We compare these three new algorithms with their corresponding original version. Thereafter, we put them together to further compare them and investigate the reason for their behavior in MOPs with complex PSs. Table 3 shows the average IGD values of the six algorithms on UF1 to UF10.
420
From the experimental statistics, the overall performance of the modified version using MOE is better than the original algorithm. Moreover, for UF5 and UF8-10 19
Table 3: Average and standard deviation of the IGD values obtained by the six algorithms on UF test problems. Best result is highlighted in bold. Problem
NSGA-II
NSGA-II/MOE
PESA-II
PESA-II/MOE
SPEA2
UF1 UF2
2.9244e-2 (3.47e-3)†
3.1216e-2 (2.75e-3)
2.8723e-2 (4.03e-3)
2.5300e-2 (3.96e-3)†
2.4672e-2 (1.63e-3)†
2.8203e-2 (1.88e-3)
1.4840e-2 (2.04e-3)
1.0818e-2 (7.37e-4)†
1.5712e-2 (2.89e-3)
1.4868e-2 (1.32e-3)†
1.3097e-2 (1.66e-3)
1.1963e-2 (8.11e-4)†
UF3
6.0421e-2 (1.22e-2)
4.9618e-2 (2.04e-2)†
1.1967e-1 (4.40e-2)
1.0548e-1 (4.36e-2)†
4.2713e-2 (1.75e-2)†
6.3146e-2 (1.23e-2)
UF4
6.5947e-2 (2.47e-3)
4.0345e-2 (7.21e-4)†
6.7333e-2 (3.88e-3)
5.9930e-2 (4.98e-3)†
6.4433e-2 (3.66e-3)
4.3527e-2 (1.14e-3)†
UF5
5.7788e-1 (1.78e-1)
1.8037e-1 (1.17e-1)†
5.6372e-1 (1.42e-1)
3.7733e-1 (1.72e-1)†
4.3165e-1 (8.65e-2)
1.3169e-1 (1.69e-2)†
UF6
2.3774e-1 (8.96e-2)
1.0801e-1 (7.39e-3)†
2.0320e-1 (1.24e-1)
2.1959e-1 (1.23e-1)
1.4228e-1 (2.51e-2)
1.0791e-1 (6.32e-3)†
UF7
1.4490e-2 (1.06e-3)
1.2707e-2 (7.44e-4)†
1.8370e-2 (3.24e-3)
1.4259e-2 (7.96e-2)†
1.2249e-2 (8.54e-4)
1.1804e-2 (6.63e-4)
UF8
1.0524e-1 (1.08e-2)
9.2408e-2 (6.76e-3)†
1.8743e-1 (4.89e-2)
1.6789e-1 (8.02e-2)†
8.8716e-2 (5.64e-3)
6.5762e-2 (2.82e-3)†
UF9
1.1522e-1 (6.77e-2)
7.9624e-2 (1.39e-3)†
3.8269e-1 (9.71e-2)
3.6101e-1 (9.93e-2)†
1.1353e-1 (7.39e-2)
7.9890e-2 (5.17e-2)†
UF10
2.2520e+0 (3.03e-1)
3.6591e-1 (9.53e-2)†
1.3256e+0 (2.43e-1)
9.1942e-1 (2.38e-1)†
2.2325e+0 (2.93e-1)
3.9631e-1 (1.17e-1)†
”†”
SPEA2/MOE
indicates that the two results are significantly different at a level of 0.05 by the Wilcoxon’s rank sum test.
test instances, NSGA-II/MOE and SPEA2/MOE have a clear advantage than the original algorithms. This is due to the fact that different operators have different search characteristics and the ensemble strategy can combine the advantages of 425
these operators efficiently. It is worth noting that the MOE strategy does not always promote the performance of the algorithm, such as NSGA-II/MOE on UF1, PESA-II/MOE on UF6 and SPEA2/MOE on UF3, using the single operator has the better performance than that using MOE. The reason may be that the performance of the operator is not only related to the characteristic of problems, but
430
also the environmental selection strategy. For the specific problem and the environmental selection, maybe using the single operator can make the evolutionary process stable and produce the sustainable evolutionary ability. 4.2.2. Performance on MOPs with many objectives Recently, MOPs with more than three objectives, known as many-objective
435
optimization problems (MaOPs), have been attracting great attention in evolutionary multi-objective optimization community [55–57]. Lots of many-objective evolutionary algorithms (MaOEAs) have been proposed to deal with MaOPs. However, most of them pay more attention on the selection of solutions for improving the performance of the algorithm, few works focus on the quality of offsprings
440
produced by operators. Recent researches [31, 58, 59] have shown that the multioperator strategy can also improve the performance of the algorithm by producing the promising offspring. Maybe the proposed MOE strategy can be efficient for handling MaOPs. Note that the main purpose of this paper is not to solve 20
MaOPs but to investigate whether MOE can improve the abilities of MaOEAs, 445
such as NSGA-III [41], VaEA [47], and KnEA [48], for dealing with MaOPs. Table 4 shows the average IGD values of the six algorithms on WFG1 to WFG9. As shown in Table 4, the modified versions with using MOE are better than the original algorithms on most of test instances. To be specific, NSGA-III/MOE has a clear advantage than NSGA-III on WFG1-2 and WFG9 test instances. For the
450
rest of the WFG problems (WFG3-6 and WFG8), NSGA-III/MOE can achieve best results on five-objective instances. For the ten- and fifteen-objective instances, it can only obtain the comparable values to NSGA-III. The reason may be that, in the large-dimensional objective space, solutions are likely to be widely distant from each other, the single GA operator used in NSGA-III focuses on the exploitation
455
that can be helpful for improving the convergence. However, this improvement is only for the many-objective problem with regular PF. For the complex problems, such as the mixed WFG1, the disconnected WFG2 and the multi-modal and nonseparable WFG9 test instances, MOE can combine the advantages of various kinds of operators and produce the offspring with different characteristics for complex
460
problems. Meanwhile for VaEA/MOE and KnEA/MOE, they can also perform better than the original algorithms on most of test instances, especially on WFG1. Different from NSGA-III/MOE, these two algorithms have a clear advantage on WFG8. This difference indicates that the efficiency of MOE may be related to the type of the environmental selection strategy.
465
4.2.3. Performance on MOPs with constraints Due to the existing of constraints, the shape of the fitness landscape in constrained optimization is more complex than the unconstrained problems. It presents a greater challenge for the search ability of the operator [60]. The CDTLZ test suite [44] considers constraints in the objective space that makes the feasible re-
470
gions much discrete and irregular in the decision space. It includes three types of constrained problems: the first type of constrained problem, where the original PF is still optimal, but there is an infeasible barrier in approaching the PF, which increases the difficulty of the algorithm in converging to the PF; the second type of constrained problem, where only the region located inside each of the
475
M + 1 hyper-spheres with radius r is feasible; the third type of constrained problem introduces multiple constraints whose PF is composed of several constraint 21
Table 4: Average and standard deviation of the IGD values obtained by the six algorithms on WFG test problems with different number of objectives. Best result is highlighted in bold. Problem WFG1
WFG2
WFG3
WFG4
WFG5
WFG6
WFG7
WFG8
WFG9 ”†”
M
NSGA-III
NSGA-III/MOE
VaEA
VaEA/MOE
KnEA
KnEA/MOE
5
2.2980e+0 (6.13e-2)
4.5006e-1 (4.94e-2)†
2.2391e+0 (5.74e-2)
2.2137e+0 (2.12e-2)
2.1430e+0 (7.36e-2)
4.1807e-1 (1.05e-2)†
10
3.1630e+0 (7.08e-2)
1.1802e+0 (1.93e-1)†
3.1839e+0 (2.75e-2)
3.1296e+0 (2.72e-2)
3.0804e+0 (3.89e-2)
1.1729e+0 (5.50e-2)†
15
3.9313e+0 (3.10e-2)
2.1442e+0 (2.56e-1)†
3.9039e+0 (8.97e-2)
3.7773e+0 (1.19e-1)†
3.8821e+0 (1.72e-1)
2.0597e+0 (1.83e-1)†
5
6.1972e-1 (1.11e-1)
5.7471e-1 (9.03e-2)†
7.2516e-1 (2.88e-2)
7.2882e-1 (3.88e-2)
5.5858e-1 (3.28e-2)
5.5762e-1 (5.05e-2)
10
2.8070e+0 (8.57e-1)
2.2400e+0 (5.00e-1)†
2.6939e+0 (3.18e-1)
2.6430e+0 (2.52e-1)
2.1230e+0 (2.55e-1)
2.2184e+0 (4.44e-1)
15
7.5133e+0 (2.92e+0)
9.7569e+0 (1.62e+0)†
2.0948e+0 (9.26e-1)
1.1770e+0 (4.86e-1)†
3.4594e+0 (8.93e-1)
3.4885e+0 (1.37e+0)
5
6.4474e-1 (6.44e-2)
4.4472e-1 (4.79e-2)†
6.4862e-1 (6.14e-2)
6.6311e-1 (6.95e-2)
5.7573e-1 (4.52e-2)
5.7407e-1 (2.17e-2)
10
1.0339e+0 (1.80e-1)†
1.7036e+0 (2.17e-1)
1.3120e+0 (9.53e-2)
1.4261e+0 (1.11e-1)
1.4814e+0 (2.25e-1)
1.4716e+0 (1.05e-1)
15
2.0699e+0 (6.42e-1)†
4.1832e+0 (1.31e+0)
2.2558e+0 (4.60e-1)†
2.4642e+0 (4.66e-1)
4.5699e+0 (1.57e-1)
2.8346e+0 (1.37e-1)†
5
8.1146e-1 (1.11e-1)
7.4736e-1 (1.31e-1)†
1.0541e+0 (1.44e-2)
1.0533e+0 (7.91e-3)
9.7181e-1 (2.24e-3)
9.6060e-1 (1.35e-3)
10
4.3574e+0 (8.84e-2)†
4.5428e+0 (1.73e-2)
4.2824e+0 (3.82e-2)
4.2877e+0 (3.26e-2)
4.0269e+0 (4.34e-2)†
4.5503e+0 (2.05e-2)
15
8.6998e+0 (1.03e-1)†
9.3860e+0 (1.35e-2)
8.9065e+0 (8.91e-2)
8.7345e+0 (9.50e-2)†
8.1682e+0 (1.37e-1)
8.0899e+0 (6.06e-2)†
5
8.1943e-1 (8.63e-2)
7.1871e-1 (1.41e-1)†
1.0032e+0 (2.66e-2)
1.0094e+0 (2.30e-2)
9.5672e-1 (8.05e-3)
9.5035e-1 (7.46e-4)
10
4.3298e+0 (1.13e-1)
4.5151e+0 (5.63e-3)
4.2042e+0 (4.03e-2)
4.1731e+0 (2.62e-2)†
3.9818e+0 (3.15e-2)†
4.2669e+0 (3.15e-2)
15
8.6629e+0 (1.12e-1)†
9.2359e+0 (1.12e-1)
8.7594e+0 (1.52e-1)
8.6900e+0 (9.00e-2)†
8.1904e+0 (5.69e-2)†
9.0907e+0 (2.68e-2)
5
8.5599e-1 (1.48e-1)
7.0963e-1 (1.28e-1)†
1.0666e+0 (5.26e-2)
1.0452e+0 (2.32e-2)
9.7741e-1 (8.36e-3)
9.6081e-1 (6.53e-4)
10
4.7396e+0 (4.09e-1)
4.5107e+0 (4.17e-2)†
4.0602e+0 (3.79e-2)
4.0773e+0 (3.99e-2)
4.2286e+0 (3.78e-2)†
4.5415e+0 (5.03e-2)
15
8.6218e+0 (1.55e-1)†
9.2219e+0 (1.04e-1)
8.3406e+0 (9.36e-2)
8.3514e+0 (9.38e-2)
9.3152e+0 (8.54e-2)†
9.6238e+0 (2.61e-1)
5
8.6909e-1 (6.87e-2)
7.4572e-1 (1.09e-1)†
1.1339e+0 (1.45e-2)
1.1205e+0 (1.70e-2)
9.8189e-1 (5.13e-3)
9.6705e-1 (1.77e-3)†
10
4.3733e+0 (7.41e-2)†
4.5635e+0 (1.17e-2)
4.2088e+0 (3.43e-2)
4.1822e+0 (3.97e-2)
4.5538e+0 (1.35e-2)
4.5790e+0 (1.34e-2)
15
8.6544e+0 (2.06e-1)†
9.3876e+0 (1.37e-2)
8.5708e+0 (7.12e-2)
8.4963e+0 (9.76e-2)
9.2179e+0 (1.04e-1)
8.6271e+0 (4.94e-2)†
5
9.6789e-1 (4.09e-2)
7.8874e-1 (9.05e-2)†
1.2605e+0 (2.96e-2)
1.2541e+0 (3.30e-2)
1.0279e+0 (1.51e-2)
9.7915e-1 (4.38e-3)†
10
4.4764e+0 (1.96e-1)
4.5922e+0 (2.11e-2)
4.2955e+0 (3.56e-2)
4.0053e+0 (3.90e-2)†
4.4720e+0 (2.62e-2)
4.1820e+0 (1.98e-2)†
15
8.6242e+0 (9.90e-2)†
9.3804e+0 (2.01e-2)
8.5474e+0 (7.96e-2)
8.5344e+0 (1.27e-1)
1.0205e+1 (2.53e-1)
9.4194e+0 (3.94e-2)†
5
8.2013e-1 (5.59e-2)
7.4212e-1 (9.42e-2)†
1.0804e+0 (2.06e-2)
1.0541e+0 (1.55e-2)
9.4047e-1 (2.14e-3)
9.3134e-1 (1.61e-3)†
10
4.4029e+0 (1.08e-1)
4.3635e+0 (7.07e-2)†
4.3787e+0 (4.60e-2)
4.2778e+0 (2.20e-2)†
4.3246e+0 (2.47e-2)
4.2759e+0 (5.01e-2)†
15
8.6435e+0 (8.74e-2)
8.4491e+0 (6.37e-2)†
8.7384e+0 (8.07e-2)
8.5463e+0 (8.05e-2)†
8.8695e+0 (9.05e-2)
8.2855e+0 (6.58e-2)†
indicates that the two results are significantly different at a level of 0.05 by the Wilcoxon’s rank sum test.
surfaces. Facing with different constrained characteristics, maybe the MOE strategy can ensemble the search advantages of different operators and produce the stronger search ability in finding the feasible regions. Note that the constrained 480
NSGA-III for constrained many-objective optimization problems has been suggested in [44]. Here, the same feasible rule proposed in [38] is embedded into the mating and environmental selection of VaEA and KnEA. All of them use the MOE strategy to produce offspring, replacing the original GA operator. As shown in Table 5, the performance of NSGA-III/MOE, VaEA/MOE and
485
KnEA/MOE has a clear improvement when MOE is applied to the original algorithms. To be specific, NSGA-III/MOE has a clear advantage than NSGA-III on most of test instances. For the rest instances, NSGA-III/MOE can achieve the competitive results. VaEA/MOE also improves the performance of VaEA on most of test instances. Note that VaEA/MOE can get the feasible solutions in the final pop-
490
ulaiton on the eight- and ten-objective C1-DTLZ1 and the ten-objective C3-DTLZ4 22
Table 5: Average and standard deviation of the IGD values obtained by the six algorithms on CDTLZ test problems with different number of objectives. Best result is highlighted in bold. Problem
C1-DTLZ1
C1-DTLZ3
C2-DTLZ2
C2-DTLZ2*
C3-DTLZ1
C3-DTLZ4
”†”
M
NSGA-III
NSGA-III/MOE
VaEA
VaEA/MOE
KnEA
KnEA/MOE
3
2.0241e-2 (1.81e-4)
1.8440e-2 (5.06e-5)†
3.7051e-2 (1.90e-2)
3.1149e-2 (8.96e-3)†
5.7091e-2 (3.55e-2
5.6632e-2 (1.92e-2)†
5
5.1866e-2 (4.49e-4)
5.1735e-2 (1.06e-4)
1.0207e-1 (3.11e-2)
1.0006e-1 (2.97e-2)
2.1343e-1 (8.51e-2)†
NaN (NaN)
8
9.6358e-2 (4.81e-4)
9.5064e-2 (3.02e-4)†
NaN (NaN)
3.2472e-1(1.19e-1)†
NaN (NaN)
NaN (NaN)
10
1.1714e-1 (3.89e-3)
1.1535e-1 (3.72e-3)
NaN (NaN)
3.5934e-1 (1.31e-1)†
NaN (NaN)
NaN (NaN)
15
1.9089e-1 (2.53e-3)
1.9066e-1 (1.96e-3)
NaN (NaN)
NaN (NaN)
NaN (NaN)
NaN (NaN)
3
8.0063e+0 (1.62e-3)
7.3905e+0 (1.60e-3)†
5.9964e-2 (5.79e-3)
5.7842e-2 (4.91e-3)†
1.2123e-1 (3.00e-2)†
1.3277e-1 (5.25e-2)
5
1.1070e+1 (2.32e+0)
9.7914e+0 (4.11e+0)†
3.3747e-1 (6.88e-2)
3.3177e-1 (8.14e-2)
4.2701e-1 (1.69e-1)
3.4172e-1 (6.43e-2)†
8
1.1653e+1 (8.54e-2)
1.1228e+1 (2.30e+0)
5.4890e-1 (4.20e-2)
5.1919e-1 (6.21e-2)†
3.1632e+1 (2.30e+0)
3.0557e+1 (1.30e+0)†
10
1.3634e+1 (2.82e+0)
1.3683e+1 (2.81e+0)
5.2552e-1 (4.50e-2)
5.2596e-1 (3.64e-2)
2.9712e+2 (9.80e+1)
3.8216e+1 (8.14e+1)†
15
1.4362e+1 (1.01e-1)
1.3857e+1 (2.83e+0)†
1.0733e+0 (2.84e-1)
1.000e+0 (3.64e-1)†
3.6694e+2 (1.15e+2)
3.0057e+2 (1.49e+2)†
3
4.8303e-2 (4.25e-4)
4.7200e-2 (4.41e-4)†
6.0421e-2 (1.94e-3)
5.9828e-2 (1.39e-3)†
8.6299e-2 (1.58e-2)
6.9931e-2 (4.08e-3)†
5
1.3890e-1 (5.93e-4)
1.3664e-1 (5.68e-4)
1.8194e-1 (3.16e-3)
1.7322e-1 (3.39e-3)†
2.1249e-1 (1.40e-2)
1.9366e-1 (1.25e-2)†
8
2.9945e-1 (1.48e-1)
2.5557e-1 (1.18e-1)†
3.6605e-1 (2.07e-3)
3.6515e-1 (2.71e-3)
3.4443e-1 (1.40e-3)
3.3485e-1 (1.27e-3)†
10
2.6725e-1 (3.97e-2)
2.3228e-1 (2.23e-2)†
3.5811e-1 (7.55e-4)
3.5737e-1 (6.46e-4)
2.9794e-1 (4.45e-4)†
3.4930e-1 (4.38e-4)
15
4.0870e-1 (2.22e-1)†
4.8041e-1 (2.69e-1)
3.0252e-1 (3.57e-3)
3.0296e-1 (2.56e-3)
2.9497e-1 (1.22e-1)†
3.5938e-1 (1.04e-1)
3
3.4382e-2 (7.87e-4)
3.3192e-2 (5.40e-4)†
5.5100e-2 (2.60e-3)
5.4602e-2 (1.72e-3)†
4.7567e-2 (8.02e-4)
4.0533e-2 (7.78e-4)†
5
6.5150e-2 (8.53e-4)
6.4395e-2 (1.33e-3)†
1.0725e-1 (5.22e-3)
1.0588e-1 (4.83e-3)
8.0538e-2 (6.43e-3)†
1.8222e-1 (1.28e-2)
8
1.1190e-1 (1.64e-2)
1.0842e-1 (1.72e-2)†
3.1740e-1 (3.60e-2)
3.1277e-1 (3.08e-2)
6.4050e-1 (3.31e-1)
4.5320e-1 (1.25e-1)†
10
1.0643e-1 (1.81e-3)
1.0004e-1 (1.28e-3)
3.8458e-1 (7.59e-2)
3.7103e-1 (7.72e-2)†
1.7400e+0 (1.33e+0)
5.3258e-1 (24.74e-1)†
15
1.7316e-1 (3.54e-2)
1.5607e-1 (3.42e-2)†
6.9259e-1 (3.51e-3)
6.5145e-1 (9.44e-2)†
1.9513e+0 (2.06e+0)
1.3539e+0 (6.26e-1)†
3
4.9842e-2 (1.04e-2)
4.8371e-2 (1.00e-2)†
NaN (NaN)
NaN (NaN)
NaN (NaN)
NaN (NaN)
5
1.0699e-1 (2.63e-4)†
1.1189e-1 (3.45e-3)
NaN (NaN)
NaN (NaN)
1.6176e+1 (2.26e+1)
3.9910e-1 (2.02e-1)†
8
2.2439e-1 (3.45e-2)
1.9740e-1 (1.46e-2)†
1.5287e+0 (1.69e+0)
1.3762e+0 (1.60e+0)†
8.0682e+0 (9.26e-1)
2.6020e+0 (8.95e-1)†
10
2.4749e-1 (2.87e-2)
2.3784e-1 (9.07e-3)†
8.0564e-1 (3.49e-1)
7.5373e-1 (2.04e-1)†
4.4055e+0 (3.63e+0)
3.6288e+0 (2.20e+0)†
15
3.8577e-1 (8.61e-3)
3.6158e-1 (2.68e-2)†
6.4921e-1 (8.80e-2)
6.3932e-1 (7.19e-2)†
8.7493e+0 (7.09e+0)
6.4719e+0 (3.38e+0)†
3
1.2408e-1 (1.53e-1)
9.6913e-2 (2.93e-3)†
NaN (NaN)
NaN (NaN)
NaN (NaN)
NaN (NaN)
5
2.4784e-1 (1.62e-3)
2.2385e-1 (1.57e-3)†
NaN (NaN)
NaN (NaN)
NaN (NaN)
6.1924e-1 (8.63e-2)†
8
4.8649e-1 (8.80e-2)
4.6285e-1 (2.35e-3)†
NaN (NaN)
NaN (NaN)
NaN (NaN)
7.8249e-1 (2.47e-2)†
10
5.8418e-1 (5.30e-2)
5.6791e-1 (8.81e-4)†
NaN (NaN)
1.0560e+0 (1.23e-1)†
NaN (NaN)
7.7776e-1 (1.28e-2)†
15
8.2720e-1 (5.23e-2)
7.7219e-1 (1.00e-3)†
NaN (NaN)
NaN (NaN)
NaN (NaN)
9.9919e-1 (4.42e-2)†
indicates that the two results are significantly different at a level of 0.05 by the Wilcoxon’s rank sum test.
test instances. Similarly, KnEA/MOE can converge to the feasible region and produce feasible solutions while the original KnEA can not find the feasible solutions on C3-DTLZ4 test problem. The reason is that the MOE can provide enough diversity of offspring by integrating the DE operators and find the promising feasible 495
region. While the single GA operator only focuses on the exploitation and easily misses the feasible region. 4.3. Competitiveness of the MOE strategy In order to verify the competitiveness of the proposed MOE strategy, we select one of the most representative AOS methods, fitness-rate-rank-based mul-
500
tiarmed bandit (FRRMAB) proposed in [29] for comparison, which selects one operator among four DE variants by using the bandit-based method under the MOEA/D framework. For the sake of fairness, we replace the four DE variants 23
by our selected operators in Section 4.1.4. In addition, due to the robustness of the penalty-based boundary intersection (PBI) for any numbers of objectives [41], it 505
is used to replace the Tchebycheff approach in the original algorithm. Moreover, we embed the proposed MOE into NSGA-III [41] to compare with MOEA/DFRRMAB. The reason for selecting NSGA-III is that it also uses the uniformly distributed reference vectors to guide the population evolution as like in MOEA/D. The parameters in MOEA/D-FRRMAB are set as recommended in its original ref-
510
erence [29], where the neighborhood size is T = 20, the maximum number of solutions replaced by each new solution is nr = 2, the selecting probability is σ = 0.9, the scaling factor is C = 5.0, the size of sliding window is W = 0.5 × N and the decaying factor is D = 1.0.
From the experimental results in Table 6 and 7, we can find NSGA-III/MOE 515
has the significant advantage on WFG and the competitive performance on UF. For the UF test problems characterized with complicated PSs, the neighborhood structure employed by MOEA/D has been shown more suitable for the landscapes of this type of problems. However, NSGA-III/MOE can achieve the better results on UF5-7 and UF9-10 test instances. Moreover, the according stan-
520
dard deviations obtained by NSGA-III/MOE are 1 or 2 magnitudes less than that of MOEA/D-FRRMAB. Even though MOEA/D-FRRMAB performs better than NSGA-III/MOE on UF3-4 test instances in terms of the average value, whose standard deviations are still larger than NSGA-III/MOE. This indicates that the proposed MOE can be more robust than using the AOS strategy. For the WFG test
525
problems with many objectives, the combination advantages of operators brought by the MOE strategy are more significant. For the high-dimensional space, solutions are likely to be widely distant from each other, using a single operator is hard to produce high quality offspring solutions, while multiple operators existing simultaneously in each generation can be useful to strengthen the search ability.
530
4.4. Dynamics of MOE To have a deeper understanding of the behavior of MOE, we investigate how the ratio of the subpopulation changes caused by the characteristic of the operator during the whole search process. Note that the adopted four operators have different abilities on exploitation and exploration, where GA operator focuses on
24
Table 6: Average and standard deviation of the IGD values obtained by NSGA-III/MOE Versus MOEA/D-FRRMAB on UF test problems. Best result is highlighted in bold. Problem
MOEA/D-FRRMAB
NSGA-III/MOE
UF1
6.8021e-3 (1.54e-3)†
4.1735e-2 (5.28e-3)
UF2
4.7082e-3 (1.11e-3)†
1.5566e-2 (1.99e-3)
UF3
2.7711e-2 (2.39e-2)†
8.5857e-2 (1.70e-2)
UF4
4.7587e-2 (3.95e-3)†
4.8686e-2 (1.44e-3)
UF5
4.1380e-1 (2.45e-1)
1.7145e-1 (4.16e-3)†
UF6
2.7770e-1 (1.60e-1)
1.1864e-1 (2.12e-2)†
UF7
4.1861e-2 (1.41e-1)
1.7204e-2 (1.29e-3)†
UF8
5.6960e-2 (5.36e-3)†
8.8857e-2 (8.55e-3)
UF9
1.3756e-1 (6.66e-2)
8.6521e-2 (7.59e-3)†
UF10
5.9465e-1 (1.03e-1)
4.7720e-1 (1.27e-1)†
”†”
indicates that the two results are significantly different at a
level of 0.05 by the Wilcoxon’s rank sum test.
535
exploitation whereas three DE operators favor exploration but with different degrees. The trajectory of the subpopulation ratio during the evolutionary process on some typical test problems as shown in Fig. 4. This run corresponds to the result with the median IGD value, where the results on UF test problems are ob-
540
tained by NSGA-II/MOE, and the results on WFG and CDTLZ test problems are obtained by NSGA-III/MOE and the constrained NSGA-III/MOE on five objectives, respectively. We can find that no single operator can dominate over the whole search process on all the test instances. However, in a certain search stage, there must exist an operator that plays a dominant role. For UF1, GA operator
545
plays a leading role from the 1st to 480th generations that accelerates the convergence of the population at the early stage, but at the late stage, DE3 plays the dominant role because the population almost converges to PF and the diversity of the population needs to be promoted. It is worth noting that the proposed credit assignment can give a chance to the poor-performing operator at the early
550
stage, such as DE1 and DE3, leave the room for them to exhibit their advantage at the late stage. For UF5, GA operator plays a leading role only for the initial stage, and then the subpopulation ratio associated with DE1 quickly climbs. In the meanwhile, DE3 gradually plays an important role. For UF9, GA operator gradually climbs and then stays at a high level of 0.85, the other subpopulations
555
would be reserved during the whole evolutionary process because we set a lower 25
Table 7: Average and standard deviation of the IGD values obtained by NSGA-III/MOE Versus MOEA/D-FRRMAB on WFG test problems with different number of objectives. Best result is highlighted in bold. Problem WFG1
WFG2
WFG3
WFG4
WFG5
WFG6
WFG7
WFG8
WFG9 ”†”
M
MOEA/D-FRRMAB
NSGA-III/MOE
5
1.8732e+0 (2.96e-1)
1.4675e+0 (4.94e-1)†
10
2.6066e+0 (7.90e-2)
1.1802e+0 (1.93e-1)†
15
3.1598e+0 (5.84e-2)
2.1442e+0 (2.56e-1)†
5
1.4111e+0 (9.77e-1)
5.7471e-1 (9.03e-2)†
10
1.1063e+1 (5.83e+0)
2.4400e+0 (9.00e-1)†
15
2.6984e+1 (4.64e-1)
2.2020e+0 (1.37e+0)†
5
5.8410e-1 (4.38e-1)
5.5072e-1 (4.79e-2)†
10
4.0408e+0 (9.80e-1)
1.4036e+0 (2.17e-1)†
15
6.1435e+0 (2.83e+0)
4.1832e+0 (1.31e+0)†
5
1.8587e+0 (2.77e-1)
7.4736e-1 (1.31e-1)†
10
6.8977e+0 (1.13e+0)
4.5428e+0 (1.73e-2)†
15
1.3041e+1 (1.73e+0)
9.3860e+0 (1.35e-2)†
5
1.7354e+0 (3.32e-1)
7.1871e-1 (1.41e-1)†
10
7.3310e+0 (7.82e-1)
4.5151e+0 (5.63e-3)†
15
1.3335e+1 (1.28e+0)
9.2359e+0 (1.12e-1)†
5
1.6424e+0 (2.07e-1)
7.0963e-1 (1.28e-1)†
10
7.3365e+0 (1.06e+0)
4.5107e+0 (4.17e-2)†
15
1.3270e+1 (1.60e+0)
9.2219e+0 (1.04e-1)†
5
1.7614e+0 (1.66e-1)
7.4572e-1 (1.09e-1)†
10
7.3478e+0 (1.12e+0)
4.5635e+0 (1.17e-2)†
15
1.3784e+1 (1.17e+0)
9.3876e+0 (1.37e-2)†
5
2.2543e+0 (4.34e-1)
7.8874e-1 (9.05e-2)†
10
8.2767e+0 (5.22e-1)
4.5922e+0 (2.11e-2)†
15
1.3787e+1 (1.29e+0)
9.3804e+0 (2.01e-2)†
5
1.7473e+0 (4.05e-1)
7.4212e-1 (9.42e-2)†
10
7.5523e+0 (5.74e-1)
4.3635e+0 (7.07e-2)†
15
1.3982e+1 (1.11e+0)
8.4491e+0 (6.37e-2)†
indicates that the two results are significantly different at a level of
0.05 by the Wilcoxon’s rank sum test.
bound 5% · N for each operator. For WFG3, both the GA and DE3 can perform
well at the initial stage, but the former plays a leading role after the 200th generation. For WFG7, GA operator plays a leading role before the 400th generation and then gradually falls down, while DE3 quickly climbs. For WFG8, we can see 560
DE3 plays a leading role during the whole evolutionary process while DE1 plays an auxiliary role at the early stage. For C1-DTLZ1, we can find that the performance of GA is better than DE3 in the first half of the process, but this situation reverses in the second half. For C2-DTLZ2, the search process can be divided into 26
three stages, the 1st to 90th generation constitutes the first stage, where GA oper565
ator plays a leading role; in the second stage of the 90th to 220th generation, DE1 plays the dominant role; in the final stage, DE3 becomes the most important operator. For C2-DTLZ2*, DE1 operator gradually climbs and plays a leading role with the ratio of 0.85. Overall, it is verified that no single operator can be always good at dealing with various MOPs. The proposed MOE strategy can effectively combine the advantages of different operators by giving different ratios on different search stages. It can smoothly switch from one leading operator to another and make full use of their advantages in an adaptive manner. 0.8
0.9 GA DE1 DE2 DE3
0.8 0.7
0.5
0.4
0.6
0.4
0.5 0.4 0.3
0.2
0.2
0.2 0.1
0.1
0
100
200
300
400
500
600
700
800
900
0
1000
0.1 0
100
200
300
Generation
400
500
600
700
800
900
0
1000
100
150
200
250
300
350
400
450
500
Generation
(c) UF9
0.9 GA DE1 DE2 DE3
0.8 0.7
GA DE1 DE2 DE3
0.8 0.7
0.7
0.5
Ratio
0.6
0.5
Ratio
0.6
0.5 0.4
0.4
0.4
0.3
0.3
0.3
0.2
0.2
0.2
0.1
0.1
0
200
400
600
800
1000
0
1200
0.1
0
200
Generation
400
600
800
1000
0
1200
0.8
GA DE1 DE2 DE3
0.7
0.5
0.9
0.45
0.8
0.25 0.2
0.3
GA DE1 DE2 DE3
0.15
0.2
0.1
0.1
250
300
350
Generation
(g) C1-DTLZ1
400
450
500
0
1200
GA DE1 DE2 DE3
0.5 0.4 0.3 0.2 0.1
0.05 200
1000
0.6
Ratio
Ratio
0.4
800
0.7
0.35
0.5
600
(f) WFG8
0.3
150
400
Generation
0.4
0.6
100
200
(e) WFG7
0.9
50
0
Generation
(d) WFG3
0
GA DE1 DE2 DE3
0.8
0.6
0
50
(b) UF5
0.9
0
0
Generation
(a) UF1
Ratio
0.7
0.3
0.3
GA DE1 DE2 DE3
0.8
Ratio
Ratio
Ratio
GA DE1 DE2 DE3
0.6
0.5
0
1 0.9
0.7
0.6
Ratio
570
0
50
100
150
Generation
(h) C2-DTLZ2
200
250
0
0
100
200
300
400
500
600
700
Generation
(i) C2-DTLZ2*
Figure 4: The trajectory of the subpopulation ratio during the evolutionary process on some typical test problems.
27
5. Conclusions and future work 575
In this paper, we have presented an MOE strategy inspired by AdaBoost for multi-objective evolutionary algorithms. In MOE, the survival rate of the subpopulation after environmental selection is used to evaluate the performance of each operator. A credit assignment scheme with a decaying effect is developed to update the subpopulation ratio for the next generation. According to these ra-
580
tios, the emigration-immigration mechanism is designed to adjust the allocation of computational resources and make the evolutionary information exchanged among subpopulations. To prove the validity, we have made an experimental comparison on three test suites that with complex PSs, many objectives and constraints, respectively. The results show that MOE can significantly improve the
585
performance of MOEAs on various kinds of MOPs. In the future research, we will further study the ensemble of different evolutionary operators and selection strategies at the same time. Because from the experimental results, we find that the performance of MOE seems to be affected by the used selection strategy. It is interesting to design a unified ensemble frame-
590
work by considering the operator and the selection strategy simultaneously. Acknowledgments This work was supported in part by National Nature Science Foundation of China under Grant 51779050, 61672033, 61822301, U1804262, 61603073, State Key Laboratory of Synthetical Automation for Process Industries under Grant
595
PAL-N201805, Anhui Provincial Natural Science Foundation for Distinguished Young Scholars under Grant 1808085J06, and Natural Science Foundation of Anhui Province under Grant No. 1908085MF219. References References
600
[1] K. C. Tan, S. C. Chiam, A. Mamun, C. K. Goh, Balancing exploration and exploitation with adaptive variation for evolutionary multi-objective optimization, European Journal of Operational Research 197 (2) (2009) 701–713.
28
[2] Y. Tian, X. Zhang, C. Wang, Y. Jin, An evolutionary algorithm for large-scale sparse multi-objective optimization problems, IEEE Transactions on Evolu605
tionary Computation (2019) 1–1. [3] X. Zhang, K. Zhou, H. Pan, L. Zhang, X. Zeng, Y. Jin, A network reductionbased multiobjective evolutionary algorithm for community detection in large-scale complex networks, IEEE Transactions on Cybernetics (2018) 1–1. [4] Y. Liang, W. He, W. Zhong, F. Qian, Objective reduction particle swarm op-
610
timizer based on maximal information coefficient for many-objective problems, Neurocomputing 281 (2018) 1 – 11. [5] Y. Tian, R. Cheng, X. Zhang, F. Cheng, Y. Jin, An indicator-based multiobjective evolutionary algorithm with reference point adaptation for better versatility, IEEE Transactions on Evolutionary Computation 22 (4) (2018) 609–622.
615
[6] D. Gong, F. Sun, J. Sun, X. Sun, Set-based many-objective optimization guided by a preferred region, Neurocomputing 228 (2017) 241 – 255. [7] H. Wang, M. Olhofer, Y. Jin, A mini-review on preference modeling and articulation in multi-objective optimization: current status and challenges, Complex & Intelligent Systems 3 (4) (2017) 233–245.
620
[8] M. Li, S. Yang, X. Liu, Pareto or Non-Pareto: Bi-criterion evolution in multiobjective optimization, IEEE Transactions on Evolutionary Computation 20 (5) (2016) 645–665. [9] H. Wang, L. Jiao, X. Yao, Two arch2: An improved two-archive algorithm for many-objective optimization, IEEE Transactions on Evolutionary Computa-
625
tion 19 (4) (2015) 524–541. [10] X. Cai, Y. Li, Z. Fan, Q. Zhang, An external archive guided multiobjective evolutionary algorithm based on decomposition for combinatorial optimization, IEEE Transactions on Evolutionary Computation 19 (4) (2015) 508–523. [11] Y. Zhou, J. Wang, J. Chen, S. Gao, L. Teng, Ensemble of many-objective evo-
630
lutionary algorithms for many-objective problems, Soft Computing 21 (9) (2017) 2407–2419. 29
[12] W. Gong, A. Zhou, Z. Cai, A multioperator search strategy based on cheap surrogate models for evolutionary optimization, IEEE Transactions on Evolutionary Computation 19 (5) (2015) 746–758. 635
[13] W. Wang, S. Yang, Q. Lin, Q. Zhang, K. Wong, C. A. C. Coello, J. Chen, An effective ensemble framework for multi-objective optimization, IEEE Transactions on Evolutionary Computation (2018) 1–1. [14] Y.-H. Zhang, Y.-J. Gong, T.-L. Gu, J. Zhang, Ensemble mating selection in evolutionary many-objective search, Applied Soft Computing 76 (2019) 294–
640
312. [15] S.-Z. Zhao, P. N. Suganthan, Q. Zhang, Decomposition-based multiobjective evolutionary algorithm with an ensemble of neighborhood sizes, IEEE Transactions on Evolutionary Computation 16 (3) (2012) 442–446. [16] R. Mallipeddi, P. N. Suganthan, Ensemble of constraint handling techniques,
645
IEEE Transactions on Evolutionary Computation 14 (4) (2010) 561–579. [17] D. H. Phan, J. Suzuki, I. Hayashi, Leveraging indicator-based ensemble selection in evolutionary multiobjective optimization algorithms, in: Proceedings of the 14th annual conference on Genetic and evolutionary computation, ACM, 2012, pp. 497–504.
650
[18] Y. Freund, R. E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55 (1) (1997) 119 – 139. [19] M. M. Baig, M. Awais, E. S. M. E. Alfy, Adaboost-based artificial neural network learning, Neurocomputing 248 (2017) 120 – 126.
655
[20] Q. Lin, Z. Liu, Q. Yan, Z. Du, C. A. C. Coello, Z. Liang, W. Wang, J. Chen, Adaptive composite operator selection and parameter control for multiobjective evolutionary algorithm, Information Sciences 339 (2016) 332 – 352. [21] H. S. Yoon, B. R. Moon, An empirical study on the synergy of multiple crossover operators, IEEE Transactions on Evolutionary Computation 6 (2)
660
(2002) 212–223. 30
[22] J. Zhang, A. C. Sanderson, JADE: Adaptive differential evolution with optional external archive, IEEE Transactions on Evolutionary Computation 13 (5) (2009) 945–958. [23] Q. Fan, X. Yan, Y. Zhang, Auto-selection mechanism of differential evolution 665
algorithm variants and its application, European Journal of Operational Research 270 (2) (2018) 636–653. [24] C. Li, S. Yang, T. T. Nguyen, A self-learning particle swarm optimizer for global optimization problems, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42 (3) (2011) 627–646.
670
[25] N. Lynn, P. N. Suganthan, Heterogeneous comprehensive learning particle swarm optimization with enhanced exploration and exploitation, Swarm and Evolutionary Computation 24 (2015) 11 – 24. [26] N. Lynn, P. N. Suganthan, Ensemble particle swarm optimizer, Applied Soft Computing 55 (2017) 533–548.
675
[27] Y. Li, A. Zhou, G. Zhang, An MOEA/D with multiple differential evolution mutation operators, in: 2014 IEEE Congress on Evolutionary Computation, 2014, pp. 397–404. [28] W. Khan, Q. Zhang, MOEA/D-DRA with two crossover operators, in: 2010 UK Workshop on Computational Intelligence, 2010, pp. 1–6.
680
[29] K. Li, Fialho, S. Kwong, Q. Zhang, Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition, IEEE Transactions on Evolutionary Computation 18 (1) (2014) 114–130. [30] D. Hadka, P. Reed, Borg: An auto-adaptive many-objective evolutionary computing framework, Evolutionary Computation 21 (2) (2013) 231–259.
685
[31] Y. Yuan, H. Xu, B. Wang, An experimental investigation of variation operators in reference-point based many-objective optimization, in: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, ACM, 2015, pp. 775–782.
31
[32] G. Wu, R. Mallipeddi, P. N. Suganthan, Ensemble strategies for population690
based optimization algorithms - A survey, Swarm and Evolutionary Computation 44 (2019) 695 – 711. [33] G. Wu, R. Mallipeddi, P. Suganthan, R. Wang, H. Chen, Differential evolution with multi-population based ensemble of mutation strategies, Information Sciences 329 (2016) 329 – 345.
695
[34] J. Zhou, X. Yao, Multi-population parallel self-adaptive differential artificial bee colony algorithm with application in large-scale service composition for cloud manufacturing, Applied Soft Computing 56 (2017) 379–397. [35] M. Z. Ali, N. H. Awad, P. N. Suganthan, Multi-population differential evolution with balanced ensemble of mutation strategies for large-scale global
700
optimization, Applied Soft Computing 33 (2015) 304–327. [36] R. Mallipeddi, P. Suganthan, Differential evolution algorithm with ensemble of populations for global numerical optimization, Opsearch 46 (2) (2009) 184– 213. [37] W. K. Mashwani, A. Salhi, O. Yeniay, H. Hussian, M. Jan, Hybrid non-
705
dominated sorting genetic algorithm with adaptive operators selection, Applied Soft Computing 56 (2017) 1 – 18. [38] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary Computation 6 (2) (2002) 182–197.
710
[39] E. Zitzler, M. Laumanns, L. Thiele, SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective optimization, in: Proceedings of the Fifth Conference on Evolutionary Methods for Design, Design, Optimization and Control with Applications to Industrial Problems, 2001, pp. 95–100. [40] D. W. Corne, N. R. Jerram, J. D. Knowles, M. J. Oates, PESA-II: Region-based
715
selection in evolutionary multiobjective optimization, in: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation, Morgan Kaufmann Publishers Inc., 2001, pp. 283–290. 32
[41] K. Deb, H. Jain, An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving 720
problems with box constraints, IEEE Transactions on Evolutionary Computation 18 (4) (2014) 577–601. [42] Q. Zhang, W. Liu, H. Li, The performance of a new version of MOEA/D on CEC09 unconstrained mop test instances, in: 2009 IEEE Congress on Evolutionary Computation, 2009, pp. 203–208.
725
[43] S. Huband, P. Hingston, L. Barone, L. While, A review of multiobjective test problems and a scalable test problem toolkit, IEEE Transactions on Evolutionary Computation 10 (5) (2006) 477–506. [44] H. Jain, K. Deb, An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part II: Handling
730
constraints and extending to an adaptive approach, IEEE Transactions on Evolutionary Computation 18 (4) (2014) 602–622. [45] P. A. N. Bosman, D. Thierens, The balance between proximity and diversity in multiobjective evolutionary algorithms, IEEE Transactions on Evolutionary Computation 7 (2) (2003) 174–188.
735
[46] I. Das, J. E. Dennis, Normal-boundary intersection: A new method for generating the pareto surface in nonlinear multicriteria optimization problems, Siam Journal on Optimization 8 (3) (1998) 631–657. [47] Y. Xiang, Y. Zhou, M. Li, Z. Chen, A vector angle-based evolutionary algorithm for unconstrained many-objective optimization, IEEE Transactions on
740
Evolutionary Computation 21 (1) (2017) 131–152. [48] X. Zhang, Y. Tian, Y. Jin, A knee point-driven evolutionary algorithm for many-objective optimization, IEEE Transactions on Evolutionary Computation 19 (6) (2015) 761–776. [49] Y. Tian, H. Wang, X. Zhang, Y. Jin, Effectiveness and efficiency of non-
745
dominated sorting for evolutionary multi-and many-objective optimization, Complex & Intelligent Systems 3 (4) (2017) 247–263. 33
[50] Y. Tian, R. Cheng, X. Zhang, Y. Jin, PlatEMO: A MATLAB platform for evolutionary multi-objective optimization, IEEE Computational Intelligence Magazine 12 (4) (2017) 73–87. 750
[51] S. Das, P. N. Suganthan, Differential evolution: A survey of the state-of-theart, IEEE Transactions on Evolutionary Computation 15 (1) (2010) 4–31. [52] R. B. Agrawal, K. Deb, R. B. Agrawal, Simulated binary crossover for continuous search space, Complex Systems 9 (2) (1995) 115–148. [53] K. Deb, M. Goyal, A combined genetic adaptive search (GeneAS) for engi-
755
neering design, Computer Science and Informatics 26 (1996) 30–45. [54] R. Storn, K. Price, Differential evolution: A simple and efficient heuristic for global optimization over continuous spaces, Journal of Global Optimization 11 (4) (1997) 341–359. ¨ [55] C. von Lucken, B. Bar´an, C. Brizuela, A survey on multi-objective evolution-
760
ary algorithms for many-objective problems, Computational Optimization and Applications 58 (3) (2014) 707–756. [56] B. Li, J. Li, K. Tang, X. Yao, Many-objective evolutionary algorithms:A Survey, ACM Computing Surveys 48 (1) (2015) 1–35. [57] K. Li, R. Wang, T. Zhang, H. Ishibuchi, Evolutionary many-objective op-
765
timization: A comparative study of the state-of-the-art, IEEE Access 6 (4) (2018) 26194–26214. [58] J. N. Kuk, S. M. Venske, M. R. Delgado, Adaptive operator selection for manyobjective optimization with NSGA-III, in: Evolutionary Multi-Criterion Op¨ timization: 9th International Conference, EMO 2017, Munster, Germany,
770
March 19-22, 2017, Proceedings, Vol. 10173, Springer, 2017, p. 267. [59] H. Wang, J. Wang, X. Zhen, F. Zeng, X. Tu, Oriented multi-mutation strategy in a many-objective evolutionary algorithm, Information Sciences 478 (2019) 391 – 407.
34
[60] S. M. Elsayed, R. A. Sarker, D. L. Essam, Multi-operator based evolution775
ary algorithms for solving constrained optimization problems, Computers & Operations Research 38 (12) (2011) 1877–1896.
35
Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
Biographies of the Authors
Chao Wang received the B.Sc. degree from Suihua College, Suihua, China, in 2012, and the M.Sc. and Ph.D. degrees from Harbin Engineering University, Harbin, China, in 2015 and 2018, respectively. He is currently a Lecturer with the School of Computer Science and Technology, Anhui University, Hefei, China. His main research interests include multi-objective optimization methods and their application.
Ran Xu received the B.Sc. degree from Hefei University, Hefei, China, in 2016, and the M.Sc. degree in School of Computer Science and Technology from Anhui University, China, in 2019. His current research interests include evolutionary multiobjective optimization and constrained optimization.
Jianfeng Qiu received the B.Sc. degree from Anqing Normal University, in 2003, and the M.Sc. and Ph.D. degrees from Anhui University, China, in 2006 and 2014, respectively. Currently, he is a Lecturer with the School of Computer Science and
Technology, Anhui University, China. His main research interests include machine learning, imbalanced classification and multi-objective optimization.
Xingyi Zhang (SM’18) received the B.Sc. degree from Fuyang Normal College, Fuyang, China, in 2003, and the M.Sc. and Ph.D. degrees from Huazhong University of Science and Technology, Wuhan, China, in 2006 and 2009, respectively. He is currently a Professor with the School of Computer Science and Technology, Anhui University, Hefei, China. His current research interests include unconventional models and algorithms of computation, evolutionary multi-objective optimization, and complex network analysis. He is the recipients of the 2018 IEEE Transactions on Evolutionary Computation Outstanding Paper Award and the 2020 IEEE Computational Intelligence Magazine Outstanding Paper Award.