An evolutionary algorithm for multi-criteria inverse optimal value problems using a bilevel optimization model

An evolutionary algorithm for multi-criteria inverse optimal value problems using a bilevel optimization model

Applied Soft Computing 23 (2014) 308–318 Contents lists available at ScienceDirect Applied Soft Computing journal homepage: www.elsevier.com/locate/...

821KB Sizes 0 Downloads 39 Views

Applied Soft Computing 23 (2014) 308–318

Contents lists available at ScienceDirect

Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc

An evolutionary algorithm for multi-criteria inverse optimal value problems using a bilevel optimization model Hecheng Li a,b,∗ a b

Department of Mathematics, Key Laboratory of Tibetan Information Processing of Ministry of Education, Qinghai Normal University, Xining 810008, China Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

a r t i c l e

i n f o

Article history: Received 24 March 2012 Received in revised form 28 January 2014 Accepted 22 June 2014 Available online 2 July 2014 Keywords: Inverse optimal value problem Bilevel program Evolutionary algorithm Solutions

a b s t r a c t Given a linear program, a desired optimal objective value, and a set of feasible cost vectors, one needs to determine a cost vector of the linear program such that the corresponding optimal objective value is closest to the desired value. The problem is always known as a standard inverse optimal value problem. When multiple criteria are adopted to determine cost vectors, a multi-criteria inverse optimal value problem arises, which is more general than the standard case. This paper focuses on the algorithmic approach for this class of problems, and develops an evolutionary algorithm based on a dynamic weighted aggregation method. First, the original problem is converted into a bilevel program with multiple upper level objectives, in which the lower level problem is a linear program for each fixed cost vector. In addition, the potential bases of the lower level program are encoded as chromosomes, and the weighted sum of the upper level objectives is taken as a new optimization function, by which some potential nondominated solutions can be generated. In the design of the evolutionary algorithm some specified characteristics of the problem are well utilized, such as the optimality conditions. Some preliminary computational experiments are reported, which demonstrates that the proposed algorithm is efficient and robust. © 2014 Elsevier B.V. All rights reserved.

1. Introduction General optimization problems are to determine the values of variables such that the objectives can achieve the optima on the condition that all coefficients are given in objectives and constraints. Inverse optimization is to execute an ‘inverse’ procedure, that is to infer some parameters in objective function or in constraints such that the optimal solutions satisfy a prespecified standard. A standard inverse optimization problem can be described as follows: given an optimization problem P : minc T x and x∈X

a desired optimal solution x ∈ X, determine a cost vector c such that x is optimal to problem P. In addition, some additional conditions on c are often taken into account. For example, c should be as close to a pre-determined vector c as possible under some p -norm, that is, the deviation c − c  p should be minimized. This problem was first introduced by Burton and Toint [1,2] in a shortest path problem, in which all weights on edges need to be determined such that some pre-specified edges can form the short-path. In this class of problems, the objective function is also decided by the selected measure

∗ Correspondence to: Department of Mathematics, Key Laboratory of Tibetan Information Processing of Ministry of Education, Qinghai Normal University, Xining 810008, China. Tel.: +86 971 6318436. E-mail address: [email protected] http://dx.doi.org/10.1016/j.asoc.2014.06.044 1568-4946/© 2014 Elsevier B.V. All rights reserved.

norm. For instance, when the deviation is evaluated by 2 -norm, the objective function of inverse optimization can be transformed into a convex quadratic function [1]. If P is a linear program, and 1 and ∞ norms are adopted, linear programming can be utilized to deal with this class of problems [3,4]. For a comprehensive survey of the literature on inverse optimization, please refer to [5]. If the problem is to find a vector c such that the objective value is equal or close to a pre-specified value, the problem is known as an inverse optimal value problem. Obviously, the inverse optimal value problem is a generalized version of the standard inverse optimization problem, because the objective values must be close to each other when two optimal solutions approach for any continuous functions. This kind of problems can be stated as follows: given an optimization problem P, a desired optimal objective value z* , and a set C of feasible cost vectors, determine a cost vector c* ∈ C such that the corresponding optimal objective value of P is closest to z* [6,7]. The inverse optimal value problem has received little attention in the literature. Berman [8] discussed a minimal location model, which is to determine edge lengths in a graph such that the induced minimal distance from a given vertex to all other vertices is within prescribed bounds, and shows the problem is NP-complete for general graphs. This paper is motivated by an application in telecommunication bandwidth pricing proposed by Paleologo and Takriti [9]. In this research, the problem the authors consider is to infer the correct

H. Li / Applied Soft Computing 23 (2014) 308–318

prices on city-to-city links on its bandwidth network, such that the total price of the links on the cheapest path between arbitrary two cities should be close to the pre-specified one. The problem is modeled as a directed graph with m nodes and n arcs, in which the length associated to each arc is the unknown price. Mathematically, the problem is to determine the length cj of each arc j, j = 1, . . ., n, so that the length of the shortest path between a node pair is equal to a given value. We denote by K the number of origin-destination pairs for which the desired shortest distance pk is known. The problem becomes that of finding a nonnegative c such that

⎧ minc T xk = pk , ⎪ ⎨ xk ⎪ ⎩

s.t. Axk = bk , xk ≥ 0.

(1)

309

level program are encoded as chromosomes, and the weighted sum of the upper level objectives is taken as a new optimization function, by which some potential nondominated solutions can be generated. It is worth noting that the coding scheme makes the algorithm evidently different from other EAs, such as Deb’s method using NSGA-II, mainly because the coding technique makes the search space become finite even for a continuous MCBLPP. This paper is organized as follows. Multi-criteria bilevel programming model is proposed and some basic notations are presented in Section 2, and Section 3 gives evolutionary operators and displays our algorithm. Some computational examples are given and solved in Section 4. We finally conclude our paper in Section 5.

k = 1, . . ., K

Alternatively, we may formulate the problem as a multi-criteria optimization problem

2. Multi-criteria bilevel programming model Problem (2) can be rewritten as

min(f1 (c), . . ., fK (c)) c≥0

Here, fi (c) =

|minc T xi xi ∈Xi

(2) − pi |, Xi = {xi |Axi = bi , xi ≥ 0}, i = 1, . . ., K, C

is a feasible region and | . | denotes the absolute value. In fact, the bandwidth pricing problem is essentially a multi-criteria inverse optimal value problem [7,9]. For the case in which K = 1 is a trivial one, Ahmed and Guan [7] showed that the inverse optimal value problem is NP-hard in general, and for the case that the set of feasible cost vectors is polyhedral, the authors developed an algorithm for the problem based on solving linear and bilinear programming problems. Lv [6] proposed a bilevel programming model and applied the duality principle of linear program to transform the problem into a single level nonlinear program. Then, a penalty function method was used to solve the equivalent problem. Despite the fact that these procedures are efficient for the trivial case (single-objective inverse optimal value problem), they cannot be extended directly to multicriteria cases. From the multi-objective point of view, Paleologo and Takriti [9] suggested a mixed-integer programming formulation by the weighted sum of the objectives for this bandwidth pricing problem. However, it can only give one optimal solution in each run of the algorithm and needs to solve a mixed-integer program which is computationally intractable. In fact, problem (2) is a special multi-criteria bilevel programming problem (MCBLPP). MCBLPP describes a hierarchical structure, the constraint region of the first level problem (leader’s problem/upper level problem) is implicitly determined by the second level optimization problem (follower’s problem/lower level problem), and the upper and/or lower levels have more than one objective. At present, some algorithmic approaches and theoretical results have been developed for this kind of the problems [10–13]. In optimization procedures involving multiple objectives, some population-based algorithms have been widely adopted since these algorithms can provide a Pareto optimal front for various preferences, such as evolutionary algorithm (EA) and it’s variations. Deb [13] proposed an efficient EA for solving MCBLPPs, which almost puts no requirements on all functions involved, such as convex or differentiable. But for problem (2) in which the lower level is single objective and is also linear for any fixed cost vector c, these procedures are computation-expensive. The purpose of this paper is to discuss a special MCBLPP which is a generalized version of problem (2) and then provide an efficient approach for solving most of inverse optimization problems with multiple criteria. Based on the proposed MCBLPP model, we develop an efficient EA by taking advantage of a dynamic weighted aggregation method as well as the optimality conditions of the optimization problem. In our approach, the potential bases of the lower

⎧ min(|c T x1 − p1 |, . . ., |c T xK − pK |) ⎪ c≥0 ⎪ ⎪ ⎪ min cT x ⎪ 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ s.t. Ax1 = b1 , x1 ≥ 0; ⎪ ⎪ ⎪ ⎨ min cT x 2

⎪ ⎪ s.t. Ax2 = b2 , x2 ≥ 0; ⎪ ⎪ ⎪ ⎪ ⎪ ... ⎪ ⎪ ⎪ ⎪ min c T xK ⎪ ⎪ ⎪ ⎩

(3)

s.t. AxK = bK , xK ≥ 0.

Obviously, the problem is a multi-criteria bilevel programming problem with multiple lower level problems. Note that no common variables are shared between arbitrary two lower level problems, therefore it is equivalent to

⎧ min(|c T x1 − p1 |, . . ., |c T xK − pK |) ⎪ ⎪ c≥0 ⎪ ⎪ ⎨ K  c T xi min ⎪ ⎪ ⎪ i=1 ⎪ ⎩

(4)

s.t. Axi = bi , xi ≥ 0, i = 1, . . ., K.

Further, set c 1 = (c T , 0, . . ., 0)



...

 K



c K = (0, . . ., 0, c T )







K

and T

x = (x1T , . . ., xKT ) , C = c 1 + · · · + c k Then, (4) can be reformed as

⎧ (|c 1 x − p1 |, . . ., |c K x − pK |) ⎪ ⎨ min c≥0 ⎪ ⎩

min Cx s.t. diag(A)x = b, x ≥ 0

(5)

310

H. Li / Applied Soft Computing 23 (2014) 308–318 T

Here, b = (bT1 , . . ., bTK ) . In order to obtain an uniform representation, let

⎛ ⎜ ⎜



In

w1 = ⎜ ⎜



⎟ ⎜ ⎟ ⎟ , . . ., wK = ⎜ ⎜ ⎟ ⎝ ⎠

0 ..



.



0

0

..

⎟ ⎟ ⎟ ⎠

. 0 In

Then, we have ci = Cwi , i = 1, . . ., K. We consider the nonlinear model as follows

⎧ 2 2 ⎪ ((Cw1 x − p1 ) , . . ., (Cwk x − pk ) ) ⎨ min c≥0 min Cx ⎪ ⎩

(6)

s.t. diag(A)x = b, x ≥ 0.

Evidently, (6) and (5) have the same optimal solutions. For general inverse optimization problems, when P is linear and multiple criteria are taken simultaneously to measure the cost vector c, multi-criteria cases arise. As a result, we give an uniform model for this kind of MCBLPPs as follows

⎧ min(F1 (x, y), . . ., FK (x, y)) ⎪ ⎪ ⎪ x ⎨ s.t. G(x, y) ≤ 0

minxT y ⎪ ⎪ ⎪ ⎩ y

(7)

s.t. Ay = b, y ≥ 0.

Here, x, y ∈ Rn , and Fi (i = 1, 2, . . ., K), G are nonlinear. In this problem



min(F1 (x, y), . . ., FK (x, y)) x

s.t. G(x, y) ≤ 0.

(8)

and



minxT y y

s.t. Ay = b, y ≥ 0.

(9)

are called the upper level problem and the lower level problem, respectively. The variables of problem (7) are divided into the upper level variables x = (x1 , . . ., xn )T and the lower level variables y = (y1 , . . ., yn )T . Similarly, F : Rn × Rn → RK is called the upper level objective function, and xT y is called the lower level objective function. G(x, y) ≤ 0 and Ay = b(∈ Rq ), y ≥ 0, are called the upper level and the lower level constraints, respectively. The decision-making process of (7) can be stated as follows: the upper level decision-maker first selects a strategy x to optimize his/her objective, then the lower level decision-maker observes the upper level’s selection and finds a strategy y to optimize his/her own objective. If such pair (x, y) satisfies the upper level constraints, it is called feasible in a bilevel decision-making procedure [14]. The purpose of solving (7) is to find out a Pareto-optimal solution set from all feasible points. Over the past 20 years or so, the field of bilevel optimization has received a lot of attention in developing efficient algorithms and applying bilevel models and optimization approaches to deal with hard real-world problems [15–21]. For bilevel programming problems (BLPPs) with multiple objectives, there exist few papers for this class of problems [11,13,24,22,23]. Deb and Sinha [13] suggested a hybrid evolutionary algorithm for solving bilevel problems with multiple objectives in both levels. In the proposed procedure, two interacting populations are kept and both upper and lower levels are searched by EAs. The proposed algorithmic method is utilized universally since it has no requirements on the functions involved, such as convexity and differentiability. But Deb’method is computation-expensive to solve (7) due to the special structure of the problem. When all

objective functions are linear, and the constraints at both levels define a polyhedra, Calvete [23] proved that the set of efficient solutions is non-empty, and based on weighted sum techniques, presented some methods of computing efficient solutions. However, the method cannot be extended to nonlinear cases. Ye [11] presented some optimality conditions by replacing the lower level problem with its KKT conditions. Also, for the general multiobjective bilevel problem, some necessary optimality conditions are also derived by considering the combined problem [11]. However, most theoretical results cannot be applied directly to develop efficient algorithmic approaches in view of rigorous requirements on functions involved. In addition, for most of the existing evolutionary algorithms, the computational performance will get worse as both the scale of problem and search space become larger. In view of these facts, an efficient algorithm for problem (7) should be: • population-based, which can provide a Pareto front instead of a single solution in one run. • problem-specific, which means some specific characteristics of the problem can be used in designing an algorithm. • less dependent on the size of feasible region. Otherwise, it will be time-consuming for some real-world problems with a larger search space. In the design of our EA, the first item is satisfied automatically. Also, the proposed algorithm explores a discrete space instead of continuous space, which differs from classical EAs, and the optimality conditions are taken advantage of as well as the characteristics of functions. It means that the second and the third items are taken into account very well. Now we introduce some related definitions and notations as follows [15]. (1) Constraint region: S = {(x, y)|G(x, y) ≤ 0, Ay = b, y ≥ 0}. (2) Feasible region of the lower level problem for x fixed: S = {y|Ay = b, y ≥ 0}. (3) Projection of S onto the upper level decision space: S(X) = {x| ∃ y, (x, y) ∈ S}. (4) Rational reaction set of the lower level for each x ∈ S(X): M(x) = {y|y ∈ argmin{xT v, v ∈ S  }}. (5) Inducible region: IR = {(x, y) ∈ S|y ∈ M(x)}. In terms of aforementioned definitions, problem (7) can also be written as: min{(F1 (x, y), . . ., FK (x, y))|(x, y) ∈ IR} In order to ensure that problem (7) is well posed, in the remainder, we always assume that A1. S is nonempty and compact. A2. For all decisions taken by the upper level, the lower level has some room to react, that is, S = / . A3. The lower level problem has unique optimal solution for each fixed x. A4. The rank of matrix A is q. Also, in order to solve the upper level programming easily, we further assume the functions Fi (i = 1, . . ., k), G are convex and differential with respect to x and y. Definition 1 (Domination). For ∀(x, y), (x, y) ∈ IR, (x, y) is said to dominate (x, y), i.e., (x, y) (x, y), if and only if, Fi (x, y) ≤ Fi (x, y), i = 1, . . ., K and ∃i0 ∈ {1, . . ., k} such that Fi0 (x, y) < Fi0 (x, y).

H. Li / Applied Soft Computing 23 (2014) 308–318

311

Definition 2 (Pareto-optimal solution). (ˆx, yˆ ) ∈ IR is a Paretooptimal solution if and only if there exists no (x, y) ∈ IR such that (x, y) (ˆx, yˆ ).

If A{1, 3} is nonsingular, based on A{1, 2}−1 , we can utilize the pivoting algorithm to obtain A{1, 3}−1 as done in the simplex method.

Definition 3 (Pareto front). Given a Pareto-optimal solution set, the set of the objective function values in the objective space is called a Pareto front.

3.2. How to find potential Pareto-optimal points based on the encoding? For any individual l, the basic matrix associated with l is denoted by B consisting of the first q columns of A. If

3. Solution method

B−1 b ≥ 0

Recall that for any (x0 , y0 ) ∈ IR, y0 must be optimal to the follower program when x is fixed to x0 and is a vertex of the feasible region consisting of Ay = b, y ≥ 0. In addition, the feasible region has only finite vertices. We take advantage of the characteristics of linear programming to present an evolutionary algorithm for solving MCBLPP (7). At first, to search potential bases of the follower linear program. For each fixed base B, one can obtain a feasible basic solution y0 to the follower problem by taking all basic variables as B−1 b and nonbasic variables as 0. Then, if y0 ≥ 0, given a weight vector w0 , we take the weighted sum of all objectives of (8) as a new objective function, and replace y with y0 . As a result, a new nonlinear program is generated in which only x is involved. Finally, we optimize the generated nonlinear program for obtaining x0 , then (x0 , y0 ) is a potential Pareto optimal solution. The main idea of the paper is to search all these points as different weight combinations and bases are chosen and to find out Pareto-optimal solutions from them.

3.1. Chromosome encoding and initial population Since EA is used to search all bases of the follower’s program, we encode each basis of (9) as an individual of population. Let V = {1, 2, . . ., n} be the set of all column indices of A. q elements are selected from V and denoted by l = {i1 , i2 , . . ., iq }. Furthermore, if these columns are linearly independent, then l is taken as an individual. An initial population with the size of Np can be generated by lexicographically selecting Np individuals. But it is computation-expansive if the determinant method is used for each l to judge whether the selected columns are linearly independent. We present a simplified approach by the following example.

 A=

a11

a12

a13

a14

a21

a22

a23

a24

 (10)

Here, q = 2, n = 4. For convenience, we denote by A{i, j} the matrices consisting of the ith and jth columns of A. Without loss of generality, set

 A{1, 2} =

a11

a12

a21

a22

 (11)

be nonsingular, then we get an individual l = {1, 2}. Further, we let

 A{1, 2}

−1

A=

1

0

a13

a14

0

1 a23

a24

 (12)

/ 0, then A{1, 3} is nonsingular, that is, l = {1, 3} is also If a23 = an individual, otherwise, l is ignored. The same procedure can be applied to judge whether l = {1, 4} is an individual.

(13)

we then provide a weight vector set l = {i |i = K (i1 , . . ., iK ), j=1 ij = 1, ij ∈ [0, 1], i = 1, . . ., s}. Here, s is a positive integer determined previously. In fact, one can easily generate the set according to the following procedure: First, we take K numbers at random from interval [0, 1] (there exists at least one positive number), and denote these numbers by  j , j = 1, . . .,

K

K

K. Then, take i = ((1 / v=1 v ), . . ., (K / v=1 v )), i = 1, . . ., s. After l is determined, we solve a group of nonlinear programs as follows

⎧ K  ⎪ ⎪ ⎪ min ij Fj (x, y) ⎪ ⎪ ⎪ ⎨ x j=1 s.t. G(x, y) ≤ 0,

⎪ ⎪ ⎪ xN − xB B−1 N ≥ 0, ⎪ ⎪ ⎪ ⎩

(14)

i = 1, . . ., s.

Here, A = (B, N), xN is the objective coefficient vector associated with all nonbasic variables, whereas xB is associated with all basic variT ables, and y = (B−1 b, 0) is the optimal solution to the follower’s program. In fact, inequality (13) and constraint xN − xB B−1 N ≥ 0 guarantee that B is feasible and optimal in the follower’s problem for some values of x. Recall the previous assumptions, the programs involved in (14) are convex, there exist dozens of optimization approaches for the problems of this kind, such as cutting plane algorithms and activeset algorithms, thus they can easily be solved. Especially, when Fi and G are linear, the simplex method can be employed. If (14) has an optimal solution x for some i, then (x, y) is a potential Pareto solution. Note that (14) involves s mathematical programs, it means that an individual l can provide s potential Pareto solutions with different x s, which differs from classical EAs. It should be noted that in (14) all s problems have the same constraints. If one of them is solved, then the obtained solution can be taken as a starting point for solving other problems, which can reduce the complexity of computation. Also, we have Theorem 1. The existence of solutions to (14) is indifferent to the given weight coefficients, i.e., if one of s problems has an optimal solution, then other optimization problems have the optimal solutions; otherwise, there exists no solution for all of s optimization problems. Proof. Since S is compact, it follows that the feasible region of (14) is bounded. If one among s problems has an optimal solution, then it can be inferred that the feasible region of (14) is nonempty. Hence, all s problems have optimal solutions. If one among s problems has no solution, it means the shared feasible region is empty. This completes the proof. When an individual is evaluated, (13) is first checked. If the inequality holds, then (14) is solved. Furthermore, from Theorem 1, if (14) has one solution for some i , then s potential Pareto optimal solutions can be obtained by solving s optimization problems in (14); otherwise, Fi (i = 1, . . ., K) are taken as infinity. In addition, it should be noted that the weighted aggregation method has a drawback, i.e., it is difficult for the search

312

H. Li / Applied Soft Computing 23 (2014) 308–318

algorithm to find out the points located in concave region on Pareto fronts whatever weight combinations are taken [25]. In order to overcome the shortcoming, we execute a procedure as follows: First, keep all processing points in the optimization of (14) as well as the optimal solutions for each individual, and denote the set of these points by ϒ. Since these points come from an individual, it follows that y associated with these points is unique, and (xi , y) ∈ IR, ∀ xi ∈ ϒ. Then, randomly choose two points from ϒ and execute arithmetic crossover. The generated points are put into ϒ. The process is not stopped until pre-determined times r is achieved. After doing so, there arises a problem whether y is still optimal to the follower problem for these generated points. We have Theorem 2. If (xi , y) ∈ IR, i = 1, 2, come from the same individual, and x¨ = ˛x1 + (1 − ˛)x2 , ˛ ∈ [0, 1], then (¨x, y) ∈ IR. (xi ,

Proof. Since y) ∈ IR, i = 1, 2, it follows that (13) holds. Also, y), i = 1, 2, come from the same individual, it means i xN − xBi B−1 N ≥ 0,

(xi ,

i = 1, 2

hold. Multiple the inequalities by ˛ and (1 − ˛), respectively, and add them. We get 1 2 (˛xN + (1 − ˛)xN ) − (˛xB1 + (1 − ˛)xB2 )B−1 N ≥ 0

i.e. i − x¨ Bi B−1 N ≥ 0 x¨ N

Recall that G is convex, it follows that both the feasibility and the optimality are satisfied. This completes the proof. Finally, we retain s optimal solutions in ϒ, but delete some of the other points located in dense parts from the set using the clustering technique [26], and keep that the element number is ( > s) (a predetermined integer). All potential Pareto-optimal solutions generated by EA are evaluated by the dominating procedure and both the Pareto solutions and the front of (7) are finally determined. 3.3. Crossover operator We choose two crossover parents l = {l1 , l2 , . . ., lq } and l = {l1 , l2 , . . ., lq }. First, a cross-position is determined at random in l as done in one-point crossover operator. Then the components at left-hand side of the cross position are, one by one, taken as entering elements for l and correspondingly, some components in l are removed as leaving elements, which almost follows the same procedure as in the simplex method except for the selection of entering-basis variables. Also, the procedure makes the total of elements in l keep unchanged. For any entering element which is also in l, the replacement is ignored. When all entering procedures are finished, a crossover offspring is generated. When a cross-position is given in l, some elements in l will be replaced and the other offspring can be generated. It should be noted that when the replacements are executed one by one, the inverse matrices associated with individuals can be obtained by the pivot algorithm. Also, based on the minimumratio principle in selecting leaving-basis variables, inequality (13) associated with the crossover offspring always holds. 3.4. Mutation operator Let l = {l1 , l2 , . . ., lq } be a selected mutation parent, and B˜ = A{l1 , l2 , . . ., lq }. First, to choose randomly an integer i0 ∈ V \l and

Ai0 is taken as an entering-basis column. The pivot operation is executed and a leaving-basis variable is determined. If the pivot operation cannot be executed, we choose another Ai0 until the operation can proceed. When the pivot process is finished, a new individual, as a mutation offspring, is obtained. As discussed in the crossover operation, (13) holds for each mutation offspring. 3.5. Proposed algorithm In this section, based on a weighted aggregation technique, we present an evolutionary algorithm (EA/WA). In the proposed algorithm, the weight vector set is first given and then changed periodically. Step 0. Some parameters are given, i.e., population size Np , crossover probability pc , mutation probability pm and positive integers s, t, r, , NnonD . Let set = . Step 1. Np individuals are generated, these individuals form an initial population denoted by pop(0). Let g = 0; Step 2. Generate a set  consisting of s weight vector; Step 3. For each individual, solve (14) to obtain all points of ϒ. Compute Fi , i = 1, . . ., K of each point in ϒ and put all obtained non-dominated solutions into . If some points are dominated in , then delete these dominated points; If the element number of is larger than NnonD , then delete some points located in dense fractions on the Pareto front by using the clustering method until there are NnonD points in [26]. Step 4. Crossover parents are chosen from pop(g), and the crossover is executed according to crossover probability pc . The crossover offspring set is denoted by Oc ; Step 5. Mutation is executed to each individual according to mutation probability pm , and mutation offspring set is represented as Om ; Step 6. If g ≥ t and g mod t=0, then update ; Otherwise, turn to the next step; Step 7. Evaluate offspring generated by crossover and mutation operators and update the non-dominated set . Randomly choose Np individuals from pop(g) ∪ Oc ∪ Om as next population pop(g + 1). Step 8. If the termination criterion is satisfied, then output ; otherwise, let g = g + 1, return to Step 4. The advantage of the algorithm is that the searching space is the set of potential bases and finite, which means that it is enough for the algorithm only to search all vertices of the follower’s feasible region. Also, to solve (14) is a local searching process for x, hence, it is of much help for EA/WA to exploit the space S(X). In fact, it is not necessary for us to assume the leader’s functions, F and G, are convex. If the nonlinear programming (14) can be solved, we can remove the assumption. 4. Computational experiment In this section, two classes of problems are first given and then solved. One is an inverse optimization problem with two criteria, whereas the other is a set of inverse optimal value problems with two criteria at different scales. We perform simulations of EA/WA in Matlab 7.0 on a personal computer (Microsoft Windows XP SP2, Inter® CoreTM 2, CPU2.99GHz, RAM 2.0G). In order to compare the computational results with the real optimal solutions of problems or with those provided by other algorithms, some measure methods must be adopted. In fact, to evaluate the quality of the existing

H. Li / Applied Soft Computing 23 (2014) 308–318

algorithms, several quality measures have been used or proposed in the literature:

10

313

E

D C

8

In our computational experiment, we adopt CPU time, the hypervolume indicator, IC-measure and U-measure to evaluate the proposed algorithm and keep at most 100 Pareto-optimal solutions outputted when the computation is finished. First, we construct an inverse optimization problem (IOP) as follows: find cost vector c ∈ C = {c|0 ≤ ci ≤ 10, i = 1, 2} such that c − (1/(10 − i), 1)  1 (i = 3, 7) are as small as possible and the linear program

4 B C2

2

0 0

A

O 2

4

6

8

10

Fig. 1. Feasible region of the linear program with 4 constraints and the desired region for objective vector.

In order to solve problem (IOP), we transform the problem into the following MCBLPP:

  3   7   ⎧ ⎪ min c − , 1 1 , c − , 1 1 ⎪ ⎪ 10 − 3 10 − 7 0≤c≤10 ⎪ ⎪ ⎪ ⎪ ⎪ 3 c 7 ⎪ s.t. 1 ⎪ ≤ ≤ , ⎪ ⎪ 7 c2 3 ⎪ ⎪ ⎨ min − cy

(16)

y≥0

⎪ ⎪ s.t. y2 ≤ 10, ⎪ ⎪ ⎪ ⎪  i  ⎪ i2 ⎪ ⎪ , i = 3, 7, y1 + y2 ≤ 10 + ⎪ ⎪ 10 − i 10 − i ⎪ ⎪ ⎪ ⎩ y1 ≤ 10.

2 PF TPF

1.5

1

0.5

0

0

0.5

1

1.5

2

F1

(a) Pareto front (PF) and true Pareto front (TPF) 2 PS TPS

(15)

y1 ≤ 10,

can achieve its optimal solution at point C(7.9, 7.9), refer to Fig. 1. In fact, these two reference points, (i/(10 − i), 1), i = 3, 7, are the gradients of the second group of constraint functions (straight lines cc1 and cc2 in Fig. 1). Also, the shaded part shows the desired region of objective vector c in the inverse optimization problem (IOP). As a result, it is easy for one to obtain the optimal solution set {(a, 1) | a ∈ [3/7, 7/3]} to problem (IOP). These true optimal solutions and the corresponding Pareto front are shown in Fig. 2.

1.5

c2

⎧ maxcy ⎪ ⎪ y≥0 ⎪ ⎪ ⎪ ⎪ ⎨ s.t. y2 ≤ 10,  i  i2 ⎪ ⎪ 10 − i y1 + y2 ≤ 10 + 10 − i , i = 3, 7, ⎪ ⎪ ⎪ ⎪ ⎩

C1

Desired optimal solution

6

F2

I. Number of function evaluations or CPU time required [27,28]: it measures the time complexity of a multi-objective optimization algorithm; II. Number of solutions found [27,28]: it measures the number of solutions that can be found by a multi-objective optimization algorithm; III. Hypervolume indicator [29–31]: this metric calculates the volume (in the objective space) of the union of some hypercubes, in which each hypercube is constructed with a referent point (rp) and a solution as the diagonal corners of the hypercube. For minimized problems, the larger the indicator is, the better the obtained solutions are. IV. C/IC-measure [30,31,25]: C(IC)-measure was proposed to compare the relative quality of two sets of solutions. It measures the percentage of solutions in one set such that these solutions are at least as good as (i.e., not dominated by) those in the other set. We can describe the IC-measure as follows: Let A be a set of nondominated solutions by Algorithm A and B be a set of nondominated solutions by Algorithm B. We denote the nondominated solution set of A ∪ B by D, then IC(A) = (|D ∩ A|)/(|D|) and IC(B) = (|D ∩ B|/|D|), here, | . | represents the element number of a set. If IC(A) > IC(B), it means that Algorithm A finds out more high-quality solutions than Algorithm B. V. U-measure [32]: U-measure is given to measure the uniformity of a given set of nondominated solutions over the Pareto front, which can be finished in the following three steps: (1) determine the domains of the Pareto front over which uniformity is measured, (2) determine the nearest neighbors of each solution in the objective space, and (3) compute the discrepancy among the distances between nearest neighbors. The U-measure is equal to this discrepancy where a smaller discrepancy indicates a better uniformity.

1

0.5

0

0

0.5

1

1.5

2

2.5

c1

(b) Pareto-optimal solutions (PS) and true Pareto-optimal solutions (TPS) Fig. 2. Fitting of the Pareto-optimal solutions (Pareto front).

314

H. Li / Applied Soft Computing 23 (2014) 308–318

where the leader’s constraints guarantee that c can be taken in a reasonable bounded region and the optimal solution can be achieved at vertex C. Further, the problem is equivalent to

⎧ min (u1 + v1 + u2 + v2 , u3 + v3 + u4 + v4 ) ⎪ ⎪ ⎪ 0≤c≤10 ⎪ 7 3 ⎪ ⎪ c2 ≤ c1 ≤ c2 , s.t. ⎪ ⎪ 7 3 ⎪ ⎪ ⎪ ⎪ 3 ⎪ ⎪ + u1 − v1 = 0, c1 − ⎪ ⎪ 10 − 3 ⎪ ⎪ ⎪ ⎪ c − 1 + u2 − v2 = 0, ⎪ ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ c1 − 7 + u3 − v3 = 0, ⎪ ⎨ 10 − 7

(17)

c2 − 1 + u4 − v4 = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ui ≥ 0, vi ≥ 0, i = 1, 2, 3, 4 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ min − cy ⎪ ⎪ y≥0 ⎪ ⎪ ⎪ s.t. y2 ≤ 10, ⎪ ⎪ ⎪ ⎪  i  ⎪ ⎪ i2 ⎪ ⎪ y1 + y2 ≤ 10 + , i = 3, 7, ⎪ ⎪ 10 10 − i −i ⎪ ⎪ ⎩

Dim Num-Obj-Eval

5 20, 000

10 200, 000

15 200, 000

Dim Num-Obj-Eval

20 400, 000

25 400, 000

30 800, 000

calculate the mean CPU time is 0.9s and the mean of U-measure values is 0.08. For the case of inverse optimal value problem with multiple evaluation objectives, Paleologo proposed a real-world application [9], but no available data were provided. In order to test the performance of the proposed algorithm on this kind of problems, we construct the following type of test problems (IOVP) with 6 different scales from 5 to 30 dimensions (Dim) (these scales are shown in Table 1).

⎧ 2 2 min((xT y − p1 ) , (xT y − p2 ) ) ⎪ ⎪ x≥0 ⎪ ⎪ ⎨ s.t. A x + B y ≤ b ; 1

1

1

(18)

⎪ minxT y ⎪ ⎪ ⎪ ⎩ y≥0

y1 ≤ 10.

s.t.B2 y = b2 ,

Problem (17) is a linear multi-objective bilevel program, then the resulting problems (14) are linear problems, and the simplex method is used to solve these problems. We take the parameters of EA/WA as follows: • • • • • • •

Table 1 Scale of the tested problems and the maximum number of objective evaluations (Num-Obj-Eval).

Population size Np = 5; Crossover probability pc = 0.8; Mutation probability pc = 0.01; Maximum number of generations MaxGen = 50; Number of nondominated solutions NnonD = 100; Number of weight vectors s = 10, and r = 10,  = 20; Number of interval generations t = 5.

We execute EA/WA 10 independent runs on problem (IOP), and record CPU time, the optimal solutions and objective values. All Pareto-optimal solutions found belong to the true Pareto-optimal set, it means that Pareto front can be fitted very well. From all 10 groups of results we randomly choose one and plot the Pareto front and Pareto solutions, which are shown in Fig. 2. For all 10 runs we

which has the same type as problem (6) or (7), and x has the same dimensions as y. In these constructed problems, all elements in A1 , Bi , bi , i = 1, 2, are generated randomly in the interval [− 50, 50], and the vector p = (p1 , p2 ) is taken at random in [0, 50]. In order to guarantee that the constraint region is bounded, we restrict all variables in [0, 100]. The number of the follower’s constraints is Dim* 0.6 , whereas that of the leader’s constraints is taken directly as Dim. For problems (14) generated in these test examples, the active-set method is used to solve these quadratic programs. In order to test the stability and performance of the proposed algorithm, we construct two computational examples (denoted by P1 and P2) for each scale by using the above procedure. As a result, there are total 12 examples. For nonlinear multi-objective bilevel programming problems, there are few efficient algorithms. Deb proposed NAGS-II for multiobjective optimization problems and extended the method to multi-objective bilevel program [13,24]. But for the case of problem (6) or (7) in which the follower has only one objective, the method presented in [13] is computationally expensive since it randomly 4

40

1.6533

x 10

35 1.6532 30 25

1.6531

20 1.6531

15 10

1.653 5 0

0

10

20 EA/WA

30

40

1.653 1.8109

1.8109

1.811 1.811 NSGA−II

Fig. 3. Comparison of the Pareto-optimal solutions when NSGA-II fails.

1.8111

1.8111 4

x 10

H. Li / Applied Soft Computing 23 (2014) 308–318

searches the Pareto solutions of the lower level in feasible regions. We revise the method as follows: For each variable value taken by the upper level, the lower level is solved by using the simplex method, whereas the upper level variables are evolved via NSGA-II. The revised version is more efficient than the original one since the solutions of the lower level can be obtained very fast and accurately. In the remainder, for the purpose of simplicity, we also denote the revised NSGA-II by NSGA-II. In order to fairly compare EA/WA with NSGA-II, we set a maximum number of objective evaluations (Num-Obj-Eval) for each problem, refer to Table 1. We run EA/WA and NSGA-II on each problem until the Num-Obj-Eval is achieved, and record the CPU time, Pareto-optimal solutions and corresponding objective values (Pareto front). Furthermore, in order to test the stability of the algorithms, we execute EA/WA and NSGA-II algorithms 10 independent runs, and compute mean CPU time and success rate in 10 runs for each computational example. Also, To compare the quality

of the solutions, we adopt the hypervolume indicator, IC-measure and U-measure. We modify Np = 100 except for the case Dim = 5, and keep other parameters unchanged. MaxGen is omitted since the algorithm stops when the maximum number of objective evaluations is achieved for these 12 computational examples. First, we define a term “success rate”: If an algorithm finds out at least 5 Pareto-optimal solutions in a run, then the algorithm executes a successful run. The success rate is the proportion of successful runs in all runs. For each class of problems with the same scale, we record the success rate in all 20 runs. In addition, the mean CPU time for each class of problems is also calculated. The success rate and the mean CPU time are shown in Table 2. From Table 2, one can see that the success rate of NSGA-II decreases as the dimension increases, whereas EA/WA keeps the success rate of 100% unchanged. Also, when the algorithms stop after the same objective evaluations, the less CPU time the algorithm occupies, the

5

20

EA/WA NSGA−II

4

15

15.5

10

14.5

15 F2

F2 2

EA/WA NSGA−II

16

0.4 0.2 0 −0.2 −0.4

3

315

2.8

3

3.2

14

3.4 5

1

13.5 −1 0 1

0 0

1

2 F1

3

0 0

4

(a) Pareto fronts for Dim = 5

5

50 EA/WA NSGA−II

8

4 2 0 0

46

40

44

30

42

20

40

10

38

EA/WA NSGA−II

7 6.5 −0.5 2

0

0.5

1

4

6

8

−2 0 2 4 6

0 0

10

10

20

F1

4

EA/WA NSGA−II 30

1.6

4 2 0 −2

3 F2

F2

1.4

20

24

1.2 0.5

1

26

28

30

10

1

2 F1

50

40

EA/WA NSGA−II

1.8

1

40

(d) Pareto fronts for Dim = 20

5

2

30 F1

(c) Pareto fronts for Dim = 15

0 0

20

F2

7.5

F2

6

15

(b) Pareto fronts for Dim = 10

10 8

10 F1

3

(e) Pareto fronts for Dim = 25

4

0 0

10

20

30

40

F1

(f) Pareto fronts for Dim = 30

Fig. 4. Pareto fronts obtained by the proposed algorithm and NSGA-II for problems with different dimensions.

316

H. Li / Applied Soft Computing 23 (2014) 308–318

98

600

96 94

Hypervolume indicator

Hypervolume indicator

550

92 90 88 86 84

20

40

60

80

450

400

350

NSGA−II EA/WA 0

500

NSGA−II EA/WA 300

100

0

20

Generations

80

100

(b) Dim = 10, r p = (25, 25)

700

3500

600

3000

500

2500

Hypervolume indicator

Hypervolume indicator

60

Generations

(a) Dim = 5, r p = (10, 10)

400 300 200 100 0

40

20

40

60

80

1500 1000 500

NSGA−II EA/WA 0

2000

0

100

NSGA−II EA/WA 0

20

Generations

(c) Dim = 15, r p = (25, 25)

40 60 Generations

80

100

(d) Dim = 20, r p = (60, 60)

100

3500

90 3000

80

NSGA−II

Hypervolume indicator

Hypervolume indicator

EA/WA

70 60 50 40 30

2500 2000 1500 1000

20

0

500

NSGA−II EA/WA

10 0

20

40

60

80

Generations

(e) Dim = 25, r p = (10, 10)

100

0

0

20

40 60 Generations

80

(f) Dim = 30, r p = (60, 60)

Fig. 5. Comparison of the hypervolume indicators as the generation increases.

100

H. Li / Applied Soft Computing 23 (2014) 308–318 Table 2 Mean CPU time and the success rates. Dim

5 10 15 20 25 30

Success-rate (%)

Mean CPU-time (s)

EA/WA

NSGA-II

EA/WA

NSGA-II

100 100 100 100 100 100

100 80 50 50 20 10

36 297 320 610 807 1011

192 1755 2554 4576 6545 13, 307

Table 3 Mean of the hypervolume indicators. Dim

5 10 15 20 25 30

EA/WA

NSGA-II

P1

P2

P1

P2

13.28 212.35 67.20 1991.50 13.28 1075.30

241.21 19.78 11.58 135.95 56.93 121.62

13.26 212.24 67.14 1989.10 13.25 1074.00

241.09 19.51 11.32 134.11 56.35 121.22

more efficient the algorithm is. EA/WA needs much less CPU time than NSGA-II. In fact, in our experiment on 12 examples, when NSGA-II fails, it can only provide a nondominated solution with an unsatisfactory objective value. We randomly show a failure case of NSGA-II on an example (Dim = 30, P1), refer to Fig. 3. It should be noted that for the example, EA/WA obtains a better Pareto-optimal front. Considering that too large objective values and too few Paretooptimal solutions will be caused when a run of NSGA-II fails, and the comparisons of quality and/or uniformity of the obtained solutions can be made at the same order of magnitude, we simply compare the results provided by successful runs. These comparisons are made by taking three indicators into account, that is the hypervolume indicator, IC-measure and U-measure. Table 3 shows the comparison of the means of the hypervolume indicator values for each problem, in which the reference point is taken as follows: each component of this point is taken as the maximal one among the corresponding objective values of all solutions found by two algorithms. From the table, one can easily see that EA/WA found better solutions than NSGA-II with regard to the indicator. In fact, NSGA-II will provide worse indicator values if all results are considered including failure cases. Table 4 shows the means of IC-measure and U-measure of Pareto-optimal fronts provided by EA/WA and NSGA-II respectively. The mean values of IC-measure corresponding to EA/WA are larger than or equal to those provided by NSGA-II. Meanwhile, the mean values of U-measure generated by EA/WA is less than those given by NSGA-II. It follows that EA/WA found more Table 4 Mean of IC-measure and U-measure values. No.

Mean IC-measure

Mean U-measure

(Dim-Prob .)

EA/WA

NSGA-II

EA/WA

NSGA-II

5-P1 5-P2 10-P1 10-P2 15-P1 15-P2 20-P1 20-P2 25-P1 25-P2 30-P1 30-P2

0.53 0.57 0.55 0.52 0.54 0.55 0.51 0.56 0.52 0.50 0.51 0.55

0.51 0.51 0.50 0.51 0.51 0.54 0.50 0.52 0.50 0.50 0.50 0.53

0.02 0.05 0.08 0.09 0.04 0.07 0.22 0.23 0.03 0.02 0.21 0.17

0.04 0.08 0.13 0.12 0.09 0.08 0.51 0.53 0.05 0.04 0.36 0.32

317

high-quality Pareto-optimal solutions than NSGA-II, and the Pareto fronts provided by EA/WA are more uniform than those given by NSGA-II for all 12 problems. In order to show the distribution of solutions obtained by two algorithms, we randomly select one from all successful runs on each problem P1 and plot the Pareto fronts (Fig. 4). In these scatter diagrams, we randomly choose a segment with different distributions, and magnify them for the purpose of comparison. From these magnified parts, one can see that the Pareto-optimal solutions found by EA/WA are located more uniformly in objective space than those by NSGA-II. In order to illustrate the search process of these two algorithms, we take the hypervolume indicator as a measure value, and demonstrate the change of the value as the number of generations increases. When calculating the hypervolume, we select a dominated point as reference point for each problem. If a worse point is found in evolving process, we take the volume of the corresponding hypercube as 0. When the same genetic parameter values are taken, such as population size, crossover and mutation probabilities, etc, the hypervolume indicator can be affected only by search spaces and search schemes involving individual-encoding. We execute EA/WA and NSGA-II 5 runs for each problem P1, and calculate mean hypervolume values at each generation, in which the maximum number of generations is taken as 100. These hypervolume values are plotted as the number of generations increases from 1 to 100, refer to Fig. 5. It can easily be seen that the performance of NSGA-II becomes increasingly worse as the dimension increases since the success rate is lower. A simple analysis can be made for the result: for a problem with dim = 30, there are 18 lower level constraints. It follows that the search space of EA/WA is composed of 18 points, whereas that of NSGA-II is [0, 100]30 . The small at most C30 and finite search space makes EA/WA more efficient than NSGA-II. When the variables are restricted into interval [0, 2], that is, the constraint region is reduced, the success rates of NSGA-II become 100% for all problems. It follows that one must have a priori knowledge of Pareto-optimal solutions when NSGA-II is applied to solve larger-scale problems, while EA/WA always shows better performance on all computational examples. 5. Conclusion We begin with a real-world application, and discuss inverse optimization (value) problems based on an uniform multi-objective bilevel programming problem. In fact, it is very difficult for us to design an efficient algorithm for nonlinear multi-objective BLPPs with global convergence, especially when the scale of problem is very large. As a consequence, some theoretical results of optimality based on the specified problems should be considered in the design of approaches. In this paper we present an efficient evolutionary algorithm by making use of the optimality results of linear programming, which makes the searching space become finite and provides an efficient local search. In fact, in this algorithmic approach, there are no restrictions on the leader except that it is solvable with the type of (7). Acknowledgments The research work was supported by the National Natural Science Foundation of China under Grant No. 61065009 and the Natural Science Foundation of Qinghai Province under Grant No. 2013-z-937Q. References [1] D. Burton, P.L. Toint, On an instance of the inverse shortest paths problem, Math. Program. 53 (1-3) (1992) 45–61.

318

H. Li / Applied Soft Computing 23 (2014) 308–318

[2] D. Burton, P.L. Toint, On the use of an inverse shortest paths algorithm for recovering linearly correlated costs, Math. Program. 63 (1–3) (1994) 1–22. [3] J. Zhang, Z. Liu, Calculating some inverse linear programming problems, J. Comput. Appl. Math. 72 (2) (1996) 261–273. [4] J. Zhang, Z. Liu, A further study on inverse linear programming problems, J. Comput. Appl. Math. 106 (2) (1999) 345–359. [5] C. Heuberger, Inverse combinatorial optimization: a survey on problems, methods, and results, J. Comb. Optim. 8 (3) (2004) 329–361. [6] Y. Lv, T. Hu, Z. Wan, A penalty function method for solving inverse optimal value problem, J. Comput. Appl. Math. 220 (1-2) (2008) 175–180. [7] S. Ahmed, Y. Guan, The inverse optimal value problem, Math. Program. 102 (1) (2005) 91–110. [8] O. Berman, Z. Ganz, J.M. Wagner, A stochastic optimization model for planning capacity expansion in a service industry under uncertain demand, Naval Res. Logist. 41 (4) (1994) 545–564. [9] G. Paleologo, S. Takriti, Bandwidth trading: a new market looking for help from the OR community, AIROnews 3 (2001) 1–4. [10] M.S. Osman, W.F. Abd El-Wahed, M.M. El Shafei, H.B. Abd El Wahab, An approach for solving multiobjective bilevel linear programming based on genetic algorithm, J. Appl. Sci. Res. 6 (4) (2010) 336–344. [11] J.J. Ye, Necessary optimality conditions for multiobjective bilevel programs, Math. Oper. Res. 36 (1) (2011) 165–184. [12] S.R. Arora, R. Arora, Indefinite quadratic bilevel programming problem with multiple objectives at both levels, Int. J. Optim.: Theory Meth. Appl. 1 (3) (2009) 318–327. [13] K. Deb, A. Sinha, An efficient and accurate solution methodology for bilevel multiobjective programming problems using a hybrid evolutionary-local-search algorithm, Evol. Comput. 18 (3) (2010) 403–449. [14] S. Dempe, Annotated bibliography on bilevel programming and mathematical programs with equilibrium constraints, Optimization 52 (3) (2003) 333– 359. [15] J.F. Bard, Practical Bilevel Optimization, Kluwer Academic Publishers, The Netherlands, 1998. [16] Y. Wang, Y.-C. Jiao, H. Li, An evolutionary algorithm for solving nonlinear bilevel programming based on a new constraint-handling scheme, IEEE Trans. Syst. Man Cyber. C 35 (2) (2005) 221–232. [17] H.C. Li, Y.P. Wang, An interpolation-based genetic algorithm for solving nonlinear bilevel programming problems, Chin. J. Comput. 31 (6) (2008) 910–918 (In Chinese).

[18] H.C. Li, Y.P. Wang, A real-binary coded genetic algorithm for solving nonlinear bilevel programming with nonconvex objective functions, in: Proceedings of 2011 IEEE Congress on Evolutionary Computation (CEC), June 5–8, New Orleans, USA, 2011, pp. 2496–2500. [19] B. Colson, P. Marcotte, G. Savard, Bilevel programming: a survey, Q. J. Oper. Res. (4OR) 3 (2) (2005) 87–107. [20] J.B.E. Etoa, Solving convex quadratic bilevel programming problems using an enumeration sequential quadratic programming, J. Glob. Optim. 47 (4) (2010) 615–637. [21] G. Wang, K. Zhu, Z. Wan, An approximate programming method based on the simplex method for bilevel programming problem, Comput. Math. Appl. 59 (10) (2010) 3355–3360. [22] H.I. Calvete, C. Gale, On linear bilevel problems with multiple objectives at the lower level, Omega 9 (1) (2011) 33–40. [23] H.I. Calvete, C. Gale, Linear bilevel programs with multiple objectives at the upper level, J. Comput. Appl. Math. 234 (4) (2010) 950–959. [24] K. Deb, A. Sinha, An evolutionary approach for bilevel multi-objective problems, Commun. Comput. Inform. Sci. 35 (2009) 17–24. [25] Y. Wang, Theory, Methodology of Evolutionary Computation, Science Press, Beijing, 2011 (in Chinese). [26] Y. Wang, C.Y. Dang, H. Li, L. Han, J. Wei, A clustering multi-objective evolutionary algorithm based on orthogonal and uniform design, in: Proceedings of 2009 IEEE Congress on Evolutionary Computation (CEC), May 18–21, Trondheim, Norway, 2009, pp. 2927–2933. [27] Y.W. Leung, Y.P. Wang, Multiobjective programming using uniform design and genetic algorithm, IEEE Trans. Syst. Man Cyber. C 30 (3) (2000) 293–304. [28] H. Ishibuchi, T. Murata, A multi-objective genetic local search algorithm and its application to flowshop scheduling, IEEE Trans. Syst. Man Cyber. C 28 (3) (1998) 392–403. [29] K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms, John Wily & Sons Ltd., Chichester, 2002. [30] E. Zitzler, L. Thiele, Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach, IEEE Trans. Evol. Comput. 3 (4) (1999) 257–271. [31] E. Zitzler, Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications (Ph.D. thesis), Swiss Federal Institute of Technology, Zurich, Switzerland, 1999. [32] Y.W. Leung, Y.P. Wang, U-measure: a quality measure for multiobjective programming, IEEE Trans. Syst. Man Cyber. A 33 (3) (2003) 337–343.