Author's Accepted Manuscript
A high performing metaheuristic for multiobjective flowshop scheduling problem N. Karimi, H. Dvoudpour
www.elsevier.com/locate/caor
PII: DOI: Reference:
S0305-0548(14)00007-0 http://dx.doi.org/10.1016/j.cor.2014.01.006 CAOR3492
To appear in:
Computers & Operations Research
Cite this article as: N. Karimi, H. Dvoudpour, A high performing metaheuristic for multi-objective flowshop scheduling problem, Computers & Operations Research, http://dx.doi.org/10.1016/j.cor.2014.01.006 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A high performing metaheuristic for multi-objective flowshop scheduling problem
N. Karimi, H. Dvoudpour
Abstract Genetic algorithm is a powerful procedure for finding an optimal or near optimal solution for the flowshop scheduling problem. This is a simple and efficient algorithm which is used for both single and multi-objective problems. It can easily be utilized for real life applications. The proposed algorithm makes use of the principle of Pareto solutions. It mines the Pareto archive to extract the most repetitive sequences, and constitutes artificial chromosome for generation of the next population. In order to guide the search direction, this approach coupled with variable neighborhood search. This algorithm is applied on the flowshop scheduling problem for minimizing makespan and total weighted tardiness. For the assessment of the algorithm, its performance is compared with the MOGLS (Arroyo and Armentano (2005). The results of the experiments allow us to claim that the proposed algorithm has a considerable performance in this problem.
Keywords: Genetic algorithm; Data mining; Flowshop scheduling problem; Multi-objective optimization; Pareto archive; Variable Neighborhood Search (VNS);
1. Introduction Scheduling is a field of study that concerns the allocation of limited resources to jobs to be processed and also sequencing and timing in order to optimize its performance measures. Flowshop is a production system which has the same order of processing jobs through a series of machines. The commonly used performance measures for this problem are makespan, mean flow time, tardiness, number of tardy jobs and etc. Great attention has been paid in the last decades into flowshop problems. Different approaches have been presented by now for solving each size of this problem. For example, mixed integer programming or branch and bound algorithms were developed for small size of problems and heuristics and metaheuristics for problems with medium and large size. An investigation of flow shop scheduling problem and its different optimal and approximate solution approach over fifty years was presented by Gupta and Stafford (2006). The majority of the literature for the flow shop scheduling problem is concentrated on a single optimization criterion. Some examples are as follows: Pan et al. (2002), Fink and Vob (2003), Bulfin and M’Hallah (2003), Blazewicz et al. (2005a), Choi et al. (2005), Grabowski and Pempera (2005), Whang et al. (2006). But considering a single objective is impractical and unqualified for real world problems (Murata et al., (1996); Toktas et al., (2004)). So, many researches have been concentrated on taking into account more than one objective at a time. MOGA was presented by Murata et al. (1996a) which 1
utilized the weighting approach to tackle multi-objective flowshop scheduling problem. They assigned random weights to objectives to have different search direction. Hybridization of genetic algorithm with simulated annealing (SA) and also with local search was proposed by Murata et al. (1996b) that caused better solutions than their previous work. The study of Murata et al. (1996a) was also developed by Ishibichi and Murata (1998) implementing a local search procedure which tries to maximize the weighted sum of objective functions with variable weights. Minimizing makespan and maximum earliness simultaneously are considered by Toktas et al. (2004) for two-machine flow shop scheduling. They proposed a branch and bound procedure and a heuristic for finding near optimal solution in the reasonable time. TSPGA, a multi-objective algorithm, were introduced for minimizing weighted sum of makespan, mean flow time and machine idle time in flow shop scheduling problem by Ponnambalam et al. (2004). Arroyo and Armentano (2005) proposed a genetic local search algorithm with elitism and use of a parallel multi-objective local search so as intensify the search in distinct regions for the flowshop scheduling problem (MOGLS). They implemented the Pareto dominance concept. The algorithm is applied to for the problem with two set of objectives, makespan, maximum tardiness and also makespan, total tardiness. Noorul Haq and Radha Ramanan (2006) developed an artificial neural network (ANN) approach for bi-objective flow shop scheduling. Rahimi-Vahed and Mirghorbani (2007) developed particle swarm algorithm for multi-objective optimization of flow shop scheduling problem. A bi-criteria m-machine flow shop scheduling with sequence dependent setup times is investigated by Eren (2010). He proposed some heuristics for this problem. Yagmahan and Yenisey (2010) combined ant colony optimization approach and a local search strategy in order to solve multi-objective flowshop scheduling problem. A metaheuristic is proposed by hybridizing two recent principles of local search, Variable Neighborhood Search (VNS) and Iterated Local Search and tested on permutation flowshop scheduling problem. Data mining refer to as data analysis tools which assists finding previously unknown and useful patterns and relationships in large data sets. Most of its tools can include statistical models, mathematical algorithms, and machine learning methods. There are few works in the literature that applied data mining techniques in solving scheduling problem. Koonce and Tsai (2000) showed that data mining can be used to learn from job shop schedules produced by GA. Dudas et al. (2010) multi-objective optimization techniques for flexible flowshop scheduling problem. They applied data mining to find the parameters that affect the performance measures. Khademi Zare and Fakhrzad (2011) presented a hybrid algorithm of GA, data mining and fuzzy approach for flexible flowshop scheduling problem. They considered all feasible solutions for the flexible flow-shop problem as a database. Data mining helped them in finding effective parameter for sequencing the jobs. In this study, we apply Genetic Algorithm (GA) which is a stochastic search technique and has a high ability to solve diverse and difficult problems. The performance of the proposed GA is enriched by its hybridization with data mining and VNS. VNS is one of the most recent metaheuristics developed which is a simple but effective local search and was proposed by Mladenovic and Hansen (1997). In contrast to the usual local search approaches which are based on the exploration of a single neighborhood structure, this method is based on the systematic change of the neighborhood during the search process. Most of the mining algorithms which have been worked on the scheduling algorithm, try to analyze the final solution and make a conclusion. But in our proposed approach the solutions would be mined at the end of each iteration of the algorithm. Our main motivation in utilizing data mining is to find the relationship 2
among the sequences of the solutions which are in the Pareto archive and use this knowledge for creating the next generation. Thus, it improves the performance of the algorithm and accelerates its convergence. We adapt one of the best performing algorithms according to the review of Minella et al. (2008) for this problem; Multi Objective Genetic Local Search (MOGLS). This paper is organized as follows: a definition of the problem is presented in section 2. Section 3 contains the proposed algorithm. Section 4 gives the main results of the algorithm and a comparison with MOGLS. Finally, conclusions are given in Section 5. 2. Problem definition
n jobs J = { j1 , j2 ,..., jn } to be processed on set of machines M = {m1 , m2 ,..., mm } . Each job j ∈ J is to be processed consecutively on machines, from machine m1 to machine mm . Each job would be processed on only one machine at any time and each machine can process only one job at a time. All jobs are available at time 0. Machines are always available, with no breakdowns or scheduled or unscheduled maintenance. Infinite buffer exist between machines, before the first and after the last machine; Jobs are available for processing at a machine immediately after processing completion at the previous machine. A sequence ( π = (π (1), π (2),..., π (n)) ) is a permutation of jobs which represents the order in which jobs are to be processed on all machines. There are set of
The notations used in this paper are defined as follows. Notations:
n : Number of jobs
m : Number of machines
J = { j1 , j2 ,..., jn } : Set of all jobs M = {m1 , m2 ,..., mm } : Set of all machines Pij : Processing time of job j on machine i C j : Completion time of job j T j : Tardiness of job j d j : Due date of job j
The goal is to determine a feasible sequence which minimizes performance criteria of this problem. Many performance criteria are important while studying the flowshop scheduling problem. However, we choose to evaluate and optimize two of them which are makespan and total weighted tardiness.
3
Makespan can be defined as C max (π ) = max{C j } , where C j is the completion time of jth job. n
Total tardiness is TT = ∑ T j where T j = max{0, C j − d j } and d j can be calculated as the j =1
following (Ruiz and Stutzle (2006));
∀j
d j = Pj * (1 + r * 3)
m
Where Pj is the processing time of each job on all machines Pij ( Pj = ∑ Pij ) and r is a random i =1
number uniformly distributed in [0,1]. 2.1. Multi-objective concept Multi-objective optimization problems consider multiple conflicting objectives instead of a single objective. In such problems, it is essential to compromise between objectives. Because, focusing on optimizing each objective, would deteriorate the value of other objectives. A general multi-objective optimization (minimization) problem can be defined as follows: Find a vector x * = ( x1* , x2* ,..., xn* ) optimizes (minimizes) the vector of function
f ( x) = [ f1 ( x), f 2 ( x), ..., f O ( x)] , where x = ( x1 , x2 ,..., xn ) is the vector of decision variables ( O is number of objective functions which should be minimized). Among different approaches to solve multi-objective problems such as weighting approach, hierarchical approach, goal programming and Pareto approach, the Pareto approach is implemented for this study which considers all objectives concurrently. Different studies on multi-objective optimization which are related to our problem are reviewed in the literature. 3. Proposed algorithm We present a hybrid algorithm based on the Genetic Algorithm (GA). It is introduced by Holland (1975), is an optimization technique based on the evolutionary ideas of natural selection and the genetic process of biological organisms. Each solution in the search space is represented as a chromosome. GA starts with an initial population of these chromosomes (solutions). During generations, populations are evolved according to genetic operators (selection, crossover and mutation) with respect to a problem-specific objective function. In each generation, randomly selected chromosomes in the population are selected (parents) to be combined by an operator in order to generate new chromosomes (offspring). This procedure would be continued until meeting one of the stopping criteria. As mentioned before, to consider all objectives of this multi-objective optimization problem simultaneously, concept of Pareto optimal solution is considered here. Features of solutions in the Pareto archive are also mined applying a data mining technique. The detailed description of this algorithm is presented below:
4
3.1. Representation, initialization and Evaluation The most commonly used representation of the solution in the scheduling problem is using an array which shows the processing order of the jobs assigned to that machine. In the other words a permutation of jobs is used for solution representation. The GA is formed by population of individuals and specified numbers of this individual are generated through random permutations of jobs. In order to evaluate each solution, both objective functions should be taken into account (Makespan and total weighted tardiness). 3.2. Adaptive Pareto archive set Instead of a single solution, Pareto approach results in a set of solutions for multi-objective problems. That provides decision maker sufficient insight into the problem to make the final decision. A Pareto optimal solution (non-dominated solution) is such solutions that dominate ones which are not contain the set. Non-dominate solutions are not worse than each other with respect to every objective function value, and are better for at least one objective. A general definition for dominance solution can be presented as: A vector u = (u1 , u 2 ,..., u O ) is said to dominate v = (v1 , v2 ,..., vO ) if and only if u is partially less than v, i.e., ∀ o ∈ {1, 2,..., O}, u o ≤ vo ∧ ∃ o ∈ {1, 2,..., O} , u o < vo . (where O is number of objective functions which should be minimized). The non-dominated solutions are stored in Pareto archive which has a pre-specified size. Different approaches can be applied to update archive when the new non-dominated solution should be added. In this study the new solution would be added to the archive if: - The archive size is less than its maximum value. - The archive size is equal to its maximum value and the solution has a density value higher than at least one of the solution in the archive. A solution's density value is its distance to the nearest non-dominant solution of archive. In the other words a solution which is in the less crowded space has a higher density value than others. The distance between two solutions x and y is as follows:
d xy =
O
∑( f o =1
o
( x) − f o ( y )) 2
f o (x) is the o th objective value of the solution x and O is the number of objective values. Density value of the solution x is density x = min{d xy | y ∈ PA} , where PA is the Pareto archive. When the new solution is added to the archive it would be checked whether the new solution dominates any members of the archive or not, then those dominated members should be deleted from the archive.
5
3.3. Data mining In order to transfer characteristic of good solutions to next generation, we try to find the pattern among them through incorporation of the proposed algorithm and data mining. Thus, we mine the Pareto archive which contains the best known solutions found by the algorithm. As the scheduling problem is considered here in which the aim is finding the best sequence of jobs, sequential mining technique can be helpful for us. This method discovers frequent sequences applying Apriori algorithm (Han (2006)).The Apriori algorithm works based on a database and a minimum support value. 3.3.1. Adaptive Apriori algorithm This algorithm investigates and mines itemsets by finding the maximal frequent itemsets, given a minimum support threshold. So it scan the database (in this study we mean Pareto archive) for the set of frequent itemsets of each size and accumulate the count for each item. Items which can satisfy minimum support would constitute the L1 . In the next step L2 would be found through scanning the L1 . This can be continued until no more frequent l-itemsets can be found. An item A can be added to the itemset Li if it satisfies the minimum support value. The resulting itemset from Li has the frequency value lower than Li . For the itemset Li , candidates of iitemsets (itemsets of size i ) should be generated which are denoted as Ci . At the next step, Apriori algorithm checks each member of Ci with a predefined minimum support value and accumulates the count for it. Each of these candidates which meets the support criteria would be added to the Li . In the following an example presented for more details: As mentioned previously the algorithm is implemented on a flowshop scheduling problem. It should initialize with a database. The small size of database is presented in Table1 for better description. Suppose the minimum support value equal to 2 for this problem. Table 1 : Pareto Archive solution Number Solutions 1 5 3 4 2 1 2 3 5 4 2 1 3 5 3 2 1 4 4 5 3 1 2 4 5 5 2 1 3 4
The number of occurrences of each item in the database should be counted, for a set of candidate 1-itemsets (Since we implement this algorithm on the scheduling problem, it has all members of 1-itemsets in each permutation once).
6
Table 2 : 1-itemset candidates C1 Itemset Support count {1} 5 {2} 5 {3} 5 {4} 5 {5} 5
Apriori algorithm checks each member of C1 with minimum support value and then deletes itemsets with the support value lower than the minimum support. This set is denoted as L1 . Then uses it to find C 2 which is the set of sequence with 2-itemset. The algorithm can use L1 × L1 to generate L2 set of itemsets which is the 2-itemsets meeting the minimum support. This generation of itemsets would be continued until the algorithm found all the frequent itemsets. Table 3 : 1-itemset L1 Itemset Support count {1} 5 {2} 5 {3} 5 {4} 5 {5} 5
Table 4 : 2-itemset candidates C2 Itemset Support count {1,2} 1 {1,3} 1 {1,4} 1 {2,1} 4 {2,4} 1 {3,1} 1 {3,2} 1 {3,4} 2 {3,5} 1 {4,2} 2 {5,2} 1 {5,3} 3
Table 5 : 2-itemset L2 Itemset Support count {2,1} 4 {3,4} 2 {4,2} 2 {5,3} 3
Table 6 : 3-itemset candidates Itemset {4,2,1} {3,4,2} {5,3,4}
Support count 2 1 0
C3
Table 7 : 3-itemset Itemset {4,2,1}
L3
Support count 2
3.4. Artificial chromosome In this stage we use the achieved data mining information and generate artificial chromosome as following: Choose the itemsets from the set of I-itemsets with the largest frequency value (I: the maximum size of founded itemsets).Then find the item which have the largest frequency value with the first
7
or last item of the chosen itemset (consider the feasibility of the sequence). Continue this manner to assign all jobs in the sequence. For the example above, we show the generation step of artificial chromosome: {4,2,1} is the 3-itemset (3: the maximum size of founded itemsets). 4 2 1 {3,4} has the largest frequency value with the first or last item of {4,2,1}. 4
2
1
3
4
2
1
{5,3} has the largest frequency value with the first or last item of {3,4,2,1}. 3
4
2
1
5
3
4
2
1
3.5. Variable Neighborhood Search In order to find a good quality solution we apply the variable neighborhood search structure for artificial chromosome. Utilizing several neighborhood structures, being easy to implement and adaptable to different kinds of problem, it attracts attention of many researchers these years. Two major facts that VNS is based upon them are: 1) a local optimum with respect to one type of neighborhood is not necessarily local optimum with respect to another type, and 2) a global optimum is a local optimum with respect to all types of neighborhoods. VNS works with two nested loops. Shake and local search are two main functions of the inner loop that explore the search space. The diversification of this method would be done using the shake function which switches to another local search neighborhood. But the neighborhood of the improved solution would be explored via the local search. The inner loop would continue until it keeps improving and outer loop would continue until meeting the termination criteria. The stopping criteria may be the maximum CPU time allowed, maximum number of iteration or maximum number of iteration between two improvements (Mladenovic and Hansen (2001)).
8
Basic VNS algorithm Design a set of neighborhood structures
N k , k = 1,..., K max
Select an initial solution x at random; for t = 1 to maximum number of iterations do set k = 1 ; while k < maximum number of neighborhood structure ( K max ) do Execute shake procedure: generate a random solution from the k th neighborhood of
x ( x'∈ N k ( x) )
Execute local search: apply some local search method on
N k (x' ) to find new
solution x" if fitness ( x" ) ≤ fitness (x) then
x = x"
set k = 1 else
k = k +1
end if end while
t = t +1
end for
As it is clear from the pseudo-code, firstly an initial solution x would be generated randomly, then using the first neighborhood structure in shake procedure, a neighborhood solution x' is randomly created. A local search is carried out around x ' in order to find the local optimum x" . If the founded local optimum x" has the better value than the current solution x , it would be set as current solution, and the algorithm restarts with the first neighborhood structure again for the current solution. Otherwise, the neighborhood structure should switch to the next neighborhood. Since improvements in VNS is based on different neighborhood search structures, the proposed VNS should utilize some types of neighborhood structures but in order to save time, we fixed the number of neighborhoods to two Neighborhood Structures (NS). Two types of neighborhood structures are employed in our algorithm the insertion and swap structures. (1) Insertion: Randomly identifies two different genes in the chromosome and places one gene in the position that directly precedes the other gene. (2) Swap: Randomly identifies two different genes and places each gene in the position previously occupied by the other. 3.6. Elitist strategy and Selection The elitist strategy selects the best solution according to each objective function and transfers them to the next generation. It also copies the artificial chromosome generated at each generation to the next generation. Selection is an operator for selecting two parent strings for generating new offspring. We apply tournament strategy that randomly select two chromosomes choose the one with better rank value. The rank value is calculated by the non-dominated sorting of solution. The lower the value of the rank, the better the solution is. 9
For this reason, here we utilize the fast non-dominated sorting approach (Deb et al. 2002). This procedure categorizes solutions to different non-domination level. For this reason each solution should be compared with every other solution in the population, according to all objective functions, to check if it is dominated. Non-domination solutions at this step assign to the first non-domination level and the procedure will be continue for the remainder solution in the population until all solutions become classified. To do this the following two parameters would be calculated for each individual: n p : number of solutions which dominate the solution p. S p : a set of solutions that the solution p dominates.
Solutions with n p = 0 are allocated to the first non-dominated front. Now we should revise the value of n p for the solutions in the S p set of each of these non-dominated solutions, this value should be subtracted by one. Afterwards among the remainder solutions, the ones with n p = 0 constitute the next non-dominated front. This procedure will be continued until assigning all solutions to a front. 3.7. Crossover and mutation Crossover is an operator that creates children out of two parent strings by exchange parts of their corresponding chromosomes that retain their characters in some form. We employ the SJOX operator in this algorithm. SJOX is an extension of ‘‘one-point crossover”, which can be described as follows (Ruiz et al. (2006b)) • Choose two chromosomes using selection operator as parents. • Find the common jobs in parent chromosomes and artificial chromosomes (A job is common in two chromosomes if it locates at the identical position in both chromosomes). • Transfer common between father and artificial chromosomes to son chromosomes and the commons between mother and artificial chromosomes to daughter chromosomes. • Generate crossover point (random integer [1,n] ). • Transfer unassigned gene from father to son and from mother to daughter from the first gene to the crossover point. • Transfer unassigned gene from mother to son and from father to daughter from the crossover point to the end. Diversification of population of each generation is provided by mutation operator because it produces random changes in chromosomes. Here swap-change method is used for mutation operator. This operator will swap two genes of a single chromosome. The mutation operator would be applied with predefined probability to generate new chromosomes (mutation probability). The remainder chromosome for population would be generated by crossover operator.
10
3.8. Termination criteria Both computation time and the number of generations are selected as termination criterion. 4. Experimental analysis The prototype of the proposed GA is coded in Matlab software and executed on PC with 2.33 GHz Intel Core 2 Duo and 2 GB of RAM memory. To illustrate the approach, as mentioned earlier we consider a flowshop scheduling problem with two objectives: makespan and total weighted tardiness. The MOGLS (Arroyo and Armentano (2005)) is selected for comparison to our algorithm which is one of the best performing algorithms which were reviewed in their study. The instance set is composed of different combinations of number of jobs (n) and number of machines (m). This n × m combinations can get the values of {20,50,100} × {5,10,20} . For each combination, 10 instances are created, so the total number of instances is 90. Weights corresponding to jobs are considered equal. The processing times ( Pij ) are generated from a uniform distribution in the range [1,99]. 4.1. Performance measures Conflicting nature of Pareto archive's solution makes us to use some performance measures to have better assessment of the proposed algorithm. So we take four performance metrics into consideration:
Number of Pareto Solution (NPS): This metric presents the number of Pareto optimal solutions which is obtained via each algorithm. Mean Ideal Distance (MID): This measure presents the closeness between Pareto solution and ideal point (which is (0, 0) because of minimizing nature of both functions) and can be shown as:
∑ MID =
nd i =1
cli
nd
Where nd is the number of non-dominated set and cli = ( f1i − 0) 2 + ( f 2i − 0) 2 , and f 1i , f 2 i are the value of the first and second objective function for ith non-dominated solution. The lower value of MID, the better solution quality we have.
Spread of Non-dominance Solution (SNS): The diversity measure of Pareto archive solutions.
SNS =
∑
nd i =1
( MID − cl i ) 2 nd − 1
11
4.2. Parameter setting Since values of the parameters have significant effect on the performance of the algorithm, its calibration should be performed before the comparison with one of the best performing algorithms in this field. This would clarify the importance and interplay of algorithm’s parameters. Usually the calibration is done through statistical experiments. The most commonly used approach is full factorial design (Cochran (1992)) which loses its efficiency by increasing the number of parameters. Here we applied Taguchi approach (Ross (1989)), because in this approach, large number of decision variables would be tuned through a small number of experiments. This method divides factors into two factors, “controllable” and “noise” factors, to minimize the effect of noise and determine the optimal level of important controllable factors based on the concept of robustness. Taguchi's method applies two major tools which are named the orthogonal array and the signal-to-noise ratio. An orthogonal array is a fractional factorial matrix, which creates a comparison between levels of each factor or interaction of factors. The array is called orthogonal because all columns can be evaluated independently. Controllable factors will be placed in the inner orthogonal array, and noise factors will be placed in the outer orthogonal array. The measured values that are obtained through the experiments will be transformed into signal-to-noise ratio. In this ratio, desirable values are called signal and undesirable values are stated as noise. Actually this ratio is the amount of variation in response variables. Signal to noise ratio can be categorized in different sets according to its characteristics: continuous or discrete; nominal-is-best, smaller-the-better, or larger the-better. Based on our scheduling problem's features, we apply the smaller-the-better: S / N ratio = −10 log10 (objective function ) 2
For the proposed algorithm following parameters are considered as controllable factors; Mutation probability, number of population, number of iterations, Pareto archive size and the minimum Support. Different probable values for these parameters are presented in Table8. The best value for each parameter would be found through the Taguchi tuning approach. Table 8. Value of algorithm’s parameter Parameter Symbol Mutation probability A
Parameters value 0.02 0.2
Number of population
B
100
300
500
Pareto archive size
C
10
20
40
Minimum Support
D
2
3
5
Number of iterations
E
50
150
300
Taguchi approach results in 18 different treatments which should be run for 10 times. Results are shown through the Fig1. which shows the performance of the algorithm according to different levels of parameters. 12
Figure1: The parameter tuning result using Taguchi approach
Based on the average values, mutation probability of 0.2 is slightly better than the other value and 100 for number of population is preferred. It also shows that the best Pareto archive size is 10 and 2 yields statistically better results for minimum support than 3 or 5. As evident smaller number of iterations are more desirable than the larger ones. The same procedure is done for the MOGLS for its parameters tuning.
4.3. Computational evaluation
As mentioned before, each n × m combination is examined for 10 different instances. So the proposed algorithm and the aforementioned MOGLS are run 10 times for each n × m combination and the average values of performance measures are presented in Table9. Table 9 : The comparison of the proposed algorithm and MOGLS based on performance measures Proposed algorithm MOGLS Problem n × m NPS MID SNS NPS MID 20 × 5 20 1495 7710 18 2008.8 20 × 10 26 1750 2170 18 5027.77 20 × 20 27 806.22 5570 18 1338.88 20 3054 1590 18.5 3254.1 50 × 5 21 3047 2010 19.5 4379.48 50 × 10 50 × 20 33 1457.57 8140 19 2547.56 20 3068 2750 18.5 3724.32 100 × 5 21 3272 3255 17.5 8228.57 100 × 10 100 × 20 20 4618 4845 18.5 11675.5 Average 23.11111 2507.532 4226.66 18.38889 4687.22
SNS 6755 2095 4125 1405 1910 2010 2450 2360 4140 3027.778
As can be seen, in most of problem instances and also in the average value of each measure, the outcome of proposed algorithm is better than the outcome of MOGLS. From the results it is worth mentioning that cooperative use of mining data and VNS in Pareto archive has a 13
considerable effect on the performance of the algorithm especially for the bigger size of the problem. This is because of that the data mining searches for the relationship among the elite solutions along with VNS which guides search procedure of the algorithm in the best direction; in so far as it has better performance than the outstanding algorithm (MOGLS) in this field of study. Here we tend to investigate the interaction between algorithm and different factors of the problem like number of jobs and number of machines. In the Fig 2, 3 and 4 the influence of n is shown through performance measures. The outperformance of the proposed algorithm is evident especially in the bigger number of the jobs. Algorithms MOGLS The proposed algorith
350000 300000
MID
250000 200000 150000 100000 50000
20
50
100
Figure2: Means plot and 95.0 Percent Tukey intervals for the interaction between number of jobs and algorithms based on MID metric.
A lgorithms MOGLS The proposed algorithm
150000
SNS
100000
50000
0
20
50
100
Figure3: Means plot and 95.0 Percent Tukey intervals for the interaction between number of jobs and algorithms based on SNS metric.
14
45
A lgorithms MOGLS The proposed algorith
40 35
NPS
30 25 20 15 10 5 20
50
100
Figure4: Means plot and 95.0 Percent Tukey intervals for the interaction between number of jobs and algorithms based on NPS metric.
Another evaluation of the significant factor on the performance of the algorithm is investigating the interaction between performance measures and m . As it is observed from figures 5-7, with increasing in the number of machines, the proposed algorithm executes better according to the value of MID and NPS. But considering SNS, no significant influence can be seen related to the number of machines. Algorithms MOGLS The proposed algorithm
400000
300000
MID
200000
100000
0
5
10
20
Figure5: Means plot and 95.0 Percent Tukey intervals for the interaction between number of machines and algorithms based on MID metric.
15
A lgorithms MOGLS The proposed algorithm
125000 100000
SNS
75000 50000 25000 0
5
10
20
Figure6: Means plot and 95.0 Percent Tukey intervals for the interaction between number of machines and algorithms based on SNS metric.
45
Algorithms MOGLS The proposed algorith
40 35
NPS
30 25 20 15 10 5
10
20
Figure7: Means plot and 95.0 Percent Tukey intervals for the interaction between number of machines and algorithms based on NPS metric.
5. Conclusion In this article, we have presented a new algorithm which investigates the effect of using data mining along with VNS on the multi objective evolutionary algorithm. Here it is devoted to flowshop problems that were tackled under two of the most widely used and relevant objective functions, the minimization of makespan and total weighted tardiness. This algorithm implements a sequential mining procedure on the best found solutions which are in the Pareto archive and also finds an artificial chromosome for the algorithm’s operators which create new generation. So mining the best solutions and implementing VNS on the created chromosome can cause the good convergence of the algorithm and enhance its performance. The MOGLS is adapted in order to present a comparison of algorithm’s performance. Comparisons prove the better performance of the algorithm. 16
In future research, it is worthwhile to implement other data mining techniques on the scheduling problem. We also expect applying this algorithm on other multi-objective problems. References: Arroyo JEC, Armentano VA. Genetic local search for multi-objective flowshop scheduling problems. European Journal of Operational Research 2005; 167 (3): 717–38. Blazewicz J, Pesch E, Sterna M,Werner F. A comparison of solution procedures for two-machine flow shop scheduling with late work criterion. Computers and Industrial Engineering 2005; 49(4):611–24 Bulfin RL, M’Hallah R. Minimizing the weighted number of tardy jobs on a two-machine flow shop. Computers and Operations Research 2003; 30(12):1887–900 Choi B-C, Yoon S-H, Chung S-J. Minimizing maximum completion time in a proportionate flow shop with one machine of different speed. European Journal of Operational Research 2007; 176 (2): 964–74. Cochran W.G, Cox G.M. Experimental Designs, 2nd ed., Wiley, USA 1992. Eren T. A bicriteria m-machine flowshop scheduling with sequence-dependent setup times, Applied Mathematical Modelling 2010; 34 (2): 284-93. Fink A, Vob S. Solving the continuous flow shop scheduling problem by metaheuristics. European Journal of Operational Research 2003; 151(2):400–14. Josef Geiger M. Decision support for multi-objective flow shop scheduling by the Pareto Iterated Local Search methodology. Computers & Industrial Engineering 2011; 61(3): 805–12. Grabowski J, Pempera J. Some local search algorithms for no-wait flow shop problem with makespan criterion. Computers and Operations Research 2005; 32(8):2197–212. Gupta JND, Stafford Jr. EF. Flow shop scheduling research after five decades. European Journal of Operational Research 2006; 169(3):699–711. Ishibuchi H, Murata T. A multi-objective genetic local search algorithm and its application to flowshop scheduling. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on C 1998; 28(3):392–403. Minella G, Ruiz R, Ciavotta M. A review and evaluation of multi-objective algorithms for the flowshop scheduling problem. Informs Journal on Computing 2008; 20(3):451–71. Murata T, Ishibuchi H, Tanaka H. Multi-objective genetic algorithm and its applications to flowshop scheduling. Computers and Industrial Engineering 1996a; 30(4):957–68. Murata T, Ishibuchi H, Tanaka H. Genetic algorithms for flowshop scheduling problems. Computers and Industrial Engineering 1996b; 30 (4) 1061-71. Noorul Haq A, Radha Ramanan T. A bicriterian flow shops scheduling using artificial neural network. The International Journal of Advanced Manufacturing Technology 2006; 30(1112):1132-8. Pan JC-H, Chen J-S, Chao C-M. Minimizing tardiness in a two-machine flow shop. Computers and Operations Research 2002; 29(7):869–85 Ponnambalam S.G, Jagannathan H, Kataria M, Gadicherla A. A TSP-GA multi-objective algorithm for flow shop scheduling. International Journal of Advanced Manufacturing Technology 2004;23(11-12):909-15.
17
Rahimi-Vahed R, Mirghorbani S.M. A multi-objective particle swarm for a flow shop scheduling problem. Journal of Combinatorial Optimization 2007;13(1):79-102. Ross R.J. Taguchi Techniques for Quality Engineering, McGraw-Hill, USA 1989. Toktas B, Azizoglu M, Koksalan S.K. Two-machine flow shop scheduling with two criteria: maximum earliness and makespan. European Journal of Operation Research 2004;157(2):28695. Wang J-B, Daniel Ng.CT, Cheng TCE, Li-Li Liu. Minimizing total completion time in a twomachine flow shop with deteriorating jobs. Applied Mathematics and Computation 2006;180(1):185–93. Yagmahan B, Yenisey M.M. A multi-objective ant colony system algorithm for flow shop scheduling problem. Expert Systems with Applications 2010;37(2):1361–1368.
Highlights •
In this paper a novel Evolutionary Algorithm (EA) for multi‐objective flow shop scheduling is proposed.
•
A data mining approach is used to reinforce performance of EA.
•
Artificial chromosome is utilized to generate better solutions.
•
A variable neighborhood search is employed to enhance the quality of the artificial chromosome.
18