An improved NEH-based heuristic for the permutation flowshop problem

An improved NEH-based heuristic for the permutation flowshop problem

Computers & Operations Research 35 (2008) 3962 – 3968 www.elsevier.com/locate/cor An improved NEH-based heuristic for the permutation flowshop problem...

129KB Sizes 63 Downloads 283 Views

Computers & Operations Research 35 (2008) 3962 – 3968 www.elsevier.com/locate/cor

An improved NEH-based heuristic for the permutation flowshop problem Xingye Dong∗ , Houkuan Huang, Ping Chen School of Computer and IT, Beijing Jiaotong University, 100044 Beijing, China Available online 2 June 2007

Abstract NEH is an effective heuristic for solving the permutation flowshop problem with the objective of makespan. It includes two phases: generate an initial sequence and then construct a solution. The initial sequence is studied and a strategy is proposed to solve job insertion ties which may arise in the construct process. The initial sequence which is generated by combining the average processing time of jobs and their standard deviations shows better performance. The proposed strategy is based on the idea of balancing the utilization among all machines. Experiments show that using this strategy can improve the performance of NEH significantly. Based on the above ideas, a heuristic NEH-D (NEH based on Deviation) is proposed, whose time complexity is O(mn2 ), the same as that of NEH. Computational results on benchmarks show that the NEH-D is significantly better than the original NEH. 䉷 2007 Elsevier Ltd. All rights reserved. Keywords: Heuristic; Flowshop; Scheduling; Makespan

1. Introduction Permutation flowshop sequencing problem (PFSP) is one of the best known production scheduling problems, which has a strong engineering background. Though the problem with two machines can be solved in O(n log n) time by using the famous Johnson’s rule [1], the general problem has been proved to be strongly NP-complete [2]. Among desired objectives, makespan minimization has attracted a lot of attention. Many approximate algorithms have been developed to find good solutions in a short time. These algorithms can be classified into two categories: improvement methods and constructive methods. Improvement methods are mainly metaheuristics, such as genetic algorithm (GA), simulated annealing (SA) and tabu search (TS) algorithm. As for constructive methods, several heuristics had been developed in relatively early decades, e.g., heuristics by Page [3], Palmer [4], Campbell et al. [5], Gupta [6,7], Dannenbring [8], Nawaz et al. (denoted by NEH) [9] and Hundal et al. [10], and more recently, by Koulamas [11], Li et al. [12] and Kalczynski et al. [13]. All the above heuristics are designed for minimizing makespan. Recently, heuristics about minimizing flowtime or both flowtime and makespan have been developed, e.g., heuristics by Woo et al. [14] and Framinan et al. [15,16]. All these three heuristics use the search strategy of NEH heuristic. Early comparisons among constructive heuristics aimed to minimize makespan conclude that the NEH heuristic is the most effective one [17]. Moreover, Koulamas claimed that his heuristic performs as well as NEH, and Li et al. claimed that their heuristic performs better than NEH. Recently, Ruiz et al. [18] showed that the NEH was still the most effective heuristic and ∗ Corresponding author.

E-mail addresses: [email protected] (X. Dong), [email protected] (H. Huang), [email protected] (P. Chen). 0305-0548/$ - see front matter 䉷 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2007.05.005

X. Dong et al. / Computers & Operations Research 35 (2008) 3962 – 3968

3963

the heuristic proposed by Koulamas only showed an average performance; though several heuristics, such as those in [19,20], were claimed better than NEH, Kalczynski et al. [13] showed that these claims were not justified. Also, Kalczynski et al. [13] proposed a heuristic NEHKK, which was claimed better than NEH. Generally speaking, improvement methods are quite time-consuming, so they are not fit to solve large-scale problems. Constructive methods are mainly simple heuristics. They can construct a solution very quickly, but the solution is usually not so good as expected. Improvements of constructive methods usually include the use of certain strategies or priority rules. Intuitively, improvement can also be expected when such strategies or rules are used in metaheuristics. Therefore, such strategies and priority rules are worthy to be deep studied. As mentioned above, NEH is the best among earlier heuristics. It includes two phases: firstly, sort jobs according to their sums of processing times in a non-increasing order; secondly, insert these jobs in sequence into a partial schedule to construct a complete schedule. It is easy to see that there exist a lot of priority rules which can be used in phase I, and a lot of ties for job insertion in phase II. Then an issue arises naturally: can we improve the NEH by using a more effective priority rule in phase I and a more reasonable strategy to deal with these ties in phase II? The remainder of this paper is organized as follows. In Section 2, several initial sequences are analyzed. In Section 3, a tie-breaking strategy is suggested. In Section 4, an improved heuristic NEH-D (NEH based on Deviation) is proposed. Then computational results are presented in Section 5 and the paper is concluded in Section 6. 2. Priority rule Firstly, the NEH algorithm is presented as follows: (1) order the n jobs by decreasing sums of processing times on the machines; (2) take the first two jobs and schedule them in order to minimize the partial makespan as if there were only two jobs; (3) for the kth job, k = 3, . . . , n, insert it into the place, among k possible ones, which minimizes the partial makespan. It is easy to see that the NEH heuristic consists of two phases: firstly, the jobs are sorted by descending sums of their processing times; secondly, a job sequence is constructed by evaluating the partial schedules originating from the initial order of phase I. In our implementation, heap sort algorithm is chosen in phase I. For the job insertion in phase II, a job is always inserted into the first position for which the minimum makespan is achieved. Li et al. [12] improved the NEH by using a priority rule in phase I. The priority rule is based on the following hypothesis: the larger deviation of the processing times of a job on each machine, the higher priority it should have. Then they use a priority rule AVGj + (1 − )DEVj in phase I to sort the jobs in a descending way, where  ∈ [0, 1], AVGj and DEVj are defined as follows, where pij denotes the processing time of job j on machine i: m

AVGj =

1  pij , m

(1)

i=1

 DEVj =

m 

1/2 (pij − AVGj )

2

.

(2)

i=1

AVGj means the average processing time of job j, DEVj reflects the deviation of the processing times of job j. In their algorithm, they choose L + 1 different , then construct one solution for each  and choose the best one as the final output. They claimed that their heuristic was better than NEH. But this is questionable for NEH is a single-pass heuristic while theirs is a multi-pass heuristic, and moreover, their heuristic is the same as NEH when  = 1. However, the priority rule they used may be useful. In order to verify this, an experiment is designed to compare the effect of three priority rules used in phase I on all the benchmarks of Taillard [21], while the phase II is identical with that in NEH. Firstly, define AVGj as Eq. (1) and the standard deviation of the processing times STDj as follows: 

m

1  (pij − AVGj )2 STDj = m−1 i=1

1/2 .

(3)

3964

X. Dong et al. / Computers & Operations Research 35 (2008) 3962 – 3968

Table 1 Effect of priority rules in phase I on Taillard’s benchmarks (the heuristic denoted by Avg column is the same as NEH)

20|5 20|10 20|20 50|5 50|10 50|20 100|5 100|10 100|20 200|10 200|20 500|20 All

Avg

Dev

AvgDev

3.091 5.025 3.668 0.478 4.226 5.219 0.379 2.281 3.675 1.078 2.514 1.257 2.741

3.107 4.576 4.571 1.228 4.656 5.691 0.592 2.437 3.841 1.372 2.660 1.439 3.014

2.662 4.084 3.816 0.878 3.971 5.154 0.378 1.887 3.889 1.052 2.649 1.279 2.642

Three priority rules are Avg, Dev and AvgDev, where Avg means ordering jobs according to AVGj (the same as that in NEH), Dev means ordering jobs according to STDj and AvgDev means ordering jobs according to AVGj + STDj . The results are shown in Table 1 in terms of average Relative Percentage Deviation (RPD). The RPDs are calculated H − UB)/UB × 100%, where H ∈ {Avg, Dev, AvgDev}, UB are the upper bounds provided by Taillard [21]. by (Cmax From this table, it can be seen that the NEH modified by using AvgDev performs the best, and the NEH modified by using Dev performs the worst. Three pairs of one-sided paired-samples t-test were carried out on the benchmarks, i.e., the degrees of freedom is 119. When the significance level is set to 0.05, the results show that the NEH modified by using Dev performs worse than the other two cases, where the larger p-value is 0.006; and there is no statistically significant difference between the NEH modified by using AvgDev and NEH (denoted by Avg column) though the whole performance of the former shows better than the latter, where p-value is 0.139. 3. New strategy For the job insertion of NEH in phase II, there may be ties, i.e., there may exist several partial sequences which have the same partial makespan. For example, as shown in Fig. 1, the makespans of three partial sequences generated by inserting job 3 into three different possible places are equal. But the original NEH algorithm does not provide a strategy to solve these ties. Considering the complexity of the problem, it can be concluded that there does not exist a single strategy which can always solve these ties optimally, but better performance can be expected by using an appropriate tie-breaking strategy. With this hypothesis, the following tie-breaking strategy is proposed. Firstly, several notes are defined as follows. For a permutation , let (x) denote the job in the xth position, pi,(x) denote the processing time of (x) on machine i, Ci,(x) denote the earliest possible completion time of job (x) on machine i, Si,(x) denote the latest possible start time of job (x) on machine i, then the following two measures can be computed for job (x) as follows: m

E(x) =

pi,(x) 1  , m Si,(x+1) − Ci,(x−1)

x = 1, . . . , n,

(4)

i=1

D(x) =

m   i=1

pi,(x) − E(x) Si,(x+1) − Ci,(x−1)

2 ,

x = 1, . . . , n.

(5)

Note that there exist exceptions: let Si,(x+1) equal to the latest possible completion time of job (x) on machine i if (x) is the last job in the partial sequence; let Ci,(0) equal to the earliest possible start time of job (1) on machine i; if Si,(x+1) = Ci,(x−1) , then pi,(x) must be zero, and so let pi,(x) /(Si,(x+1) − Ci,(x−1) ) equal to zero in this case. Ci,(x) and Si,(x) can be computed in O(nm) by using the method of Taillard [17]. Then, when inserting a job into the partial sequence, the place which minimizes the makespan is firstly chosen, and if there exist ties, the place x which

X. Dong et al. / Computers & Operations Research 35 (2008) 3962 – 3968

(a)

3.1

1.1

2.1

1.2

2.2

3.2

2.3

3.3

(b)

2.1

1.3

1.1

3.1

1.2

3.2

2.2

2.3

(c)

3965

2.1

1.3

3.3

3.1

1.1

1.2

2.2

3.2 1.3

2.3

3.3

Fig. 1. Ties for job insertion: (a) insert job 3 into the first position; (b) insert job 3 into the second position; (c) insert job 3 into the third position.

(a)

3.1

1.1

2.1

1.2

2.2

3.2 3.3

(b)

2.1

1.1

3.1 2.2

1.2

3.2 2.3

(c)

2.1

1.3

2.3

1.3

3.3

1.1

3.1 1.2

2.2 2.3

3.2 1.3

3.3

Fig. 2. Illustration of E(x) and D(x) : (a) E(1) ≈ 0.5397, D(1) ≈ 0.3190; (b) E(2) ≈ 0.6667, D(2) ≈ 0.1867; (c) E(3) ≈ 0.6667, D(3) ≈ 0.1667.

minimizes D(x) is chosen, and if there still exist ties, then any of them is acceptable. The idea behind this is to choose the place more likely to balance the utilization of each machine. For the example in Fig. 1, in order to compute the above two measures, these three cases are illustrated in Fig. 2 in another way. In Fig. 2, the jobs prior to the insertion place (including the inserted job itself if it is not the last job in the partial sequence) are scheduled as early as possible and the jobs succeeding to the insertion place 3 2 3 are scheduled as late as possible. From Fig. 2(a), it can be seen that E(1) = 13 ( 3−0 + 10−3 + 14−5 ) ≈ 0.5397 and 3 2 3 2 2 2 D(1) = ( 3−0 − E(1) ) + ( 10−3 − E(1) ) + ( 14−5 − E(1) ) ≈ 0.3190. The measures for Figs. 2(b) and (c) can be computed similarly. Since D(3) in Fig. 2(c) is the smallest, inserting job 3 into position 3 is preferred. 4. Improved heuristic According to the above discussions, better performance can be expected if the NEH is modified by combining the priority rule AvgDev in Section 2 into phase I and the new strategy in Section 3 into phase II. The improved algorithm is called NEH-D, which means NEH algorithm improved by using deviation and is presented as follows. Algorithm NEH-D: (1) compute the average processing time AVGj and the standard deviation of processing times STDj for every job j, where AVGj and STDj are defined in Section 2; sort the jobs in a non-increasing order by AVGj + STDj ;

3966

X. Dong et al. / Computers & Operations Research 35 (2008) 3962 – 3968

(2) take the first job as the partial sequence as if there were only one job; (3) for the kth job, k = 2, . . . , n, insert it into the place, among k possible ones, which minimizes the partial makespan; if there exist ties, the place x with minimal D(x) is chosen, where D(x) is computed according to Eq. (5); if there still exist ties, then any of them is acceptable. The time complexity of step 1 is the same as that of NEH. The difference between NEH-D and NEH in step 3 is that the partial makespan and the corresponding D(x) have to be computed in NEH-D when inserting a job into the partial sequence, while only the partial makespan needs to be computed in NEH. By using the speed-up method of Taillard [17], both the partial makespan and the corresponding D(x) can be computed in O(m). So the time complexity of NEH-D is still O(mn2 ). 5. Computational results All the heuristics are implemented in C++, running on a Pentium IV 2.4G PC with 256 M main memory. The test benchmarks are taken from Taillard [21], which contains 120 particularly hard instances of 12 different sizes. For each size, a sample of 10 instances was provided. The scale of these problems varies from 20 jobs and 5 machines to 500 jobs and 20 machines. In order to check the influence of the new strategy on the performance with different priority rules, the following experiments are designed. In the experiments, the RPDs of six different cases, i.e., three different priority rules combined with two search strategies, are computed. The different priority rules are denoted by Avg, Dev and AvgDev, respectively, with the same meanings as those in Section 2. The two search strategies are denoted by Cmax and NewS, which mean the original search strategy of NEH and the search strategy described in Section 3, respectively. The results are shown in Table 2 in terms of average RPD from the upper bounds provided by Taillard [21]. From this table, it can be seen that for a specific initial job sequence, better solutions can be constructed by using the new search strategy on the whole. However, the new strategy does not perform well on all instances. It performs a little worse on small instances, e.g., 20 jobs and 5 machines instances. For relatively large instances, the new search strategy performs quite well. It can be seen that NEH-D (the AvgDev + NewS case) is better than NEH (the Avg + Cmax case) for all sizes except for 50 jobs and 5 machines, and on the whole, the NEH-D heuristic improves the average RPD by 0.417. Next, seven pairs of one-sided paired-samples t-test are carried out. The results are listed in Table 3, where p values are given as the minimum level of significance to reject H0, i.e., for every significance level  p, H0 has to be rejected. The meanings of H0 and H1 are explained by the following examples: “x = y” means that there is no statistically significant difference between x and y; “x < y” means that x performs worse than y. From Table 3, it can be seen that, even when  = 0.001, there are statistically significant differences between the algorithms with or without the new

Table 2 Influence of new strategy on Taillard’s benchmarks (the Avg + Cmax case is the original NEH) Avg

Dev

AvgDev

n|m

Cmax

NewS

Cmax

NewS

Cmax

NewS

20|5 20|10 20|20 50|5 50|10 50|20 100|5 100|10 100|20 200|10 200|20 500|20 All

3.091 5.025 3.668 0.478 4.226 5.219 0.379 2.281 3.675 1.078 2.514 1.257 2.741

2.442 4.519 3.703 0.595 3.636 4.752 0.421 1.660 2.997 0.843 1.824 0.938 2.361

3.107 4.576 4.571 1.228 4.656 5.691 0.592 2.437 3.841 1.372 2.660 1.439 3.014

3.314 4.152 4.588 1.668 3.899 5.214 0.520 1.993 3.687 0.982 2.120 1.015 2.763

2.662 4.084 3.816 0.878 3.971 5.154 0.378 1.887 3.889 1.052 2.649 1.279 2.642

2.772 3.752 3.644 0.716 3.732 4.848 0.369 1.431 3.228 0.738 1.848 0.806 2.324

X. Dong et al. / Computers & Operations Research 35 (2008) 3962 – 3968

3967

Table 3 One-sided paired-samples t-test results for the influence of using the new strategy H0

H1

p

Avg + Cmax = Avg + NewS Dev + Cmax = Dev + NewS AvgDev + Cmax = AvgDev + NewS Dev + NewS = Avg + NewS Avg + NewS = AvgDev + NewS Dev + NewS = AvgDev + NewS Avg + Cmax = AvgDev + NewS

Avg + Cmax < Avg + NewS Dev + Cmax < Dev + NewS AvgDev + Cmax < AvgDev + NewS Dev + NewS < Avg + NewS Avg + NewS < AvgDev + NewS Dev + NewS < AvgDev + NewS Avg + Cmax < AvgDev + NewS

0.000 0.001 0.000 0.000 0.334 0.000 0.000

Table 4 CPU time comparison on Taillard’s benchmarks n|m

NEH

NEH-D

20|5 20|10 20|20 50|5 50|10 50|20 100|5 100|10 100|20 200|10 200|20 500|20

0.17 0.30 0.55 0.42 0.74 1.38 0.85 1.52 2.88 3.26 6.23 18.77

0.20 0.36 0.67 0.59 1.10 2.07 1.54 2.90 5.60 8.71 17.06 86.47

strategy. For the Avg + Cmax case is the original NEH, the Avg + NewS case and the AvgDev + NewS case are better than NEH. There is no statistically significant difference between the Avg + NewS case and the AvgDev + NewS case (i.e., NEH-D), though the latter seems slightly better than the former on the whole in Table 2. Recently, Kalczynski et al. [13] proposed a heuristic NEHKK and claimed that it performed better than NEH. We implemented their heuristic and carried out experiments on the famous Taillard’s benchmarks [21]. Experimental results show that the average RPD over the upper bounds provided by Taillard [21] is 2.549, which justifies their claim. Two pairs of one-sided paired-samples t-test are carried out between NEHKK and NEH, NEH-D, respectively. The results show that NEHKK is better than NEH but worse than NEH-D. The p-values are 0.005 and 0.009, respectively. Based on the above, NEH-D performs better than NEHKK. In order to compare the runtime, we run NEH and NEH-D 100 times respectively and independently, then collect total CPU time in seconds and average them according to problem size. The results are reported in Table 4. From this table, it can be seen that NEH-D is quite efficient though it is more time-consuming. It can solve 500 jobs and 20 machines instance within about 0.86 s. 6. Conclusions NEH heuristic is so excellent that no constructive algorithm more effective than it has been proposed even more than 20 years after its publication. In this paper, several different priority rules are studied, and a new strategy is proposed trying to solve job insertion ties which may exist in the original NEH heuristic. Experimental results show that the priority rule, which combines the average processing time of jobs and their standard deviations, is not statistically significantly better than that used in NEH but it can get slightly better performance. Furthermore, the new tie-breaking strategy does improve the performance of NEH significantly. Based on the above results, a more effective heuristic NEH-D is proposed, and experimental results show that the NEH-D is indeed more effective than the original NEH.

3968

X. Dong et al. / Computers & Operations Research 35 (2008) 3962 – 3968

Acknowledgments The authors would like to give thanks to the two anonymous reviewers for their valuable suggestions and comments, and to Dr. Pawel J. Kalczynski for his help about the NEHKK heuristic. This work is partially funded by the National Basic Research Program of China (Project Ref. 2006CB705500). References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]

Johnson SM. Optimal two and three-stage production schedule with setup times included. Naval Research Logistics Quarterly 1954;1(1):61–8. Garey MR, Johnson DD, Sethi R. The complexity of flowshop and jobshop scheduling. Mathematics of Operations Research 1976;1:117–29. Page ES. An approach to the scheduling of jobs on machines. Journal of the Royal Statistical Society, Series B 1961;3(2):484–92. Palmer DS. Sequencing jobs through a multistage process in the minimum total time: a quick method of obtaining a near-optimum. Operational Research Quarterly 1965;16:101–7. Campbell HG, Dudek RA, Smith ML. A heuristic algorithm for the n job, m machine sequencing problem. Management Science 1970;16: B630–7. Gupta JND. A functional heuristic algorithm for the flowshop scheduling problem. Operational Research Quarterly 1971;22(1):39–47. Gupta JND. Heuristic algorithms for multistage flowshop scheduling problem. AIIE Transactions 1972;4:11–8. Dannenbring DG. An evaluation of flowshop sequence heuristics. Management Science 1977;23:1174–82. Nawaz M, Enscore EE, Ham I. A heuristic algorithm for the m-machine, n-job flowshop sequencing problem. OMEGA 1983;11:91–5. Hundal TS, Rajgopal J. An extension of Palmer’s heuristic for the flow shop scheduling problem. International Journal of Production Research 1988;26(6):1119–24. Koulamas C. A new constructive heuristic for the flowshop scheduling problem. European Journal of Operational Research 1998;105:66–71. Li XP, Wang YX, Wu C. Heuristic algorithms for large flowshop scheduling problems. In: Proceedings of the 5th world congress on intelligent control and automation. Hangzhou, China; 2004. p. 2999–3003. Kalczynski PJ, Kamburowski J. On the NEH heuristic for minimizing the makespan in permutation flow shops. OMEGA 2007;35:53–60. Woo HS, Yim DS. A heuristic algorithm for mean flowtime objective in flowshop scheduling. Computers and Operations Research 1998;25(3):175–82. Framinan JM, Leisten R, Ruiz-Usano R. Efficient heuristics for flowshop sequencing with the objectives of makespan and flowtime minimisation. European Journal of Operational Research 2002;141:559–69. Framinan JM, Leisten R. An efficient constructive heuristic for flowtime minimisation in permutation flow shops. OMEGA 2003;31:311–7. Taillard E. Some efficient heuristic methods for the flow shop sequencing problem. European Journal of Operational Research 1990;47:65–74. Ruiz R, Maroto C. A comprehensive review and evaluation of permutation flowshop heuristics. European Journal of Operational Research 2005;165:479–94. Pour HD. A new heuristic for the n-job, m-machine flowshop problem. Production Planning and Control 2001;12:648–53. Nagano MS, Moccellin JV. A high quality solution constructive heuristic for flow shop sequencing. Journal of the Operational Research Society 2002;53:1374–9. Taillard E. Benchmarks for basic scheduling problems. European Journal of Operational Research 1993;64:278–85.