Exact and heuristic algorithms for parallel-machine scheduling with DeJong’s learning effect

Exact and heuristic algorithms for parallel-machine scheduling with DeJong’s learning effect

Computers & Industrial Engineering 59 (2010) 272–279 Contents lists available at ScienceDirect Computers & Industrial Engineering journal homepage: ...

296KB Sizes 0 Downloads 73 Views

Computers & Industrial Engineering 59 (2010) 272–279

Contents lists available at ScienceDirect

Computers & Industrial Engineering journal homepage: www.elsevier.com/locate/caie

Exact and heuristic algorithms for parallel-machine scheduling with DeJong’s learning effect q Dariusz Okołowski, Stanisław Gawiejnowicz * ´ , Poland Adam Mickiewicz University, Faculty of Mathematics and Computer Science, Umultowska 87, 61-614 Poznan

a r t i c l e

i n f o

Article history: Received 22 September 2008 Received in revised form 19 March 2010 Accepted 20 April 2010 Available online 24 April 2010 Keywords: Parallel-machine scheduling Learning effect Branch-and-bound algorithms Heuristic scheduling algorithms

a b s t r a c t We consider a parallel-machine scheduling problem with a learning effect and the makespan objective. The impact of the learning effect on job processing times is modelled by the general DeJong’s learning curve. For this NP-hard problem we propose two exact algorithms: a sequential branch-and-bound algorithm and a parallel branch-and-bound algorithm. We also present the results of experimental evaluation of these algorithms on a computational cluster. Finally, we use the exact algorithms to estimate the performance of two greedy heuristic scheduling algorithms for the problem. Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction Scheduling problems concerning multi-machine production environments are encountered in many modern manufacturing processes (Leung, 2004; Pinedo, 2008). Since the classical scheduling theory (Conway, Maxwell, & Miller, 1967) turned out to be too rigid for some of these environments, the theory started to evolve in early 1990s. This led to a rise of new scheduling models such as, e.g., scheduling with controllable job processing times (Shabtay & Steiner, 2007), scheduling jobs with time-dependent processing times (Gawiejnowicz, 2008) and multiprocessor task scheduling (Drozdowski, 2009). An important group of such new scheduling models constitute scheduling models with the so-called learning effect (Biskup, 2008) that we consider in this paper. The impact of the learning effect on production issues was first discussed in 1936 by Wright (Biskup, 2008), who observed that learning may decrease the processing times of production tasks in the aircraft industry. The observation was later confirmed by many empirical studies saying that the knowledge of a learning curve during the planning process may result in cost savings in manufacturing, industrial production and software engineering (Badiru, 1992; Globerson & Seidmann, 1988; Raccoon, 1995). A general learning effect is modelled in scheduling theory by assumption that the processing time of a job is a function of the job position in a schedule. In literature there are known different models of the learning effect that lead to distinct forms of the funcq

This manuscript was processed by Area Editor Maged M. Dessouky. * Corresponding author. Tel.: +48 61 829 5334; fax: +48 61 829 5315. E-mail address: [email protected] (S. Gawiejnowicz).

0360-8352/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.cie.2010.04.008

tion (see, e.g., Bachman & Janiak, 2004; Biskup, 2008; Dondeti & Mohanty, 1998; Gawiejnowicz, 1996, for details). Thus, since the learning effect allows to take into account human factors in scheduling, the problems similar to the mentioned above belong to intensively studied topics in scheduling research (Lodree, Geiger, & Jiang, 2009). Throughout the paper, we will consider the following scheduling problem with a learning effect. There is given a set of jobs J1, J2, . . . , Jn which have to be processed on machines M1, M2, . . . , Mm. All jobs are available for processing at time 0 and job preemption is not allowed. The processing time of job Jj is described by DeJong’s learning curve, pj,r = pj(M + (1  M)ra), where pj is the initial job processing time, a 6 0 is the learning index, M is the incompressibility factor, r is the current position of a job in a given schedule and 1 6 j,r 6 n. To the best of our knowledge, no scheduling problems with this form of a learning effect have been considered earlier. Function pj,r, introduced by DeJong (Badiru, 1992), mirrors the impact of a learning effect on job processing times and is a generalization of both the log-linear learning curve, pj,r = pjra, and the case of fixed job processing times, pj,r = pj. (The first case is obtained for M = 0, while the second one for M = 1.) The main advantage of DeJong’s model follows from the fact that parameter M represents the part of job processing time that is limited by some conditions and cannot be shortened. Different values of M are recommended in literature. For example, DeJong suggests M = 0.25 for labourintensive jobs and M = 0.5 for machine-intensive jobs (Raccoon, 1995). Throughout the paper, we assume that M 2 [0, 1]. The criterion of schedule optimality in our problem is to minimize the makespan, Cmax: = max{Cj:1 6 j 6 n}, where Cj is the completion time of job Jj. Extending the three-field notation (Graham,

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279

Lawler, Lenstra, & Rinnooy Kan, 1979), we will denote the problem as Pmjpj,r = pj(M + (1  M)ra)jCmax. The contribution of the paper is threefold. First, we prove some basic properties of the problem. Next, on the basis of the properties, we propose for this problem two exact algorithms: a sequential branch-and-bound algorithm and a parallel branch-and-bound algorithm. To the best of our knowledge, no branch-and-bound algorithms for parallel-machine scheduling problems with a learning effect have been proposed earlier. We also present the results of experiments conducted on a computational cluster in order to evaluate the quality of schedules generated by our exact algorithms. Finally, we discuss the application of two greedy heuristic scheduling algorithms for the problem, comparing schedules generated by the heuristics with schedules generated by the branchand-bound algorithms. The remainder of the paper is organized as follows. In Section 2, we present a brief review of recent research on multi-machine scheduling with a learning effect. In Section 3, we prove basic properties of the considered problem. In Sections 4 and 5, we describe the sequential and the parallel branch-and-bound algorithm for the problem, respectively. In Section 6 we discuss two greedy heuristic scheduling algorithms for the problem. In Section 7, we present the results of computational experiments conducted on a computational cluster for both the exact algorithms and the greedy scheduling algorithms. We complete the paper by Section 8 with conclusions and remarks about future research.

273

Eren and Güner (2008) considered a two-machine flow shop problem with the log-linear learning effect and with the objective P to minimize the weighted sum of total completion time, wj C j , and the Cmax. In order to solve this problem, the authors proposed an integer programming formulation. Wang (2008) has shown by a counter-example that some of results presented by Cheng, Sun, and Yu (2007) for a multi-machine flow shop with log-linear learning effect and the Cmax criterion are not correct. Xu, Sun, and Gong (2008) analyzed the worst-case behaviour of single-machine optimal job sequences applied to a multi-machine flow shop scheduling with a general learning effect and with the P following three criteria: the wj C j , the discounted total weighted P completion time, w ð1  erC j Þ, and the sum of the quadratic job P j2 completion times, Cj . Lee and Wu (2009a, 2009b), Lee, Lai, and Wu (2010) and Yin, Xu, Sun, and Li (2009) have shown polynomial-time solvability of some multi-machine flow shop problems with the Cmax, the maxiP mum lateness, Lmax :¼ maxfC j  dj : 1 6 j 6 ng; C j and the P wj C j criteria and with a learning effect that depends not only on the position of a job in a schedule but also on the processing times of the jobs already processed. Zhang and Yan (2010) obtained similar results for some flow shop scheduling problems P with another form of a learning effect and with the C max ; Cj and Lmax criteria. Kuo and Yang (2010) have shown by a counterexample that some of the latter results are incorrect. 2.2. Branch-and-bound algorithms

2. Previous research In this section, we briefly review the most recent literature concerning multi-machine scheduling problems with a learning effect and branch-and-bound algorithms proposed for problems of this type. 2.1. Multi-machine scheduling with a learning effect Below we describe the main results concerning scheduling with a learning effect on parallel and dedicated machines. Since some of them are discussed in Biskup (2008), we mention only these that have been published after the review. We begin with parallel-machine scheduling problems. Mosheiov (2008) has shown that the problem of minimizing the PP total absolute deviation of job completion times, i j jC i  C j j, subject to a general learning effect can be solved in a polynomial time. Toksarı and Güner (2008) considered parallel-machine earliness/tardiness scheduling problem with simultaneous effects of learning and linear deterioration, sequence dependent setups and a common due-date for all jobs, and gave a mixed non-linear integer programming formulation of the problem. Eren (2009a, 2009b) and Toksarı and Güner (2009a, 2009b) proposed a mathematical programming formulation of a parallel-machine scheduling problem with a learning effect of setup times and removal times, and the objective to minimize the weighted sum of P P total completion time and total tardiness, a C j þ b T j , where Tj:= max {Cj  dj}, a > 0 and b > 0 . All the papers concern job processing times with the log-linear learning effect, pj,r = pjra for 1 6 j 6 n. Now, we pass to dedicatedmachine scheduling problems with a learning effect. Cheng, Wu, and Lee (2008a, 2008b) proposed polynomial-time optimal solutions for some special cases of multi-machine flow P shop problems with the Cmax and C j criteria, in which the actual processing time of each job is a function of the total normal processing times of the jobs already processed and of the position of the job in a schedule.

In this section, we briefly review branch-and-bound algorithms proposed for multi-machine scheduling problems with a learning effect. All algorithms from this group concern two-machine flow shop problems, since no branch-and-bound algorithms for other multi-machine scheduling problems with a learning effect have been proposed. Lee and Wu (2004) proposed a branch-and-bound algorithm for P the C j criterion. The algorithm was coded in Fortran 77, run on a Pentium 4 PC and tested on 300 instances with n 6 35 jobs. Chen, Wu, and Lee (2006) proposed a branch-and-bound for criP terion k C j þ ð1  kÞT max , where k 2 (0, 1), and tested it on 7200 instances with n 6 15 jobs. Wu, Lee, and Wang (2007) proposed a branch-and-bound algorithm for the Tmax criterion. The algorithm was tested on 6000 instances with n 6 14 jobs. Wu et al. (2007) and Wu and Lee (2009) proposed a branch-andP bound algorithm for the C j criterion. These algorithms have been tested on instances with n 6 16 jobs and two, three or five machines. All the algorithms were coded in Fortran 90 and run on a Pentium 4 PC.

3. Problem properties In this section, we prove some properties of the considered problem, which will be used in subsequent sections of the paper. Problem Pmjpj,r = pj(M + (1  M)ra)jCmax is a generalization of the ordinary NP-hard problem PmkCmax. Property 1. Problem Pmjpj,r = pj(M + (1  M)ra)jCmax is ordinary NP-hard. A general version of the problem, Pjpj,r = pj(M + (1  M)ra)jCmax, is a generalization of the strong NP-hard problem PkCmax. Property 2. Problem Pjpj,r = pj(M + (1  M)ra)jCmax is strongly NPhard. Since the processing time of a particular job Jj depends only on its number in sequence and its initial processing time, we obtain two next properties.

274

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279

Property 3. In any schedule for problem Pmjpj,r = pj(M + (1  M)ra) jCmax, a pairwise job interchange among positions q, . . . , q + k, where 1 6 q – k 6 n and 2 6 q + k 6 n, does not affect the actual processing time of jobs in positions 1, . . . , q  1, q + k + 1, . . . , n. Property 4. In any schedule for problem Pmjpj,r = pj(M + (1  M)ra) jCmax, the processing times of jobs scheduled on machine Mk, 1 6 k 6 m, do not depend on job sequences on other machines. The next property concerns artificial idle times, i.e. time intervals of a non-zero length that have been inserted after completion of some jobs. Property 5. For problem Pmjpj,r = pj(M + (1  M)ra)jCmax there exist optimal schedules that do not include artificial idle times. Proof. Let r be an optimal schedule for problem Pmjpj,r = pj(M + (1  M)ra)jCmax with an artificial idle time interval [t1, t2], where 0 6 t1 < t2. Move in r all jobs after t2 for t2  t1 > 0 units of time and denote the new schedule by r0 . Notice that all job processing times in r0 will remain the same as in r, since no job will change its position. Therefore, Cj(r0 ) = Cj(r) for j 2 {i: Ji ends before t1} and Cj(r0 ) 6 Cj(r) for j 2 {i: Ji ends at or after t1}. Hence, Cj(r0 ) 6 Cj(r) for 1 6 j 6 n what implies that Cmax(r0 ) 6 Cmax(r). Thus, schedule r0 is not worse than schedule r. Repeating the above procedure for other artificial idle times in r0 , we obtain a schedule rw that does not contain artificial idle times and which is not worse than schedule r. h Before we formulate the next property, we illustrate a specific feature of scheduling problems with DeJong’s learning effect.

Let r0 be the modified schedule r in which job Jf will be scheduled in the (h + 1)st position on Mk1 . Then

C f ðr0 Þ ¼

hþ1 X a ðM þ ð1  MÞj Þ 6 C f ðrÞ: j¼1

Since, by Properties 3 and 4, Cj(r0 ) = Cj(r) for all 1 6 j – f 6 n, we have Cmax(r0 ) 6 Cmax(r). Repeating the above procedure for other artificially idle machines in r0 , we obtain a schedule rw in which no machine is artificially idle and which is not worse than schedule r. h Before we formulate the next property, we recall an auxiliary result. Let xj % and xj & denote sequence xj arranged in nondecreasing and non-increasing order, respectively. Lemma 1 Hardy, Littlewood, and Pólya, 1934, p. 261. The sum of P products nj¼1 xj yj of elements of number sequences x1,x2, . . . , xn and y1, y2, . . . , yn is minimized if the first sequence is arranged in the xj % order and the second sequence is arranged in the xj & order or vice versa, and it is maximized if the sequences are ordered in the same way. Property 8. If m = 1, then in any optimal schedule for problem Pmjpj,r = pj(M + (1  M)ra)jCmax jobs are arranged in the pj % order. Proof. Let r be an optimal schedule for the considered problem and let p[j] denote the processing time of the jth job in the schedule. Since

C max ðrÞ ¼

n X

a

p½j ðM þ ð1  MÞj Þ

j¼1

Example 1. Let instance of problem Pmjpj,r = pj(M + (1  M)ra)jCmax be defined as follows: m = 2, n = 21, pj = 1 for 1 6 j 6 20, p21 = 50, a = 0.322 and M = 0. Let r0 = ((J1, . . . , J20), (J21)) denote the schedule in which jobs J1, . . . , J20 are scheduled in the given order on machine M1 and job J21 is scheduled on machine M2. Then Cmax(r0 ) = 50. However, if by r00 = ((J1, . . . , J21), (;)) we denote the schedule in which all jobs are scheduled on machine M1 and machine M2 is idle, then Cmax(r00 )  29 < Cmax(r0 ). Example 1 illustrates the next property of our problem: schedules with artificially idle machines (i.e. the machines that are idle, though there exist jobs available for processing) not necessarily have larger objective values compared to schedules in which all machines are busy. Property 6. For problem Pmjpj,r = pj(M + (1  M)ra)jCmax there exist schedules with artificially idle machines that have smaller makespan than schedules without artificially idle machines. In view of Property 6, problem Pmjpj,r = pj(M + (1  M)ra)jCmax is more difficult to solve than its counterpart PmkCmax with fixed job processing times. However, if all jobs have unit basic processing times, pj = 1 for 1 6 j 6 n, then the first problem is similar to the second one. Property 7. For problem Pmjpj,r = M + (1  M)rajCmax there exist optimal schedules in which no machine is artificially idle. Proof. Let r be a schedule such that h jobs are assigned to machine Mk1 , while h + 2 jobs are assigned to machine M k2 , where 1 6 k1 – k2 6 m and 1 < h < n. Assume also that machine M k1 is artificially idle after the completion of the hth job. Denote by Jf the last job processed on machine Mk2 . Then hþ2 X a C f ð rÞ ¼ ðM þ ð1  MÞj Þ: j¼1

and sequence (M + (1  M)ja) is non-increasing for 1 6 j 6 n, by Lemma 1 for xj  p[j] and yj  M + (1  M)ja, the Cmax will be minimal, if sequence p[j] will be arranged in the pj % order. h From Property 8 we immediately obtain the next property. Property 9. If m P 2, then in any optimal schedule for problem Pmjpj,r = pj(M + (1  M)ra)jCmax jobs assigned to machine Mk, 1 6 k 6 m, are arranged in the pj % order. We complete the section by two properties concerning lower and upper bounds on the value C H max of optimal makespan for the considered problem. Property 10. For problem Pmjpj,r = pj(M + (1  M)ra)jCmax there holds the inequality

( a CH max P max maxfpj gðM þ ð1  MÞn Þ; 16j6n

) n 1 X a pðjÞ ðM þ ð1  MÞj Þ ; m j¼1

where p(j) denotes the jth smallest initial processing time, 1 6 j 6 n. Proof. The inequality follows from the fact that the C H max is at least as large as the processing time of the largest job scheduled in the last position in a schedule and the average machine load of a schedule in which all jobs are arranged in the pj % order. h Property 11. For problem Pmjpj,r = pj(M + (1  M)ra)jCmax there holds the inequality

CH max 6

n X

a

p½j ðM þ ð1  MÞj Þ:

j¼1

Proof. The inequality follows from the fact that by Property 6 the CH max for a multi-machine problem cannot be larger than the makespan of a single-machine problem in which all jobs are arranged in the pj & order. h

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279

4. Sequential branch-and-bound algorithm In this section, we present the first exact algorithm that we propose for the considered problem. This sequential algorithm is a base for the parallel algorithm presented in Section 5.

275

n, m, p1, . . . , pn, a and M), the present level of the tree of all solutions (depth), a partial solution (c_sol), the best current solution (bcs) and the best current value of the Cmax (bcv). Algorithm 4.1. RecursiveSearch(inst_desc,depth,c_sol,bcs,bcv)

4.1. Branching and coding

if (depth = n)

Branching, i.e. the decomposition of a problem into smaller and smaller subproblems in order to reach a proper level of granulation that leads to easily solvable cases, is the first basic component of any branch-and-bound algorithm (Gendron & Crainic, 1994; Ibaraki, 1987; Lawler & Wood, 1966). The main issue to solve here is the right coding of subproblems and partial solutions. In the case of problem Pmjpj,r = pj(M + (1  M)ra)jCmax, by Property 5, jobs assigned to each machine are processed without artificial idle times. Hence, by Properties 3, 4 and 9, any schedule can be coded by a permutation with repetitions of the set {1, 2, . . . , m}. Notice, however, that these permutations with repetitions that lead to schedules with artificially idle machines, in view of Property 6, cannot be omitted and that jobs assigned to any machine must be in the pj % order.

8 cbs C max ðc sol; inst descÞ; > > > < if ðcbs < bcsÞ then  > c sol; bcv > > then : bcs cbs; 8 toe CreateExtensionsTableðc sol; inst descÞ; > > > < for i 1 to LengthðtoeÞ else  > n sol c sol  toe½i; > > : do Recursiv eSearchðinst desc; depth þ 1; n sol; bcs; bcv Þ;

Example 2. Permutation with repetitions (1, 2, 2, 1, 1, 3) codes a schedule in which jobs J1, J4 and J5 are assigned to machine M1, jobs J2 and J3 are assigned to machine M2 and job J6 is assigned to machine M3, with preserving the pj % order on each machine.

4.2. Bounding Bounding, i.e. the delimiting the given set of partial solutions to a small set which includes an optimal solution, is the second basic component of any branch-and-bound algorithm. The first element of the bounding is a lower bound that determines which partial solutions should be considered further. In the case of problem Pmjpj,r = pj(M + (1  M)ra)jCmax, to the best of our knowledge, no lower bounds are known. Thus, in our algorithm we applied the lower bound from Property 10. The second element of the bounding is the strategy of searching the tree of all solutions. We have tested two search strategies. The first was the Depth-First Search (DFS) strategy that is often used in branch-and-bound algorithms, since it is easy to implement. The DFS strategy starts at the root of a given tree and explores the tree as far as possible before a backtracking will be made. The second tested strategy was the Best-First Search (BFS). The strategy uses a lower bound to determine which partial solutions should be considered first. Preliminary experiments have shown that in the case of problem Pmjpj,r = pj(M + (1  M)ra)jCmax, the DFS strategy is better than BFS. Therefore, in our algorithm we applied the DFS strategy. 4.3. General scheme Below (see Algorithms 4.1 and 4.2) we present the sequential branch-and-bound algorithm in a pseudocode. Symbol Cmax(c_sol,inst_desc) denotes the function which returns the Cmax for schedule c_sol and instance inst_desc of our problem. Function InitialSolution(inst_desc) returns an initial schedule for instance inst_desc. Function CreateExtensionsTable(c_sol,inst_desc) creates a table, used by the search strategy for the solution c_sol. Symbol  denotes the operator of concatenation of two partial solutions. The initial lower bound is equal to Cmax(InitialSolution(inst_desc), inst_desc). The main part of the sequential branch-and-bound algorithm is procedure RecursiveSearch (see Algorithm 4.1) with the following parameters: the description of an instance (inst_desc, includes

Notice that procedure RecursiveSearch has to be executed with some initial parameters as it is shown by the following procedure SequentialBB. Algorithm 4.2. SequentialBB(inst_desc) bcv InitialSolution(inst_desc); bcs Cmax(bcv,inst_desc); RecursiveSearch(inst_desc,0,NULL,bcs,bcv); return (bcv);

5. Parallel branch-and-bound algorithm In this section, we describe a parallel implementation of the algorithm presented in Section 4.3. 5.1. Parallelization strategy The main problem in parallelization of a sequential algorithm is to find the proper way to divide calculations among available processors. In dependence of the nature of the calculations, there are known two main strategies of parallelization of branch-and-bound algorithms (Crainic, Cun, & Roucairol, 2006):  node-based strategy, used when nodes of a given solution tree are computationally intensive and can be processed by parallel processes;  tree-based strategy, in which the branching and bounding of a given tree of all solutions are divided into units that can be processed in parallel. In our case, since procedure RecursiveSearch of the sequential algorithm does not perform complex computations and the tree generated by this algorithm is relatively large, we have applied the tree-based strategy. The second important topic related to any parallel branch-andbound algorithm is to determine the partition of the tree of all solutions. Our algorithm tries to divide the tree into possibly equal parts and starts with calculation of the tree level at which the partition will be realized. This level depends on the number m of machines in the instance we solve, the number procs_num of parallel computational processes and control parameter div_par of the partition precision.

276

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279

Notice also that since Algorithm 5.1 is performed using SPMD (Single Program, Multiple Data) approach, the same process is run simultaneously on multiple processors with different inputs.

6. Greedy algorithms

Fig. 1. The idea of tree-based parallelization strategy.

The deeper the level is, the partition becomes more balanced but it also generates more sequential calculations. It should be noticed that branching and bounding in the parallel algorithm are not realized starting from the root, but from the calculated level of parallelization. In some particular instances we can omit some cutting nodes. However, if the control parameter div_par is selected properly, the calculated parallelization level is low. Hence, we did not observed any redundant computations. After finding the partition, each part of the tree of all solutions can be processed asynchronously, since all the parts are pairwise independent. Fig. 1 shows the partition of a tree on the third level. Each active processor gets a set of solutions from the calculated tree level. (In Fig. 1 we have three such sets, denoted as C1, C2 and C3.) In order to decrease communication, each such a set is determined by solutions with the lowest and the greatest Cmax value in this set. Processors go through this data range by calculating successors of permutations with repetitions in the lexicographic order. This saves us from redundant communication which could lead to a weak performance. The synchronization is performed after covering the partial tree on each node. The approach in some particular cases generates some workload imbalance, since given parts are not always equally computationally intensive. However, in our case it did not had considerable effect on computation time. 5.2. General scheme Below, in Fig. 5.1, we present the parallel branch-and-bound algorithm in a pseudocode. Procedure Broadcast(inst_desc,root_id) sends instance description inst_desc from processor root_id to all other processes. Function GetId() returns network identifier of the process invoking this function. Function CalculateParallelismLevel(m,procs_num,div_par), for the number of machines m, the number of parallel processes procs_num and the parameter div_par, calculates the level at which the tree of all solutions should be divided. Function CreatePartialSolutions(cpu_no,par_lev), for processor id cpu_no and parallelism level par_lev, calculates partial solutions that will be calculated in parallel. Function GiveFirstSchedule(inst_desc) calculates for instance inst_desc the first current best schedule. Function EstablishOptimalSchedule(bcv) changes locally optimal schedule bcv to globally optimal schedule by comparing local solutions of each process. Algorithm 5.1. SymmetricParallelBB(inst_desc) Broadcast(inst_desc,root_id); cpu_id GetId(); par_lev CalculateParallelismLevel(m,procs_num,div_par); pns CreatePartialSolutions(cpu_id,par_lev); bcv GiveFirstSchedule(inst_desc); RecursiveSearch(inst_desc,pl,pns,bcs,bcv); EstablishOptimalSchedule(bcv); return (bcv)

By Property 1, we know that Pmjpj,r = pj(M + (1  M)ra)jCmax is an ordinary NP-hard problem for m P 2. This means that unless P ¼ NP, for the problem exact polynomial algorithms do not exist. Therefore, in order to find a solution to the problem in polynomial time, we should use polynomial-time heuristic algorithms. Greedy scheduling algorithms are simple but quite effective heuristics for many scheduling problems. One of the most popular greedy heuristics is the Largest Processing Time first (LPT) algorithm in which first jobs are arranged in the pj & order and then, as long as there exist unscheduled jobs, the first available job is scheduled on the first available machine. ðrLPT Þ 1 Graham (1969) proved that in the worst-case CCmax 6 43  3m , H max ðr Þ LPT w where r and r denote schedule generated by the LPT algorithm and optimal schedule, respectively. Given below example (Okołowski, 2008) shows that this bound does not hold for problem Pmjpj,r = pj(M + (1  M)ra)jCmax. Example 3. Let the instance of problem Pmjpj,r = pj(M + (1  M)ra)jCmax be defined as follows: m = 2, n = 6, pj = 1 for 1 6 j 6 5, p6 = 20, learning index a = 0.5 and incompressibility factor M = 0. Consider for this instance two schedules, r = ((J1, . . . , J5), (J6)) and rw = ((J1, . . . , J6), (;)). Notice that r is generated by the LPT algorithm and rw is an optimal schedule. Since Cmax(r) = 20 and max ðrÞ 20 Cmax(rw)  11.5, we have CCmax  11:5  1:74 > 43  16  1:17. ðrH Þ Schedules with the reversed job order, compared to the one from the LPT algorithm, are generated by the SPT (Shortest Processing Time first) algorithm which works exactly as the LPT algorithm, with this difference that the pj & order is replaced by the pj % order. Since the algorithms are quick (both run in O(n log n) time) and, by Property 8, the SPT algorithm is optimal for m = 1, a natural question is how good are the algorithms on average. In Section 7.4 we present the results of an experiment that has been conducted in order to find an answer to the question. 7. Computational experiments In this section, we present the results of four experiments conducted in order to test the performance of the proposed branchand-bound algorithms and to evaluate the quality of schedules generated by the greedy algorithms. 7.1. Software and hardware environment The proposed algorithms were implemented in C++ language, using Message Passing Interface (MPI) library (Snir, Otto, Huss-Lederman, Walker, & Dongarra, 1996). An open source LAM environment (Burns, Daoud, & Vaigl, 1994) was used as an implementation of MPI. This software is fully compatible with the first version of MPI and many features are included from the second version of this standard. All experiments with the algorithms were conducted on a small computational cluster, which consisted of five integrated IBM HS20 Blade Servers. Each Blade Server was equipped with two IntelÒ Xeon™ 3.06 GHz processors and ca. 2.5 GB of DDR2 RAM memory. The units were plugged to a shared disk matrix and connected by 1 Gbit Ethernet interface. The hardware was controlled by Linux operating system with 2.6.9-42 ELsmp kernel.

277

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279 Table 1 Performance of the sequential branch-and-bound algorithm.

Table 2 Average and maximal performance of greedy algorithms (m = 2, n = 18).

m

tb

a

M

vSPT (%)

vRND (%)

vBFS (%)

tb

a

M

r avg SPT

r max SPT

r avg LPT

r max LPT

2 2 2 2 3 3 3 3 2 2 2 2 3 3 3 3

20 20 20 20 20 20 20 20 100 100 100 100 100 100 100 100

0.1 0.1 0.322 0.322 0.1 0.1 0.322 0.322 0.1 0.1 0.322 0.322 0.1 0.1 0.322 0.322

0.5 0 0.5 0 0.5 0 0.5 0 0.5 0 0.5 0 0.5 0 0.5 0

57 54 52 52 20 20 20 19 54 54 52 52 20 19 18 18

55 54 54 52 21 20 21 20 56 56 55 56 19 20 19 19

57 55 53 51 21 20 20 19 54 54 53 52 20 19 19 18

20 20 20 20 100 100 100 100

0.1 0.1 0.322 0.322 0.1 0.1 0.322 0.322

0 0.5 0 0.5 0 0.5 0 0.5

0.053 0.053 0.054 0.051 0.056 0.049 0.058 0.060

0.077 0.089 0.077 0.103 0.094 0.077 0.084 0.120

0.092 0.039 0.271 0.093 0.078 0.39 0.291 0.104

0.096 0.048 0.342 0.141 0.102 0.52 0.371 0.142

7.2. Performance of the sequential branch-and-bound The first experiment has concerned the behaviour of different variants of the sequential branch-and-bound algorithm. The results of the experiment are summarized in Table 1. Symbol vISOL denotes an average percentage of the number of visited nodes, where ISOL 2 {SPT,RND,BFS}. Symbol SPT means that jobs in the initial schedule were in the pj % order, RND means that the order was random. In both these cases the DFS searching strategy was used. Symbol BFS means that jobs in the initial schedule solution were in the pj % order and the BFS searching strategy was used. Parameters m, a and M denote the number of machines, the learning index and the incompressibility factor, respectively. Symbol tb means that the values of the initial processing times were drawn randomly from 1 to tb. For each combination of the parameters m, tb, a, M and ISOL, we tested 100 instances with n = 18 jobs. In total, in the experiment 4800 instances have been tested. Table 1 shows that the percentage of the number of visited nodes does not depend on learning parameters, but it strongly depends on the number of machines. Since the most of vn values are slightly smaller for lower tb parameter, we conjecture that these values are correlated. We also observed that SPT with DFS and SPT with BFS behave very similar, but some differences can be noticed in favour to SPT with DFS, when learning effect is stronger. Notice also that SPT with BFS requires sorting what increases the amount of time needed for calculations.

7.3. Performance of the parallel branch-and-bound The aim of the second experiment was to calculate an average speedup of the parallel branch-and-bound algorithm. For each possible number lMPI of MPI processes, where 1 6 lMPI 6 20, we tested 100 instances of problem Pmjpj,r = pj(M + (1  M)ra)jCmax with m 2 {2, 3} machines and n = 18 jobs. In total, in the experiment 2000 instances have been tested. The calculated average speedup is depicted in Fig. 2. We also calculated the efficiency of the parallel exact algorithm, i.e. the speedup divided by the number of used cores. The smallest efficiency was noticed for 10 and 12 cores (about 0.5), the largest one was noticed for small number of cores (from 0.725 to 0.95). Since the final average efficiency was found to be about 2/3, the parallelization can be considered as acceptable. 7.4. Performance of greedy algorithms In the third experiment we compared the quality of schedules generated by greedy algorithms with the quality of optimal schedules found by branch-and-bound algorithm. As a measure of the H ÞC H max ðr Þ quality we used the ratio rH ¼ C max ðCrmax , where rH and rw deðrH Þ note the schedule constructed by heuristic H and the optimal schedule, respectively. In the experiment we used two commonly recommended values of the incompressibility factor: M = 0 that corresponds to loglinear learning effect (Biskup, 1999) and M = 0.5 that is suggested for machine-intensive processes (Raccoon, 1995). In the case of the learning index we also have chosen two values: a = 0.322 that corresponds to 80% learning rate and is commonly used in literature (Biskup, 1999; Mosheiov, 2001) and a = 0.1 that makes problem Pmjpj,r = pj(M + (1  M)ra)jCmax close to the classical one. The results of the experiment are presented in Tables 2 and 3, where r avg and rmax denote the average and the maximal ratio rH H H

Fig. 2. Speedup of the parallel branch-and-bound algorithm.

278

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279

Table 3 Average and maximal performance of greedy algorithms (m = 3, n = 18). tb

a

M

r avg SPT

r max SPT

r avg LPT

r max LPT

20 20 20 20 100 100 100 100

0.1 0.1 0.322 0.322 0.1 0.1 0.322 0.322

0 0.5 0 0.5 0 0.5 0 0.5

0.097 0.097 0.104 0.106 0.109 0.112 0.101 0.210

0.178 0.157 0.181 0.182 0.187 0.191 0.176 0.210

0.072 0.039 0.237 0.099 0.076 0.043 0.242 0.103

0.101 0.068 0.360 0.133 0.103 0.062 0.364 0.139

Table 4 Average performance of greedy algorithms (m = 2, 10 6 n 6 34). tb

n

r avg SPT

r avg LPT

tb

n

r avg SPT

r avg LPT

20 20 20 20 20 20 20

10 14 18 22 26 30 34

0.077 0.067 0.053 0.042 0.035 0.032 0.027

0.076 0.095 0.097 0.100 0.101 0.104 0.102

100 100 100 100 100 100 100

10 14 18 22 26 30 34

0.122 0.071 0.054 0.047 0.037 0.035 0.029

0.120 0.087 0.102 0.107 0.104 0.105 0.104

Table 5 Average performance of greedy algorithms (m = 3, 10 6 n 6 22).

ness problem with log-linear learning effect. Notice also that ratio rH for LPT is worse when tb parameter is lower; when tb = 100, the ratio is smaller but also perceptible. The experiment has also shown that the SPT algorithm is much better compared to the LPT algorithm for m = 2 machines. The advantage of SPT can be also seen for m = 3 machines but it is not so clear, maybe in view of smaller size of considered instances. 8. Conclusions In this paper, we considered a parallel-machine scheduling problem with DeJong’s learning effect and the objective to minimize the makespan. We proved basic properties of the problem and gave some lower and upper bounds on the optimal value of the objective. We also proposed for solving the problem two exact algorithms, a sequential branch-and-bound algorithm and a parallel branchand-bound algorithm, and two greedy heuristic algorithms. Finally, we reported the results of computational experiments conducted in order to evaluate the performance of the proposed branch-andbound algorithms and the greedy heuristics. In future research more tight lower and upper bounds on the optimal makespan value should be determined. This would allow to solve by the branch-and-bound algorithms larger instances in a reasonable time. Acknowledgment

tb

n

r avg SPT

r avg LPT

tb

n

r avg SPT

r avg LPT

20 20 20 20

10 14 18 22

0.103 0.122 0.105 0.082

0.080 0.095 0.099 0.101

100 100 100 100

10 14 18 22

0.099 0.130 0.109 0.078

0.090 0.100 0.100 0.107

for heuristic H 2 {SPT, LPT}, respectively. In order to obtain a single row in each of the tables, we tested 100 instances with n = 18 jobs. In total, in the experiment 1,600 instances have been tested. Tables 2 and 3 show that the pj % order is better when learning parameters are stronger, i.e. when a and M are small. This can be explained by the fact that then the pj % order entails better shortening. If we consider instances with weaker learning effect, i.e. when a = 0.1 and M = 0.5, the pj & order generates better solutions, because this case is closer to the classical case with a = 0 or M = 0. As it was easy to predict, the errors increase with the increase of instance sizes. In the last experiment we calculated the average ratio rH of greedy algorithms for different values of n. We fixed a = 0.322 and M = 0.5, since these values make problem Pmjpj,r = pj(M + (1  M)ra)jCmax equally far from both the classical and the log-linear case. This experiment was much more time-consuming compared to the previous experiments. For example, the processing of instances with m = 2 and n = 34 has taken about 9 days of computations, while the processing of instances with m = 3 and n = 22 has taken about 11 days of computations. This was caused by the fact that the tree of all solutions was much more bigger for m = 3 compared to the one for n = 2. Thus, the three-machine instances tested in the experiment were smaller in size, n 6 22 for m = 3. The results of this experiment are presented in Table 4 and 5. For each value of n and m, the average ratio rH was calculated using 10 instances and compared to the values calculated by branchand-bound algorithm. The experiment has shown that the increase of the number of jobs causes significant decrease in the average rH ratio for SPT. It is another phenomenon that is a consequence of the learning effect: the bigger the instance is, the more significant shortening occurs at the end of the sequence. A similar phenomenon was noticed by Wu et al. (2007) for a two-machine flowshop maximum tardi-

The research of the second author was partially supported by a grant of the Ministry of Science and Higher Education of Poland. References Bachman, A., & Janiak, A. (2004). Scheduling jobs with position dependent processing times. Journal of the Operational Research Society, 55, 257–264. Badiru, A. B. (1992). Computational survey of univariate and multivariate learning curve models. IEEE Transactions on Engineering Management, 39, 176–188. Biskup, D. (1999). Single-machine scheduling with learning consideration. European Journal of Operational Research, 115, 173–178. Biskup, D. (2008). A state-of-the-art review on scheduling with learning effects. European Journal of Operational Research, 188, 315–329. Burns, G., Daoud, R., & Vaigl, J. (1994). Lam: An open cluster environment for mpi. In Proceedings of supercomputing symposium (pp. 379–386). . Cheng, M. B., Sun, S. J., & Yu, Y. (2007). A note on flow shop scheduling problems with a learning effect on no-idle dominant machines. Applied Mathematics and Computation, 184, 945–949. Cheng, T. C. E., Wu, C. C., & Lee, W. C. (2008a). Some scheduling problems with deteriorating jobs and learning effects. Computers and Industrial Engineering, 54, 972–982. Cheng, T. C. E., Wu, C. C., & Lee, W. C. (2008b). Some scheduling problems with sumof-processing-times-based and job-position-based learning effects. Information Sciences, 178, 2476–2487. Chen, P., Wu, C. C., & Lee, W. C. (2006). A bi-criteria two-machine flowshop scheduling problem with a learning effect. Journal of the Operational Research Society, 57, 1113–1125. Conway, R. W., Maxwell, W. L., & Miller, L. W. (1967). Theory of scheduling. Reading: Addison-Wesley. Crainic, T. G., Cun, B. L., & Roucairol, C. (2006). Parallel branch-and-bound algorithms. In E. G. Talbi (Ed.), Parallel combinatorial optimization (pp. 1–28). John Wiley & Sons. Dondeti, V. R., & Mohanty, B. B. (1998). Impact of learning and fatigue factors on single machine scheduling with penalties for tardy jobs. European Journal of Operational Research, 105, 509–524. Drozdowski, M. (2009). Scheduling for parallel processing. Berlin, London: Springer. Eren, T. (2009a). A bicriteria parallel machine scheduling with a learning effect of setup and removal times. Applied Mathematical Modelling, 33, 1141–1150. Eren, T. (2009b). A note on minimizing maximum lateness in an m-machine scheduling problem with a learning effect. Applied Mathematics and Computation, 209, 186–190. Eren, T., & Güner, E. (2008). A bicriteria flowshop scheduling with a learning effect. Applied Mathematical Modelling, 32, 1719–1733. Gawiejnowicz, S. (1996). A note on scheduling on a single processor with speed dependent on a number of executed jobs. Information Processing Letters, 57, 297–300. Gawiejnowicz, S. (2008). Time-dependent scheduling. Berlin, New York: Springer.

D. Okołowski, S. Gawiejnowicz / Computers & Industrial Engineering 59 (2010) 272–279 Gendron, B., & Crainic, T. G. (1994). Parallel branch-and-bound algorithms: Survey and synthesis. Operations Research, 42, 1042–1066. Globerson, S., & Seidmann, A. (1988). The effects of imposed learning curves on performance improvements. IIE Transactions, 20, 317–324. Graham, R. L. (1969). Bounds on multiprocessing timing anomalies. SIAM Journal of Applied Mathematics, 17, 416–429. Graham, R. L., Lawler, E. L., Lenstra, J. K., & Rinnooy Kan, A. H. G. (1979). Optimization and approximation in deterministic sequencing and scheduling: A survey. Annals of Discrete Mathematics, 5, 287–326. Hardy, G. H., Littlewood, J. E., & Pólya, G. (1934). Inequalities. Cambridge: Cambridge University Press. Ibaraki, T. (1987). Enumerative approaches to combinatorial optimization. Annals of Operations Research, 10. Kuo, W. H., & Yang, D. L. (2010). Erratum to: ‘‘Machine scheduling with a general learning effect” [Math. Comput. Modelling 51 (1–2) (2010) 84–90]. Mathematical and Computer Modelling, 51, 847–849. Lawler, E. L., & Wood, D. E. (1966). Branch-and-bound methods: A survey. Operations Research, 14, 699–719. Lee, W. C., Lai, P. J., & Wu, C. C. (2010). Erratum to: ‘Some single-machine and mmachine flowshop scheduling problems with learning considerations’ [Inform. Sci. 179 (2009) 3885–3892]. Information Sciences, 180, 1073. Lee, W. C., & Wu, C. C. (2004). Minimizing total completion time in a two-machine flowshop with learning effect. International Journal of Production Economics, 88, 85–93. Lee, W. C., & Wu, C. C. (2009a). Single-machine and flowshop scheduling with a general learning effect model. Computers and Industrial Engineering, 56, 1553–1558. Lee, W. C., & Wu, C. C. (2009b). Some single-machine and m-machine flowshop scheduling problems with learning considerations. Information Sciences, 179, 3885–3892. Leung, J. Y. T. (Ed.). (2004). Handbook of scheduling: Algorithms, models, and performance analysis. Boca Raton: CRC Press. Lodree, E. J., Geiger, C. D., & Jiang, X. C. (2009). Taxonomy for integrating scheduling theory and human factors: Review and research opportunities. International Journal of Industrial Ergonomics, 39, 39–51. Mosheiov, G. (2001). Scheduling problems with a learning effect. European Journal of Operational Research, 132, 687–693. Mosheiov, G. (2008). Minimizing total absolute deviation of job completion times: Extensions to position-dependent processing times and parallel identical machines. Journal of the Operational Research Society, 59, 1422–1424.

279

Okołowski, D. (2008). A note on scheduling with learning effect. In S ß erifog˘lu, F. S., Bilge, Ü. (Eds.), Proceedings of the eleventh international workshop on project management and scheduling (pp. 206–209). Pinedo, M. (2008). Scheduling: Theory, algorithms and systems. 3rd ed. Berlin, New York: Springer. Raccoon, L. B. (1995). A learning curve primer for software engineers. Software Engineering Notes, 21, 77–86. Shabtay, D., & Steiner, G. (2007). A survey of scheduling with controllable processing times. Discrete Applied Mathematics, 155, 1643–1666. Snir, M., Otto, S., Huss-Lederman, S., Walker, D., & Dongarra, J. (1996). MPI: The complete reference. The MIT Press. Toksarı, M. D., & Güner, E. (2008). Minimizing the earliness/tardiness costs on parallel machine with learning effects and deteriorating jobs: A mixed nonlinear integer programming approach. International Journal of Advanced Manufacturing Technology, 38, 801–808. Toksarı, M. D., & Güner, E. (2009a). A bicriteria parallel machine scheduling with a learning effect. International Journal of Advanced Manufacturing Technology, 40, 1202–1205. Toksarı, M. D., & Güner, E. (2009b). Parallel machine earliness/tardiness scheduling problem under the effects of position based learning and linear/nonlinear deterioration. Computers and Operations Research, 36, 2394–2417. Wang, J. B. (2008). Erratum to: A note on flow shop scheduling problems with a learning effect on no-idle dominant machines [Appl. Math. Comput. 184 (2007) 945–949]. Applied Mathematics and Computation, 202, 897–898. Wu, C. C., & Lee, W. C. (2009). A note on the total completion time problem in a permutation flowshop with a learning effect. European Journal of Operational Research, 192, 343–347. Wu, C. C., Lee, W. C., & Wang, W. C. (2007). A two-machine flowshop maximum tardiness scheduling problem with a learning effect. The International Journal of Advanced Manufacturing Technology, 31, 743–750. Xu, Z., Sun, L., & Gong, J. (2008). Worst-case analysis for flow shop scheduling with a learning effect. International Journal of Production Economics, 113, 748–753. Yin, Y., Xu, D., Sun, K., & Li, H. (2009). Some scheduling problems with general position-dependent and time-dependent learning effects. Information Sciences, 179, 2416–2425. Zhang, X., & Yan, G. (2010). Machine scheduling problems with a general learning effect. Mathematical and Computer Modelling, 51, 84–90.