ARTICLE IN PRESS
Int. J. Production Economics 100 (2006) 44–58 www.elsevier.com/locate/ijpe
Sequencing precedence-related jobs on two machines to minimize the weighted completion time Girish Ramachandra, Salah E. Elmaghraby North Carolina State University, Raleigh, NC 27695, USA Received 1 January 2004; accepted 1 October 2004 Available online 2 February 2005
Abstract P We address the problem P2jprecj wj C j : The problem is known to be NP-hard. We offer a binary integer program (BIP) and a dynamic program (DP); the latter is based on the concept of ‘‘initial subsets’’ of jobs and the optic of ‘‘weighted earliness–tardiness’’. Although the DP approach expands the size of problems that can be solved to optimality to almost twice that obtained by the BIP, it reaches its computational limit around 25 jobs with mean job processing time of 10. We then introduce a genetic algorithm (GA) procedure that is capable of solving anyP problem size, and further extends the domain of applicability to more than two machines in parallel (problem Pmjprecj wj C j ). The BIP is used also to establish a good lower bound against which the performance of the GA procedure is measured for larger size problems. Experimental investigation of the GA procedure demonstrates that it is capable of achieving the optimum in very few iterations (less than 20), thanks to the manner in which the initial population is generated, and that early abortion still yields excellent approximation to the optimum as judged by its proximity to the lower bound. r 2004 Published by Elsevier B.V. Keywords: Scheduling; Parallel machines; Dynamic programming; Genetic algorithm
1. Introduction The problem treated in this paper may be described as follows. We are given a set N of n jobs that are ready ‘‘now’’, and two identical machines (m=c’s) in parallel. Only in the application of genetic algorithm (GA) do we extend the Corresponding author. Tel.: +1 919 515 7077;
fax: +1 919 515 5281. E-mail address:
[email protected] (S.E. Elmaghraby). 0925-5273/$ - see front matter r 2004 Published by Elsevier B.V. doi:10.1016/j.ijpe.2004.10.014
number of machines to three. Extension to more than three machines is straightforward, but not attempted in this research. The jobs are related by arbitrary precedence in the job-on-node (JoN) mode of representation. Each job has a processing time pi and a ‘‘weight’’ denoted by wi ; which may represent the cost of keeping it as ‘‘work in process’’ during its residence in the shop before completion. Denote the completion time of job i by C i : The cost incurred at its completion is evaluated as wi C i : It is desired to sequence the jobs
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
on the two machines non-preemptively while respecting the precedence relations so that the cost of completing all the jobs is minimized. The convention P is to state this problem as P2jprecj wi C i : Parallel machine scheduling problems are being addressed by many researchers due to their wide and prevalent applications. The problem is important from both the theoretical and practical points of view. The criterion of minimizing the weighted completion times is appealing to many because it induces stable utilization of resources, high ‘‘throughput’’ in the shop, and minimum inprocess inventory. Scheduling problems involving precedence constraints are among the most difficult problems in the area of machine scheduling. Let m denote the number of (identical) machines in parallel. Garey and Johnson (1979) use the transformation from PARTITION to demonstrate that the decision without precedence constraints is NP-complete for m ¼ 2; and is NP-complete in the strong sense for arbitrary m42: For arbitrary precedence constraints, the problem is NP-complete in the strong sense even if all weights are equal, and the partial order is an ‘‘in-tree’’ or an ‘‘out-tree’’. The computational difficulty of the problem shall be manifested in our mathematical formulations below, and explains our focus on achieving solutions via compu-search procedures in general and GA in particular. 1.1. Literature review The ability to understand the structure of the problem and develop near-optimal solutions remain limited, and progress on the weighted time completion objective on a single machine or identical parallel machines is quite recent. For the one m=c scheduling problem, minimizing the weighted completion time is NP-hard for arbitrary precedence constraints (Garey and Johnson, 1979; Horn, 1973). A polynomial-time algorithm is known when the precedence is a forest (Horn, 1973), or a series–parallel graph ðs=pÞ (Adolphson and Hu, 1973; Lawler, 1973). The case of arbitrary precedence was treated by Sidney (1969) who introduced the concept of initial sets of jobs, and subsequently by Sidney and Steiner
45
(1986). Elmaghraby (2001), in a recent report has proposed an approach to solve the case of arbitrary precedence by first converting the non s=p graph into a s=p one by adding artificial precedence relations; he then employs a branchand-bound (BaB) procedure to arrive at the optimal sequence. Chekuri and Motwani (1999), Chudak and Hochbaum (1999) and Hall et al. (1997) provide 2-approximation algorithms to this problem. Chekuri’s algorithm is based on solving the minimum cut while the latter two contributions use modified linear programming relaxations. For an arbitrary number of machines in parallel, the problem is NP-hard even without precedence constraints, unlessPall weights are equal. The special case Pmk wj C j with unit weights is solvable in polynomial time by sorting (Conway et al., 1967). Alidaee (1993) modifies the dynamic programming algorithm of Hall and Posner (1991) for solving the single m=c weighted earliness– tardiness (WET) problem to obtain a pseudo-polynomial algorithm for the 2-machine weighted mean flow time minimization problem. The problem is strongly NP-hard even when all the weights are identical and the graph is a union of chains. Dror et al. P (1997) provide two results: P2jstrong chainj C j is P solvable in polynomial time, and P2jweak chainj C j is NP-hard in the strong sense. ‘‘List scheduling’’ algorithms, first analyzed by Graham (1966), are among the simplest and most commonly used approximation methods for parallel machine scheduling problems. These algorithms use priority rules, or job ranking, which are often derived from solutions to relaxed versions of the problems. In his seminal paper, Graham (1966) showed that a simple list scheduling rule is a ð2 1=mÞ-approximation algorithm for the problem PjprecjC max : In contrast, simple counterexamples demonstrate that this may lead to an arbitrarily poor performance ratio for P a weighted sum of completion times objective j2N wj C j : Thus to obtain a bounded performance for the weighted sum of completion times objective, the no-idleness property needs to be relaxed. An important contribution to this area of research is the paper by Rinnooy Kan et al. (1975) in which they propose a BaB algorithm for
ARTICLE IN PRESS 46
G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
P 1k wi C i ; for a general nondecreasing cost function. They use four main dominance criteria to define the order in which jobs will appear in the optimal schedule. They use the assignment formulation on the root node and then compute lower bounds (l.b.) throughout the search tree using Dorhout’s (1975) dual solution. The necessity of having to deal with idleness adds a significant amount of complexity to the problem. Hall et al. (1997) partition jobs into groups that are individually scheduled according to the GLSA, and then these schedules are concatenated to obtain a solution to the original problem. Chekuri et al. (1997) present a different variant of the GLSA by artificially introducing idle time whenever it seems that a delay of the next available job in the list (if it is not the first) can be afforded. Unlike the above algorithms, which are machine based, the approximation algorithm by Munier et al. (1998) (MQS) was the first within its context that was job based. They obtain the list from an LP relaxation which uses job completion times C j as the decision variables. Reference to this formulation is made in the next section. The extreme complexity of the multiprocessor scheduling problem has triggered a great deal of effort in developing compu-search techniques such as tabu search and genetic algorithms (GA). Esquivel et al. (2001) address the parallel processor scheduling problem for the makespan minimization criterion using GA and show that their method is superior to GLSA. Hou et al. (1994) propose a scheduling algorithm using genetic search in which each chromosome is a collection of lists, each of which represents the schedule on a distinct processor. Thus each chromosome is a two-dimensional structure of (processor index, task list). Although this solution representation facilitates easy crossover operations, it poses a restriction on the number of schedules being represented. Therefore, it reduces the probability of finding optimal solutions. In this paper we present two mathematical formulations. The first is a binary integer program (BIP), discussed in Section 2, which proved useful in solving optimally problems with no more than about 10 jobs. In the same section we also present a l.b. based on a linear programming relaxation
which is later used to evaluate the performance of the GA approach. The second is a dynamic program (DP), presented in Section 3, which extends the capability of optimization models beyond the limits of the BIP models, albeit it remains unwieldy for problems with more than about 25 jobs. In Section 4, we present a GA formulation to find optimal or near-optimal solutions in a fast and consistent way. The solution representation in this formulation is a ‘valid’ scheduling list, and we alter this list using genetic operators to find a list which, if used as an input to the ‘‘scheduler’’, generates an optimal schedule. The l.b. developed in Section 2 is used to compare results for problems solved using the GA procedure. Section 5 presents our experimental results. Section 6 discusses the conclusions and delineates the areas of our current research.
2. A BIP model Define xjkt as 0–1 variables for 1pjpn; k 2 f1; 2g; and t discrete with 0ptpT; where T denotes some upper bound on the starting time of the ‘‘last job(s)’’ (these are the jobs with no successors in the graph): 8 > < 1 if job j is started on machine k at time t; xjkt ¼ > : 0 otherwise: The BIP formulation for the problem is as follows: min z ¼
n X
wj C j
(1)
j¼1
subject to the following constraints: 2 X T X
xjkt ¼ 1
(2)
8j;
k¼1 t¼0 n X
xjkt p1
8k; t;
(3)
j¼1 2 X T X k¼1 t¼0
txjkt þ pj p
2 X T X k¼1 t¼0
txj 0 kt
8j; j 0 2 A;
(4)
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
txjkt þ Mð1 xjkt ÞX
t1 X
ðpi þ rÞxikr
r¼0
8k; t; 8ði; jÞ 2 N N; Cj ¼
2 X T X
txjkt þ pj
ð5Þ 8j;
(6)
k¼1 t¼0
xjkt 2 f0; 1g
8j; k; t:
(7)
The objective function (1) seeks to minimize the sum of weighted completion times. Constraint (2) ensures that each job is started at some time on either m=c: Constraint (3) ensures that at any time at most one job may start processing on any machine. Constraint (4) ensures that if job j precedes job j 0 on the same machine, the start time of j 0 should be at least pj units from the start of j 0 ; where A denotes the set of arcs in the graph. Constraint (5) ensures that a job can start processing on an m=c only after all jobs previously assigned to that m=c are completed, where M is a large number. Constraint (6) defines the completion times of all the jobs. Lastly, constraint (7) represents the binary integrality of the variables. To illustrate the proposed BIP model and its computational requirements, an example of 10 jobs is shown in Fig. 1. (The figure also shows two ‘‘initial subsets’’ which are discussed below in Section 3.) The legend ½wj ; pj next to a job
Fig. 1. A sample problem, some initial subsets, and the optimal schedule.
47
indicates its weight and processing time, respectively. The BIP was modeled in LINGO and it took 31 seconds to reach the optimal solution. It required 17 814 variables and 2392 constraints. The optimal sequence is as follows:
machine 1 :
jobs 1,2,6,4,8
machine 2 :
jobs 3,5,9,7,10
for a value of 111. The size of this BIP model (as measured by the number of variables and the number of constraints) is strongly affected by the fact that one of the indices is a function of the processing times. We are particularly interested in the effect of number of jobs n and the sum of processing times P j pj on the number of constraints generated by the model because the LINGO version we used had a limitation of 32 000 variables and 16 000 constraints. Assuming the ratio of the number of arcs (representing precedence) to the number of jobs to be 1.5 (recall that we have adopted the jobon-node mode of representation, in which the nodes represent the jobs and the arcs represent the precedence among them), one arrives at the following expression: Number of constraints ¼ 3:5n þ 2Tðn2 þ 1Þ; where T is the scheduling time horizon. Since there are two identical Pprocessors Pavailable, we can limit T between 0:5 j pj and j pj : Thus we can say that thePnumber of constraints would Plie between 3:5n þ j pj ðn2 þ 1Þ and 3:5n þ 2 j pj ðn2 þ 1Þ: The table shown in Fig. 2 gives an idea of the numbers discussed. It is therefore inefficient and computationally expensive to attempt to solve the problem by the above integer programming formulation for n410 jobs with average processing time larger than 5. A total of 50 problems were solved to optimality utilizing this model, which ranged inP size from 10 to 15 jobs with total processing time pj ranging between 20 and 40. Fig. 5 in Section 5 tabulates a sample of 11 instances. The table also shows the results of applying the GA procedure described below to these problems. It can be seen that: (i) the LP relaxation is a reasonably good l.b. on the optimal value; (ii) the optimum could be achieved
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
48
tion times C j as the decision variables. Although our formulation does not explicitly account for the assignment of jobs to machines, such assignment can be deduced from the result of the BIP. For the sake of completeness we briefly state the ILP of MQS. MQS-LP: min
B¼
n X
wj C j
(8)
j¼1
subject to Fig. 2. Growth in the number of constraints.
X j2F
for problem size up to 15 jobs, but the larger number of jobs imposed stricter constraints on the individual job processing time in order not to exceed the software capability; (iii) although not reported in the figure, the BIP solution time ranged between a few seconds and 6 hours on a Pentium III processor. Evidently larger problems are beyond the software capability available to us, though other software can expand the size of solvable problems to 50 jobs of average duration 20. 2.1. Lower bounds Lower bounds are often obtained by relaxing one or more constraints from the main problem. Some commonly used methods are: relax the precedence among the jobs, permit job preemption, permit processing jobs on different machines (job splitting), or solve the problem as a (continuous) LP. A notable exception to this common practice is the work of Rinnooy Kan et al. (1975) mentioned above. For our problem, we propose to secure the l.b. by simply relaxing the binary integrality constraints in the BIP. For the 10 jobs problem presented in Fig. 1 the LP relaxation had an optimal value of 100.83, which is a reasonably strong l.b. on the optimum (of value 111). The bound secured from the LP relaxation is compared to the l.b. obtained by using the LP formulation provided by MQS (1998) which uses job comple-
1 X pj C j X p 2m j2F j
!2 þ
1X 2 p 2 j2F j
8F N; (9)
C j Xpj
(10)
8j;
C j XC i þ pj
8i j:
(11)
Constraint (9) is a relatively weak (because machine assignments are ignored) way of expressing the requirement that each machine can process at most one job at a time. Constraints (10) and (11) take care of completion times and precedence relations, respectively. Observe that here we are using the stronger form of Schulz (1996) rather than the weaker form of 2 3 !2 X X 1 4 pj þ p2j 5 8F N; 2m j2F j2F used by Hall et al. (1997). The fundamental idea behind the bound of (9) is that these inequalities define the convex hull of the polytope of the original problem, as demonstrated by Queyranne and Schulz (1994). Since none of the inequalities of (9) is redundant, all must be enumerated, which mars the applicability of this approach to large-scale problems. In fact we could not exceed n ¼ 13 jobs in our routine experimentation, from which we concluded that our l.b. based on the solution of the BIP as a regular LP is as good, if not actually superior, to that of MQS; see the tables in Figs. 5 and 6 in Section 5. No further use of MQS-LP was made in the sequel.
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
3. A DP model The main motivation to pursue the DP approach stems from the following two contributions. For other contributions to the single machine scheduling problem with various criteria, see Ibaraki and Nakamura (1994) and the references cited therein. See also Mondal and Sen (2001) for treatment of the WET problem with a restrictive common due date. 1. The work by Hall and Posner (1991) in which they developed a DP algorithm of pseudopolynomial order to solve the WET problem involving a single m=c with unrelated jobs. 2. The extension of the above formulation by Alidaee (1993), again using DP, to solve the problem of minimizing the sum of weighted completion times of unrelated jobs on two identical parallel machines. The keys to the success of these two DP formulations are two. The first is that the order in which the jobs are selected in the iterative procedure of either model is known (in particular, it is wspt, which stands for weighted shortest processing times). The second is that subsets of jobs together with the ‘‘occupancy’’ of the ‘‘early’’ jobs determine uniquely the completion time of the latest ‘‘tardy’’ job. Can the order in which the jobs are considered in the presence of precedence relations be defined in advance of the procedure for sequencing on two machines? A seemingly logical way to proceed is to seek the optimal sequence on one machine subject to the precedence constraints, very much in the same way that both contributions started with the order given by the wspt ranking, and then impose the WET logic on that order. Presumably, the oneprocessor schedule, taken as a list, provides some priority information on how the jobs are to be considered. In particular, the case of a single machine in which the jobs possess series–parallel ðs=pÞ precedence, under the objective of minimizing weighted completion times, has been solved by an elegant procedure due to Lawler (1973). Surely, one may argue, such optimal sequence on one machine should give a clue to the order in which
49
the jobs are to be considered when scheduled on two machines. Unfortunately this is not true! The order given by the one machine optimal sequence secured from Lawler’s procedure proved to be useless to our problem; one can easily construct counter-examples in which the optimal sequence on two machines violates the order established by the one-machine optimal solution. The reason is as follows. Unlike with the makespan criterion, the completion time of every job is important for the weighted completion time criterion. When trying to convert the one-machine schedule into a parallel-machines one, precedence constraints prevent complete parallelization (of the jobs). Thus, we may need to execute jobs out of order from the list to benefit from parallelization, which impairs the utility of the one-machine schedule as a guide to the order in which jobs are considered for the multiple-machines case. If all the pi ’s are identical, say 1, we can afford to use the list scheduling algorithm. But if the pi ’s are different, that is no longer the case since a job could keep a processor busy while delaying more profitable jobs that become available. Therefore there seems to be no escape from devising a new approach to the determination of the allocation cum sequencing to the machines. However, we retain the WET optic of Hall and Posner (1991) and Alidaee (1993). The Achilles’ heel of the DP procedure proposed next is the need to enumerate all initial subsets of jobs (defined as a set of jobs with no predecessors) and to maintain a complete list of alternate optima at all intermediate stages, since subsequent expansion of a subset of jobs may favor one alternative over the others, despite their being of equal value at a particular stage of iteration. The construction of the DP model proceeds as follows. Let I k denote an unordered set of initial jobs of cardinality k; such as I 5 ¼ f1; 2; 3; 4; 7g in Fig. 1. It defines the state in the DP model. The stages of the DP model are defined by the cardinality of the set I k ; k ¼ 1; . . . ; n; where n is the number of initial subsets. The precedence relations may restrict the set of jobs eligible to be terminal jobs in the set I k : Denote the set of eligible terminal jobs in I k by LðI k Þ: For instance, in the set I 5 ¼ f1; 2; 3; 4; 7g; it is clear that LðI 5 Þ ¼
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
50
f2; 3; 7g since any one of these three jobs may be the last job in the sequence. However, in the same graph the set I 6 ¼ f1; 2; 3; 5; 6; 9g has LðI 6 Þ ¼ f9g: Let l k denote the last job in the sequence of I k : In the set I 5 ; the terminal job l 5 may be any job of the subset LðI 5 Þ; but in the set I 6 only job 9 is eligible to be l 9 : Let d be a fictitious ‘‘due date’’, with X dX pj : j
and the subset of ‘‘tardy’’ jobs by I Tk1 ¼ fj 2 I k1 j C j 4dg: Let the set of jobs immediately preceding job l k in the precedence graph be denoted by G1 ðl k Þ: Then job l k ; being the last job in the sequence, may either be the ‘first early’ or the ‘‘last tardy’’. If it is the ‘‘first early’’, then its completion time is C El k ¼ max C j ; P I Ek1 þ pl k ; j2G1 ðl k Þ
Since we are concerned with earliness and tardiness, the measure of time proceeds in both directions away from the due date d: Jobs that complete processing before d are termed ‘‘early’’ and denoted by the set E; and those that complete processing after d are termed ‘‘tardy’’ and denoted by the set T: Therefore, l k 2 LðI k Þ would appear first among the ‘‘early’’ jobs, and last among ‘‘tardy’’ jobs; see Fig. 3(B). The total processingP time of any subset I of jobs is denoted by PðIÞ ¼ j2I pj ; which is divided in some fashion between the two m=c’s. Let C j denote the completion time of job j measured from d: Let I k1 ¼ I k fl k g be the subset of jobs in I k exclusive of job l k : Suppose we know the optimal sequence of the jobs in I k1 ; and denote the subset of ‘‘early’’ jobs by
with cost equal to wl k C El k : On the other hand, if the job l k is ‘‘last tardy’’ then its completion time is C Tl k ¼ max C j ; PðI Tk1 Þ þ pl k ; j2Gðl k Þ
with cost given by wl k C Tl k : The reason for defining C El k ðC Tl k Þ as given in the above two expressions is the possibility of a job j 2 G1 ðl k Þ being in the set T (set E) in the optimal sequence I Tk1 : We are now poised to formulate the DP extremal equation. Let f ðI k ; l k Þ denote the minimal cost of the jobs in I k when job l k is last. Then one may write the extremal equation as n n o f ðI k ; l k Þ ¼ min min wl k C El k ; wl k C Tl k l k 2Lðl k Þ o þ f ðI k fl k g; l k1 Þ ; k ¼ 1; . . . ; n: ð12Þ
I Ek1 ¼ fj 2 I k1 j C j pdg p(Wk ) p(Wk) - e
e 0
d T
E (A)
p(Wk ) p(Wk) - e
e
0
Job lk early E
T
Job lk tardy
(B) Fig. 3. Illustration of earliness and tardiness.
Observe that there are as many values of the optimum f ðI k fl k g; l k1 Þ as there are elements in the set LðI k fl k gÞ: To gain an idea about the performance of this DP approach, we solved the 10 jobs problem given in Fig. 1. We had to enumerate 44 initial subsets, and secured the same optimal sequence depicted in Fig. 1 and the same optimal value of 111. The validity of the extremal equation of (12) depends crucially on the separability of the objective function. We assert that the addition of job l k to the set I k1 does not ‘disturb’ the optimal sequence of the latter, in the sense of either switching the order of any two jobs in it or inserting itself between two jobs in I k1 : This can be seen from the following three observations: 1. Job l k cannot insert itself in the sequence of the jobs that precede it.
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
2. If job l k should occupy an earlier position in the sequence, thus inserting itself somewhere in the set I k1 ; then some other job, say l 0k ; would be last in the set I k ; which implies that l k has been previously considered in the subset I k1 : Using the same logic with I k1 we are led to subsets I k2 ; . . . ; I 1 ; in which l k could have been in a prior position in any of them. Therefore, considering job l k at stage k does not interfere with its consideration in prior stages, if precedence permits it. 3. The presence of job l k last cannot cause two jobs, say i and j, in the set I k1 to exchange positions. Here we must consider two eventualities: (a) the exchange of positions is between two ‘‘internal’’ jobs; i.e., neither i nor j is a ‘‘last’’ job in the sequence of the set I k1 : This eventuality is impossible since such an interchange must improve the cost of the set I k1 taken by itself by permitting some job to finish earlier, which contradicts the optimality of its current sequence, (b) the exchange of positions is between a ‘‘last’’ job in the set I k1 ; suppose it is job j, and an ‘‘internal’’ job, suppose it is job i. But then the same argument presented relative to job l k can be repeated verbatim relative to job j; which would eventually lead to the situation described in arguments 1, or 2, or 3a. Therefore the superiority or inferiority of job l k being last shall become evident without disturbing the sequence of any job in the set I k1 ; as asserted. This establishes the independence of the decision on l k from prior decisions, and the additivity of the cost of appending l k to the sequence of set I k1 : The only caveat in applying the extremal equation of (12) is that at each stage of iterations one must retain all alternate optima at each state with a specified ‘last’ job. This is because alternate optima represent different positions of the jobs in the earliness–tardiness dichotomy, which may have an impact on the placement (and cost) of subsequent jobs that are appended to the partial sequence. This fact was amply evident in the
51
example cited above. And it is the cause behind the computational difficulty in large-scale problems.
4. A GA approach The fact that our problem is intractable to polynomial-time algorithms suggests the possible use of compu-search approaches (alternatively known as meta-heuristics) such as GA simulated annealing, and tabu search, among many others. We propose to solve our problem using GA, and this section is devoted to a detailed description of its construction and implementation. The distinctive feature of our algorithm which sets it apart from other contributions using GA in scheduling problems lies in (i) the structure of the chromosome representation, and (ii) in the manner by which the initial population is generated. Each chromosome represents a list of task priorities. Since task priorities are dependent on the input, which is a directed acyclic graph (dag), different sets of task priorities represent different problem instances. We use different priority rules to derive the first few chromosomes and then alter them slightly to construct the rest of the initial population. As shall be presently seen, this method ensures faster convergence compared to starting with a set of randomly ordered chromosomes. Fitness of the chromosomes is determined from the value of the schedules secured from these lists using the first available machine (FAM) rule. 4.1. The structure of the GA GAs, introduced by Holland in the 1970s, are search techniques based on the concept of evolution (Davis, 1991; Goldberg, 1989). Given a welldefined search space in which each point is represented by a bit string, called a chromosome, a GA is applied with its three search operators— selection, crossover and mutation—to transform a population of chromosomes, with the objective of improving their ‘‘quality’’. Before the search starts, a set of chromosomes are chosen from the search space to form the initial population. The genetic search operators are then applied one after another to systematically obtain a new generation
ARTICLE IN PRESS 52
G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
of chromosomes with a better overall quality. This process is repeated until the stopping criterion is met and the best solution of the last generation is reported as the final solution. For an efficient GA search, in addition to a proper solution structure, it is necessary that the initial population of schedules be a diverse representative of the search space. Furthermore, the solution encoding should permit: a large diversity in a small population, easy crossover and mutation operations, and an easy computation of the objective function. All three desired characteristics are realized in our approach. 4.2. The proposed GA Classical optimal scheduling algorithms, like Hu’s (1961) and Coffman’s (1976) algorithms, are based on the list scheduling approach in which the nodes of the dag are first arranged as a list such that the ordering of the list preserves the precedence constraints. It is proposed that a similar approach be used to calculate the objective function value or fitness of the chromosome in the GA; i.e., beginning from the first precedence-feasible job in the list, each job is removed and assigned to the machine that allows the earliest start time. It is evident that an optimal ordering of jobs in the list is needed to generate an optimal schedule using this approach. While optimal scheduling lists can be easily constructed for certain restricted cases (e.g., unit-weight out-tree as in Hu’s algorithm, where an out-tree is a connected, acyclic graph in which each node has at most one parent), such lists cannot be determined for arbitrary dags. Also, there are an exponential number of candidate lists that can be used for scheduling, and an exhaustive search for an optimal list can be very time consuming indeed, if not impossible.
1,1,0,2
2,1,2,5
3,2,2,3
4,2,3,5
The likelihood of the existence of lists leading to optimal schedules using the FAM rule is very high, although there is no proof that such a list always exists. Since an optimal schedule is not unique, the list which can lead to the optimal schedule is also not unique. A solution neighborhood can then be defined for the genetic search. For instance, we can start from an initial list from which we obtain an initial schedule. We then systematically modify the ordering within the list such that precedence is maintained. From the new list we construct a new schedule. If the schedule is better, we adopt it, else we test another modified list.
4.3. Chromosome representation From the perspective of representation, two broad classifications exist, namely direct and indirect representations. In the former case a complete and feasible schedule is an individual of the evolving population. In the latter scenario, the population consists of encoded schedules. Because the representation does not directly provide a schedule, a schedule builder or decoder is necessary to transform the chromosome into a schedule and evaluate it. The decoder could be in the form of implementing the naı¨ ve list scheduling algorithm or similar priority rules. Consider the dag shown in Fig. 1. A direct representation of a chromosome could be the following four-tuple: htask_id; proc_id; init_time; end_timei; where task_id identifies the task to be allocated, proc_id identifies the processor on which the task is to be processed, init_time is the start time of task_id on proc_id; end_time denotes completion time of the task. The chromosome representation for the Gantt chart of Fig. 1 is shown in (13). An example of indirect representation for the same schedule is shown in (14). As seen, this representation just indicates the assignments of the jobs to the processors.
5,1,5,8
6,2,5,8
7,1,8,9
8,1,9,10
(13)
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
Processor:
1
1
2
2
1 2
1
1
2
2
Task:
1
2
3
4
5 6
7
8
9 10 (14)
In our case, we use a variant of the indirect representation. A permutation which respects all precedence constraints is a chromosome. The P fitness of a chromosome is defined as wj C j ; where wj represents job weight and C j is the completion time of job j in the schedule constructed by using the FAM rule. 4.3.1. Representation of the precedence constraints The precedence constraints among the jobs determine the size of the search space and also adds complexity to the problem. In our approach, precedence constraints are represented by the adjacency matrix A: It is a node–node adjacency matrix of 1’s and 0’s, where a 1 in location ði; jÞ implies that i precedes j; written i j: Using the concept of transitive closure of graphs, we also define earliest position (EP) and latest position (LP) for any given node. These positions determine the earliest and the latest feasible positions of a given job in a list. The EPs and LPs for all the jobs in Fig. 1 are shown in (15). This information is used for feasibility checks of chromosomes during mutation.
Job:
1
2
3 4
5
6
7
8 9
10
EP:
1
2
2 2
4
4
7
8 6
10
LP:
1
5
4 5
6
6
7
8 9
10 (15)
4.3.2. Generation of the initial population The initial population is generated from a set of scheduling lists that are constructed using a set of priority rules as described next. 1. Weighted subgraph rule: Ready jobs are assigned to the machines in topological order as and when the machine becomes available. In case of a contest; i.e., there are more
53
ready jobs than machines, the priority is decided as follows: (a) Case 1—one machine available: each job P ForP pj in in contention, calculate wj = which the summation in both numerator and denominator is over the jobs under consideration and the jobs in the subgraph of its successors. The job with the highest value is scheduled first, with ties being broken lexicographically (which essentially means arbitrarily). (b) Case 2—both machines available: The same procedure is followed and the jobs are sorted in nonincreasing order of their weighted subgraph values, with ties being broken lexicographically. The first two jobs in this list are scheduled. 2. WSPT rule: This method is similar to the previous one except that the priority rule is the wspt rule. 3. Lawler’s 1—machine optimal schedule: For the given dag, we first transform it into a s=p graph by the addition of auxiliary precedence relations following the procedure of Elmaghraby (2001) then apply Lawler’s (1973) algorithm to obtain the list. This list is then scheduled on the two machines using the FAM rule. 4. LP-relaxation list: This method is due to MQS (1998). The LP-relaxation of Eqs. (8)–(11) is solved to obtain the completion time vectors of the jobs. Let the LP completion time of job j be LP C LP as j : Accordingly, define LP-mid-point mj LP LP mj ¼ C j pj =2: The LP-mid-point list is now defined by sorting the jobs in nondecreasing order of their mid-points mLP j : These different ordering schemes not only provide the necessary diversity but also represent a population of better fitness than a set of totally random topological orderings. A whole population is then generated by random valid swapping of jobs in the lists. 4.3.3. The genetic operators Since we are dealing with modification of ordered lists, standard crossover and mutation operators may violate the precedence constraints, thereby generating invalid lists. So, we need to use
ARTICLE IN PRESS 54
G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
other well-defined genetic operators. Three kinds of crossover operators, namely, the order crossover (OCX), the partially mapped crossover (PMX), and the cycle crossover (CCX) operators are viable candidates. By using simple counter-examples, one can show that PMX and CCX operators may produce invalid lists. Therefore, in our algorithm, only the OCX operator is used. We also describe a mutation operator based on node-swapping. The OCX operator. We consider a single-point order crossover operator. Given two parents, select a cutoff point (the same for both parents) randomly in the two chromosomes. We first pass the left segment from the first parent to the child. Then we construct the right segment of the child from the original genes of the other parent in the same order as they appear, deleting the genes that are already in the left segment and keeping the ones that are missing. An example of the crossover operator is shown in Fig. 4(a). The chromosomes shown in the figure are valid topological ordering of the dag in Fig. 1. The cutoff point is randomly selected and falls after gene in position 4. The left segment {1,3,2,4} of parent 1 is passed directly to the child. The original chromosome of parent 2 is f1; 2; 3; 5; 6; 9; 4; 7; 8; 10g; of which f1; 3; 2; 4g have been already accounted for in the left segment. This leaves the right segment containing the genes f5; 6; 7; 8; 9; 10g of parent 2 to be appended to the child in the order of their appearance in parent 2. This operator permits fast processing and is easy to implement. More importantly, it always produces valid offsprings. The mutation operator. A valid topological order can be transformed into another topological order by swapping some nodes. However, not every pair of nodes can be swapped without violating
parent 1 { 1, 3, 2, 4, 5, 6, 7, 8, 9, 10 } { 1, 3, 2, 4, 5, 6, 9, 7, 8, 10 } child
parent 2 { 1, 2, 3, 5, 6, 9, 4, 7, 8, 10 }
(a) { 1, 3, 2, 4, 5, 6, 7, 8, 9, 10 }
mutation
{ 1, 3, 2, 6, 5, 4, 7, 8, 9, 10 }
(b) Fig. 4. Illustration of (a) crossover and (b) mutation operators.
precedence constraints. Two nodes are interchangeable if they are not lying on the same path in the dag. Using the concept of transitive closure of graphs, we can check (in constant time) whether or not two nodes are interchangeable during the search. This implies that we can efficiently test whether two randomly selected nodes are interchangeable, and if so, swap them to check the new objective function value. Such swapping defines the random search neighborhood which size is Oðn2 Þ since there is a maximum of ðn2Þ pairs of interchangeable nodes. It should be noted that such swapping may result in infeasibility if proper post-processing is not done. For example, in Fig. 4(b), jobs 2 and 8 are interchangeable since they do not lie on the same path in the dag. But, if we consider the list f1; 3; 2; 4; 5; 6; 7; 8; 9; 10g; then simply swapping the nodes produces the list f1; 3; 8; 4; 5; 6; 7; 2; 9; 10g; which is clearly infeasible because job 8 cannot be processed before 4. In order to overcome this occurrence, we use the earliest– latest positions table (15) to determine if indeed the swapped jobs’ positions are permissible. Satisfaction of both conditions is sufficient to ensure feasible interchangeability. We now define the swap mutation operator which swaps two interchangeable nodes in a given chromosome. An example of this operator is shown in Fig. 4(b). The mutation probability for smaller problem sizes is derived empirically, as explained in the next section. For problem sizes of 40 jobs and more it is!calculated by the formula f 0 f min mm ¼ b ; (16) f avg f min where f 0 is the fitness of the chromosome, f min the current best fitness, f avg the mean fitness of the population, b a number between 0 and 1. 4.3.4. Parameter setting As discussed above, the genetic search method is guided by the ‘tuning’ of three parameters, namely: population size, crossover rate mc ; and mutation rate mm : We chose these parameters empirically, within the ranges shown below:
Population size: 25–75 chromosomes (step size ¼ 5),
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
55
crossover rate: 0–100% (step size ¼ 10%), and mutation rate: 0–100% (step size ¼ 10%) and adaptive.
The entire process was done in three rounds. Initially we fixed all parameters at specific values within the ranges above. We also fixed b of Eq. (16) at 0.5. We then varied the population size in steps and kept the best one; defined to be the smallest size that resulted in the lowest objective function value appearing in at least 5 generations. Next we varied the crossover rate while keeping the other two parameters fixed. We finally finished the first round by varying the mutation rate. The next two rounds were carried out by starting with varying the crossover rate and mutation rate, respectively. This procedure was carried out for each of the problem sizes of 10, 20, 40 and 80 jobs. Results were very similar for crossover and mutation rates, independently of the population size. We found that for smaller problem sizes, population size of 30 and the crossover rate between 0.4 and 0.6 gave the best results. Using the mutation rates between 0.1 and 0.3 gave better results when compared to the use of adaptive mutation operator. After few experiments with b ranging from 0.1 to 0.8, we decided to fix it at 0.6 for all problem sizes. With problem sizes of 40 and 80 jobs, we achieved better results when the population size was 50, crossover rate between 0.5 and 0.7, and mutation operator adaptive. The stopping criterion used in all the cases was the number of generations—20 for problem sizes up to 20 jobs, and 30 for problem sizes greater than 20 jobs.
5. Computational results In order to get a feel of the performance of the GA, 50 problems were constructed that were within LINGO’s solving capability. The experiments were conducted on a Sun Ultra workstation, and all programs were written in MATLAB. The tables in Figs. 5 and 6 give the results for the twoand three-machine cases, respectively. The proposed GA reached the optimal value for all problems in 15 generations or less. Problem
Fig. 5. Computation results—two-machine scenario.
Fig. 6. Computation results—three-machine scenario.
instances for 20, 40, and 80 jobs were also generated. Since these are well beyond the computing capability of the LINGO solver available to us the performance of the GA was evaluated against the l.b. generated from the LP solution (written as l.b. (LP)). Fig. 7 summarizes the performance in terms of percentage deviation from the l.b. (LP). The average computation times for all the problem instances is presented in Fig. 9. Since the computation overhead in MATLAB is significantly higher than in C/C++, these times do not serve as a performance index of our GA in the true sense. Additional problems were also generated using the DAG generator developed by Elmaghraby et
ARTICLE IN PRESS 56
G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
Fig. 7. Randomly generated instances—deviation of GA from LP.
Fig. 8. Randomly generated instances using DAGEN.
al. (1996), which uses the node reduction index (Bein et al., 1992) (n.r.i., the number of nodes to ‘‘reduce’’, i.e., eliminate, in the job-on-arc mode of representation of the precedence relations in order to render the graph series/parallel. In some sense, the n.r.i. measures the degree of deviation of the given graph from being series/parallel) as one of the metrics in randomizing the topology of the precedence graph. In all, 12 instances were generated, of which 8 were with n.r.i. of 6, and 4 with n.r.i of 8. The weights and processing times were generated from a uniform distribution as shown in the table of Fig. 8. As evident from the
table, the GA solution is no more than 1% from the l.b. (LP). This demonstrates the effectiveness of both the GA as a solution methodology and the LP relaxation as an effective l.b. (Fig. 9).
6. Summary and conclusions We have investigated the problem P Pmjprecj wi C i under the assumption that m ¼ 2 identical machines in the case of the BIP and the DP models, and m ¼ 2 or 3 in the case of the GA procedure. The precedence relations are arbitrary.
ARTICLE IN PRESS G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
Fig. 9. Average computation times.
We have developed two mathematical programming models: a binary integer program (BIP) that can accommodate up to 10 jobs with average job processing time of 5, and a dynamic program (DP) that extends the scope of solution to 25 jobs with average job processing time of 10. For larger problems we have developed a rather efficient GA procedure. In this study we have implemented it on two and three processors; it is easily extendable to more than three processors. This research may be extended in at least two directions. The first is to permit fractional allocation of a ‘‘machine’’ to a job, if more gainful. This would emulate, for instance, the case of dividing a person’s time equally between two jobs in the same interval of time. The second is to have different resources that are available in multiple units each. This would be the case of jobs demanding two (or more) different resources simultaneously, say a truck and a bulldozer, and there are several units of each resource. The resolution of these extensions would go a long way towards resolving the well known resource-constrained project scheduling problem (RCPSP) in project planning and control.
References Adolphson, D., Hu, T.C., 1973. Optimal linear ordering. SIAM Journal on Applied Mathematics 25, 403–423.
57
Alidaee, B., 1993. Schedule of n jobs on two identical machines to minimize weighted mean flow time. Computers & Industrial Engineering 24, 53–55. Bein, W.W., Kamburowski, J., Stallmann, M.F.M., 1992. Optimal reduction of two-terminal directed acyclic graphs. SIAM Journal on Computing 21, 1112–1129. Chekuri, C., Motwani, R., 1999. Precedence constrained scheduling to minimize sum of weighted completion times on a single machine. Discrete Applied Mathematics 98, 29–38. Chekuri, C., Motwani, R., Natarajan, B., Stein, C., 1997. Approximation techniques for average completion time scheduling. Proceedings of the Eighth ACM-SIAM Symposium of Discrete Algorithms, pp. 609–618. Chudak, F.A., Hochbaum, D.S., 1999. A half-integral linear programming relaxation for scheduling precedence-constrained jobs on a single machine. Operations Research Letters 25, 199–204. Coffman, E.G., 1976. The Computer and Job-Shop Scheduling Theory. Wiley, New York. Conway, R.W., Maxwell, W.L., Miller, L.W., 1967. Theory of Scheduling. Addison-Wesley, Reading, MA. Davis, L.D., 1991. The Handbook of Genetic Algorithms. Van Nostrand Reinhold, New York. Dorhout, B., 1975. Experiments with some algorithms for the linear assignment problem. Report BW30, Math. Centrum, Amsterdam, The Netherlands. Dror, M., Kubiak, W., Dell’Olmo, P., 1997. Scheduling chains to minimize mean flow time. Information Processing Letters 61, 297–301. Elmaghraby, S.E., 2001. On scheduling jobs related by precedence on one machine to minimize the weighted completion times. Paper presented at the IETM Conference, Quebec City, Canada, August 21–24, 2001. Elmaghraby, S.E., Agrawal, M., Herroelen, W., 1996. DAGEN: A network generating algorithm. European Journal on Operational Research 90, 376–382. Esquivel, S.C., Gatica, C.R., Gallard, R.H., 2001. Conventional and multirecombinative evolutionary algorithms for the parallel task scheduling problem. In: EvoWorkshops, Lecture Notes in Computer Science, vol. 2037. Springer, Berlin, pp. 223–232. Garey, M.R., Johnson, D.S., 1979. Computers and Intractibility: A Guide to the Theory of NP-Completeness. W.H. Freeman, New York. Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading, MA. Graham, R.L., 1966. Bounds for certain multiprocessing anomalies. Bell System Technical Journal 45, 1563–1581. Hall, N.G., Posner, M.E., 1991. Earliness–tardiness scheduling problems I: Weighted deviation of completion times about a common due date. Operations Research 39, 836–846. Hall, L.A., Schulz, A.S., Shmoys, D.B., Wein, J., 1997. Scheduling to minimize average completion time: Off-line and on-line approximation algorithms. Mathematics of Operations Research 22, 513–544.
ARTICLE IN PRESS 58
G. Ramachandra, S.E. Elmaghraby / Int. J. Production Economics 100 (2006) 44–58
Horn, W., 1973. Minimizing average flow time with parallel machines. Operations Research 21, 846–847. Hou, E.S.H., Ansari, N., Ren, H., 1994. A genetic algorithm for multiprocessor scheduling. IEEE Transactions on Parallel and Distributed Systems 5, 113–120. Hu, T.C., 1961. Parallel sequencing and assembly line problems. Operations Research 19, 841–848. Ibaraki, T., Nakamura, Y., 1994. A dynamic programming method for single machine scheduling. European Journal on Operational Research 76, 72–82. Lawler, E.L., 1973. Optimal sequencing of a single machine subject to precedence constraints. Management Science 19, 544–546. Mondal, S.A., Sen, A.K., 2001. Single machine weighted earliness–tardiness penalty problem with a restrictive due date. Computers & Operations Research 28, 649–669. Munier, A., Queyranne, M., Schulz, A.S., 1998. Approximation bounds for a general class of precedence constrained parallel
machine scheduling problems. In: Proceedings of Integer Programming and Combinatorial Optimization (IPCO), Lecture Notes in Computer Science, vol. 1412. Springer, Berlin, pp. 367–382. Queyranne, M., Schulz, A.S., 1994. Polyhedral approaches to machine scheduling. Technical Report 408/1994, Technical University of Berlin. Rinnooy Kan, A.H.C., Legeweg, B.J., Lenstra, J.K., 1975. Minimizing total cost in one machine scheduling. Operations Research 23, 908–927. Schulz, A.S., 1996. Polytopes and scheduling. Ph.D. Thesis, Technical University of Berlin, Germany. Sidney, J.B., 1969. Decomposition algorithms for singlemachine sequencing with precedence relations and deferral costs. Operations Research 22, 283–298. Sidney, J.B., Steiner, G., 1986. Optimal sequencing by modular decomposition: polynomial algorithms. Operations Research 34, 606–612.