Computers & Industrial Engineering 53 (2007) 420–432 www.elsevier.com/locate/dsw
A single resource scheduling problem with job-selection flexibility, tardiness costs and controllable processing times q,qq Bibo Yang 1, Joseph Geunes
*
Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA Received 7 October 2004; received in revised form 23 November 2006; accepted 22 February 2007 Available online 21 May 2007
Abstract We consider single-resource scheduling when candidate jobs may be accepted (producing job-specific profit) or rejected. Our solution approaches seek to maximize the profitability of the resulting schedule under job-specific tardiness costs and reducible processing times (at a cost). We present an algorithm that maximizes schedule profit for a given sequence of jobs, along with two heuristic approaches for generating good job sequences. A set of computational tests on randomly generated problem instances demonstrates the relative effectiveness of our proposed heuristic approaches. 2007 Elsevier Ltd. All rights reserved. Keywords: Scheduling; Heuristics; Job-selection
1. Introduction Recent trends in manufacturing and operations management have led to the incorporation of demand management approaches in operations planning (e.g., Chopra & Meindl, 2003; Lee, 2001). These trends recognize that suppliers generally have some degree of flexibility in determining the set of downstream demands that provide the best match for its resource capabilities in order to enhance profitability. Only a small amount of past research exists, however, that considers demand management decisions in an operations scheduling context. This paper partially fills this gap by considering resource scheduling problems in which the firm exercises some discretion over the number of jobs it accepts, with a goal of maximizing profit. In particular, we consider a situation in which the jobs have associated due dates; if the firm agrees to accept a job and delivers the job later than the due date, a tardiness cost is incurred. The firm must therefore determine which jobs to accept and the schedule for the accepted jobs in order to maximize profit.
q qq *
1
This manuscript was processed by Area Editor Subhash Sarin. This work was partially supported by NSF Grants #CMS-0122193 and DMI 0322715. Corresponding author. Tel.: +1 325 392 1464x2012; fax: +1 352 392 3537. E-mail address:
[email protected]fl.edu (J. Geunes). Address: Department of Logistics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
0360-8352/$ - see front matter 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.cie.2007.02.005
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
421
We consider a set of available jobs J = {1, 2, . . . , n}, from which we must choose some subset for processing. Those jobs that are selected must be processed sequentially on a single resource without preemption. Job j has an associated revenue wj P 0, release date rj, due date dj, and processing time pj. We denote the starting time of job j as sj (a decision variable) and its execution time interval as (sj, sj + pj) [rj, T), where T is an overall deadline for the processing of all jobs.2 We allow jobs to violate their due dates at some cost per unit tardy. If job j is completed later than its due date, a penalty cost lj is incurred for each unit of tardiness; letting Cj denote the completion time of job j, then the tardiness cost for job j equals lj(Cj dj)+. We assume that jobs also have controllable processing times. That is, the firm can choose to allocate a greater number of resources than under normal operating conditions in order to complete a job more quickly. The actual processing time of job j can be reduced to (pj xj) at a compression cost cxj, where xj (with 0 6 xj 6 uj 6 pj) is the time by which the ‘‘normal’’ processing time pj is compressed, uj is the maximum possible amount of compression time for job j, and c is a cost per unit processing time reduction on the resource. The profit of a job will therefore equal the job’s revenue less the tardiness and compression costs. The goal is to select a subset of jobs to be processed in the time interval (0, T], such that the sum of the individual profits of the scheduled jobs is maximized. We will refer to the problem we have defined as the job selection with controllable process times and tardiness (JSCT) scheduling problem. Because the JSCT scheduling problem is NP-Hard (it generalizes the minimum sum of weighted tardiness costs scheduling problem), we propose a two-part heuristic solution approach for creating close-to-optimal solutions. We note that our model does not explicitly consider job rejection costs. If job rejection costs exist, then a job may either be accepted, thus generating a profit or it may be rejected at a certain rejection penalty cost. But, if we add the rejection cost to the profit of each job, and set the rejection penalty cost equal to zero for all jobs, the problem is transformed into an equivalent problem without rejection costs. Note, however, that in this case, our model’s objective function will contain a component equal to the sum of the rejection costs of the jobs we select. We must therefore subtract this term in order to obtain the correct value of overall profit. We next discuss work from prior literature that is related to our work in this paper. For selective demand problems with a single resource constraint, Kleywegt and Papastavrou (2001) studied a dynamic and stochastic knapsack problem, where a deterministic resource quantity is available, each demand requires some amount of the resource, and a reward (the value of which is unknown prior to job arrival) is generated upon acceptance. The problem is to select the demands that maximize expected profit. They analyze both infinite horizon and finite horizon cases and provide an optimal acceptance policy. Engels et al. (1998) consider a scheduling problem with a penalty cost for each rejected job. The objective function minimizes the sum of the weighted completion times of the jobs scheduled plus the sum of the penalties of the jobs rejected. Bartal, Leonardi, Marchetti-Spaccamela, Sgall, and Stougie (2000) studied a multiprocessor scheduling problem with job rejections, which minimizes the makespan of the schedule for accepted jobs plus the sum of the penalty costs of rejected jobs. Epstein, Noga, and Woeginger (2002) considered on-line versions of job selection and rejection scheduling problems, where jobs arrive on-line, one at a time. They provide a greedy algorithm to minimize the total completion time of accepted jobs plus job rejection penalties. Another related problem in the literature is the throughput maximization problem (TMP). The TMP is a scheduling problem with an objective of maximizing the total profit associated with jobs delivered on time, or minimizing a weighted number of late jobs (note that for this problem these objectives are equivalent). The TMP has no finite time horizon constraint and assumes that if a job is delivered late, there is no profit associated with the job (in contrast, the problem we consider allows jobs to be delivered late at some tardiness cost). The TMP is NP-Hard even when all jobs are released simultaneously (Sahni, 1976). The preemptive version of the TMP was studied by Lawler (1990), who provided a pseudo-polynomial time algorithm. Bar-Noy, Guha, Naor, and Schieber (2001) provided factor approximation algorithms for the non-preemptive version of the TMP with release date constraints. On-line versions of the problem for the preemptive and non-preemptive cases were considered in Baruah et al. (1992), Koren and Shasha (1995), and Lipton and Tomkins (1994).
2 The constraint on the time horizon T can be viewed as a rolling planning horizon used by the firm or some customer-imposed deadline. Note that if we let T = maxj{dj + dwj/lje}, where no job has positive profit after T, this problem becomes a ‘‘normal’’ scheduling problem, effectively without a finite time horizon constraint.
422
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
Berman and Dasgupta (2000) studied the case when all job parameters are positive integers, and provided a pseudo-polynomial algorithm. Scheduling problems with controllable processing times are the subject of a considerable number of papers in recent literature (see, e.g., the survey by Nowicki & Zdrzalka, 1990; which summarizes research results that allow controllable processing times within the following problem classes: 1j jTmax, 1 jrjjCmax, 1jrjjLmax, and 1j jfmax). Cheng, Chen, Li, and Lin (1998) studied the problem with the objective of minimizing the sum of time compression costs and the costs associated with the number of the late jobs. Cheng, Janiak, and Kovalyov (2001) consider the problem when both the job processing times and setup times can be compressed, and they provide an algorithm to minimize maximum job lateness subject to an upper bound on total weighted resource consumption. In each of these papers, all of the jobs in the job set must be processed, whereas we consider job acceptance and rejection decisions in combination with controllable processing times and tardiness costs associated with late deliveries. To our knowledge, the JSCT scheduling problem we consider, despite its relevance to practical scheduling settings, has not yet received attention in the scheduling literature. The remainder of this paper is organized as follows. Sections 2 and 3 present two components of a heuristic approach for the JSCT scheduling problem. Section 2 first discusses an algorithm for determining the optimal start and finish times (and therefore, compression times) for a fixed sequence of jobs. Section 3 then presents two heuristic methods for sequencing decisions. Section 4 then presents a set of computational test results which compare the performance of our heuristic approaches, while Section 5 presents conclusions and future research directions.
2. Optimizing a fixed job sequence We next present an algorithm, which we call the compress and relax algorithm, for determining the optimal start times, finish times and compression times for a given job sequence. The algorithm provides a solution to the JSCT scheduling problem (note that any jobs whose finish times exceed the fixed scheduling horizon length T will be discarded or rejected). The main idea of the compress and relax algorithm is to begin by applying the maximum amount of compression time possible to each job, and then iteratively reduce the amount of compression time applied. A similar idea is also seen in Yang, Geunes, and O’Brien (2004), who study a different problem involving the tradeoff between tardiness and overtime costs in single-resource scheduling. The problem they considered is different in that it considers scheduling jobs with fixed processing times during successive periods of regular and overtime, where overtime and regular time involve different processing costs. Moreover, they do not allow for job selection as we do in this paper. Our compress and relax algorithm has two phases: in the first phase (called the compression phase), given a fixed sequence of jobs, we set each job’s process time equal to the shortest possible process time, i.e., we let pj = pj uj, and consider the resulting schedule for the fixed sequence; note that this produces the largest possible compression cost and generates the greatest possible revenue (since as many jobs as possible are completed prior to the end of the scheduling horizon length T). We refer to the resulting schedule as the compressed schedule. In the second phase (called the relax phase), beginning with the job scheduled last, we sequentially reduce the amount of job time compression; consequently, some jobs may increase their tardiness cost, and some jobs initially scheduled within the (0, T) scheduling horizon may start and/or finish after the scheduled time limit T (such jobs must then be excluded). As a result of the relax phase, the net revenue from jobs may be reduced (due to tardiness costs and excluded jobs that finish after time T) and the total compression cost decreases with respect to the first phase solution. We therefore face a tradeoff, since we may simultaneously reduce net revenue and compression costs. Our goal is to find the best amount of time compression xj for each job j such that the net revenue minus cost (net profit) is maximized for the fixed sequence. Before presenting the algorithm, we first require the following definition and notation. Definition: Independent Subset For any schedule of jobs, an independent subset of jobs satisfies the following properties: (i) the release date of the first job in the subset is strictly greater than the completion date of the job’s immediate predecessor, (ii) the completion date of the last job in the subset is strictly less than the release date of its immediate successor,
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
423
and (iii) no unscheduled (idle) time exists between the start of the first job in the subset and the completion of the last job in the subset. Notation: c: The P amount of total reduction in compression time after implementing the relax phase, i.e., c ¼ nj¼1 ðuj xj Þ, where xj is the final amount of compression time applied to job j, and 0 6 xj 6 uj. Note that c is a decision variable whose final value is determined in the relax phase of the algorithm. dj: The maximum value of c such that, in the relax phase, job j is not delayed beyond its due date with respect to the initial compressed schedule. If a job j is already delayed beyond its due date after the compress procedure, we then set dj = 0. rj: The maximum value of c such that job j will be completed within the original scheduling horizon (0, T); for jobs that finish beyond T after the compression procedure, we set rj = 0. We first assume that there is only one independent subset in the schedule (which occurs when all release times equal zero, for example), and present the following proposition, which allow us to show the optimality of the compress and relax algorithm for a fixed sequence of jobs: Proposition 1. (i) Within an independent subset, compressing the same amount of processing time for jobs scheduled earlier generates at least as great a benefit as for jobs scheduled later. (ii) The quantities dj and rj for job j are given by the following equations, where L(j) denotes the set of successors of job j and C 0j is the completion time of job j after the compression phase: dj ¼
X
þ
ui þ ðd j C 0j Þ ;
i2LðjÞ
8 P ui þ ðT C 0j Þ; < rj ¼ i2LðjÞ : 0;
if C 0j 6 T ; if C 0j > T :
(iii) Total profit as a function of c is a piecewise linear curve, where the peak points are the points where c equals some dj or rj. The maximum possible profit for a given sequence of jobs must occur at one of these peak points. Proof. Part (i) follows since compressing the processing times for jobs scheduled earlier will decrease the completion times of a greater number of jobs and, as a result, less total tardiness cost is incurred for the same amount of processing time compression. Part (ii) illustrates the fact that the values of dj and rj equal the sum of the compression times of job j’s successor jobs (which are relaxed before job j) and the difference between its completion time after the compression phase and its due date (for dj) or horizon limit T (for rj). This proposition directly follows from part (i), since we should reduce the compression time of successor jobs before we reduce job j’s compression time. Also, if a job i is lost during the relax phase by finishing beyond time T, we effectively lose all of its compression time used in the initial schedule. Therefore, even if a successor job i is dropped from the schedule before relaxing all of its compression time, we can still claim that we have relaxed all of the compression time associated with job i. Part (iii) is illustrated through the profit curve in Fig. 1. The profit equals the total revenue less the sum of total tardiness and compression costs as a function of c. Note that the revenue curve is a non-increasing step function, since only when c equals some rj, will job j exceed the time limit T and as a result the total revenue will decrease by wj (the revenue of job j). The tardiness cost curve is a piecewise-linear and discontinuous function of c. Only when c equals some dj for some job j will that job begin to incur tardiness at a rate of lj per unit time tardy. If the reduction in compression time is larger than rj, job j’s completion time will exceed the time horizon T and we no longer include its revenue or tardiness cost, i.e., for c > rj, the tardiness cost of job j
424
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
Cost Revenue
Tardiness cost
Profit Compression cost
Compression time reduction, γ
δ1 σ3 δ2 σ1
Fig. 1. Profit as a function of reduction in compression time, c.
will be subtracted from the total tardiness cost. Thus we see that the profit curve changes slope at any value of dj, and has downward steps of discontinuity at each value of rj.The compression cost, on the other hand, is simply linear in total compression time with slope c. As the combination of the three, the profit curve is a piecewise linear discontinuous curve. The curve’s ‘‘peaks’’ occur when the compression time is reduced by some rj or dj, and the maximum profit must occur at one of these peak points. h If C 0j denotes the completion time of job j in the compressed schedule, we can characterize the profit function in terms of the total compression reduction time c using the following functional representation: i Xh X þ þ pðcÞ ¼ wj ððc dj Þ þ ðC 0j d j Þ Þlj fj ðcÞ þ cc c uj ; j2N
¼
Xh
j2N
!
i X þ w0j ðc dj Þ lj fj ðcÞ c uj c ;
j2N
j2N
where fj ðcÞ ¼
1; if c < rj 0; otherwise
þ
; and w0j ¼ wj ðC 0j d j Þ lj :
To illustrate the derivation of this function, note that the first part of the function is the revenue less the total tardiness cost. Here w0j is the profit contribution of job j, which includes the original tardiness cost resulting from the compression phase. The quantity (c dj)+ is the additional tardiness cost if we reduce the compression time in the relax phase by c. If c > rj, job j is scheduled after period T (thus zero profit is obtained for job j). The function fj(c) ensures that such jobs incur no revenue or tardiness costs. The last part of the Pfuncn tion is the total compression cost incurred. The peak points occur at the boundary points c = 0, c ¼ j¼1 uj , and at points of the form c = dj or c = rj, j = 1, . . . , n. We can thus evaluate profit at all of these points to determine an optimal solution for a fixed sequence.
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
425
2.1. Non-zero release times and independent subsets We next consider the case in which more than one independent subset can exist in the initial compressed schedule, which may occur when release times are non-zero. In this case, we begin by applying the relax phase of the algorithm on the independent subset that finished last in the compressed schedule. We then apply the relax phase to the second last independent subset. It may happen, however, that when relaxing the compression time in the second last subset, this subset becomes blocked from further relaxation by the first job in the last subset. When this happens, based on the results of Proposition 1 (part (i)), we must reconsider relaxing additional compression time in the last subset, if compression time still exists in this subset, before further relaxing compression time in the second last subset. More generally, suppose that after the compression phase we have m scheduled independent subsets denoted by S1, S2, . . . , Sm. We index independent subsets in increasing order of the start of the first job in the subset, and we say that Sl > Sk for any subsets Sk and Sl if the start of the first job in subset Sl is later than the start of the first job in subset Sk. If, during the relax procedure, a subset’s compression time is ‘‘relaxed’’ enough such that the finish time of the last job in the subset reaches the start time of the first job in the succeeding subset, then these two subsets merge into a new subset and, based on Proposition 1, we restart the relax procedure on the newly formed independent subset. Note that when subsets merge, we need to revise the values of dj and rj for all jobs j in the merged subsets except for the earliest (lowest indexed) subset in the merge. We denote the time between subsets k and k + 1 in the initial compressed schedule as Sk,k+1, which equals the starting time of the first job in set k + 1 minus the completion time of the last job in set k. We first relax the compression time in subset k + 1, which occurs after subset k and therefore does not affect subset k. After considering subset k + 1, we then consider subset k. Clearly we can relax the compression time in subset k by an amount equal to Sk,k+1 before any of the jobs in subset k + 1 are affected. If we relax the compression time in subset k by Sk,k+1, then at this point we need to consider the merged subset (which contains the union of subsets k and k + 1). Therefore, immediately before we consider subset k (and immediately after considering subset k + 1), the time Sk,k+1 will need to be added to the values of dj and rj for j in subset k + 1 to reflect this additional compression time reduction that occurs in the algorithm before affecting these jobs. There are at most O(n) possible independent subsets, and each job’s compression time reduction will be evaluated at most O(n) times. Computing dj and rj at each step requires O(n) time. Thus, the total complexity for the compact and relax procedure is O(n3). This leads to the following theorem: Theorem 1. The Compress and Relax algorithm solves the problem with identical compression costs optimally in O(n3) time for a given fixed sequence of jobs. We next summarize the Compress and Relax algorithm. Compress and Relax algorithm Step 1: Set each job’s compression time to its maximum value uj, and schedule jobs using the sequence determined by the priority rule. Keep track of each independent subset that results. Step 2: Begin reducing the compression time starting with the last job in the last subset and working backward in time. Calculate the maximum profit for each subset byP checking the possible values of c in the set, i.e., c = dj and c = rj for all j in the subset, and c = 0 and c ¼ nj¼1 uj . Merge the subsets as necessary and restart the time compression relaxation process at the end of the new subset after merging. Continue this procedure until reaching the first subset. 3. Methods for determining a good job sequence Next we require a method to determine the best job sequence, which represents a difficult combinatorial optimization problem. This section discusses two potential heuristic approaches for generating good job sequences. The first method uses a greedy randomized adaptive search procedure (GRASP), while the second is a modified two-phase algorithm (2PA) based on a previous approach used for the throughput maximization problem (TMP). We discuss these approaches in Sections 3.1 and 3.2, respectively.
426
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
3.1. GRASP approach The GRASP approach is a multi-start or iterative procedure that has been successfully used for a number of combinatorial optimization problems in past literature. Each GRASP iteration consists of two phases. In the first construction phase, a feasible solution is produced, and in the second local search phase, a locally optimal solution in the neighborhood of the constructed solution is sought. The best overall solution is retained at each iteration. In the construction phase, a feasible solution is iteratively constructed, one element (here one job) at a time. The next element to be added at each step is determined by ordering all candidate elements in a candidate list with respect to a priority rule. A probabilistic component of a GRASP is applied by randomly choosing from the candidates in the priority list, but not necessarily always choosing the best candidate. We use the construction phase of GRASP to solve the JSCT scheduling problem. Using a priority rule, we compute a priority value for each available job whenever the machine finishes a selected job. We next define some notation for the parameters used in our priority rule: tk: The time at which we schedule the start of the job in the kth position in the sequence; initially, for the first scheduled job, t1 = 0. After scheduling job k 1, tk equals the minimum between the completion time of the (k 1)st job, and the minimum release time among all remaining unscheduled jobs. /AVG: The Pn weighted average value of the job revenue, weighted by tardiness cost per time unit tardy, i.e., /AVG ¼ j¼1 ðwj lj Þ=N . Note that when either the revenue or the tardiness cost of a job is large relative to the compression cost, we desire greater compression time for a job. xkj : An estimate of the amount of compression time that will be applied to job j in the final schedule. That is, we set xkj ¼ kuj where k = min {a/AVG/c, 1}, and a is a scalar between 0 and 1; observe that if the value of the compression cost relative to the weighted average revenue is very large, k is near 0, which means little or no compression will be applied. Conversely, when k is 1, this means that the weighted average job revenue far exceeds the compression cost, and the maximum amount of compression time is used. pkj : An estimate of the processing time that will be used for job j, i.e., pkj ¼ pj kuj . The priority rule we use for evaluating jobs at time tk is given by the following formula: 8 þ < wj lj ðtk þpkj d j Þ cxkj ; if rj > tk ; pkj pj ðtk Þ ¼ : 1; otherwise: This priority rule is motivated by the following ideas: The numerator of pj(tk), i.e., the quantity þ ½wj lj ðtk þ pkj d j Þ cxkj , at time tk, provides an estimate of the net revenue of job j when scheduled as the kth job in the sequence (i.e., the next job in the sequence). A job with high net revenue naturally has a high selection priority. The denominator of pj(tk) ensures that if the job has a long processing time, it receives lower priority and will be scheduled later, so as not to delay too many other jobs in the schedule. In terms of job selection, those jobs that are scheduled with completion times before time T have been selected, while those jobs that remain unscheduled at time T are not selected. Given the sequence of jobs determined by the GRASP algorithm, we can then apply the compress and relax algorithm to this sequence to determine the optimal job start and completion times for this sequence. 3.2. A modified two-phase algorithm In the introduction we discussed a problem that is related to the JSCT scheduling problem called the throughput maximization problem (TMP). Berman and Dasgupta (2000) developed a two-phase algorithm (2PA) for the TMP with a worst-case performance ratio of 2. Using a modification of their approach, we can develop an alternative method for selecting and sequencing jobs for the JSCT scheduling problem. Their two-phase algorithm is derived by solving a problem called the interval selection problem (ISP). For each job j 2 {1, 2, . . . , n}, we are given a family of integer intervals Sj, and each interval is described
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
427
by the four parameters (j, wj, s, C), where wj is the profit3 associated with the interval, s is the interval start time, and C is the interval finish time. If any integer interval (j, wj, s, C) from Sj is ultimately selected, then this implies that job j starts at time s, finishes at time C, and a profit of wj is realized. The ISP problem requires selecting at most one interval from each set of interval families (where one ‘‘family’’ is associated with each job), so that the selected intervals are disjoint and the sum of the individual profits is maximized. The TMP is easily formulated as an equivalent ISP problem in the following way: for each job j, we associate it with a family of integer intervals Sj, where each interval is denoted by (j, wj, sj, Cj), wj is the job’s profit, sj is the starting time and Cj is the finishing time of the job, with sj P rj and Cj 6 dj. The basic idea of the 2PA algorithm is as follows. In the first evaluation phase, we initially create a list L of all possible (j, wj, s, C) quadruples associated with all jobs, sorted in non-decreasing order of finish time. Then a stack Sˆ is created, into which we will later heuristically insert potentially desirable elements of L (i.e., desirable quadruples). The stack Sˆ is simply a list of potentially good assignments of jobs to time intervals; note that we may have more than one stack element associated with a given job j (where different elements associated with a job j have different start and completion times and profits). Associated with each element of the stack is a value v that characterizes the attractiveness of the assignment. The stack is initially empty and is created by analyzing the list L, one element at a time. If v > 0, we insert the quadruple (j, wj, s, C) into the stack. In order to compute the value v of an element of the stack Sˆ, define TOTAL(s) as the sum of all of the values for those jobs in the stack whose completion times are greater than s; for each job j let total(j, s) denote the sum of the values of all elements of Sˆ who have the same corresponding job j and completion times less than or equal to s. Define the value of an element of the list as v = wj total(j, s) TOTAL(s); observe that v provides the profit of job j, less the values associated with all prior occurrences of job j in the stack whose completion time does not exceed s, and the profit associated with jobs in the stack whose completion time exceeds s, i.e., whose scheduled time would conflict with that of job j under quadruple (j, wj, s, C). In the second selection phase, beginning at the top of the stack, we consider each quadruple (j, wj, s, C) in the stack. If the job j associated with the quadruple under consideration has not been scheduled, and the completion time C of the job does not overlap any previously scheduled job, we then schedule job j at start time s and completion time C; otherwise, the quadruple is discarded. A feasible solution is generated at termination. Berman and Dasgupta (2000) show that the 2PA algorithm solves the TMP in O(ns(1 + log log s)) time with an approximation ratio of 2 (where s is the maximum due date among all jobs). For a detailed discussion of the algorithm and the complexity and performance results, see Berman and Dasgupta (2000). For the JSCT scheduling problem, we need to modify this approach because we must schedule jobs within a finite time horizon T, and we also allow job tardiness and processing time reduction. The nature of the required modifications is as follows. For a given job j and completion time Cj, we can now have a number of associated intervals, depending on the amount of time compression utilized. Given a job j and completion time Cj, we can have at most uj starting times sj equal to Cj pj, . . . , Cj uj. We therefore define wj and sj as functions of the completion time Cj and compression time xj, and will have a set of intervals (j, wj(Cj, xj), sj(Cj, xj), Cj) for every possible (j, Cj) pair. The interval (j, wj(Cj, xj), sj(Cj, xj), Cj) thus has net profit wj(Cj, xj) = wj lj(Cj dj)+ cxj and starting time sj(Cj, xj) = Cj pj xj. Since xj is an integer decision variable, we must consider all possible value of xj such that 0 6 xj 6 uj. To decrease the complexity of this approach, we use two-dimensional data arrays to determine the value of each interval. Moreover, in the evaluation phase, for each possible time period, instead of putting all intervals with positive value into the stack, we only put the interval with the largest value into the stack (among all intervals whose corresponding job is completed in the associated time period), thus reducing the overall complexity of the algorithm in the worst case. The modified 2PA algorithm (Mod-2PA) is described as follows.
3
In the version of the problem considered by Berman and Dasgupta (2000), since there are no costs defined, revenue is equivalent to profit.
428
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
Mod-2PA algorithm We begin by generating all intervals (j, wj(Cj, xj), sj(Cj, xj), Cj) for every job j and completion time Cj, such that sj(Cj, xj) = Cj (pj xj) and wj(Cj, xj) = wj lj(sj + pj xj dj)+ cxj, with s 2 (rj, T pj + uj), for 0 6 xj 6 uj and integer, and j 2 [1, n]. Let Lt denote the set of the intervals (j, wj(Cj, xj), sj(Cj, xj), Cj) such that Cj = t, for t = 2, . . . , T and wj(Cj, xj) > 0. We maintain a two-dimensional data array A, where entry ajk records the value of the selected interval for job j when ending = k; initially we let each entry ajk = 0, j = 1, . . . , n, k = 2, . . . , T. The evaluation phase that we next describe begins at time t = 2 and works in increasing time period index order, and we assume that all jobs take at least two periods. *Evaluation Phase* If t = 2 and L2 5 B, select j 0 such that interval (j 0 , wj0 , 1, 2) contains the largest wj in L2, and insert this interval into the stack Sˆ; set aj0 2 equal to wj0 ; (otherwise let t = 3 and begin the following loop). for (each t > 2 in increasing time index order) If Lt = B, set t = t + 1, and repeat this step. Otherwise, continue. { for (each interval (j, wj(CP xj), Cj) P of Lt) j, xj), sj(Cj,P n Set vj :¼ wj ðC j ; xj Þ sk¼1 ajk t1 k¼sþ1 i¼1 aik ; 0 0 0 Select j such that interval ðj ; wj ðC j ; xj Þ; sj ðC j ; xj Þ; C j Þ contains the largest vj in Lt and insert this interval ðj0 ; w0j ðC j ; xj Þ; sj ðC j ; xj Þ; C j Þ into the Stack Sˆ; Set aj0 t :¼ vj0 ; } The selection phase is the same as that for the original 2PA. Based on a prior result for the 2PA algorithm (see Berman & Dasgupta, 2000) we have the following theorem. Theorem 2. The Mod-2PA algorithm solves the JSCT scheduling problem in O(umaxnT) time with an approximation ratio of 2, where umax = maxj2[1, n]uj. Proof. In the evaluation phase, there are O(T) interval sets Lt, and each set is composed of O(umaxn) intervals, since each job that finishes at time t has O(umax) possible starting times, sj = t pj xj, 0 6 xj 6 uj. Thus the total complexity of the algorithm is O(umaxnT). To show that the algorithm has the same approximation ratio 2 as the original 2PA, we need to consider the differences between the two algorithms. The first difference is in the calculation of the value of the interval. In the Mod-2PA algorithm, since each P entry ajk of A records the value of the selected interval of job j with ending = k, the quantity sk¼1 ajk is the sum ˆ of Pn associated with job j in the stack S that have ending less than or equal to s, Psthe values of those intervals the value of the selected interval with ending = k, and no interval k¼1 ajk ¼ totalðj; sÞ. Since i¼1 aik records P t1 Pn with ending larger than t has been selected, aik is the sum of the values of those intervals in Sˆ that k¼sþ1 i¼1 Pt1 Pn have ending greater than s, i.e., a ¼ TOTALðsÞ. Thus, for job j, the computed value vj satisfies: k¼sþ1 P P Pn i¼1 ik vj :¼ wj ðC j ; xj Þ sk¼1 ajk t1 a ¼ w ðC ; x Þ totalðj; sÞ TOTALðsÞ, and the calculation of the j j j k¼sþ1 i¼1 ik value v is the same for both algorithms. The second difference between the two algorithms is the number of intervals added to the stack. In the original 2PA, all of the intervals are sequentially evaluated and added to the stack if they have positive values, while in the Mod-2PA, we first sort intervals in non-increasing order of value and only add the largest valued interval in each set Lt to the stack. To illustrate this, let sjt denote the start time of job j if this job finishes at time t, i.e., sjt = t pj. When considering the set of intervals in Lt, for each job j corresponding to an interval in Lt we have TOTAL(s), which is more precisely defined as TOTAL(sjt). If some arbitrary job j 0 is the first job that we consider adding to the stack among those jobs corresponding to intervals in Lt, then we will compute a value for job j 0 equal to vj0 :¼ wj0 ðC j ; xj Þ totalðj0 ; sj0 t Þ TOTALðsj0 t Þ, and we will add the interval ðj0 ; wj0 ; t pj0 ; tÞ to the stack if vj0 > 0. Suppose, however, that job k provides the largest value of vj: = wj(Cj, xj) total(j, sjt) TOTAL(sjt) prior to inserting any intervals in Lt into the stack (and clearly we
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
429
must have vk P vj0 by definition). If we had added job k to the stack prior to adding job j 0 , then we would compute the value of job j 0 as vj0 :¼ wj0 ðC j ; xj Þ totalðj0 ; sj0 t Þ TOTALðsj0 t Þ vk (since clearly job k would complete processing after period sj0 t ). But, we know by definition of the indices j 0 and k that vk P wj0 ðC j ; xj Þ totalðj0 ; sj0 t Þ TOTALðsj0 t Þ, which implies that our new computation of the value of job j 0 produces vj0 6 0. So, sorting in non-increasing order of vj and choosing the largest one for each Lt allows us to avoid adding jobs to the stack whose value would have been negative when considered in a different order in the 2PA algorithm. Moreover, if the largest vj in the set Lt is negative, all intervals in the set are also negative, and none of these should be further evaluated or selected. Therefore, for each set Lt, we only need to consider the interval with the largest value. Our algorithm allows intervals associated with the same job to have different profit values wj and different interval lengths. This does not change the lower bound on optimal profit relative to that used in the worst-case performance bound of Berman and Dasgupta (2000). Note that in determining a lower bound on an optimal solution in the original 2PA algorithm (see Berman & Dasgupta, 2000), they only consider the profit wj of an interval for job j in a feasible solution and the associated value v, and they do not require that all intervals associated with a job have the same profit wj and the same interval length. In the 2PA algorithm, they set the time horizon as the maximum due date s, where beyond s, the jobs cannot finish on time and thus provide no profit; in our modified 2PA, all intervals ending after T are dropped, because they are not permitted. Therefore the proof for the algorithm’s performance bound still holds, and Mod-2PA solves our problem with an approximation ratio of 2. h Note that we can use the solution from the Mod-2PA algorithm to generate a candidate solution, or we can simply use the resulting job sequence generated by this algorithm as input to the compress and relax algorithm. If we use the Mod-2PA to generate a job sequence for input to the compress and relax algorithm, the overall complexity of the entire algorithm is O(umaxnT + n3) for the JSCT scheduling problem. 4. Computational tests This section discusses a broad set of computational tests designed to assess the effectiveness of the heuristic solution approaches we have proposed for the JSCT scheduling problem. As the discussion in the previous section indicates, there are several potential heuristic approaches we can use to solve the JSCT scheduling problem. Because the Mod-2PA algorithm has a proven worst-case performance ratio of 2, we use the solution from direct application of this algorithm as a benchmark, against which we can compare the other heuristic methods. We note that like many complex scheduling problems, because of the extremely high dimension and complexity of the problem class we are considering, it is very difficult to empirically obtain strong upper bounds on optimal profit through, for example, evaluating the linear programming relaxation of the mixed integer programming formulation (the linear programming relaxation itself takes extremely high dimension and thus becomes difficult to solve for even medium-size problems). As an alternative, many researchers have used theoretical worst-case performance bounds to characterize the performance of heuristic approximation algorithms (see, for example, Nowicki & Smutnicki, 1996). For small problem instances, we are able to determine an optimal solution value using enumeration, as we later discuss. However, for larger problem instances, we use the Mod-2PA algorithm results as our benchmark, because of its known worst-case performance ratio of 2. In our computational tests we evaluated the following two solution methods, denoted by H1 and H2. H1. Directly uses the Mod-2PA algorithm, which solves the problem in O(umaxnT) time with an approximation ratio of 2. H2. Applies the GRASP-based heuristic in Section 3.1, which uses a priority rule to determine the sequence of jobs, and then use the compress and relax algorithm to determine the final solution. We generated problem instances using random and uniformly distributed data according to the rules in Table 1. The parameter a in Table 1 is an integer scalar, whose value we vary to create a set of four different problem classes. The four different problem classes correspond to values of a = 0.5, 0.2, 0.1, and 0.05. As Table 1 indicates, we tested problem instances where jobs are reasonably similar to each other, with relatively
430
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
Table 1 Rules used for randomly generating test problem parameters Parameter
Rule
Release times Processing times Due dates
rj = UNIF(0, T 20) pj = UNIF(12, 24) · T/200 2) dj = rjj + pj + UNIF(1, k pj UNIF ð1;50Þ
Maximum compression times
uj ¼
Job revenues
wj = lUNIF(3,m80)
Tardiness cost
lj ¼
wj UNIF ð3; 5Þ
Pn Compression cost
c¼a
l j¼1 j
n
tight due dates, which tends to increase the problem’s difficulty. Note that our method for generating processing times (shown in Table 1) results in a mean processing time that is increasing with T. If we increase T in our experiments without accordingly increasing the job processing times, then for a given number of jobs, the problem actually becomes easier since more jobs may be selected for processing within the time horizon. Thus we scale the processing time distribution linearly in T in order to avoid reducing problem difficulty as the time horizon increases. Our goal is to determine how each of the proposed heuristic solution approaches works when the relative values of profit, tardiness penalty cost, and compression cost differ (determined by the parameter a). Each problem instance consists of 50 jobs, and we used five different problem sets, using different combinations of the time horizon value T and the number of jobs n. The values of (T, n) (time horizon, number of jobs) considered were (100, 10), (200, 10), (300, 10), (200, 30), (200, 50), (100, 50), and (300, 50). We use the first three problem sets to compare the performance of the heuristics to the optimal solution. Comparison among problem sets 2, 4, and 5 allows us to investigate algorithm performance as the number of jobs changes, while comparisons among the last three sets allow determining the impact of the time horizon length. Within each problem set we considered 30 different randomly generated problem instances, which provided a total of 840 test cases (four values of a), seven problem sets, and 30 problem instances for each combination of a value and problem set. Tables 2a–g present the relative performance results of the algorithms. In Tables 2a, b, and c, because of the reasonably small problem sizes (n = 10), we are able to compare the performance of the heuristics to the optimal solution value, which we denote by Z*. In the remaining tables, which consider larger problem sizes, we compare the performance of the two heuristics, noting that the Mod-2PA heuristic (H1) contains a worst-case performance bound of 2. Tables 2a, b, and c show that heuristic H2 (which uses GRASP and our compress and relax algorithm) performs quite well relative to the optimal solution value, while heuristic H1 does reasonably well. These results indicate that the heuristic performance is robust to changes in the time horizon length. In the remaining tables, the performance ratios shown in the last column show the performance of heuristic H2 relative to the base heuristic H1, which has a worst-case performance ratio of 2. Thus, a higher ratio value in the last three columns of the table implies better algorithm performance (note that these ratios should not, therefore, be interpreted as optimality gaps). We make several observations from the results in Tables 2a–g, and based on our analysis of the corresponding solutions. Heuristic H2 provides the best average performance in all cases, and appears to be quite robust to the changing cost of compression time, although the performance of both algorithms is reasonably comparable. Note that in most cases, the Mod-2PA algorithm (H1) degrades a bit when compression costs are high. The reason for this is the following. When compression costs are high, the jobs that have high value in the evaluation phase have longer processing times. In the evaluation phase, the Mod-2PA algorithm can repeatedly pick the same high-valued job with a long processing time as the most valuable job for successive intervals. Since at most one interval from each job can be selected in the selection phase, this leads to many time slots without attractive alternative jobs to pick from, creating a sparse schedule. Heuristic H2 generates a complete sequence of jobs (using the GRASP heuristic), and then uses the compress and relax algorithm to determine the best amount of compression time for all jobs. Since the compress and relax algorithm generates an optimal solution for a given sequence of jobs, the sequences generated determine the quality of Heuristic
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
431
Table 2 Summary of computational test results for randomly generated test cases* Optimal solution value, Z*
H1 profit
H2 profit
H1/Z*
H2/Z*
H2/H1
(a) T = 100, n = 10 1 0.5 2 0.2 3 0.1 4 0.05 Overall average
466.4 525.5 569.6 594.5
442.1 507.1 552.1 576.4
466.1 522.0 563.4 584.9
0.943 0.963 0.968 0.969 0.961
0.999 0.993 0.990 0.984 0.992
1.066 1.033 1.023 1.016 1.035
(b) T = 200, n = 10 1 0.5 2 0.2 3 0.1 4 0.05 Overall average
421.4 464.7 520.8 559.0
393.4 456.9 506.3 544.0
412.4 463.8 520.8 559.0
0.934 0.983 0.973 0.973 0.966
0.977 0.998 1.000 1.000 0.994
1.053 1.016 1.029 1.029 1.032
(c) T = 300, n = 10 1 0.5 2 0.2 3 0.1 4 0.05 Overall average
408.6 433.8 490.7 552.0
401.0 425.7 478.6 531.7
408.6 433.8 486.5 540.3
0.977 0.977 0.975 0.964 0.973
1.000 1.000 0.990 0.979 0.992
1.029 1.029 1.016 1.015 1.022
Problem class
Compression cost multiplier, a
Problem class
Compression cost multiplier, a
H1 profit
H2 profit
H2/H1
(d) T = 200, n = 30 1 2 3 4 Overall average
0.5 0.2 0.1 0.05
666.4 765.7 983.8 1156.2
676.2 770.2 996.3 1167.4
1.016 1.006 1.013 1.010 1.011
(e) T = 200, n = 50 1 2 3 4 Overall average
0.5 0.2 0.1 0.05
783.9 929.1 1284.4 1529.3
786.1 932.7 1302.5 1545.5
1.003 1.004 1.014 1.011 1.008
(f) T = 100, n = 50 1 2 3 4 Overall average
0.5 0.2 0.1 0.05
851.9 1160.3 1416.4 1559.5
866.0 1204.2 1427.3 1570.0
1.017 1.038 1.008 1.007 1.017
(g) T = 300, n = 50 1 2 3 4 Overall average
0.5 0.2 0.1 0.05
743.9 827.4 1076.7 1406.0
754.3 827.7 1086.1 1418.3
1.015 1.003 1.009 1.009 1.009
*
Figures in columns H1–H2 represent average heuristic objective function value among 30 problem instances, while all performance ratios represent an average among the 30 problem instances tested.
H2. As either the number of jobs or the time horizon increases, finding a good sequence becomes more difficult for the GRASP heuristic; in such cases the performance of the two algorithms becomes quite similar as a percentage of the solution value. For large problem sizes, a larger neighborhood search can be easily added to Heuristic H2 to find an improved sequence, and the greater the amount of neighborhood search we perform, the closer the solution will likely be to optimality. Therefore, we see that the method we have developed in this
432
B. Yang, J. Geunes / Computers & Industrial Engineering 53 (2007) 420–432
paper compares very favorably to the Mod-2PA algorithm, which has a known worst-case performance bound, and provides a viable solution method for this problem class. 5. Conclusions and future research directions This paper focused on a single-resource scheduling problem with job-selection flexibility, which we called the job selection with controllable process times and tardiness (JSCT) scheduling problem. We first provided the compress and relax algorithm, which optimizes the tradeoff between tardiness and compression costs (and lost revenues from rejected jobs completed after the deadline T) for a predetermined sequence of jobs. We then considered heuristic methods for job sequencing. Our computational test results demonstrated the effectiveness of these heuristic approaches for providing good solution values. This work suggests several directions for future research. One interesting direction is in the on-line version of this problem. For the on-line version of the problem, jobs arrive randomly without warning. Greedy and approximation algorithms are typically reasonable approaches for these types of on-line scheduling problems. We might further consider the more general parallel machine scheduling version, which might make use of the heuristic solution methods we have provided for the single-machine version of the problem in this paper. Finally, future research might continue to explore whether alternative algorithmic approaches can provide better worst-case performance guarantees. References Bar-Noy, A., Guha, S., Naor, J., & Schieber, B. (2001). Approximating the throughput of the multiple machines in real-time scheduling. SIAM Journal on Computing, 31(2), 331–352. Bartal, Y., Leonardi, S., Marchetti-Spaccamela, A., Sgall, J., & Stougie, L. (2000). Multi-processor scheduling with rejection. SIAM Journal on Discrete Mathematics, 13(1), 64–78. Baruah, S., Koren, G., Mao, D., Mishra, B., Raghunathan, A., Rosier, L., et al. (1992). On the competitiveness of on-line real time scheduling. Real-Time Systems, 4, 125–144. Berman, P., & Dasgupta, B. (2000). Multi-phase algorithms for throughput maximization for real-time scheduling. Journal of Combinatorial Optimization, 4, 307–323. Cheng, T., Chen, Z., Li, C., & Lin, B. (1998). Scheduling to minimizing the total compression and late costs. Naval Research Logistics, 45, 68–82. Cheng, T. C. E., Janiak, A., & Kovalyov, M. Y. (2001). Single machine batch scheduling with resource dependent setup and processing times. European Journal of Operational Research, 135, 177–183. Chopra, S., & Meindl, P. (2003). Supply chain management: Strategy, planning, and operations (2nd ed.). Upper Saddle River, NJ: Prentice Hall College Div.. Engels, D., Karger, D., Kolliopoulos, S., Sengupta, S., Uma, R., & Wein, J. (1998). Techniques for scheduling with rejection. In Proceedings of the sixth European Symposium on Algorithms (ESA’1998), Lecture Notes in Computer Science (Vol. 1461, pp. 490–501). Epstein, L., Noga, J., & Woeginger, G. (2002). On-line scheduling of unit time jobs with rejection: Minimizing the total completion time. Operations Research Letters, 30, 415–420. Kleywegt, A., & Papastavrou, J. (2001). The dynamic and stochastic knapsack problem with random sized items. Operations Research, 49(1), 26–41. Koren, G., & Shasha, D. (1995). An optimal on-line scheduling algorithm for overloaded real-time systems. SIAM Journal on Computing, 24, 318–339. Lawler, E. L. (1990). A dynamic programming approach for preemptive scheduling to minimize the number of late jobs. Annals of Operations Research, 26, 125–133. Lee, H. L., (2001). Ultimate enterprise value creation using demand-based management. Stanford Global Supply Chain Management Forum, Report #SGSCMF-W1-2001. Lipton, R. J., & Tomkins, A. (1994). On line interval scheduling. Proceedings of the 5th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 302–311. Nowicki, E., & Smutnicki, C. (1996). A fast taboo search algorithm for the job shop problem. Management Science, 42(6), 797–813. Nowicki, E., & Zdrzalka, S. (1990). A survey of results for sequencing problems with controllable processing times. Discrete Applied Mathematics, 26, 271–287. Sahni, S. (1976). Algorithms for scheduling independent tasks. Journal of the ACM, 23, 116–127. Yang, B., Geunes, J., & O’Brien, W. J. (2004). A heuristic approach for minimizing weighted tardiness and overtime costs in single resource scheduling. Computers & Operations Research, 31, 1273–1301.