Computers Ops Res. Vol. 19, No. 6, pp. 535-543, Printed in Great Britain. All rightsreserved
1992
0305-0548/92 S5.00 + 0.00 Copyright 0 1992PcrgamonPressLtd
SCHEDULING STOCHASTIC JOBS WITH INCREASING HAZARD RATE ON IDENTICAL PARALLEL MACHINES SUSAN H. Xv’?, SRIKANTA P. R. KUMAR*$ and PITU B. MIRCHANDANI~~ ‘Department of Management Science and Information Systems, The Pennsylvania State University,
University Park, PA 16802, ‘Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208 and %ystems and Industrial Engineering Department, The University of Arizona, Tucson, AZ 85721, U.S.A. (Received
July 1989; in revised form May 1991)
Scope and Purpose-A somewhat frequently occurring production scenario may be described as follows: a set of parallel identical machines (or servers or processors) are available to process a collection of jobs; the jobs are not necessarily identical but each machine can fully process any job. In such scenarios, a common decision problem is: “When a machine becomes available which job to dispatch next?“. Also, if preemption of a job is allowed, a related decision problem is “At any given instant of time, do we preempt a job that is being processed and dispatch instead another job?“. Although such production scenarios occur frequently in manufacturing, service, and computer systems, a prototypical application is as follows: identical printers are available for outputting computer jobs where each printer can fully provide the printing requirements of any job; when a printer becomes free, what job should it print next? Since each machine is identical, major factors in the scheduling decisions are (1) the expected time involved in completing each of the jobs, and (2) the reward (or profit) reveived from completing each of the jobs. When the processing times are uncertain and probabilistic, a reasonable assumption for a large class of problem scenarios is to assume that longer the job has been processed smaller the expected remaining processing time. This may be modeled by a probability distribution for processing time that is referred to as an increasing hazard rate (IHR) distribution. Now, with regard to reward for a job completion, since no specified priorities are assigned in our problem scenario, it is reasonable to assume that the reward should be greater if the job is completed sooner. A simple model for this reward is fl, 0 c b < 1, where t is the time that the job is completed and fi is referred to as the discount factor. A more general model is to assume that the reward is a convex non-increasing function of time l. The main result of this paper is that, for our problem scenario and the assumptions, we have a simple optimal dispatching strategy referred to as SEPT (Shortest Expected Processing Time) that assigns a job with the shortest remaining expected processing time to any machine that becomes available, whether or not preemption is allowed. However, because of our assumption on IHR distribution, this strategy turns out to be a non-preemptive one. The SEPT strategy is particularly simple to implement and, therefore, should prove valuable in applications where our problem scenario is appropriate. Abstract-We consider a discrete time model of m identical machines operating in parallel to complete a collection of jobs. The processing times of the jobs are independent, identically distributed discrete random variables having increasing hazard rate. The jobs may have received different amounts of prior processing. A reward p. 0 c p < 1, is acquired when a job is finished at time t. We show that a non-preemptive SEPT (Shortest Expected Processing Time) strategy maximizes the expected total reward among all possible policies. We show that SEPT strategy is still optimal for several generalizations of this problem scenario.
t Susan H. Xu is an Assistant Professor in the Department of Management Science and Information Systems at the Pennsylvania State University. She obtained her M.S. and Ph.D. degrees in Operations Research and Statistics from Rensselaer Polytechnic Institute, Troy, New York. Her research is focused on stochastic scheduling and optimal control of queues. tsrikanta P. R. Kumar is an Associate Professor in the Electrical and Computer Science Department at Northwestern University. Prior to that he has been on the faculties of SUNY at Buffalo-and Rensselaer Polytechnic Institute, Troy, New York. He received his Ph.D. degree from Yale University. and B.E. and M.E. from Indian Institute of Science at Bangalore, India., respectively. His-publications span various aspects of communication networks, and his current research interests include stochastic scheduling, network protocols, security, and cellular and wireless networks. 6 Pitu B. Mirchandani is a Professor and the Head of the Systems and Industrial Engineering Department at the University of Arizona. Prior to that he was at Rensselaer Polytechnic Institute, Troy, New York. He holds the B.Sc. (Higheat Honors) and M.S. degrees in Engineering from the University of California at Los Angeles, SM. degree in Aeronautics and Astronautics and Sc.D. degree in Operations Research from the Massachusetts Institute of Technology. Professor Mirchandani’s research interests and publications are in optimization, applied probability, and computational algorithms with applications in transportation, telecommunication, and manufacturing. 535
536
SUSANH. Xv et al. 1. INTRODUCTION
AND
BACKGROUND
The scheduling problem addressed in this paper may be loosely described as follows. A service station comprises a set of identical machines (servers or processors) operating in parallel. Initially, there is a collection of jobs that are waiting to be served, and furthermore, as time progresses new jobs may arrive. Each job can be processed on any of the machines. The processing time of a job (that is, the time required to complete a job) is a discrete random variable. A unit reward is obtained each time a job is completed. The problem is to find the optimal scheduling policy (i.e. a rule which specifies, at each time, the jobs that may be processed or assigned to the machines) which maximizes the total expected discounted reward. In principle, the above problem may be formulated as a dynamic programming problem, and the optimal policy may be obtained as a straightforward solution (using, for example, backward induction or fixed point techniques) to the dynamic program. However, such an approach is ~ombinato~ally complex, unless the number of jobs and the number of machines are both small. In addition, the approach sheds little insight into the generic structure of the scheduling policy. Thus, it is of interest to characterize conditions under which optimal scheduling strategies are in form of simple rules. In this paper, we show that, for the above problem, the SEPT (Short Expected Processing Time) policy is the optimal scheduling strategy among all preemptive policies, if the processing times of all jobs are integer-valued, independent and identicaly distributed random variables with increasing hazard rate, whether or not the jobs have received different amounts of prior processing. The SEPT strategy is also optimal even when fresh (new) jobs arrive at arbitrary times, provided that the arriving jobs have received less prior processing then the jobs currently being processed by the SEPT strategy. The SEPT strategy, at each instant of time, assigns to the available machines those uncompleted jobs that have the shortest expected remaining processing times. Standard limiting argument can extend our result for the discrete-time model to its continuous-time counterpart. Literature review reveals that optimal policies for these problems are predominated by one of two policies, namely, “Shortest Expected Processing Time First” (SEPT) and “Longest Expected Processing Time First” (LEPT) strategies. In the case of deterministic processing times, Conway et al. [I] showed that the SPT (Shortest Processing Time) strategy maximizes the flowtime. The simplest setting of a stochastic parallel machine problem is to process n jobs on two identical machines. The processing times of the jobs are assumed to be exponentially distributed with different means. This problem has been extensively studied. Bruno [Z], Bruno and Downey [ 31, and Pinedo and Weiss [4] showed that SEPT and LEPT are the optimal policies to minimize the expected flowtime and makespan, respectively. Glazebrook [S, 61 generalized the results to the m parallel machine problem and proved that the SEPT strategy minimizes expected flowtime when the jobs have geometrically distributed processing times with different means. He further extended this result to its continuous counterpart, where the jobs have exponential processing times. A similar result has been obtained by Weiss and Pinedo [7], who, in addition, showed that LEPT minimizes the expected makespan if the jobs have exponential processing times. Pinedo et al. also showed that SEPT and LEPT are expected value optimal when the processing times for the n jobs are hyperexponentials. Additional results on identical machine problems are given by Weber [8, 91. Weber is the first to consider non-exponential processing times. His models assume that the jobs have identically distributed processing time X (X may either be discrete or continuous), but each job may have already received a different amount of processing prior to the start, and his objective is to find the optimal preemptive strategies so that the expected values of common measures, such as makespan and flowtime, are minimized. Specifically, Weber was concerned with two classes of processing time dist~butions: (1) The class of processing time distributions which have a monotone hazard rate (MHR); that is, the hazard rate h(x) =$(x)/C 1 - F(x) J is either non-increasing (DHR) or non-decreasing (IHR) where f(x) and F(x) are the density and distribution functions of the processing times X, respectively. (2) The class of processing time dist~butions which have a monotone likelihood ratio (MLR); that is, logf(x) is either concave (DLR) or convex (ILR).
537
Stochastic job scheduling
Weber shows that if the distribution of X has an MHR, SEPT is the optimal policy to minimize the expected flowtime whereas LEPT is the optimal policy to minimize the expected makespan. In some cases, these policies also optimize the measures in distribution, see Weber [8, 91, and Weiss
cw
Weber et al. [ 1l] also showed that if the jobs have processing times that may be stochastically ordered and the reward function r(t) acquired when a job is completed at time t is decreasing and convex in t, then the SEPT strategy maximizes the expected reward within the class of non-preemptive scheduling strategies. In particular, for r(t) = - t, this strategy minimizes the expected flowtime. It should be noted that the stochastically ordered distributions contain MHR distributions, and, therefore, stochastic ordering assumption may be the weakest among such assumptions in the above results. The problem considered here differs from the ones discussed above in several respects. First, the expected discounted reward criterion is considered. Second, the optimality of the SEPT rule is extended to systems with any non-increasing convex reward function r(t), provided that r(t) approaches zero as t approaches infinity. Note that this class of reward function was also considered by Weber et al. [ 111, under the assumptions that jobs can be ordered stochastically and that preemptions are forbidden. However, in their system scenario the SEPT policy is not optimal among the class of preemptive policies. In this paper, we show that by restricting the attention to the class of IHR distributions, the SEPT strategy is optimal among preemptive policies. Third, arbitrary arrivals of new jobs are allowed, provided that the new jobs have received less prior processing than the job currently under execution by the SEPT strategy. Note that while arrivals of new jobs were considered by Weber [S, 91 for the makespan criterion, they have not been dealt within the context of flowtime criterion. Finally, we provide a counterexample to show that SEPT is not optimal for any arbitrary arrival process. In Section 2, we formulate the problem with no job arrivals. The model along with the main result is formally stated. Section 3 contains the proof of the optimality of the SEPT strategy. Section 4 discusses possible generalizations of our result and summarizes the paper.
2. STATEMENT
OF
THE
PROBLEM
AND
MAIN
RESULT
We consider a discrete time model of m identical machines operating in parallel to complete a collection of jobs. Each job can be done on any of the machines. In this section, we assume that there are no job arrivals. (The case with arrivals is considered later.) The processing requirements of the jobs may vary, but it is assumed that the processing times (original processing times or remaining processing times after being partially processed) can be represented by conditional discrete random variables from a common distribution F(t). More specifically, we assume that the remaining processing time distribution function of job i is F,(t) = [F(t + ti) - F(ti)]/[ 1 - F( ti)], i=O, 1, .,.) and t, 2 t, > **- t,, where ti is the processing time that job i
R(n*, t)
Max,
i t=1
pZ”(t;
1
where ZI( t; r) is the (random) number of jobs finished at time t, given that policy and machine available times are z. We define a list scheduling strategy as follows: for a given list (or permutation) ofjobs (1, 2, . . . . n), if we schedule jobs on machines in the order of the list then scheduling strategy. Let L denote a given list scheduling strategy and let r = (rl,
(1) R is employed (it, i,, . . . . i,)
we have a list z2, . . . , T,,,) be
538
SUSANH. Xv et at.
arbitrary machine av~lability times, Then we let R(L; t) denote the expected total reward under the list scheduling strategy L with machine availability times Z. Let L, be the SEPT list (1, 2, . . . , n). Our main result is the following theorem. Theorem. Suppose II jobs have to be completed on m identical machines. The processing time ofjob i has a distribution function F,(t) = [F(t + ti) - F(t&]/( 1 - F(Q), where tr 2 a+*2 t,, and F(t) is a distribution with non-decreasing hazard rate. Then the non-preemptive SEPT list strategy maximizes the total expected reward among all possible strategies. In other words,
W,;
4 =
Max.-(fl B’p(t; rl].
The theorem will be proven in Section 3. 3. THE
OPTIMALITY
OF NON-PREEMPTIVE
SEPT
To facilitate the proof of the above theorem, we establish three lemmas, Notation X <,Y will mean that random variable X is stochastically less than random variable Y In the developments below, for convenience, we will refer to our processing time distribution as IHR (increasing hazard rate) even though the mathematical developments pertain to non-decreasing hazard rate. In a similar vein, we will use the term decreasing to mean non-increasing. Lemma 1. Let X be an integer-valued random variable with hazard rate X, := X - a/X > a be the conditional random variable defined by the distribution
h(x).
Let
P(X, > t) = P(X - 4 > tJX > a) Then for any two integers a 2 b, X, ssstXb if and only if X has an IHR distribution. Ross [ 12, p. 253) gives a proof for Lemma 1. With regard to our problem, Lemma 1 states that if X has an IHR distribution; then the remaining processing times of the unfinished jobs can always be ordered stochastically. Furthe~ore, the job that has received a larger amount of processing (or has a higher age) has a lesser amount of remaining processing time and is likely to be finished sooner than a job that has received a lesser amount of processing. Lemma 2. Let h(x,, x,) be any real valued function which is decreasing in xi, and increasing in x2, for x1 < x2. Also let the function satisfy h(x,, x1) = -h(x,, .x1). Then, if X, &X2,
ECMX,, X2)12 0. Proof We first remark that if a function satisfies the above conditions, then it is a decreasing function of x1 and an increasing function of x2 even when the assumption xi 3 x2 is relaxed. To verify this, note that if x2 c x1, then by our conditions, h(x,, x1) is decreasing in x2 and increasing in xi and h(x,, x2) = -h(q, x1). Thus -h(x,, x,), and hence h(x,, x2), is decreasing in x, and increasing in x2. Note that if Yi <.,,Y,, then E(f( Y,)) < E(f( Y,)) for any increasing functionf(e) (see Ross [ 131, Barlow et at. [ 143). Now, let X be a random variable which is independent, identically distributed as X,. Then 8 <,X2. Since h(x,, x2) is an increasing function of its second argument, we have ECMX,,
x211
a
HIMX,,
ml.
(2)
Keeping in mind that h(x,, x1) = 0, the right hand side of (2) can be written as E[h(X,,g)]
= f -f E[h(q, .f=l x,=1 =
2
f .$=x,+1
+
= x,)P(x
= 2)
h(x,, Z), P(X, = Xl)P(X = 2)
x,=1
2 f
x,=x+1
=
Z)JP(X,
k(x,, Z)P(X,
.+?=I
F 2
h(x,, Q[P(X, 3=x,+1 x1=1
= Xl)P(.iE’ = a)
= x,)P(X
= 2) - P(X, = n)P(x
= Xl)]
539
Stochastic job scheduling
Since X, and 8 are identically distributed, the above equation yields E[h(X,,
X)] = 0.
(3)
Equations (2) and (3) imply Iz1
EEMX,, X,)J 2 0.
The following lemma shows that if any two machines are available at times x and y - 1 respectively, with x < y, then the expected reward will be higher if these two machines can be made available at times x - 1 and y, instead. Informally, this means that it is better to have an early machine (available at time x) one unit of time earlier than a late machine (available at time y) one unit of time earlier. Lemma 3. Suppose L, is the SEPT list (1, 2, . . . , n). Let (x, y, T,,,_~) denote the times at which the m machines become available, where z,,,_~ = (rl, TV,. . . , z,_J and, without loss of generality, T’1
A”@, y, z,_~) = R(L,; x - 1, y, q,p-z) - R(&,; x, Y - 1, G-d.
(4)
Then:
(a) AYx, Y, G,-~) = -A’Yy, x, G,-~). (b) A”(x, y, z,- 2) 2 0, and is decreasing in x and increasing in y for any x < y. Proof:
(a) To prove the first part, note that A”(x,y,r,-2)
=R(L,;
y - 1, x, r,-2)
= -R(L,;
- R(L,; y, x - 1, ~-2).
x - 1, y, z,_J
+ R(L,; x, y - 1, z~_~).
= -A”(x, y, t,-& Clearly, if x = y, An(x, y, T,,,- 2) = 0. (b) The proof for the second part is by induction on n, the number ofjobs to be processed. Let n = 1 and let the processing time of this job be X,. If z1 G x - 1, the job will be scheduled to the machine available at time ti, regardless of the machines becoming available later, hence A’(x, y, r,,,_ 2) = 0. If z1 > x, the job will be assigned at time x - 1 and x under the machine availability times (x - 1, y, T~._~) and (x, y - 1, t,-2), respectively. Since /F-lE(/?xi) and /FJ!Z(/~~‘)are the rewards for completing the job with these two machine available time vectors, A”(x, y, z,-~) = (1 - ~)~x-lE(~xJ)
2 0.
In summary,
A’(x,
Y, T,-2)
=
0
ifz, + 1 ,
(1 - /l)p”-‘E(px)
ifr, 2 x,x < y.
= y,
It can be seen easily that Ai (x, y, t, _ 2) is non-negative and is decreasing in x and increasing in y. Now, suppose (b) is true for fewer than n jobs. We shall show that it is true for exactly n jobs. Let t1 + 1 < x, and let X, be the processing time of the first job in the SEPT list. In this case, this job will be scheduled at time zl. Let L,_, = (2, 3, . . . , n) and z,R!,-~= (zl + X,, 22, . . ., T,,,_~). Since E[p”*x’ f is the reward from completing the first job, we have A“(~,y,m~_~) = E[fF’+x’
+ WL,-1;
_ E[pl+XI = A”-l(x,
x - 1, y, ?!I-2Il
+ R(L,-,;
y, t;-,).
x, y - 1, &2)1
(5)
SUSANH.
540
Xu et a/.
The result [that (b) is true for n jobs J follows from our inductive hypothesis. If ri 3 x, since x < y, A’(x,y,r,_,)
= E[p-f+X1 -,[p”+*’
+ R&-r;
x - 1 + X,, y, rn-2)1
+ R(L,-1;
x + X,, y - 1, L-211
= (1 - /I)B”-‘E(ax’) Conditioning
(6)
+ A”-‘(x -t X1, y, T,,,-~).
on X, = c, A”-l(x + Xi, y, z,-~) can be written as A”-‘(x + X,, y, T,_~) = f
A”-‘(x + c, y, r,-#(X,
= c).
c=1
From (a) and inductive, hypothesis, A”- ’ (x, y, r,,,_ 2) satisfies the conditions of Lemma 2. Recalling our first remark in the proof of Lemma 2, we note that A”- ‘(x f c, y, z,,,_~) is decreasing in x and increasing in y, regardless whether x + c < y or not. Since the linear combination of increasing (decreasing) functions is also increasing (decreasing), A”- ’ (x + Xl, y, r,-& possesses the monotone properties stated in (b). Note that (1 - fl)~-‘E(/Ix’) is also decreasing in x and increasing in y, and therefore we conclude from (6) that A”(x, y, T,,,-~) is decreasing in x and increasing in y for r1 2 x and x < y. Using (5) and (6) A”(x, y, z~_~) is defined as
WX, y, ~-2) =
0
x=y
A"-"(x,YJ~,-~)
z,+l
(I -
fl)p”-‘E(fix’) + A”-‘@ + Xi,y,r,,,-2)
(7)
x G rl,x < Y.
We have shown that A”(x, y, ‘F,._~) is decreasing in x and increasing in y within each of its subdomains. It remains to be proven that A” function is monotone when x and y pass through the “connecting” points of the subdomains. First, note that by (7) and our inductive hypothesis, ‘A”(x, y, T~-~) = A”-‘(x, y, z;-~) 3 0, for
71 + 1 G x < y.
Hence A”(x, y, T,,,_~) possesses all the properties stated in (b) for r1 + 1 < x < y. Now, to prove that AR(x) y, z,_~) 2s 0 is decreasing in x for all x 4 y, it is sufficient to demonstrate that A”(r1,
Y, 7,-z)
-
A”(71
+
1, Y, ~-a)
2 0.
(8)
And to prove that A”(x, y, z,,,_~) is increasing in Y for any fixed x G ri, one needs to prove that A”(x, x + 1, z,-s)
3 0.
(9
Consider (8) first. Suppose rl = tj . . . = 7k_ 1 c tk < a’* GT~-~. If k > n, job 1 in the SEPT list A,, will be assigned at time zl - 1 under the machine available time (TV - 1, y, r,_ 2) and it will be assigned attime r1 under (rl, y - 1, rf,_2 )*, all the other n - 1 jobs are processed on the machines available at time r1 under both these time vectors. Then it is clear that A”(T~, y, T,_~)
=
(1 - fl)p”-‘E(/?*‘)
B 0.
(10)
On the other hand, according to SEPT, all the n jobs should be processed on the machines available at time TV, under the two vectors (rl, y, z,_~) and (ri + 1, y - 1, z,_~). Therefore A”(7, + 1, y, q,,_2) = 0.
(11)
Hence from ( 10) and ( 1l), A’(t$, y, z,.e2) - A”(r, -t 1, y, z,_~) = (1 - ~)~i-‘~~~xl)
3 0.
Now consider the case k < n. Let L,, be a list similar to L, except now the first job and the kth job are omitted, k 3 2. Following similar arguments as above, we have A”(rl,y,r,-2)
- A”(71 + LY,z,,-~) = (1 - BP-‘E(P*‘) -
(1 - ~)~l~(~x~)
+ A*(T, + Xl,
Y, 6-2)
- A*(r, + 1 + X,, y, r:_&
(12)
Since 8’ is decreasing in Eand XI < srXk, E(j3*' ) Z E( fixt). Note also, from the inductive hypothesis,
Stochastic job scheduling
541
that A*(r, + 1 + X,, y, T,,,.-~) is decreasing in its first argument for n - 2 jobs. After some simple manipulations, (12) reduces to W~I,Y,L~
-
A”(r,
+ LY,T,,-2)
2 A*(r, =
+ X,,
Y, &)
-
A*(r,
+ &,
y, &),
A*(ri i- Xi, r1 + X,, ‘tLm2),
where T;-~ = (y, r2, r3, . . . . T,,,_~).From Lemma 2 we can show that the last expression is non-negative. To prove (9), let Ln-2 = (3, 4, . . . . n). Quite trivially, we can see that for x < pi, A”(x,x + I,T,,,_~)= (1 - /3)/Y-%(/3*‘) + A”-‘(x + Xi, x + 1, r,,,e2) >, (1 - /3)/Y-‘[E(Bx’) + A”-‘(x + Xi, x, T,,,_~) =(l-/3)B”-‘[E(~x’)-E(~x’)]+A”-2(x+Xl,
x+X2, z,,,_~)~O.
This completes the proof of Lemma 3. The three lemmas proven above can now be used to establish our main result.
0
Proof of theorem. The proof is by backward induction on time c. Suppose ?Iis an optimal strategy and II* is the non-preemptive SEPT strategy. Since n is optimal and the reward is discounted over time, for any given E > 0, there must exist an integer T(E), such that from time T(E) onwards, the following inequality holds R T(E)=
- Lt$(, E
1
jFZ"'(
t;r)
<
E.
11
In fact,
@I]
WC, r) - z”‘(c,
R
G
f
pm= mpT@)/( 1-
f = T(e)
/I).
When T(E) is large enough, R T(e)< E holds, implying that the remaining expected reward from time T(E) onward is negligible, no matter which policy is applied. Hence, without loss of generality, we may assume that x is non-preemptive SEPT after time T(E). Now suppose II is SEPT from some integer time c onward, where t < T(e). We can then show that A is SEPT at time c - 1. Suppose the contrary that n is not SEPT at time c - 1. Let the number of unfinished jobs and number of available machines at C- 1 be p and q, respectively, where p > q. (If p < q, all policies will generate the same expected reward, provided that no idle period is allowed for any machine if the job list is non-empty). By Lemma 1, the remaining processing times of unfinished jobs can always be stochastically ordered. Without loss of generality, let the SEPT list at t - 1 be L, = (1, 2, . . ., p), that is, X,(c - 1) d,X,(C - 1) < ***<,,XJc - 1) where X,( C- 1) is the remaining time of job k at time c - 1. Since n is not SEPT at t - 1, there exists a job, say job j, which is scheduled in one of the q machines at time C- 1, instead of job i, where i < q cj, and Xi(t - 1) <,X,(C - l).We need to consider two cases for the proof. Case I: job j is not assigned at time t in n
Recall that from time t onwards SEPT is optimal, so rc = a* after time t, and therefore job i must be assigned to a machine available at time C.That is, under policy II, job j is processed at time C- 1 for one unit of time and job i is processed at time C.Following is the interchange argument used in the proof of single machine problem (see Ross [ 133). Consider another policy is which schedules job i at time t - 1 for one unit of time and job j at time t, respectively; 5 = it, otherwise. Since the remaining processing times of jobs i and j are altered in the same manner under the both
542
SUSAN
H. Xu et al.
policies it and Z, and since the both policies are the same otherwise, by (l), we have
E
[
1
~~~B”z”(s,rf-~lBz’tsr) =(P’-p+l)(P(X,(t-
l)=
= (p - P'+')(P(Xj(t
l)-P(Xj(b-
l)=
1))
- 1) > 1) - P(X,(t - 1) > 1)) 2 0,
where the last inequality follows from the fact that X&t - 1) g,,X,(t optimal. This completes the proof of Case 1.
- 1). Thus x cannot be
Case 2: job j is assigned at time t in z, due to either some job completions or newly arrived machines
Since job i must be assigned at time t under policy II (since it uses SEPT after time t), job j and job i will be scheduled at time t - 1 and t, respectively. Furthermore, the two jobs will be processed non-preemptively. Let ii be another policy which schedules job i at time t - 1 and job j at time t non-preemptively; ii = a, otherwise. (Note that by our hypothesis, it = K = Z* from time t onwards.) Further, denote the list resulting by omitting jobs i and j from L, by Lp_z. Let the availability time vector at t - 1 for the other m - 2 machines (which are not used by jobs i and j) be B,,,_2, m 2 2. Using ( 1), the difference between the expected rewards under policies ii and II can be expressed as
+R(L,_$
X,(t - 1) + t - 1, X,(t - 1) + t, (I,_,)
_E[px”(t-l)+‘-1
-
R(L,_*;
+
pxrt-l)+‘]
X,(t - 1) + t, xjct - 1) + t - 1, iY,_,)
= (1 _ @)p’E[p%‘-1’ _ fix&-“] + APe2(Xi(t - 1) + t, Xj(t - 1) + t, em-2).
(13)
It is trivial to see that EC_?xdr-‘)] 3 E[Bx~@-‘)], and by Lemma 2, we have AP-‘(Xi(t
- 1) + t, Xj(t - 1) + t, @m-2) 2 0.
Therefore, the left hand side of ( 13) is non-negative and K cannot be optimal. Thus we have proven that for policy 71to be optimal it must be the SEPT strategy at times t = 0, 1, . . . . Finally, recall that F(t) has an increasing hazard rate, so if SEPT strategy is employed, the stochastic order, and the corresponding SEPT list, of the unfinished jobs will never change at any time. Thus the non-preemptive SEPT list strategy gives the maximum expected total reward. q 4. GENERALIZATIONS
AND
CONCLUDING
REMARKS
Non-preemptive SEPT strategy is particularly simple to implement. At any decision point (for example, when a machine becomes available), all that the dispatching system has to do is to sort the list of expected processing times of the jobs waiting to be processed and dispatch that job with the shortest expected processing time, Although the results are developed for scenarios where there are no new arrivals and the reward function is /Y, the result may be extended to scenarios with some restricted arrivals and any convex non-increasing reward function (these extensions are discussed below). This makes the applicability of the model even wider, especially in manufacturing and computer systems. The results of the previous section can be generalized along the following lines. (A) It should be clear from the proof of the previous section that in order to carry out the inductive procedure in the theorem, we only need to be sure that jobs which have been assigned previously will not be preempted by the new arrivals at time t under the SEPT policy. This is the case if the new jobs have stochastically larger remaining processing times. Thus, the SEPT strategy is still optimal when new jobs are released later, under the restriction that no later arrivals have received more prior processing than any job currently under execution.
543
Stochastic job scheduling
(B)
We remark here that non-preemptive SEPT is no longer optimal if the jobs released later have received an arbitrary partial processing. To see this, consider the following counterexample. Let a system consist of two machines available at times 0 and 1, respectively, and four jobs with deterministic processing time of, say, m time-units. Note that the deterministic processing time satisfies the IHR property. Assume that two jobs with remaining processing times 1 and 2 are initially available to be processed. The other two jobs, with remaining processing times m and 1, will be released sequentially whenever a machine finishes a job. It can be easily computed that the expected discounted reward under non-preemptive SEPT is /3 + p3 + p’ ’ + /?“. Now consider the policy ?I, which first processes the job with remaining processing time 2 at time 0, and then follows the SEPT strategy. The expected discounted reward under il, is 8’ + 8’ + pm+’ + p3. Simple calculations show that when /? is close to 1 and m is sufficiently large, policy ii1 is better than SEPT. Instead of the discounted reward B’, the reward obtained by a finished job could be generalized to any convex, decreasing function r(t), provided that lim,,,r( t) = 0. To prove this, we need to show that (a) E[r(x (b) E[r(X
-
- 1 + X) - r(x + X)] is decreasing in x, and 1 + Xi) - r(X + Xi)] ~ E[r(X - 1 + Xj) - r(X
+
Xi)] if Xi ~ rXj.
To prove (a), conditioning on X = c we have E[r(x
- 1 + X) - r(x + X)] = f
[r(x - 1 + c) - r(x + c)]P(X
= c)
c=1
Since r(x) is decreasing and convex in x, r(x - 1 + c) - r(x + c) is decreasing, thus the linear combination of decreasing functions is also decreasing. Now (b) can be trivially proven once (a) is given. While this paper focuses on the optimality of a simple scheduling rule, viz., SEPT policy, it is worthwhile to investigate the conditions which yield optimality of other similar simple rules, especially given the complex nature of the general stochastic scheduling problems. Acknowledgements-The
authors thank the two reviewers whose comments and suggestions improved the presentation of the results. This work was partially supported by NSF Grant ECS-8307232.
REFERENCES 1. R. W. Conway, W. L. Maxwell and L. W. Miller, Theory of Scheduling. Addison-Wesley, Reading, MA (1967). 2. J. Bruno, Sequencing tasks with exponential service times on parallel machines. Tech. Rep., Computer Science Department, Pennsylvania State University (1976). 3. J. Bruno and P. Downey, Sequencing task with exponential service times on two machines. Tech. Rep., Dept. of Electrical Engr. and Computer Science, University of California, Santa Barbara (1977). 4. M. Pinedo and G. Weiss, Scheduling stochastic tasks on two parallel processors. Naval Res. Log. Q., 27,528-536 (1980). 5. K. D. Glazebrook, Stochastic scheduling. Ph.D. Thesis, University of Cambridge (1976). 6. K. D. Glazebrook, Scheduling tasks with exponential service times on parallel processors. J. Appl. Prob., 16, 685-689 (1979). 7. G. Weiss and M. Pinedo, Scheduling tasks with exponential service times on non-identical processors to minimize various cost functions. J. Appl. Prob. 17, 187-202 (1980). 8. R. R. Weber, Optimal organization of multi-server systems, Ph.D. Thesis, University of Cambridge (1979). 9. R. R. Weber, Scheduling jobs with stochastic processing requirements on parallel machines to minimize. makespan or flowtime. J. Appl. Prob. 19, 167-182 (1982). 10. G. Weiss, Multiserver stochastic scheduling. In Deterministic and Stochastic Scheduling (Edited by M. A. H. Dempster, J. K. Lenstra and A. H. G. Rinnooy), pp. 157-180. Reidel, Doidrecht (1981). 11. R. R. Weber, P. Varaiya and J. Walrand, Scheduling jobs with stochastically ordered processing times on parallel machines to minimize expected flowtime. J. Appl. Prob., 23, 841-847 (1986). 12. S. Ross, Stochastic Processes. Wiley, New York (1983). 13. S. Ross, Introduction to Stochastic Dynamic Programming. Academic Press, New York (1983). 14. R. E. Barlow and F. Proschan, Staristical Theory ofReliability:and Life Testing. Holt Rinehart & Winston, New York (1975).