Algorithms for Flow-Shop Scheduling to Meet Deadlines

Algorithms for Flow-Shop Scheduling to Meet Deadlines

Copyright © IFAC Real Time Programming Georgia, USA, 1991 ALGORITHMS FOR FLOW-SHOP SCHEDULING TO MEET DEADLINES R. Bettati and J. W. S. Liu Departmen...

1MB Sizes 0 Downloads 76 Views

Copyright © IFAC Real Time Programming Georgia, USA, 1991

ALGORITHMS FOR FLOW-SHOP SCHEDULING TO MEET DEADLINES R. Bettati and J. W. S. Liu Department 0/ Computer Science, University 0/ Illinois , Urbana, IL61801 , USA

Abstract. In a multiprocessor or distributed system, jobs may need to be executed on more than one processor. When all the jobs execute on different processors in turn in the same order, the problem of end-to-end scheduling on the processors is known as the flow-shop problem. This paper describes two optimal polynomial-time algorithms for scheduling jobs in flow shops to meet deadlines, for two special cases where the scheduling problem is tractable. For the general case, where an optimal polynomial-time algorithm is unlikely to be found, a heuristic algorithm is presented. The traditional flow-shop model is generalized to model jobs that are serviced more than once by some processors. We present an algorithm for scheduling jobs to meet deadlines in this more general type of flow shop. Some considerations are made about scheduling periodic flow shops. Keywords. Real time computer systems; parallel processing; flow shops; scheduling to meet deadlines; heuristic programming.

1. Introduction

suppose that the three processors in the control system mentioned earlier are connected by a bus. We can model the bus as a processor and the system as a flow shop with recurrence. Each task executes first on the input processor, then on the bus, on the computation processor, on the bus again, and finally on the output processor. In a periodic flow shop, each task is a periodic sequence of requests for the same computation.

A flow shop [1- 61 models a multiprocessor or distributed system in which processors and devices (also modeled as processors) are functionally dedicated. Each task executes on the processo rs in turn, following the same order. For example, a real-time control system containing an input processor, a computation processor, and an output processor can be modeled as a flow shop if each task in the system executes first on the input processor, then on the computation processor, and finally on the output processor. Many hard real- time systems can be modeled as flow shops in which every task has a deadline, the point in time by wbich its execution must be completed . The primary objective of scheduling in such systems is to find schedules in which all tasks meet their deadlines whenever such schedules, called feasible schedules, exist. A scheduling algorithm is said to be optimal if it always finds a feasible schedule whenever feasible schedules exist.

In the following , we begin by describing the flow -shop models in Section 2. Section 3 describes a heuristic algorithm for scheduling tasks with arbitrary processing times and the simulation results on its performance. Section 4 describes an optimal algorithm for scheduling in flow shops with recurrence where tasks have identical processing times and ready times. Section 5 describes a way for scheduling in periodic flow shops. Section 6 is a summary.

The problem of scheduling tasks in flow shops to meet deadlines is NP-hard, except for a few special cases [1,21. We describe here a heuristic algorithm for scheduling tasks with arbitrary processing times and discuss its performance. We also consider two variations of the traditional flow -shop model, called flow shop with recurrence and periodic flow shop. In a flow shop with recur rence, each task executes more than once on one or more processors. A system that does not have a dedicated processor for every function can often be modeled as a flow shop with recurrence. As an example,

2. Flow Shop Models In (traditional) flow shops, there are m different processors PI' P 2 , . . . 'Pm' We are given a task set T of n tasks T I , T 2, . .. , Tn ' Each task T j consists of m subtasks T jl , Tj2' ... , T jm . These sub tasks have to be executed in order: the subtask T jj executes on processor P j after the subtask T jU - I ) completes on processor P U - I )' for J = 2,3, . .. ,m. Task T j is ready for execut ion at or after its release time rj and must be completed by its deadline dj • Let Tjj denote the

133

processing time of T ij , the time required for the subtask Tij to complete its execution. We occasionally refer to the totality of release times, deadlines, and processing times as the task parameters. The task parameters are rational numbers unless it is stated otherwise.

Algorithm H Task parameters ri' di and Tij of T. Output:A feasible schedule of T, or "the algorithm fails".

Input:

In the more general flow-shop-with-recurrence model, each task Ti in T has k subtasks, and k > m. The subtasks are executed in the order T il , Ti2' ... , T ik . We characterize the order in which the subtasks execute on the processors by a sequence V = (vI' V2' .. . , vk) of integers, where Vj is one of the integers in the set {1, 2, . . . , m}. Vj being I means that the sub tasks Tij are executed on processor PI' For example, suppose that each task in the given set T has 5 subtasks to be executed on 4 processors. The sequence V = (1, 4, 2, 3, 4) means that all tasks first execute on P I' then on P 4' P 2' P 3' and P 4 in this order. We call this sequence the visit sequence of the tasks. If an integer I appears more than once in the visit sequence, the corresponding processor PI is a reused processor. In this example P 4 is reused, and each task visits it twice.

Step 1: Determine the effective release times rij and effective deadlines dij of all sub tasks. Step 2: Determine the bottleneck processor P b on which there is a sub task Tlb with the longest processing time Tlb' In other words, for some I, Tlb 2: Tij for all i and j. Let Tmax = Tlb' Step 3: Inflate all the sub tasks in {Tib } by making their processing times equal to Tmax' Step 4: Schedule the inflated subtasks in {Tid on P b using the EEDF algorithm. The resultant schedule is Sb' Step 5: Let tib be the start time of Tib in Sb on P b. Propagate the schedule Sb onto the remaining processors as follows: For any task Ti we schedule Ti (b+l ) to start at time tib + Tmax on processor P b+l , T i (b+2) at time tib + 2Tmax on P b+2, and so on until T im is scheduled at time t ,b + (m-b)Tmax on Pm' Ti(b-I) is scheduled to start at time tib - Ti(b - I) on processor P b- I , T i (b-2) at time tib - Tmax - Ti(b - 2) on P b- 2, and so on, until Tit is scheduled to start at time tib - (b-1)Tmax - Til on PI ' Step 6: Compact the schedule by eliminating as much as possible the lengths of idle periods that were introduced in Step 3 and Step 5. Stop .

The periodic flow- shop model is a generalization of both the traditional flow- shop model and the traditional periodic-job model [7- 9). The periodic job system J to be scheduled in a flow shop consists of n independent periodic jobs; each job consists of a periodic sequence of requests for the same computation . In our previous terms, each request is a task. The period Pi of a job Ji in J is the time interval between the ready times of two consecutive tasks in the job. The deadline of the task in each period is some time L1 after the ready time of the task. In an m - processor flow shop, each task consists of m subtasks that are to be executed on the m processors in turn following the same order.

Fig. 1. Algorithm H for scheduling arbitrary tasks in flow shops.

3. Flow-Shop Scheduling

subtasks T.b nonpreemptively on P b. The resultant schedule Sb is used as a starting point for the construction of the overall schedule S on all the processors. In Step 5, subtasks on the other processors are scheduled in the same order as the subtasks Tib in Sb; hence the resultant schedule is a permutation schedule. In this step, every subtask is assigned Tmax units of time on its processor. Both Step 3 and Step 5 add idle times on all the processors and therefore generate schedules that can be improved. The compaction in Step 6 is a way to improve the performance of Algorithm H (that is, the chance for it to find a feasible schedule when such a sched ule exists.) In this step, idle times on all processors are reduced as much as possible.

Unfortunately, with arbitrary task parameters, the problem of scheduling in a flow shop to meet deadlines is NP-hard [10). For this reason, we focus on a heuristic algorithm designed for scheduling tasks with arbitrary task parameters. This algorithm, called Algorithm H, is described in Fig. 1. It makes use of the effective of sub tasks. The effective deadline deadlines dij = di - Et.. i+1 Tik for the subtask Ti j is the point in time by which the execution of the subtask Tij must be completed to allow the later subtasks, and the task T i , to complete by the deadline d, . Similarly, the effective release time rij = ri + Tik of a subtask Tij is the earliest point in time at which the subtask can be scheduled.

Er:t

The classical EEDF algorithm used in Step 4 is priority driven. It assigns priorities to the inflated subtasks Tib according to their effective deadlines: the earlier the deadline, the higher the priority. In particular, we use the more complicated version designed by Carey et al. and described in [11); it is optimal for nonpreemptive scheduling to meet deadlines of tasks with identical processing times, and with release times and deadlines that are arbitrary rational numbers. This algorithm first identifies forbidden regions in time where tasks are not allowed to start execution and postpones the release times of selected tasks to avoid the forbidden regions. In Fig. 1, by release times of T ib , we mean the postponed release times produced by this initial step of the EEDF algorithm

The effective release times and deadlines of sub tasks are computed in Step 1 in Algorithm H. In Step 2, a processor P b, called the bottleneck processor, is identified; a sub task on this processor has the longest processing time Tmax among all the subtasks on all the processors. Step 3 inflates all the subtasks Tib on P b by making their processing times equal to Tmax' Each inflated subtask Tib consists of a busy segment with processing time Tib and an idle segment with processing time Tmax - Tib ' Now, all inflated subtasks on P b have identical processing times. We then use the classical earliest-effective-deadline- first (EEDF) algorithm [11 ) in Step 4 to schedule the inflated

134

from the parameters

rib

and dib as computed in Step 1.

j = 1,2, ... , m, Step 3 introduces no additional idle time. Furthermore, Step 4 is optimal. If the schedule Bb found by the classical EEDF algorithm is not feasible, then {Tib} cannot be feasibly scheduled. Steps 5 and 6 always produce a feasible schedule B of all subtasks on all the processors if the schedule Bb on P b is feasible. We therefore have the following theorem whose proof follows directly from these arguments [12).

The example given by Table 1 and Fig. 2 illustrates this algorithm. Since T43 is the longest among the processing times of all sub tasks, Tmax = T43' and P 3 is the bottleneck processor. Fig. 2a shows the schedule produced in the first five steps in Algorithm H. After compaction in Step 6, the final schedule is shown in Fig. 2b. We note that two tasks, namely T I and T 5 miss their deadlines in the schedule of Fig. 2a. Moreover, TI and T4 are forced to start before their release times. All tasks meet their release times and deadlines in the compacted schedule of Fig.2b.

Theorem 1. For nonpreemptive flow-shop scheduling of tasks that have arbitrary release times and deadlines and whose subtasks on each processor P j have the same processing times Tj , Algorithm H is optimal.

Algorithm H is relatively simple, with complexity O( nlogn + nm). In the special case where the sub tasks Ti j have identical processing times, that is Tij = Tj for all

Task

ri

di

Ti1

Ti2

'iJ

'i4

1 2 3 2

2 3 2

14 14

10 16 22 28 29

3 4 3 5 4

2 1 3 3 1

TI T2 T3 T4 Ts

For the general case of arbitrary processing times Tij, however, Algorithm H is not optimal. This is caused by the suboptimal scheduling algorithm on a single processor in Step 4 and by the restriction on permutation schedules. Fig. 3 shows the results of a simulation experiment to investigate the probability for the Algorithm H to succeed in producing a feasible schedule when feasible schedules exist. In this experiment, Algorithm H was used to schedule task sets which have randomly generated task parameters but are known to have feasible schedules. The fraction of time Algorithm H succeeded in finding a feasible schedule is plotted as a function of m, the number of processors in the flow shop, for different standard deviations (std v) in the processing times of the subtasks on each processor. We see that when the difference between processing times of subtasks Tij on each processor P j is small (for example, when the standard deviation is 0.05 or 0.15) Algorithm H performs well. Its performance decreases as the difference in processing times increases.

Table 1 Task set with arbitrary processing times.

-5

time

0

5

10

15

20

25

30

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I

4. Flow-Shop With Recurrence

T 4,4

The order in which tasks execute on processors in a flow shop with recurrence can also be represented by a visit graph, whose set of nodes {Pi} represents the processors in the system. There is a directed edge eij from Pi to P j with label a if and only if in the visit sequence V = (VI' v2, .. . , V., v. + I' . . • , Vk), V. = i and v.+I = j. A visit sequence can therefore be represented as a path with increasing edge labels in the visit graph. An example of a visit graph for the visit sequence V = ( 1, 2, 3, 4, 2, 3, 5) is shown in Fig. 4. We confine

T S,4

(a) Before compaction.

T~5 , 1

T I ,IT2,I T 3,1 T I ,2 T 2,2 T 3,2 T 1,3 T 2,3

T T

~

~ ~S,2 T 4 ,3

100 99 98 -

T S,3

Success 97 Rate 96(%) 95 94 93 92

(b) After compaction.

Fig. 2. Schedule generated by Algorithm H with the task set of Table 1. •

(95% confidence) I

t

t

f

j

I

I

I

I

I

I

4

5

6

7

8

9

Number of Processors • stdv = 0.05

Idle time due to inflation.

• stdv = 0.15

x stdv

Fig. 3. Performance of Algorithm H.

135

=

0.30

Algorithm R: Input:

Task parameters rii' dii , r, and the visit graph G. P v, is the first processor in the single loop of length q in G.

Output:A feasible schedule S or the conclusion that the tasks in {Ti} cannot be feasibly scheduled. Step 1: Schedule the subtasks in {Tit} U {Ti(l+q)} on the processor P VI using the modified EEDF algorithm described below: the following actions are taken whenever P VI becomes idle

5 Fig. 4. Visit graph for visit sequence V = ( 1, 2, 3, 4, 2, 3, 5).

(i) If no sub task is ready, leave the processor idle. (ii) If one or more sub tasks are ready for execution, start to execute the one with the earliest effective deadline. When a subtask Tit (that is, the first visit of Ti to the processor) is scheduled to start its execution at time tib set the effective release time of its second visit Ti(l+q) to ti/ + qr.

our attention here to a class of visit sequences that contain simple recurrence patterns, called loops: some sub- sequence of the visit sequence containing reused processors appears more than once. In the example shown in Fig. 4 the labeled path that represents the visit sequence contains a loop. The sub- sequence ( 2, 3) occurs twice and makes the sequence ( 4, 2, 3) following the first occurrence of ( 2, 3) into a loop. In particular, the visit sequence in Fig. 4 contains a simple loop which contains no sub loop. The length of a loop is the number of nodes in the visit graph that are on the cycle. The loop in Fig. 4 therefore has length 3.

Step 2: Let tit and ti(l+q) to be the start times of Tit and T.(I+q) in the partial schedule SR produced in Step 1. Propagate the schedule to the rest of the processors according to the following rules:

< I, schedule Ti} at time tit - (l-j)r. < j :::; I+q, schedule Ti) at time ti/ + (J-I)r. (iii) If I+q < j < m, schedule Tij at time ti(l+q) + (J-I-q)r. (i) If j

(ii) If I

The problem of scheduling to meet deadlines in a flow shop with recurrence, being a generalization of the flow shop scheduling problem, is NP-hard when task parameters are arbitrary. However, in the special case where (1) all tasks have identical release times, arbitrary deadlines , and the same processing time r on all processors and (2) the visit sequence contains a single simple loop, there is an optimal, polynomial- time algorithm for scheduling tasks to meet deadlines. This algorithm, called Algorithm R, is shown in Fig. 5.

Fig. 5. Algorithm R to schedule flow shops with single simple loops. 5. Periodic Flow Shops To explain a simple method that can be used to schedule jobs in periodic flow shops, we note that each job J i in a periodic flow-shop job set can be logically divided into m subjobs Ji ). The period of Ji} is Pi' The subtasks in all periods of Jij are executed on processor P j and have processing times ri}' The set of n periodic subjobs J} = { JiJ is scheduled on the processor Pj' The total utilization factor of all the subjobs in J} is Uj = Et- l ri} /Pi . We call the subtask in the kth period of subjob Jij Tij(k). For a given j, the subjobs Jij of different jobs J i are independent, since the jobs are independent. On the other hand, for a given i, the subjobs Jij of J i on the different processors are not independent since Tij(k) cannot begin until TiU - i)(k) is completed. Unfortunately, there are no known polynomial- time optimal algorithms that can be used to schedule dependent periodic jobs to meet deadlines, and there are no heuristic algorithms with known schedulability criteria. Consequently, it is not fruitful to view the subjobs of each job J i on different processors as dependent subjobs. A more practical approach is to consider all subjobs to be scheduled on all processors as independent periodic subjobs and schedule the subjobs on each processor independently from the subjobs on the other processors. We effectively take into account the dependencies between subjobs of each job in the manner described below.

Algorithm R is essentially a modified version of the classical EEDF algorithm. We observe that if a loop in the visit graph has length q, the second visit of every task to a reused processor cannot be scheduled before (q-l)r time units after the termination of its first visit to the processor. The key strategy used in Algorithm R is based on this observation. Let Ti/ be the subtask at the first visit of Ti to the processor P VI in the loop of length q, and Ti (t+q) be the subtask at the second visit of Ti to the processor. T'(I+q) is dependent on Tit. While Ti/ is ready for execution after its release time, Ti(l+q) is ready after its release time and after the completion of Tit. Step 1 of Algorithm R differs from the classical EEDF algorithm because the effective release times of the second visits are postponed whenever necessary as the first visits are scheduled. Therefore, the optimality of Algorithm R no longer follows in a straightforward way from the optimality of the EEDF Algorithm. Algorithm R is optimal, however, as stated by the following theorem, the proof of which is given in [101. Theorem 2. For nonpreemptive scheduling of tasks in a flow shop with recurrence, Algorithm R is optimal, when the tasks have identical release times, arbitrary deadlines, identical processing times, and the visit graph contains a single, simple loop.

For the sake of concreteness, let us for the moment give the periodic jobs in our model the following specific

136

interpretation: the consecutive tasks in each job are dependent, that is, the subtask Til(k) cannot begin until the sub task Tim(k-l) is completed. Let "fi denote the time at which the first task T ij (l) becomes ready. "fi is also called the phase of Ji ; it is also the phase "fit of the subjob J il of Ji on the first processor. Hence the kth period of the subjob J il begins at "fit + (k-l)Pi' Without loss of generality, suppose that the set J I is scheduled on the first processor PI according to the wellknown rate-monotone algorithm [7]. Suppose that the total utilization factor UI of all the subjobs on PI is such that we can be sure that every subtask Til (k) is completed by the time 0IPi units after its ready time "fi = (k-l)Pi for some ° 1 < 1. Now, we let the phase of every subjob Ji2 of Ji on processor P 2 be "fi2 = "fit + 0IPi' By postponing the ready time of every subtask T i2 ( k) in every subjob Ji2 on processor P 2 until its predecessor subtask Til(k) is surely completed on processor PI' we can ignore the precedence constraints between subjobs on the two processors. Any schedule produced by scheduling the subjobs on PI and P 2 independently in this manner is a schedule that satisfies the precedence constraints between the subjobs Jit and Ji2 . Similarly, if the total utilization factor U2 of all subjobs on P 2 is such that every task in Ji2 is guaranteed to complete by the time instant 02Pi units after its ready time, for some 2 < 1-°1, we postpone the phase "fi3 of J i3 by 2 , and so on.

°

and nodes, to executions of instructions in a pipeline processor, to sequencing of operations in a logic circuit. We have described: (1) a heuristic algorithm for scheduling tasks with arbitrary processing times in flow shops, (2) a polynomial-time optimal algorithm for scheduling task with identical processing times and release times in simple flow shops with recurrence, and (3) an effective method for preemptive scheduling of jobs in periodic flow shops.

Acknowledgement This work was partially supported by the Navy ONR Contract No. NVY N00014 89-J-1181.

References [1]

Garey, M. R. and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NPCompleteness, W. H. Freeman and Company, New York, 1979.

[2] Lawler, E. L., J. K. Lenstra, A. H. G. Rinnooy Kan, and D. B. Shmoys, "Sequencing and scheduling: Algorithms and complexity," Centre for Mathematics and Computer Science, Amsterdam, 1989

°

[31 Garey, M. R. and D. S. Johnson, "Scheduling Tasks with Nonuniform Deadlines on Two Processors," J. Assoc. Comput. Mach . 1976 vo!. 23, pp. 461-467.

Suppose that the total utilization factors Uj for all j = 1, 2, ... , m are such that, when the subjobs on each of the m processors are scheduled independently from the subjobs on the other processors according to the ratemonotone algorithm, all sub tasks in Ji; complete by O;Pi units of time after their respective ready times, for all i and j. Moreover, suppose that each 0; is a positive fraction, and E;":I 0; ::; 1. We can postpone the phase of each subjob J,; on P; by 0jPi units. The resultant schedule is a feasible schedule where all deadlines are met. Given the parameters of J, we can compute the set {Uj} and use the existing schedulability bounds given in [8,9] to determine whether there is a set of {OJ} where 0; > 0 and Ej':1 0; ::; 1. The job system J can be feasibly scheduled in the manner described above to meet all deadlines if such a set of 0;' exists. It is easy to see that with a small modification of the required values of 0;, this method of scheduling in periodic flow shops can handle the case where the deadline L1 of each task in a job with period Pi is equal to or less than m Pi units from its ready time. Similarly, this method can be used when the subjobs are scheduled according to other algorithms, or even different algorithms on different processors, so long as schedulability criteria of the algorithms are known.

[41 Garey, M. R., D. S. Johnson, and R. Sethi, "The Complexity of Flowshop and Jobshop scheduling," Math . Oper. Res. 1976 vo!. 1, pp. 117- 129. [5]

Woodside, C. M. and D. W. Graig, ''Local NonPreemptive Scheduling Policies for Hard Real- Time Distributed Systems," Proceeding of Real-Time Systems Symposium, Dec. 1987.

[6]

Sha, L., J. P. Lehoczky, and R. Rajkumar, "Solutions for Some Practical Problems in Prioritized Preemptive Scheduling," Proceeding of Real-Time Systems Symposium, Dec. 1986.

[7] Liu, C. L. and J. W. Layland, "Scheduling algorithms for multiprogramming in a hard real-time environment," J. Assoc. Comput. Mach. vo!. 20, pp. 46- 61, 1973. [8] Lehoczky, J. P. and 1. Sha, ''performance of RealTime Bus Scheduling Algorithms," ACM Performance Evaluation Review, 14, 1986 [9] Peng, D. T. and K. G. Shin, "A New Performance Measure for Scheduling Independent Real-Time Tasks," Technical Report, Department of Electrical Engineering and Computer Science, University of Michigan, 1989. [101 Bettati, R. and J. W .-S. Liu, "Algorithms for Endto-End Scheduling to Meet Deadlines," Technical Report No. UIUCDCS-R-1594, Department of Computer Science, University of Illinois, 1990.

6. Summary We considered here the problem of end-to-end scheduling of tasks with hard deadlines in flow shops. The objective of scheduling is to find a feasible schedule in which all tasks complete by their deadlines whenever feasible schedules exist. The flow-shop model can be used to characterize hard real-time applications on a system containing functionally-dedicated processors and devices. In addition, they also model other types of workload and systems, from data transmissions through a series of links

[11 1 Garey, M. R., D. S. Johnson, B. B. Simons, and R. E . Tarjan, "Scheduling Unit-Time Tasks with Arbitrary Release Times and Deadlines," SIAM J. Comput. 1981 vo!. 10-2, pp. 256-269. [121 Bettati, R. and J. W.-S. Liu, "Algorithms for Endto-End Scheduling to Meet Deadlines," Proceeding of the Second IEEE Symposium on Parallel and Distributed Processing, Dec. 1990.

137