Information and Software Technology 38 (1996) 569-580
An efficient process migration algorithm for homogeneous clusters Khaled H. Al-Saqabi, Kassem A. Saleh Kuwait University, Department of Electrical and Computer Engineering, P.O. Box 5969, Safat 13060. Kuwait Received 21 March 1995; accepted 4 December 1995
Abstract In this paper, we present an algorithm for migrating processes in a general purpose workstation environment. In such an environment, parallel applications are allowed to co-exist with other applications, using workstations when released by their owners, and off loading from workstations when they are reclaimed. The applications can dynamically create and terminate processes during their execution. Our objectives are minimizing the average Turn Around Time (TAT) of all scheduled applications, and maintaining fairness among the competing applications. A workstation-application model together with a migration algorithm and its proofs of correctness are presented to facilitate the implementation of our objectives on the resources in an environment consisting of homogeneous workstations. Keywords: Process migration; Distributed systems; Homogeneous clusters; Parallel processing
1. Introduction The emergence of high powered workstations connected via fast communication networks has increasingly been considered as an alternative to dedicated high performance parallel computers. These workstation networks are not only cheaper, but also provide a general purpose computing environment that is typically shared by both parallel and non-parallel applications. General purpose workstations have general characteristics that must be considered when they are to be used for parallel processing. First, the collective resources of the network are often shared by a large number of users. Second, the concept of workstation ownership is present. Individual workstations, while available across the network, are usually associated with a specific user “owner”. A workstation owner is often willing to allow other users to borrow their workstations when it is idle. The collection of idle workstations in a network can be combined and used as a huge computing resource. As a result, an unobtrusive access to the idle cycles of a workstation is targeted for harnessing the full power of such an environment. Allocating workstations to applications and off loading from already allocated ones, together with the dynamic behavior of parallel applications during their execution, mandates an efficient migration algorithm. Such an algorithm must invoke migration between 0950-5849/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDI 0950-5849(95)01095-5
workstations in a network at points when it seems to be necessary and worthwhile. This in turn motivates us to consider migration cost, and the returned benefits of migration. As far as the migration benefits, it allows parallel applications, or jobs, to run dynamically without declaring their execution plan ahead of time. That is, parallel applications can spawn and terminate processes dynamically. In addition, migration enables the implementation of unobtrusive access to workstations when they are released/reclaimed by their owners. Migration cost is an important factor in the proposed environment. Neglecting the costs and incentives of migration in distributed systems leads to unrealistic solutions. In general, the following steps are necessary in order to invoke, migrate, and restart process(s) successfully in a distributed environment: (i) detection of the migration stimulus, i.e. workstation reclaimed by its owner and therefore foreign processes must be “evicted”, or workstation released by its owner and therefore an idle workstation is “available”, (ii) suspension of the concerned processes, foreign processes in case of “eviction”, or some selected processes in the case “available”, (iii) capturing the state of the suspended processes. The state involves creating a checkpoint file from the process’s virtual memory, capturing process’s recorded state regarding open files, devices, cached blocks, etc., capturing process’s recorded state in the kernel regarding process’s identifiers, user’s identifier, present
570
K.H. Al-Saqabi, K. Saleh/Information and Software Technology 38 (1996) 569-580
working directory, signal masks and handlers, etc. (iv) flushing in-transit messages addressed to the migrating process, (v) transferring the process by transferring its state to the selected destination, (vi) restarting the process by rebuilding its state in the new location, then scheduling it for execution. A detailed description of the involved migration mechanisms and costs can be found in Refs. [l-3]. The rest of the paper is organized as follows. In Section 2, we review some of the existing related work. In Section 3, we present a model which helps to illustrate and state our objectives. Section 4 presents the objectives related algorithms to satisfy the stated objectives. In Section 5, we use the algorithms of Section 4 to present the algorithms for the migration activations. Finally we conclude in Section 6.
2. Related work The existing allocation and scheduling policies are tailored to the scheduling of applications tasks (or processes) in either shared memory multiprocessors [4,5] or distributed systems [6,7]. For each environment, there are two types of scheduling: static and dynamic. Static scheduling maps application’s tasks onto processors based on the application and the processor characteristics defined at application submission time. Whereas, dynamic scheduling does not require a priori knowledge of the system and the submitted application characteristics. The scheduler adopts to the current system characteristics by continually re-mapping the application tasks on processors as needed. The goal of the mapping algorithm is to balance the workload among processors. For distributed systems, dynamic scheduling offers flexibility by not requiring foreknowledge of application characteristics; however, process migration overhead is incurred. Scheduling algorithms in distributed systems vary between optimizing two conflicting strategies: (i) the accuracy of knowledge needed for making load balancing decisions. This knowledge indicates the load of the processor (or node) in terms of its queue length [8], and (ii) the amount of overhead incurred by the balancing process. Zhou’s algorithm [9] balances load by periodically requiring each processor to inform other processors of load changes. The scheduler is invoked whenever a new process is submitted. If the local load is below a threshold value Thi the process is executed locally. Otherwise the least loaded node in the system is examined. If its load is less than the local load by at least a threshold Th2, then the process is scheduled on that processor. Otherwise, it is executed locally. Four heuristics, presented by Xu [lo], reduce the overhead of this scheme at the expense of accuracy, by allowing either neighboring processors or all processors in the system to
contribute to the setting and adjustment of the threshold values for a given node. Kremien [l l] improves the scalability and stability of Zhou’s algorithm by subdividing the system into domains and only exchanging load information among processors in the same domain. Willebeek-LeMair [ 121 presented four scheduling policies based on the degree of accuracy and overhead incurred by the information exchange. The policies are: (i) sender/receiver initiated distribution which performs balancing based on information from neighboring nodes, (ii) the hierarchical balancing methods which organize systems into a hierarchy of subsystems within which balancing is performed, (iii) the Gradient model guides migration between overloaded and underloaded nodes through a proximity gradient which eventually transfers load from heavily to lightly loaded regions, and (iv) the Dimension Exchange method which first requires a synchronization phase, then balancing is performed iteratively. Performance comparisons between the different strategies are also given. Suen [13] proposed a balancing algorithm and an efficient communication protocol which improves the average response time and reduces the communication overhead. Each processor communicates its load directly to only N’/* of the N processors in the system. Each processor has two sets of processors, its sending set and its receiving set, that it sends to and receives from, respectively. The sets are constructed such that load balancing information is propagated either directly or indirectly to all processors in the system. Casas [1,3] presents the mechanisms of the migration oriented parallel virtual machine (MPVM). The MPVM is a software system, under development at the Oregon Graduate Institute, based on the PVM 3.3.4 [14]. The MPVM targets dynamic migration [ 151for parallel applications in a general purpose workstation environment. In this paper, we present a migration algorithm, which can be applied to homogeneous general purpose workstation clusters. In such an environment, workstations are available only when they are not being used by their owners. We dynamically invoke the migration algorithm in order to migrate tasks, or processes, that were evicted by the workstations owner. Similarly, newly available workstations cause the migration of the executing processes if it seems to be worthwhile. Our migration algorithm also supports dynamic process creation. That is, all parallel applications can dynamically introduce and terminate processes any time during their execution. The migration policy in a workstation environment can substantially reduce the overall cost/performance factor for each workstation in the cluster. In the current version of the algorithm, migration decisions are made on a single processor. Further work to distribute this function is underway, but is outside the scope of this paper.
K.H. Al-Saqabi, K. Salehjlnformation and Software Technology 38 (1996) 569-580
ws 3
JO
54
ws 2
JO
52
53
ws 1
JO
Jl
53
JS
ws 0
JO DIn tiJ
Jl
53
J5
tl
t2
t3
Fig. 1. Allocating jobs on the Bin model.
3. Model and objectives
In this section, we present a model for allocating processes of multiple jobs onto a set of available workstations. We will use this model to state our objectives and requirements. Interaction properties among processes of a job are factors that influence the allocation strategy. Jobs that exhibit close cooperation among their processes are shown to perform well if executed by gangs [16]. In gang allocation, all processes run together on the available processors. Multiple jobs are ganged by time sharing the processor’s time. In our algorithm, we consider allocating jobs where processes exhibit close interactions among themselves. We also choose round robin to implement time sharing among multiple jobs. In our model, the system would perform migration decisions only if activated by one of the following events, i.e. migration activations (i) an available workstation received an eviction notice from its owner: the system must migrate all foreign processes, (ii) a non-available workstation is released by its owner: the system must decide what jobs can benefit from the availability of that workstation, (iii) a process or a new job is spawned: the system must decide on where to run the process or the job, or (iv) a scheduled job or a process completes execution, and issues an exit signal. Fig. 1 represents a model of allocating jobs onto a collection of available workstations. We call the model shown in Fig. 1, the Bin model. The vertical direction in the figure shows the available workstations. The horizontal direction shows the time slices assigned by the system to implement round robin on multiple jobs. The contents of each cell in the Bin board indicates a process. In the figure, Job 0 (JO) processes are allocated time slice to on the whole set of available workstations. Whereas Jl, 52 and 54 share the available workstations in time slice tl. The number of processes of each job and the number of available workstations determine how many workstations are allocated to a job. Fig. 1 shows that Jl requires two workstations for its processes, while 53 achieves maximum concurrency through acquiring three workstations. Note that Workstations 2 and 3 are idle during time slice t3. Within the framework of the migration activations
571
stated in this section, our objectives are: (i) minimizing the average Turn Around Time (TAT) of all scheduled jobs, and (ii) maintaining fairness among competing jobs by granting, when available, each job a number of workstations with a cumulative processing power proportional to the number of its processes. We refer to the two above objectives as the timeliness and fairness, respectively. We require our solutions to be responsive and scalable. This indicates that we must keep migrations to a minimum, as possible. As the number of workstations grow, the cluster will have a greater number of activations coming from workstations dynamics. Unrestricted migrations might compromise the responsiveness and scalability of the system. In our algorithm, we restrict our solutions to those that satisfy our requirements.
4. Objectives related algorithms In this section, we present a family of algorithms that are used during the migration activations. These are the Assign and Select (A&S), and the Sort and Fit (S&F) algorithms. The A&S algorithm selects the necessary number of workstations such that the processing resources are shared fairly (the fairness objective), then it assigns the processes of the job to the selected workstations. The S&F algorithm allocates the jobs to the idle workstations such that the Bin board is compacted (the timeliness objective). In Section 5, we present the algorithms for the migration activations: evict, available, spawn, and exit, which make use of the A&S and the S&F algorithms. 4.1. The A&S algorithm In the A&S algorithm, we perform two things: determine the necessary number of workstations that achieve the fairness objective, and calculate the number of processes that must be assigned to those workstations. For this, we present a design oriented parameter the objective depth figure Sobj.The 6,bj indicates for any job the number of processes that must be assigned from a job to the selected workstations. To illustrate that, assume that we choose bobj = 5 to be the number of processes assigned for any workstation. Let X be the total number of processes submitted by a job. Then [X/51 indicates the number of workstations that must be allocated in order to assign five processes per workstation. For a job I, we define 9?(J,) = [X//So,] to be the number of workstations that are required to implement the Sobl which achieves the fairness objective. The Sob implements fairness among competing jobs by fixing the number of processes assigned to each workstation while varying the number of workstations 9?(JI). However, there is a limitation. When, for a job, the number of workstations required to implement S,bj exceeds the number of workstations currently
572
K.H. AI-Saqabi.K. Salehjlnformationand Software Technology38 (1996) 569-580
available, then the assignment yields values of the depth figure greater than Sobj. In that case, the application is forced to a depth figure greater than the designated Sobj. We refer to the new figure as the effective depth figure 6,~ of the job. Values of Sei-rgreater than Sobj results in less resources being allocated to the job compared with the case of equal values of 6,~ and Sobj. When sufficient number of owners release their workstations, the difference between the two values is eliminated. Once the algorithm determines the proper number of workstations, the algorithm must then select the workstations. Here, we minimize the average TAT of all jobs by first checking suitable vacant cells (gaps) within the Bin board. If sufficient gaps within the Bin board can be found such that &objis satisfied, then this is the best solution for the fairness and timeliness objectives. Otherwise, a new column comprising all the available workstations, as indicated by the Bin board, is granted to the job. The last part of the A&S algorithm assigns the processes to the selected workstations. The 6obj is a figure programed based on the expected size of the submitted jobs and the expected number of available workstations. If the submitted jobs are expected to be large jobs, i.e. large X, and the number of available workstations is expected to be small, then 6obj should be large. As the number of workstations increases, Sobi can be decreased. The 6,bj should be minimal for large numbers of available workstations and many small sized jobs. By programming Sobj we avoid reaching the situation c!& > So,, mentioned earlier. The situation S,, < Sobj is also possible. When 6obj is applied to determine the necessary number of workstations that achieves the fairness objective, gaps or available and free workstations in the same column might be left unused. Those free workstations can, for the time being, be borrowed by any job to improve the effective depth figure Set of any job residing in the same column. Improving the S,n will ultimately improve the timeliness objective. Therefore, a job is granted those free workstations until the next migration activation. At that point, resolving the activation according to our objectives may force the job to relinquish those borrowed workstations, and the job must then adjust its S,, to match Sobj.Now, we present the A&S algorithm. The Assign and Select (A&S) algorithm For the job J, with Xl processes, then a(Jl) = [X~/~~~j]. Let Xi and X, indicate column i and
a new column of the Bin model, respectively. Let oxi and cx indicate gaps located in column i, and a new column: respectively. Also, let p(cxi) and p(cAx.) indicate the number of gaps (number of free workstations) in column i and the new column, respectively. For simplicity, we’ll use ui and g, instead of oxi and ux. everywhere. Let aj and a, indicate the list of free workstations in column i and a new column, respectively.
c01u2ms
1
3
1 11 workstations
Jl Fig. 2. The Bin model for Example 1.
The subscripts i, j, k, and I are used as indices for the parameters. Finally, Jik indicates job i is located in column k of the Bin model. (1) If W( J,) > ~(a,) then open a new column. Define the set Q = +n. (2) If W(Jl) < ~(a,) then search all Xi for a gi such that W(JI) < p(ci). If found then @ = Cpi,else + = ap,. (3) Let xl = [_Xl/B(Jl)j be the number of processes assigned to the workstations in a. (4) While ((Bal = X - x) > 0){ (i) Assign x process from Jl to a workstation in @. (ii) Get the next workstation in @. (iii) Xl = Bal} (5) If (Bal < 0) then assign Xl to any of the remaining members of a, /* if Bal = 0 do nothing */. End of algorithm
Example 1 Let the objective depth figure bobj = 3. Let the total number of available workstations ~(a,,) = 11. Also, let the jobs to be scheduled start with a number of processes as follows: Xl = 1, X, = 6, X3 = 50, X4 = 26, and Xs = 7. Then the A&S algorithm assigns a number of workstations for each job according to W(J,) = [X~/b~~j]. The number of allocated workstations to J,, J2, J3, J4, and J5 is 1, 2, 17, 9, and 3 workstations, respectively. But, because we are limited by the 11 available workstations, J3 is allocated all the resources of a new column. In this case J3 has an effective depth figure S,, = 5 > bobj = 3. The resulting Bin model is shown in Fig. 2. To further illustrate the relation between the A&S and the S&F algorithms and the migration activations, let the next event, in Example 1, be the spawn of a new job (J6) with X, = 18 processes. Then, the A&S algorithm requires six workstations for J6 in order to satisfy the fairness objective. In the Bin model shown in Fig. 2, there is no gap with a size of 6 within the Bin board. Therefore, according to the A&S algorithm, a new column is allocated for the new job. But, this violates the timeliness objective. That is because J1 can be migrated to another workstation and execute in
K.H. Al-Saqabi, K. Salehllnformation and Software Technology 38 11996) 569-580
column 3, and J6 can be accommodated in the gap of size 5 of column 1 together with the space vacated by migrating Ji . This approach does not add a new column to the Bin, and hence satisfies the timeliness objective. The S&F algorithm is applied after the A&S algorithm and performs compaction on the Bin board in order to satisfy the timeliness objective. At the end, the stated objectives are satisfied. The migration activations: evict, available, new spawn or process spawn, and exit, use the two algorithms in certain ways in order to produce an efficient migration algorithm. 4.2. The Sort and Fit (S&F) algorithm The algorithm compacts the jobs scheduled on the Bin board. The Bin board is compacted when the scheduled jobs cannot be scheduled in less columns. Compacting the Bin model achieves the timeliness objective, i.e. minimizing the average TAT. We perform compactions by reallocating all the jobs that are currently allocated to partially filled columns such that the number of columns in the Bin model is reduced. The algorithm starts by extracting the scheduled jobs and the gaps from every partially filled column. It then creates three sorted lists based on the extracted information: (i) the list of gaps sorted in ascending order according to the number of workstations in each gap, (ii) the list of Jgaps. Each of the Jgaps list contains a field indicating the job’s size, i.e. demands of resources, and another field indicating the size of the gap located in the same column. The list is sorted in ascending order based on the first field of each element, and finally (iii) the list of all the co-located sites “cosets”. A coset is the set containing all the jobs residing in the same column. The cosets are also sorted from the smallest to the largest based on the total size of the jobs in a coset. We reallocate jobs by mapping the jobs in each coset, one coset at a time, to gaps in the list of gaps using first fit policy. Later, we will show that the first fit in sorted lists results in the best possible fit between jobs and gaps. The idea behind the algorithm starts by initially assuming that each coset consists of only one job. Hence, the list of cosets becomes the list of jobs. Each job in the list of jobs has a corresponding gap in the sorted list of gaps. A job and its corresponding gap are positioned at opposite ends each in its list. That is, a small job is associated with a large gap, when both are sorted in their lists, the small job occurs at the beginning of the job list, and the large gap occurs at the end of the gap list. Reallocating a job from the sorted list of jobs to the first fitting gap in the sorted list of gaps accomplishes two goals: (i) a corresponding gap at the extreme end of the sorted list gaps is removed with each job reallocation considered at the beginning of the list of jobs, resulting in the removal of a column with the largest gap, (ii) small jobs are checked before large jobs leading to minimum
513
process migrations. However, our Bin models columns with multiple jobs per column. Therefore, we must adjust our initial assumption and consider the reallocation of cosets, rather than jobs, to gaps. Two possibilities exist when we try to reallocate the jobs of a given coset: (i) the search in the list of gaps results in a successful map to each job in the coset, or (ii) there is at least a single job in the current coset that cannot be matched to any gap. When all the jobs in the current coset are reallocated into matching gaps, using the first policy, then the coset and its corresponding column is removed from the Bin board. On the other hand, if a matching gap cannot be found for any job in the current coset, the second possibility, then we perform a transitive allocation check for that job in the coset on the list of Jgaps. In the transitive allocation, another job “the replaced” in the list of Jgaps is swapped with our job “the replacing” in the list of cosets. The replaced job and the gap in its column, i.e. the sum of the two fields of an element in the Jgaps list, must occupy a combined space that can accommodate the replacing job. To get the best solution, as will be shown later, we select the first fitting replaced job from the list of Jgaps. Moreover, the replaced job’s size must be strictly less than the size of the replacing job. This guarantees the algorithm termination. More importantly, if a smaller replacing job cannot be mapped successfully to a gap, then definitely an equal or larger replaced job cannot do otherwise. If the second possibility cannot be resolved, then the coset at hand cannot be reallocated, and all actions related to relocating the coset are invalidated. The next coset in the list of cosets is then examined for a possible relocation. The algorithm terminates when the sum of the remaining gaps in the Bin board becomes less than the size of the current coset. That is because the cosets are sorted and none of the subsequent cosets can reallocate all of its jobs. Now, we present the S&F algorithm. The S&F algorithm (1) From the Bin model, form the set X containing all the columns Xi which contain gaps. (2) Sort the set X in ascending order in terms of the number of gaps in each column. That is Xi precedes Xj if p (gi) is less than or equal to p(aj). We refer to the resulting sorted list as the list of gaps. (3) Define the set of jobs residing in a column as the co-located set. For each column Xk in X, form the co-located set Kk, and let K represent all the different co-located sets of X. (4) Sort K in ascending order. That is, Ki precedes Nj if CvJ,EK, a(J,) d CvJ,EK, 9?(J,). We refer to the resulting sorted list as the list of cosets. (5) Sort in ascending order all jobs Jl present in the set X in terms of their 9(J,). That is, Ji precedes .$ in the sorted list if B!(Ji) < B(Jj). Each element in the sorted
K.H. Al-Saqabi. K. Saleh/Information and Software Technology 38 (1996) 569-580
514
list has two components. The first component indicates the job’s 9?(Ji). The second component indicates the size of the gap in the column where the job is located, i.e. the second component for Jik is P(Q). We refer to the resulting sorted list as the list of Jgaps. (6) For each & in K, sort the elements of Kk in descending order. That is, Jik follows Jik if
correspond to all the columns that cannot be removed from the Bin board. The removed &‘s in Step (7)-(iv)-(c) correspond to the columns that are removed from the Bin board. Perform the migrations necessary to implement the reallocations of the S&F algorithm. End of algorithm
g(Jik) G g(Ji/c).
To prove the correctness of the S&F algorithm, we proceed in two steps. First we prove in Theorem 1 the correctness of a restricted version of the S&F algorithm. In the restricted version, we assume that only a single job can reside in any column. In the second step, we remove the restriction and prove the correctness of the algorithm in Theorem 2. Altogether, Theorems 2, 3, and 4 prove that the S&F algorithm satisfies our objectives. Afterwards, we present an example. Lemma 1: Given a list of gaps G and a list of objects (or jobs) J, both sorted in ascending order based on their sizes. The first fit policy of an object from J to a gap in G results in the best allocation of the objects onto the available gaps. Proof: Let us assume that we are given two sorted lists: a list of gaps G and a list of objects J. For a job in J, we show that selecting a fitting gap past the first fitting gap in G cannot produce a greater number of jobs allocations than that produced by selecting the first fitting gap. We show that by presenting two solutions for the allocation: the next fit, and the first fit. In the next fit solution, we’ll assume that deferring the selection of a gap beyond the first fit allocates more jobs into gaps, thus causing greater jobs allocation. For example, if Ji, Ji+l, Ji+Z are three jobs and Ji’S first fit is gap ~1, then, in the next fit solution, we allocate Ji to another gap ff[+k. Since fflfk > al in G, we’ll choose the best case for the next fit solution and assume enough space in gap al+k to accommodate the next two jobs Ji+ 1, and Ji+z. In the first fit, we allocate a job from J to the first fitting gap in G. In our example, we allocate Ji into ol. To produce a worst case scenario for the first fit solution, we assume that there is no room left in this gap for the next jobs Ji+ 1, Ji+2.Hence, Ji+ i goes into (TI+,, where m < k. The last job Ji+2 can be fitted into ff[+k. In the next fit solution, the three jobs used one gap (T/+k. In the first fit, the three jobs used three gaps gl, CT~+,,, and (Tl+kwhich may imply that next fit is better than first fit. This is not true, because, in the next fit all the unused gaps from gap gI until ci+k_ 1 cannot be utilized by any subsequent job Ji+z in J where z > 3. The reason is due to the fact that Ji+2’s first fit, as indicated, is Dl+k. Since Jis sorted, then Ji+s and all the subsequent jobs in J cannot fit any gap pi < ff[+k. Therefore, the best case of the next fit consumes at least I + k gaps, gaps ol until u~+~_, , while the worst case first fit also consumes I + k gaps. In general terms, the first fit solution produces better results than the next fit solution. Thus, the first fit policy results in the best allocation of objects onto gaps in sorted lists.
(7) For every Kk in the list of cosets, and while (EVA P(ai)) - P(ak) 2 CVJ,EK~ a( For every Jik E Kk and while b$ exists { If (B(Jik) < p(ui) for anyj andj # k) then { (i) Allocate Jik to the first setting gap Uj from the list of gaps where P(gj) > B(Jik) andj # k. (ii) Remove Jik from the Set Kk. (iii) Sort and adjust cj in the list of gaps, and adjust the references to aj, accordingly, in the list of Jgaps. (iv) If (& = NULL), then { (a) remove the column Xk from the Bin model, (b) remove all Jik co-located jobs from the list of Jgaps, and Ck from the list of gaps. (c) Remove Kk from K, (d) get the next co-located set from K}. else { get the next job in tck} else { /* Here we start the transitive allocation procedure */ (i) Search for the first element Jp4 in the list of Jgaps such that &‘(Jp4) + ~(a,) > 9(Jik), i.e. the sum of first and second components of an element in the list of Jgaps is greater than or equal to B(Jik)y provided that %?(J,,) < a(Jik) and q # k. (ii) If W(J,,) + p(c,) 2 a(Jik) and B(J,,) < g(Jik)
and 4 # k) then { (a) deallocate the space acquired by Jp4 from the Bin model, (b) insert Jp4 in &, now J, beCOmeS Jpk, (c) Adjust ~(a~) to B(Jp4) + ~(a~)- .%‘(Jik)in the list of gaps, then sort. Update the references to the newly created gap in the list of Jgaps. (d) Allocate Jik to the newly acquired workstations from Jp4 and g4 in column X,. Note that & cannot be equal to NULL at this point }. else {/ * cannot remove x k *i (a) invalidate all the actions performed for 2; get the next coset in K}. }/* end of second else */ }/ * end of second for */ }/* end of first for */ (8) Now the remaining elements in the list of cosets
K.H. Al-Saqabi, K. Saleh/Information and Software Technology 38 (1996) 569-580
Theorem 1: Given the list J scheduled in the Bin model,
and Ji E J are the jobs that partially fill a column. If we assume that only a single job is currently allocated in any column, then the following algorithm, when called, results in a reallocation that minimizes the average turn around time of all scheduled jobs. Here is the algorithm: (1) Sort the scheduled Ji’S in ascending order according to their required processing resources a(Ji), the resulting sorted list is J. (2) Sort the gaps of the Bin model in ascending order according to their processing capacity p(a), the resulting sorted list is G. (3) For every Ji E J, and while i < j do { (i) Allocate Ji to the first gap Uj in G such that g(4)
G
PCgj).
(ii) Remove ‘Tifrom G. } End of algorithm Proof In our proof of the theorem, we show that: (i) the algorithm produces maximum column removal from the Bin board, (ii) it is not necessary to check all the jobs in the list of jobs once a job’s first fitting gap comes from its own column, and (iii) maximum column removal from the Bin model yields the least possible average TAT for all scheduled jobs. (i) For a job J, from the list of jobs, the allocation of Ji to a ai from the list of gaps where p(cj) < W(Ji) is not possible. This fact is also true for all Jk in the list of jobs where k 3 i. Thus the first possible allocation for Ji in a a, occurs when p(cj) 2 B?(Ji). AS a matter of fact, Ji can be allocated to any larger gap aP where aP > aj in the list, but this is not the best solution, as presented in Lemma 1. Allocating a larger gap aP can be made to another Jk for k > i, however, an appropriate first fit ai for Ji might not accommodate a Jk, since 93(Jk) > c%(Ji). This approach leaves aj unallocated because it was not matched with the first fitting Ji. We conclude that the best solution is the first fitting gap in which p(~j) > 9(Jj). Moreover, the algorithm operates on sorted lists of jobs and gaps. Small jobs at one end of the sorted list are allocated in the same column with jobs located at the other end of the same list. That is because the first fit policy starts with the smallest possible fitting gap which comes from a column containing large jobs which is located at the other end of the sorted list. As the list of jobs is scanned and relocated to gaps, relocated jobs release their columns to move in with other jobs. Since the first fit policy maximizes the allocation of jobs onto gaps, then maximal number of jobs are reallocated, which results in maximal column removal. (ii) The algorithm terminates when a given job Ji scans the list of gaps and its first fit is to its own column X,, i.e. the scan of the list of gaps reaches gi. When such an event occurs, all subsequent jobs Jj in the list of jobs do not need reallocation checks, because all will result in the same allocation. To prove this fact, we show that there cannot exist a a,+ that can fit the Jj where p(ai) 6 p(ak) < P(gj). Let’s assume that there exist a ak between
515
gi and aj. Since we are scanning a sorted list of gaps, then the gaps are ordered as (aj, Ck, Cj) in the list. The gap ok must come from a column with a job Jk where Jk < J,. Since Jk, Ji, and Jj are ordered as Jk < J, < Ji in the list of jobs, then Jk must have already been considered for reallocation before Ji and Jj. When Jk is allocated, its gap ok is then removed from the list of gaps, according to the algorithm. As a result, ok should not exist in the list. Therefore, the procedure of checking allocations of all Jj > Ji is not necessary. (iii) Since the Bin model implements round robin, all job completion times depend on two parameters: the number of columns in the Bin model, and the number of workstations allocated to the jobs. For the latter, all the available workstations are utilized in line with the fairness objective. However, the average completion time of all jobs depends on the number of columns of the Bin model. Maximal column removal results in the minimal average completion time and hence, the minimum possible average TAT of all scheduled jobs. Theorem 2: Applying the S&F algorithm (the general algorithm) on the Bin board achieves the minimum average TAT of all scheduled jobs. Pro08 The S&F algorithm is an upgraded version of the algorithm in Theorem 1 where the former starts with multiple jobs scheduled in any column. For the two algorithms to produce the same effects on the Bin, the following points must be resolved, (i) the relocation of a job according to the S&F algorithm must proceed towards eliminating columns, (ii) the mapping of sorted cosets onto sorted gaps in the S&F algorithm must ultimately resemble the mapping of isorted jobs onto sorted gaps in the algorithm of Theorem 1, and (iii) the first fit policy must remain intact in the S&F algorithm. (i) In the S&F algorithm, reallocating a job is not valid until all the other jobs in the same co-located set are also reallocated. In Step (7)-(iv), a co-located set Kk is removed from the Bin board only when it becomes NULL. The set tCk becomes NULL when all the jobs in Kk together with any other job kicked off by the reallocation procedure are reallocated. Otherwise, the reallocations and all the related actions performed on Kk are invalidated. Therefore, the reallocation procedure in the S&F algorithm starts with the
576
K.H. Al-Saqabi, K. Salehllnformation and Software Technology 38 (1996) 569-580
contents with respect to each other. For example, if Nj precedes Kk in the sorted lists of cosets in the S&F algorithm, then it is not necessary that (a(J,)Vi E Nj) < (9?(J,)V,1 E Kk). Therefore, the allocation of sorted cosets onto sorted gaps can violate the first fit law, unless we allow the transitive allocation. In a transitive allocation, a job is allowed to replace both a scheduled job and the gap in its column if a fitting gap for our job cannot be found. The replaced job is inserted back into the list of jobs of the current coset which eventually must be reallocated before the next coset is considered. Previously allocated jobs from previous cosets are rescheduled for allocation by future cosets if it is deemed to be necessary for the removal of the coset at hand. The transitive allocation technique leads to the implementation of the first fit law on all cosets in the list of cosets. Therefore, allocating sorted cosets onto sorted gaps in the S&F algorithm resembles the procedure of allocating jobs onto gaps in the algorithm of Theorem 1. (iii) Because of the transitive allocation procedure of the S&F algorithm, Step (7)-2nd else-(ii), the first fit law remains intact. Overall, the S&F algorithm produce the same effects as the algorithm of Theorem 1 on the Bin board. We conclude that the S&F algorithm achieves the minimum average TAT of all scheduled jobs. Lemma 2: The process of repeatedly swapping the allocation between a job in a coset and a scheduled job in the Bin board, Step (7)-2nd else-(ii), in the S&F algorithm is bounded. Proox omitted in this paper. Lemma 3: The S&F algorithm guarantees that no job is allocated to a gap corresponding to a column that will subsequently be removed. Proof omitted in this paper. Lemma 4: The S&F algorithm guarantees that freed gaps generated during the execution of the algorithm contribute to the minimization of the average TAT. Proofi Freed gaps generated while reallocating the jobs in a coset cannot be used by any job in the same coset, as indicated by Lemma 3. When all the jobs of a coset are reallocated, i.e. the coset becomes NULL, the column and its coset are removed from the Bin board. Therefore, vacated gaps while executing the S&F algorithm contribute to the minimization of the average TAT. Lemma 5: Similar to the algorithm of Theorem 1, the S&F algorithm must check the allocation prospects of all the cosets in the list of cosets until the sum of the remaining gaps in the Bin board falls below the size of the next coset in the list of cosets. Pro08 Since the list of cosets are sorted, all we have to show is that if the size of the currently selected coset exceeds the total size of all the remaining gaps in the Bin board, then all subsequent cosets in the list cannot be reallocated elsewhere in the Bin board. We need to
11 workstations
Fig. 3. The Bin board used in Example 2.
show that for a coset Kk, if (1) (‘&A p(ui)) - P(Q) < P(oi))c v,,~N~WV/) for any k, then (21 (I& p(Oj) < CvJ, E~j B(Jl) for any remaining j and j # k. Since CvJIEKk 9(J,) = ~(a,) - P(Q), then (1) becomes &~,p(ai) y ~(a,). Subtracting p(oj) from both sides, Ed replacing p(gn) - p(gj) on the right side wtth vJ, ENj9(J,) results in Equation (2). Theorem 3: The S&F algorithm results in the minimum number of migrations. Proof: We sort the list of cosets and gaps in ascending order. Smaller gaps belong to columns which are holding the largest cosets. Because both cosets and gaps are sorted in ascending order, we start considering the migrations of the smallest cosets to be merged with the largest cosets. That is migrating small jobs residing in columns of large gaps to small gaps in columns of large cosets. This transfers lightly populated columns towards the heavily populated ones. Also, as shown in Lemma 5, the cosets reallocation procedure terminates at the first instance when the cumulative size of all the gaps cannot support further reallocations. Therefore, our algorithm achieves our objectives and the number of migrations is minimal. Theorem 4: The S&F algorithm satisfies our objectives and requirements. Proofi We have shown that the algorithm of Theorem 1 yields minimum average TAT. The S&F algorithm is the generalized version of the algorithm of Theorem 1. The S&F algorithm allows the scheduling of multiple jobs in a single column of the Bin model. In Theorem 2, we show that the S&F algorithm resolves and maintains the same results of the algorithm of Theorem 1, e.g. minimum average TAT of all scheduled jobs. Lemma 2 shows that the S&F algorithm is bounded. Lemmas 3,4 and 5 show that the S&F algorithm is correct. Finally, in Theorem 3, we show that the S&F algorithm achieves our objectives with the minimum number of migrations possible. Overall, the S&F algorithm satisfies our objectives and requirements. Example 2 This example illustrates the S&F algorithm. The algorithm is presented with a Bin board for a
511
K.H. Al-Saqabi. K. Salehjlnformation and Software Technology 38 (1996) 569-580
possible compaction. Let us assume that the presented Bin is as shown in Fig. 3. In the figure, the shaded areas are gaps. The other regions are space allocated to jobs. The numbers within a region indicate W(Jj) for jobs, and p(a) for gaps. From the figure, the total number of available workstations in a column = ~(a,) = 11. We extract from every partially filled column of the Bin, all the columns in our example, three sorted lists: the list of gaps, the list of Jgaps, and the list of cosets. We use the following notation in the list of gaps: (u)~, to indicate the size of the gap p(ai) = a and it is located in the column Xi of the Bin. Also, the notation (b, c),+ used in the Jgaps list, indicates that the element is a job located in Xi with workstation requirements 9(Ji) = b and an associated gap of size p(ai) = c. The notation used in the cosets listed (d),, indicates that the sum of the job sizes in the coset of Xi is equal to d. The lists when sorted are as follows:
of
(i) List gaps:(1)x,, (1)x,, (4)~~~(5)~~. (ii) List of Jgaps: (1, lh,, (1, lh,, (2, 4)x,, (2, 1)x,, (2, 1)x,, (3, l),,, (3, 1)x,, (4, 1)x,, (4,1)x,> (5, 4)x0>(6,5)x,. (iii) List of cosets: (6),j, (7)x,, (1O)x2, (1O)x,. The steps to be followed in order to reallocate the cosets onto gaps are: (1) Try to reallocate the first coset Ke onto the first fitting gap by taking the individual jobs making up the cosets and try reallocating them. There is only one job of size = 6 in KO. Scanning for the first fitting gap in the list of gaps yields none. Thus Kc, cannot be reallocated, and its job stays where it is currently, column number 3. (2) The jobs 2 + 5 = 7 in the second coset Ki are checked for reallocation. Note that the second coset K, means that it is the second element in the list of cosets, and it does not mean that it is in the second column. The largest job in the coset is checked first, this is the job of size = 5. The first fitting gap is located in column 3. Thus this job tentatively can be migrated from column 0 to column 3. This migration decision is final only if all the other jobs in the same coset can be migrated. Next comes the job of size = 2. Checking the list of gaps results in no fit. Note that (4),, does not count as a fitting gap, because it is located in the same column with the job of size = 2. (3) Now we apply the transitive allocation procedure. We check the Jgaps list for a possible candidate job such that its allocated size plus the size of the associated gap can fit the job at hand of size 2. The procedure is to check the first element in the Jgaps list in which: (i) the sum of the two fields is at least equal to the size of the job at hand, and (ii) the first field of the candidate element in the Jgaps list is strictly less than the size of the job at hand. Checking the Jgaps list yields the first element in
(4
tb)
Fig. 4. The results of the Bin of Example 2: (a) the migrations involved. (b) the final Bin.
the list. Thus, the job of size 2 in column 0 migrates to the gap and the vacated space of the candidate job in column 1. The candidate (vacated) job of size = 1 is thrown in the current coset K, for a possible reallocation. (4) Now we must find a fitting gap for the remaining job in tci which resulted from step 3. Searching the list of gaps results in a reallocation of the remaining job to the gap in column 2. (5) Since Ki is now empty, then the corresponding column X0, as read from the list of cosets, is removed. At this point, all migrations from X,, are made final. (6) Finally, since the total number of the remaining gaps is equal to zero, which is less than the size of the next coset K2, the S&F algorithm terminates. The Bin is now in compacted form. Fig. 4 shows the migrations involved and the resulting Bin.
5. The migration activations In this section, we present algorithms for the migration activations: evict, available, spawn, and exit. Collectively, these algorithms form the migration algorithm. The activation algorithms are responses to either user actions such as workstation releases or claims, or job actions such as new job s,pawn, spawn new processes for executing jobs, process(es) exists, or job exits. In all the activation algorithms, the stated objectives and requirements are achieved through the proper application of the A&S and the S&F algorithms. The migration decisions are made by a single processor. This processor contains the Bin model which reflects the job allocations in the workstation cluster. While there are no activations, there would be no migrations in the cluster. Consequently, the Bin should conform to our objectives until the next activation. The event that alters the Bin can be any of the four activations. In such an event, the type of the activation together with any relevant information, i.e. number of processes spawned or exiting or the affected workstations ids,
578
K.H. Al-Saqabi, K. Saleh/Information and Software Technology 38 (1996) 569-580
are sent to the central processor holding the Bin (load information distribution policy). In response, the central processor applies the correct activation algorithm and issues the appropriate migrations (placement policy). When the migrations are completed, the state of the workstation cluster will conform to the stated objectives. 5.1. A workstation eviction
in which 6,t-r< &obj. (5) Call the S&F algorithm, and pass the above two lists. (6) If not all the cosets are reallocated then open new columns in the Bin model as needed and assign the remaining cosets. (7) Expand the jobs that can benefit from the remaining available gaps. That is, expand Sefffor the jobs which reside in columns with left over gaps if the expansion results in Seff1. 6obj
When a workstation is evicted, all foreign processes are migrated to the remaining available workstations. Each affected job must then reschedule its evicted processes such that the migration objectives and requirements are met. If the evicted processes belong to a job Ji which is to remain running on all the available resources, i.e. &‘(Ji) 2 p(~,,), then the only solution is to distribute the evicted processes on allocated workstations. A subset of the A&S algorithm is used to calculate the increments per workstation. If the evicted processes belong to a job which was granted free resources after satisfying the objectives, i.e. a job with S,, - bobj> 1, then the evicted processes are migrated and distributed on the allocated workstations, according to the A&S algorithm. For the remaining jobs, the S&F algorithm can be used to reallocate them on the available gaps in the Bin model. Note that satisfying our second objective mandates the relocation of all the remaining jobs in the list of cosets, either to gaps within the board, or to newly added columns as necessary. This is because the S&F algorithm is used here not for compaction purposes, but rather for servicing the eviction efficiently. The list of gaps used by the S&F algorithm are all the currently available gaps together with those gaps which were granted beyond the requirements of 6obj. After the allocation is completed, jobs are allowed to expand again if the expansion of a job results in 8,~ - bobj> 1. Next, the eviction algorithm. Let the total number of available workstations, after deducting the eviction, be indicated by ~(0,). (1) For each Ji where W(Ji) > p(~~){ Migrate the processes scheduled on the evicted workstation to the other workstations allocated to that job. Adjust ~5,~of the job to reflect the new state. The number of processes allocated to each workstation is calculated according to Steps (3)-(5) of the A&S algorithm.} (2) For each Ji where W(Ji) < p( gn), and Sen - bobj2 1{ Migrate the processes scheduled on the evicted workstation to the workstations allocated to the job. NOW S,, = 6obj. (3) Form the list of the remaining jobs. Those are the scheduled jobs with W(Ji) < p(~~), and 1. (4) Form the list of available gaps in the Bin model. The list includes the extra gaps allocated to jobs S&-So,
<
>
5.2. A workstation available When a workstation becomes available, a new row is added to the Bin board. The new Bin may no longer satisfy our objectives. Jobs with B(Ji) > p(c~~) must be allowed to expand and use the newly available resource. The remaining jobs with the remaining columns of the Bin model are passed to the S&F algorithm for a possible compaction. When the workstation available algorithm is completed, the Bin board satisfies our objectives. Here is the algorithm: (1) For each job Ji where W(Ji) > ~(cT~){ Allocate the available workstation to the job, migrate the job’s processes from the allocated workstations to the new one, then adjust &.Eaccordingly. For the job, the number of processes allocated to each workstation is calculated according to Steps (l), (3)-(5) of the A&S algorithm.} (2) Form the list of the remaining jobs. Those are the scheduled jobs with 92(Ji) < ~(a,). (3) Form the list of available gaps in the Bin model. Note that extra gaps allocated to jobs with Sef < S,,,, must be inserted in the above list. (4) Call the S&F algorithm, and pass the above two lists. (5) Expand the jobs that can benefit from the remaining available gaps, i.e. expand Serf for the jobs which reside in columns with left over gaps, and S&-So,> 1. 5.3. A job or a process spawn There are two cases to be handled by this activation: spawn of a new job, and a scheduled job spawns a new process. If applying the second objective on the newly spawned job yields the allocation of all the available workstations, then a new column in the Bin model is granted to the job. On the other hand, if the outcome of the A&S algorithm specifies a fewer number of workstations than what is currently available, then the Bin board is searched for a matching gap. Allocating the gaps within the Bin board according to the out come of the A&S algorithm results in minimum average TAT, the first objective, together with achieving the second objective. Searching for a gap may take one of two directions. If a matching gap is found, the job is assigned to the first fitting gap from the candidate gaps. Alternatively, if
K.H. Al-Saqabi. K. Salehllnformation and Software Technology 38 (1996) 569-580
there is no gap that matches the requirements of the job as dictated by the second objective, then the second direction is sought, In the second direction, the S&F algorithm is applied to the Bin board for possible compaction. The new job is inserted as a coset in the list of cosets which will be used by the S&F algorithm. If the outcome of the S&F algorithm does not increase the number of columns in the Bin board, then the allocation of the S&F algorithm is chosen and implemented. Otherwise, the new job is allocated in a new column. In the second case, when a scheduled job spawns a new process, we check the requirements of the job. If the job is currently allocated all the available workstations, or the job was expanded beyond its requirements (6,r < Sobj), then the new process is sent to the least loaded workstation allocated to the job. Otherwise, the process and its job are treated as a new job, which results the new job spawn activation being called for that job. Here is the spawn algorithm. If (new J,) then { If (9(JI) > ~(a,)) then {allocate a new column, and call Steps (1) (3)-(5) of the A&S algorithm} else { (1) For all the columns, get a 0; where 9(Jl) < p(ai). Gaps allocated to jobs in which Set < S,bj are included in the list of gaps. (2) Assign J, to the first fitting gap, perform the migration, and expand 8,~ of that job if possible. (3) If B?(J,) > p(ai) for any gap in the Bin model then { (i) Invoke the S&F algorithm on the Bin model, and check the resulting number of columns. The new spawned job is included in the list of cosets as a coset. (ii) If the number of columns in the Bin board does not increase then { reallocate the jobs (including the new job) to the workstations according to the S&F algorithm. Call Steps (2)-(5) of the A&S algorithm to perform the assignment of processes on the allocated workstations. Expand S,, for each job when possible. else { open a new column for the new job, assign the processes of that job to the new column by calling Steps (l), (3)-(5) of the A&S algorithm, and relax I& when possible.}}}/* end if new job. */ else {/ * The spawned process belong to a scheduled job */ If the process belongs to a job with W(Ji) 2 ~(a,) then { start the process in the least loaded workstation allocated to that job.}
579
else if (starting the new process does not cause the job to become S,rr > S,bj) then { start the process in the least loaded workstation allocated to that job.} else {treat the spawning job as a new job, call the new spawn activation for that job.}}/* end else the spawned process.*/ 5.4. A job or a process exit Our policy in handling a process exit is to wait until all other processes of the same job exit. If the resulting vacated gap is either a complete column, or the vacated gap can fit other jobs located in other columns, then the resulting column is removed, or the other jobs are migrated to the vacated gap, respectively. Otherwise, we face two choices: the first is to invoke the S&F algorithm to check compaction prospects. The other choice is to wait until the next evict, spawn or available activation in which the S&F algorithm is invoked automatically. We designate a variable LIMIT, which can be used to force the invocation of the S&F algorithm in case that several successive exit activations occur. The variable LIMIT can be programmed according to the trade off between enforcing the first objective and inducing process migrations. For example, when the first objective is more important than curbing migrations, when we set LIMIT = 1. The exit algorithm follows. If (the job exit frees a column) then remove the column from the Bin model. else {Let the vacated space of the exiting job be gV. If (~(a,) 2 I for any Jl and J, is the only job in its co-located set) then {migrate JI to the vacated gap, and release the column allocated to J/.}. else {if (the number of successive exits exceeds the LIMIT) then call the S&F algorithms on the Bin model else do nothing to the vacated gap.}}
6. Conclusion We presented a migration algorithm that conforms to a set of stated objectives. The presented algorithm enables us to implement migration in an environment of general purpose homogeneous workstations. In such an environment, workstations are added and removed from the pool of available resources to the general users based on the release and reclaim of the workstation by its owner, respectively. Moreover, scheduled jobs are allowed to change their behaviour dynamically during execution. Implementing migration means trading off migration cost with some desired objectives. Our objectives consist of: (i) minimizing the average TAT of all scheduled
580
K.H. AI-Saqabi, K. Salehllnformation and Software Technology 38 (1996) 569-580
jobs (timeliness), and (ii) maintaining fairness among the jobs by allocating to them, when possible, a number of resources proportional to their demands (fairness). We introduce the Bin model, which helps us visualize the implementation of our objectives. We also presented the Assign and Select “A&S”, and the “Sort and Fit” S&F algorithms which facilitate the implementation of our objectives on the Bin. The A&S algorithm: (i) selects the number of the required workstations to satisfy the fairness objective is implemented, and (ii) assigns the processes of a job to a set of available workstations. The size of the set of the workstations is determined by step (i), and the location of the set on the Bin is selected to promote the timeliness objective. The S&F algorithm compacts the columns of the Bin board to insure that the timeliness objective is met on the level of all scheduled jobs. We have shown that the S&F algorithm operates on the Bin and produces a Bin with the least number of columns. We have also shown that the S&F algorithm is bounded, and results in the minimum number of migrations possible. Finally, we integrated the A&S, and the S&F algorithms in the algorithms of each of the four types of migration activations. Collectively, the migration activation algorithms form the migration algorithm. Our migration algorithm translates all possible actions emanating from the environment into responses that maps to one of the activations. When the appropriate activation algorithm is completed, the Bin and the environment it represents, i.e. the cluster, conforms with the stated objectives. Our future work plans are for implementing the above migration algorithm on an MPVM environment [3]. The first step towards implementing this work is to generate realistic workload traces such as the availability times of the workstations in a cluster, the number and durations between successive spawns, and the durations of the processes spawned by the parallel applications, in an MPVM environment.
Acknowledgments The first author acknowledges the comments and suggestions made by Jon Walpole, and Steve Otto during the sabbatical year spent at the Oregon Graduate Institute of Science and Technology. The authors would also like to thank the anonymous referees for their
constructive comments that helped improve the earlier version of the paper.
References ill J. Casas, R. Konuru, S. Otto, R. Prouty and J. Walpole, Adaptive load distribution systems for PVM, Supercomputing ‘94 Proceedings, Washington DC, 14-18 Nov 1994, pp. 390-399. 121J.M. Smith, A survey of process migration mechanisms, ACM Oper. Syst. Review, 22 (July 1988) 28-40. [31 J. Casas, D. Clark, R. Konuru, S. Otto, R. Prouty and J. Walpole, MPVM: a migration transparent version of PVM, mpvmTR.ps.gz, OGI of Science and Technology (February 1995). [41 C. Maccann, R. Vaswani and J. Zahorjan, A dynamic processor allocation policy for multiprogramming shared memory multiprocessors, ACM Trans. on Computer Systems, 11 (May 1993) 146-178. 151 A. Tucker and A. Gupta, Process control and scheduling issues in multiprogrammed shared memory multiprocessors, Proc. 12th Symposium on Operating System Principles, Dec. 1989, pp. 159166. PI T.L. Casavant and J.G. Kuhl, A taxonomy of scheduling in general-purpose distributed computing systems, IEEE Trans. Soft. Eng., 14 (Feb. 1988) 141-154. [71 D.L. Eager, E.D. Lazowaska and J. Zahorajan, Adaptive load sharing in homogeneous distributed systems, IEEE Trans. Soft. Eng., 12 (May 1986) 662-675. PI 0. Kremien and J. Kramer, Methodical analysis of adaptive load sharing algorithms, IEEE Trans. on Parallel and Distributed Systems, 3 (November 1992) 747-760. [91 S. Zhou, A trace driven simulation study of dynamic load balancing, IEEE Trans. Soft. Eng., 14 (September 1988) 1327-1341. 1101J. Xu and K. Hwang, Dynamic load balancing for parallel program execution on a message passing multicomputer, Proc. Second IEEE Symposium on Parallel and Distributed Processing, 1990, pp. 402-406. 0. Kremien, J. Kramer and J. Magee, Scalable, adaptive load sharing for distributed systems, IEEE Parallel and Distributed Technology, 1 (August 1993) 62-70. 1121M.H. Willebeek-LeMair and A.P. Reeves, Strategies for dynamic load balancing on highly parallel computers, IEEE Trans. on Parallel and Distributed Systems, 4 (September 1993) 979-993. [I31 T.T. Suen and J.S. Wong, Efficient task migration algorithm for distributed systems, IEEE Trans. on Parallel and Distributed Systems, 3 (July 1992) 488-499. P41 G.A. Geist and VS. Sunderam, Network based concurrent computing on the PVM system, Concurrency: Practice & Experience, 4 (June 1992) 293-311. [I51 K.H. Al-Saqabi, S.W. Otto and J. Walpole, Gang scheduling in heterogeneous distributed systems, Tech. Report CSE-94-023, (OGI of Science and Technology) August 1994. 061 D.L. Black, Scheduling support for concurrency and parallelism in the Mach operating system, IEEE Computer, 23 (May 1990) 35-43.