PII:
Computers Ops Res. Vol. 25, No. 10, pp. 839±856, 1998 # 1998 Published by Elsevier Science Ltd. All rights reserved Printed in Great Britain S0305-0548(98)00009-4 0305-0548/98 $19.00 + 0.00
PROBABILISTIC LOAD SCHEDULING WITH PRIORITIES IN DISTRIBUTED COMPUTING SYSTEMS L. Anand{, D. Ghose{ and V. Mani} Department of Aerospace Engineering, Indian Institute of Science, Bangalore, 560012 India (Received September 1996; in revised form November 1997) Scope and purposeÐScheduling of processing loads play an important role in the performance of a distributed computing system/network. This is more important when the distributed computing system consists of many heterogeneous processors, with dierent processing capabilities. The scheduling strategies should also take into account the priorities given to the loads, when there is more than one type of jobs utilizing the system. A methodology is presented to obtain optimal load scheduling strategies, when there are two type of loads in a heterogeneous distributed computing system. AbstractÐIn this paper a distributed computing system/network in which two type of loads (or jobs) namely local loads and global loads arrive is considered. The local loads originate at the individual processors/nodes in the network and have to be processed at the same processors. The global loads originate at the central scheduler and have to be distributed among the processors in the system. The individual processors in the system assign dierent priorities to local and global loads. A priority queuing optimization model of this situation is formulated as a nonlinear programming problem and a solution methodology is presented. The arrival rate of the global and local loads are assumed to follow Poisson distribution and the service time of the global and local loads at the processors follow a general or exponential service time distribution. Both the situations are analyzed. The objective of this study is to minimize the response time of the global loads in the network. # 1998 Published by Elsevier Science Ltd. All rights reserved
1. INTRODUCTION
A distributed computing system consists of many heterogeneous processors, with dierent performance capabilities, connected by two-way communication links and, in general, having their own local buers/memories. The random arrival of loads in such a distributed system results in an uneven build-up of loads at the processors. To improve the performance of such systems, it is necessary to balance/schedule the loads (or jobs) among the processors in the system. Load scheduling has generated considerable interest among researchers in recent times. Various classi®cations of load scheduling algorithms have been discussed in detail in Refs. [1±3]. Broadly, load scheduling algorithms are classi®ed as static or dynamic algorithms. Static algorithms are further classi®ed as deterministic [4, 5] or probabilistic [6, 7] algorithms. In static algorithms, the decision to transfer a job is taken without taking the state of the system into consideration. Because of this, wrong decisions that transfer jobs from lightly loaded nodes to heavily loaded nodes can sometimes occur. However, static scheduling algorithms derive their strength from the fact that they are easy to implement. In dynamic or adaptive scheduling algorithms, on the other hand, the decision to transfer is taken based on the state of the system at that instant [8, 9, 10, 11]. This ensures that the decisions taken are always correct to the extent of the accuracy of the status information. However, such algorithms have a major drawback that they incur signi®cant overheads in terms of implementation complexity, status exchange strategies, and communication cost. These overheads quite often become prohibitive. In this paper we will deal with static algorithms only. Apart from distributed computing systems, static scheduling algorithms ®nd applications in manufacturing systems where large number of jobs are processed and the plants operate for long time periods [5]. {To whom all correspondence should be addressed. E-mail:
[email protected] {
[email protected] }
[email protected] 839
840
L. Anand et al.
Load balancing can be done either with the aid of a central scheduler [8, 13] or it could be done in a decentralized fashion [10, 12, 3]. A static load balancing algorithm with a central scheduler is discussed in Ref. [14] and a tradeo study between time and cost in data assignment among several nodes in a distributed computing system is presented. The optimal assignment of data among several processing nodes including the communication delay is presented in Ref. [15]. Since in such problems there can be more than one objective, a multicriteria design method is proposed in Ref. [16]. The three objectives considered are minimizing the operating costs, minimizing the system time, and maximizing the job availability. Also, a similar problem, when there are prioritized customers to be distributed in the system, is discussed in Ref. [17]. A related problem is considered in [18], but in a decentralized setup. The analogs to local loads and global loads are dedicated jobs and generic jobs, respectively, in Ref. [18]. In the model considered in Ref. [18] both kinds of jobs arrive at the processors in the system. The dedicated jobs are those that can be processed only on that processor at which they arrive. The other class of jobs, on the other hand, is more generic in nature and can be processed on any processor in the system, and hence its name. In Ref. [18] the generic jobs are re-scheduled suitably to achieve better performance by the system in processing the generic jobs, subject to given constraints on the response time for the dedicated jobs. In the present paper, we consider a centralized model. In the literature, this model has been considered as consisting of an arrival stream of loads to the central scheduler, which schedules the loads amongst various processors. These loads, referred to as global loads henceforth, can be processed on any of the processors in the system. Apart from the global loads, there could be some loads arriving locally at each processor and demanding service only at that processor. These loads are termed as local loads. This paper deals with the static probabilistic scheduling of the global loads among the processors in the system in the presence of such local loads. Also in this paper, we do not place any direct constraint on the response time of the local loads, except that it should be ®nite. Under certain cases, it becomes necessary to settle for a sub-optimal solution in terms of the response time of the global loads in order to maintain the response time of the local loads at a ®nite value. When processing the two classes of loads it is possible to process them with dierent priorities. Here, we consider separately the cases when low priority and high priority, respectively, are given to the global loads. It could also happen that in the system while one processor gives high priority to the global loads another may prefer to give it a lower priority. We do not treat such mixed priority systems explicitly in this paper, though a combination of the algorithms presented in this paper can be applied to such systems too. Load balancing could be preemptive or non-preemptive in nature. In preemptive scheduling a higher priority load is taken up for processing immediately upon its arrival, suspending the processing of any lower priority load, even if the processing is only partially complete. After the higher priority load gets serviced, processing of the suspended load is resumed. On the other hand, in non-preemptive scheduling, a higher priority load is taken up for processing only when the processor becomes free. In this paper we shall discuss only non-preemptive scheduling. The problem of non-preemptive scheduling of global loads (in the presence of local loads at the processors), when the processors choose to give dierent priorities, is formulated as a nonlinear programming problem, resulting in minimization of the mean response time of the system for the global loads. 2. THE SYSTEM MODEL
A queuing model of the distributed computing system considered in this paper is shown in Fig. 1. In this system, the arriving load at a node is of two types: (i) Local loads: These loads arrive at a node j, at a Poisson rate aj , demanding service only at that node. The service time is a random variable that follows a general distribution with mean 1/bj and variance slj , where j denotes the node index and `l' denotes the class of local loads. (ii) Global loads: This class of loads arrives at the central scheduler and can be processed by any of the nodes in the system. The loads are assigned to the nodes in the system by the central scheduler using some pre-de®ned assignment policy. The global loads are assumed to arrive at
Probabilistic load scheduling
841
Fig. 1. Model of a distributed computing system.
node j at Poisson rate lj. The service time for these loads follow a general distribution with mean 1/mj and variance sgj where j denotes the node index and `g' denotes the class of global loads. The objective of the central scheduler is to schedule the global loads such that the mean response time of the global loads is a minimum. The central scheduler has control on the response time of the global loads in the sense that it can suitably schedule these global loads so as to minimize the average response time of this class of loads. On the other hand, the local loads arrive and are serviced at the same node, and hence their response times cannot be controlled through scheduling. Therefore, it is more meaningful to talk of optimal scheduling of the global loads so as to minimize their average response time. Each node has to process two classes of loads. The nodes are at a liberty to give high priority to either of the classes. Assuming that all the nodes in the system assign the same priority to the global loads, we have the following two cases: 1. Local loads are given higher priority over global loads. 2. Global loads are given higher priority over local loads. If Wj denotes the mean response time of the global loads at the j-th node, then the mean system response time W for the global loads can be de®ned as W
M X pj W j
1
j1
where pj is the probability with which a load is allocated to the node j. It follows that lj p j l
2
M and since aM j=1lj=l, we have aj=1pj=1. To obtain a feasible solution to the load scheduling problem, we require the following conditions to be satis®ed.
aj
j 1, . . . , M
3
842
L. Anand et al. M X j1
mj ÿ mj
aj bj
> l:
4
Violation of equation (3) results in the queue length, and hence the response time of local loads, becoming unbounded and equation (4) ensures that the arrival rate of the global loads is less than the total available service rate in the system. Violation of equation (4) results in the queue for the global loads becoming unbounded in at least one of the nodes. In the following sections, we discuss separately the cases when high priority is given to the local and global loads, respectively. 3. HIGH PRIORITY FOR LOCAL LOADS
In this section, we discuss the case when the nodes in the system uniformly give a high priority to the local loads. From [19], we obtain the expression for Wj as, Wj
b2j m2j pj l
s2gj 1=m2j aj
s2lj 1=b2j 2
bj ÿ aj
mj bj ÿ aj mj ÿ pj bj l 2mj
bj ÿ aj
mj bj ÿ aj mj ÿ pj bj l
5
where, pj is as de®ned in equation (2). The above equation (5) can be rewritten as Wj where Aj aj b2j m2j
Aj Bj pj l Cj
Dj ÿ Ej pj l
1 2 slj 2 2mj
bj ÿ aj 2 , bj
Cj 2mj
bj ÿ aj ,
Bj
Dj mj
bj ÿ aj ,
b2j m2j
6
s2gj
Ej bj :
1 2 mj
ÿ 2bj
bj ÿ aj ,
7
Our objective is to determine pj( j = 1, . . . , M) such that W = aM j=1pjWj is minimum. We can formulate the optimization problem as M X Minimize pj Wj
8
j1
subject to
M X
pj 1,
9
j1
pj r0,
j 1, . . . , M pj <
10
Dj Ej l
11
Note that aj, bj, l, and mj ( j = 1, . . ., M) are so chosen as to satisfy Equations (3) and (4). This is necessary for the optimization problem de®ned in Equations (8)±(11) to have a meaningful solution. For the time being, we shall ignore equation (11). It will be shown later that the solution to Equations (8)±(10) automatically satis®es equation (11), i.e., equation (11) is a passive constraint. The augmented function for the optimization problem de®ned in Equations (8)±(10) can be written as [20], X X M M M X pj W j ÿ K pj ÿ 1 Lj pj
12 L j1
j1
j1
Probabilistic load scheduling
843
where K and Lj ( j = 1, . . . , M) are Lagrange multipliers that satisfy, Lj 0, if pj > 0, Lj r0, if pj 0
13
Case 1. pj>0 ( j = 1, . . ., M). Under this condition, equation (12) becomes L
X M M X pj W j ÿ K pj ÿ 1 : j1
14
j1
At the minimum point, according to the Karush±Kuhn±Tucker conditions [20], @L/@pj=0, applying which we get ( ) 2 1 Aj Dj 2Bj Dj pj l ÿ Bj Ej p2j l :
15 K Cj
Dj ÿ Ej pj l2 where Aj, Bj, Cj, Dj, and Ej are as de®ned in equation (7). If K is known, then the above equation (15) leads to a quadratic equation in pj which can be solved to get v ( ) u 2 u KCj Dj Ej l Bj Dj l 2 KC D ÿ A D KCj Dj Ej l Bj Dj l j j j j ÿ :
16 2t pj KCj E 2j l2 Bj Ej l2 KCj E 2j l2 Bj Ej l2 KCj E 2j l2 Bj Ej l2 Of the two values of pj given in equation (16) only one satis®es equation (11), and that is given by, v ( ) u 2 u KCj Dj Ej l Bj Dj l 2 KC D ÿ A D KCj Dj Ej l Bj Dj l j j j j ÿ :
17 ÿt pj KCj E 2j l2 Bj Ej l2 KCj E 2j l2 Bj Ej l2 KCj E 2j l2 Bj Ej l2 From Equations (9) and (17), it is dicult to solve for K and pj analytically. We present an algorithm to determine the value of K that gives the optimal load fractions pj ( j = 1, . . ., M). For this, we study the function, ( ) 2 1 Aj Dj 2Bj Dj pj l ÿ Bj Ej p2j l :
18 K j
pj Cj
Dj ÿ Ej pj l2 From equation (18), it follows that dKj
pj 2lDj
Bj Dj Aj Ej : dpj Cj
Dj ÿ Ej pj l3
19
For Aj, Bj, Cj, Dj and Ej de®ned in equation (7), we get Bj Dj Aj Ej > 0: So, dKj
pj Dj > 0, if pj < Ej l dpj and dKj
pj Dj : <0 if pj > Ej l dpj
20
At pj=Dj/(Ejl), Kj(pj) goes to in®nity and so does dKj(pj)/dpj. K 0j Kj
pj 0
Aj > 0: Cj Dj
21
844
L. Anand et al.
Kj
pj 0)Bj Ej p2j l2 ÿ 2Bj Dj pj l ÿ Aj Dj 0 or
8 s > Dj Aj Ej > > 1 1 > > < Ej l B j Dj pj s > > Dj Aj Ej > > > 1 ÿ 1 :E l B j Dj j
22
if Bj > 0
23 if Bj <0
The other root of equation (22) results in pj being negative and hence can be neglected. Asymptotically, as pj 41, we get ( ) 2 1 Aj Dj 2Bj Dj pj l ÿ Bj Ej p2j l Bj :
24 lim Kj
pj lim ÿ 2 pj ÿ 41 pj ÿ 41 Cj C
Dj ÿ Ej pj l j Ej Based on the above, a qualitative representation of the function Kj(pj) is shown in Fig. 2. It is clear that for K r 0 and for the solution to be feasible, we require that equation (11) serves as an upper bound on pj. Any algorithm to obtain pj ( j = 1, . . . , M) should be such that equation (11) is satis®ed for all j = 1, . . ., M. If we can prove that the algorithm that requires equation (11) to be satis®ed does terminate with a solution, then we would have ensured that equation (11) is satis®ed, thereby justifying our relaxation of this condition earlier. It is clear that for the algorithm to terminate, the condition M X Dj >1 El j1 j
is required. We prove the validity of this condition in Lemma 1 later. Case 2. pj=0 for at least one j $ {1, . . ., M}.
Fig. 2. Graph of Kj vs pj.
25
Probabilistic load scheduling
845
Under this condition, equation (12) becomes X X M M M X pj W j ÿ K pj ÿ 1 Lj pj L j1
j1
26
j1
where, Lj are as de®ned in equation (13). We de®ne a partition {I1,I2} on I = {1, . . . , M} such that I1 f j:j 2 I, pj 0g
I2 f j:j 2 I, pj > 0g:
Applying the Karush±Kuhn±Tucker conditions [20] and simplifying, we get, ( ) 2 1 Aj Dj 2Bj Dj pj l ÿ Bj Ej p2j pj l Lj , K Cj
Dj ÿ Ej pj l2 where Lj is as de®ned in equation (13). Consider the function, ( ) 1 Aj Dj 2Bj Dj pl ÿ Bj Ej p2 l2 : Kj
p Cj
Dj ÿ Ej pl2
27
28
This function Kj(p) in equation (28) is treated as in Case 1. In the algorithm to obtain the optimal load fractions pj ( j = 1, . . . , M), the partition on I is obtained before the load fractions are computed. As in Case 1, for termination of the algorithm, equation (25) needs to be satis®ed. Before we describe the algorithm, we shall discuss the condition mentioned in equation (25). Lemma 1. In a stable distributed computing system aM j = 1 Dj/Ejl>1. Proof: For the queue length at node j to be stable, aj pj l <1 bj mj
29
bj
mj ÿ pj l ÿ mj aj > 0:
30
that is,
Dividing both sides by bj and re-arranging, we get mj
bj ÿ aj ÿ pj l > 0: bj
31
Summing both sides for all the nodes in the system and simplifying, we get M X mj
bj ÿ aj >1 bj l j1
that is, M X Dj > 1: El j1 j
32
q Now we shall describe the algorithm to solve the optimization problem de®ned in Equations (8)±(11). Step 1. Compute KjO ( j = 1, . . ., M) using equation (21) and order them such that K 01 RK 02 R . . . RK 0M : Set j = 1 Step 2. K = Kj0. Step 3. Compute pj ( j = 1, . . ., jÃ) using equation (17).
846
L. Anand et al. Ã
If (ajj=1=1pj<1) then jÃ=jÃ+1. à If ( jRM) then Go to Step 2. else determine an integer n>0 such that 0 aM j=1 pj (K = (n ÿ 1)KM) < 1 and 0 aM j=1 pj (K = nKM)>1. 0 Set Kmax=nKM and Kmin=(n ÿ 1)K0M. Set I1=F and I2={1, . . ., M}. Go to Step 4. à else if (ajj=1 pj=1) then à Set I1={ jÃ, . . . , M} and I2={1, . . ., jÿ1}. Stop. else Set Kmax=KjÃ0 and Kmin=Kj0ÿ1 à . à Set I1={ jÃ, . . . , M} and I2={1, . . ., jÿ1}. Go to Step 4. Step 4. K
Kmin Kmax : 2
Step 5. Compute pj ( j $ I2) from equation (17) such that equation (11) is satis®ed. P If ( j2I2 pj=1) then Kmin=K. Go to Step P 4. else if ( j2I2 pj=1) then Stop. else Kmax=K. Go to Step 4. Lemma 1 assures a termination for this algorithm. For the local loads, the response time Wlj at node j is given by [19], Wlj
pj lb2j
s2gj 1=m2j aj b2j
s2lj 1=b2j bj ÿ aj 2
bj ÿ aj
:
33
Wlj is ®nite so long as bj>aj (which is true from equation (3)). Thus, pj ( j = 1, . . ., M) that solves the optimization problem de®ned in Equations (8)±(11) maintains the queue lengths and the response times for both the classes at a ®nite level. Hence, the solution that we obtain is feasible from the local loads' point of view too. 3.1. Exponential service times As a particular case, when the service times at the nodes follow an exponential distribution with mean 1/mj ( j = 1, . . ., M), then the expression for Wj simpli®es to Wj
bj ÿ aj
mj bj ÿ mj aj ÿ bj pj l
aj m2j pj lb2j : mj
bj ÿ aj
mj bj ÿ mj aj ÿ bj pj l
34
The formulation and the algorithm for obtaining the solution follows the same steps as mentioned above. The expression for the response times for the local loads is given by [19], Wlj
bj pj l
bj =mj 2 : bj
bj ÿ aj
35
Probabilistic load scheduling
847
From the above, it is clear that Wlj is ®nite since pj is bounded. Hence the response time of jobs in both the queues remain ®nite for the solution obtained. 4. HIGH PRIORITY FOR GLOBAL LOADS
In this section we shall discuss the case when all the nodes in the distributed computing system give high priority to the global loads over the local loads. From [19] we have, Wj
pj lm2j
s2gj 1=m2j aj m2j
s2lj 1=b2j 2mj ÿ 2pj l
36
2mj
mj ÿ pj l
which can be simpli®ed and rewritten as Wj where, Uj
aj m2j
Uj Vj pj l 2mj
mj ÿ pj l
1 2 slj 2 2mj bj
37
Vj m2j s2gj ÿ 1:
38
Hence, we can write the mean response time of the system as, M X Uj Vj pj l pj : W 2mj
mj ÿ pj l j1
39
Apart from the constraints given earlier in the optimization problem de®ned in Equations (8)± (11), we have an additional constraint. The constraints mentioned earlier are enough to ensure stability for global loads. In the present case, since the local loads are given low priority, even if the global loads' queue is bounded, it is possible for the local loads' queue to become unstable. To ensure that the total load on a node is less than the total capacity that the node can handle and thereby ensure the stability of the local loads' queue, we require mj aj 1ÿ :
40 pj < l bj This condition can be inferred from the mean response time expression for local loads Wlj at node j given by [19], Wlj
b2j m2j pj l
s2gj 1=m2j aj
s2lj b2j 2
mj ÿ pj l
mj bj ÿ aj mj ÿ pj lbj 2bj
mj ÿ pj l
mj bj ÿ aj mj ÿ pj lbj
:
41
In the last section we observed that the stability of the low priority global loads' queue automatically ensures the stability of the high priority local loads' queue. Hence, a constraint to ensure the stability of the local load queue was not necessary. The strict inequality in equation (40) is modi®ed to read as mj aj 1 ÿ ÿ Dj
42 pj R l bj with Dj>0. Hence, the optimization problem for the present case is formulated as, Minimize M X U j V j pj l pj 2m j
mj ÿ pj l j1
43
848
L. Anand et al.
subject to M X pj 1,
44
j1
pj r0, pj R
j 1, . . . , M
45
mj aj 1 ÿ ÿ Dj , j 1, . . . , M: l bj
For this optimization problem, we can write the augmented function as X X M M M M X X mj Uj Vj pj l aj ÿK pj pj ÿ 1 Lj pj ÿ Gj pj ÿ 1 ÿ ÿ Dj : L l bj 2mj
mj ÿ pj l j1 j1 j1 j1
46
47
From which, 2 @ L Uj mj ÿ Vj p2j l 2Vj mj pj l ÿ K Lj ÿ Gj 0: @ pj 2mj
mj ÿ pj l2
48
Hence, K
Uj mj ÿ Vj p2j l2 2Vj mj pj l 2mj
mj ÿ pj l2
Lj ÿ Gj :
49
We de®ne, Kj
p
Uj mj ÿ Vj p2 l2 2Vj mj pl 2mj
mj ÿ pl2
:
50
Lemma 2. The variable Kj de®ned in equation (50) is a monotonically increasing function of p. Proof. It is enough to show that @Kj/@p>0. 2 2 mj Vj Uj @ Kj
mj ÿ pl
2Vj mj l ÿ 2Vj pl 2l
mj ÿ pl > 0: l @p
mj ÿ pl3 2mj
mj ÿ pl4 Let us de®ne
q mj aj 1 ÿ ÿ Dj : pjb l bj
51
Once Dj (j = 1, . . ., M) are ®xed pjb also gets ®xed. We thus have 0 RpjRpjb. Choice of Dj decides the extent to which we can approach the optimal solution when the optimal solution lies either on the boundary (de®ned by pjb (Dj=0)) or beyond. Hence, choice of Dj ( j = 1, . . ., M) is crucial in determining the eciency of load sharing. We now describe an algorithm to obtain the best possible solution for the load scheduling problem de®ned by Equations (43)±(46) while controlling the low priority queue length at a ®nite value. Before we describe the algorithm, we discuss the condition that needs to be satis®ed in order to ensure the existence of a solution to the problem de®ned by Equations (43)±(46). Lemma 3. For a given choice of Dj (j = 1, . . ., M), low priority queues can be controlled at ®nite lengths if and only if M X pjb r1:
52
j1
Proof. Suppose equation (52) is true. Under this condition it is possible to determine pj ( j = 1,
Probabilistic load scheduling
849
. . ., M) such that pj R pjb and aM j=1pj=1, which corresponds to the existence of a solution that controls the low priority queue at ®nite lengths. This proves the suciency part of the Lemma. To show that the condition mentioned in equation (52) is indeed necessary, let us assume that equation (52) does not hold. Hence, M X pjb <1:
53
j1
For stability of the low priority queue we require pj Rpjb ,
j 1, . . . , M
54
which implies that, M M X X pj R pjb <1: j1
55
j1
Hence, it is not possible to select a pj ( j = 1, . . ., M) such that aM j=1 pj=1. Or, in other words, for any solution pj (j = 1, . . ., M) such that aM j=1 pj=1 there exists at least one j that violates the boundary constraint de®ned by equation (54). So, at least some of the low priority queues are unstable. Thus, equation (52) is also a necessary condition for the stability of all the low priority queues. q Lemma 3 de®nes a constraint on the choice of D1, . . ., DM. The choice of D1, . . ., DM would become clear when we discuss an example in the next section. Here, we shall discuss an algorithm for determining the required solution to Equations (43)±(46). As before, we partition I= {1, . . ., M}, but this time into I1, I2, and I3 de®ned as I1 f j:pj 0g
I2 f j:0
I3 f j:pj pjb g:
56
The Lagrange multipliers in equation (47) are such that Ljr0 for j $ I1, Lj=0 for j $ I2[I3, Gj=0 for j $ I1[I2 and Gj r 0 for j $ I3. The algorithm described below constructs the partition on I. For j $ I2, from equation (49) on simpli®cation, we get p2j l2
2Kmj Vj ÿ 2pj lmj
2K ÿ Vj 2Km3j ÿ Uj mj 0: Solving the quadratic equation, we get q mj
2K ÿ Vj 2 m2j
2K ÿ Vj 2 ÿ
2Kmj Vj
2Km3j ÿ Uj mj : pj l
2Kmj Vj
57
58
Of the two values obtained for pj, the one that falls in the feasible region (namely, 0R pjRpjb) is the value of pj that is used in the algorithm described below. Step 1. Compute Kj0 ( j = 1, . . ., M) and Kj(pjb) using equation (21) and equation (51) respectively, and order Kj0 ( j = 1, . . ., M) such that K 01 RK 02 R . . . RK 0M : Set jÃ=1.Set I1, I3=F. Step 2. K = KjÃ0. Step 3. For j = 1, . . ., jà do Compute pj using equation (58). If pjrpjb then Set pj=pjb. Set I3=I3[{ j}. à If (ajj= 1 pj<1) then jÃ=jÃ+1. à If ( jRM) then Go to Step 2. else
850
L. Anand et al.
Determine an integer n>0 such that 0 aM j=1pj(K = (n ÿ 1)KM) < 1 and 0 aM j=1pj(K = nKM)>1. 0 Set Kmax=nKM and Kmin=(n ÿ 1)K0M. Set I1=F and I2={1, . . ., M}\I3. Go to Step 4. à else if (ajj=1pj=1) then à Set I1={ jÃ, . . . , M} and I2={1, . . ., jÿ1}\I 3. Stop. else Set Kmax=KjÃ0 and Kmin=Kj0ÿ1 à . à Set I1={ jÃ, . . . , M} and I2={1, . . ., jÿ1}\I 3. Go to Step 4. Step 4. K
Kmin Kmax : 2
Step 5. Compute pj ( j $ I2) from equation (58). P P If ( j2I2 pj + j2I3 pjb<1) then Kmin=K. Go to Step P P 4. else if ( j2I2 pj + j2I3 pjb=1) then Stop. else Kmax=K. Go to Step 4. So long as the condition given in Lemma 3 is valid, the algorithm terminates. As in the algorithm discussed in Section 3, the partition {I1, I2, I3} on I is obtained in the algorithm itself. This algorithm also ensures that the low priority queue of local loads do not become unbounded. 4.1. Exponential service times When the service times follow an exponential distribution, the expression for the mean response time of the global loads simpli®es to [19] Wj
mj aj
mj =bj 2 mj
mj ÿ pj l
59
and the expression for the response time of the low priority local loads becomes Wlj
mj ÿ pj l
mj bj ÿ bj pj l ÿ mj aj
pj lb2j aj m2j : bj
mj ÿ pj l
mj bj ÿ bj pj l ÿ mj aj
60
Rest of the theory remains the same as in the case of general distribution. In Ref. [18] the authors have discussed a generalized version of the problem presented here, with the service times following exponential distributions. They consider more than one kind of local loads and using the algorithm given in Ref. [7] obtain the optimal probability distribution. The authors prove in Ref. [18] that the optimal scheduling policy at each node requires that the global loads are given high priority over other local loads, or at least the same priority as some or all of the local loads. In Ref. [18], a constraint in the form of an upper limit on the response time of the local loads is assumed. In this paper, we consider a single class of local loads and a constraint on the allocation probabilities of the global loads (i.e., in the probability space) as a function of the local loads' parameters. When talking of the optimal solution of global loads, it is more meaningful to arrive at a compromise between the extent of sub-optimality of the solution that can be tolerated and the response time of the local loads, which is what we do in the
Probabilistic load scheduling
851
discussion presented here. An upper limit on the response time of the local loads would help us in the choice of D1, . . ., DM as is discussed below. We have considered the constraint on pj as 0 RpjRpjb. Let us de®ne xi=Wlj (pjb). Hence, Wlj Rx j ,
j 1, . . . , M:
61
We have xj
mj
b2j aj mj ÿ aj bj mj Dj
aj mj ÿ b2j bj m2j D2j
aj bj Dj m2j bj Dj
:
62
So, if xj is given, then Dj can be obtained as the solution of a quadratic equation. Equation (61) is the constraint discussed in Ref. [18].
5. NUMERICAL EXAMPLES AND DISCUSSIONS
In this section we shall look at two examples that illustrate how the algorithms given above work. For both the examples, we consider that the service times at the nodes follow exponential distribution. The ®rst example which considers a small 2-node system provides an insight into the selection of Dj discussed in Section 4. In the second example a larger system consisting of 6 nodes that provides a more realistic problem is solved and the solution presented. Example 1. Let us consider a 2-node system with parameters a1=1.0, b1=10.0, m1=8.0, a2=1.0, b2=20.0, m2=5.0 and l = 11.7. Case 1: High priority for local loads With high priority given to the local loads it turns out that the optimal load fractions that minimizes the mean response time of the global loads ({p1*, p2*}) is {0.6034, 0.3966}. The mean response time for the global loads is 8.3263 and for the local loads in the two nodes are 5.7151 and 2.4794, respectively. Case 2: High priority for global loads For this case, it turns out that the optimal probability distribution that minimizes the mean response time of the global loads ({p1*, p2*}) is {0.6208, 0.3792}. But, for the system described in this problem, p1b* = 0.6154 and p2b* = 0.4060 (it can be seen that p1*>p1b* thereby leading to instability of the local loads' queue in node 1). Hence, we need to accept a sub-optimal solution for the probabilities so as to maintain the mean response times for the local loads in both the nodes within reasonable bounds.
Fig. 3. Choice of D1, D2
852
L. Anand et al. Table 1. D2, e and p for dierent values of D1
D1 0.01 0.02
Range for D2
e
p
0±0.0341 0±0.0181
0.0508 0.1578
0.6085 0.6017
Let W(p) denote the mean response time of the global loads when a load is allocated to node 1 with probability p. Let W* = W(p = p1b*). W* inf p W
p When p = p1b*, i.e., W(p) = W*, the low priority queue becomes unbounded in node 1. Hence, p < p1b*, and we choose * ÿ m1 D p p1b p1b 1 l
63
as given in equation (51). The value of D1 is governed by how far from the optimal solution we are prepared to go. Let us de®ne, e
W
p p1b ÿ W* : W*
Thus, W
p W*
1 e:
64
But, W
p
2 X pj fmj aj
mj =bj 2 g mj
mj ÿ pj l j1
where p1=p and p2=1 ÿ p. For the values chosen, W
p
24:4823p2 ÿ 27:1823p 8:1 : ÿ136:90p2 171:99p ÿ 53:6
65
From Equations (64) and (65), on simpli®cation we obtain
24:4823 220:3261Zp2 ÿ
27:1823 276:82Zp
8:1 86:2698Z 0
66
where Z = 1 + e. Let p* be the feasible solution of equation (66) (i.e., that solution of equation (66) satis®es 0 Rp* < p1b*). We have, m a1 m ÿ 1 D1 p* 1 1 ÿ l b1 l From which, D1 0:9 ÿ
11:7 p* 8
67
From Equations (66) and (67) D1 can be determined, knowing the value of e and hence, Z. For dierent values of e we get dierent values for D1 and a range of values for D2 as shown in Fig. 3. Once the values of D1 and D2 are ®xed, the corresponding choice of the probability and Table 2. System parameters for Example 2 Node Number
a
b
m
D chosen
1 2 3 4 5 6
1 3 2 3 4 2
5 5 4 4.5 5 4
0.5 2 3 4 4.5 5
0 0.005 0 0.01 0.006 0
Probabilistic load scheduling
853
Fig. 4. Example 2: Probabilities with high priority for local loads.
hence the mean response time for the global and local loads are known. In Table 1 we give e, the range of D2, and the sub-optimal value of p for two speci®c values of D1 chosen for illustration. In this example, we considered a small 2-node system and so the eect of the arrival rate on the optimal scheduling probabilities in terms of the drop-out rule could not be demonstrated. In the next example, we will consider a larger system. Example 2. In this example we consider a 6-node system which illustrates the drop-out condition and the eect of the bounds (when high priority is given to the global loads). To do this, we vary the arrival rate and study the probability variations graphically. The values of the other parameters considered for the system are given in Table 2. Case 1: High priority for local loads The scheduling probabilities for global loads when high priority is given for local loads is presented graphically in Fig. 4. As can be seen from the ®gure not all nodes are used in processing the load when the arrival rate is small. The node 5, for example,. is dropped out for arrival rates less than about 2.5. Another interesting point is that, although node 3 has a lower service rate than node 5 in processing global loads, even then node 3 is allocated some global load at very low values of l, whereas node 5 is dropped out. This happens due to the presence of local loads. Case 2: High priority for global loads The scheduling probabilities for global loads when high priority is given for global loads is presented in Fig. 5. Due to the presence of the upper bounds on the scheduling probabilities given by equation (42), the curves are no more smooth. The curves follow the exponential path that they are expected to take, until they hit their bounds given by equation (42). As a result of the restriction placed by the bound on the respective node, namely node 5, the other curves also get distorted. In Fig. 6 we present the eect of the bounds on the probability curve. In the ®gure, the curves corresponding to the scheduling probabilities, without the bounds given by equation (42), are labelled with the pre®x P followed by the corresponding node number. Similarly, the curves corresponding to the bounds are labelled with the pre®x B followed by the node number. The curve for node 5 in Fig. 5 follows the probability curve in Fig. 6 until the
854
L. Anand et al.
Fig. 5. Example 2: Probabilities with high priority for global loads.
Fig. 6. Example 2: Eect of bounds on the scheduling probabilities.
Probabilistic load scheduling
855
point where the ®rst curve, namely that corresponding to node 5, meets the curve B5. From that point onwards, the curve will be below the curve corresponding to the bound by m5D5/l. The drop-out condition is observed for this case too. But it should be noted that the node that was dropped out in the low priority case is dierent from the one that gets dropped out in this case. So, the relative priorities given to the loads does have an eect on the drop-out rule. 6. CONCLUSIONS
This paper presents a methodology for optimal distribution of global customers in a distributed computing system in which local customers arrive at the individual processors. These local customers have to be processed by the individual processors where they arrive. The presence of local customers introduces a constraint in the load distribution of global customers. A priority queuing model is presented and a solution methodology is discussed. Some of the interesting results are proved analytically. Other applications of this methodology are in manufacturing systems, distributed data processing via local area networks, multimedia systems, and data communication. The paper mainly focusses on static load balancing algorithms and no comparison of its performance with dynamic load balancing algorithms is given. While this would be an interesting exercise, it is beyond the scope of this paper. AcknowledgementsÐThe authors are extremely grateful to J. K. Anand for his help in generating some of the ®gures during the ®nal preparation of the manuscript.
REFERENCES 1. Casavant, T. L. and Kuhl, J. G., A taxonomy of scheduling in general purpose distributed computing systems. IEEE Trans. Software Engineering, 1988, 14(2), 141±154. 2. Thomasian, A., A performance study of dynamic load balancing in distributed systems, Proceedings of IEEE Seventh International Conference on Distributed Computing Systems,1987, 178±184. 3. Wang, Y.-T. and Morris, R. J. T., Load sharing in distributed systems. IEEE Trans. Comput., 1985, 34(3), 204±217. 4. Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G., Shmoys, D. B., Sequencing and scheduling: Algorithms and complexity, Handbook of Operations Research and Management, Vol. 4, eds. S. C. Graves, A. H. G. Rinnooy Kan, and P. Zipkin. North-Holland, 1990. 5. Tzafestas, S. and Triantafyllakis, A., Deterministic scheduling in computing and manufacturing systems: A survey of models and algorithms. Mathematics and Computers in Simulation, 1993, 35, 397±434. 6. Ni, L. M. and Hwang, K., Optimal Load balancing in a multiple processor system with many job classes. IEEE Trans. Software Engineering, 1985, 11(10), 1141±1152. 7. Tantawi, A. X. and Towsley, D., Optimal static load balancing in distributed computing systems. Journal of Association for Computing Machinery, 1985, 32(2), 445±465. 8. Bonomi, R., and Kumar, A., Adaptive optimal load balancing in a heterogeneous multiserver system with a central job scheduler, Proceedings of IEEE Eighth International Conference on Distributed Computing Systems, 1988, 500± 508. 9. Eager, D. L., Lazowska, E. D. and Zahorjan, J., Adaptive load sharing in homogeneous distributed systems. IEEE Trans. Software Engineering, 1986, 12(5), 662±675. 10. Hsu, C.-Y. H., and Liu, J. W.-S., Dynamic load balancing algorithms in homogeneous distributed systems, Proceedings of IEEE Sixth International Conference on Distributed Computing Systems, 1986, 216±223. 11. Ramamritham, K., and Stankovic, J. A., Dynamic task scheduling in distributed hard real-time systems, Proceedings of IEEE Fourth International Conference on Distributed Computing Systems, 1984, 96±107. 12. Shin, K. G. and Hou, C.-J., Design and evaluation of eective load sharing in distributed real-time systems. IEEE Trans. Parallel and Distributed Systems, 1994, 5(7), 704±719. 13. Stankovic, J. A., An application of Bayesian decision theory to decentralized control of job scheduling. IEEE Trans. Comput., 1985, 34(2), 117±130. 14. Lee, H., Time and cost tradeo for distributed data processing. Comput. Ind. Engng., 1989, 16, 553±558. 15. Lee, H., Modelling and optimization of data assignment in a distributed information system. Int. J. Systems Sci., 1993, 24, 173±181. 16. Lee, H., Assignment of a job load in a distributed system: A multicriteria design method. Eur. J. Ops. Res., 1995, 87, 274±283. 17. Lee, H., Optimal static distribution of prioritized customers to heterogeneous parallel servers. Comput. Ops. Res., 1995, 22, 995±1003. 18. Ross, K. W. and Yao, D. D., Optimal load balancing and scheduling in a distributed computing system. Journal of the Association of Computing Machinery, 1991, 38(3), 676±690. 19. Gross, D., and Harris, C. M., Fundamentals of Queueing Theory, 2nd Ed. John Wiley and Sons, Inc., U.S.A., 1985. 20. D. G. Luenberger, Linear and Non-linear Programming, 2nd Ed. Addison-Wesley, 1984. AUTHORS' BIOGRAPHIES L. Anand obtained his Ph.D. degree in Engineering from the Indian Institute of Science, Bangalore, India in 1997. He is
856
L. Anand et al.
currently working at Motorola India Electronics Ltd., Bangalore. His areas of research interest are parallel and distributed computing, communication issues in distributed systems, and networking. D. Ghose is an Associate Professor in the Department of Aerospace Engineering, Indian Institute of Science, Bangalore, India. His ®elds of research interest are dierential games, parallel and distributed computing, multimedia systems, and guidance of ¯ight vehicles. He has published papers in many IEEE Transactions, Journal of Parallel and Distributed computing, and Journal of Optimization Theory and Applications. He is one of the authors of the book, ``Scheduling divisible loads in parallel and distributed systems'', published by the IEEE Computer Society Press. V. Mani is an Associate Professor in the Department of Aerospace Engineering, Indian Institute of Science, Bangalore, India. His ®elds of research interest are Mathematical modelling, parallel and distributed computing, Neural networks and Evolutionary computing. His papers have appeared in IEEE Trans. on Aerospace and Electronics Systems, IEEE Trans. on Parallel and distributed Systems, IEEE Trans. on Reliability and Journal of Parallel and Distributed computing. He is one of the authors of the book, ``Scheduling divisible loads in parallel and distributed systems'', published by the lEEE Computer Society Press.