A cooperative scheduling method based on the device load feedback for multiple tasks scheduling

A cooperative scheduling method based on the device load feedback for multiple tasks scheduling

Author’s Accepted Manuscript A Cooperative Scheduling Method based on the Device Load Feedback for Multiple Tasks Scheduling Yu Xin, Ya-Di Wang, Zhi-Q...

9MB Sizes 0 Downloads 87 Views

Author’s Accepted Manuscript A Cooperative Scheduling Method based on the Device Load Feedback for Multiple Tasks Scheduling Yu Xin, Ya-Di Wang, Zhi-Qiang Xie, Jing Yang www.elsevier.com/locate/jnca

PII: DOI: Reference:

S1084-8045(17)30310-7 https://doi.org/10.1016/j.jnca.2017.09.012 YJNCA1979

To appear in: Journal of Network and Computer Applications Received date: 23 January 2017 Revised date: 14 August 2017 Accepted date: 27 September 2017 Cite this article as: Yu Xin, Ya-Di Wang, Zhi-Qiang Xie and Jing Yang, A Cooperative Scheduling Method based on the Device Load Feedback for Multiple Tasks Scheduling, Journal of Network and Computer Applications, https://doi.org/10.1016/j.jnca.2017.09.012 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A Cooperative Scheduling Method based on the Device Load Feedback for Multiple Tasks Scheduling Yu Xina,∗, Ya-Di Wanga , Zhi-Qiang Xiea , Jing Yangb a College

of Computer Science and Technology, Harbin University of Science and Technology, Heilongjiang, 150001, China b College of Computer Science and Technology, Harbin Engineering University, Heilongjiang, 150001, China

Abstract With the development of cloud computing, the traditional Star Scheduling System with solo scheduler can not meet the requirement of distributed system. Thus, we designed a scheduling method, which can be applied into multischeduler system, denoted by MTDR (Multi Task Dynamic-Rank scheduling method). For the issue of device confliction in multi-scheduler system, we design the following scheduling principle: the devices feedback their load state to schedulers to control the successive scheduling process. For the device load modeling, we utilize time window method to predict the device’s load state, denoted by LSF (Load State Feedback model). When the schedulers deal with the task slices, the device load state is considered. We performed experiments on the arrival time test, device dependence test, task structure test, CCR(communication computation ratios) test, Devices Set test. By experimental comparison, the effectiveness and rationality of the proposed method is verified. Keywords: load balancing, task scheduling, multi-scheduler, dynamic rank

1. Introduction With the development of communication technology, the traditional single host system has begun converting into distributed system.

And many

∗ Corresponding

author Email addresses: [email protected] (Yu Xin), [email protected] (Ya-Di Wang), [email protected] (Zhi-Qiang Xie), [email protected] (Jing Yang)

Preprint submitted to Journal of LATEX Templates

October 2, 2017

distributed systems, such as Mapreduce[1, 2], IaaS[3] are emerging. All the 5

distributed systems utilize service devices to response the individual service request. In order to improve the efficiency of resource allocation and data processing, the scheduling issue has become the focus of distributed system research recently. The contents of these researches mainly comprise the task allocation

10

scheduling and data transmission scheduling. At the task allocation scheduling framework aspect, The Refs [4, 5] suggested 6 compatible strategies orient to elastic computing cloud, and proposed the two levels scheduling model, comprising cloud schedule and sub-schedule. The two levels scheduling model gave a macroscopic and microcosmic view to illustrate the scheduling. At the data

15

transmission scheduling aspect, the content is to design the scheduling policy with the objective of minimizing the telecommunication cost, according to the distribution features. In this aspect, Nan[6] proposed the cost model and queuing mode, and designed the QoS oriented distributed resource scheduling algorithm to minimize the distributed data transmission cost. Kllapi[7] suggested

20

the splitting, computing and merging method for data stream processing, and designed the data stream transmission scheduling method. And Lin[8] considered the processing and transmission capacity of each device, to allocate the devices with the similar capacity into the same cluster. By which, the scheduling problem in distributed system can be converted into DAG (DAG, Directed

25

Acyclic Graph) scheduling problem. Fig.1(a) shows the structures of the single scheduler system used the multi-scheduler system. The single scheduler system is used in the traditional solo scheduler environment, while the multi-scheduler system can be used in the distributed environment. In Fig.1(a), the single scheduler controls the entire scheduling process, and allocates the task slices into the

30

separate devices. Therefore, the whole loads of tasks scheduling are concentrated on the single scheduler. If the amount of tasks is very large, the single scheduler will be in the overload risk. Fig.1(b) shows the topological structure of multi-scheduler system, where the schedulers cooperatively receive and allocate the tasks, thus the loads of tasks scheduling are balanced. By the comparison of 2

35

Fig.1, the multi-scheduler system is the future trend in distributed environment.

6FKHGXOHU

'HYLFH

'HYLFH

6FKHGXOHU

6FKHGXOHU 'HYLFH 6FKHGXOHU

'HYLFH

(a)The single scheduler system

'HYLFH 6HW (b)The multi-scheduler system

Figure 1: The topological structure of single scheduler and multi-scheduler scheduling system

2. Related Work Recently, the DAG scheduling methods can be summarized into the following 5 categories: 1) Weight Priority Methods. These methods, such as HEFT[9], DCP(Dynamic 40

Critical-Path)[10], determine the scheduling sequence of task slices, by the rank or weight. The rank or weight of task slices can be obtained directly, according to the structure of task. Thus the Weight Priority Methods can utilize EFT (earliest finish time) or EST (Earliest Start Time) policy to get the scheduling result. The advantage of these methods is the low computing complex O(n2 ).

45

2) Level Priority Methods. These methods, such as HDS(Hierarchical DAG Scheduling)[11], PSDS(Parallel Sparse Direct Solver)[12], CPDS(Critical Path based dynamical DAG Scheduling)[13], treat the levels of task slices in DAG as their priority. Thus, the Level Priority Methods guarantee the tasks have a parallel scheduling result. With maximizing the parallelization of each task slice,

50

the objective of minimizing the total execution time can be realized indirectly. The advantage of these methods is the high resource occupancy ratio, which is beneficial for the load balance research, but the disadvantage is the poor performance on the case that, the structure of task is complex and the load on each level is imbalanced.

3

55

3) Roll Back Methods. These methods, such as HEFT-lookahead[14], utilize the roll back policy to reschedule the previous result to adjust the global scheduling result. These method has a better result, but also a higher computing complex O(n3 ). However, these methods are static scheduling methods, can not be used in the real time scheduling and dynamic scheduling field.

60

4) Path Clustering Methods. These methods, such as PCH(Path Clustering Heuristic)[15], HHDS(Hybrid Heuristic Dag Scheduling)[16], BCHCS(Budget Constrained Hybrid Cloud Scheduler)[17], DCP(Dynamic Critical-Path)[10], give the task slices in the critical path a higher priority, to reduce the total cost. The advantage of these methods is the better performance on the case

65

that, the task has more levels and the amount of task slices is less, however, when the amount of task slices in each levels are high, they perform poorer. 5) Heuristic Methods. These methods such as genetic methods[18], particle swarm optimization methods[19, 20], optimize the global scheduling result according to the cost function. These methods iteratively adjust the scheduling

70

result, adaptively reserving the local optimization result to realize the global optimization. The advantage of these methods is the better performance, and suitable to be used by various optimal policies. The disadvantage is that it is easy to fall into a local optimum process and obtain an instability result. In the distributed environment, the schedulers sent the scheduling result to

75

the devices in parallel, shown in Fig1.(b), causing the conflict on the devices[21]. The scheduling policy of the traditional methods, such as HEFT, DCP can not adjust the scheduling result according to the device load and device conflict, in dealing with multiple tasks. Therefore, they do not fit to the multi-scheduler system. For the problems above, we treat the device load as the measurement

80

of device conflict, by which we design the feedback mechanism to adjust the following scheduling result in each scheduler. Based on the feedback mechanism, the cooperative scheduling in distributed system can be realized.

4

3. Problem Description and the Notations The distribution tasks are composed by the task slices, and the task slices are 85

the units of task scheduling. The scheduling tasks are heterogeneous computing takes. Task slices in the same level do not belong to the same task category, so the execution times of task slices in the same level would not have the same execution times. The task slices in the DAG task have the following constraints: (1) The DAG task is composed by the task slices, which are the scheduling

90

units. There exists solo start slice (entrance slice) and end slice (exit slice) in each DAG task. (2) The device which can execute the same task slice is not unique, and the execution times of task slices in each device are known previously. (3) At the same time, one device can only execute one task slice.

95

(4) If vj is the successive task slice of vi , vj needs to receive all the data transmitted from vi . That is the precedence constraint. If vi and vj are executed in the same device, the communication cost is 0. (5) The task slice has to receive all the data from precedent slices before its execution.

100

The DAG scheduling can be described by the following symbols: (1) The DAG can be denoted by G=(V , E), where G represents the task, and V ={v1 , v2 , . . . , vn } represents the task slices in G, with |V |=n representing the number of task slices. (2) D={d1 , d2 , . . . , dm } represents the device set, where |D| is the number of

105

devices. (3) ci,j is the communication cost from device di to dj . The HEFT utilizes the rank of task slices as the sorting criterion, and the task slice with the maximal rank has the highest priority to be scheduled. The equation of the rank is as follows:

ranki = Wi +

max

vi ∈succ(vi )

5

(ci,i + ranki )

(1)

110

where succ(vi ) is the successors of vi , Wi is the average execution time of vi on each device, when vi is the end slice, ranki = Wi . Fig.2 shows the structure of a DAG task sample, where the nodes represent the task slices, and v1 , v8 are the start and end slices. In Fig.2 the directed edges represent the successive relationship between task slices, and the weights

115

of the edges represent the communication cost. The execution times of all the task slices are listed in Table 1, and the ranks calculated by Eq(1) are listed in Table 2. By the rank list in Table 2, the priority sequence of the task slices is: v1 , v2 , v4 , v3 , v5 , v6 , v7 , v8 . The scheduling result of Fig.2 task is show in Fig.3(a) by HEFT, and the execution time is 111. Y  Y





Y 

 Y





Y 

Y

 Y



 Y

Figure 2: The structure of DAG

Table 1: The parameters of the 3 synthetic datasets

devices

v1

v2

v3

v4

v5

v6

v7

v8

d1

17

22

15

20

13

49

17

13

d2

19

27

25

9

27

49

16

10

d3

21

17

9

22

18

46

15

9

Table 2: The parameters of the 3 synthetic datasets

120

v1

v2

v3

v4

v5

v6

v7

v8

rank

130.67

99.67

80

97.67

79

64.67

52.67

10.67

rank

156.6

118.87

88.8

120

90.2

79.33

59.07

13.07

The traditional DAG methods based on rank, such as HEFT, PCH, Looka6

G

Y

G

Y

G

Y 

Y

Y 

G

Y







G

Y Y

Y

Y

Y













(b)The HEFT scheduling result as d3 delay

Y

G G

Y 



(a)The original HEFT scheduling result

Y

Y

G

Y 

Y Y

G

Y

Y Y

Y 

Y

Y 

Y

Y 







(c) The HEFT scheduling result by consider d3 delay

Figure 3: The HEFT scheduling result under the 3 conditions

head, HHDS, are the single scheduler methods, with the topological structure shown in Fig.1(a). The advantage of the single scheduler is that it can centralized manage the multiple tasks and allocate the devices, thus the beginning and ending time of each task slice can be determined without the device conflict. And 125

the disadvantage of the single scheduler is the star topological structure. The centralized scheduler undertakes all the task scheduling loads, which is in the overload risk. However, for the multi-scheduler system shown in Fig.1(b), there exists the device conflict of task slices, caused by the paralleled task scheduling. If one task slice is scheduled on a occupied device, the task slice would

130

be suspended, because of device confliction. That will decrease the execution efficiency. There is a DAG task shown in Fig.2, if d3 is the critical device and the delay rate is 1.8, implying the task slices executed on d3 are delayed 1.8 times, the execution time of each task slices on d3 is: v1 (37.8), v2 (30.6), v3 (16.2), v4 (39.6),

135

v5 (32.4), v6 (82.8), v7 (27.0), v8 (16.2). That implies the task slices allocated on d3 need to prolong their execution time as the conflict on d3 . Fig3.(b) is the scheduling result in the case of conflict on d3 delay, where the finish time of v7 is 91, that delays the beginning time of v8 . Iteratively, the total execution time is extended to 123. The reason is that the rank in Eq(1) lacks of the consideration

140

on the device conflict, for the multiple tasks scheduling. However, if add the

7

delay rate of d3 into the scheduling, namely d3 ← d3 × 1.8, the new rank  can be obtained, and Table 1 lists the rank  . The priority sequence of task slices with rank  is v1 , v4 , v2 , v5 , v3 , v6 , v7 , v8 , where the task slices v4 , v2 , v5 , v3 have changed their orders. The scheduling result is shown in Fig.3(c), where 145

the total execution time is 115 which is better than that of Fig.3(b). Essentially the objective of rank is to predict the execution time for each task slice as the task beginning. However, there exists the device conflict among the tasks. The amount of competitive task slices can impact the whole efficiency. The delayed task slice will influence the successive task slices, iteratively, causing

150

the extension of the total execution time. To resolve this issue, we designed the MTDR (Multi Task Dynamic-Rank) scheduling method, and the contributions are the following: (1) Established the multiple task parallel scheduling methods for the multischeduler system, by which each task can be scheduled separately.

155

(2) The feedback mechanism can control the following task allocation, to reduce the device conflict. (3) We provide the dynamic load measurement, denoted by LSF, to evaluate the conflict state on the devices.

4. Multi-Scheduler Scheduling Method 160

4.1. LSF (load state feedback) model The rank represents the ideal priority of the task slice. According to Eq(1), the ranki of vi is mainly depend on the device, and Eq(1) can be rewritten as:

ranki =

|d|  r(dj , vi ) j=1

|d|

+

max

vi ∈succ(vi )

(ci,i + ranki )

(2)

where |d| is the number of devices in the distributed environment, r(dj , vi ) is the time cost of vi executed on dj . When the device dj has a higher occupancy 165

rate, the conflict of task slices on dj is more intensive. Thus, the execution time of task slice vi on dj will be delayed, implying the execution time of vi on 8

dj will be greater than r(dj , vi ). Therefore, it needs to give a higher priority to the task slices which will be executed on dj , to reflect the delay of r(dj , vi ). Thus the ranks of all the successive task slices will get adjustment, by 170

r(dj , vi ) ← (1 + Bj )r(dj , vi ), where Bj is the Busyness of dj and Bj ≥0. The Busyness can represent the conflict state of dj , and Eq(2) can be expressed as the following form:

ranki =

|d|  j=1

(1 + Bj )

r(dj , vi ) + max (ci,i + ranki ) |d| vi ∈succ(vi )

(3)

When dj is idle, Bj =0, the ranki in Eq(3) is equal to that in Eq(1). If the task slices executed on dj have a larger amount and longer execution times, dj 175

will provide more contribution on the rank. Thus, if a task slice vi has a long execution time on dj , it would have a large increment on rank and r(dj , vi ), implying that if the priority of vi gets larger the vi would be scheduled earlier. So, the vi tends to be scheduled on other devices. Thus the load of dj will be reduced, and the conflict on dj will be relaxed. According to Eq(3), Bj can

180

control the value of rank, therefore, the device can feedback the Busyness B to the scheduler to control the successive scheduling result. For the Busyness modeling, we utilize the time window to calculate the Busyness. Fig.4 shows the 3 time windows {(T0 , T0 +T ), (T0 , T0 +T ), (T0 , T0 + T )} on device d, where the length of the window is T . The Busyness of d can

185

be obtained, at T0 + T , T0 + T , T0 + T , according to the occupancy state in the windows of (T0 , T0 + T ), (T0 , T0 + T ) and (T0 , T0 + T ). In Fig.4, at T0 + T there are 3 occupancy segments τ1−3 in (T0 , T0 + T ), with the beginning time  t1−3 , while at T0 + T there are 5 occupancy segments τ1−5 in (T0 , T0 + T ),

where τ4 and τ5 are the new segments in (T0 + T, T0 + T ). At T0 + T there are 190

 5 occupancy segments τ1−5 in (T0 , T0 + T ), where τ5 is the new segments in

(T0 + T, T0 + T ). We utilize the frequency and the beginning time of the occupation segments

9

W

W



IJ

W

W 

W

W 



IJ

W





IJ



IJ



W 

W



W

 7  

W





IJ





W 

W



IJ



W

 



IJ 





IJ





W 



IJ 







 7 





IJ

7 7 

7 7 

7 7 

IJ









W





IJ





IJ







7  7

 







Figure 4: The illustration of the time windows on device d

as the input of Busyness calculation, designed the following equation: f B=

i=1

T

w i τi

(4)

where f is the amount of task slices in the window, implying the frequency, T 195

is the length of window, τ ={τ1 , τ2 , . . . , τf } is the execution time of the task slices in the window, wi is the weight of τi . The principle of Eq(4) is that: 1) The wi is influenced by ti , assuming T ∗ is the ending time of the window, if T ∗ − ti is less implying the ti approach to the end of the window, the impact of τi on B is greater and the wi is greater. 2) In the window, if the amount

200

of task slices is higher representing the device busier, the B would have a large value. For the modeling of w and t, the t∗ is the normalized t, denoted by t∗ =[T − (T ∗ − t)]/T , 0≤ t∗ ≤ 1, and we give the follow conditions to construct the function w=f unction(t∗ ). (1) If t the beginning time of the time slice is closer to the ending time of the

205

window, the w of which is greater. Thus, f unction(t∗ ) is an increasing function in the interval of [0,1], for t∗ = 1, the w has the maximal value w=1. So the w has the interval of 0≤ w ≤1. (2) If the weight of a time slice is w at the normalized beginning time t∗ , the distance between t∗ and T ∗ is 1-w. Therefore, the increment of w at t∗ , namely

210

w(t∗ + t) − w(t∗ ), is proportional to the w, while inversely proportional to the

10

1-w. According to the 2 conditions above, the differential equation of w and t can be constructed as the following: w(t∗ − Δt∗ ) − w(t∗ ) = k

w Δt∗ 1−w

(5)

where k is the regulation parameter, the following equation is obtained by solv215

ing the differential equation. ∗ w = k0 ekt w e

(6)

For t∗ =1 and w=1, k0 =e−1−k , the expression of w=f unction(t∗ ) can be obtained as: w = −lambertw(0, −ek(t



−1)−1

)

(7)

where the lambertw function is the ’Lambert W Function’. The T In Fig.4 is 35, for k=1.5, the Busyness of d at T0 +T , T0 +T , T0 +T is 0.13, 0.15, 0.16, 220

according to Eq(7). Integrated the Eqs(1)(3)(4)(7), the ranki of vi can be obtained as the following:

ranki =

|d|  j=1

(1 +

f (j) t −T ∗ 1 i (j) r(dj , vi ) −lambertw(0, −ek( T )−1 )τi ) T i=1 |d|

+

max

vj ∈succ(vi )

(ci,j + rankj )

(8)

4.2. MTDR(Multi Task Dynamic-Rank scheduling method) We designed the MTDR (Multi Task Dynamic-Rank scheduling method), according to the multi-scheduler topological structure in Fig.1(b). The Busyness 225

has considered the confliction of simultaneous schedule of the multiple schedulers, by minimizing the conflict loss. So if the tasks arrive at different schedulers at the same time, the schedulers can assign the tasks to the devices simultaneously, then the devices process these tasks by arrival time order. Each scheduler follows the following policy to scheduling the task G at T ∗ : 11

230

(1) The busyness of each device is calculated during (T ∗ -T , T ∗ ), where T is the length of the window. (2) Calculate the ranks of all the task slices in G, according to Eq(8). Then sort the task slices by the rank. The sorted sequence is Q. (3) Assuming the first element of Q is vi , select the device dj , on which vi can

235

finish earliest, as the execution device. Then remove the first element from Q. (4) Repeat the step (3) until Q is empty. Fig.5 shows the DAG structures of task A, B and C, with the device set D={d1 , d2 , d3 }. The execution time of each task slice on the devices is listed in

240

table 3. The arrival time of the 3 tasks is 20, 40, 60. Fig.6 shows the schedule process of 3 tasks on the 3 schedulers. The arrival time of the 3 tasks is 20, 40, 60. In Fig.6(a)(b), when the task A arrive at Scheduler 1, devices d1 , d2 and d3 feedback their Busyness B1 , B2 and B3 to Scheduler 1, then Scheduler 1 allocate task slices to d1 , d2 and d3 according to Busyness. Correspondingly,

245

Fig.6(c)(d) and Fig.6(e)(f) show the schedule process of B and C on Scheduler 2 and Scheduler 3. Fig.7 shows the scheduling result of the 3 tasks with T =20, k=1.5. Fig.7(a) shows the occupancy state as the task A arriving at T ∗ =20. At T ∗ =20, Busyness of the 3 devices is B1−3 ={0.09, 0.00, 0.11}. According to Eq(8), the ranks of

250

A1−6 are 34.48, 23.21, 24.93, 13.43, 16.28, 5.05, and the Fig.7(b) shows the scheduling result of task A. As the task B arriving (T ∗ =40), the Busyness of the 3 devices is B1−3 ={0.15, 0.11, 0.10}, with the ranks of B1−6 are 34.89, 21.57, 27.44, 14.81, 14.66, 3.02, and Fig.7(c) shows the scheduling result of task B. As the task C arriving (T ∗ =60), the Busyness of the 3 devices is

255

B1−3 ={0.12, 0.13, 0.04}, with the ranks of C1−6 are 39.21, 25.71, 26.23, 13.99, 13.21, 5.97, and the Fig.7(d) shows the scheduling result of task C.

12

$

%





$



$







%

%







&

&



$

&







%



&



$ 



%



$

&

 %

 &

Figure 5: The structures of the task A, B, C

Table 3: The execution time of the task slices in the 3 tasks A1

A2

A3

A4

A5

A6

B1

B2

B3

B4

B5

B6

C1

C2

C3

C4

C5

C6

d1

5

4

1

8

3

9

2

6

8

5

3

3

9

5

4

4

1

9

d2

9

3

7

3

8

2

2

5

6

4

2

4

9

4

2

3

2

6

d3

2

9

8

4

4

3

8

2

4

1

2

1

5

9

8

4

9

1

5. The Analysis on the parameters of MTDR According to Eq(7), the weight w of τ during (T ∗ − T, T ∗ ) is influenced by the parameter k. Fig.8 shows the functional relationship of w against k. It 260

can be seen from Fig.8, if the beginning time t of τ is close to T ∗ , namely t∗ approaching to 1, the weight w is greater. When k has a large value, w is close to 0 for t∗ =0. In this case, the ratio of wi and wj gets less, where wi is the weight of τi far from the end of window, wj is the weight of τj near to the end of window. It implies that, when k is large the influence of τj on Busyness is

265

weaker than that of τi . k and T are key parameters of MTDR, and the optimal values of k and T are various for different experimental data. To analyze the optimal ranges of the parameters, we designed the following 2 experimental analysis methods. 5.1. The analysis on k

270

We utilize the experimental study to test the effective value of k. The designed experimental scheme is as follows:

13



 ^ĐŚĞĚƵůĞƌϮ

^ĐŚĞĚƵůĞƌϮ

^ĐŚĞĚƵůĞƌϭ

%  %  % 

^ĐŚĞĚƵůĞƌϭ

^ĐŚĞĚƵůĞƌϯ

^ĐŚĞĚƵůĞƌϯ

$ $

% 

% 

Ěϭ ĚϮ

% 

$ $ $

$

Ěϭ

Ěϯ

ĚϮ

Ěϯ

(a) At T =20 feedback the Busyness to Scheduler 1 (b) At T =20 Scheduler 1 allocate task slices to A



%  ^ĐŚĞĚƵůĞƌϮ %  % 

^ĐŚĞĚƵůĞƌϭ



^ĐŚĞĚƵůĞƌϯ

^ĐŚĞĚƵůĞƌϮ

^ĐŚĞĚƵůĞƌϭ

^ĐŚĞĚƵůĞƌϯ

% %

% 

% 

Ěϭ ĚϮ

% 

% % % %

Ěϭ

Ěϯ

ĚϮ

Ěϯ

(c) At T =40 feedback the Busyness to Scheduler 2 (d) At T =40 Scheduler 2 allocate task slices to B





^ĐŚĞĚƵůĞƌ Ϯ

^ĐŚĞĚƵůĞƌϭ

%  %  % 

^ĐŚĞĚƵůĞƌ Ϯ

^ĐŚĞĚƵůĞƌϯ

^ĐŚĞĚƵůĞƌϭ

^ĐŚĞĚƵůĞƌϯ

& & & &

&

% 

% 

Ěϭ ĚϮ

% 

Ěϭ

Ěϯ

&

ĚϮ

Ěϯ

(e) At T =60 feedback the Busyness to Scheduler 3 (f) At T =60 Scheduler 3 allocate task slices to C

Figure 6: The scheduling process of tasks A, B and C

14

%  %  % 

G

$

$

G $

G

G

G

G

$ %  %  % 

$$













$

$ $

$ %% % % %

%







(b)The scheduling result of task A at T ∗ =20

(a)The original HEFT scheduling result G



%  %  % 

$

$

$ %%

&

& &

&

G $

G

% % %

%

&

G $$

$$

G

& %  %  % 

G 





















(d)The scheduling result of task C at T ∗ =60

(c)The scheduling result of task B at T =40

Figure 7: The Gantt charts of the 3 tasks

1.00

k

wj

0.75

4

w

wi

3

0.50

2

T*

0.25

0.00 0.00

0.25

0.50

0.75

t* Figure 8: The function relationship of k against w

15

1.00

1

Experimental Data. We utilize the Workflow Generator[22] to generate 3 groups of tasks, where each group comprises 100 tasks. The 3 groups of tasks are generated with the following parameters: Group1 (|V |=100, average size of 275

the task slices is 10M), Group2 (|V |=500, average size of the task slices is 10M), Group3 (|V |=1000, average size of the task slices is 10M). Device (|D|=9, the processing speed is 6Mbps, the data transmission speed is 8Mbps). Contrast Methods. We select the HEFT[9], PCH[15], HHDS[16] as the contrast methods. These methods all utilize the rank in Eq(8) as the scheduling

280

criterion, and these methods are all the single scheduler methods. Experimental Process. To simulate the multiple tasks, randomly set the different arriving time for each task in the group, and the arriving interval is a unit of time. Thus, if the time unit is less, there will exist the device conflict between the successive tasks. For the 3 groups data, the time unit is

285

T ={10S, 20S, 30S}, for instance, the 100 tasks in Group1 will arrive at 1S, 11S, 21S, . . . , correspondingly. We utilize the MTDR method, with the parameters k={0.5, 1.5, 2, . . . , 5} and T =16S, to schedule the 3 groups of tasks, scheduling 50 times for each k. Fig.9 shows the distribution of the total execution time of MTDR against

290

k, and also gives the scheduling result of HEFT, PCH, HHDS, where the boxes represent the distribution state. It can be seen from Fig.9, for 0.5< k <3, the entire distribution of MTDR performs best, while for k >3 the scheduling result of HEFT, PCH, HHDS is better than MTDR. Therefore, the optimal range of k in the experiment is 0.5 < k < 3. We have carried out many tests by this

295

experimental method on various data, to obtain the optimal value of k analysis. The conclusion we obtained is that the optimal value of k concentrates in the range of 0.5 < k < 3 with a probability of 0.95. So 0.5 < k < 3 can be seen as the experiential value of k. 5.2. The analysis on T

300

Experimental Process. 1) Calculate the average execution time of all the task slices for each task in Group1-3, denoted by T 0 , and set T =N × T 0 . 2) 16

The total execution time (S)

Group1 1600 1400 1200 1000 800 Methods

Group2 4000

MTDR

3000

HEFT

2000

PCH HHDS Group3

9000 8000 7000 6000 5000 0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

HEFT PCH HHDS

k

Figure 9: The distribution of the total execution time against k

Carry out the MTDR with k=1.5 on each group for 50 times, and record the results. Fig.10 shows the distribution of the total execution time of MTDR against 305

N , and also gives the scheduling result of HEFT, PCH, HHDS, where the boxes represent the distribution state. It can be seen from Fig.10, for 4< N <16, the MTDR performs best, while for N >16 the scheduling result of HEFT, PCH, HHDS is better than MTDR. Therefore, the effective value of N is 4< N <16, namely 4T 0 < T < 16T 0 . We have carried out many tests by this

310

experimental method on various data, to obtain the optimal value of T analysis. The conclusion we obtained is that the optimal value of T concentrates in the range of 4T 0 < T < 16T 0 with a probability of 0.87. So 4T 0 < T < 16T 0 can be seen as the experiential value of T . By the analysis above, the experiential parameter values of MTDR are 0.5 <

315

k < 3, 4T 0 < T < 16T 0 .

17

Group1

The total execution time (S)

1600 1400 1200 1000 800 2

4

6

8

10

12

14

16

18

20

HEFT HHDS PCH Methods

Group2

MTDR 4000 HEFT 3000 HHDS

2000 2

4

6

8

10

12

14

16

18

20

HEFT HHDS PCH

16

18

20

HEFT HHDS PCH

PCH

Group3 10000 8000 6000 2

4

6

8

10

12

14

N

Figure 10: The distribution of the total execution time against N

6. Experiments In this section, we also use the Workflow Generator to generate the experimental data. The generated data D200 with the following generative parameters, the number of tasks is 200, the number of task slices is 300 for each task, and 320

the size of the task slices is randomly set in the interval [10M, 15M]. The device set comprises 6 devices with the processing speed 6Mbps, and the data transmission speed is randomly set in the interval [5M, 10M]. The HEFT[9], HHDS[16] and PCH[15] are selected as the contrast methods. The HEFT, HHDS, and PCH are the representative methods of Weight Priority Methods, Level Pri-

325

ority Methods and Path Clustering Methods. All the 3 methods schedule the tasks based on the rank. The proposed MTDR is also based on the rank, and the improvement of MTDR is to control the influence of rank on the scheduling priority. So these selected methods can effectively contrast the effect of MTDR. 6.1. Arrival time test

330

To analysis the influence of arrival time of the 200 tasks in D200 on the scheduling result, we set the arrival time as 5×n(S), implying the arrival time 18

of the n-th rask is 5n(S). If n is less, the tasks are more intensive, and the device conflict is fiercer. Fig.11 shows the comparison of the 4 methods MTDR (k=1.5, T =20), HEFT, PCH, HHDS, on the average execution time of the 200 tasks, 335

against n. For there exists the conflict on the multiple tasks, if the method have a sufficient consideration on the device loads, the average execution time of the 200 tasks tends to be lower. In Fig.11, the MTDR has a lowest average execution time for n < 10, implying that the MTDR performs better than the other methods under the device conflict condition. For n > 15, the successive

340

arrival time has a long distance, and the conflict gets less. Thus the average execution time of the 4 methods gets closed to each other, implying the MTDR has the similar performance with the classic DAG methods under the condition

7KHDYHUDJHH[HFXWLRQWLPH 6

of less device conflict.  0HWKRGV 

07'5 +()7 ++'6



3&+  





Q



Figure 11: The comparison on the average execution time of the 200 tasks

To analyze the the performance of the 4 methods on the random arrival time, 345

we set each task a random n for the 200 tasks in D200, and the random vaule of n is in the range of (1 20). Table 4 shows the times of each method obtaining the optimal scheduling result. For example, for k=1.0 and T =15, MTDR obtains 64 optimal scheduling results in the 200 task scheduling, while HEFT, HHDS and PCH obtain 34, 61 and 41 optimal scheduling results respectively. It can 19

350

be seen from table 4, MTDR performs bettter than the other 3 methods for various k and T . Table 4: The times of 4 methods obtaining the optimal scheduling result

Parameter

MTDR

HEFT

HHDS

PCH

k=1.0, T=15

64

34

61

41

k=1.0, T=20

72

24

61

43

k=1.0, T=25

59

50

43

48

k=1.0, T=30

79

42

53

26

k=1.5, T=15

62

42

49

47

k=1.5, T=20

63

46

33

58

k=1.5, T=25

74

38

48

40

k=1.5, T=30

75

40

42

43

k=2.0, T=15

79

47

51

23

k=2.0, T=20

69

43

39

49

k=2.0, T=25

66

50

42

42

k=2.0, T=30

74

49

60

17

k=2.5, T=15

77

33

54

36

k=2.5, T=20

81

34

32

53

k=2.5, T=25

62

52

43

43

k=2.5, T=30

75

40

47

38

6.2. Device dependence test The device dependence is the device selection tendency of the scheduling methods. To analyze the device dependence of each method, we utilize the 355

device occupancy rate as the measurement, and the expression of the device occupancy is the following: o(di ) occupancy rate = |D| j=1 o(dj )

(9)

where o(dj ) represents the occupied time on device dj , |D| is the number of devices. If one method have a higher occupancy rate on the device d, the method

20

has a higher device selection tendency on d. Thus, the methods considering the 360

load balance will have a uniform distribution of occupancy rate on the devices. Fig.12 shows the occupancy rate of the 4 methods, MTDR (k=1.5, T =20), HEFT, PCH, HHDS, on the data D200, and the distance of successive arrival time is 20S. We designed 5 devices Device1-5, the processing speed is: 10Mbps, 9Mbps, 8Mbps, 7Mbps, 6Mbps correspondingly. The total execution time of the

365

4 methods is: 158S(MTDR), 202(HEFT), 178(PCH), 183(HHDS). By the comparison from Fig.12, the methods of HEFT, PCH and HHDS have an obviously higher occupancy rate on Device1 than Device2-5, and the occupancy rate has the downward trend from Device1 to Device5, implying the device dependence of HEFT, PCH, HHD on Device1 is higher than on Device2-5. However, the

370

MTDR consider the device confliction, tend to schedule the task slices onto the other low efficient devices. So the utilization rate of Device2-5 is much higher. In contrast, other methods mainly tend to allocate the task slices onto the high efficient device. So the utilization rate of from Device1 to 5 gets gradually smaller. Because of the device confliction, the task execution times of other

375

methods are long, and then the ratio of occupancy and task execution times is lower than that of MTDR. The occupancy rate distribution of MTDR is relatively uniform, implying the MTDR can fully utilize the low efficient devices to reduce the task execution time and has a better performance on the load balance. By which the MTDR can sufficiently utilize the low efficient devices to reduce the total execution time. Device1

Device2

Device3

Device4

Device5

0.4

Methods MTDR

0.3

HEFT

0.2

HHDS 0.1

PCH

PCH

HEFT

HHDS

MTDR

PCH

HEFT

HHDS

MTDR

PCH

HEFT

HHDS

MTDR

PCH

HHDS

HEFT

MTDR

PCH

HEFT

HHDS

0.0 MTDR

occupancy rate

380

Figure 12: The occupancy rate of the 4 methods on the Device1-5

21

6.3. Task structure test In order to analysis the influence of DAG structure on the scheduling result, we designed the data D150. The structure of D150 comprises 8 levels, where the first level and the eighth level contains only 1 task slice, and the amount 385

of task slices in other levels is randomly generated with the total amount is 100 constantly. The size of the task slices is randomly set in the interval [10 M, 15 M]. The devices used are the same 4 devices, with the processing speed 5Mbps, and the data transmission speed 8Mbps. The distribution of task slices in each level represents the feature of the DAG structure. Thus, we utilized

390

the information entropy to evaluate the feature of D150. The expression is the following: entropy = −

n 

pi log(pi )

(10)

i=1

where pi =Li /100, Li is the number of task slices in the i-th level. The distribution of task slices in each level is more uniform, and the value of entropy is larger. For instance, for L1−8 ={1,16,16,16,16,17,17,1}, the entropy obtained 395

the maximal value of 1.87, conversely, for L1−8 ={1,93,1,1,1,1,1,1}, the entropy obtained the minimal value of 0.39. Therefore, we separated the D150 into 6 groups by the entropy={(0.39 0.65], (0.65 0.90], (0.90 1.15], (1.15 1.40], (1.65 1.87]}. Each group contains 150 tasks, and the distance of successive arrival time is 10S.

400

Fig.13 shows the distribution of the total execution time for the 6 groups, where the MTDR, in the condition of entropy < 1.40 implying the distribution of task slices is nonuniform, performs best. The reason is that, when the distribution of task slices is nonuniform on each level, the amount of task slices in a specific level is large. Thus, lots of task slices in the same level have the similar

405

ranks, will lead to the intensive device conflict. In this case, the MTDR has the better efficiency than the others.

22

The total execution time (S)

2200

Methods MTDR HEFT

2000

HHDS PCH 1800

1600 0.40~0.65

0.65~0.90

0.90~1.15

1.15~1.40

1.40~1.65

1.65~1.90

entropy

Figure 13: The total execution time of the 4 methods on in 6 entropy intervals

6.4. CCR(Communication computation ratios) test The objective of CCR test is to verify the stability of the methods with varying the transmission speed. For that, we utilize the D200 as the experimental 410

data, and designed the following process: 1) Enlarge the transmission speed of D200 for 1, 0.8, 0.6, 0.4, 0.2 times, forming 5 groups DAG tasks. Thus, the random intervals of the 5 groups DAG tasks are: [5Mbps, 10Mbps], [4Mbps, 8Mbps], [3Mbps, 6Mbps], [2Mbps, 4Mbps], [1Mbps, 2Mbps]. 2) The distance of successive arrival time is 20S. 3) Utilize the 4 methods, MTDR (k=1.5, T =20),

415

HEFT, PCH, HHDS, to schedule the 5 groups of tasks for 100 times and record times of the methods obtaining the best result on each group. Fig.14 shows the statistical results, where the MTDR performs best with enlarge ratio=0.2. With reducing the enlarge ratio, the superiority of MTDR is getting obvious. The reason is that, with the reduction of transmission speed, the device conflict

420

becomes intensive. In this case, the MTDR has the better efficiency than the others. 6.5. Device set test The objective of this experimental test is to analysis the efficiency of MTDR (k=1.5, T =20). We designed the following experimental process: 1) Generate

425

the data D500: contains 500 tasks, and for each task there are 300 task slices,

23

enlarge ratio=0.4

enlarge ratio=0.6

enlarge ratio=0.8

enlarge ratio=1 Methods

30

MTDR

20

HEFT HHDS

10 PCH

PCH

HEFT

HHDS

MTDR

PCH

HHDS

HEFT

MTDR

PCH

HEFT

HHDS

MTDR

PCH

HEFT

HHDS

MTDR

PCH

HEFT

HHDS

0 MTDR

The frequency of best performance

enlarge ratio=0.2 40

Figure 14: The histogram of obtaining the best result times for the 4 methods on each groups

with the size randomly set in the interval [10M, 15M]. 2) The distance of successive arrival time is 20S. 3) There are 3 types of devices, such as the high efficient device with processing speed 10Mbps, middle efficient device with processing speed 6Mbps, low efficient device with processing speed 3Mbps. The 430

designed 6 device sets Set1-6 are list in table 5, where each set contains 9 devices. For instance, in Set1 there are 2 high efficient devices, 4 middle efficient devices, 3 low efficient devices. 4) Schedule the D500 on each device set utilizing the MTDR method with k=1.5, and record the execution time of each task on D500.

435

Fig.15 is the scatter diagram of the 500 tasks on 6 sets, where the efficiency of the sets is decreasing from Set1 to Set6, and the execution time of the 4 methods is increasing. In Fig.15, the MTDR has the superiority over the other methods on Set1 and Set2, while the performance of MTDR is close to the others on Set5 and Set6. The reason is that, the distribution of the 3 types

440

of devices is relatively uniform, thus the MTDR considering the balance load performs better. However, the type of devices tend to be the same on Set5 and Set6, thus the MTDR performs closely to the other methods.

7. Conclusion We designed a multi-scheduler scheduling method MTDR orient to the dis445

tributed environment. The innovative thinking of this paper is utilizing the

24

Table 5: The 6 device sets Set1-6

Device Types

Set1

Set2

Set3

Set4

Set5

Set6

high efficient device

2

2

1

1

0

0

middle efficient device

4

3

3

2

2

0

low efficient device

3

4

5

6

7

9

6HW

6HW

6HW

6HW

6HW

6HW

7KHWDVNH[HFXWLRQWLPH 6





0HWKRGV 07'5 +()7



++'6 3&+



3&+

+()7

++'6

07'5

3&+

+()7

++'6

07'5

3&+

+()7

++'6

07'5

3&+

+()7

++'6

07'5

3&+

+()7

++'6

07'5

3&+

+()7

++'6

07'5



Figure 15: The scatter diagram of the 500 tasks on 6 sets

feedback mechanism to control the following scheduling result, denoted as LSF, to balance the device load. The proposed device Busyness modeling method, denoted by LSF, is utilizing the frequency and the beginning time of the occupation segments as the input, and treating the Busyness as the control parameters 450

of rank. By that, the task slices tend to be allocated to the relative idle devices, realizing the balance load. We utilize the experimental study to determine the parameters’ optimal interval, and verified the MTDR performs better than other methods, such as HEFT, PCH and HHDS. The experimental process can be used as the general

455

method to analysis the parameters. In the experiments, the efficiency of MTDR is tested on the arrival time, device dependence, task structure, CCR, device set, 5 aspects. The experiments shows that because of considering the device load the MTDR performs best in multiple tasks scheduling with device conflict. When the conflict is less, the performance of MTDR is close to the other classic

25

460

DAG scheduling methods. Thus the MTDR can also meet the requirement of single scheduler system. The MTDR method we designed can provide a basis for big data scheduling, cloud computing scheduling, having practical significance on the study of parallel scheduling. Furthermore, the drawback of MTDR is unable to determine

465

the optimal value of k and T . Thus the further work is to research the optimal value of k and T . Acknowledgement We acknowledge the support of the National Natural Science Foundation of China under Grant Nos.61602133, 61672179, 61370083, 61370086; the Hei-

470

longjiang Postdoctoral Science Foundation(LBH-Z15096); China Postdoctoral Science Foundation Funded Project (2016M591541). References [1] J. Dean, S. Ghemawat, Mapreduce: Simplified data processing on large clusters., In Proceedings of Operating Systems Design and Implementation

475

(OSDI 51 (1) (2004) 107–113. [2] J. Dean, Mapreduce: simplified data processing on large clusters, Communications of the Acm 51 (1) (2008) 147152. [3] S. Bhardwaj, L. Jain, S. Jain, Cloud computing: A study of infrastructure as a service (iaas), International Journal of engineering and information

480

Technology 2 (1) (2010) 60–63. [4] M. Malawski, G. Juve, E. Deelman, J. Nabrzyski, Algorithms for cost- and deadline-constrained provisioning for scientific workflow ensembles in iaas clouds, Future Generation Computer Systems 48 (1) (2015) 1–18. [5] H. Xu, B. Yang, W. Qi, E. Ahene, A multi-objective optimization approach

485

to workflow scheduling in clouds considering fault recovery, Ksii Transactions on Internet & Information Systems (2016) 1–18. 26

[6] X. Nan, Y. He, L. Guan, Optimal resource allocation for multimedia cloud based on queuing model., in: IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), 2011, 2011, pp. 1–6. 490

[7] H. Kllapi, E. Sitaridi, M. M. Tsangaris, Y. Ioannidis, Schedule optimization for data processing flows on the cloud, in: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, 2011, pp. 289– 300. [8] C. Lin, S. Lu, Scheduling scientific workflows elastically for cloud comput-

495

ing., in: 2013 IEEE Sixth International Conference on Cloud Computing, 2011, pp. 746–747. [9] H. Topcuouglu, S. Hariri, M. Y. Wu, Performance-effective and lowcomplexity task scheduling for heterogeneous computing, Parallel & Distributed Systems IEEE Transactions on 13 (3) (2002) 260–274.

500

[10] Y. K. Kwok, I. Ahmad, Dynamic critical-path scheduling: An effective technique for allocating task graphs to multiprocessors, IEEE Transactions on Parallel & Distributed Systems 7 (5) (1996) 506–521. [11] W. Wu, A. Bouteiller, G. Bosilca, M. Faverge, J. Dongarra, Hierarchical dag scheduling for hybrid distributed systems, IEEE International Parallel

505

& Distributed Processing Symposium (2015) 1–11. [12] K. Kim, V. Eijkhout, A parallel sparse direct solver via hierarchical dag scheduling, Acm Transactions on Mathematical Software 41 (1). [13] Y. Ma, L. Wang, A. Y. Zomaya, D. Chen, R. Ranjan, Task-tree based largescale mosaicking for massive remote sensed imageries with dynamic dag

510

scheduling, Parallel & Distributed Systems IEEE Transactions on 25 (8) (2014) 2126–2137. [14] L. F. Bittencourt, R. Sakellariou, E. R. M. Madeira, Dag scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm, in:

27

2010 18th Euromicro Conference on Parallel, Distributed and Network515

based Processing, 2010, pp. 27–34. [15] L. F. Bittencourt, E. R. M. Madeira, Towards the scheduling of multiple workflows on computational grids, Journal of Grid Computing 8 (3) (2010) 419–441. [16] R. Sakellariou, H. Zhao, A hybrid heuristic for dag scheduling on heteroge-

520

neous systems, in: Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International, 2004, pp. 111b–111b. [17] A. Rezaeian, H. Abrishami, S. Abrishami, M. Naghibzadeh, A budget constrained scheduling algorithm for hybrid cloud computing systems under data privacy, in: IEEE International Conference on Cloud Engineering,

525

2016, pp. 230–231. [18] S. Tayal, Tasks scheduling optimization for the cloud computing systems, International Journal of Advanced Engineering Sciences And Technologies (IJAEST) 5 (2) (2011) 111–115. [19] S. Pandey, L. Wu, S. M. Guru, R. Buyya, A particle swarm optimization-

530

based heuristic for scheduling workflow applications in cloud computing environments, in: 2010 24th IEEE International Conference on Advanced Information Networking and Applications, 2010, pp. 400–407. [20] Z. Wu, Z. Ni, L. Gu, X. Liu, A revised discrete particle swarm optimization for cloud workflow scheduling, in: 2010 International Conference on

535

Computational Intelligence and Security, 2010, pp. 184–188. [21] Y. Xin, Z. Q. Xie, J. Yang, A load balance oriented cost efficient scheduling method for parallel tasks, Journal of Network and Computer Applications (2016) 1–15. [22] Workflow Generator, https://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator

540

(2014).

28