The impact of workload variability on the energy efficiency of large-scale heterogeneous distributed systems

The impact of workload variability on the energy efficiency of large-scale heterogeneous distributed systems

Accepted Manuscript The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems Georgios L. Stavrini...

1MB Sizes 0 Downloads 23 Views

Accepted Manuscript

The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems Georgios L. Stavrinides, Helen D. Karatza PII: DOI: Reference:

S1569-190X(18)30140-0 https://doi.org/10.1016/j.simpat.2018.09.013 SIMPAT 1860

To appear in:

Simulation Modelling Practice and Theory

Received date: Revised date: Accepted date:

18 August 2018 21 September 2018 24 September 2018

Please cite this article as: Georgios L. Stavrinides, Helen D. Karatza, The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems, Simulation Modelling Practice and Theory (2018), doi: https://doi.org/10.1016/j.simpat.2018.09.013

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems

a Department

CR IP T

Georgios L. Stavrinidesa,∗, Helen D. Karatzaa

of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

AN US

Abstract

Previous studies have shown that the workload variability has a serious impact on the performance of large-scale distributed architectures, since it may cause significant fluctuations in service demands. Energy efficiency is one of the aspects of such platforms that are of paramount importance and therefore it is imperative to investigate how it may also be affected by this factor. Towards

M

this direction, in this paper we investigate via simulation the impact of workload variability, in terms of computational volume and interarrival times, on

ED

the energy consumption of a large-scale heterogeneous distributed system. The workload consists of real-time bag-of-tasks jobs that arrive dynamically at the system. The execution rate and power consumption characteristics of the pro-

PT

cessors are modeled after real-world processors, according to the Standard Performance Evaluation Corporation (SPEC) Power benchmark. Four heuristics are employed for the scheduling of the workload, two commonly used baseline

CE

policies and two proposed energy-aware heuristics. The simulation results reveal that the workload variability has a significant impact on the energy consump-

AC

tion of the system and that the severity of the impact depends on the employed scheduling technique. Keywords: Energy Efficiency, Workload Variability, Large-Scale Distributed ∗ Corresponding

author. Tel.: +30 2310 997974. Email addresses: [email protected] (Georgios L. Stavrinides), [email protected] (Helen D. Karatza)

Preprint submitted to Simulation Modelling Practice and Theory

September 24, 2018

ACCEPTED MANUSCRIPT

Systems, Heterogeneity, Real-Time Jobs

1. Introduction

CR IP T

In recent years, energy consumption has become one of the main challenges

of large-scale distributed architectures. The ever-growing demand for effective and highly parallel processing, along with the rapid technological advancements, have driven distributed systems to unprecedented scales and computational ca-

pabilities. Furthermore, such infrastructures gradually grow more heterogeneous due to the budget-constrained upgrade and replacement cycles, encompassing

AN US

processors of different generations with various energy consumption characteristics. Inevitably, these systems continue to show a dramatic increase in their carbon footprint. Due to the heterogeneity and inherent complexity of such platforms, their performance is usually evaluated via simulation, rather than by analytical methods. Analytical modeling in such cases would require many

M

simplifying assumptions that could yield misleading conclusions. The jobs processed on such platforms typically feature a high degree of parallelism, consisting of independent tasks that can be executed in any order,

ED

without any intertask communication among them. Such jobs are commonly referred to as bags-of-tasks. Prominent examples are image processing, chromosome mapping, massive searches and data mining jobs [1]. Quality of Service

PT

(QoS), often expressed in terms of timeliness, is another important aspect of such systems. That is, jobs are often real-time, having a deadline within which

CE

they must complete their execution. Since a late result would be useless, in case a job fails to meet its deadline, it is aborted from the system [2, 3, 4, 5].

AC

1.1. Motivation Previous studies have shown that the workload variability has a serious im-

pact on the performance of large-scale distributed architectures [6, 7, 8]. It may cause significant fluctuations in service demands, as their low, average and peak values can differ by several orders of magnitude. However, previous studies ignored the impact of these fluctuations on the energy consumption of the 2

ACCEPTED MANUSCRIPT

system. Since energy efficiency is one of the aspects of such platforms that are of paramount importance, it is imperative to investigate how it may also be

CR IP T

affected by this factor. 1.2. Contribution

Towards this direction, in this paper we investigate via simulation the impact of workload variability, in terms of computational volume and interarrival times,

on the energy consumption of a large-scale heterogeneous distributed system. The workload consists of real-time bag-of-tasks jobs that arrive dynamically

AN US

at the system. The execution rate and power consumption characteristics of R the processors are modeled after real-world Intel processors, according to the R 2008 Standard Performance Evaluation Corporation (SPEC) SPECpower ssj

benchmark, which is the first industry-standard benchmark that evaluates the power and performance characteristics of servers [9]. Four heuristics are employed for the scheduling of the workload, two commonly used baseline policies

M

and two proposed energy-aware heuristics.

ED

2. Background and Related Work

In this paper, the interarrival time and computational volume variability are expressed in terms of the coefficient of variation of the corresponding random

PT

variables. The coefficient of variation CV of a random variable X is the ratio of its standard deviation σ to its mean X and determines the variability in its

CE

values, in relation to the mean (i.e. CV = σ/X). With regard to the exponential

distribution, when CV = 1, it means that X is exponentially distributed with

mean X. In this case, X exhibits moderate variability in its values. On the one

AC

hand, when CV = 0.5, X features low variability, compared to the exponential

distribution. In this case, X follows a hypoexponential distribution with mean X and its values are closer to the mean X, compared to the exponential distribution case. On the other hand, when CV = 1.5, X features high variability in its

values, compared to the exponential distribution. In this case, X follows a

3

ACCEPTED MANUSCRIPT

hyperexponential distribution with mean X and takes a large number of very small values and a small number of very large values, compared to the mean X. Many research efforts have been focused on the investigation of the effects

CR IP T

of workload variability in distributed architectures [10, 6, 7]. However, none of these works investigate the impact of workload variability on the energy ef-

ficiency of the computing resources. The work presented in this paper differs from our previous work in [6, 7], since in this case the energy consumption of the

system is taken into account, in contrast to our previous efforts that only considered the impact of workload variability on the provided QoS. Furthermore, in

AN US

this study the workload variability is expressed in terms of both computational volume and interarrival times, whereas our previous studies only considered the variability in the computational volume of the workload. Moreover, this study differs from our previous work in [11], since the main goal of this study is to

investigate the impact of workload variability on the energy efficiency of the system, whereas the workload variability was not modeled nor examined in [11].

M

Furthermore, the system model and the proposed energy-aware heuristics employed in this paper are different from the ones used in all of our previous

ED

research efforts.

Two of the most widely used scheduling heuristics for bags-of-tasks, are the Min-Min and Max-Min policies [12]. Both heuristics are based on a two-step

PT

iterative process. In the first step, the minimum completion time of each unassigned task is calculated, over all of the processors in the system. In the second

CE

step, the task with the minimum (in the case of Min-Min) or maximum (in the case of Max-Min) minimum completion time is assigned to the corresponding processor - hence the name of each algorithm. At each iteration, the current

AC

load of the processors is taken into account for the next scheduling decision. Several energy-aware variants of the above algorithms have been proposed in the literature. However, the vast majority of them compromises the system performance in order to achieve energy savings [13, 14, 15, 16]. The energy-aware heuristics proposed in this paper improve both the energy consumption of the system, as well as the provided QoS. 4

ACCEPTED MANUSCRIPT

processor 1

P1 central scheduler

local queue

processor q

global queue

CR IP T

job arrivals

Pq

local queue

AN US

Figure 1: The queueing model of the target distributed system.

3. Problem Definition 3.1. System Model

The target distributed system consists of a set Q = {p1 , p2 , ..., pq } of q fully connected heterogeneous processors. Each processor pi has its own memory

M

and serves its own local queue of tasks, with an execution rate µi . There is a central scheduler with a global waiting queue, running on a dedicated processor, responsible for scheduling the component tasks of the submitted jobs to the other

ED

processors of the system. A centralized scheduler is utilized in this study, as it is simpler and easier to implement than a distributed scheduler, which would require additional communication and synchronization overhead. Furthermore,

PT

with a centralized scheduler that has global information on the state of the whole system, better scheduling decisions can be made, ultimately providing

CE

better QoS [17]. The scheduler is invoked upon the arrival of each job. Figure 1 shows the queueing model of the system under study.

AC

3.2. Workload Model Real-time jobs arrive dynamically at the central scheduler, with interarrival

time a. The interarrival time of the jobs follows a distribution with a coefficient

of variation CVa and a mean a. Each job is a bag-of-tasks, consisting of a set J = {t1 , t2 , ..., tn } of n independent, parallel tasks, without intertask communication or precedence constraints among them. n is uniformly distributed in the 5

ACCEPTED MANUSCRIPT

range [nmin , nmax ]. There are no dependencies between the submitted jobs. The component tasks of the jobs are not preemptible, as preemption of tasks with time constraints may eventually lead to performance degradation [18, 19, 20].

CR IP T

Each component task ti of a job has a weight wi that denotes its computational volume, i.e. the amount of the computational operations required by the particular task. This information can be obtained by using profiling or from

historical data of previous executions of the particular job [21]. The computational volume of the tasks follows a distribution with a coefficient of variation CVw and a mean w.

AN US

The computational cost of a task ti on a processor pj of the system is defined as:

Comp(ti , pj ) = wi /µj

(1)

where µj is the execution rate of processor pj . The length L of a job is equal to the largest average computational cost of its component tasks, over all of the

M

processors in the system.

Each job has an end-to-end deadline D: (2)

ED

D = AT + RD

where AT is the arrival time of the job and RD is its relative deadline. RD is uniformly distributed in the range [L, 2L]. When a job fails to complete its

PT

execution within its deadline, it is aborted and thus considered lost. 3.3. Energy Consumption Model

CE

In this paper we focus on the energy consumption of the processors, since

compared to other system components, they typically consume the largest amount

AC

of energy [22, 23]. The power P i consumed by a processor pi in the system is

given by:

 i i i P i (ui (τ )) = Pidle + Pmax − Pidle · ui (τ )

(3)

i i where Pidle and Pmax are the power consumed by the processor when idle and

at 100% utilization, respectively, whereas ui (τ ) is the utilization of the processor and is a function of time. Consequently, the energy consumption E i of a 6

ACCEPTED MANUSCRIPT

processor pi over a period of time [τ0 , τ1 ] can be defined as an integral of its power consumption P i [22, 11]: i

E =

Z

τ1

P i (ui (τ )) dτ

(4)

CR IP T

τ0

3.4. Scheduling Heuristics

Four bag-of-tasks scheduling heuristics are employed: two baseline nonenergy-aware policies, Min-Min and Max-Min, and two proposed energy-aware

heuristics, Min-EAMin and Max-EAMin. The first two policies were selected due to their widespread use, whereas the last two were proposed in order to

AN US

investigate the impact of workload variability on the energy consumption of the system also in the case where energy-aware scheduling heuristics are employed. 3.4.1. Min-Min & Max-Min

The baseline Min-Min and Max-Min scheduling policies were adapted, in order to take into account the deadlines of the jobs during the scheduling process.

M

Specifically, in the first step of each iteration of the scheduling algorithm, the estimated completion time CT of each component task ti of a newly arrived job

ED

is calculated on each processor pj of the target system: CT (ti , pj ) = τidle (ti , pj ) + Comp(ti , pj )

(5)

PT

where τidle (ti , pj ) is the time at which processor pj will be able to execute task ti . It depends on the position that task ti would be placed in pj ’s queue, if it

CE

was actually assigned to that particular processor. This position is determined according to the deadline D of ti ’s job, so that task ti would precede other tasks

in the queue with later deadlines.

AC

Subsequently, the minimum estimated completion time M CT of each task

ti over all of the processors in the system is calculated, as follows: M CT (ti ) = min {CT (ti , pj )} pj ∈Q

(6)

In the second step of each iteration of the scheduling process, the task with the minimum (in Min-Min case) or maximum (in Max-Min case) M CT among all 7

ACCEPTED MANUSCRIPT

of the tasks of the job is selected and assigned to the corresponding processor. Each iteration of the scheduling process takes into account the current load of the processors, as resulted by the scheduling decision of the previous iteration.

CR IP T

The iterations continue, until all of the component tasks of the newly arrived job are scheduled. 3.4.2. Min-EAMin & Max-EAMin

The Min-EAMin and Max-EAMin scheduling heuristics are modified ver-

sions of the Min-Min and Max-Min policies described above, which take into

AN US

account the energy consumption of the processors during the scheduling process

(the EA initials in the name of the algorithms stand for the term energy-aware). The energy-awareness of the algorithms is achieved by using an enhanced version of the heuristic used in our previous work in [11]. During the first step, along with the estimated completion time CT of each component task ti on each processor pj , the estimated energy consumption E j of each processor is

M

also determined, calculated based on the amount of energy required to finish the execution of all of pj ’s currently assigned tasks, including task ti . It is given by:

ED

j E j = Pmax · (CT (ti , pj ) − τ )

(7)

j is the power consumption of processor pj at 100% utilization, where Pmax

PT

whereas τ is the current time.

Subsequently, the minimum estimated completion time M CT of task ti is

CE

calculated, as in the case of Min-Min and Max-Min. Thereafter, we determine the set of processors that can provide task ti with estimated completion time

AC

that satisfies the following condition: CT (ti , pj ) ≤ M CT (ti ) + φ · σCT (ti )

(8)

where φ is the selection margin coefficient, which is constant for all tasks. σCT

is the standard deviation of the estimated completion time of task ti on each

8

ACCEPTED MANUSCRIPT

(9)

CR IP T

processor in the system. It is given by: v uP 2 u t pj ∈Q CT (ti , pj ) − CT σCT (ti ) = q

where CT is the mean estimated completion time of task ti on the processors

of the system, whereas q is the number of processors in the system. Condition

(8) constitutes the main enhancement of this heuristic over the one we proposed in [11], where the same fixed margin was always used for the determination of the set of processors.

AN US

From this set, the processor pk that has the minimum estimated energy consumption is selected for task ti . Consequently, the minimum estimated com-

pletion time of the task is recalculated as M CT 0 (ti ) = CT (ti , pk ). The second step of each iteration is performed in the same manner as in the case of Min-Min and Max-Min, respectively. That is, the task with the minimum (in Min-EAMin case) or maximum (in Max-EAMin case) M CT 0 is selected and assigned to the

M

corresponding processor.

ED

4. Experimental Evaluation

In order to investigate the impact of workload variability on the energy efficiency of the target system, the coefficient of variation of the interarrival time

PT

of the jobs was considered to be CVa = {0.5, 1, 1.5}. Similarly, the coefficient of variation of the computational volume of the component tasks was considered to

CE

be CVw = {0.5, 1, 1.5}. In the first set of experiments where CVa = {0.5, 1, 1.5},

the computational volume of the component tasks (the secondary parameter in this case) was considered to be exponential, i.e. CVw = 1. Correspondingly, in

AC

the second set of experiments where CVw = {0.5, 1, 1.5}, the interarrival time of the jobs (the secondary parameter in this case) was considered to be exponential, i.e. CVa = 1. The exponential distribution was chosen for the secondary parameter in each set of experiments, since in a large-scale distributed system with no workload fluctuations the service and interarrival times are typically exponentially distributed [24]. 9

ACCEPTED MANUSCRIPT

4.1. Metrics In addition to the energy consumption, which is the main metric in this study, a secondary metric was employed as well, in order to evaluate the system

CR IP T

performance in terms of QoS. Specifically, the following two metrics were used:

1. Total Energy Consumption, which is the total energy - in kilowatt-hours (kWh) - consumed by all of the processors of the system under study, during the observed time period.

2. Deadline Miss Ratio, which is the ratio of the number of jobs that did

not finish their execution within their deadline (and thus lost), over the

AN US

number of all of the jobs that arrived at the central scheduler of the system, during the observed time period. 4.2. Simulation Setup

We implemented our own discrete-event simulator in C++, tailored to the specific requirements of the particular case study. In order to directly control

M

the workload variability and obtain unbiased, general results, not applicable only to particular workload traces, synthetic workload was used. The input

ED

parameters of the simulation model are shown in Table 1. The heterogeneity of the processors in the system was achieved through the utilization of three sets of processors, categorized according to their com-

PT

putational capacity. Specifically, 25% of the processors were considered to be small, 50% of the processors were considered to be medium, whereas the rest

CE

25% of the processors were considered to be large. The execution rate and power consumption characteristics of the small, medium and large processors R R R R were modeled after the Intel Xeon E3-1260L v5, Intel Xeon E5-2699

AC

R R v4 and Intel Xeon Platinum 8180 processors, respectively, according to the R SPECpower ssj 2008 benchmark. The average values per processor for bench-

mark results published in the period Jan. 1st 2016 - Jan. 1st 2018 were used, as shown in Table 1. According to the benchmark, the execution rate of the processors is expressed in Server-Side Java Operations Per Second (SSJ OPS), whereas the power consumption in watts (W). 10

ACCEPTED MANUSCRIPT

The mean computational volume of the component tasks of the jobs was chosen to be w = 5.22 · 108 Server-Side Java Operations (SSJ OPs), so that on average, the computational cost of a task on a medium processor was approxi-

CR IP T

mately equal to 5 minutes. The mean interarrival time of the jobs was chosen to be a = 43.48, in order for the system to be stable. The selection margin coefficient was selected to be φ = 0.25, since after conducting simulation runs with values in the range [0.05, 1.50] using 0.05 increments, this value gave the best results in terms of both energy efficiency and QoS.

We ran 30 replications of the simulation with different seeds of random

AN US

numbers, for each set of input parameters. Each replication was terminated when 106 jobs had been completed. We found by experimentation that this simulation run length was sufficiently long enough to minimize the effect of

warm-up time. For every mean value, a 95% confidence interval was evaluated. The half-widths of all of the confidence intervals were less than 5% of their

4.3. Simulation Results

M

respective mean values.

ED

The impact of interarrival time variability on the energy consumption of the distributed system under study is illustrated in Fig. 2. Figure 3 shows its effect on the deadline miss ratio metric. The simulation results indicate that as the

PT

variability in the interarrival times increases, the total energy consumption of the system, as well as the deadline miss ratio, increase, irrespectively of the employed scheduling policy. On the other hand, as it is shown in Fig. 4 and

CE

Fig. 5, the variability in the computational volumes of the component tasks has a different impact on the energy efficiency of the system and the provided QoS,

AC

which depends on the employed scheduling heuristic. Specifically, in the case of the baseline Min-Min and Max-Min policies, the energy consumption of the system, as well as the deadline miss ratio, increase with the increase of the computational volume variability. On the contrary, in the case of the energy-aware policies Min-EAMin and Max-EAMin, both metrics decrease as the computational volume variability increases. In absolute values, when CVa = 0.5 all 11

ACCEPTED MANUSCRIPT

Table 1: Input parameters of the simulation model. Value

Number of completed jobs

106

Number of tasks per job

n ∼ U [nmin , nmax ]

Minimum number of tasks per job

nmin = 1

Maximum number of tasks per job

nmax = 64

Mean job interarrival time

a = 43.48

Coefficient of variation of job interarrival time

CVa = {0.5, 1, 1.5}

CR IP T

Parameter

RD ∼ U [L, 2L]

Job relative deadline

w = 5.22 · 108 SSJ OPs

Mean task computational volume

CVw = {0.5, 1, 1.5}

Total number of processors

q = 256

Number of small processors

qs = 64

Number of medium processors

AN US

Coefficient of variation of task computational volume

qm = 128

Number of large processors

ql = 64

µs = 0.51 · 106 SSJ OPS

Small processor execution rate Medium processor execution rate

µm = 1.74 · 106 SSJ OPS µl = 2.86 · 106 SSJ OPS

Large processor execution rate

s Pidle = 16.37 W

Small processor power consumption @ 100% utilization

s Pmax = 48.87 W

Medium processor power consumption when idle

m Pidle = 23.26 W

Medium processor power consumption @ 100% utilization

m Pmax = 133.62 W

Large processor power consumption when idle

l Pidle = 26.43 W

Large processor power consumption @ 100% utilization

l Pmax = 226.56 W

Selection margin coefficient

φ = 0.25

M

Small processor power consumption when idle

Min-Min, Max-Min

ED

Scheduling policy

Min-EAMin, Max-EAMin

PT

scheduling algorithms give a smaller total energy consumption and deadline miss ratio, compared to the case where CVw = 0.5. On the other hand, when

CE

CVa = 1.5, all policies give a larger value for both metrics, compared to the case where CVw = 1.5. It can be observed that the Max-Min and Max-EAMin policies outperform

AC

their corresponding Min-Min and Min-EAMin heuristics. This is due to the fact that Max-Min and Max-EAMin are likely to give a smaller total execution time for a job than Min-Min and Min-EAMin, respectively, since they schedule the component tasks with larger execution times at earlier iterations. Therefore, there is a higher probability that the tasks with smaller execution times that are

12

ACCEPTED MANUSCRIPT

subsequently scheduled, will be executed in parallel with the tasks with larger execution times. Consequently, with Max-Min and Max-EAMin less jobs miss their deadline and thus less energy is wasted, than in the case of Min-Min and

CR IP T

Min-EAMin, respectively. Min-EAMin and Max-EAMin yield better results than their corresponding baseline policies, Min-Min and Max-Min, since they

take into account the energy consumption of the processors at each scheduling

decision. Overall, in all cases of workload variability, the proposed Max-EAMin policy outperforms the other heuristics.

The impact of workload variability, in terms of percentage change (increase

AN US

or decrease) compared to the exponential case of each workload parameter (interarrival time and computational volume), is summarized in Table 2. It can

be observed that the energy-aware heuristics Min-EAMin and Max-EAMin are more sensitive to the changes of the interarrival time variability, exhibiting larger decrease and increase, compared to the baseline policies. On the other hand, while Min-EAMin and Max-EAMin exhibit greater sensitivity to the decrease of

M

the computational volume variability, compared to Min-Min and Max-Min, on the contrary, they are less sensitive to the increase of the computational volume

ED

variability, compared to the baseline heuristics. Table 2: Impact of workload variability (% change compared to exponential case). TEC is the Total Energy Consumption metric, whereas DMR is the Deadline Miss Ratio metric. CVa

TEC

−1.25

PT

Policy

DMR

CVw

TEC

DMR

−2.93

0.5

−0.02

−0.50

0.5

3.10

50.42

0.5

−66.17

−1.95

−14.77

0.5

2.20

122.54

0.5 0.5

Max-Min

0.5

Max-EAMin

0.5

−2.16

Min-Min

1.5

7.29

36.79

1.5

4.55

19.47

Min-EAMin

1.5

7.63

113.02

1.5

−0.53

−7.20

Max-Min

1.5

6.36

71.67

1.5

3.43

34.09

Max-EAMin

1.5

7.05

315.87

1.5

−1.18

−32.43

AC

CE

Min-Min

Min-EAMin

−1.92 −1.65

−20.50 −14.18

13

AN US

CR IP T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

Figure 2: Total Energy Consumption vs. CVa .

Figure 3: Deadline Miss Ratio vs. CVa .

14

AN US

CR IP T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

Figure 4: Total Energy Consumption vs. CVw .

Figure 5: Deadline Miss Ratio vs. CVw .

5. Conclusions and Future Work In this paper we investigated via simulation the impact of workload variability, in terms of computational volume and interarrival times, on the energy 15

ACCEPTED MANUSCRIPT

consumption of a large-scale heterogeneous distributed system. For comparison purposes, a secondary metric was employed as well, in order to evaluate the system performance in terms of QoS. Four heuristics were employed for the

CR IP T

scheduling of the workload, two commonly used baseline policies, Min-Min and Max-Min, and two proposed energy-aware heuristics, Min-EAMin and MaxEAMin. The simulation results demonstrate that the workload variability has

a significant impact on the energy consumption of the system, as well as the provided QoS, which depends on the employed scheduling technique.

Our future work plans include the evaluation of the energy efficiency of

AN US

the system under various processor heterogeneity degrees, utilizing different scheduling heuristics.

References

[1] F. A. B. Silva, H. Senger, Scalability limits of bag-of-tasks applications run-

M

ning on hierarchical platforms, Journal of Parallel and Distributed Computing 71 (6) (2011) 788–801. doi:10.1016/j.jpdc.2011.01.002.

ED

[2] G. L. Stavrinides, H. D. Karatza, The impact of input error on the scheduling of task graphs with imprecise computations in heterogeneous distributed real-time systems, in: Proceedings of the 18th In-

PT

ternational Conference on Analytical and Stochastic Modelling Techniques and Applications (ASMTA’11), 2011, pp. 273–287. doi:10.1007/

CE

978-3-642-21713-5\_20. [3] G. L. Stavrinides, H. D. Karatza, Scheduling real-time jobs in distributed

AC

systems - simulation and performance analysis, in: Proceedings of the 1st International Workshop on Sustainable Ultrascale Computing Systems (NESUS’14), 2014, pp. 13–18.

[4] Y. Chen, W. T. Tsai, Service-Oriented Computing and Web Software Integration: From Principles to Development, 5th Edition, Kendall Hunt Publishing, 2015. 16

ACCEPTED MANUSCRIPT

[5] G. L. Stavrinides, F. R. Duro, H. D. Karatza, J. G. Blas, J. Carretero, Different aspects of workflow scheduling in large-scale distributed systems, Simulation Modelling Practice and Theory 70 (2017) 120–134. doi:10.

CR IP T

1016/j.simpat.2016.10.009. [6] G. L. Stavrinides, H. D. Karatza, Scheduling real-time bag-of-tasks appli-

cations with approximate computations in saas clouds, Concurrency and

Computation: Practice and Experience (2017) e4208doi:10.1002/cpe. 4208.

AN US

[7] G. L. Stavrinides, H. D. Karatza, The effect of workload computational demand variability on the performance of a saas cloud with a multi-tier sla, in: Proceedings of the IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud’17), 2017, pp. 10–17. doi:10.1109/ FiCloud.2017.26.

M

[8] G. L. Stavrinides, H. D. Karatza, Energy-aware scheduling of real-time workflow applications in clouds utilizing dvfs and approximate computations, in: Proceedings of the IEEE 6th International Conference on Fu-

ED

ture Internet of Things and Cloud (FiCloud’18), 2018, pp. 33–40. doi: 10.1109/FiCloud.2018.00013. P.

E.

C.

Standard

Performance

Evaluation

Corporation,

PT

[9] S.

Specpower ssj2008, https://www.spec.org/power_ssj2008/, accessed:

CE

13 Jul 2018 (2018). [10] A. Iosup, O. Sonmez, S. Anoep, D. Epema, The performance of bags-

AC

of-tasks in large-scale distributed systems, in: Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC’08), 2008, pp. 97–108. doi:10.1145/1383422.1383435.

[11] G. L. Stavrinides, H. D. Karatza, Simulation-based performance evaluation of an energy-aware heuristic for the scheduling of hpc applications in

17

ACCEPTED MANUSCRIPT

large-scale distributed systems, in: Proceedings of the 8th ACM/SPEC International Conference on Performance Engineering (ICPE’17), 3rd International Workshop on Energy-aware Simulation (ENERGY-SIM’17), 2017,

CR IP T

pp. 49–54. doi:10.1145/3053600.3053611. [12] O. H. Ibarra, C. E. Kim, Heuristic algorithms for scheduling independent

tasks on nonidentical processors, Journal of the ACM 24 (2) (1977) 280– 289. doi:10.1145/322003.322011.

[13] Y. Li, Y. Liu, D. Qian, A heuristic energy-aware scheduling algorithm for

AN US

heterogeneous clusters, in: Proceedings of the 15th International Confer-

ence on Parallel and Distributed Systems (ICPADS’09), 2009, pp. 407–413. doi:10.1109/ICPADS.2009.33.

[14] C. O. Diaz, M. Guzek, J. E. Pecero, P. Bouvry, S. U. Khan, Scalable and energy-efficient scheduling techniques for large-scale systems, in: Proceed-

M

ings of the IEEE 11th International Conference on Computer and Information Technology (CIT’11), 2011, pp. 641–647. doi:10.1109/CIT.2011.

ED

106.

[15] P. Lindberg, J. Leingang, D. Lysaker, K. Bilal, S. U. Khan, P. Bouvry, N. Ghani, N. Min-Allah, J. Li, Comparison and analysis of greedy energy-

PT

efficient scheduling algorithms for computational grids, John Wiley & Sons, 2012, Ch. 7, pp. 189–214. doi:10.1002/9781118342015.ch7.

CE

[16] S. Nesmachnow, B. Dorronsoro, J. E. Pecero, P. Bouvry, Energyaware scheduling on multicore heterogeneous grid computing systems,

AC

Journal of Grid Computing 11 (4) (2013) 653–680.

doi:10.1007/

s10723-013-9258-3.

[17] X. Zhu, X. Qin, M. Qiu, Qos-aware fault-tolerant scheduling for real-time tasks on heterogeneous clusters, IEEE Transactions on Computers 60 (6) (2011) 800–812. doi:10.1109/TC.2011.68.

18

ACCEPTED MANUSCRIPT

[18] G. C. Buttazzo, Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications, 3rd Edition, Springer, 2011.

doi:

10.1007/978-1-4614-0676-1.

CR IP T

[19] G. L. Stavrinides, H. D. Karatza, Scheduling multiple task graphs in het-

erogeneous distributed real-time systems by exploiting schedule holes with bin packing techniques, Simulation Modelling Practice and Theory 19 (1) (2011) 540–552.

[20] G. L. Stavrinides, H. D. Karatza, The impact of data locality on the per-

AN US

formance of a saas cloud with real-time data-intensive applications, in:

Proceedings of the 21st IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT’17), 2017, pp. 1–8. doi:10.1109/DISTRA.2017.8167683.

[21] R. N. Calheiros, R. Buyya, Energy-efficient scheduling of urgent bag-of-

M

tasks applications in clouds through dvfs, in: Proceedings of the 6th IEEE International Conference on Cloud Computing Technology and Science

ED

(CloudCom’14), 2014, pp. 342–349. doi:10.1109/CloudCom.2014.20. [22] A. Beloglazov, J. Abawajy, R. Buyya, Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing,

PT

Future Generation Computer Systems 28 (5) (2012) 755–768. doi:10. 1016/j.future.2011.04.017.

CE

[23] G. L. Stavrinides, H. D. Karatza, Scheduling data-intensive workloads in large-scale distributed systems: trends and challenges, 1st Edition, Vol. 36

AC

of Studies in Big Data, Springer, 2018, Ch. 2, pp. 19–43. doi:10.1007/ 978-3-319-73767-6\_2.

[24] D. Mukherjee, S. Dhara, S. C. Borst, J. S. van Leeuwaarden, Optimal service elasticity in large-scale distributed systems, Proceedings of the ACM on Measurement and Analysis of Computing Systems 1 (1) (2017) 25:1– 25:28. doi:10.1145/3084463.

19