Accepted Manuscript
The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems Georgios L. Stavrinides, Helen D. Karatza PII: DOI: Reference:
S1569-190X(18)30140-0 https://doi.org/10.1016/j.simpat.2018.09.013 SIMPAT 1860
To appear in:
Simulation Modelling Practice and Theory
Received date: Revised date: Accepted date:
18 August 2018 21 September 2018 24 September 2018
Please cite this article as: Georgios L. Stavrinides, Helen D. Karatza, The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems, Simulation Modelling Practice and Theory (2018), doi: https://doi.org/10.1016/j.simpat.2018.09.013
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
The Impact of Workload Variability on the Energy Efficiency of Large-Scale Heterogeneous Distributed Systems
a Department
CR IP T
Georgios L. Stavrinidesa,∗, Helen D. Karatzaa
of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
AN US
Abstract
Previous studies have shown that the workload variability has a serious impact on the performance of large-scale distributed architectures, since it may cause significant fluctuations in service demands. Energy efficiency is one of the aspects of such platforms that are of paramount importance and therefore it is imperative to investigate how it may also be affected by this factor. Towards
M
this direction, in this paper we investigate via simulation the impact of workload variability, in terms of computational volume and interarrival times, on
ED
the energy consumption of a large-scale heterogeneous distributed system. The workload consists of real-time bag-of-tasks jobs that arrive dynamically at the system. The execution rate and power consumption characteristics of the pro-
PT
cessors are modeled after real-world processors, according to the Standard Performance Evaluation Corporation (SPEC) Power benchmark. Four heuristics are employed for the scheduling of the workload, two commonly used baseline
CE
policies and two proposed energy-aware heuristics. The simulation results reveal that the workload variability has a significant impact on the energy consump-
AC
tion of the system and that the severity of the impact depends on the employed scheduling technique. Keywords: Energy Efficiency, Workload Variability, Large-Scale Distributed ∗ Corresponding
author. Tel.: +30 2310 997974. Email addresses:
[email protected] (Georgios L. Stavrinides),
[email protected] (Helen D. Karatza)
Preprint submitted to Simulation Modelling Practice and Theory
September 24, 2018
ACCEPTED MANUSCRIPT
Systems, Heterogeneity, Real-Time Jobs
1. Introduction
CR IP T
In recent years, energy consumption has become one of the main challenges
of large-scale distributed architectures. The ever-growing demand for effective and highly parallel processing, along with the rapid technological advancements, have driven distributed systems to unprecedented scales and computational ca-
pabilities. Furthermore, such infrastructures gradually grow more heterogeneous due to the budget-constrained upgrade and replacement cycles, encompassing
AN US
processors of different generations with various energy consumption characteristics. Inevitably, these systems continue to show a dramatic increase in their carbon footprint. Due to the heterogeneity and inherent complexity of such platforms, their performance is usually evaluated via simulation, rather than by analytical methods. Analytical modeling in such cases would require many
M
simplifying assumptions that could yield misleading conclusions. The jobs processed on such platforms typically feature a high degree of parallelism, consisting of independent tasks that can be executed in any order,
ED
without any intertask communication among them. Such jobs are commonly referred to as bags-of-tasks. Prominent examples are image processing, chromosome mapping, massive searches and data mining jobs [1]. Quality of Service
PT
(QoS), often expressed in terms of timeliness, is another important aspect of such systems. That is, jobs are often real-time, having a deadline within which
CE
they must complete their execution. Since a late result would be useless, in case a job fails to meet its deadline, it is aborted from the system [2, 3, 4, 5].
AC
1.1. Motivation Previous studies have shown that the workload variability has a serious im-
pact on the performance of large-scale distributed architectures [6, 7, 8]. It may cause significant fluctuations in service demands, as their low, average and peak values can differ by several orders of magnitude. However, previous studies ignored the impact of these fluctuations on the energy consumption of the 2
ACCEPTED MANUSCRIPT
system. Since energy efficiency is one of the aspects of such platforms that are of paramount importance, it is imperative to investigate how it may also be
CR IP T
affected by this factor. 1.2. Contribution
Towards this direction, in this paper we investigate via simulation the impact of workload variability, in terms of computational volume and interarrival times,
on the energy consumption of a large-scale heterogeneous distributed system. The workload consists of real-time bag-of-tasks jobs that arrive dynamically
AN US
at the system. The execution rate and power consumption characteristics of R the processors are modeled after real-world Intel processors, according to the R 2008 Standard Performance Evaluation Corporation (SPEC) SPECpower ssj
benchmark, which is the first industry-standard benchmark that evaluates the power and performance characteristics of servers [9]. Four heuristics are employed for the scheduling of the workload, two commonly used baseline policies
M
and two proposed energy-aware heuristics.
ED
2. Background and Related Work
In this paper, the interarrival time and computational volume variability are expressed in terms of the coefficient of variation of the corresponding random
PT
variables. The coefficient of variation CV of a random variable X is the ratio of its standard deviation σ to its mean X and determines the variability in its
CE
values, in relation to the mean (i.e. CV = σ/X). With regard to the exponential
distribution, when CV = 1, it means that X is exponentially distributed with
mean X. In this case, X exhibits moderate variability in its values. On the one
AC
hand, when CV = 0.5, X features low variability, compared to the exponential
distribution. In this case, X follows a hypoexponential distribution with mean X and its values are closer to the mean X, compared to the exponential distribution case. On the other hand, when CV = 1.5, X features high variability in its
values, compared to the exponential distribution. In this case, X follows a
3
ACCEPTED MANUSCRIPT
hyperexponential distribution with mean X and takes a large number of very small values and a small number of very large values, compared to the mean X. Many research efforts have been focused on the investigation of the effects
CR IP T
of workload variability in distributed architectures [10, 6, 7]. However, none of these works investigate the impact of workload variability on the energy ef-
ficiency of the computing resources. The work presented in this paper differs from our previous work in [6, 7], since in this case the energy consumption of the
system is taken into account, in contrast to our previous efforts that only considered the impact of workload variability on the provided QoS. Furthermore, in
AN US
this study the workload variability is expressed in terms of both computational volume and interarrival times, whereas our previous studies only considered the variability in the computational volume of the workload. Moreover, this study differs from our previous work in [11], since the main goal of this study is to
investigate the impact of workload variability on the energy efficiency of the system, whereas the workload variability was not modeled nor examined in [11].
M
Furthermore, the system model and the proposed energy-aware heuristics employed in this paper are different from the ones used in all of our previous
ED
research efforts.
Two of the most widely used scheduling heuristics for bags-of-tasks, are the Min-Min and Max-Min policies [12]. Both heuristics are based on a two-step
PT
iterative process. In the first step, the minimum completion time of each unassigned task is calculated, over all of the processors in the system. In the second
CE
step, the task with the minimum (in the case of Min-Min) or maximum (in the case of Max-Min) minimum completion time is assigned to the corresponding processor - hence the name of each algorithm. At each iteration, the current
AC
load of the processors is taken into account for the next scheduling decision. Several energy-aware variants of the above algorithms have been proposed in the literature. However, the vast majority of them compromises the system performance in order to achieve energy savings [13, 14, 15, 16]. The energy-aware heuristics proposed in this paper improve both the energy consumption of the system, as well as the provided QoS. 4
ACCEPTED MANUSCRIPT
processor 1
P1 central scheduler
local queue
processor q
global queue
CR IP T
job arrivals
Pq
local queue
AN US
Figure 1: The queueing model of the target distributed system.
3. Problem Definition 3.1. System Model
The target distributed system consists of a set Q = {p1 , p2 , ..., pq } of q fully connected heterogeneous processors. Each processor pi has its own memory
M
and serves its own local queue of tasks, with an execution rate µi . There is a central scheduler with a global waiting queue, running on a dedicated processor, responsible for scheduling the component tasks of the submitted jobs to the other
ED
processors of the system. A centralized scheduler is utilized in this study, as it is simpler and easier to implement than a distributed scheduler, which would require additional communication and synchronization overhead. Furthermore,
PT
with a centralized scheduler that has global information on the state of the whole system, better scheduling decisions can be made, ultimately providing
CE
better QoS [17]. The scheduler is invoked upon the arrival of each job. Figure 1 shows the queueing model of the system under study.
AC
3.2. Workload Model Real-time jobs arrive dynamically at the central scheduler, with interarrival
time a. The interarrival time of the jobs follows a distribution with a coefficient
of variation CVa and a mean a. Each job is a bag-of-tasks, consisting of a set J = {t1 , t2 , ..., tn } of n independent, parallel tasks, without intertask communication or precedence constraints among them. n is uniformly distributed in the 5
ACCEPTED MANUSCRIPT
range [nmin , nmax ]. There are no dependencies between the submitted jobs. The component tasks of the jobs are not preemptible, as preemption of tasks with time constraints may eventually lead to performance degradation [18, 19, 20].
CR IP T
Each component task ti of a job has a weight wi that denotes its computational volume, i.e. the amount of the computational operations required by the particular task. This information can be obtained by using profiling or from
historical data of previous executions of the particular job [21]. The computational volume of the tasks follows a distribution with a coefficient of variation CVw and a mean w.
AN US
The computational cost of a task ti on a processor pj of the system is defined as:
Comp(ti , pj ) = wi /µj
(1)
where µj is the execution rate of processor pj . The length L of a job is equal to the largest average computational cost of its component tasks, over all of the
M
processors in the system.
Each job has an end-to-end deadline D: (2)
ED
D = AT + RD
where AT is the arrival time of the job and RD is its relative deadline. RD is uniformly distributed in the range [L, 2L]. When a job fails to complete its
PT
execution within its deadline, it is aborted and thus considered lost. 3.3. Energy Consumption Model
CE
In this paper we focus on the energy consumption of the processors, since
compared to other system components, they typically consume the largest amount
AC
of energy [22, 23]. The power P i consumed by a processor pi in the system is
given by:
i i i P i (ui (τ )) = Pidle + Pmax − Pidle · ui (τ )
(3)
i i where Pidle and Pmax are the power consumed by the processor when idle and
at 100% utilization, respectively, whereas ui (τ ) is the utilization of the processor and is a function of time. Consequently, the energy consumption E i of a 6
ACCEPTED MANUSCRIPT
processor pi over a period of time [τ0 , τ1 ] can be defined as an integral of its power consumption P i [22, 11]: i
E =
Z
τ1
P i (ui (τ )) dτ
(4)
CR IP T
τ0
3.4. Scheduling Heuristics
Four bag-of-tasks scheduling heuristics are employed: two baseline nonenergy-aware policies, Min-Min and Max-Min, and two proposed energy-aware
heuristics, Min-EAMin and Max-EAMin. The first two policies were selected due to their widespread use, whereas the last two were proposed in order to
AN US
investigate the impact of workload variability on the energy consumption of the system also in the case where energy-aware scheduling heuristics are employed. 3.4.1. Min-Min & Max-Min
The baseline Min-Min and Max-Min scheduling policies were adapted, in order to take into account the deadlines of the jobs during the scheduling process.
M
Specifically, in the first step of each iteration of the scheduling algorithm, the estimated completion time CT of each component task ti of a newly arrived job
ED
is calculated on each processor pj of the target system: CT (ti , pj ) = τidle (ti , pj ) + Comp(ti , pj )
(5)
PT
where τidle (ti , pj ) is the time at which processor pj will be able to execute task ti . It depends on the position that task ti would be placed in pj ’s queue, if it
CE
was actually assigned to that particular processor. This position is determined according to the deadline D of ti ’s job, so that task ti would precede other tasks
in the queue with later deadlines.
AC
Subsequently, the minimum estimated completion time M CT of each task
ti over all of the processors in the system is calculated, as follows: M CT (ti ) = min {CT (ti , pj )} pj ∈Q
(6)
In the second step of each iteration of the scheduling process, the task with the minimum (in Min-Min case) or maximum (in Max-Min case) M CT among all 7
ACCEPTED MANUSCRIPT
of the tasks of the job is selected and assigned to the corresponding processor. Each iteration of the scheduling process takes into account the current load of the processors, as resulted by the scheduling decision of the previous iteration.
CR IP T
The iterations continue, until all of the component tasks of the newly arrived job are scheduled. 3.4.2. Min-EAMin & Max-EAMin
The Min-EAMin and Max-EAMin scheduling heuristics are modified ver-
sions of the Min-Min and Max-Min policies described above, which take into
AN US
account the energy consumption of the processors during the scheduling process
(the EA initials in the name of the algorithms stand for the term energy-aware). The energy-awareness of the algorithms is achieved by using an enhanced version of the heuristic used in our previous work in [11]. During the first step, along with the estimated completion time CT of each component task ti on each processor pj , the estimated energy consumption E j of each processor is
M
also determined, calculated based on the amount of energy required to finish the execution of all of pj ’s currently assigned tasks, including task ti . It is given by:
ED
j E j = Pmax · (CT (ti , pj ) − τ )
(7)
j is the power consumption of processor pj at 100% utilization, where Pmax
PT
whereas τ is the current time.
Subsequently, the minimum estimated completion time M CT of task ti is
CE
calculated, as in the case of Min-Min and Max-Min. Thereafter, we determine the set of processors that can provide task ti with estimated completion time
AC
that satisfies the following condition: CT (ti , pj ) ≤ M CT (ti ) + φ · σCT (ti )
(8)
where φ is the selection margin coefficient, which is constant for all tasks. σCT
is the standard deviation of the estimated completion time of task ti on each
8
ACCEPTED MANUSCRIPT
(9)
CR IP T
processor in the system. It is given by: v uP 2 u t pj ∈Q CT (ti , pj ) − CT σCT (ti ) = q
where CT is the mean estimated completion time of task ti on the processors
of the system, whereas q is the number of processors in the system. Condition
(8) constitutes the main enhancement of this heuristic over the one we proposed in [11], where the same fixed margin was always used for the determination of the set of processors.
AN US
From this set, the processor pk that has the minimum estimated energy consumption is selected for task ti . Consequently, the minimum estimated com-
pletion time of the task is recalculated as M CT 0 (ti ) = CT (ti , pk ). The second step of each iteration is performed in the same manner as in the case of Min-Min and Max-Min, respectively. That is, the task with the minimum (in Min-EAMin case) or maximum (in Max-EAMin case) M CT 0 is selected and assigned to the
M
corresponding processor.
ED
4. Experimental Evaluation
In order to investigate the impact of workload variability on the energy efficiency of the target system, the coefficient of variation of the interarrival time
PT
of the jobs was considered to be CVa = {0.5, 1, 1.5}. Similarly, the coefficient of variation of the computational volume of the component tasks was considered to
CE
be CVw = {0.5, 1, 1.5}. In the first set of experiments where CVa = {0.5, 1, 1.5},
the computational volume of the component tasks (the secondary parameter in this case) was considered to be exponential, i.e. CVw = 1. Correspondingly, in
AC
the second set of experiments where CVw = {0.5, 1, 1.5}, the interarrival time of the jobs (the secondary parameter in this case) was considered to be exponential, i.e. CVa = 1. The exponential distribution was chosen for the secondary parameter in each set of experiments, since in a large-scale distributed system with no workload fluctuations the service and interarrival times are typically exponentially distributed [24]. 9
ACCEPTED MANUSCRIPT
4.1. Metrics In addition to the energy consumption, which is the main metric in this study, a secondary metric was employed as well, in order to evaluate the system
CR IP T
performance in terms of QoS. Specifically, the following two metrics were used:
1. Total Energy Consumption, which is the total energy - in kilowatt-hours (kWh) - consumed by all of the processors of the system under study, during the observed time period.
2. Deadline Miss Ratio, which is the ratio of the number of jobs that did
not finish their execution within their deadline (and thus lost), over the
AN US
number of all of the jobs that arrived at the central scheduler of the system, during the observed time period. 4.2. Simulation Setup
We implemented our own discrete-event simulator in C++, tailored to the specific requirements of the particular case study. In order to directly control
M
the workload variability and obtain unbiased, general results, not applicable only to particular workload traces, synthetic workload was used. The input
ED
parameters of the simulation model are shown in Table 1. The heterogeneity of the processors in the system was achieved through the utilization of three sets of processors, categorized according to their com-
PT
putational capacity. Specifically, 25% of the processors were considered to be small, 50% of the processors were considered to be medium, whereas the rest
CE
25% of the processors were considered to be large. The execution rate and power consumption characteristics of the small, medium and large processors R R R R were modeled after the Intel Xeon E3-1260L v5, Intel Xeon E5-2699
AC
R R v4 and Intel Xeon Platinum 8180 processors, respectively, according to the R SPECpower ssj 2008 benchmark. The average values per processor for bench-
mark results published in the period Jan. 1st 2016 - Jan. 1st 2018 were used, as shown in Table 1. According to the benchmark, the execution rate of the processors is expressed in Server-Side Java Operations Per Second (SSJ OPS), whereas the power consumption in watts (W). 10
ACCEPTED MANUSCRIPT
The mean computational volume of the component tasks of the jobs was chosen to be w = 5.22 · 108 Server-Side Java Operations (SSJ OPs), so that on average, the computational cost of a task on a medium processor was approxi-
CR IP T
mately equal to 5 minutes. The mean interarrival time of the jobs was chosen to be a = 43.48, in order for the system to be stable. The selection margin coefficient was selected to be φ = 0.25, since after conducting simulation runs with values in the range [0.05, 1.50] using 0.05 increments, this value gave the best results in terms of both energy efficiency and QoS.
We ran 30 replications of the simulation with different seeds of random
AN US
numbers, for each set of input parameters. Each replication was terminated when 106 jobs had been completed. We found by experimentation that this simulation run length was sufficiently long enough to minimize the effect of
warm-up time. For every mean value, a 95% confidence interval was evaluated. The half-widths of all of the confidence intervals were less than 5% of their
4.3. Simulation Results
M
respective mean values.
ED
The impact of interarrival time variability on the energy consumption of the distributed system under study is illustrated in Fig. 2. Figure 3 shows its effect on the deadline miss ratio metric. The simulation results indicate that as the
PT
variability in the interarrival times increases, the total energy consumption of the system, as well as the deadline miss ratio, increase, irrespectively of the employed scheduling policy. On the other hand, as it is shown in Fig. 4 and
CE
Fig. 5, the variability in the computational volumes of the component tasks has a different impact on the energy efficiency of the system and the provided QoS,
AC
which depends on the employed scheduling heuristic. Specifically, in the case of the baseline Min-Min and Max-Min policies, the energy consumption of the system, as well as the deadline miss ratio, increase with the increase of the computational volume variability. On the contrary, in the case of the energy-aware policies Min-EAMin and Max-EAMin, both metrics decrease as the computational volume variability increases. In absolute values, when CVa = 0.5 all 11
ACCEPTED MANUSCRIPT
Table 1: Input parameters of the simulation model. Value
Number of completed jobs
106
Number of tasks per job
n ∼ U [nmin , nmax ]
Minimum number of tasks per job
nmin = 1
Maximum number of tasks per job
nmax = 64
Mean job interarrival time
a = 43.48
Coefficient of variation of job interarrival time
CVa = {0.5, 1, 1.5}
CR IP T
Parameter
RD ∼ U [L, 2L]
Job relative deadline
w = 5.22 · 108 SSJ OPs
Mean task computational volume
CVw = {0.5, 1, 1.5}
Total number of processors
q = 256
Number of small processors
qs = 64
Number of medium processors
AN US
Coefficient of variation of task computational volume
qm = 128
Number of large processors
ql = 64
µs = 0.51 · 106 SSJ OPS
Small processor execution rate Medium processor execution rate
µm = 1.74 · 106 SSJ OPS µl = 2.86 · 106 SSJ OPS
Large processor execution rate
s Pidle = 16.37 W
Small processor power consumption @ 100% utilization
s Pmax = 48.87 W
Medium processor power consumption when idle
m Pidle = 23.26 W
Medium processor power consumption @ 100% utilization
m Pmax = 133.62 W
Large processor power consumption when idle
l Pidle = 26.43 W
Large processor power consumption @ 100% utilization
l Pmax = 226.56 W
Selection margin coefficient
φ = 0.25
M
Small processor power consumption when idle
Min-Min, Max-Min
ED
Scheduling policy
Min-EAMin, Max-EAMin
PT
scheduling algorithms give a smaller total energy consumption and deadline miss ratio, compared to the case where CVw = 0.5. On the other hand, when
CE
CVa = 1.5, all policies give a larger value for both metrics, compared to the case where CVw = 1.5. It can be observed that the Max-Min and Max-EAMin policies outperform
AC
their corresponding Min-Min and Min-EAMin heuristics. This is due to the fact that Max-Min and Max-EAMin are likely to give a smaller total execution time for a job than Min-Min and Min-EAMin, respectively, since they schedule the component tasks with larger execution times at earlier iterations. Therefore, there is a higher probability that the tasks with smaller execution times that are
12
ACCEPTED MANUSCRIPT
subsequently scheduled, will be executed in parallel with the tasks with larger execution times. Consequently, with Max-Min and Max-EAMin less jobs miss their deadline and thus less energy is wasted, than in the case of Min-Min and
CR IP T
Min-EAMin, respectively. Min-EAMin and Max-EAMin yield better results than their corresponding baseline policies, Min-Min and Max-Min, since they
take into account the energy consumption of the processors at each scheduling
decision. Overall, in all cases of workload variability, the proposed Max-EAMin policy outperforms the other heuristics.
The impact of workload variability, in terms of percentage change (increase
AN US
or decrease) compared to the exponential case of each workload parameter (interarrival time and computational volume), is summarized in Table 2. It can
be observed that the energy-aware heuristics Min-EAMin and Max-EAMin are more sensitive to the changes of the interarrival time variability, exhibiting larger decrease and increase, compared to the baseline policies. On the other hand, while Min-EAMin and Max-EAMin exhibit greater sensitivity to the decrease of
M
the computational volume variability, compared to Min-Min and Max-Min, on the contrary, they are less sensitive to the increase of the computational volume
ED
variability, compared to the baseline heuristics. Table 2: Impact of workload variability (% change compared to exponential case). TEC is the Total Energy Consumption metric, whereas DMR is the Deadline Miss Ratio metric. CVa
TEC
−1.25
PT
Policy
DMR
CVw
TEC
DMR
−2.93
0.5
−0.02
−0.50
0.5
3.10
50.42
0.5
−66.17
−1.95
−14.77
0.5
2.20
122.54
0.5 0.5
Max-Min
0.5
Max-EAMin
0.5
−2.16
Min-Min
1.5
7.29
36.79
1.5
4.55
19.47
Min-EAMin
1.5
7.63
113.02
1.5
−0.53
−7.20
Max-Min
1.5
6.36
71.67
1.5
3.43
34.09
Max-EAMin
1.5
7.05
315.87
1.5
−1.18
−32.43
AC
CE
Min-Min
Min-EAMin
−1.92 −1.65
−20.50 −14.18
13
AN US
CR IP T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
Figure 2: Total Energy Consumption vs. CVa .
Figure 3: Deadline Miss Ratio vs. CVa .
14
AN US
CR IP T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
Figure 4: Total Energy Consumption vs. CVw .
Figure 5: Deadline Miss Ratio vs. CVw .
5. Conclusions and Future Work In this paper we investigated via simulation the impact of workload variability, in terms of computational volume and interarrival times, on the energy 15
ACCEPTED MANUSCRIPT
consumption of a large-scale heterogeneous distributed system. For comparison purposes, a secondary metric was employed as well, in order to evaluate the system performance in terms of QoS. Four heuristics were employed for the
CR IP T
scheduling of the workload, two commonly used baseline policies, Min-Min and Max-Min, and two proposed energy-aware heuristics, Min-EAMin and MaxEAMin. The simulation results demonstrate that the workload variability has
a significant impact on the energy consumption of the system, as well as the provided QoS, which depends on the employed scheduling technique.
Our future work plans include the evaluation of the energy efficiency of
AN US
the system under various processor heterogeneity degrees, utilizing different scheduling heuristics.
References
[1] F. A. B. Silva, H. Senger, Scalability limits of bag-of-tasks applications run-
M
ning on hierarchical platforms, Journal of Parallel and Distributed Computing 71 (6) (2011) 788–801. doi:10.1016/j.jpdc.2011.01.002.
ED
[2] G. L. Stavrinides, H. D. Karatza, The impact of input error on the scheduling of task graphs with imprecise computations in heterogeneous distributed real-time systems, in: Proceedings of the 18th In-
PT
ternational Conference on Analytical and Stochastic Modelling Techniques and Applications (ASMTA’11), 2011, pp. 273–287. doi:10.1007/
CE
978-3-642-21713-5\_20. [3] G. L. Stavrinides, H. D. Karatza, Scheduling real-time jobs in distributed
AC
systems - simulation and performance analysis, in: Proceedings of the 1st International Workshop on Sustainable Ultrascale Computing Systems (NESUS’14), 2014, pp. 13–18.
[4] Y. Chen, W. T. Tsai, Service-Oriented Computing and Web Software Integration: From Principles to Development, 5th Edition, Kendall Hunt Publishing, 2015. 16
ACCEPTED MANUSCRIPT
[5] G. L. Stavrinides, F. R. Duro, H. D. Karatza, J. G. Blas, J. Carretero, Different aspects of workflow scheduling in large-scale distributed systems, Simulation Modelling Practice and Theory 70 (2017) 120–134. doi:10.
CR IP T
1016/j.simpat.2016.10.009. [6] G. L. Stavrinides, H. D. Karatza, Scheduling real-time bag-of-tasks appli-
cations with approximate computations in saas clouds, Concurrency and
Computation: Practice and Experience (2017) e4208doi:10.1002/cpe. 4208.
AN US
[7] G. L. Stavrinides, H. D. Karatza, The effect of workload computational demand variability on the performance of a saas cloud with a multi-tier sla, in: Proceedings of the IEEE 5th International Conference on Future Internet of Things and Cloud (FiCloud’17), 2017, pp. 10–17. doi:10.1109/ FiCloud.2017.26.
M
[8] G. L. Stavrinides, H. D. Karatza, Energy-aware scheduling of real-time workflow applications in clouds utilizing dvfs and approximate computations, in: Proceedings of the IEEE 6th International Conference on Fu-
ED
ture Internet of Things and Cloud (FiCloud’18), 2018, pp. 33–40. doi: 10.1109/FiCloud.2018.00013. P.
E.
C.
Standard
Performance
Evaluation
Corporation,
PT
[9] S.
Specpower ssj2008, https://www.spec.org/power_ssj2008/, accessed:
CE
13 Jul 2018 (2018). [10] A. Iosup, O. Sonmez, S. Anoep, D. Epema, The performance of bags-
AC
of-tasks in large-scale distributed systems, in: Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC’08), 2008, pp. 97–108. doi:10.1145/1383422.1383435.
[11] G. L. Stavrinides, H. D. Karatza, Simulation-based performance evaluation of an energy-aware heuristic for the scheduling of hpc applications in
17
ACCEPTED MANUSCRIPT
large-scale distributed systems, in: Proceedings of the 8th ACM/SPEC International Conference on Performance Engineering (ICPE’17), 3rd International Workshop on Energy-aware Simulation (ENERGY-SIM’17), 2017,
CR IP T
pp. 49–54. doi:10.1145/3053600.3053611. [12] O. H. Ibarra, C. E. Kim, Heuristic algorithms for scheduling independent
tasks on nonidentical processors, Journal of the ACM 24 (2) (1977) 280– 289. doi:10.1145/322003.322011.
[13] Y. Li, Y. Liu, D. Qian, A heuristic energy-aware scheduling algorithm for
AN US
heterogeneous clusters, in: Proceedings of the 15th International Confer-
ence on Parallel and Distributed Systems (ICPADS’09), 2009, pp. 407–413. doi:10.1109/ICPADS.2009.33.
[14] C. O. Diaz, M. Guzek, J. E. Pecero, P. Bouvry, S. U. Khan, Scalable and energy-efficient scheduling techniques for large-scale systems, in: Proceed-
M
ings of the IEEE 11th International Conference on Computer and Information Technology (CIT’11), 2011, pp. 641–647. doi:10.1109/CIT.2011.
ED
106.
[15] P. Lindberg, J. Leingang, D. Lysaker, K. Bilal, S. U. Khan, P. Bouvry, N. Ghani, N. Min-Allah, J. Li, Comparison and analysis of greedy energy-
PT
efficient scheduling algorithms for computational grids, John Wiley & Sons, 2012, Ch. 7, pp. 189–214. doi:10.1002/9781118342015.ch7.
CE
[16] S. Nesmachnow, B. Dorronsoro, J. E. Pecero, P. Bouvry, Energyaware scheduling on multicore heterogeneous grid computing systems,
AC
Journal of Grid Computing 11 (4) (2013) 653–680.
doi:10.1007/
s10723-013-9258-3.
[17] X. Zhu, X. Qin, M. Qiu, Qos-aware fault-tolerant scheduling for real-time tasks on heterogeneous clusters, IEEE Transactions on Computers 60 (6) (2011) 800–812. doi:10.1109/TC.2011.68.
18
ACCEPTED MANUSCRIPT
[18] G. C. Buttazzo, Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications, 3rd Edition, Springer, 2011.
doi:
10.1007/978-1-4614-0676-1.
CR IP T
[19] G. L. Stavrinides, H. D. Karatza, Scheduling multiple task graphs in het-
erogeneous distributed real-time systems by exploiting schedule holes with bin packing techniques, Simulation Modelling Practice and Theory 19 (1) (2011) 540–552.
[20] G. L. Stavrinides, H. D. Karatza, The impact of data locality on the per-
AN US
formance of a saas cloud with real-time data-intensive applications, in:
Proceedings of the 21st IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT’17), 2017, pp. 1–8. doi:10.1109/DISTRA.2017.8167683.
[21] R. N. Calheiros, R. Buyya, Energy-efficient scheduling of urgent bag-of-
M
tasks applications in clouds through dvfs, in: Proceedings of the 6th IEEE International Conference on Cloud Computing Technology and Science
ED
(CloudCom’14), 2014, pp. 342–349. doi:10.1109/CloudCom.2014.20. [22] A. Beloglazov, J. Abawajy, R. Buyya, Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing,
PT
Future Generation Computer Systems 28 (5) (2012) 755–768. doi:10. 1016/j.future.2011.04.017.
CE
[23] G. L. Stavrinides, H. D. Karatza, Scheduling data-intensive workloads in large-scale distributed systems: trends and challenges, 1st Edition, Vol. 36
AC
of Studies in Big Data, Springer, 2018, Ch. 2, pp. 19–43. doi:10.1007/ 978-3-319-73767-6\_2.
[24] D. Mukherjee, S. Dhara, S. C. Borst, J. S. van Leeuwaarden, Optimal service elasticity in large-scale distributed systems, Proceedings of the ACM on Measurement and Analysis of Computing Systems 1 (1) (2017) 25:1– 25:28. doi:10.1145/3084463.
19