A new energy-aware task scheduling method for data-intensive applications in the cloud

A new energy-aware task scheduling method for data-intensive applications in the cloud

Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Q1 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 ...

2MB Sizes 0 Downloads 63 Views

Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Q1 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

Contents lists available at ScienceDirect

Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca

A new energy-aware task scheduling method for data-intensive applications in the cloud Qing Zhao a,n, Congcong Xiong a, Ce Yu b, Chuanlei Zhang a, Xi Zhao a a b

School of Computer Science and Information Technology, Tianjin University of Science and Technology, 300222 Tianjin, China School of Computer Science and Technology, Tianjin University, 300072 Tianjin, China

art ic l e i nf o

Keywords: Energy aware scheduling Data-intensive application SLA violation rate Data correlation

a b s t r a c t Maximizing energy efficiency while ensuring the user's Service-Level Agreement (SLA) is very important for the purpose of environmental protection and profit maximization for the cloud service providers. In this paper, an energy and deadline aware task scheduling method for data-intensive applications is proposed. In this method, first, the datasets and tasks are modeled as a binary tree by a data correlation clustering algorithm, in which both the data correlations generated from the initial datasets and that from the intermediate datasets have been considered. Hence, the amount of global data transmission can be reduced greatly, which are beneficial to the reduction of SLA violation rate. Second, a “Tree-to-Tree” task scheduling approach based on the calculation of Task Requirement Degree (TRD) is proposed, which can improve energy efficiency of the whole cloud system by optimizing the utilization of its computing resources and network bandwidth. Experiment results show that the power consumption of the cloud system can be reduced efficiently while maintaining a low-level SLA violation rate. & 2015 Published by Elsevier Ltd.

1. Introduction With the arrival of Big Data, many applications with a large amount of data have been abstracted as scientific workflows and run on a cloud platform. Cloud computing has been envisioned as the next-generation computing paradigm because of its advantages in powerful computing capacity and low application cost (Buyya, 2008; Armbrust et al., 2010; Pedram, 2012). However, the growing quantity of cloud data centers has greatly increased the total energy consumption in the world, which has become a critical environmental issue because of high carbon emissions. On the other hand, high power consumption is also a big problem in terms of economic cost from the perspective of cloud service providers. Researchers (Qureshi et al., 2009) have found, a 3% reduction in energy cost for a large company like Google can translate into over a million dollars in cost savings. High energy consumption not only translates to high cost but also leads to high carbon emissions, which are not environmentally friendly. Therefore, the problem of energy-aware performance optimization has attracted significant attention. In literature, many existing works (Gorbenko and Popov, 2012; Fard et al., 2012) have shown that a task scheduling strategy is crucial for the overall performance of cloud workflow systems' energy n

Corresponding author. E-mail address: [email protected] (Q. Zhao).

efficiency. The high ratio of under-loaded machines is the main reason for low energy efficiency. Researchers (Chen et al., 2008) have found that even during idle periods, most of today's servers consume up to 50% of their peak power. Virtualization technology (Barham et al., 2003) is a significant technology for improving the utilization of resources, and is also the technical foundation of cloud computing. Therefore, the virtual machine (VM) is the basic deployment unit in this paper. A reasonable task-scheduling strategy based on VM placement can make hosts work in a proper load so as to achieve the objective of high energy efficiency. With the development of precision instrument technologies, many scientific research fields have accumulated vast amounts of scientific data, such as astronomy, meteorology, and bioinformatics. To solve the energy problem of these data-intensive applications, a special-task scheduling method will be presented in this paper. The scheduling principles can be derived according to the characteristics of the data-intensive application as follows: 1. Decrease network traffic through rational data layout and task scheduling. I/O operations are the most time consuming part for dataintensive applications in the cloud. A bad task scheduling strategy can increase the amount of data transmission greatly and would directly result in an increased task response time. Then, to satisfy the user's service-level agreement (SLA), the cloud system would have to allocate more computing resources

http://dx.doi.org/10.1016/j.jnca.2015.05.001 1084-8045/& 2015 Published by Elsevier Ltd.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

to applications to decrease their time consumptions on computing. An indirect result is that the energy consumption on the server side would be increased. On the other hand, frequent data transmission will also lead to a large amount of power consumption. As shown in (Cavdar and Alagoz, 2012), network devices consume up to 1/3 of the total energy consumption (excluding cooling equipments), and network conflicts are the main reason for this high consumption. The frequent and large amounts of data movement will inevitably lead to the prolonging of the total network transmission time consumption and an increase in the risk of online conflict. Therefore, we believe it is important to reduce the amount of data transmission data intensive applications in the cloud. Fortunately, there are data dependencies between tasks, and reasonable data placement and task scheduling based on these dependencies can decrease the number and time consumption of data transfers. This is the first optimization objective of this paper. 2. Improve the utilization of servers in the cloud, and reduce the generation of inefficient energy. As mentioned above, under-loading of a machine is the main cause of low energy efficiency. Therefore, if the utilization of one machine is low, two strategies can be performed. One is to allocate more tasks to this host in order to improve its resource utilization situation. The other is to migrate the tasks on it to other machines so as to make it closed. On the other hand, the overloaded status also needs to be changed. This is because the error rate will increase greatly under the condition of overload, hence leading to a high increase of

power consumption. In the literature (Srikantaiah et al., 2009), the optimal CPU utilization of today's servers is about 70% in terms of energy efficiency. Therefore, making the active servers work at a balanced energy efficient utilization rate, and turning the underloaded server off should be a intelligent strategy. This is another optimization objective of this paper. For the conveniences of the readers, the symbols defined in this paper are illustrated in Table 1. The remainder of the paper is organized as follows. Section 2 presents related works. Section 3 builds the user workflow model. Section 4 builds the cloud environment model. Section 5 shows our energy consumption model. Section 6 gives the detail of the task scheduling method. Section 7 presents and analyzes the simulation results. Finally, Section 8 addresses conclusions and future work.

2. Related works The great amounts of energy consumed by supercomputers and computing centers have been a major resource and environmental concern facing today's society. In data centers, large amounts of energy are wasted by leaving computing and networking devices— such as servers, switches, and routers—powered on in a low utilization state. A survey of the energy utilization state of 188 data centers mostly located in the United States points out that on average 10% of servers are never utilized (Grid, 2010). A proportion of this power could be saved if these servers were powered off or switched to low-power mode while idle. Therefore, improving the

Table 1 Symbols illustration. jD j

The number of global data items

di ð1r ir jDjÞ Din ðtÞ Sizei jS j sj ð1r jr jSjÞ SC jstorage

The The The The The The

Dj V M x ð1 r x r mÞ V C xcpu

A set of data items stored on server sj The x th type of VMs in the cloud platform The type of VM V M x .

V C xmem tk

The memory c the CPU capacity of apacity of the type of VM V M x . The k th task The Worst-Case execution time of the task t k runs on a specific type of VM

WCET k

i th data item set of input data items of the task t size of the data item di number of physical servers in the cloud environment j th physical server storage capacity of server sj

The deadline restrict of the whole application the task t k belongs to

T kdeadline bi;j SC jcpu

The network bandwidths between server si and serversj The CPU capacity of the physical server sj

SC jmem

The memory capacity of the physical serversj

njV Mx

The number of the type of VM VM x that are allocated into the server sj . The CPU utilization rate of sj at time t

j

Utilt Optj j

Ef f ecUtilt

The optimal utilization level in terms of performance-per-watt for the server sj The effective CPU utilization of the server sj at the time t

njV Mx _runningTask ðtÞ   Din ðt k Þ

The number of VM x that is running task on the server sj at the time t

cinitialData i;j

The data correlation betweendi anddj derived from the initial data items.

ermediateData cint i;j

The data correlation betweendi anddj derived from the intermediate data items.

ci;j Ti TRD OOED Task marginj(t) aveUtilp

The integrated correlation between di anddj . The set of tasks which need the data item di as its input data Task Requirement Degree On-off expectation degree at time t How much additional workload is needed to increase the machine's utilization to Optj in order to improve the energy efficiency of this server The average CPU utilization in the period t The overall power consumption of the server sj The power consumption of the server sj at time t. The total power consumption the cloud system

Power j Power j ðtÞ Total_Power

The number of data items in the input set of the task t k

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Q3 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

energy efficiency of large-scale distributed systems has become a very active research and development area. Benefiting from recent Web developments and the advent of the era of Big Data, data centers have become the main model for computation resource provision and data storage. However, in order to meet the resource requirements of sporadic peaks, it seems that a low utilization rate of resources (Iosup et al., 2006) and wastage of energy is inevitable most of the time. Cloud computing provides an interesting resolution because of the elastic and scalable service supply based on virtualization technology. Due to the benefits of the VM migration technologies and workload consolidation, computing resources can be allocated more intelligently in the global scope to improve the effective utilization of cloud devices. At the same time, because of the opaqueness of the cloud system, competition among coexisting users and this system's global optimization goal, the strategies of task scheduling and resource allocation are quite different with traditional multiprocessor systems (Kwok and Ahmad, 1998; Topcuoglu et al., 2002; Gao et al., 2013). Take the resource allocation framework Nephele (Warneke and Kao, 2009) as an example, it considers the dynamic configurability of cloud resources and the changing workload but performs optimization from the perspective of a single application. From the perspective of cloud service providers, global performance (including SLA violation and energy efficiency) optimization is their concern. Therefore, there is active research in the area of cloud resource allocation. In recent years, significant research (Bobroff et al., 2007; Wood et al., 2009) has been done to improve application scheduling in cloud data centers. From the point of resource allocation procedures, it consists of two steps: allocating resources to applications to ensure the QoS and mapping applications or VMs to physical servers. For the first step, the objective is to allocate an appropriate amount of resources (normally in the form of VMs) to satisfy user demands. This problem can be formulated and solved differently according to how the cloud platform is modeled. If the user requests are supposed to be independent workloads in a specified arrival rate, many existing resource allocation strategies can be performed. Among them, multidimensional bin packing algorithms are the most classic (Ajiro and Tanaka, 2007; Wang et al., 2011; Tammaro et al., 2011), and custom workload prediction algorithms inspired by queuing theory (Mazzucco et al., 2010; Lin et al., 2013; Bi et al., 2010) are also commonly used. Network communication is also an important factor, and against the background of Big Data, it is particularly important for data-intensive applications. In (Aoun et al., 2010), a graph model of a cloud system is built based on distributed data storage, and the resource provisioning problem is formulated as a Mixed Integer Linear Problem. Other schemas are also introduced, such as dynamic VM provisioning and allocation based on bundled workload (Aoun et al., 2010), considering performance alterable VMs (Bjorkqvist et al., 2012), etc. For the second step, the objective is to achieve high energy efficiency of devices by improving resource utilization. If the workload is independent, many existing works have been performed. In (Srikantaiah et al., 2009), a bin packing method is proposed, not only to reduce the number of physical servers, but also to ensure the load balancing between active servers. As another example, in (Hadji and Zeghlache, 2012) a minimum cost maximum flow algorithm is proposed for resource placement in clouds confronted by dynamic workloads and flow variations. Many other research has been conducted on energy aware task scheduling and VM placement (Srikantaiah et al., 2009; Knauth and Fetzer, 2012; Adhikary et al., 2013), however, few of these aforementioned strategies are simultaneously driven by dataintensive application and workflow oriented. This is the purpose of this paper.

3

One characteristic of our method is it is oriented to dataintensive applications, and the tasks are scheduled according to their data correlation; hence it can greatly reduce the amounts of data movements. There are some similar existing works. In (Yuan et al., 2010), a matrix based k-means clustering strategy has been proposed. This strategy can effectively reduce the data movement of their workflow cloud systems; nevertheless, their proposed method is more suitable for an isomorphic environment, since the differences in data size and data center's computing capacities have not been incorporated. On the other hand, their method is only intended to decrease the data movement frequency, but reducing the transfer frequency does not mean decreasing the amount of data movement and is also not equivalent to cutting down the time consumption by data transmissions. In (Zhao et al., 2012), data placement strategy based on a heuristic genetic algorithm has been proposed. This method is intended to reduce data movement while balancing the loads of data centers. However, it only considers the data placement problem and is not energy aware. Another significant characteristic of the proposed method in this paper is that the tasks are modeled as graphs with dependencies, instead of independent batches. In more detail, each user workload is viewed at a finer granularity as a task graph with output dependencies. This fine-grained treatment of workloads provides many opportunities for performance and energy optimizations, therefore enabling the CSP to meet user deadlines at lower operation costs. And this model is especially suitable for large scientific and engineering applications (Isard et al., 2007; Chen et al., 2010).

3. User workflow model In this paper, the user workload is modeled at a fine granularity by using a workflow graph. This type of workflow model for a cloud system is much more complicated than the coarse granularity model in which each application is represented as an atomic task. By including the task dependencies, our workflow model is more suitable for large scientific and engineering applications. S. Xavier (Xavier and Lovesum 2013) has given a detailed analysis of these two modeling methods. Fig. 1 provides an example illustration of the workflow model (or the dependency model). From the figure we can see that there is some data correlation within the datasets, as some tasks need more than one input data item, and some data items are required by more than one task. In addition to the data correlation, task dependencies also exist. Therefore an intelligent data placement and task scheduling strategy can greatly reduce global data movements. As our work in this paper is oriented to data-intensive applications, data movement can be the main time-consuming factor of the whole system. As a large amount of data transfer can greatly increase the energy consumption of a running server as well as the energy consumption of network transmission, reasonable data placement based on data dependencies is very important.

Fig. 1. Application Model.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

3.1. Storage requirement

after every task assignment, for ensuring the feasibility of subsequent task scheduling.

We denote the set of input data items of the task t asDin ðtÞ. The sizes of data items in this research can be very big and very different; hence the storage capacity of each physical machine is a significant element that needs to be considered in data distribution. We denote the size of the data item di (1 r ir jDj) as Sizei and denote the storage capacity of Server sj (1 r j r jSj) asSC jstorage . Since a large amount of intermediate data may be generated during the execution stage, the whole storage space cannot be occupied at the build-time stage. Accordingly, an experiential parameter λis introduced, which denotes the allowable usage ratio of the servers' storage capacity. During scheduling, each server sj is associated with a set of data items denoted asDj . Hence, the data allocation of Dj must abide by the initial-stage storage condition, P i.e., Sizei rSC jstorage Uλ.

We assume that there are m geographically distributed organizations in the cloud based computation platform that execute scientific workflows (see Fig. 5 as an example). The participating organizations are heterogeneous, different computation capabilities, energy efficiency condition and storage space, and also different network bandwidths among them. The whole workload on the cloud platform consists of multiple simultaneously running workflows as demonstrated in Fig. 3. The tasks are allocated to servers in the form of VM placement, and both computing resources and network conditions are considered in this process.

3.2. Computing resource requirement

4.1. Network structure model

Besides the storage resources, each task also requires a certain amount of computing resources, and the resource requirement is represented by the virtual machine (VM). We assume that there are m types of VMs in the cloud platform, and they are distinct in the amount of CPU and memory they require. Hence, each VM x ð1 r x r mÞcorresponds to a binary setðV C xcpu ; VC xmem Þ, which specifies the CPU capacity and memory capacity of this type of virtual machineV M x . In fact, in the preprocessing stage, each task t k is coupled with a quintuple set: ðVM k ; WCET k ; T kdeadline ; EST k ; LST k Þ. Here, VM k is the type of VM that t k needs to run on. And WCET k means the worst-case execution time of the task t k when run on the specific virtual machineVM k . WCET is a common estimated parameter and is often used as an input to scheduling analysis. The next value illustrates the deadline restriction of the whole workflow, and it is specified by this user. Although users cannot control the real scheduling of their tasks on physical resources, they still can state their service quality requirement by this parameter, and it is an important constituent part of the user's SLA. This deadline parameter is also useful in deducing the last two parameters: EST k andLST k , which respectively denote the earliest start time and the latest start time of taskt k . Because we adopt the task graph workload model of task scheduling—i.e., there are dependencies between tasks on the time dimension—these two parameters determine the effective range of the start time of a certain task: [EST k ,LST k ]. When the task is specified with a tight deadline, the effective range of start time would be compact too, hence it may needed to be allocated to a separate virtual machine. Otherwise, as the effective start time range would be wide, it may share the virtual machine resource with other tasks in order to reduce the energy consumption. Utilizing the parameterWCET k , whether a task can be accommodated by a certain server can be calculated. As shown in Fig. 2, there are 4 tasks in this scene, t 1 ; t 2 ; t 3 andt 4 . Assume they belong to 4 different workflows. The constraints of these parameters, t 1 andt 2 can be distributed to the same virtual machine, while on the other hand, whereast 3 andt 4 cannot share the same virtual machine. The parameters EST and LST are vital judge conditions on task allocation, and they are calculated based on the critical path of the whole workflow. Therefore, the first step of deducing EST and LST is to determine the critical path of the application. After this step, tasks' ESTs can be calculated in the positive sequence, and their LSTs can be derived in the reverse order. In the task allocation process, every task will be scheduled based on these two parameters and its WCET value. And after each task allocation, the EST and LST values of its corresponding affected unscheduled task will be updated. This is a critical step

In this sub-section, the heterogeneous network transmission capacities among the servers are the main concern. We expect frequent data movements to happen on the high-speed channels in order to reduce the time consumption on data transmission and decrease the application's response time. Another benefit of this is that it can reduce the possibility of users committing an SLA violation. In order to achieve this goal, we first propose a hierarchical clustering method for the servers based on the network conditions among them. Using this method, the cloud system will be abstracted to a tree structure, and this structure will play an important role in the subsequent “Tree-to-Tree” task assignment method that will be presented in Section 6. The channel bandwidths between server si and serversj will be presented as bi;j . In one extreme case, bi;j ¼ 0, these two nodes are disconnected, i.e., you cannot transfer a dataset from si tosj or

4. Cloud environment model

di A D j

Fig. 2. Task allocation example.

Fig. 3. The cloud environment.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

fromsj tosi . Therefore, a global network condition matrix B of the cloud system can be built as below. 2 3 b1;1 b1;2 ⋯ b1;j Sj 6 7 b2;2 ⋯ b2;j Sj 7 6 b2;1 7 B¼6 6 ⋮ ⋮ ⋯ ⋮ 7 4 5 b j Sj ;1 b j Sj ;2 ⋯ b j S;j Sj If the cloud architecture is based on LAN, the network structure is usually able to be abstracted into a tree structure naturally, as long as disaster recovery devices are ignored. However, a more typical situation is that a big cloud system is built on the Internet and composed of multiple server farms; hence, it cannot be mapped into a nature tree structure. Therefore, we do a binary dichotomy recursively on these hosts so as to convert the cloud system into an approximate tree model. First, a BEA conversion is done on matrix B to bring similar values together. Then a partition point pis selected, and the hosts are divided into two sets fs1 ; :::; sp gandfsp þ 1 ; :::; sjSj g, which maximizes the objective function: Xp Xp XjSj XjSj PM ¼ b U b i¼1 j ¼ 1 i;j i ¼ pþ1 j ¼ p þ 1 i;j Xp XjSj ð i ¼ 1 b Þ2 ð1Þ j ¼ p þ 1 i;j Then, this type of partition is done recursively until the whole system is converted to a tree model that is suitable for the requirements of the “Tree-to-Tree” task assignment method outlined in Section 6. 4.2. Virtual machine placement During virtual machine placement, each host sj corresponds to a vector of lengthm: N jVM ¼ fnjV M1 ; njVM2 ; :::; njV Mm g, where njV Mx indicates the number of type xVMs that are allocated to the server sj . The VM placement must satisfy the limit to the amount of resources of each P j physical server, which means for8 sj , we have m x ¼ 1 nV M x U Pm j x j x j VC cpu r SC cpu and x ¼ 1 nVMx U VC mem rSC mem .

5

The relationship between utilization rate and energy consumption will be clearer after the analysis of Fig. 4. Theαj andβj coefficients are different in different hosts and are related to the server's energy efficiency performance. Hence, through these two coefficients, the server’s level of energy efficiency can be determined. It is also useful in deciding the priorities of hosts in VM placement because the host with the highest energy efficiency would have a higher priority for loading tasks. In Fig. 4, another important revelation is that it is best to make every machine run at an Optj utilization ratio or approach the value of Optj. Therefore if a host is under-loaded, one strategy is to allocate more tasks to this server in order to improve its resource utilization, and another strategy is to migrate all the tasks on it to other nodes and then shut it down to reduce unnecessary power consumption. This is the most important optimization idea of this paper.

5.2. Utilization, effective utilization and VM sharing j

In Eq. (3), the utilization ratio Utilt is simplified to the CPU utilization ratio, and other factors are ignored. Its detail calculation function is shown in Eq. (4). Pm j

Utilt ¼

x¼1

njV Mx U VC xcpu SC jcpu

 100%

ð4Þ

WherenjV Mx represents the number of typexVMs allocated on the serversj , VC xcpu denotes the CPU requirement of this type virtual machineV M x , and SC jcpu indicates the CPU capacity of the host sj . When we calculate the CPU utilization ratio at time t, we do not distinguish whether the VMs on it are running real tasks or are in idle state, because background CPU activities are needed even during idle periods. This utilization parameter is also called total utilization.

5. Server energy consumption model 5.1. Utilization ratio and power consumption As mentioned earlier, normally, today's servers consume up to 50% of their peak power when idle (Chen et al., 2008). This is because the power consumption of a server sj consists of two parts: static power consumption Power jstatic and dynamic power consumptionPower jdynamic . This can be represented as Eq. 2: Power

j

ðtÞ ¼ Powerjstatic þ Power jdynamic ðtÞ

Fig. 4. Relationship between utilization rate and power consumption.

ð2Þ

Power jstatic

is constant as long as the machine is in the Where active state. And Power jdynamic is related to the utilization rate of j sj at time t—Utilt . Their detailed relation is shown as Eq. 3: 8 < Utiljt Uαj ðUtiljt o Opt j Þ j Power dynamic ðtÞ ¼ ð3Þ : Opt j Uαj þ ðUtiljt Opt j Þ2 U βj WhereOpt j is the optimal utilization level in terms of performance-per-watt. In practice, this value is about 70%. When a server's utilization level is below the value of Opt j , there is a linear relationship between the dynamic power consumption and the utilization ratio; otherwise, the “excess” utilization will result in an exponential growth of the dynamic energy consumption (Pedram, 2012; Chen et al., 2008). αj andβj are the respective coefficient of the linear part and the exponential part for the serversj .

Fig. 5. Utilization and effective utilization.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

Frequent start-up and shutdown of machines will increase the consumption of power and is also very time consuming. Therefore, the idle period of CPU is unavoidable. Hence, we introduce another j parameter called Ef f ecUtilt to represent the real effective utilization of CPU resources for the server sj at time t (see Eq. (5)). Pm j x x ¼ 1 nVM x _runningTask ðtÞ U VC cpu j  100% ð5Þ Ef f ecUtilt ¼ j SC cpu Fig. 5 has given a simple example for the calculation of the two parameters: utilization and effective utilization. In simple terms, a server's utilization indicates its power consumption, while its effective utilization represents the effectiveness of its power consumption. Therefore, we expected that the effective utilization could be as close to the total utilization as possible, and the total utilization could approach the value of Optj. If the average effective utilization in a certain period is low, we can say the host is under loaded during that period. VM sharing is an efficient function for increasing the effective utilization. The same example with Fig. 5 but with new tasks allocated to the last two VMs is shown in Fig. 6. Obviously, in this way, the gap between the total utilization and the effective utilization can be narrowed. On the other hand, VM sharing can also reduce the probability of starting a new VM or awakening a new server; thus, the global energy consumption can be further reduced. Therefore, VM sharing is a vital strategy for improving energy efficiency of a cloud system. In the process of task allocation (Section 6), we will increase the ratio of VM sharing so as to improve the effective utilization of computation resources.

6. Task assignment method As this paper is aimed at the data placement and task scheduling problem of data-intensive applications, reducing the amount of data transmission is the first target. This is because a bad data placement and task scheduling strategy may result in a big increase in data movements, and that would significantly extend the execution time of applications. The direct result is that the user's SLA would be hard to satisfy, and the cloud system has to allocate more computation resources to this application so as to improve the efficiency of its computation part. Even if the network transmission delay can be made up in this way, more power consumption is inevitable. On the other hand, network communication is also very energy-consuming. The literature (Cavdar and Alagoz, 2012) has shown that the power consumption on network devices can account for one-third of the total consumption of cloud data centers. Based on the above analysis, we set up the basic idea of our proposed scheduling method: first, we clustered the datasets involved in these applications in terms of their data dependencies in order to reduce the amount of data movement and its time consumption at runtime. Second, according to the data cluster result, we generated the clustering structure of tasks, and finally

Fig. 6. Using VM sharing to improve a server's effective utilization.

assigned both data items and tasks to the appropriate nodes together according to the clustering results. In the process of task scheduling, some measures will be implemented to improve the energy efficiency of the servers. 6.1. Correlation-based clustering 6.1.1. Data correlation calculation Previously, we have done some research about data correlation calculation, and this work is published on the conference proceeding of CCGrid 2015 (Zhao et al., 2015). In this paper, the previous methods have been applied and extended so as to reduce the amount of global data movements in the system. Complex data dependencies always exist in a workflow system. As can be seen from Fig. 1 in Section 3, some tasks have more than one input datasets, while some datasets are required by more than one task. Hence, it is better to place the datasets with a close relationship on the same center to reduce data transfers. first, we calculate the correlation between each two data items. The dependency calculation is different from those of the classical methods in (Yuan et al., 2010) because the correlation computation method is based on the volume of data transferred rather than the frequency, and the correlation derived from the size prediction of the intermediate data is integrated, which is called the first order conduction correlation (FOC-Correlation). In particular, the data correlation matrix can be built from the following two steps: Step 1: Derive data Correlation from initial existing data. It is commonly believed that the more tasks related to the datasets, the higher the dependency between those datasets. In fact, the dependency between two datasets derived from this method cannot fully reflect the increased transmission cost when they are separated, because there is a huge difference between the sizes of the data items. Therefore, we define the correlation between each two datasets as how much data transfer amount will be increased if they are put on different servers. For tasks that only need two data items as their input, the produced dependency involved is valued as the size of the smaller one between them, because transferring the smaller chunk to the place where the larger one located is the most efficient solution. For tasks that require more than two data chunks, the situation is more complicated. The detailed calculation algorithms of the two cases are given below: 1. FOR the tasks t k A T, which j Din ðt k Þj ¼ 2:     ¼ T i \ T j   min Sizei ; Sizej ; di ; dj A Din ðt Þ ð6Þ cinitialData i;j     2. For each task t k A T, which Din ðt k Þ 4 2: ¼ max ffSizex j dx A Din ðtÞ; x a ia jg [ fSizei þ Sizej gg cinitialData i;j  max fSizex j dx A Din ðtÞg

ð7Þ

  In the first case, T i \ T j is the number of tasks which   need di anddj as their entire input data. min Sizei ; Sizej is the size of the smaller data item betweendi anddj . In the second case, the derivation is more complex. Here, we give an example. A task t k has three input data items:d1 ,d2 , andd3 , which are size 3 G, 6 G, and 7 G. This can be represented as:Din ðtÞ ¼ fd1 ; d2 ; d3 g, Size1 ¼3 G, Size2 ¼6 G, Size3 ¼7 G. Assume that in the initial state, the three data items are stored on three different servers, and the correlation gain of fd1 ; d2 ; d3 g is3 þ 6 ¼ 9ðGBÞ. This is

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

7

Fig. 7. Correlation calculation of tasks with more than 2 input datasets.

because movingd1 andd2 to the node where d3 is placed is the optimal strategy. Then consider the situation that d1 andd2 are together whiled3 is placed on another different server. Obviously, at least a 7 G amount of transmission is needed to run this task. Hence, the correlation gain between d1 andd2 here is 9  7 ¼ 2ðGBÞ. Therefore, the correlation gain betweendi anddj can be calculated as: X ¼ð Sizex  max fSizex j dx A Din ðtÞgÞ cinitialData i;j dx A Din ðtÞ



X

Sizex  max ffSizex j dx A Din ðtÞ; x a ia jg

dx A Din ðtÞ

[ fSizei þ Sizej ggÞ

ð8Þ

And Eq. (8) can be simplified to Eq. (7). In Fig. 7(a) and (b), two typical examples are given. 2. Step 2: Derive the first order conduction correlation from intermediate data. As shown in Fig. 1, some temporal relationships exist between tasks, i.e., some tasks require the data generated from other tasks as their input data. In this step, this kind of dependency generated by the intermediate data is considered. The contribution of introducing this correlation is that: first, some task sequential relationships can be involved in our proposed method, and hence, the subsequent task scheduling operation can be more reasonable; second, some runtime data movements can be taken off at the build time. We named this type of data correlation generated by two consecutive dependent tasks the first order conduction correlation (FOC-Correlation). An example is shown in Fig. 8(a). It is part of an entire workflow including only two tasks t 1 andt 2 . It looks like there is no correlation between d1 andd4 as no task needs both of them simultaneously. However, through a further analysis we can see that there is a high degree of correlation between them if the newly generated intermediate datasetd3 is considered. In Fig. 8, we define some expression rules first. If d1 d2 d4 andd5 are stored on 4 different nodes, the required amount of data movement for this data layout is denoted as [1, 2, 4, 5]. If the 4 data items are placed on the same server, their required data transfer amounts are denoted as [1 2 4 5]. Similarly, [1 2, 4, 5] represent the data transfer amount of the layout that d1 andd2 are stored together and d4 and d5 are placed on two other different nodes. In accordance with this expression rule, the detailed derivation process of the correlation gain between each two items ofd1 ,d2 ,d4 andd5 is described in Fig. 8(b). The results show that the dataset d1 andd2 have strong dependence with d4 andd5 . These results will make these 4 data items more likely to be

Fig. 8. The FOC-Correlation generated from dependent tasks.

placed together in the subsequent clustering stage, hence the runtime data transfer amount can be reduced because the intermediate data itemd3 no longer needs to be transferred for executing the taskt 3 . Here, we have assumed the size of the intermediate data itemd3 can be estimated roughly. In fact, at some point, the format and size of the intermediate data can be known roughly in advance, but only the exact value must be known after the previous task is finished. Furthermore, in the case that the size of some of these intermediate data items cannot be predicted precisely, if we do not perform this step of FOC-Correlation calculation, it is equivalent to ignore the dependencies of the two tasks, or regarding the size of the intermediate data as zero. This is obviously more unreasonable, especially as the amount of the intermediate data is usually very large. In other words, if we do not introduce the first order conduction correlation into the correlation matrix, more movements of the newly generated data are needed at runtime and will inevitably increase the workload of the cloud system. After the two steps mentioned above, all of the correlation items in the matrix have been derived except those on the diagonal. The items on the diagonal, marked by equivalent indices on the two dimensions can be simply calculated as the sum of all the other items in this row. A workflow instance is given in Fig. 9, and its correlation matrix is calculated as shown in Fig. 10. As shown in Figs. 9 and 10, if the tasks have strong dependencies—in other words, the newly generated intermediate datasets are large in size—a high percentage of correlation between the data items will be generated on the second step. Therefore, the step of deriving the FOC-Correlation from the dependencies between tasks is essential.

6.1.2. Hierarchical data clustering In this section, we do hierarchical clustering for the datasets in terms of their mutual dependencies. The BEA transformation is applied to the correlation matrix to collect the similar values together. And then recursive partition operations are performed according to a new proposed partition strategy named the Correlation Density-based Partition Strategy. In detail, for each division, a partition point pis selected to divide the items in the correlation matrix into two sets fd1 ; :::; dp gandfdp þ 1 ; :::; dn g while maximizing the following objective function CDR (correlation

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

density ratio): Pp Pp i ¼ 1

CDR ¼

j ¼ 1

cij þ

Pn

Pn

i ¼ pþ1

j ¼ pþ1

cij

j fcij j i;j ¼ 1;:::;pg [ fcij j i;j ¼ p þ 1;:::;ngj

Pp

i ¼ 1

Pn

j ¼ pþ1

cij

ð9Þ

j fcij j i ¼ 1;:::;p;j ¼ p þ 1;:::ngj

Where the denominator represents the density of the correlation reserved in this partitioning, while the molecule indicates the density of the correlation broken. This equation reflects the correlation density ratio of the two parts—the correlation reserved and the correlation broken. As shown in Fig. 11, we label the 4 parts after one partition as X, Y, Z and Z. After we calculate all the values ofPM forp ¼ 1; 2; :::; n  1, the first partition position is selected as p¼ 2, where the ratio of average density of the region Xþ Y and that of Z reaches the peak. These same correlationdensity-based partition operations are performed iteratively until the ratioPMis below a threshold. The specific threshold is determined according to the real situation of the cloud system and the data items. Consequently, a binary tree is established through this recursive partitioning strategy, as illustrated in Fig. 11. Comparative experiments show that our density ratio strategy has a proper equivalent-division tendency and has better

Fig. 9. An example of workflow.

Fig. 11. Recursive partition data into binary tree.

performance than that of D. Yuan's (Yuan et al., 2010). It is also better than some other common methods, such as the strategy of maximizing the dependency summation of the region Xþ Y or maximizing the ratio of the two dependency summations of the region X þY and the region Z, since the last two methods have a certain degree of skew partition tendency. We found that excessive equivalent-division tendency may cause the data items with high dependency to be separated prematurely through experiments. However, the consequence of too little equivalent-division tendency is that the binary-tree may be too unbalanced, and may have excessive layers, and therefore it is difficult to allocate to data centers in the next step. 6.1.3. Mapping tasks in the data tree After the above hierarchical clustering steps, the datasets have been modeled as a binary tree. Each node of this tree represents a subset of related datasets, and the lower the node is, the closer the datasets contained in it are. In this sub-section, the tasks will be mapped onto the nodes of the data tree. The basic mapping principle is a task that should be mapped to the node that includs the most of its required data, so as to minimize the amount of input data transmission. If a task requires no input dataset, then it will be mapped to the nodes where its preceding task resides. Fig. 12 has shown a very simple example of the whole process from data correlation matrix calculation, to hierarchical data clustering, and finally to task mapping. From this figure we can see that after the task mapping process, the data-related tasks have been clustered together successfully. The reason for this is that both the data correlation and FOCCorrelation have represented the data correlation effectively, and to some extent, the FOC-Correlation has reflected the sequential relationship between different tasks. 6.2. Server's task assignment priority 6.2.1. Task Requirement Degree (TRD) calculation In Section 5, the power consumption model has been built. And based on the previous analysis, an energy saving strategy can be issued: 1. In a period of time, if a machine's mean effective utilization is higher than the lowest utilization threshold, we believe it should be in the “on” state in this period. If its utilization is lower than the optimal utilization in terms of energy efficiency, allocating more tasks to it is necessary. 2. In a period of time, if a machine's mean effective utilization is very low, lower than the lowest utilization threshold, and for the certain periods before and after, the desired server states are both “off”, it should be in the “off” state for this period. Therefore, it should not be assigned more tasks, and the already allocated tasks on it should be migrated to other servers.

Fig. 10. The correlation matrix of the workflow in Fig. 9.

In this sub-section, we proposed a special parameter of Task Requirement Degree (TRD) to represent how much work load

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

9

each task assignment, the OOEDj(t) should be updated for the influenced servers. The detail calculation is shown in Algorithm 1. Algorithm 1 The OOED calculation algorithm 1: 8 i and t; Initialize OOEDj (t)¼0. 2: repeat 3: task_assignment(a node of the data-task tree); 4: for i¼1 to m 5: foreach period p with task assignment 6: aveUtilp ’get_mean_utilizationðpÞ; 7: if (aveUtilp Z l owestUtil) 8: OOEDj(t) ¼1, for tA p; 9: end foreach 10: foreach period p that OOEDj(t) ¼0 11: foreach tA p, 12: a ¼ HaveActiveStateOrNotðt  t restart Uγ; tÞ ; 13: if a ¼ ¼ true 14: OOEDj(t)þ ¼0.5; 15: b ¼ HaveActiveStateOrNotðt; t þ t restart U γÞ 16: if b¼ ¼true 17: OOEDj(t)þ ¼0.5; 18: end foreach 19: end foreach 20: end for 21: until task assignment is terminated

Fig. 12. The steps from correlation calculation to tasks mapping.

should be assigned to a server in order to improve its energy efficiency. The TRD value determines the machine's task assignment priority in the “Tree-to-Tree” task scheduling method. The basic calculation function is shown in Eq. (10). TRDj ðtÞ ¼ OOEDj ðtÞ U TaskMarginj ðtÞ

ð10Þ

Where OOEDj(t) is the On-Off Expectation Degree forsj at time t, which represent whether the machinesj should be at on-state or not. Value 1 of this parameter means the machine should be active, while 0 means it should be shutdown. TaskMarginj(t) means how much additional workload is needed to increase the machine's utilization to Optj in order to improve the energy efficiency of this server. Therefore, from Eq. (10) we can believe that the TRD parameter has the following functions: 1. If the serversj is expected to be active for this time, its CPU utilization rate should be adjusted to close to Optj. 2. If the serversj is expected to be close for this time (OOEDj(t) ¼0), its TRD value will also be equal to 0. Hence, the task assignment priority of this server should be very low. 3. If the serversj is expected to be active, i.e., OOEDj(t)¼ 1, then the TRDj(t) value depends on the value of TaskMarginj(t). In particular, the larger the value TaskMarginj(t) is, the higher the task assignment priority it has. Therefore, the TRD based task assignment strategy will have an effect of load balancing. This can further improve the QoS level of the cloud system. The detail calculation methods of OOEDj(t) and TaskMarginj(t) are shown in Sections 6.2.2 and 6.2.3. 6.2.2. OOED (on-off expectation degree) calculation In the initial stage of task scheduling, all of the servers' task load is zero, so for any value of t and i, OOEDj(t) ¼0. And then, after

From line 5 to line 9 of Algorithm 1, we can see that for a period of a server, if its average utilization is more than the lowest utilization threshold, the value 1 is assigned to the OOED parameter. From line 11 to line 18, we can see that even if there is no task for a certain period, the OOED may also not be 0. The OOED situation of a certain time period before and after should be taken into account, since frequent starting and shutting down of a server would waste large amounts of power. Specifically, for the moment t that Utilt ¼ 0, if there is a left interval A ðt  δt; tÞ in which OOED¼1, then OOED(t) ¼0.5. Similarly, if there is a right interval A ðt; t þ δtÞ in which OOED¼ 1, than OOED(t) ¼0.5. If both the left interval and the right interval have at least 1 period in which OOED¼1, then OOED(t) ¼1. Here, δtis a longer time interval for the purpose of avoiding frequent startup and shutdown of the machines.

6.2.3. Server's TaskMargin calculation TaskMarginj(t) means how much additional task load is needed to increase the machine's utilization to Optj in order to improve the energy efficiency of this server. Its detail derivation method is shown in Eq. (11). TaskMarginj ðtÞ ¼ Opt j  Ef f ecUtilj ðtÞ

ð11Þ

The contributions of Eq. (11) are that: first, if the effective utilization is lower than Optj, then more tasks are expected to be allocated to this machine by the new VM opening or the VM sharing technology to improve the server's energy efficiency; second, the hosts with lighter loads will have higher priority for receiving more tasks comparing with the ones with heavier loads if they are all active-state expected. Therefore, this method is helpful for load balancing at the same time. In order to express this concept more clearly, Fig. 13 shows a TRD calculation example. From this figure we can see that, in the central interval (the gray are), the average CPU utilization is very low ( o20%), but it will not be labeled as OOED¼0 directly. There are some periods in which OOED¼1 on both the left-hand side and right-hand side. Some areas of the central interval can be labeled

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Fig. 13. An TRD calculation example.

as OOED¼ 0.5. Here, the judgment time interval is selected as 20 times of tstart. In other words, γis fixed to 20 in Fig. 13. 6.3. The Tree-to-Tree heuristic assignment strategy As mentioned in Sections 4 and 5, both the tasks and the servers in the cloud have been modeled as tree-structures. In this section, we will propose a heuristic task scheduling strategy to map the task groups into servers. Hence, it is called as the “Treeto-Tree” task scheduling method. The main principle of the proposed task scheduling method is that highly associated tasks and data items are assigned to the same node as much as possible, and if these items cannot be stored together due to storage or computation limits, or energy efficiency considerations, they should be placed on closer nodes of the tree-structured environment, since the network bandwidth is relatively higher. Specifically, the allocation strategy is illustrated as follows: First, by a greedy idea, we always select the higher-layer node of the workload tree in a top-down manner to perform the task assignment, and that could retain more data correlation in the group and reduce the amount of data transmission at runtime. On the other hand, a bottom-up strategy (assigning the task group to the lowest level sub-tree) is adopted for machine group selection, and all of the storage space, execution capacities, and energy efficiency conditions should be satisfied. This task scheduling strategy can effectively guarantee the frequent data transfers happening on high-speed channels. However, the bottom-up server node selection strategy is only one factor in task scheduling. Another important factor is the TRD value, since the TRD based strategy can guarantee the high energy efficiency of the servers in the cloud. Therefore, these two kinds of priority determination strategies must be integrated reasonably. Below we give the concrete task scheduling strategies. Step 1: Select the task group waiting for allocation in the next step. Select a task node from a higher layer of the task tree. If there are multiple nodes in this layer, then select the one that has the highest mean EST value. Step 2: Calculate the mean TRD value of each node. First, analyze the related time periods of the above task group. Then calculate the average TRD value of all of the related periods for every server. After that, the average TRD value of every node in the server tree will be calculated. This value is useful for determining the node's task assignment priority. Step 3: Determine the target execution position for the selected task group. The sequence of the node selecting process is based on the following 4 judgement methods. And remarkable, this 4 methods is listed in an importance descend sequence. For each two nodes in the task tree. Method 1: If one node's mean TRD value exceeds 0, and the other node's mean TRD value is equal to 0, then the first one that has TRD 40 should have a higher task assignment priority. Method 2: If using Method 1 the two nodes' assignment

priorities cannot be differentiated, then the node that has a lower layer in the task tree should have a higher task assignment priority. Method 3: If using the above 2 methods, the two nodes' assignment priorities still cannot be differentiated, then the node that has a larger mean TRD value should have a higher task assignment priority. Method 4: If using the above 2 methods, the two nodes' assignment priorities still cannot be differentiated, then the one whose inherent energy efficiency is higher should have a higher priority. Based on these 4 methods, the nodes' task assignment priority can be determined. And then we will try to allocate the task node to the server node in this sequence. Step 4: Try to find whether the task node can be scheduled to the selected server node. We will attempt to allocate the tasks in this node to a VM in a server of the server group (server node). If the task cannot share a running VM with previous tasks, a new VM can be started as long as the computing resources of this server are enough. To judge whether a task can be allocated to a certain host successfully, the following conditions should be tested: Condition 1:After assignment, both the CPU utilization rate and the memory occupation rate in the corresponding periods cannot exceed 70% of the total amount. Condition 2: After assignment, the total data amount in this server should not be more than 50% of the total capacity. If all of the tasks can be scheduled to the hosts in this server node in terms of the above 2 conditions, this assignment is finished. Otherwise, go to Step 5. Step 5: Try to schedule the task node to the next high-priority server node in the sequence. This step will be performed recursively until a successful assignment has been found. If all these attempts fail or the found assignment layout is too loose (for example, the task node includes only 5 tasks, but they are allocated to a server node with 10 servers), go to Step 6. Step 6: Split the task node, in other words select the child nodes of the task node with the assignment failure above. Then try to schedule this smaller task group to servers by above Step 2 to Step 5. Finally, when all of the tasks have been allocated to servers successfully, the task assignment is finished. If there are some tasks that still cannot be assigned to any nodes, some applications must be rejected. We should decrease the number of rejected applications. The basic strategy is: first, select the applications with the most unscheduled tasks, then reject them and remove all the tasks from them. As some computation resources have been saved, some unassigned tasks that do not belong to this application may have an extra chance.

6.4. Re-adjustment After the above steps, all the tasks have been scheduled to the servers unless some applications have been rejected. Now the tasks and VMs reside on under-loaded hosts should be readjusted in order to make more servers be closed. The re-adjustment is done by the following steps: Step 1: Move the tasks that reside on the machine whose CPU utilization is always less than 20% for all of the time intervals. In other word, the tasks on the server that satisfied the condition:f or 8 p; Ef f ecUtilp o 20%should be migrated first. The selection principle of the destination is as follows: 1. From the nearest server in the server tree, try to find the same type of VM in the periods for which OOED¼ 1. If there is a

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

VM can accommodate this task, the migration is a success. Otherwise, go to the next step. 2. Still from the nearest server in the server tree, try to find a period on it where OOED ¼1 and this task can be executed by starting a new VM. Another condition is that utilization should not exceed Optj after this allocation. If a suitable host cannot be found after global searches, go to step 3. 3. From the period for which OOED¼ 0 or 0.5, but the host includes at least one period for which OOED¼1, try to see whether the task can be moved by starting a new VM. If this kind of proper host cannot be found, this migration should be declared failed. Hence, label this task as “cannot be migrated” and make the OOED value of the period it resides on equal to 1. Step 2: Try to migrate the tasks that reside on the period with OOEDo1 and for which the host has at least 1 period for which OOED40. The selection method of the destination is based on Step 1 and Step 2 of the above principles. If migration fails, do not go to Step 3, just label it as “cannot be migrated” and make the OOED of the located period equal to 1. In re-adjustment, new SLA violations may be generated. After the above steps, if some tasks still reside on under-loaded servers and cannot be migrated to other nodes, the cloud system should make a decision about whether to reject some application to turn off the under-loaded machine. The benefit of this high-price rejection solution is that the average utilization of the computing resources can be increased and a certain amount of power can be saved. However, the obvious disadvantage is some users' demands cannot be satisfied, i.e., the SLE violation rate will increase. During experiments, both the solutions will be tested.

7. Simulation and results analysis 7.1. Experimental methods 7.1.1. Generate simulative workflows There are 3 solutions of workflow experimental data, and they represent low workload, mediate workflow, and high workload respectively. In each solution, the numbers of workflows are 10, and the group numbers of workflow data are 20. Therefore, every experiment in Section 7.2 is tested on all 20 of the data groups. Below we give their generation methods. 1. Low-workload workflow data solution In every group of experiment data, 10 workflows (applications) are generated in a random manner. Every workflow includes 3– 6 tasks, and the dependencies of these tasks are random. Every task needs 1–4 random input datasets and produces 0–2 random output datasets. All the data items are sized randomly 0.5–50 GB, and the total number of datasets is fixed at 45. Using this strategy, 20 groups of low-workload workflow data are generated. We find that the total number of tasks is between 34 and 53 and averages 46. 2. Medium-workload workflow data solution In every group of experiment data, 10 workflows (applications) are generated in a random manner. Every workflow includes 6– 9 tasks, and the dependencies of these tasks are random. Every task need 1–4 random input datasets and produces 0–2 random output datasets. All the data items are sized randomly 0.5–50 GB, and the total number of datasets is fixed at 75. Using this strategy, 20 groups of low-workload workflow data are generated. We find that the total number of tasks is between 66 and 82 and averages 77.5. 3. High-workload workflow data solution In every group of experiment data, 10 workflows (applications)

11

are generated in a random manner. Every workflow includes 9– 14 tasks, and the dependencies of these tasks are random. Every task needs 1–4 random input datasets and produces 0–2 random output datasets. All the data items are sized randomly 0.5–50 GB, and the total number of datasets is fixed at 115. Using this strategy, 20 groups of low-workload workflow data are generated. We find that the total number of tasks distributes is between 101 and 129, and averages 11.8.

7.1.2. Simulative cloud systems For all of the experiments, the server's situation in the cloud is fixed: there are 10 servers—including 2 machines with 8 cores and 16 GB memory capacity, 6 machines with 16 cores and 32 GB memory capacity, and 2 machines with 32 cores and 64 GB memory capacity. There are 4 types of network structures among the servers, and the concrete bandwidth values between each two nodes are also generated randomly. It should be noted that every workflow data group is running on all 4 network structures. 7.2. Experiment strategies To evaluate the performance of this energy-aware task-scheduling method, we run each workflow group through 3 simulation strategies. The first two are the methods we proposed but are distinguished by whether they reject the high-price applications in the re-adjusting step of Section 6.4. Hence, these two methods are denoted “proposed method without high-price rejection” and “proposed method with high-price rejection”. The last experiment method is the baseline schedule which is used for performance comparison. In this method, the energy efficiency problem is ignored. The workflow with the earliest deadline will have the highest priority, and the tasks in it will be scheduled in time order. When selecting the target position, the host is selected at random and the power efficiency is not taken into consideration. VM sharing is also encouraged. If the resource capacity is enough, all the tasks in the workflow will be assigned to the same server, otherwise, the system will attempt to schedule the remaining tasks to a nearby server. When task scheduling is finished, the server with no task will be closed. 7.3. Evaluation indicators In the experiment result analysis, we analyzed 3 evaluation indicators: total server power consumption, SLA violation rate, and global data amount of movements. Because the power consumption saved on the network transmission has not been measured, the third indicator can reflect this performance in a certain extent. Indicator 1: Total server power consumption (Total_Power). In Section 4.2, the calculation method of the server's instantaneous power consumption has been given in Eqs. (2) and (3). Now, the overall power consumption of a certain machine sj is shown below: Powerj ¼

8 < :

P P

j j px ððPower static þ utilpx

j

UαÞU Δt px Þ; utilpx ropt j

j j px ððPower static þ opt j U α þ ðutilpx

2

j

 opt j Þ U βÞ U Δt px Þ; utilpx 4 opt j

ð12Þ j

Where px is the xth time period, utilpx is the total CPU utilization rate of the xth period and Δt px is the length of this period. Therefore, the total server power consumption is: XjSj Total_Power ¼ Power j ð13Þ i¼1 Indicator 2: SLA violation rate. SLA violation rate is defined as the percentage of tasks that cannot be finished by its deadline.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

Indicator 3: Global data amount of movements. As mentioned in Section 6.1, tasks are packed with datasets, and they are assigned to servers together. As our work in this paper is oriented to data-intensive applications, the datasets are usually large. Therefore, we assume all of the datasets are stored in only one node, i.e., no duplicates exist. However, some data items are needed by multiple tasks, and the newly generated data items are not always required by the local tasks. Therefore, data movements are unavoidable at runtime. Based on dependencies between task and data, the total data amount of transmission of the 10 applications can be calculated. 7.4. Results and analysis 7.4.1. Energy consumption analysis For each workload type, the 20 groups of workflow experiment data (10 workflows for 1 group) and 4 cloud structures are combined into 80 groups of experiment data. All 80 groups with the same experiment data are tested by the three experiment methods. The average results are shown in Fig. 14. From this figure, we can see that, compared with the baseline method, our 2 proposed methods can reduce energy consumption efficiently. A detailed analysis of the two methods we proposed is given as follows. 1. The method without high-price rejection First, we analyze the method without high-price rejection. At low workload, 37.5% of power consumption is reduced compared with the baseline method. At medium workload, the reduced ratio is 36.6%. And at high workload, the reduced ratio is 24.5%. The reason the effect of consumption improvement is decreased at high workload is that: because the workload is very high and approaches saturation, the CPU load of every machine may be heavy even using an intelligent method. Therefore, it is hard to shutdown some machines to decrease energy consumption. In fact, if we take a look at how many servers can be closed after task scheduling, the reason is clearer. The result is shown in Table 2. Comparing the two methods, the gap in the average number of closed servers is only 1 server when the workload is high. From Table 2, some other conclusions can be made after analysis. When the workload level is high and medium, many more servers can be closed on average. Some of the main reasons for this phenomenon are that: first, by means of data correlation clustering of the datasets and tasks, many more data movements have been taken out; second, through the tree-structured modeling of the cloud system in terms of network conditions, the data transmission time consumption can be further reduced. As a large amount of time can be saved in our proposed method, it is more flexible for scheduling tasks. This is why many more machines can be turned off. Therefore,

Fig. 14. Energy consumption comparison.

Table 2 Comparison results with different workload levels. Workload level

Method

Closed server number

Average closed server number

Consumption reduced rate

Low

Our 1st method baseline Our 1st method baseline Our 1st method baseline

4–6

5.2

37.5%

0–4 1–4

1.8 3

————————————— 36.6%

0–2 0–3

0.8 1.2

————————————— 24.5%

0–1

0.2

——————————————

Medium

High

we can draw the conclusion that for a reasonable workload level, our proposed method can reduce energy consumption very efficiently whether the high-price application rejection step is performed or not. 2. The method with high-price rejection Here, we mainly compare the experiment result of this method with that of the method without high-price rejection. In Section 6.4, it was mentioned that if some hosts still have very low utilization after task re-adjustment, and these tasks cannot be moved to other nodes, one solution is to reject several applications in order to close this under-loaded machine. The benefit of this solution is that the average utilization of the computing resources can be increased and a certain amount of power can be saved. However, the obvious disadvantage is some users' demands cannot be satisfied, i.e. the SLE violation rate increases. So we have to make a trade-off between these two aspects of influence. As shown in Fig. 14, compared with the first method we proposed, when the workload level is low and medium, the decrease in energy consumption is relatively obvious by this high-price application rejection operation—10.5% and 14.1 respectively. However, in the high-workload situation, only 2.3% of power is saved. The reason for this phenomenon is that when the workload is very high, the utilization of every host may be high. Therefore, it will be less and less likely to find a machine unused by this task migration method. The reason will be clearer after the SLA violation comparisons.

7.4.2. SLA violation analysis SLA violation rate is very important in task scheduling. Table 3 has shown the comparison results of these methods. As shown in Table 3, compared with the first proposed method, the baseline method will result in a much higher SLA violation rate. Especially for the high workload level, the ratio of the applications cannot be finished is very high in the baseline method. This is because there is no optimization of transmission aware in this method, and as the datasets are large, a large amount of data movements is unavoidable. Therefore, when the workload is very high, there may not seem to be enough computing resources. What can also be seen from Table 3 is that, after introducing the high-price rejection strategy, the SLA violation rate will increase a certain amount. When the workload level is high, the increased SLA violation rate is not the highest; instead, it is the lowest. This is because as the workload be higher, most of machines are medium or overloaded, therefore, the machines should not be turned off in most situations. This can be found from Table 3 even without this high-price rejection operation, 3.1% of applications still cannot be finished on deadline for the high load level. So we believe the utilization of servers in the high load

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66

13

Table 3 SLA violation rate comparison. Workload level

Low

Medium

High

0 6% 6% 0

0 6.8% 6.8% 2.2%

3.1% 3.5% 6.6% 11.5%

Methods Our 1st method (without high-price rejection) Our 2nd method (with high-price rejection)

Generated by high-price rejection Overall violation rate

Baseline method

that the proposed scheduling method can simultaneously increase energy efficiency and meet users' deadline constraints. However, as a heuristic method for task assignment is used in this paper, only a near-optimal energy-aware solution could be found. In the future, optimization algorithms based on evaluation function will be researched to achieve higher energy efficiency. On the other hand, the online task scheduling issue has not been researched. For the purpose of fine-grained workflow modeling, a running time estimation method based on WCET and networktransmission time has been used in task scheduling. In the future, we will focus on research on online task scheduling and VM placement.

Fig. 15. Data amount of transmission comparison.

level is higher. Reviewing Fig. 14, which shows high-price task rejection operations, the power consumption can be reduced especially for the low and high workload levels. Therefore, whether the high-price application rejection operation should be performed depends on the real requirements of the system. For instance, if the system has a high demand for QoS, the high-price rejection operation should not be performed. 7.4.3. Amounts of data transmission In this paper, the power consumption saved on network devices will not be measured, but the indicator “amounts of data transmission” can reflect this performance to a certain extent. The comparison results are shown in Fig. 15. It is clear that our proposed method can reduce the global amount of data movement in the cloud system greatly. Therefore, the energy consumption decrease on network devices will also be very obvious. The conclusion can be drawn that our proposed energy efficiency aware task scheduling method can reduce energy consumption very efficiently with a further lower SLA violation rate.

8. Conclusions and future work In this paper, a new offline task scheduling method in cloud was proposed in order to reduce the energy consumption. The objective of this work is to provide a scheduling framework with high energy efficiency and low SLA violation rate for dataintensive applications in the era of Big Data, which is expected to satisfy the technological requirement of cloud computing. Both the datasets and cloud system were built into a tree-structure model by data correlation based clustering. Hence, the amount of global data movement can be reduced greatly, which can decrease the SLA violation rate and improve energy efficiency of servers and network devices in the cloud. In addition, an energy consumption model was built, based on which the principles of powerefficiency-aware scheduling were applied. Then a vital parameter Task Requirement Degree (TRD) was proposed, that could improve resource utilization efficiency and decrease energy consumption during the scheduling process. Experiment results demonstrated

Uncited references Zaman et al. (2012) and Halder et al. (2012)

Acknowledgments This work is supported by the National Natural Science Foundation of China (61272509) and (61402332), and the Natural Science Foundation of Tianjin University of Science and Technology (20130124). References R. Buyya, Market-oriented cloud computing: vision, hype, and reality of delivering computing as the 5th utility. In: Proceedings of the 9th IEEE/ACM international symposium on Cluster Computing and the Grid (CCGrid). 2008. Armbrust M, et al. A view of cloud computing. Commun ACM 2010:50–8. Pedram M. Energy-efficient datacenters. IEEE Trans Comput-Aided Des Integr Circuits Syst 2012:1465–84. GA Qureshi, R Weber, H Balakrishnan, J Guttag, and B. Maggs, Cutting the electric bill for Internet-scale systems. In: Proceedings of the ACM SIGCOMM 2009 conference on Data communication. 2009. p. 123–134. Gorbenko A, Popov V. Task-resource scheduling problem. Int J Autom Comput 2012;9(4):429–41. H. M. Fard, R. Prodan, J. J. D. Barrionuevo, and T. Fahringer, A multi-objective approach for workflow scheduling in heterogeneous environments. In: Proceedings of the 12th IEEE/ACM international symposium on cluster, Cloud and Grid Computing. 2012. p 300–309. G. Chen et al., “Energy-aware server provisioning and load dispatching for connection-intensive internet services,” Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, p 337–350, 2008. P. Barham et al., Xen and the art of virtualization. In: Proceedings of the 19th ACM symposium on operating system principles. 2003. p. 164–177. D. Cavdar D and F. Alagoz, A survey of research on greening data centers. In: Proceedings of the IEEE global telecommunications conference. 2012. p. 3237– 3242. Srikantaiah S, et al. Energy aware consolidation for cloud computing. Cluster Comput 2009;12(1):1–15. Grid Green. Unused servers survey results analysis. Tech Rep 2010;2010. Iosup, C. Dumitrescu, D. Epema, Hui Li, and L. Wolters, How are real Grids used? The analysis of four Grid traces and its implications. In: Proceedings of the IEEE/ ACM international conference on Grid Computing. 2006. Y. Kwok and I. Ahmad, Benchmarking the task graph scheduling algorithms. In: Proceedings of the first merged international parallel processing symposium and symposium on parallel and distributed processing. 1998. p. 531–537. Topcuoglu H, et al. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Comput-Aided Des Integr Circuits Syst 2002;13(3):260–74.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 Q2 92 93 94 95 96 97 98 Q4 99 100 101 102 103 104 105 106 107 108 Q5 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 Q6125 126 127 128 129 130 131 132

14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Q. Zhao et al. / Journal of Network and Computer Applications ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Y. Gao et al., Using explicit output comparisons for fault tolerant scheduling (FTS) on modern high-performance processors. In: Proceedings of Design, Automation and Test in Europe. 2013. p. 927–932. D. Warneke and O. Kao, Nephele: efficient parallel data processing in the cloud. In: Proceedings of the 2nd ACM workshop on many-task Computing on Grids and supercomputers. 2009. N. Bobroff, A. Kochut, and K. Beaty, Dynamic placement of virtual machines for managing SLA violations. In: Proceedings of the 10th IFIP/IEEE international symposium on integrated network management. 2007. p. 119–128. Wood T, Shenoy P, Venkataramani A, Yousif M. Sandpiper: black-box and gray-box resource management for virtual machines. J Comput Netw 2009;53:2923–38. Y. Ajiro and A. Tanaka, Improving packing algorithms for server consolidation. In: Proceedings of the international conference for the computer measurement group. 2007. p. 399–407. M. Wang, X. Meng, L. Zhang, Consolidating virtual machines with dynamic bandwidth demand in data centers. In: Proceedings of the IEEE INFOCOM 2011 mini-conference. 2011. p. 71–75. D. Tammaro et al., Dynamic resource allocation in cloud environment under timevariant job requests. In: Proceedings of the 2011 IEEE 3rd international conference on cloud computing technology and science. 2011. p. 592–598. M. Mazzucco et al., Maximizing cloud providers revenues via energy aware allocation policies. In: Proceedings of IEEE 3rd international conference on cloud computing. 2010. p. 131–138. M. Lin et al., Dynamic right-sizing for power-proportional data centers. In: Proceedings of the IEEE/ACM transactions on networking. 2013. p. 1378–91. J. Bi et al., Dynamic provisioning modeling for virtualized multi-tier applications in cloud data center. In: Proceedings of the IEEE 3rd international conference on cloud computing. 2010. p. 370–377. R. Aoun et al., Resource provisioning for enriched services in cloud environment. In: Proceedings of the 2010 IEEE 2nd international conference on cloud computing technology and science. 2010. p. 296–303.

S. Zaman et al., An online mechanism for dynamic VM provisioning and allocation in clouds. In: Proceedings of the IEEE 5th international conference on cloud computing. 2012. p. 253–260. M. Bjorkqvist et al., Opportunistic service provisioning in the cloud. In: Proceedings of the IEEE 5th international conference on cloud computing. 2012. p. 237–244. K. Halder et al., Risk aware provisioning and resource aggregation based consolidation of virtual machines. In: Proceedings of the EEE 5th international conference on cloud computing. 2012. p. 598–605. M. Hadji and D. Zeghlache, Minimum cost maximum flow algorithm for dynamic resource allocation in clouds. In: Proceedings of the IEEE 5th international conference on cloud computing. 2012. T. Knauth and C. Fetzer, Energy-aware scheduling for infrastructure clouds. In: Proceedings of the IEEE 4th international conference on cloud computing technology and science. 2012. p. 58–65. T. Adhikary, A. Das, M. Razzaque, A. Sarkar, Energy-efficient scheduling algorithms for data center resources in cloud computing. In: Proceedings of the IEEE international conference on embedded and ubiquitous computing. 2013. p. 1715–1720. Yuan D, Yang Y, Liu X, Chen J. A data placement strategy in scientific cloud workflows. Future Gener Comput Syst 2010;26(8):1200–14. E. Zhao, Y. Qi, X. Xiang, and Y. Chen,. A data placement strategy based on genetic algorithm for scientific workflows. In: Proceedings of the international conference on computational intelligence and security. 2012. p. 146–149. Isard M, et al. Dryad: distributed data-parallel programs from sequential building blocks. Oper Syst Rev 2007;41(3):59–72. R. Chen et al., On the efficiency and programmability of large graph processing in the cloud. Technical Report MSR-TR-2010-44. Microsoft Research. 2010. Xavier S, Lovesum SJ. A survey of various workflow scheduling algorithms in cloud environment. Int J Sci Res Publ 2013;3(2). Q. Zhao et al., A data placement strategy for data-intensive scientific workflows in cloud. In: Proceedings of the IEEE/ACM international symposium on Cluster, Cloud, and Grid Computing. 2015.

Please cite this article as: Zhao Q, et al. A new energy-aware task scheduling method for data-intensive applications in the cloud. Journal of Network and Computer Applications (2015), http://dx.doi.org/10.1016/j.jnca.2015.05.001i

27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51