Future Generation Computer Systems 29 (2013) 1152–1163
Contents lists available at SciVerse ScienceDirect
Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs
Decreasing power consumption with energy efficient data aware strategies Susan V. Vrbsky ∗ , Michael Galloway, Robert Carr, Rahul Nori, David Grubic Department of Computer Science, The University of Alabama, Tuscaloosa, AL 35487-0290, United States
article
info
Article history: Received 24 March 2011 Received in revised form 17 December 2012 Accepted 21 December 2012 Available online 8 January 2013 Keywords: Data backfilling Data cluster Data grid Data intensive computing Data replication Energy Power consumption QLFU
abstract Regardless of whether data is stored in a cluster, grid, or cloud, data management is being recognized as a significant bottleneck. Computing elements can be located far away from the data storage elements. The energy efficiency of the data centers storing this data is one of the biggest issues in data intensive computing. In order to address such issues, we are designing and analyzing a series of energy efficient data aware strategies involving data replication and CPU scheduling. In this paper, we present a new strategy for data replication, called Queued Least-Frequently-Used (QLFU), and study its performance to determine if it is an energy efficient strategy. We also study the benefits of using a data aware CPU scheduling strategy, called data backfilling, which uses job preemption in order to maximize CPU usage and allows for longer periods of suspension time to save energy. We measure the performance of QLFU and existing replica strategies on a small green cluster to study the running time and power consumption of the strategies with and without data backfilling. Results from this study have demonstrated that energy efficient data management can reduce energy consumption without negatively impacting response time. © 2013 Elsevier B.V. All rights reserved.
1. Introduction Data intensive scientific applications are becoming larger in scale as cluster, grid, and cloud computing systems provide access to data and storage not previously available to many users. Clusters are tightly coupled systems in which the individual components can act together as a single computer. The connection of the components through a LAN provides high performance and high availability. Grid systems allow the sharing of data and resources in dynamic, multi-institutional virtual organizations [1] across the globe. Grids involve loosely coupled jobs and can have computationally intensive tasks, and/or the need to manage an extremely large number of data sets that are typically larger than 1 TB. A data grid computing system is utilized for processing and managing this very large amount of distributed data, allowing thousands of clients around the world to access the data. Grid systems also host user-level services, such as scientific simulation and modeling [2]. A data grid can incorporate a cluster and the resources can be managed as part of a grid. Cloud computing systems also increase the availability of computational resources to users. Cloud infrastructures [3] typically provide elasticity in terms of certain system resources, in which users can increase or decrease the amount of resources requested, depending on their needs. The usage and
∗
Corresponding author. E-mail addresses:
[email protected] (S.V. Vrbsky),
[email protected] (M. Galloway). 0167-739X/$ – see front matter © 2013 Elsevier B.V. All rights reserved. doi:10.1016/j.future.2012.12.016
sharing of data and resources can require paying for services, and unlike a data grid, a cloud is managed by a single provider. Regardless of whether data is stored in a cluster, grid, or cloud, data management is being recognized as a significant bottleneck [4]. This is due to the fact that data resources in a traditional distributed system are typically considered as second class objects. To address the problems inherent in processing massive amounts of data, new strategies for scheduling and resource allocation that consider data as first class objects are needed. While some data aware strategies have been proposed to reduce the bottleneck [4], the power efficiency of these strategies is not considered. Storing and managing large amounts of data are recognized as extremely energy inefficient [1]. Power consumption and cooling are the biggest issues in data centers [5–8]. A data center containing 1000 racks in 25,000 square feet would require 10 MW of power for the computing infrastructure per year and also require 5 MW to dissipate the heat using current semiconductor technology [1]. In 2006, the US government produced a report in which data centers were reported to consume 1.5% of all electricity consumed in the US [6]. The power demand was reported as growing every year, with storage devices having the fastest growing power consumption rate at 191% and with an overall power consumption at 23% from the year 2000 to 2006 [7]. Data centers in the US were reported to have used 2% of the country’s total electricity in 2010 [9]. Subsequent to the 2006 report, much attention has been paid to this issue. Some current approaches to reducing the energy consumed involve multi-core architectures with higher performance, lower power budgets and virtualization [10]. Moving the
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
data centers to locations near renewable energy sources, such as wind, geothermal, and solar, also effectively decreases the amount of power consumed from the US electricity grid. Another approach is to consolidate data centers to make larger data centers that are more efficient [11]. The US government created Energy Star standards [6] for servers and a Green Grid consortium comprised of prominent industry vendors is also addressing the problem [5]. The three possible areas to address in designing data aware energy efficient strategies for data intensive computing are: disk storage, minimizing the amount of data transmitted, and CPU usage. An analysis by Dell Computers of their data centers [12] has determined that 37% of data center utility power is consumed by data storage equipment, 23% by network equipment and 40% by servers. Data storage is expected to surpass all other power consumers, as its current growth rate of 20% is expected to accelerate [13]. In a data grid system, computing elements are often located far away from the data storage elements. A common solution to improve availability and file access time is to replicate the data, resulting in the creation of copies of data files at many different locations [14]. Data placement and replacement strategies that decrease the amount of data transmitted (the file access times), while not requiring an increase in data storage, are needed for energy efficiency. There will always be a trade-off between high availability and high energy efficiency while using data management algorithms. In order to address such issues, we are designing and analyzing a series of energy efficient data aware strategies, involving data placement and CPU scheduling. In other words, we propose to consider solutions that are energy efficient as well as reduce data bottlenecks. Our work involves two facets: (1) Designing data aware strategies: designing energy efficient strategies for data intensive computing systems that minimize disk storage, the amount of data that needs to be transmitted, and the power used by the CPU. (2) Impact of data aware strategies on energy efficiency: implementing these strategies on a prototype green cluster to determine the benefits to the system. We will focus on data grids and will use a green cluster to simulate grid behavior. Without direct access to modify the physical characteristics of the resources used in data centers, we have to rely solely on software algorithms to produce the highest efficiency available. For every resource used, there is a maximum efficiency threshold that cannot be exceeded except by replacing that resource with a more efficient version. The idea is to bring all resources to their maximum efficiency (in terms of requests per Watt) over their lifespan and eventually justify the need to replace any resource with a more efficient version as electrical engineers advance in the area of semi-conductor and storage devices. In this paper, we present a new strategy for data replication, called Queued Least-Frequently-Used (QLFU). We implement QLFU on our green system to determine its impact compared to traditional strategies to determine the energy efficiency of these strategies. We also study the benefits of data aware CPU scheduling strategies for job preemption, in order to maximize CPU usage and to allow for longer periods of suspension time to save energy. The control over CPU energy usage is much greater than other system components. Techniques to decrease CPU energy consumption include low-power/low frequency processing and increasing the time the CPU is suspended. We analyze one such CPU scheduling strategy, called data backfilling [15], to determine its effect on the power consumed by our green system. We analyze the performance of data backfilling for different replication strategies, including QLFU. The remainder of this paper is organized as follows. In Section 2 we describe related work. In Section 3 we present our new replication strategy of QLFU and study its performance in terms of time, power and energy consumed. In Section 4 we describe our CPU scheduling strategy of data backfilling and also study its performance in terms of time, power and energy consumed. Section 5 provides conclusions and future work.
1153
2. Related work The need to consider the energy efficiency of computing systems is becoming increasingly popular. With a number of white papers [5], a group of information technology professionals called the Green Grid, have been proposing energy efficient strategies for large scale data centers. Their goal was to design successful data metrics that would analyze and scale the efficiency and productivity of the data transfers. In their work they suggested two major metrics [5]: Power usage of efficiency (PuE )
=
Total data center facility power IT equipment power
Data center infrastructure efficiency (DCiE ) =
1 PuE
.
They also consider the Telecommunications Equipment Energy Efficiency Rating (TEEER) developed by Verizon. In [11] the authors propose a blueprint for an Energy Aware Grid that incorporates energy efficiency and thermal management for data centers. Their results demonstrate the economic feasibility and energy savings of their approach. As another example of the increasing awareness of the need for energy efficiency, IEEE Computer magazine began with an issue dedicated to green issues [16] and dedicated a column to Green IT. Insight into the CPU utilization of servers is provided in [17], indicating servers are almost never totally idle or operating at their maximum utilization. Peak energy efficiency (processed requests per Watt) of servers occurs at peak utilization. However, most servers spend most of their time at the 20%–30% utilization range, at which point the energy efficiency is less than half of the energy efficiency that can be found at peak performance. Server processors can provide a wide dynamic power range, while the range of other components is much smaller, e.g. 25% for disk drives and 15% for networking switches [16]. Disk devices have a latency and energy penalty for transition from an inactive to an active mode and networking equipment typically does not have low-power modes. As a result, alternative strategies are needed to reduce the data transmitted and the amount of disk storage utilized in a data grid. Benchmarks for power supplied to a single unit server [18], such as SPEC, have been applied by some organizations that analyze performance. In their paper, the authors propose a power consumption model that was based on a benchmark set by a prior published report by the Transaction Processing Performance Council (TPC-C). They apply various power usage trends to the model to analyze the consumption trends and to identify the components that are power-intensive. They report advancements in software and hardware systems that could make them energy efficient. This acts as a standard to compensate between the performance and price effectiveness of computing systems. Other work by a group of researchers [19] surveys power and energy management techniques that are applied to modern day server systems. The authors link these results to the workloads related to each of the servers and identify techniques that could provide significant savings in energy together with minimal compensation in the systems’ performance. Their survey concentrates on energy saving at the individual component level, as well as cluster-wide power savings in a data grid. They compare current and past work done, and suggest a future course of action in the area of power savings in a data grid. Data replication is one way to reduce the amount of data transmitted, but smart replication is needed to reduce storage. Early work on data replication [20–22] in data grids has placed the most
1154
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
attention on decreasing the data access latency and the network bandwidth assumption. In [21], six replica strategies are presented and their results indicate that the best strategy has significant savings in latency and bandwidth consumption if the access patterns contain a moderate amount of geographical locality. In [22], the HotZone algorithm is presented to place replicas in a widearea network, so that the client-to-replica latency is minimized. In [20] a dynamic replica replication strategy HBR is proposed that benefits from ‘‘network-level locality’’, utilizing the broadest bandwidth to the site of the job execution. Later, researchers began to focus on the availability of the data. In [23], the authors propose providing replicated fragments of files to increase data access performance and load balancing. In [24], a dynamic model-driven replication approach in which peers create replicas automatically in a decentralized fashion is presented. The authors propose to meet the data availability goal based on the assumption that the total system replica storage is large enough to hold all the data replica copies. In [25] the authors propose a data replication strategy for star-topology data grids that replicates files based on their popularity as calculated periodically. We note that none of this work on data replication considers power consumption. There has been much research in the area of Dynamic Voltage Scaling (DVS) [26,27], which builds on the tradeoff between performance and energy savings. It has been recognized that low-performance, low-power processing is adequate in many situations. Low performance results from lowering the operating frequency of the processor. By lowering the frequency, the processor can operate at a lower voltage, and hence, requires less energy [27]. This work demonstrates the potential for energy savings if the performance scales better than linearly as CPU frequency is decreased. Recent work [28,29] considers powering down disk storage during non-peak hours. In [28] a subset of the disks is powered down, but all data items are still available in the system. The goal is to have little impact on the performance of the system during normal operation. Virtual nodes and migration are utilized in [29], so that a small number of disks can be used without overloading them. The authors also introduce an extension of their strategy in which some of the data is replicated. We note that while we also consider replication in our work, we do not power down the disk storage. The work in the Petashare project [4] proposes several approaches to data aware scheduling for CPU-oriented batch schedulers and proposes a new data aware paradigm. Experiments indicate increasing parallelism and concurrency do not always result in an increase in the data transfer rate. In addition, their results indicate that data placement by a scheduler can improve the performance and reliability of a system. The authors also consider such factors as remote I/O versus staging. The authors do not consider the energy efficiency of their approaches. 3. Data replication When the data are distributed among different sites in a data intensive environment, there is a high probability the system will access data which is not in the local site. Remote data file access can be a very expensive operation, considering the sizes of the files and limited networking bandwidth or network congestion. To reduce the amount of time needed to access the data, data replication can be utilized to avoid remote file access. However, in order to decrease the amount of energy needed to store the data, it is imperative that a limit be placed on the size of the storage. We propose to minimize the energy consumed through the use of smart data replication to reduce the cost of accessing and storing the data.
Fig. 1. Data grid architecture.
3.1. Data grid architecture We consider single-tier grids in our work. We assume that in an organization a mixture of applications exists, whereby, some of the grids will be single-tier in structure [30], while other applications will use a more complicated structure, such as a multi-tier grid. We expect the strategies developed for single-tier grids here can be used within the multi-tier structure and will investigate as such in the future. Fig. 1 illustrates single-tier grid architecture. A data grid typically consists of job execution sites containing the components: the Computing Element CE, the Storage Element SE and a Replica Manger containing a replica optimizer. Given a set of jobs, J = (J1 , J2 , J3 , . . . , JN ), they are submitted to the Resource Broker agent through a user interface. The resource broker agent then schedules them to the appropriate computing sites, where the jobs may be queued in the site’s job queue. Meanwhile, each job will request more than one file to complete its own task from the file set, F = (f1 , f2 , . . . , fk ), where each file fi is of size Si . It is common for a job in a data grid to list all the files needed to complete its task. Hence, we utilize this aspect in designing a data replication scheme. 3.2. QLFU strategy We now present our new replication and replacement scheme QLFU, which is called the Queued LFU (least-frequently-used) replica schema. QLFU considers the number of times a file is accessed in the future and how far the replica optimizer will look ahead into the job queue. Traditional LFU considers the references to the file in the past and replaces the file that has been used the least frequently in the past. LFU assumes a file not frequently used in the past is not likely to be used in the future. This differs from QLFU which considers only future requests when making decisions for the immediate present. QLFU replaces files that are the least likely to be used in the future. This means every time a pending request for a file is queued, a count associated with that file is incremented, which represents the number of times the file will be requested in the future. The file with the smallest count that is resident in the storage element is the one to be replaced. For example, assume a total of 5 files in our system, where files are numbered from 1 to 5. Also assume the storage element can hold 4 files and contains file 2, file 1, file 3 and file 4. The queue of pending requests contains 11 files. File 5 needs to be stored in the SE for the next job.
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
1155
File 5 will not fit into the SE unless one of the files is replaced. According to the QLFU strategy, the file with the fewest number of pending requests will be replaced. If we look ahead 6 file requests into the future (after the request for file 5), we have the following number of pending requests. File
1
2
3
4
5
Number of pending requests
3
2
1
0
1
File 1 has 3 pending requests, file 2 has 2 pending requests, files 3 and 5 each have 1 pending request, while file 4 has no pending requests. The file in the storage element with the least number of future requests is file 4, so it will be replaced with file 5. The resulting storage element now contains the following files: Storage element 2
1
3
5
If instead we look ahead 10 file requests into the future, we have the following number of pending requests. File
1
2
3
4
5
Number of pending requests
3
3
1
2
2 Fig. 2. Algorithm QLFU.
File 1 has 3 pending requests, file 2 has 3 pending requests, file 3 has 1 pending request, and files 4 and 5 each have 2 pending requests. File 3 has the least number of future requests, so it will be replaced with file 5. The resulting storage element in this case now contains the following files. Storage element 2
1
5
4
An implementation of Algorithm QLFU appears in Fig. 2. We assume each site in the grid contains a replica manager (Fig. 1) which implements Algorithm QLFU. In Algorithm QLFU, a counting array (Count_Array) is maintained which represents each of the possible files in the system and the number of times it is requested in the queue of pending requests at the local site. When a new nextJob arrives at a site, the count for each file requested by the new nextJob is incremented in Count_Array and the new nextJob is added to the current list of pending jobs (Job_Curr) in FIFO order (lines 9–12). When a nextJob is completed, the count for each file requested by the nextJob is decremented in the Count_Array and the count in Count_Array is decremented (lines 13–16). As the Count_Array is incremented and decremented, the Count_Array is reordered (line 17). To accomplish this reordering efficiently, the system can also maintain an ordering array, which is an array of pointers to the Count_Array positions ordered by count of requests from greatest to least. The Count_Array is reordered as needed and could be limited to swapping the position of a limited number of pointers at most after the order is initially set up. As the nextJob is processed, each neededFile requested by nextJob, but not stored in the SE, is handled as follows (lines 18–23). If there is room for the neededFile, it is stored in the SE, otherwise, the Count_Array is used to determine which file to replace. The neededFile with the current smallest value in the Count_Array is the one replaced by the neededFile. In Algorithm QLFU, the file that is replaced is the one with the shortest waiting request queue, and in essence, is used the least frequently in the future. By maintaining the count of requests for each file, replication choices can be made that will be statistically beneficial to the entire queue rather than the upcoming jobs only. This strategy does not use information about past files nor does it forecast requests that may arrive in the future. Instead, it examines the current pending requests for files which are updated with the arrival of each new request. These are the requests for files that
have already arrived, but have not been serviced yet. It simply examines the existing requests list and prioritizes the replicas based on those needs. One issue to consider for QLFU is how far into the future to consider the pending requests. The success of QLFU is based on how many pending requests we use to make current decisions, which we call the LookAhead parameter. If we do not have many pending requests, then we cannot look far enough into the future to make knowledgeable decisions. On the other hand, if there are many pending requests and all pending requests are considered, then we are utilizing requests that will be satisfied far into the future to make decisions about now. Using all pending requests is only useful if those requests are similar to the files currently demanded. We study the effect of the LookAhead parameter on the performance of QLFU in Section 3.3.1. 3.3. Performance results We evaluate the performance of our QLFU replica strategy using a small energy efficient cluster called Sage, which we have built at the University of Alabama. Our Sage cluster currently consists of 9 nodes, where each node is composed of: 1. Intel D201GLY2 mainboard with 1.2 GHz Celeron CPU ($70 per) 2. 1 Gb 533 MHz RAM ($20 per) 3. 80 Gb SATA 3 hard drive ($29 per). The cluster nodes are connected to a central switch on a 100 Mb Ethernet in a star configuration, using the D201GLY2 motherboard on-board 10/100 Mbps LAN. The operating system is Ubuntu Linux 7.10. Shared memory is implemented through NFS mounting of the head node’s home directory. The Intel boards were chosen as a compromise between energy usage, computational power, and cost. There are lower-powered boards available, but they either offer significantly less computational power or significantly more cost. As configured, the total cost of the cluster is approximately $1500. Sage can be powered from a single 120VAC outlet. While booting, peak energy usage rates are approximately 430 W. While running, but idle, Sage consumes 225 W (using CPU cooling fans). In a cool room, Sage can run without cooling fans, but can run in a warm or even hot room, with the cooling fans on. Sage is able to accommodate an additional 20 boards on a single 15 A 120VAC household outlet/circuit. As currently configured with 10 nodes, Sage has a theoretical peak performance of 0.024 Tflops.
1156
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
3.3.1. Performance environment We now describe the environment utilized for the performance results of the replica strategies. As mentioned previously, a data grid can incorporate a cluster and the resources can be managed as part of a grid. Although we did not run our performance experiments with our Sage cluster as part of a grid, we do manage our resources similar to a grid. We assume an architecture for our Sage cluster similar to that illustrated in Fig. 1. One of the nodes (server node) performs the tasks of the resource broker, e.g. all requests are sent to this node, where they are queued and assigned to the remaining nodes in the cluster (client nodes). The client nodes are responsible for processing the requests, maintaining replica copies of the data and notifying the server when a job is completed. In order to simulate a grid using our cluster, we need to simulate the transfer of data across a WAN. Tests were done to send files to various locations and at different times of day to obtain realistic data transfer times. Delays can be introduced into our cluster to simulate such a data transfer. In order to determine any improvement provided by our QLFU replica strategy, we need to compare it to existing strategies. We compare our QLFU scheme to the existing schemes: FIFO, LFU, MRU, LRU and SWIN. FIFO (first-in–first-out) replaces the file that has been in the queue the longest, assuming it is no longer needed because it is old. LFU (least-frequently-used) replaces the file that was least frequently used, assuming it is not a popular file and not likely to be used in the future. LRU (least-recently-used) replaces the least recently used file stored with the assumption that a file not used recently is not likely to be used in the future. MRU (mostrecently-used) replaces the most recently used file. MRU is most useful when an older file is more likely to be requested, such as in a large cyclical loop or in repeated scans over large datasets. We describe the SWIN strategy in detail below. The FIFO, LFU, LRU and MRU replacement strategies were used as replication policies. They were implemented in C++ on our Sage cluster as follows. If file fi is needed for a job but it is not stored locally, then the file is obtained from a remote site. If there is enough free space to store file fi in the site’s storage element, then a copy of fi is stored. Otherwise, if there is not enough empty storage space, then a file currently stored in the Storage Element must be removed from storage to make room for file fi . The FIFO, LFU, LRU or MRU strategy is applied to determine which file to replace. The SWIN strategy is the Sliding Window Strategy presented in [31]. The idea of this replica scheme is to build a ‘‘sliding window’’ that is a set of files which will be used immediately in the future. The sum of the sizes of those files will be bound by the size of the local Storage Element. The sliding window is a set of distinct files, which includes all the files the current job will access and the distinct files from the newly arrived job(s), with the constraint that the sum of the files in the sliding window will be at most the size of the local Storage Element. The sliding window moves forward one more file each time the system finishes processing one file. In this way, the sliding window will keeping changing, and each time the window changes, the files in the SE are changed accordingly. The QLFU differs from the SWIN strategy because SWIN only considers the number of files into the future equal to the size of the replica storage while QLFU allows a variable window into the future. QLFU also does not change the files in the SE until a file must be replaced. Table 1 illustrates the default parameters used in our experiments. Many of the possible parameters that we use are similar to those in OptorSim v2.0 [14]. OptorSim is a simulation package that was created by the EU DataGrid Project [14] to test dynamic replica schemes. Similar to the CERN site on the EU data grid, we assume that all of the files are available on the server. As the files
Table 1 Default experiment parameters. Description
Value
Number of files Storage available at an SE (number of files × size of files) Number of files accessed by a job Arrival distribution Request distribution QLFU LookAhead parameter v
30 10 files 1–5 Poisson Zipf 25
are distributed and stored locally on each client node, a file request can be satisfied by any client node or the server. As shown in Table 1, for our default parameters, we assume there are 30 different files in the grid and all the files have the same size. The sizes of the files are 40 and 80 MB. The size of the SE and the corresponding sliding window is 10 (i.e. 10 files × size of the files). The files are requested following a Zipf distribution and each job requests 1–5 files. Our metrics involve the average runtime per job, the average power and the average energy consumed per job. These metrics were calculated by averaging the results over multiple experimental runs comprised of from 200–500 jobs per replica strategy. Our first metric is the average runtime for a job. Jobs in our experiments vary from a few seconds to roughly one-quarter of a minute depending on the strategy used. Our second metric is the average power consumed in terms of Watts. This metric was obtained by taking a sample every 15 s from a Wattmeter while an experimental run was executing and averaging the Watts for each replica strategy. In our experiments, all nodes were continually powered on, even if they were not being utilized during the experiment. The third metric is the average energy consumed to process a job in terms Joules, which is computed as: (average_Watts∗ Runtime). We note that the processing of a job in our experiments includes not only the time and power required for the computation of the replica strategy, but it also includes the time and power required to send, receive and write the files locally. Hence, a replica strategy whose algorithm requires little power to compute, may require a higher cost to send/receive files because more of the files needed may not be locally resident. It was also necessary to choose a good value for the QLFU strategy for the LookAhead parameter v . A LookAhead value of v means that the file requests (in FIFO order) for the next v pending jobs are used in the Count_Array by the QLFU algorithm in Fig. 2. We ran experiments to measure the running time when files are 40M and 80M, using the default parameters in Table 1, for LookAhead values v = 10, 25, and 50. Fig. 3(a) and (b) illustrate the average running time for the three different values of v , when file sizes are 40M and 80M, respectively. QLFU has the shortest average runtime when v = 25 for all number of nodes and file sizes, although the improvement provided by v = 25 is most notable when the files sizes are 80M. In most cases a v = 10 has the worst performance, while a v = 50 has the second best. This is because a window size that is too small does not anticipate the files that will be needed in the future. Similarly, a window size that is too large considers too many jobs in the distant future and does not represent the files that will be requested by the jobs in the immediate future. As such, we chose a default value of v = 25 for all experiments involving QLFU. Lastly, we again note that in our experiments, all nodes were continually powered on, even if they were not being utilized during the experiment. In [31], we considered the case when only the nodes being used in an experimental run are powered on. This means that if 4 nodes were needed for the experiment, only 4 of the nodes in the cluster were powered on. As expected, the results in [31] indicate that the power needed to process an experiment increases as the number of nodes powered on increases.
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
(a) 40M file size.
1157
(b) 80M file size. Fig. 3. Average job runtime for LookAhead parameter values.
(a) 40M file size.
(b) 80M file size. Fig. 4. Average runtime (s) per job.
3.3.2. Experiment 1—average runtime for each replication strategy In our first experiment, we want to determine the effect of the replication strategy on the runtime per job. While it is important to conserve energy, it is also important to make sure that energy conserving strategies do not negatively affect the response time of the user requests. This experiment measures the average runtime per job for each of six replication strategies: FIFO, LFU, LRU, MRU, SWIN and QLFU. The default parameters from Table 1 are used in this experiment. Results from this experiment, and Experiments 2 and 3 in Sections 3.3.3 and 3.3.4, respectively, will allow us to determine if replication strategies that are more energy efficient have a longer running time. We expect the runtimes to decrease as the number of nodes increases, and the runtimes to increase as the file sizes increase. We also expect QLFU and SWIN to have the fastest runtime and MRU to have the slowest. We observe in Fig. 4 the average runtime per job for the six replication strategies. Fig. 4(a) utilizes files of size 40M, while Fig. 4(b) utilizes files of size 80M. As expected, the runtimes decrease as the number of nodes increases from 1 to 8 and they increase as the file sizes increase from 40M to 80M. In both figures, the QLFU strategy clearly has the shortest runtime compared to the other strategies regardless of the number of nodes. Contrary to our expectations, the results for SWIN and the remaining strategies are not as clear. We can say, however, that SWIN is always faster than FIFO, LRU and MRU for both 40M and 80M files. MRU has the poorest performance in general for 80M files. The trends for 40M files are similar to the trends for 80M files, as MRU and LRU perform worse compared to the other strategies, and they even perform worse than FIFO for many numbers of nodes. QLFU provides a decrease in running time compared to LRU for 40M files that ranges from 19.8% for 1 node and increases to 25% for 6 nodes, and then drops to 18% for 8 nodes. Similarly, the decrease
provided by QLFU compared to MRU is 26% for 1 node and increases to 27% for 6 nodes, but then deceases to 19.5% for 8 nodes. When compared to SWIN, QLFU provides a decrease in running time compared to SWIN that drops from 23% for 8 nodes to 8% for 8 nodes. For 80M files, QLFU provides a decrease in running time compared to LRU that ranges from 27% for 1 node and increases to 39% for 4 nodes, and then drops to 31% and 25% for 6 and 8 nodes, respectively. Similarly, the decrease provided by QLFU compared to MRU is 27% for 1 node and increases to 42% for 4 nodes, but then deceases to 30% for 6 and 8 nodes. SWIN exhibits similar behavior compared to MRU and LRU, although the performance gain is much smaller for SWIN than QLFU. In conclusion, results from these experiments indicate that the runtime can be notably improved by a smart data replication strategy. QLFU has a faster average runtime for all number of nodes and file sizes. Contrary to our expectations, SWIN did not perform as well as expected, as LFU performed better than SWIN when the number of nodes is small. This indicates that the sliding window does not provide enough information to make good replica decisions when the system has a heavy workload. Since QLFU always performed better than LFU, the results indicate past behavior as used by LFU is not as good a predictor as the current pending requests in the QLFU, even for shorter queue lengths (number of nodes = 6 and 8) and longer queue lengths (number of nodes = 1 and 2). However, a question remains as to whether the improvement provided by QLFU is also applicable to the power consumed. 3.3.3. Experiment 2—average power consumed for each replication strategy In this experiment we compare the average power in terms of Watts consumed by each of the replica strategies: FIFO, LFU, LRU, MRU, SWIN and QLFU. We used the same parameters as those in Experiment 1 in Section 3.3.2, which are the default parameters
1158
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
(a) 40M file size.
(b) 80M file size. Fig. 5. Average power (W) per job.
from Table 1. As mentioned previously, we compute the power consumed by taking a sample every 15 s from a Wattmeter while an experimental run was executing and averaging the Watts for each replica strategy. We expect the average power to increase as the number of nodes increases and the size of the files increases. We do not have any expectations as to which replication strategy would consume more Watts. We observe in Fig. 5(a) the average power in terms of the Watts sampled for each of the replica strategies for 40M files, and in Fig. 5(b) the results for 80M files. As expected in Fig. 5(a), the power consumed increases as the number of nodes increases. The number of Watts for each strategy does not vary considerably among the different replica strategies. The SWIN strategy demonstrated the most variation in power, as it averaged the fewest Watts for 1 and 2 nodes, but the most Watts for 8 nodes for 40M files. The average Watts range from a low of 229 W for 1 node to a high of 245 W for 8 nodes for the SWIN replica strategy. The LRU replica strategy uses the most Watts per job for 2 and 4 nodes, while the LFU consumes the second most Watts for 2, 6 and 8 nodes. The Watts consumed by QLFU range from 230 for 1 node to 244 for 8 nodes. Although our proposed QLFU strategy has the fastest runtime for all number of nodes, it does not have the lowest power. In fact, it has the highest Watts for 6 nodes and the second highest Watts for 8 nodes. In Fig. 5(b) for 80M files, the Watts ranges from a low of 230 W for 1 node and the QLFU replica strategy, to a high of 244 W for 8 nodes and the LFU replica strategy. QLFU consumes fewer Watts than the other replica strategies for 4 and 6 nodes. However, QLFU increases in power consumed compared to some of the other strategies when the number of nodes is 8. In general, LFU consumes the most Watts. The power consumed does not strictly increase as the number of nodes increases. There is little variation in Watts consumed among the replica strategies for the larger file size when the number of nodes is 1 and 2 and the system load is high. The highest Watts occur when the number of nodes is 8 while the Watts are very similar for 4 and 6 nodes. This means the additional load resulting from the increase in nodes from 4 to 6 is offset by the advantage provided by a smaller workload at each node. However, when the number of nodes is 8, the increase in power required by the additional node is not offset by the decreased workload per node. The average power for 40M and 80M files is similar when the number of nodes is 1, but contrary to our expectations the power consumed is less for 80M files than for 40M files when the number of nodes ranges from 2 to 8. Obviously, as illustrated in Fig. 3, the runtime is longer with the larger file size. In conclusion, there is no one replication strategy that clearly consumes the most or least Watts for all experiments. Instead, there was little variation amongst the replication strategies. While it may seem counter intuitive that the power consumed was less for 80M files than for 40M files, it is due to the fact that the
system has an increase in idle time waiting for the larger files to be transferred over the network. These results indicate that it is important to consider both the runtime and Watts consumed to determine overall energy consumed, as we do in the next section. 3.3.4. Experiment 3—average energy consumed per job for each replication strategy In this experiment we study the average energy consumed per job for each of the six replica strategies of FIFO, LFU, LRU, MRU, SWIN and QLFU. Again, we used the same parameters as those in Experiment 1 in Section 3.3.2, which are the default parameters from Table 1. As mentioned previously, the average energy consumed per job in terms of Joules is computed as: (average_Watts∗ Runtime). We expect the average energy per job to increase as the size of the files increases, and we expect QLFU to consume the least and MRU to consume the most energy per job. We observe in Fig. 6(a) and (b), the average energy consumed per job for 40M and 80M files, respectively. Fig. 6 illustrates that the energy consumed per job is much higher for the larger file size of 80M than for the smaller file size of 40M. For example, for LFU and 1 node, the energy consumed for file size 80M is 1.69 times higher than for 40M size files and 1.93 times higher for 4 nodes when the file size is 80M compared to 40M size files. Fig. 6 also illustrates that as the number of nodes increases, the average energy per job decreases irrespective of the file size, although there is less of a decrease in power as the number of nodes increases. This differs from Fig. 5 in which the power during processing increases as the number of nodes increases. Despite the fact that an increase in the number of nodes results in an increase in the power consumed, the average energy consumed per job decreases as the number of nodes increases. This is because the average runtime per job decreases as the number of nodes increases, resulting in an overall decrease in the energy consumed per job. QLFU has the least average energy consumed per job for all number of nodes and file sizes. This differs from Fig. 5, where QLFU does not have the lowest power consumed. However, because of its superior running time, QLFU has the lowest average energy consumed per job. In general, MRU has the highest average energy consumed per job for 80M files; while in general, MRU and LRU have the highest average energy per job for 40M files. In conclusion, the results in Experiments 1–3 indicate that the replication strategy utilized can impact the average energy consumed. They also indicate that despite the increase in Watts needed to process a job with more nodes, the corresponding decrease in running time results in a lower average energy consumed with more nodes. Due to the fact that there is not a large variation in average Watts per job, trends for the average energy consumed are very similar to the trends for the average runtime. In the next section, we consider data backfilling, for which the average runtime per job is not always a good predictor of the energy consumed per job.
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
(a) 40M file size.
1159
(b) 80M file size. Fig. 6. Average energy consumed per job.
4. CPU scheduling 4.1. Data backfilling The geographical separation in a grid can lead to problems transferring the large amounts of data necessary between locations. Due to the dynamic nature of a grid, there is a probability that a user will need data that is not in the local site. Transferring this large amount of data can be expensive, due to the large file sizes and limitations of the network bandwidth or network congestion. In a typical space sharing environment, the system would remain idle while waiting for a remote data transfer. In order to avoid idling or even busy waiting, in [15] we proposed a new strategy called data backfilling which works as follows. While the remote I/O is being accessed, we allow the job that requested the remote data to be preempted and the CE to switch to another job. This other job chosen for processing next is the next data ready job, meaning the data file(s) it needs are available locally. Hence, we are making data aware decisions that are driven by the availability of the data. In data-intensive computing, each job will access one or more very large data files, which will be known before the job is submitted. The next data ready job to process is chosen on a FCFS basis. We note that it is still possible for the system to idle if no other jobs are data ready. This strategy is similar to the traditional backfilling scheduler [32], in that data backfilling will be used to fill any ‘‘hole’’ in the computing utilization of the computing element. However, in traditional backfilling, decisions are made based on the amount of processing time required by a job, as opposed to the availability of the data. In [15] we presented some simulation results of the performance of data backfilling, but we did not consider any power consumption of the strategy nor did we have results from our Sage cluster. 4.2. Performance results In order to study the energy efficiency of our data backfilling strategy, we implemented data backfilling on our Sage cluster. We study the average runtime per job, the average power (Watts), and the average energy consumed per job (Joules) for the six replica strategies: FIFO, LFU, LRU, MRU, SWIN and QLFU (v = 25) for 40M and 80M. We compare these results when data backfilling is used and when no data backfilling is used. 4.2.1. Experiment 4—average runtime for data backfilling versus no data backfilling In this experiment, we want to determine the effect of data backfilling on the runtime per job for each of six replication strategies: FIFO, LFU, LRU, MRU, SWIN and QLFU. We can then compare the average runtime when both data backfilling and no
data backfilling are used. The default parameters from Table 1 are used in this experiment. Results from this experiment and Experiments 5 and 6 in Sections 4.2.2 and 4.2.3, respectively, will allow us to determine the relationship between the average runtime, average power consumed and the average energy consumed per job when data backfilling is utilized. We expect the average runtime to decrease with data backfilling and we expect the runtime with data backfilling to decrease as the number of nodes and file sizes increases. We also expect data backfilling to have little effect on the performance of the replication strategies relative to each other. We observe in Fig. 7(a) the average runtime per job for file size 40M and in Fig. 7(b) the average runtime per job for the file size 80M when no data backfilling versus data backfilling is used. Fig. 7 demonstrates that despite the replica scheme utilized, there is a decrease in the runtime when data backfilling is used compared to no data backfilling. Fig. 7(a) shows that for a file size of 40M, the decrease in average runtime, when data backfilling is utilized compared to when no data backfilling is utilized, is the most dramatic when the system has a heavy workload, e.g. when the number of nodes is 1. For a file size of 40M, QLFU still has the best performance with data backfilling compared to the other replica strategies, although it has only a slight performance advantage over the other strategies. Contrary to our expectations, data backfilling affected strategies differently. LRU benefits the most from data backfilling, as it has the second best performance, followed by SWIN. Without data backfilling, LRU is one of the worst performers in terms of runtime. For the smaller size files of 40M, the runtime is decreased for QLFU by 48% for 1 node and decreased by 15% for 8 nodes when data backfilling is used. The running time is decreased for LRU by 61% for 1 node and 25% for 8 nodes when data backfilling is used. Fig. 7(a) also illustrates the difference in runtime among the replica strategies is much less with data backfilling than without data backfilling for a larger number of nodes. For example, QLFU provides only a 9% decrease in runtime over MRU for 4 nodes when data backfilling is used, compared to a 25% decrease in runtime when no data backfilling is used and a 4% decrease in runtime over MRU for 8 nodes when data backfilling is used, compared to a 20% decrease in runtime when no data backfilling is used. The difference is even smaller for LRU, as QLFU provides only a 1% decrease in runtime over LRU for 4 nodes when data backfilling is used compared to a 20% decrease when no data backfilling is used, and a 1% decrease in runtime over LRU for 8 nodes when data backfilling is used compared to an 18% decrease when no data backfilling is used. As shown by Fig. 7, results from our study indicate that any benefit to runtime from increasing the number of nodes is less for data backfilling than with no data backfilling. For 40M size files, QLFU and data backfilling, there is a 42% decrease in runtime when
1160
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
(a) 40M files.
(b) 80M files. Fig. 7. Average runtime (s) per job, data backfilling versus no backfilling.
the number of nodes increases from 1 to 2, a 6% decrease when the number of nodes increases from 2 to 4, a 1% decrease when the number of nodes increases from 4 to 6, and a 2% decrease in runtime when the number of nodes increases from 6 to 8. Without data backfilling there is a 67% decrease in runtime when the number of nodes increases from 1 to 2, a 32% decrease when the number of nodes increases from 2 to 4, a 9% decrease when the number of nodes increases from 4 to 6, and an 8% decrease in runtime when the number of nodes increases from 6 to 8. Fig. 7(b) demonstrates that the benefit provided by data backfilling is more significant for the larger file size of 80M. Contrasted with the results in Fig. 7(a), this benefit is notable even when the number of nodes is 8. The runtime dramatically decreases when data backfilling is used for file size 80M. LRU benefits the most from data backfilling, and has an even faster runtime than QLFU for all number of nodes. QLFU has the second best performance for all number of nodes, following by SWIN. When data backfilling is used, the runtime is decreased for QLFU by 71% for 1 node and 38% for 8 nodes for 80M files, compared to 48% for 1 node and 15% for 8 nodes 40M files. The running time is decreased for LRU by 79% for 1 node and 55% for 8 nodes for 80M files, compared to 61% for 1 node and 26% for 8 nodes for 40M files. Of all the replica strategies, MRU has the worst performance overall when both data backfilling and no data backfilling is used. Unlike the smaller size files of 40M, an increase in the number of nodes does not strictly decrease the runtime with data backfilling for the larger size files of 80M, specifically for the strategies with the faster runtimes. For example in Fig. 7(b), the runtime increases 1% for LRU when the number of files is increased from 4 to 6, and the runtime increases 3% for QLFU and 6% for SWIN when the number of nodes increases from 2 to 4. For all other replica strategies and for all other numbers of nodes, the runtime decreases as the number of nodes increases. This indicates the strategies with the faster runtimes need more than 2 nodes (4 nodes) to provide a decrease in runtime, while the slower strategies benefit the most from any additional nodes. Results in Fig. 7 also indicate that with data backfilling, a change in file size from 40M to 80M does not have a large impact on the runtime. For example, for QLFU, there is a less than a 1% difference in the running time between the 40M and 80M files sizes for 1 node when data backfilling is used, and a 10% decrease in runtime using 40M versus 80M for 8 nodes when data backfilling is used. This is in contrast to no data backfilling, where for QLFU, there is a 44% decrease in the running time using the 40M versus 80M files sizes for 1 node when data backfilling is used, and a 36% decrease in runtime using the 40M versus 80M for 8 nodes when data backfilling is used. In conclusion, results from this experiment indicate that the differences in runtime between the replica strategies are much smaller when data backfilling is used than when it is not used. This implies that the particular replica strategy utilized has less of an effect on runtime. Contrary to what we expected, the results in
Fig. 7 also indicate there is less difference in the runtime between the 40M and 80M sizes when data backfilling is used compared to no data backfilling. This means that increasing the file sizes can have less effect on the runtime if smart data aware strategies are used. Similarly, an increase in the number of nodes has less of an effect on the runtimes when data backfilling is used, particularly for a larger number of nodes. This is because not all of the files are available at all nodes, so there is not a corresponding increase in data ready nodes as the number of nodes increases. 4.2.2. Experiment 5—average power consumed for data backfilling versus no data backfilling In this experiment we want to determine the effect of data backfilling on the Watts consumed by each of the replica strategies: FIFO, LFU, LRU, MRU, SWIN and QLFU. We can then compare the power consumed when both data backfilling and no data backfilling are used. We used the same parameters as those in Experiment 1 in Section 3.3.2, which are the default parameters from Table 1. As mentioned previously, we compute the power by taking a sample every 15 s from a Wattmeter while an experimental run was executing and averaging the Watts for each replica strategy. We expect the Watts consumed to be higher when data backfilling is used than when no data backfilling is used. We also expect the Watts to increase as the number of nodes increases and the file sizes increase, and we expect more Watts consumed among the replica strategies with faster runtimes. We observe in Fig. 8(a) and (b) the average power per job in terms of Watts, when data backfilling versus no data backfilling is utilized, for files of size 40M and 80M, respectively. As expected, the power consumed with data backfilling is higher. While, there is little difference between the power consumed for data backfilling and no data backfilling for 1 node, for both 40M and 80M files, as the number of nodes increases, the number of Watts increases as well. This increase in Watts as a result of data backfilling versus no data backfilling is much more notable for a larger number of nodes. The difference in the number of Watts for QLFU, MRU and LRU is less than 1% for both file sizes. SWIN uses a higher number of Watts when 2, 4, and 8 nodes are used for 40M files and more Watts for 4 and 6 nodes for 80M files. Similar to the results illustrated in Fig. 5, the Watts for the larger file size is less than for the smaller size file even for data backfilling. In conclusion, as shown in Fig. 8 and as expected, data backfilling uses more Watts than when no backfilling is used, because instead of idling while waiting for a file, the node performs computation with an available file. This increase in Watts as a result of data backfilling versus no data backfilling is much more notable for a larger number of nodes because the number of non-idle nodes is increased as well, contributing to the overall power consumed. The negligible difference between the power consumed for data backfilling and no data backfilling for 1 node is due to the fact that the node has many requests and may not have many data ready jobs.
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
(a) 40M files.
1161
(b) 80M files. Fig. 8. Average power (W) per job with data backfill.
Similar to the results in Fig. 5, the Watts for the larger file size is less than for the smaller size file for data backfilling for 1 to 8 nodes are used. This is because even though there may be a data ready job that can be used, if a file must be transferred, there is a longer idle time for larger files. However, when the number of nodes is 8, data backfilling does consume more power for larger size files than smaller size files. This is because the amount of power consumed by the larger number of nodes processing ready jobs is greater than any increased idle time due to the larger size files. SWIN and QLFU use a higher number of Watts compared to some of the other strategies, because they are good predictors of which files will be needed next, and hence, there is always a ready file which increases the power consumed. However, to determine the energy efficiency of our strategies, we must consider not only the power consumed and the runtime, but also the energy consumed per job when considering any power savings provided by data backfilling. 4.2.3. Experiment 6—average energy consumed per job for data backfilling versus no data backfilling In this experiment we want to determine the effect of data backfilling on the average energy consumed per job for each of the replica strategies: FIFO, LFU, LRU, MRU, SWIN and QLFU. We can then compare the average energy when both data backfilling and no data backfilling are used. We used the same parameters as those in Experiment 1 in Section 3.3.2, which are the default parameters from Table 1. As mentioned previously, the average energy consumed per job is computed as: (average_Watts∗ Runtime). We expect data backfilling to have a lower average energy per job compared to no data backfilling and we expect QLFU to have the lowest average energy consumed per job. We also expect the average energy consumed per job with data backfilling to decrease as the number of nodes increases. We observe in Fig. 9(a) and (b) the average energy consumed per job, when no backfilling versus backfilling is utilized, for file sizes of 40M and 80M, respectively. Fig. 9 illustrates that the data backfilling strategy utilizes much less energy than when no data backfilling is utilized. For example as shown in Fig. 9(a), for the QLFU replica strategy and a file size of 40M, the average energy consumed per job is reduced by 48% for 1 node and 6% for 8 nodes. For the LFU replica strategy and a file size of 40M, the average energy consumed per job is decreased by 49% for 1 node and 15% for 8 nodes. The results for larger size files of 80M are much more dramatic, as shown in Fig. 9(b). For the QLFU replica strategy, the average energy consumed per job is reduced by 74% for 1 node and 42% for 8 nodes, while for the LFU replica strategy, the average energy consumed per job is decreased by 72% for 1 node and 42% for 8 nodes. As illustrated in Fig. 9(a), when the file size is 40M, the average energy consumed per job decreases as the number of nodes increases from 1 to 2. When the number of nodes increases to 4 and 6, the energy consumed increases slightly. The energy changes
little or decreases slightly for all strategies except for the LFU and QLFU strategies as the number of nodes increases to 8. The energy consumed per job increases slight for the LFU strategy and remains the same for the QLFU strategy as the number of nodes is increased to 8. These results differ from our expectations and to the results when no data backfilling is utilized, where the energy consumed continues to decrease as the number of nodes increases to 8. As indicated above, Fig. 9(b) shows that the data backfilling strategy consumes much less energy than when no data backfilling is utilized, and this is most dramatic for the larger size files of 80M. Fig. 9 also illustrates that as the number of nodes increases from 1 to 2 the energy per job decreases when data backfilling is used. Similar to the results in Fig. 9(a), as the number of nodes increases to 4, the energy per job for LRU, MRU, SWIN and QLFU begins to increase slightly, and the energy begins to increase at 6 nodes for FIFO and LFU. The energy consumed continues to increase again as the number of nodes is increased to 8. This is different from the results for no data backfilling where the addition of 2 extra nodes always requires less energy. However, they are similar to the average runtime results in which the runtime did not always decrease as the number of nodes increased for 80M files. The energy per job when backfilling is used has much less variation than when no backfilling is utilized, particularly as the number of nodes increases. For example, for 40M files, QLFU uses 19% less energy than LFU for 1 node and 9% less energy for 8 nodes, while for 80M files, QLFU uses 13% less energy for 1 node and 15% less energy for 8 nodes. QLFU uses 20% less energy than MRU for 1 node and 4% less energy for 8 nodes for 40M files and 28% less energy than MRU for 1 node and 18% less energy for 8 nodes for 80M files. In Fig. 6, the average energy consumed per job was very similar to the average runtime power job. This is not the case when data backfilling is used. Comparing Figs. 9(a) to 7(a), we can see that there are some differences, particularly for the larger number of nodes. For example, although its runtime is similar to the runtime of MRU, SWIN consumes more energy than MRU for 4 nodes and 40M files. Similarly, although FIFO’s average energy per job when data backfilling is used is similar to that of QLFU when no data backfilling is used for 6 nodes, there is a more notable difference in the runtime between these two. When 8 nodes are used, LFU also consumes more energy than FIFO when data backfilling is used, although it has a faster runtime than FIFO. In conclusion, despite the increase in power consumed, results from these experiments indicate that data aware CPU scheduling with data backfilling can provide a savings in energy consumed per job. While data backfilling requires additional energy because it has less idle time, data backfilling provides a significant reduction in energy consumed compared to no data backfilling due to decreased runtime. However, results indicate that adding additional nodes when utilizing data backfilling does not always reduce the energy consumed because a decrease in runtime does not offset the increase in power demanded by additional
1162
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
(a) 40M files.
(b) 80M files.
Fig. 9. Average energy consumed per job, no backfilling versus backfilling.
nodes processing the jobs. Results have demonstrated that data backfilling provides a benefit over no data backfilling, with the greatest benefit occurring for the larger size files and when the system has a heavy workload. We have shown that smart data replication and data aware CPU scheduling strategies for job preemption can be used to maximize CPU usage and minimize energy consumption without negatively impacting the response time. 5. Conclusions and future work In this paper, we propose that smart strategies for replicating files are one way to minimize the amount of energy consumed in a data grid. As a result, we present a replica strategy, called QLFU, that utilizes pending file requests to make current replication decisions. QLFU is designed to minimize the amount of data transmitted and storage needed. The QLFU strategy was implemented on our small Sage cluster, which was built considering both energy efficiency and cost. Performance results on our cluster indicate our QLFU strategy performs better than existing strategies, such as FIFO, LFU LRU, MRU and SWIN in terms of both average running time and energy consumed per job. QLFU is beneficial in terms of energy consumed per job regardless of the number of nodes or the sizes of the files. We also studied a CPU scheduling strategy that involves preemption, called data backfilling. Data backfilling provided a dramatic decrease in the runtime compared to jobs that did not use backfilling. This was particularly notable for larger size files. However, because data backfilling processes the next available job ready file, instead of waiting for a file, the power consumed in terms of Watts was higher. Nevertheless, the average energy consumed per job was much lower for data backfilling than no data backfilling. With data backfilling, the file sizes and the replica strategy had less of an effect on the power per job. This indicates that the use of data backfilling can result in a system with not only lower, but more stable energy consumption. The results from our study indicate that the replica strategy chosen and CPU scheduling can have an impact on the amount of energy used. The results from this work highlight the need for power management decisions to consider all aspects when attempting to minimize power consumed. Although our experiments were run on a small cluster, in the future we will demonstrate its scalability for larger systems and with varying file sizes. In the future we also plan to use data backfilling and to schedule the jobs so that we can utilize forced periods of suspension time for power saving. We are studying powering down nodes depending on the load of the cluster. Data gathered with these tests would reveal the worth, or cost, of powering down nodes dynamically to decrease power consumption at the expense of a slower response to sudden dramatic increases in job requests. We anticipate that the strategies
presented in this paper could also be useful for saving energy in service-oriented grids and clouds, but will require balancing with the quality of service. Acknowledgments We would like to thank Dr. John Lusth for his conception and construction of the Celadon cluster, which was the basis for the Sage cluster. We would also like to thank Dr. Brian Hyslop for his expertise. References [1] C. Patel, R. Sharma, C. Bash, S. Grauper, Energy aware grid: global workload placement based on energy efficiency, Hewlett-Packard (2002). [2] R. Fraser, T. Rankine, R. Woodcock, Service oriented grid architecture for geosciences community, in: Proceedings of the 5th Australasian Symposium on Grid Computing and e-Research, AusGrid, 2007. [3] D. Abadi, Data management in the cloud: limitations and opportunities, in: Bulletin of the IEEE Technical Committee on Data Engineering, 2009. [4] T. Kosar, M. Balman, A new paradigm: data-aware scheduling in grid computing, Future Generation Computer Systems 25 (2009) 406–413. [5] Green grid, Green grid metrics data center power efficiency metrics: PUE and DCiE, Technical Committee White Paper, February 2007. [6] US Environmental Protection Agency, Report to Congress on Server and data center energy efficiency Public law 109–431, in: Energy Star Program, 2007. [7] P. Korcan, P.-J. Bochard, Storage efficiency with IBM N series, Information Infrastructure Forum (2009). [8] J. Baliga, R. Ayre, K. Hinton, R. Tucker, Green cloud computing: balancing energy in processing, storage, and transport, Proceedings of the IEEE 99 (1) (2011) 149–167. [9] J. Koomey, Growth in Data Center Electricity Use 2005 to 2010, Analytics Press, Oakland, CA, 2011. [10] K. Cameron, Green introspection, IEEE Computer 42 (1) (2009) 101–103. [11] http://gcn.com/articles/2010/10/04/data-center-management.aspx. [12] Dell Computers—The green data project: http://www.greendataproject.org/ index.php?option=com_content&task=view&id=44&Itemid=60/. [13] D.J. Brown, C. Reams, Toward energy-efficient computing, Communications of the ACM 53 (3) (2010) 50–58. [14] D.G. Cameron, R. Carvajal-Schiaffino, A.P. Millar, C. Nicholson, K. Stockinger, F. Zini, Evaluating scheduling and replica optimisation strategies in OptorSim, in: 4th International Workshop on Grid Computing, Grid2003, IEEE Computer Society Press, Phoenix, Arizona, 2003, November 17. [15] M. Lei, S. Vrbsky, R. Horton, A remote data access element for data grids, grid computing and applications, in: WorldComp’09 2009 International Conference on Grid Computing and Applications, GCA’09, Las Vegas, NV, July 2009, pp. 188–194. [16] L. Barroso, U. Holzle, The case for energy-proportional computing, IEEE Computer 41 (12) (2007) 33–37. [17] J. Koomey, Estimating Total Power Consumption by Servers in the US and the World, Analytics Press, Oakland, CA, 2007. [18] M. Poess, R. Nambiar, Energy cost, the key challenge of today’s data centers: a power consumption analysis of TPC-C results, in: Proceedings of the VLDB Endowment’08, New Zealand, August 2008, pp. 1229–1240. [19] R. Bianchini, R. Rajamony, Power and energy management for server systems, IEEE Computer (2004) 68–76. [20] S.-M. Park, J.-H. Kim, Y.-B. Ko, W.-S. Yoon, Dynamic data grid replication strategy based on Internet hierarchy, in: Second International Workshop on Grid and Cooperative Computing, GCC’2003, Shanghai, China, December 2003.
S.V. Vrbsky et al. / Future Generation Computer Systems 29 (2013) 1152–1163
1163
[21] K. Ranganathan, I. Foster, Identifying dynamic replication strategies for a high performance data grid, in: International Workshop on Grid Computing, Denver, 2001. [22] M. Szymaniak, G. Pierre, M. van Steen, Latency-driven replica placement, in: 2005 Symposium on Applications and the Internet, SAINT’05, pp. 399–405. [23] R.-S. Chang, P.-H. Chen, Complete and fragmented replica selection and retrieval in data grids, Future Generation Computer Systems Journal 23 (2007) 536–546. [24] K. Ranganathan, A. Iamnitchi, I. Foster, Improving data availability through dynamic model-driven replication in large peer-to-peer communities, in: Proceedings of the Workshop on Global and Peer-to-Peer Computing on Large Scale Distributed Systems, Berlin, May 2002. [25] M.-C. Lee, F.-Y. Leu, Y.-P. Chen, PERF: an adaptive data replication algorithm based on star-topology data grids, Future Generation Computer Systems 28 (2012) 1045–1057. [26] K. Flautner, S. Reinhardt, T. Mudge, Automatic performance-setting for dynamic voltage scaling, in: 7th Conference on Mobile Computing and Networking MOBICOM’01, Rome, Italy, July 2001. [27] P. Pillai, K. Shin, Real-time dynamic voltage scaling for low-power embedded operating systems, in: ACM Symposium on Operating Systems Principles, Banff, Canada, 2001, pp. 89–102. [28] D. Harnik, D. Naor, I. Segall, Low power mode in cloud storage systems, in: IEEE International Symposium on Parallel & Distributed Processing, IPDPS, 2009, pp. 1–8. [29] K. Hasebe, T. Niwa, A. Sugiki, K. Kato, Power-saving in large-scale storage systems with data migration, in: 2nd IEEE International Conference on Cloud Computing Technology and Science, IEEE CloudCom 2010, Indianapolis, IN, December 2010. [30] M. Tang, B.-S. Lee, X. Tang, C.-K. Yeo, The impact of data replication on job scheduling performance in the data grid, Future Generation Computer Systems Journal 22 (2006) 254–268. [31] S. Vrbsky, M. Lei, K. Smith, S.J. Byrd, Data replication and power consumption in data grids, in: 2nd International Conference on Cloud Computing Technology and Science CloudCom10, Indianapolis, IN, December 2010. [32] A. Mu’alem, D. Feitelson, Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling, in: 12th Intl. Parallel Processing Symposium, April 1998, pp. 542–546.
J. Michael Galloway received the A.A.S. degree in Electronics Technology from Central Alabama Community College, Alexander City, AL in 2001. He received the B.W.E. degree in Wireless Soft-ware Engineering from Auburn University, Auburn, AL in 2005. He is currently enrolled as a Ph.D. student in the Department of Computer Science at the University of Alabama. His current research interests are power management in cloud platforms, resource load balancing and distribution, network protocols, and distributed systems. He is a student member of IEEE and ACM.
Susan V. Vrbsky is an Associate Professor of Computer Science at The University of Alabama, Tuscaloosa, AL. She received her Ph.D. in Computer Science from The University of Illinois, Urbana-Champaign. She received an M.S. in Computer Science from Southern Illinois University, Carbondale, IL and a B. A. from Northwestern University in Evanston, IL. She is the Advisor of the Cloud and Cluster Computing Lab and the Graduate Program Director. Her research interests include database systems, data management in clouds, green computing, data intensive computing and database security.
David Grubic received his M.S. and B.S degrees in Computer Science from the University of Alabama. He is currently an analyst at CTS, Inc. in Birmingham, AL.
Robert Carr is currently a DBA at FastHealth, Inc. in Tuscaloosa, AL. He received his B.S. and M.S. degrees in Computer Science from the University of Alabama.
Rahul Nori received an M.S. degree in Computer Science at the University of Alabama. He is currently a Ph.D. student in Computer Science at the University of North Dakota.