SLA enabled CARE resource broker

SLA enabled CARE resource broker

Future Generation Computer Systems 27 (2011) 265–279 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: ...

2MB Sizes 1 Downloads 39 Views

Future Generation Computer Systems 27 (2011) 265–279

Contents lists available at ScienceDirect

Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs

SLA enabled CARE resource broker P. Balakrishnan ∗ , Thamarai Selvi Somasundaram Centre for Advanced Computing Research and Education (CARE), Department of Information Technology, MIT Campus of Anna University, Chromepet, Chennai 600 044, Tamilnadu, India

article

info

Article history: Received 28 June 2010 Received in revised form 26 August 2010 Accepted 3 September 2010 Available online 13 October 2010 Keywords: Grid computing Virtualization Service level agreements VMware Xen

abstract Limitations of Grid, such as provisioning the dynamic execution environment are handled by virtualization technology. While delivering virtual resources for executing the applications, the user’s Quality of Service requirements must be ensured. This QoS management is instrumented with the help of Service Level Agreements that clearly specify the guaranteed QoS, restrictions on resource usage and the penalties associated with the deviant service behavior. To support those demanding QoS, the grid meta-schedulers must be empowered with the capability to discover the required quality, quantity and usage policy satisfied resources as per the requirements of an application. In this paper, we propose a Service Level Agreement based on-demand resource virtualization framework that provides the support for specifying the usage policies, creates and manages the virtual machines over the SLA negotiated resources that are created using either VMWare or Xen. This proposed architecture is integrated with our grid metascheduler, CARE Resource Broker as a value addition component. This SLA enabled broker is evaluated with the help of real-time application execution by varying the number of resources available in the testbed and the number of policies per resource. The results conclude that the inclusion of SLA affects the resource selection behavior of the broker. In addition, the overall performance of the system is increased in terms of job throughput with an extra minimal overhead in request processing due to usage policy matching, while realizing a controlled grid resource sharing environment. These effects that are presented may be useful, while new designs are proposed to take the advantage of SLAs at the meta-scheduler level. © 2010 Elsevier B.V. All rights reserved.

1. Introduction Grid is a dynamic framework that enables sharing and selection of scalable, fault-tolerant, loosely-coupled heterogeneous resources over geographically distributed locations using the concept of Virtual Organization (VO) [1]. VO is a collaborative computing environment, where a set of individuals or organizations share their resources that abide by a common sharing policy (VO policy) to meet a common objective. The grid middleware provides a runtime environment for executing the jobs in grid resources. The grid meta-scheduler gathers the grid resources available in the grid environment and provides a gateway to the user to submit jobs in that environment. Whenever the user submits his jobs to the grid metascheduler, it gathers all the available resources’ information and matches every resource’s information with the job requirements. The outcome of this matchmaking makes that resource fall in any one of these three cases: over matches (Plug-In), exactly matches (Exact) or under matches (Subsume). For each job, all the available resources may fall into anyone of these regions at the end of match



Corresponding author. Tel.: +91 44 22516306. E-mail address: [email protected] (P. Balakrishnan).

0167-739X/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.future.2010.09.006

making. If a job has resources in exact or plug-in region, single resource provider is sufficient to execute that job request. But, if a job has resources only in the subsume region, then that job may not be run by any of the conventional grid meta-schedulers. In order to execute those jobs that have resources only in the subsume region, we must explore the cause that makes these resources fall into subsume region. Alternatively, we must identify the resource parameters that are not matched with the job requirements so that we can explore the possibility of using these resources by adopting some new scheduling use cases. In general, any job requirement consists of three parts viz., Hardware requirements, software requirements and Quality of Service (QoS) requirements. The hardware requirements consist of parameters, such as number of CPUs required for executing the job, size of RAM, amount of secondary storage required for storing the input files or results, CPU speed of each processor etc. The operating system and libraries required come under software requirements, whereas the bandwidth and latency fall into QoS requirements. Suppose there is a list of resources that satisfy all the above mentioned requirements except the number of CPUs required for executing the job. Then these resources are grouped together and their total capability is used to execute the job. This kind of scheduling scenario is called as physical co-allocation. In a virtual co-allocation [2] scheduling scenario, the additional CPUs required for application execution are created as

266

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

virtual machines (VMs) over the same or remote physical clusters. This combination of newly created VMs and the existing physical resources are together used for application execution. These combined resources (i.e. physical and VMs) can be managed from single administrative domain and this eliminates the need for grouping multiple physical resources that are managed by different administrative domains as in physical co-allocation. Suppose there is a group of resources that satisfy the hardware and QoS requirements, but not the software requirements. Then this situation is handled by creating the required number of VMs, seamlessly deploying them over the virtualization layer of the existing grid resources, creating a cluster and executing the job over this newly created virtual cluster. This is called a virtual cluster scheduling scenario. We believe that the virtualization [3] technology achieves the isolation of jobs from underlying physical resources as in high level grids. In this paper, we proposed an architecture that dynamically provisions the virtual machines for executing the applications based on Service Level Agreements (SLA) [4] and integrating it with our CARE Resource Broker (CRB) [5] as a value addition component. Our existing CRB does not look into the following issues (i.e. which are future aspects at that time): It does not have the policy management system to express and enforce the usage policies, which is a mandatory requirement for resource providers to express their desired usage scenario. Also, the time complexity of the existing resource scheduling algorithm is greater, since it checks the three scheduling use cases [5] in a round robin fashion. In addition, it does not have an image repository to store the preconfigured VM images. Also, it has a short VM provisioning lifecycle. The existing lifecycle generates the VM creation request, identifies the suitable physical resource over which the VM is to be deployed and finally activates that VM. It does not deal with selection of suitable pre-configured VM (i.e. no image repository), modifying the configuration of existing VM images, and the deletion of deployed VMs. Furthermore, it cannot realize a controlled grid resource sharing environment as it lacks a mechanism for creating and managing SLAs. The above mentioned limitations are overcome by the proposed architecture. This architecture has a provision to express and enforce the usage policies and it replaces the existing scheduling algorithm with a deviation based resource scheduling algorithm [6,7] that reduces the time complexity. With the help of pre-configured VM images, the time taken for processing the job request may be greatly reduced. Hence, the proposed architecture contains an image repository for storing the pre-configured VM images and their meta-data is stored in an XML file that may be helpful, while matchmaking the hardware and software configurations of each VM against the job request to find a suitable VM image. Further, the VM provisioning lifecycle has expanded with the support to select a suitable VM image from image repository, modify the VM image configuration and recycle the newly deployed VM after the job gets executed. The Grid SLA Management Architecture (GSMA) [6,7] module has been integrated with CRB in such a way as to provide the support for negotiating with the resources, monitoring the job execution to ensure the non-violation of service level objectives and also to penalize the user/resource provider in case any violation thereby leads to controlled grid resource sharing. The existing CRB gives a tightly coupled support to Xen based virtualization only. We overcome this limitation by introducing the concept of the hypervisor adapter, which is a loosely coupled architecture, where a new virtualization vendor can be easily pluggedin to the CRB. With this, we have successfully tested the VMware virtualization support of our CRB. The summarized contributions of this paper are:

• Policy Management System that gives the provision to the resource owners to express their desired usage scenarios as resource usage policies. • Generic schema for expressing the resource usage policies.

• Deviation based Resource Scheduling algorithm that handles

• • • • •

the different scheduling scenarios with reduced time complexity. SLA based virtual resource provisioning using GSMA that supports the entire lifecycle of SLAs. An extensible and inter-operable Virtual Resource Management (VRM) system using hypervisor-adapter. VM image repository that supports dynamic querying about VM image configuration. Integration of this VRM and VM image repository with CRB. Real time application execution that shows the time reduction in making scheduling decisions, percentage increase in scheduling rate, and an overhead incurred due to SLA in request processing.

The rest of the paper is organized as follows. Section 2 provides the strong background about the need of SLAs in a grid environment and the significance of virtualization technology to complement the grid environment. The previous works that are related to usage policy, SLAs and the application of virtualization technology are given in Section 3. Proposed architecture for GSMA based VRM is elucidated in Section 4. The implementation of proposed architecture is given in Section 5 and the test-bed is explained in Section 6. Finally, we analyze the results in Section 7 and present the conclusion in Section 8. 2. Background The main objective of grid computing is to provide a large amount of computing power to solve computationally intensive problems while allowing anyone to access/provide resources from/to the grid environment. Resource sharing is a serious issue in grid computing since the resources that belong to different administrative domains may have different usage policies in different VOs. So if a user is willing to access a grid resource belonging to another domain, first he has to satisfy the policies defined in the VO, where that resource is logically contributed, then he has to satisfy the usage policies defined in the local domain, where the resource is actually residing. Though these policies are broad and unclear, it is important to express and integrate them with the grid metascheduler in order to realize controlled grid resource sharing. The role of the grid meta-scheduler is to coordinate and negotiate with the resource providers on behalf of the user. These negotiations lead to a contract between the resource providers and the user that clearly states the QoS to be provided, restrictions on resource utilization and penalties during the violation of the objective. The QoS parameters that are becoming a part of a SLA should be taken from a known set of variables like bandwidth, RAM, Secondary storage etc., and also they should be measurable parameters in order to monitor the Service Level Objectives (SLO). Consider the scenario, where an user U submits the following job request to the meta-scheduler that matchmakes the job request with the available resource providers. ‘‘Need 16 CPUs of PARAMPADMA and 32 CPUs of a PC cluster with MATLAB-7.0 for six hours between 2010/08/12 06:00 pm and 2010/08/15 06:00 pm and a network connection providing a bandwidth of 100 Mbit/s between them. The input file (of size 40 GB) and the executable should be transferred from Space Application Center, Ahmedabad to the execution site. From all resource providers (pair of PARAMPADMA, PC cluster and SAC Ahmedabad) choose the one with the highest network bandwidth among them’’. The above mentioned request clearly states three parts: Hardware, software and QoS requirements. Apart from these

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

requirements, it clearly specifies the temporal parameters, such as time at which the job requires resources, spatial connections like bandwidth and delay, external information, such as location of input file and finally the choice of resources like highest bandwidth, when multiple numbers of resources are selected. In order to schedule the above posted job request, the orchestrator (i.e. grid meta-scheduler) has to filter out the potential resources based on their dynamic information, such as available period of resources, load and their static information, such as number of CPUs, OS, RAM size, get their final commitment based on some negotiation and finally execute the job over these committed resources. Even though the grid resources have the potential to fulfill the job request based on their static and dynamic information, their allowed or permitted resource usage level for grid is defined and restricted by the resource providers as resource usage policies or Resource Service Level Agreements (RSLA) [8]. So whenever a job request is submitted to the grid meta-scheduler, it has to filter out the resources based on their usage policies first, then the resultant resources are matched against the job request in order to identify the capable resources, and finally, it starts negotiation with these resources in order to get their final commitment that will automatically lead to a reduction in the number of negotiations per SLA. In order to illustrate the significance of resource usage policies, we explain some of our sample resource usage policies (RUP). RUP1: Grid users are allowed to store their temporary files under/tmp directory with the maximum size of 40 GB. RUP2: The temporary files are deleted after the successful execution of submitted job. RUP3: The grid users are allowed to create virtual machines using Xen in the Resources belonging to CARE Cluster. RUP4: The virtual machine creation is allowed only when the physical machine’s past 5 min average CPU utilization is less than 80%. RUP5: Only the grid users belonging to the MIT domain are allowed to create their virtual machine in the non-dedicated desktop PCs of the MIT domain. RUP6: Whenever there is a key/mouse press in the non dedicated desktop PCs, all the grid operations are wound up and the VM is migrated to some other suitable physical machine. So it is mandatory to decide the parameters that are to be considered for expressing the resource usage policies, a common schema for resource usage policy expression and a policy engine to interpret and evaluate the usage policies with the functionality to add new usage policies, manage the existing policies and integrate the same with the grid meta-scheduler in order to realize a controlled grid environment. 3. Related work Since our proposed framework involves resource usage policies, SLAs, virtual resource provisioning and management, we grouped our related work in the above mentioned categories. 3.1. Resource usage policies In [9], a model is proposed to facilitate resource usage policy based allocation in grids. They implement the model and integrate with a Maui scheduling mechanism. The literature [10] proposes a mechanism to express resource usage policies and its enforcement in a grid. It uses a request/response paradigm based on XACML and introduces the relevant attributes to express and enforce grid resource usage policies. In our work, we propose a policy management system and a generic schema to express the usage policies for grid resources based on their availability, peak load and off peak load time etc. Hence, to find out suitable resources for a request,

267

all the available grid resources are filtered against their usage policy first and then against the hardware requirements. Also, the resources contributing to the grid environment need not have a single resource usage policy. It may have ‘n’ number of policies with each policy has ‘m’ number of parameters. We also look into this issue and evaluate the proposed system behavior by varying the number of resources and number of policies per resource. Since the policy management system is integrated with CRB, it will be helpful to realize controlled grid resource sharing. 3.2. Service Level Agreements In [11], a business oriented virtual resource provisioning architecture using SLA is proposed. But in their approach, they do not concentrate on resource usage policy based SLA creation and do not deal with the significance of virtualization concepts in their architecture. The advantages of computing power services are explained in [12]. It also deals with the service negotiation and SLA mappings in the Grid/Cloud arena. The algorithm in [13] allocates the resources by considering the SLA alone. It does not deal with any of the virtualization techniques to provision the resources. The QoS provisioning method in [14] considers not only the real time monitoring information but also the data obtained from previous experiences. In [15], a cooperative model that tailors the P2P interaction model with Grid environments for QoS aware resource discovery and SLA negotiation are illustrated. A dynamic negotiation strategy for creating SLAs using a WS-Agreement is specified in [16]. The Virtual Resource Manager is proposed in [17] and bridges the gap between the computing demands with a particular level of QoS and local resource management systems that provide the resources to the grid environment. In [18], Sahai et al. introduce the SLA formation in commercial Grids. Keller and Ludwig [19] propose a Web Service Level Agreement (WSLA) specification which is a SLA language. In order to support Web Service Resource Framework (WSRF) services, an agreement based SLA specification was defined by Andrieux et al. [20] and is accepted by the Global Grid forum as a WSAgreement. The automation process of monitoring SLAs for Web Services is proposed by Sahai et al. [21]. Nudd et al. [22] develop PACE (Performance Analysis and Characterization Environment) which incorporate source code analysis and hardware modeling. Further in the discussion, PACE is employed for Grid resource scheduling taking into consideration static hardware components but not considering dynamic changes to resource performance. Smith [23] use the historic information to predict the execution run-time for parallel applications. The discussion does not provide heuristics on anticipated execution time. Huedo et al. [24] consider workload parameters to evaluate periodically adaptive execution actions. The formal SLA specification, integration and calculation of remaining execution time are not addressed in this model. The literature [25] explains about the methodology to federate multiple resources using agent negotiation. In our current work, we implement the proposed GSMA architecture as grid services and integrate it with CRB in order to ensure that the user’s QoS requirements are fulfilled till the end of the execution. Further, we made the CRB intelligent enough to select the minimal number of resource providers in such a way as to minimize the number of negotiations between the CRB and the resource providers. This intelligence is incorporated in a DRS scheduling algorithm that can predict the response of the negotiation (based on an accepted usage scenario or resource capability) with some degree of accuracy and decide whether to select that particular resource provider for negotiation or not. This reduces both the traffic for each SLA generation and the average SLA creation time. Also, QoS aware match-making of DRS supports QoS aware resource management. Moreover, the GSMA supports the automated creation/deletion of SLAs, monitoring of runtime parameters of a job and automatic enforcement of SLAs against any violation produced by the consumer/resource provider.

268

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

3.3. Virtualization

4. Virtual resource management system

In [26], Foster et al. discuss the commonalities between cloud and grid computing in terms of vision, architecture and technology, differentiate in terms of programming, business and computation models and finally highlight the challenges in realizing cloud computing. In [27], Buyya et al. explain the vision of cloud computing and propose a market-oriented allocation architecture of resources within Clouds with negotiation of QoS and SLA support. Keahey et al. [28,29] introduce the concept of virtual workspace (VW) that aims to provide a customizable and controllable remote job execution environment for Grid. Virtual Workspaces support unmanned installation of legacy applications, which can effectively reduce the deployment time. Freeman et al. [30] address the management issues arising from the division of labor. The abstractions and tools allow the clients to configure dynamically, deploy and manage required execution environments in application-independent ways as well as to negotiate enforceable resource allocations for the execution of these environments. We also use a similar approach to deploy virtual resources in a physical resource but the virtual resource information is aggregated and integrated with globus’ MDS. With this approach, both physical and virtual resource information can be aggregated at the meta-scheduler level. Foster et al. [31] extend the virtual workspace to encompass the notation of a cluster. They describe the extensions needed for workspace definition, protocols and other services to create a virtual cluster for an authorized grid clients. They have conducted several experiments to analyze time complexity for virtual cluster creation and deployment of a required execution environment. A dedicated private virtual cluster creation per VO that provides customized, homogenous execution environments on an existing grid backbone is illustrated in [32]. The methodology to suspend, migrate and resume the heavy weight virtual clusters without disrupting the application execution is explained in [33]. Borja Sotomayor [34] develop a model, in which a virtual workspace is associated with a well-defined resource allocation strategy to enable accurate and efficient creation or timely creation of virtual workspaces. It allows a remote client to create virtual resources securely using standard Web services based protocols and services. Further, in [35], they address the methods to minimize the overhead introduced by the deployment of VM images before the start of a resource leasing. An open source virtualization tool called open nebula that manages dynamic deployment and replacement of virtual machines in a data center or in a cluster is explained in [36]. It mainly concentrates on virtualizing the cluster nodes from the head node whereas our research work focuses on virtualizing resources from the meta-scheduler. GREEN, the distributed matchmaking mechanism that orders the grid resources by executing some benchmarks which are independent of underlying middleware is explained in [37]. In our previous work [5], the proposed framework is capable of creating virtual resources over the existing grid resources across the network. It applies resource scheduling strategies to decide on which resource and the virtual machines to be created. But the overhead involved in making the decision of the case (i.e. single cluster, virtual co-allocation or virtual cluster) is more since it adopts the various brokering strategies one after the other (round robin). In our current work, we adopt a deviation based resource scheduling algorithm that makes the decision of the region in which the resource falls in a single comparison. So the overhead in making a decision to identify the case (i.e. single cluster, virtual co-allocation or virtual cluster) is greatly reduced. Also, this classification provides a way to optimize the resource allocation by combining the other resource scheduling techniques with DRS thereby leads to increase in throughput of the meta-scheduler. In addition, the user is hidden from the underlying complexity of negotiating with the potential grid resources and from complex resource usage policies.

Static and Operational entities are the two mandatory requirements to manage effectively any virtualized environment. The static entities include a concrete view of participating physical systems and their static information about the hardware, software environments, policy information, supported hypervisor, logical volume manager and virtual machine images that are available in the image repository. These static entities are constant during the operation of Grid. The operational entities include the dynamic information about the current status of physical resources, lifecycle management of VM provisioning, monitoring of VMs and the catalog of deployed virtual machines and their hosting physical resources. Lifecycle management of any VM has to manage the following three entities viz., dynamic reconfiguration, VM provisioning and VM tracking. The dynamic reconfiguration entity deals with ways to modify the VM image configuration parameters. The reliable mechanism to manage all the VM creation requests is handled by the VM provisioning entity. Finally, the VM tracking entity leverages the capability to track the current status of any VM (i.e. start, stop, suspend, pause and destroy). The monitoring entity deals with the health of deployed VMs, the physical resources on which these VMs are deployed and their associated run time attributes such as RAM, Secondary storage, number of CPUs and CPU utilization. The virtual machine images are stored at the Image repository. This repository provides a store for a variety of images, which are needed for the execution of jobs. These images can be used for the VM creation when needed, depending on the job requirements. The hypervisor adapter present in VRM can contact both XEN as well as VMware. After the identification of both the VM image that is suitable for current job requirements from the image repository and the physical machine on which this VM is going to be deployed, the scheduler initiates the transporter to transfer this compressed VM image to the selected physical machine, activates the VM image (i.e. assigning the hostname, IP address) and begins the booting of the newly deployed VM. After the booting up of VMs or virtual cluster, the job is submitted and at the end of its execution, these VMs are deleted. These entire operations of the above mentioned components constitute the Virtual Resource Management System (VRMS). 4.1. Policy management system (PMS) Managing grid resources is a complex task since the grid spans across multiple organizations. Every organization has their own resource usage policies expressed using their own mechanism. Whenever the resources from these organizations are contributed to the grid environment, it is not possible to set a single usage policy as a whole, because the nature, locality and the business hours of an organization may determine the available time of resources, the amount of resources that can be contributed to grid and the desired usage scenarios. Currently there is no mechanism available to give assurance to the resource providers that their local resources are never overrun. In order to realize the controlled grid resource sharing environment, there must be a common mechanism to express these usage policies for all resource providers participating in grid and these policies should be interconnected with the grid meta-scheduler in order to enforce the desired usage scenario. In general, there are two kinds of policies in the grid environment: Access policies and usage policies. The access policies are used for authentication and authorization and the usage policies specify the accepted usage scenario. The policy management system proposed in this paper has the following capabilities:

• The resource providers can express their resource usage policies through the policy editor.

• Adding/deleting the new/existing policies to/from the policy repository.

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

269

Fig. 1. Proposed schema for resource usage policy.

• Policy matchmaking mechanism that can identify the appropriate policy from a set of usage policies and compares the usage policy against current job requirements. • Integrating this policy management system with our CARE resource broker. The resource usage policy expression has to view the policies at different levels of the grid environment that include local policies, grid site policies, and operational policies. The local policies mainly concentrate on the resources within a domain. They mainly focus on the amount of contribution (such as CPU load, storage) towards the grid environment on every resource that participates in grids. For example the usage policy that specifies ‘‘. . . provide 20% of disk space towards grid on machine A. . . ’’ falls under this category. When many resources belonging to different domains combine together to form a gird, there must be some rules and regulations about the sharing of resources and quality of services to be maintained during the interaction between them. These are specified under grid site policies. These policies are otherwise called as service level agreements. For example, ‘‘. . . the MIT has to provide 60 CPUs to Garuda whereas the Garuda has to provide 100 Mbps connectivity to MIT . . . ’’ coming under grid site policies. These policies mainly concentrate on single sites rather than the entire operation of a grid environment. The policies that concentrate on the entire operation of a grid environment are called operational policies. For example, ‘‘. . . The overall load for VO1 should be less than 70%. . . ’’ belongs to this category. After viewing the policies at various levels, the next step is to express the policies using some existing/new policy language. Grid resource policy expression is a twofold process. The first step is to identify the policy language for expressing the policy information.

The next step is to define the policy schema (i.e. vocabulary) for that particular domain. In a grid environment, no standard language has been followed to express the policy information. We used WS-POLICY specification to express policy information thereby making it possible to address interoperability issues across various grids and grid resources. WS-Policy provides a flexible and extensible grammar for expressing the capabilities, requirements, and general characteristics of entities in a XML Web services-based system. We define a new XML schema for grid environments to express the usage policies. The usage policy expressed (only local policies) using this schema is embedded into the WS-Policy and put into a policy repository. Every resource provider can specify one or more usage policies for every resource that contributes to the grid environment. The proposed schema for resource usage policy is shown in Fig. 1. Every resource provider has unique ID called RPID and is specified within the ResourceProvider tag. Every policy has as unique ID, (PID) and is specified within the Policy tag. The name of the resources bounded by the current ID is specified in the ResourceName tag under the Resource tag. The duration of availability of each resource is specified in the Availability tag. Also if the resource provider is willing to allow the users from a particular domain, they can use the an Allow tag that is available under the Security tag. The FileSystem tag is used to specify the mount point, where the user’s jobs use the space available in this filesystem for executables, input and output files. The number of CPUs, amount of CPU load, bandwidth and the RAM contributed towards the grid environment are specified in the CPUCount, CPULoad, NetworkBandwidth and PhysicalMemory tags respectively. In order to make the policy creation easy, we have transformed the above mentioned tags into a portal, where a resource provider

270

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

Fig. 2. Design of policy editor.

can simply enter the desired usage scenario and submit it to the policy manager (refer to Fig. 2). The policy manager then converts the entries in the portal into an XML file (RSLA.xml) and stores it in the policy repository. Also, there is a portal support to add/modify/delete the new/existing policies from the repository. Whenever the resource provider is willing to add a new policy, he/she has to simply login as a resource provider and click the addPolicy button. Then the portal will be redirected to the createPolicy page wherein the resource provider enters his desired usage scenario and submits it to the policy engine to update it in the policy repository. Then the policy engine adds one more Policy element (tag) in the corresponding ResourceProvider tag identified with the help of the logged in RPID and updates the RSLA.xml file. If he wishes to modify the existing policy, he has to click the modifyPolicy button, then it will show all the policies created by him one by one. He may use the next navigation key in order to navigate through all the policies and make appropriate changes and add it to the policy repository. In this case, the policy engine updates the corresponding modified policy tag identified with the help of PID in the related ResourceProvider tag tracked by the RPID. The deletion of existing policies can be done through the deletePolicy button. Whenever the resource provider clicks the delete button, all the policies related to this particular resource provider are retrieved from the repository and shown to the resource provider. Then the user may navigate to the appropriate policy, delete it and update in the repository by removing the Policy tag. In the third phase, we have to find out the appropriate usage policy first and then compare that usage policy against the current job requirements in order to test the suitability of that particular resource. For example, a particular resource provider may define many usage policies for a single resource based on his business hours. Alternatively, the resource provider determines the resource available time and the amount of resources contributed to the grid environment based on their local usage. Whenever the user submits his job to the meta-scheduler, he may want to run the job immediately (i.e. On-demand) or some time later (i.e. Advance Reservation). Then the meta-scheduler has to identify the resources that are available immediately or the resources that are available at their Expected Time of Run (ETR) respectively. In both cases, first the meta-scheduler has to identify the appropriate usage policy for all the available resources. In order to select the correct usage policy, it will use the availability of resources specified in the usage policy as its primary key. For example, a user submits the job to the meta-scheduler and expects the job to be

run at latter time. Currently there may be 100 resources available for job submission. But there is no assurance that all the 100 resources are available at the expected time of run. In order to discover the resources that are available at the expected time of run, the meta-scheduler simply refers the available time of the resources that is specified in the usage policy. After referring the usage policy, it may come to know that only 50 resources are available at the desired time of run. So the first level of filtering is done by simply referring the usage policy. Then the next level of filtering is done by matchmaking these resultant resources’ (i.e. 50 resources) capability against the current job requirement. The next step is to create SLAs by negotiating with these resources. But in order to reduce the number of negotiations, we use our own deviation based resource scheduling algorithm [6,7]. The pseudo code for selecting the appropriate policy is shown below. Where A = {host_1, host_2, . . . , host_n}, RUP(A[i]) = {RUP_1, RUP_2, . . . , RUP_n}, i = 0, 1, . . . , n, ETR = Expected Time of Run. Select_RUP (A[], RUP[ ], ETR) [1] [2] [3] [4] [5] [6]

for s ← 1 to length[A] for t ← 1 to length[RUP[A[s]]] if ETR ⊆ Available_time [t ] then set_matched_resources (A[s], RUP[t ]) else continue

[7]

endif

[8]

endfor

[9]

endfor

The last phase is to integrate this policy engine with our own grid meta-scheduler. While integrating, the first two phases are independent and are integrated with the resource provider login. The third phase has to be integrated with the matchmaking module of our CARE resource broker. So that the CRB uses this policy information as its key factor to select the resources for job submission rather than selecting the resources only based on their capability. 4.2. Scheduler The match-making component of the CRB gathers the available physical resources information using monitoring and discovery

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

271

Table 1 Scheduling scenario for CRB using DRS. Hardware requirements (HR)

Software requirements (SR)

QoS requirements

NCPU

RAM

SS

Speed

OS

Software

BW

Latency



✓ ✓ ✓ ✓

✓ ✓ ✓ ✓

✓ ✓ ✓ ✓

✓ ✓ ✓

✓ ✓ ✓

×

×

✓ ✓ ✓ ✓

✓ ✓ ✓ ✓

× × ×| ✓

component of grid middleware (here Globus toolkit-4.0) and filters out the resources by verifying the usage policy to ensure that these resources will be available at the ETR of a job. Then the next level of matchmaking is done by measuring the capability of the resultant resources that are obtained from the previous phase against the current job requirement. We integrated our own deviation based resource scheduling algorithm with CRB to estimate the capability of the resources. This algorithm uses the deviation against the current job requirements as its metric to calibrate the resources’ capacity. The pseudo code for matchmaking is given below. Where TSLA is the current job requirements,

1. for s ← 1 to length [get_matched_resources ()] 2. 3. 4. 5.

if TSLA ⊆ RUP then deviation [s] = RUP − TSLA else

Single physical Physical coallocation Virtual coallocation Virtual cluster

Exact or Plug-in Subsume Subsume Subsume

1. for j ← 1 to length[B] 2. get_matched_resources ← Select_RUP (A[ ], RUP[ ], ETR) 3. deviation [ ] ← Matchmaking (TSLA, get_matched_resources) 4.

selected_host[ ] ← DRS (deviation [ ], B[j])

5.

switch (scenario)

6.

case 1:

7. 9.

submit_physical (selected_host[ ], B[j]) break case 2:

10.

submit_physical_coalloc(selected_host[ ], B[j])

11.

break

12. 13.

deviation [s] = TSLA − RUP

case 3: selected_vm [ ] ← VM_Select (TSLA, VM_image_repository,

endif

14.

7. endfor

15.

6.

Region

Schedule(A, B)

8. Matchmaking (TSLA, get_matched_resources ( ))

Scheduling scenario

selected_host[ ]) tr_status ← VM_Transfer (selected_host [ ], selected_vm)

Then the scheduler orders the resources based on their deviation values. If a job has any resources in the exact or plug-in region, then the job could be executed with the help of single physical resource provider and there is no need to coallocate more than one resource provider (physical resources) or to create virtual machines. Suppose a job doesn’t have any resources in the exact or plug-in region but it has the resources only in the subsume region. In that case, we carefully explore this region and propose some scheduling use cases in such a way as to use the resources in this region to the fullest possible extent. The proposed scheduling use cases are physical co-allocation, virtual co-allocation and virtual cluster creation. The physical co-allocation takes place when the resources in the subsume region satisfy the software requirements, QoS requirements and hardware requirements except the number of CPUs needed for job execution. In such a scenario, the meta-scheduler co-allocates two or more physical resources and executes the job in the newly created logical resource. The virtual co-allocation scenario is the same as physical co-allocation. The only difference is that instead of co-allocating the physical resources, it explores the possibility of creating the virtual machine in the physical resources itself, thereby growing in the number of resources by co-allocating these newly created virtual machines with physical resources. Finally, the virtual cluster creation initiates only when the resources in the subsume region satisfy the hardware and QoS requirements but not the software requirements. The scheduling scenarios that are derived from deviation based resource scheduling algorithm are shown in the Table 1. The pseudo code for deviation based resource scheduling in given below. Where A = {host1, host2, . . . , hostn}, B = {job1, job2, . . . , jobn}

16.

if tr_status = true then

17.

job_completed ← execute_job (B[j])

18.

if (job_completed = true)

19.

then VM_Recycle(selected_host[ ],

20.

selected_vm) endif

21. 22. 23. 24. 25.

endif break case 4: selected_vm ← VM_Select (TSLA, VM_image_repository,

26. 27.

selected_host[ ]) tr_status ← VM_Transfer (selected_host [ ], selected_vm)

28.

if tr_status = true then

29. 30.

virtual_cluster_creation ()

31.

job_completed ← execute_job (B[j]) if (job_completed = true)

32.

then VM_Recycle(selected_host[ ],

33.

selected_vm) endif

34. 35. 36. 37. 38.

endif break default: exit (1)

39. endfor

272

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

DRS (deviation [ ], B[j]) 1. Sort_descending (deviation [ ]) 2.

for n ← 1 to length [deviation]

3.

if (deviation [n] >= 0)

4.

then scenario = 1

5.

selected_host1[ ] = sla_negotiate (A[n])

6. 7. 8.

if (deviation [n] < 0 && (!NCPU && HR && SR && QoS)) then scenario = 2

9.

selected_host2[ ]=sla_negotiate (A[n])

10. 11. 12.

if (deviation [n] < 0 && (!NCPU && HR && SR && QoS)) then scenario = 3

13. 14.

selected_host3[ ] = sla_negotiate (A[n])

15.

if (deviation [n] < 0 && ((!NCPU || NCPU) && RAM && !SR && QoS))

16.

then scenario = 4

17. 18.

selected_host[ ]=sla_negotiate (A[n])

19. endfor

4.3. Service level agreements Service level agreements (SLAs) are powerful mechanisms for expressing all commitments, expectations and restrictions in a business transaction. In a grid environment, the brokers should have the negotiation support to establish SLAs, which is still lacking in most conventional brokers/metaschedulers. These newly created SLAs clearly express the required QoS to be maintained till the end of application execution, restrictions on resource usage and the penalties in case of failure to provide the accepted QoS or breaching the restrictions on resource usage. After the successful creation of SLAs, the resource provider allows the user to gain access to the resources. These resources should be monitored in order to ensure that the committed QoSs are obeyed. In case of any violation against these committed values, there should be a mechanism to penalize the user or resource provider. The proposed schema for each SLA is shown in the Fig. 3. The WS-agreement acts as a wrapper around this schema. The proposed schema consists of one root element called the SLA tag. It consists of five child elements: JobID, ID, Expiration, SLO (Service Level Objective) and Penalty. The JobID tag specifies the job ID for which this SLA is created. The ID tag specifies the SLA ID. The expiration time of this SLA is specified in the Expiration tag. The requirements of each job are specified in the child element called Job, whereas the committed values of resources against this job are specified in the Resource child element. Suppose if there is any violation of SLO on either side (Resource provider or user), then the penalty mechanism is specified in the Penalty tag. The GSMA consists of four grid services: SLA Creation Service, SLA Monitoring Service, SLA Job Monitoring Service and SLA Enforcement Service. SLA Creation, SLA Monitoring and the SLA enforcement services are deployed in the meta-scheduler, whereas the SLA Job Monitoring service is deployed in each resource provider. 1. SLA creation. The SLA creation service of GSMA starts with negotiation. This implements the Mutual Agreement Protocol (MAP) [6,7] for negotiation and supports pluggable negotiation strategies. Negotiation is a methodology by which an agreement is established on a range of objectives based on an interaction between the

Fig. 3. Proposed schema for service level agreement.

participants using some strategies. Here, the interactions are done through a common negotiation protocol that supports multiple negotiation strategies such as unilateral bargain, bilateral bargain, yes/no (i.e. no more bargaining is supported and we implement this as negotiation strategy). In our implementation, the requirements of an application are expressed as a tender and it is submitted to the CRB. The resource providers expressed their bid as resource usage policies. Each resource provider can have more than one resource usage policy based on the availability of resources, peak load time, least load time etc. The matchmaking module of the CRB selects the resource provider for negotiation only when the resource provider has a favorable resource usage policy against the tender in all aspects at the expected time of job execution. For instance, if a tender needs a job to be executed after five hours from the time of submission, then the matchmaker identifies the appropriate resource usage policy to be applied for that time (i.e. after 5 h) and extracts the bid from that policy and compares it against the tender. If the bid satisfies the tender then the resource provider will be selected for negotiation. If not, then the resource provider will be selected for negotiation only when the negotiation strategy is other than yes/no. The latter tries to reduce the length of negotiation process by considering the success probability of bargaining, whereas the former is a certain event. At the end of the SLA creation phase, the global SLA is formed for that particular tender containing the parties (i.e. resource providers) involved, service level objectives (SLO), penalties etc. Once the global SLA is created, it is stored in a SLA database (i.e. repository) which is a plain XML file and the parties involved are reserved against that particular job. The SLA Creation service takes care of negotiation objectives, parties involved in the negotiation, accepted functional and non-functional objectives, expiration time of SLAs and all these terms and conditions are stored in a XML file. This repository will be referred by the SLA monitoring service, while executing the job in order to monitor the Service Level Objectives (SLO). The SLA creation module consists of SLA Creation Client service and SLA Creation Server Service.

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

SLA creation client service: The scheduler invokes the SLA Creation Client service with pool of Job objects that are accumulated in a Queue. The job object consists of the matched resources for that job. The SLA creation client service invokes the SLA Server Service in order to negotiate with the matched resources. The matched resources are ordered based on their deviation values. After successful negotiation with the ordered matched resource providers, the concrete SLAs are created and stored in the SLA repository. SLA creation server service: The SLA Creation server service extracts the ordered matched resources and the job requirements from the Job object and starts negotiation with the matched resource providers against the functional and non-functional requirements. The functional requirement involves the hardware and software requirements and the non functional requirement involves the response time, bandwidth, latency etc. The committed functional and non functional values are recorded as Service Level Objectives, the resource providers that are participating in the negotiation are termed as parties involved and the punishment against any violations are termed as penalties. These above mentioned terms are gathered as a concrete SLA and are stored in the SLA repository. Apart from the above mentioned terms the concrete SLA consists of jobID for which the SLA is created, a unique SLAID term and the expiration time of this SLA. 2. SLA monitoring. Next, the application is submitted to those committed parties through the executor of CRB. When the application occupies the executing resources, the job status is modified from pending to active by the executor of CRB. Then the SLA Monitoring service will be invoked which in turn creates a separate application monitoring thread for each active application and subscribes the SLA Job Monitoring service in order to get notification of violation against guaranteed objectives. The runtime parameters of an active application are gathered by the SLA Job monitoring service. It uses the software sensors to gather the hardware and process information. The SLA Accounting module gets the usage information from the SLA Job monitoring service and estimates the usage cost and penalty. The SLA monitoring module consists of two services viz.: SLA monitoring client service and SLA monitoring server service. The SLA monitoring client service will be invoked by the Executor, which in turn invokes the SLA monitoring server service. The SLA monitoring server service invokes the corresponding SLA job monitoring service of the resources on which this job is executing. The detailed implementation of the SLA monitoring module is explained in the following sections. SLA monitoring client service: After the successful transfer of the executables, input files to the targeted resource, the Executor starts the execution on the targeted resources. After the successful initiation of the job, the executor will invoke the SLA monitoring client service with Job ID as its argument. The SLA monitoring client service will start the SLA monitoring server service in order to capture the violation. SLA monitoring server service: The SLA monitoring server service gets the job ID from SLA monitoring client service and extracts the job object for that corresponding job ID using the job helper class. The job helper is a helper class that gets the job ID as its input, iterates through all the job objects available in a job pool and returns the appropriate job object for that job ID. Moreover, any job object contains the information, such as requirements of that job, all the matched resources for that job and scheduled resource (i.e. a resource that is originally executing the job) for that job. The SLA monitoring server service extracts the scheduled resource(s) from the job object. After identifying the resources that are actually executing the job, the SLA monitoring server service will invoke the SLA job monitoring service that is deployed into

273

the globus container of the executing resource. While invoking, the SLA monitoring server service passes the SLOs to the SLA job monitoring service. 3. SLA job monitoring service. The SLA job monitoring service consists of scripts for extracting the job’s runtime parameters such as RAM, Secondary storage, CPU load, bandwidth and latency. These scripts are generated automatically in the job directory since we modified the job manager scripts (i.e. pbs.pm and sge.pm). These runtime parameters of the job are compared against the committed values. Whenever the runtime values exceed the SLOs, only then it will be notified to the SLA monitoring server service from where it will initiate the SLA enforcement client service. 4. SLA enforcement. The SLA enforcement module is used to identify the violation producer and also to determine the penalty for the violation. In our case, we consider four kinds of violation viz.: Credit, Block, Purge and Penalty. The enforcement module consists of SLA enforcement client service and SLA enforcement server service. The SLA Enforcement service gets invoked, when the SLA Monitoring service gets the violation notification from the SLA Job Monitoring service and penalizes the user or resource provider who caused the violation (i.e. bilateral enforcement). SLA enforcement client service: This service is invoked by the SLA monitoring server service and investigates the violator (user or provider) of the objective and the parameter (RAM, secondary storage, load) which violates the objective. Based on the decision, it will invoke the SLA enforcement server service. The SLA enforcement server service refers the SLA from the SLA repository, decides the kind of penalty and invokes the appropriate penalty mechanism. SLA enforcement server service: The SLA enforcement server service gets the inputs from SLA enforcement client service and queries the SLA repository to get the appropriate penalty mechanism to be invoked. Since we implement the penalty mechanism as a pluggable architecture, we can incorporate additional penalty mechanism in the SLA enforcement engine. Based on the kind of penalty that is obtained from the SLA repository, it will invoke the appropriate penalty mechanism. 4.4. Virtual Resource Manager (VRM) VRM is a system that simplifies and automates the deployment and management of a virtual computing infrastructure in which the VMs are provisioned, monitored and managed from birth to death. VRMS provision the VMs from a pool of VM images that are stored in an image repository. In our context, every VM request must pass through the following VM provisioning lifecycle viz.: Request, Select, Provision, Transfer, Activate and Recycle. When there is no single physical cluster or resource available to satisfy the current job request, the Request phase gets the VM create request from the meta-scheduler with the hardware and software requirements of VMs, such as number of CPUs per VM, number of VMs, RAM, hard disk size and the software libraries of each VM. The mandatory requirement to provision these newly created VMs is the selection of suitable physical resources over which these newly created VMs are to be deployed. The Select phase gets the suitable physical resources by matching them only against the hardware requirements specified in the job request with the help of a Deviation based Resource Scheduling (DRS) algorithm. Based on the hypervisor available (Xen or VMware) in those selected resources, it selects the type of the VM image to be deployed (.img or .vmdk). After selecting the type of VM image, it matches the hardware and software requirements with the meta-data of the VM images that are

274

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

Hypervisor_Adapter(A) 1. for j ← 1 to length[A] 2.

hypervisor ←A[j].getHypervisor()

3.

do

4.

if hypervisor = vmware then VMWare_Adapter (A[j])

5. 6.

else if hypervisor = xen then Xen_Adapter (A[j])

7. 8.

else

9.

continue

10. done 11. endfor

5. Implementation

Fig. 4. VM provisioning lifecycle.

available in the image repository. The meta-data provides the information, such as operating system, number of CPUs per VM, RAM, hard disk size and software libraries of that particular VM. At the end of the selection phase, the VM images and physical resource(s) over which these selected VMs to be deployed are decided. The Provision phase analyzes the possibility of re-modifying the configuration of VM images that are available in the repository if and only if no suitable VM image is found in the select phase. Then the transferring of the selected VM image to the selected physical resource is done in the Transfer phase, following that connecting and using the VM is done in the Activate phase that includes assigning the IP address & hostname, to boot or to migrate the VM. At the end, deleting the VMs and reclaiming the physical resources are handled in the Recycle phase. The entire VM provisioning lifecycle is specified in Fig. 4. 1. Hypervisor adapter. The virtualization layer installed over the physical machine is responsible for the transferring of VM images and gathering current resource state and status information. The VRM should have the capability to interoperate with the underlying virtualization layer (VMWare, Xen, etc.). In order to achieve the interoperability between the virtualization layers, we propose a hypervisor adapter in the VRM architecture. The hypervisor adapter obtains the information about the hypervisor installed over the physical machine using external information provider integrated with Monitoring and Discovery component of Globus toolkit through Resource Property provider framework. Based on the hypervisor installed, it will invoke an appropriate hypervisor adapter (either Xen or VMWare) for getting the information about the VMs and the physical machine. The VMware adapter is developed using VI-SDK API. It consists of the functionalities such as to establish the session with the VMWare Server installed in the physical machine, to get the physical host metrics and inventory information, number of VMs present in the inventory, power state of the VMs, user management etc. The Xen adapter is developed using Xen API. It includes the functionalities to establish the session with Xen installed machine using RPC, to get all the VMs and their UUIDs, power state, status, utilization information etc. The pseudo code for hypervisor adapter is shown below. Where A = {host1, host2, . . . , hostn}

The CRB has developed and implemented by the Centre for Advanced Computing Research and Education (CARE), Anna University, India. Our experimental setup consisting of two clusters, one with six computing nodes, called the Xen cluster and the other with four computing nodes, called the VMware cluster. Both the clusters have globus middleware deployed with the following services VRAS, VMS and VCS in each of their head node [5]. The Xen cluster is installed with Xen-3.1 as its hypervisor, whereas the VMware cluster uses VMware server-2.0.0. The detailed experimental setup is explained in the next section. We assume that the user’s application can be executed in any one of the images available in the image repository. The VRAS service runs periodically and collects the resource information specific to virtual cluster formation, and registers them with globus’ WS-MDS. The WS-MDS uses the default index service to collect the available resources’ information (both physical and virtual) periodically (−300 s). This information is provided to the information manager of CRB and maintained in the host pool. The SLA enabled CRB is shown in Fig. 5. Before entering into the flow of SLA enabled CRB, we briefly explain the modification done to the existing CRB. The PMS is integrated with the host identifier, which is available in the controller component in such a way that it will use the policy information while selecting the resources. The image repository and the SLA creation modules are interconnected with the scheduler module. This leads to negotiation with resources that are selected by the scheduler and also to select the VM image from the repository. The SLA monitoring module is integrated with both the Executor of CRB and SLA enforcement module in such a way as to monitor the SLOs and to find the violations respectively. The VRM module is integrated with the hypervisor adapter in such a way that to manage the created VMs. The CRB accepts the jobs in standard Job Submission Description Language (JSDL) [38] format. Once the jobs are submitted to the CRB, the host identifier gets all the available resources’ information from the host pool. Then it requests the PMS to find out the resources that are available at the ETR of the jobs. This is a twofold process: to find the appropriate policy from a set of policies defined for that particular resource and to generate the policy matched resources list based on the ETR alone irrespective of the capability. Every resource participating in grid may have more than one usage policies based on their availability. So the PMS gets the resource list and ETR of the jobs from host identifier, and contacts the policy repository (i.e. DB) to get all the policies for each resource in the resource list. It compares the ETR of each job with the availability tag of the usage policy. It is mandatory to ensure that the resources are not reserved by other jobs during that time. This is done by referring our advance reservation service (This is

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

275

Fig. 5. Architecture of modified CRB.

out of scope of this paper.). If that resource is available in that period, then it will be added to the policy matched resources list. This policy matched resources list is sent back to the host identifier to identify which resource in the list falls into which region (i.e. plugin, exact or subsume) using the DRS algorithm. After this step, every submitted job has a set of matched resources (both policy and capability) that are grouped into three regions. Then the scheduling type of each job is decided based on the resource availability in these regions (i.e. which region and which resource in that region). If a job has resources in all the three regions, then we use lollipop based selection [6,7] to decide the region to be considered. If a job has more than one resource in each region, then the selection of resources from a single region is done based on the preferences specified by the user. The resources are ordered based on their preferences (highest bandwidth or low latency) and initiate SLA negotiation. At the end of SLA negotiation, the concrete SLA is created and stored in the SLA repository. The negotiated resources are reserved so that they should not be considered for another job scheduling at that time. If the scheduling type of a job is single physical, then the job is submitted to the physical cluster whose name is available within the Hostname tag of SLA schema. If it is a physical co-allocation, then it calls the co-allocation service of globus (This is also out of scope of this paper.). In case of virtual co-allocation, we create additional virtual machines in the selected physical resources specified in the hostname tag of SLA and add it to the existing cluster [2] by invoking the Virtual Machine Service (VMS). The hypervisor installed over the selected resource(s) is obtained from MDS. Based on the available hypervisor, it will query the image repository to get the suitable VM image (.img or .vmdk). The images in the repository may be bundle based or image based. The

image based images are those that have the operating system, dependent software, libraries etc. as a single image. The disadvantages of this type of image based images are that it is tedious to add/delete the software available in that image. But the advantage is that they are ready to use components. The bundling overhead is reduced in image based images. But in the bundle based type, the OS and software are available as separate entity. So we have to bundle all the required combinations to make it a single, ready to use image. The advantage of this type of image is that we can create any number of combinations of images that we need. The disadvantage is additional overhead incurred due to bundling ondemand. Then the selected VM image will be transferred from the CRB node to the selected host with the help of a transporter. After the successful transfer of the VM image, the scheduler will call the hypervisor adapter with the selected VM image name and the selected hostname. Then the hypervisor adapter will initiate the appropriate adapter (Xen or VMware) with this information. Next, it will establish a connection with the selected host and add/activate the context information such as IP address, hostname etc. to this transferred image. In case no suitable image is found in the image repository, the scheduler modifies the configuration of the existing image to meet the request. Once the VM image gets activation, the hypervisor adapter boots the VM and adds it to the existing cluster. Then the scheduler will submit the job in the new hybrid resource. The virtual cluster scheduling type creates the virtual cluster in the committed resources and executes the job in the newly created virtual cluster with the help of Virtual Cluster Service (VCS) and VMS. The VCS does both the VM creation and cluster set-up. Once the job gets submitted over the resources, the SLA monitoring module starts its thread and monitors the committed SLOs by measuring

276

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

the runtime parameters of a running job. In order to retrieve the appropriate SLA, it uses the JobID tag as its key. If there is any violation (either at the provider’s side or the user’s side) during the execution of the job, it invokes the corresponding penalty action that is specified in the Penalty tag. After the completion of the job, the SLA monitoring thread deletes the apt SLA from the SLA repository. 6. Experimental set up We evaluate the performance of SLA enabled CRB using real time application execution in the following experimental set up. This set up consists of one Xen cluster and one VMware cluster. Resource configuration: The Xen cluster consists of one head node and 6 computing nodes. The head node is installed with RHEL 5.0 as its OS, Xen-3.1 as hypervisor, Globus Toolkit-4.0.0 as its grid middleware, torque as its local resource manager with 2 GB RAM, 160 GB harddisk, 3.2 GHz processor speed. The computing nodes have RHEL-5.0, Xen-3.1 and torque client configurations with 2 GB RAM, 160 GB harddisk, 3.2 GHz processor speed. The VMware cluster consists of one head node and 4 computing nodes. The head node consists of RHEL-4.0, VMware server-2.0.0 as its hypervisor, Globus Toolkit-4.0.0 as grid middleware, torque as its local resource manager with 2 GB RAM, 160 GB harddisk, 3.2 GHz processor speed. The computing nodes have RHEL-4.0, VMware server-2.0.0 and torque client configurations with 2 GB RAM, 160 GB harddisk, 3.2 GHz processor speed. The CRB is installed in server hardware with 4 CPUs, each CPU with quad core processors, 2000 MHz per processor, 16 GB RAM with RHEL-5.0, archives of Xen API and VMware API and Globus Toolkit-4.0.0. The two clusters are upstream of their information to CRB and become grid resources to CRB. Image configuration: Every VM image in the repository contains any one of the following minimal operating systems (i.e. 648 MB for VMware and 1.0 GB Xen) viz., Red Hat Linux 4.0, Scientific Linux 5.0, Fedora Linux 3.0, and Gentoo Linux, set of software and libraries required for job execution in both format (.img or .vmdk) (image based image). The Meta-data about each image is represented in a XML file. 7. Results and discussion Simulation: We have simulated the scheduling algorithm explained in this paper and compared the performance with other conventional scheduling methods as we did in our previous work. The experimental model topologically maps into one grid meta-scheduler with multiple resource providers. The experiment considered has nearly 12 500 resources with different capabilities available in a database (i.e. number of CPUs, RAM and CPU load). Every job request is generated using Fietelson model [39]. This model generates the job parameters such as length of job (tD ), number of CPUs required (NCPU ), job arrival rate (TA ) and number of jobs (NJ ). In addition to these parameters, we intentionally generate some additional parameters such as job execution frame i.e. job start time TS , job finish time TF , size of the RAM for executing the job (SRAM ) and size of secondary storage (SSS ). The same job request and the available resources are submitted to the simulated scheduling algorithms of Gridway, existing CRB and SLA enabled CRB. We used a random access model that generates 10–100 short job requests in random fashion that selects the resources in a random manner from the 12 500 resources that are available in the database. In Gridway, the resources are filtered against the job request, the rank of resultant resources is computed and the job in the resource that has highest rank submitted. If the job is submitted successfully, it will be treated as success, else it is a failure. The existing CRB gets the job requests, matched them against the resources to identify the scheduling case (i.e. physical, virtual co-allocation or virtual cluster) and submits the job into the selected resource. In case, more resources are matched against a single job request, the

Fig. 6. Job scheduling rate.

Fig. 7. Time taken to schedule.

existing CRB submits the job request to the resource that has the highest capability as it does not have any resource optimization techniques. In SLA enabled CRB, a deviation value of all the resources are computed against restrictions in the job, if an exact or plug-in match is found, the job is submitted to that resource else it initiates negotiation in the subsume region, creates the SLA and submits the job across all the resources that generate the SLA. In case more resources are matched against a single job request, the SLA enabled CRB submits the job request to the resource that is selected using a lollipop based resource selection technique, which allocates the resource in a more optimized manner. This leads to an increase in the throughput in SLA enabled CRB than in existing CRB. The scheduling rate is compared with Gridway, existing CRB and SLA enabled CRB against these 100 job requests and is shown in Fig. 6. Obviously the scheduling rate is more in SLA enabled CRB than in Gridway and existing CRB. In addition, we have measured the time taken for making the scheduling decision in existing and SLA enabled CRB to prove the efficiency of DRS. From Fig. 7, we conclude that the time taken for decision making is significantly reduced since the SLA enabled CRB decides the region of resources in a single comparison. Application execution: In India, Central Electronics Engineering Research Institute (CEERI) is an institute that endeavors to exploit the recent trends in information technology into edge technologies such as horticulture, agriculture etc. CEERI has developed an image processing application to detect the mechanical defects in fruits like Mango, Apple etc. using a watershed algorithm and is sequential in nature. Fruit is made to flow on a conveyor belt that comes under four cameras and pictures of four different sides of a fruit are taken. This application takes these pictures as inputs and compares with the test images to determine the defect in a single fruit. The time complexity of this application mainly depends on the amount of fruit. We have considered this application and parallelized it, so that every view of the fruit is processed in one processor rather than processing all the four views in a single processor to take a final decision. In this manner, the execution time has been greatly reduced, when the amount of fruit (i.e. number of conveyor belts) gets increased. In our setup, 4 items of fruit are made to flow in 4 conveyor belts and a camera on one side captures a single image that consists of four fruits’ images in one position. Likewise, all the other 3 views are taken from the

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

277

Fig. 8c. Image-processor allocation strategy.

Fig. 8a. Four views of four apples.

Fig. 8d. Design of VM image used for the experiment. Fig. 8b. Processed image for single view.

remaining 3 cameras. In total there are 4 images each containing one view for 4 fruits. The modified CRB has been tested with this application with 16 images taken from the CEERI database (i.e. 16 items of fruit) have been given as input to the system. Based on the application’s requirements, we have built pre-configured VM images that are stored in the image repository. Each application instance can process 4 images (i.e. 4 items of fruit) with the help of 4 processors. So for 4 application instances, 16 processors are required to process these 16 images. So the first step is to submit the request for 4 application instances to CRB. Then the CRB provisions these 16 processors as follows: 4 processors from the Xen cluster as physical resources, 4 processors from the VMWare cluster as physical resources, 2 processors from Xen cluster with 2 processors created as VMs over these physical resources as a virtual co-allocation and finally 4 processors from the VMware cluster as a virtual cluster. Thus, the fruits with defects are determined and results are collected for about ten iterations to obtain consistency in observation. Fig. 8a shows the four views of four fruits, where A1–A4 specify the apples and S1–S4 state their views. All the four application instances take these images as inputs and process them. It removes the background of the images first, then segments the images and compares it with the reference images to identify the defects. Fig. 8b shows the background removed and segmented image (for single view) that clearly shows the defects, the imageprocessor allocation strategy is shown in Fig. 8c and the design of the VM image used for this experiment is shown in Fig. 8d. The total job processing time for ten iterations that includes VM transfer, VM boot and application execution time for both VMware and Xen is shown in Figs. 9 and 10 respectively. In addition, each resource has varied number of usage policies ranging from 1 to 6. The CRB schedules the jobs if possible, if not it will invoke either VMS or VCS. In the modified CRB, our focus is to create VMs or virtual cluster using both hypervisors. The total resource usage policy processing time for single and two resource(s) is shown in Figs. 11 and 12 respectively. This total processing time includes finding the appropriate policy and request processing time. Since every resource may have a varied

Fig. 9. Total job processing time in VMware.

Fig. 10. Total job processing time in Xen.

number of policies, we vary the number of polices from 1 to 6 per resource. Also we have measured the overhead by adding both the resources one by one. The experimental results show that the total policy processing time varies linearly with the increase in number of policies per resource. When the number of resources increases, then the policy find time and the request processing time also increases. 8. Conclusion In this paper, we implement and fulfill the future aspects of CRB as we mentioned in our earlier work. We insist the significance

278

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279

Fig. 11. RUP processing time for a single resource.

Fig. 12. RUP processing time for two resources.

of resource usage policies to realize the controlled grid resource sharing environment, we propose a policy management system and integrate it with our CRB as a value addition component. The additional overhead due to the inclusion of policies, varying the number of policies and varying the number of resources is also evaluated. With this SLA enabled CRB, we ensure that the resource providers’ desired usage scenario is always obeyed. Further, these resource usage policies are taken into account, while selecting the resources for submitting the job. Based on these usage policies, the SLAs are generated for every job and maintained in the repository. The SLA monitoring and enforcement modules are used to maintain the desired usage scenario and punish the user/resource provider in case of violation of objective respectively. In our earlier work, we consider the on-demand bundling of VM images, while provisioning the VM images. But in our current work, we consider only the image based images that lead to a reduction of runtime environment preparation overhead. In addition, we also propose a pluggable hypervisor adapter that transforms the CRB to support multi-vendor virtualization products. Moreover, we modify the scheduling algorithm of CRB with DRS, which provides a pluggable support to scheduling parameters that leads CRB to enhance the resource selection based on the user’s preference. Due to the above mentioned characteristics, we strongly believe that the SLA enabled CRB architecture will be a stepping stone in realizing controlled grid resource sharing. Acknowledgements We sincerely thank the Ministry of Communication and Information Technology, Government of India for financially supporting this project. Also, we thank our team members Mr. R. Kumar, Mr. K. Rajendar, Mr. G. Kannan, Mr. R. Rajiv and Mr. E. Mahendran for their valuable suggestions and support. References [1] Ian Foster, C. Kesselman, S. Tuecke, The anatomy of the grid: enabling scalable virtual organizations, The International Journal of Supercomputer Applications 15 (3) (2001) 200–222.

[2] Thamarai Selvi Somasundaram, Balachandar R. Amarnath, Balakrishnan Ponnuram, Kumar Rangasamy, Rajendar Kandan, Rajiv Rajaian, Rajesh Britto Gnanapragasam, Mahendran Ellappan, Madusudhanan Bairappan, Achieving co-allocation through virtualization in grid environment, in: The Proceeding of 4th International Conference, GPC 2009, Geneva, Switzerland, May 4–8, 2009, pp. 235–243. [3] Susanta Nanda, Tzi-Cker Chiueh, Survey on Virtualization Technologies, the Research Proficiency Report, Stony Brook, ECSL-TR-179, February 2005. www.ecsl.cs.sunysb.edu/tr/TR179.pdf. [4] R.L. Keeny, H. Raiffa, Decisions with Multiple Objectives: Preferences and Value Trade-Offs, John Wiley and Sons, 1976. [5] T.S. Somasundaram, B.R. Amarnath, R. Kumar, P. Balakrishnan, K. Rajendar, R. Rajiv, G. Kannan, G. Rajesh Britto, E. Mahendran, B. Madusudhanan, CARE resource broker: a framework for scheduling and supporting virtual resource management, Future Generation Computer Systems (2009) doi:10.1016/j.future.2009.10.005. [6] P. Balakrishnan, Thamarai Selvi Somasundaram, G. Rajesh Britto, GSMA based automated negotiation model for grid scheduling, in: The Proceeding of 5th IEEE International Conference on Services Computing, SCC WIP 2008, Honolulu, Hawaii, USA, July 8–11, pp. 569–570. [7] P. Balakrishnan, Thamarai Selvi Somasundaram, G.Rajesh Britto, Service level agreement based grid scheduling, in: The Proceeding of 6th IEEE International Conference on Web Services, ICWS 2008, SERVICES 2008, Beijing, China, September 23–26, pp. 203–210. [8] K. Czajkowski, I. Foster, C. Kesselman, S. Tuecke, Grid service level agreements, in: J. Nabrzyski, J.M. Schopf, J. Weglarz (Eds.), Grid Resource Management, Kluwer Academic Publishers, 2004, pp. 119–134. [9] C.T. Dumitrescu, M. Wilde, I. Foster, A model for usage policy-based resource allocation in grids, in: The Proceedings of the Sixth IEEE International Workshop on Policies for Distributed Systems and Networks, Stockholm, Sweden, June 6–8, 2005, pp. 191–200. [10] Jun Feng, G. Wasson, M. Humphrey, Resource usage policy expression and enforcement in grid computing, in: The Proceeding of 8th IEEE/ACM International Conference on Grid Computing, 19–21 September 2007, pp. 66–73. [11] Attila Kertesz, Gabot Kecskemeti, Ivona Brandic, An sla-based resource virtualization approach for on-demand service provision, in: The Proceedings of 3rd VTDC’09, 2009, Barcelona, Spain, June 15–19, 2009, pp. 27–34. [12] Ivona Brandic, Dejan Music, Schahram Dustar, Service mediation and negotiation bootstrapping as first achievements towards self-adaptable grid and cloud services, in: The Proceeding of GMAC’09, Barcelona, Spain, June 15–19, 2009, pp. 1–8. [13] M.Q. Dang, J. Altmann, Resource allocation algorithm for light communication grid-based workflows within an SLA context, International Journal of Parallel, Emergent and Distributed Systems 24 (1) (2009) 31–48. [14] Andreas Menychtas, Dimosthenis Kyriazis, Konstantinos Tserpes, Realtime reconfiguration for guaranteeing QoS provisioning levels in grid environments, Future Generation Computer Systems 25 (7) (2009) 779–784. [15] Antonella Di Stefano, Giovanni Morana, Daniele Zito, A P2P strategy for QoS discovery and SLA negotiation in grid environment, Future Generation Computer Systems 25 (8) (2009) 862–875. [16] Antonie Pichot, Philipp Weider, Oliver Waldrich, Wolfgang Ziegler, Dynamic SLA negotiation based on WS-agreements, in: WEBIST 2008, Proceedings of the Fourth International Conference on Web Information Systems and Technologies, vol. 1, Funchal, Madeira, Portugal, May 4–7, 2008, pp. 38–45. [17] Lars-Olof Burchard, Matthias Hovestadt, Odej Kao, Axel Keller, Barry Linnert, The virtual resource manager: an architecture for SLA-aware resource management, in: The Proceeding of 4th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2004, Chicago, IL, USA, April 19–22, 2004, pp. 126–133. [18] A. Sahai, A. Graupner, V. Machiraju, A. Van Moorsel, Specifying and monitoring guarentees in commercial grids through SLA, in: Proc. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, Tokyo, Japan, May 12–15, 2003, pp. 292–299. [19] Alexander Keller, Heiko Ludwig, The WSLA framework: specifying and monitoring service level agreements for web services, Journal of Network and Systems Management 11 (1) (2003) 57–81. [20] A. Andrieux, K. Czajkowski, A. Dan, K. Keahey, H. Ludwig, J. Pruyne, J. Rofrano, S. Tuecke, M. Xu, Web services agreement specification (WS-agreement) draft 20, 2004. https://forge.gridforum.org/projects/graap-wg/. [21] Akhil Sahai, Vijay Machiraju, Mehmet Sayal, Aad P.A. van Moorsel, Fabio Casati, Automated SLA monitoring for web services, in: The Proceedings of the 13th IFIP/IEEE International Workshop on Distributed Systems: Operations and Management: Management Technologies for E-Commerce and E-Business Applications 2002, October 21–23, 2002, pp. 28–41. [22] G.R. Nudd, D.J. Kerbyson, E. Papaefstathiou, S.C. Perry, J.S. Harper, D.V. Wilcox, Pace—a toolset for the performance prediction of parallel and distributed systems, International Journal of High Performance Computing Applications 14 (3) (2000) 228–251. [23] Warren Smith, Improving resource selection and scheduling using predictions, in: J. Nabrzyski, J.M. Schopf, J. Weglarz (Eds.), Grid Resource Management: State of the Art and Future Trends, Kluwer Academic Publishers, Norwell, MA, USA, 2004, pp. 237–253. [24] Eduardo Huedo, Ruben S. Montero, Ignacio M. Llorente, A framework for adaptive execution in grids, Software: Practice & Experience 34 (7) (2004) 631–651.

P. Balakrishnan, T.S. Somasundaram / Future Generation Computer Systems 27 (2011) 265–279 [25] Wai-Khuen Cheng, Boon-Yaik Ooi, Huah-Yong Chan, Resource federation in grid using automated intelligent agent negotiation, Future Generation Computer Systems 26 (8) (2010) 1116–1126. [26] I. Foster, Yong Zhao, I. Raicu, S. Lu, Cloud computing and grid computing 360degree compared, in: The Proceedings of the Grid Computing Environments Workshop, Austin, Texas, 12–16 November, 2008, pp. 1–10. [27] Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, James Broberg, Ivona Brandic, Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems 25 (6) (2009) 599–616 (ISSN: 0167-739X). [28] Katarzyna Keahey, Ian T. Foster, Timothy Freeman, Xuehai Zhang, Daniel Galron, Virtual workspaces in the grid, in: José C. Cunha, Pedro D. Medeiros (Eds.), Euro-Par 2005, Parallel Processing, 11th International Euro-Par Conference, Lisbon, Portugal, August 30–September 2, 2005, Proceedings, in: Lecture Notes in Computer Science, vol. 3648, Springer, 2005, pp. 421–431. [29] K. Keahey, I. Foster, T. Freeman, X. Zhang, Virtual workspaces: achieving quality of service and quality of life in the grid, Scientific Programming Journal 13 (4) (2005) 265–276. Special Issue: Dynamic Grids and Worldwide Computing. [30] Timothy Freeman, Katarzyna Keahey, Ian T. Foster, A. Rana, Borja Sotomayor, Frank Würthwein, Division of labor: tools for growing and scaling grids, in: Asit Dan, Winfried Lamersdorf (Eds.), 4th International Conference on Service-Oriented Computing, ICSOC 2006, Chicago, IL, USA, December 4–7, 2006, in the Proceedings, in: Lecture Notes in Computer Science, vol. 4294, Springer, 2006, pp. 40–51. [31] Ian T. Foster, Timothy Freeman, Katarzyna Keahey, Doug Scheftner, Borja Sotomayor, Xuehai Zhang, Virtual clusters for grid communities, in: The Proceedings of 6th IEEE International Symposium on Cluster Computing and the Grid, CCGrid 2006, Singapore, 16–19 May 2006, pp. 513–520. [32] Michael A. Murphy, Sebastien Goasguen, Virtual organization clusters: selfprovisioned clouds on the grid, Future Generation Computer Systems 26 (8) (2010) 1271–1281. [33] Paolo Anedda, Simone Leo, Simone Manca, Massimo Gaggero, Gianluigi Zanetti, Suspending, migrating and resuming HPC virtual clusters, Future Generation Computer Systems 26 (8) (2010) 1063–1072. [34] B. Sotomayor, A resource management model for VM-based virtual workspaces, Masters Paper, University of Chicago, February 2007. [35] Borja Sotomayor, Kate Keahey, Ian T. Foster, Combining batch execution and leasing using virtual machines, in: Manish Parashar, Karsten Schwan, Jon B. Weissman, Domenico Laforenza (Eds.), The Proceedings of the

[36]

[37]

[38] [39]

279

17th International Symposium on High-Performance Distributed Computing, HPDC-17 2008, Boston, MA, USA, 23–27 June 2008, pp. 87–96. J. Fontán, T. Vázquez, L. Gonzalez, R.S. Montero, I.M. Llorente, OpenNEbula: the open source virtual machine manager for cluster computing, in: The Proceeding of Open Source Grid and Cluster Software Conference, San Francisco, USA, 12–16 May 2008. A. Clematis, A. Corana, D. D’Agostino, A. Galizia, A. Quarat, Job-resource matchmaking on grid through two-level benchmarking, Future Generation Computer Systems 26 (8) (2010) 1165–1179. Job submission description language specification, Version 1.0. www.gridforum.org/documents/GFD.56.pdf. Dror G. Feitelson, Lang Rudolph, Metrics and benchmarking for parallel job scheduling, in: D.G. Feitelson, L. Rudolph (Eds.), The Proceeding of Job Scheduling Strategies for Parallel Processing, in: LNCS, vol. 1459, Springer, 1998, pp. 1–24.

P. Balakrishnan has worked as a Senior Research Associate at CARE since October 2005. He is currently pursuing a Ph.D. in Policies in Grid and Virtualization frameworks at Anna University. He pursued his Masters in Engineering during 2004. He has 7 years of research and software development experience. His major research areas are Security, SLA, Virtualization and Desktop Grid and Cloud Computing. He has presented more than 5 research publications at international conferences.

Thamarai Selvi Somasundaram is a professor at the Department of Computer Technology, MIT Campus of Anna University. She is also the director of CARE and has successfully coordinated several research projects funded by various funding agencies across India. She has more than 27 years of teaching experience and 10 years of research experience. She has won many awards for her excellence in academic contributions. Her research areas include Grid Computing, Virtualization technologies, Neural networks, Cloud Computing and Mobile Computing. She has presented more than 100 publications in international conferences, 10 publication in renowned journals, and has authored 4 books.