Extending OCCI for autonomic management in the cloud

Extending OCCI for autonomic management in the cloud

ARTICLE IN PRESS JID: JSS [m5G;February 1, 2016;22:10] The Journal of Systems and Software 000 (2016) 1–14 Contents lists available at ScienceDire...

2MB Sizes 4 Downloads 61 Views

ARTICLE IN PRESS

JID: JSS

[m5G;February 1, 2016;22:10]

The Journal of Systems and Software 000 (2016) 1–14

Contents lists available at ScienceDirect

The Journal of Systems and Software journal homepage: www.elsevier.com/locate/jss

Controversy Corner

Extending OCCI for autonomic management in the cloud Mohamed Mohamed a,∗, Djamel Belaïd b, Samir Tata a,b a b

IBM Research, Almaden Research Center, 650 Harry Road, San Jose CA, 95120, USA SAMOVAR, Télécom SudParis, CNRS, Université Paris-Saclay, 9 rue Charles Fourier 91011 EVRY, France

a r t i c l e

i n f o

Article history: Received 15 October 2014 Revised 23 December 2015 Accepted 2 January 2016 Available online xxx Keywords: Cloud computing Autonomic computing OCCI

a b s t r a c t Cloud Computing is an emerging paradigm involving different kinds of Information Technologies (IT) services. One of the major advantages of this paradigm resides on its pay-as-you-go economic model. To remain efficient, it becomes necessary to couple this model with autonomic computing. By autonomic computing we mean the ability of the system to automatically and dynamically manage its resources to respond to the requirements of the business based on Service Level Agreement (SLA). In this paper, we propose an extension for Open Cloud Computing Interface (OCCI) to support the different aspects of autonomic computing. This OCCI extension describes new Resources and Links that are generic Kinds and are specialized using OCCI Mixins. We introduce the Autonomic Manager as a special Resource that starting from a SLA instantiates all needed entities to automatically establish an infrastructure to enable an autonomic management of Cloud resources. The other introduced OCCI Resources are: Analyzer to analyze monitoring data based on specific analysis rules and Reconfiguration Manager to generate reconfiguration actions based on reconfiguration strategies. These Resources are linked using new defined Link entities. We describe herein, a real use case to show that we can apply our approach to the different levels of the Cloud (i.e., IaaS, PaaS and SaaS) at the same time. We present also the implementation details as well as evaluation preliminary results that are encouraging. © 2016 Elsevier Inc. All rights reserved.

1. Introduction Cloud Computing is a challenging area involving different aspects of IT (Information Technologies) services. Its adoption is increasing due to its economic model based on pay-as-yougo principle. A survey conducted by the Cloud Industry Forum (Cloud Industry Forum, 2012) in 2012 involving 300 companies showed that 53% of the companies are currently adopting the Cloud. The same survey showed that 73% of them are planning to increase their adoption of Cloud services in the next 12 months. Many aspects related to Cloud Computing are well explored. Meanwhile, management of resources did not get the needed attention in research works. In fact, Cloud resources are exposed to dynamic evolution during their lifecycle due to the environment dynamics. To cope with Cloud dynamics, monitoring and reconfiguration mechanisms might be offered by cloud providers. Monitoring consists in informing interested parts about the current status of a resource or a service. While reconfiguration is a runtime modification of the structure or the implementation of an infrastructure ∗

Corresponding author. Tel.: +1 408 409 3538. E-mail addresses: [email protected] (M. Mohamed), djamel.belaid@ telecom-sudparis.eu (D. Belaïd), [email protected] (S. Tata).

(Hnˇetynka and Plášil, 2006). Such mechanisms should enable dynamic management of Cloud resources with a minimal cost and minimal performance degradation. 1.1. Problem statement There are several issues related to monitoring and reconfiguration of Cloud resource to be addressed such as heterogeneity, scalability and security. In this paper, we will focus on the following three problems: (1) heterogeneity of managed resources, (2) heterogeneity of the resources used for monitoring and reconfiguration and, (3) dealing with multiple SLA description languages. Monitoring and reconfiguration mechanisms in Cloud environments may specify in a granular way what it wants to monitor and reconfigure, going from a simple attribute to a complex system. This brings a big challenge to monitoring and reconfiguration, which is the heterogeneity of resources, that need to be monitored and reconfigured. An additional problem is the fact that monitoring and reconfiguration have been treated as separate aspects. Generally, the resources used to these aims are not compatible with one another. Actually, almost all of the cloud computing projects suppose that monitoring is used just to gather Key Performance Indicator (KPI) for billing or to notify the interested client of possible

http://dx.doi.org/10.1016/j.jss.2016.01.002 0164-1212/© 2016 Elsevier Inc. All rights reserved.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS 2

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

problems (CompatibleOne, 2013; EASI-CLOUDS, 2014; Cloudware, 2013; Cloud4SOA, 2012). They also consider that reconfiguration consists of a new deployment of resources (CompatibleOne, 2013; EASI-CLOUDS, 2014; Cloudware, 2013; Cloud4SOA, 2012). Therefore, reconfiguration is as expensive as a new deployment. Some attempts have been made to use autonomic computing to bridge the gap between monitoring and reconfiguration. Nevertheless, they just cope with specific use cases. Thus, they cannot respond to the dynamics of the cloud in different situations. The third problem that we need to overcome in this paper is the ability to deal with the different languages used to describe Service Level Agreements between consumers and providers (i.e., WS-Agreement Andrieux et al., 2005, USDL Kadner et al., 2011, etc). Since adopting a new SLA language is a cumbersome task, there is a need to offer a way to process SLAs described using any language with a minimal effort to facilitate the on-boarding task of new agreements.

1.2. Approach overview In this paper, we propose to adopt Open Cloud Computing Interface recommendation (OCCI) and Autonomic Computing to tackle the abovementioned problems. OCCI will play a key role in resolving heterogeneity of resources and SLA languages, while Autonomic Computing will enable coupling monitoring and reconfiguration in a holistic loop. According to Buyya et al. (2012), coupling Autonomic Computing with Cloud computing increases the availability of Cloud resources and reduces their costs. However, this coupling remains critical and challenging since Cloud environments “are composed of thousands of physical hosts and virtual machines, connected by many network elements” (Buyya et al., 2012). Autonomic Computing can reduce the difficulty of managing these environments (Entrialgo et al., 2011). Autonomic Computing (Jacob et al., 2004) is the ability to manage computing resources automatically and dynamically to respond to the requirements of the business based on SLA. Autonomic computing is usually represented by an autonomic loop that bridges the gap between monitoring and reconfiguration. This loop consists in harvesting monitoring data, analyzing them and generating reconfiguration strategies to correct violations (self-healing and selfprotecting) or to target a new state of the system (self-configuring and self-optimizing). Autonomic management is based on monitoring and reconfiguration allowing the continuity of services and the fast reaction to any detected violation. To resolve heterogeneity problems, the most natural way is to use standard interfaces. For Cloud Computing, the de facto standard is Open Cloud Computing Interface (OCCI). OCCI is defined by the Open Grid Forum as “an abstraction of real world resources, including the means to identify, classify, associate and extend those resources” (Nyren et al., 2011). Using OCCI standard enables also a granular description for autonomic computing requirements. The extension mechanism of OCCI, based on the usage of Mixins, enables adding new functionalities easily. The rendering of the infrastructure is based on REST interfaces using HTTP (Metsch and Edmonds, 2011b). To support different SLA languages, we propose to use the Mixins mechanism to be able to process as much languages as possible for SLA description. In a previous work (Mohamed et al., 2013b) we presented some aspects of monitoring and reconfiguration for OCCI Resources. In this paper, we push farther our work and we propose an on demand autonomic computing infrastructure based on OCCI standard. This infrastructure is based on new OCCI entities (i.e., Resources and Links) and Mixins. We aim at adding the needed facilities to OCCI Resources independently of the level (XaaS).

In order to add autonomic computing facilities to resources, we propose new Mixins to enable the following functionalities: (1) inspecting specific SLAs related to resources to enable defining the needed elements for the autonomic loop, (2) monitoring resources and extracting their relevant data, (3) handling subscriptions/notifications from/to interested parts and (4) enabling reconfigurations on managed resources. The proposed infrastructure consumes monitoring data using these latter Mixins to analyze and generate reconfiguration actions. The established infrastructure needs possibly some Mixins for specific processing (i.e., adding specific analysis rules and reconfiguration strategies, etc.). These Mixins could be adapted at runtime to change the behavior of the system. To ease the users task, we define an Autonomic Manager as an abstract Resource responsible of inspecting the SLA and automatically instantiating all the needed resources to establish the autonomic infrastructure. The Autonomic Manager Resource is specified by the usage of specific Mixins able to inspect specific types of SLAs. In this paper we do not cover all the self-CHOP (configuring, healing, optimizing and protecting) properties. Instead, we propose the needed mechanisms that could be used to cover these properties. The originality of our approach resides on the fact that our proposed mechanisms are independent from the Cloud service provider and from the Cloud service layer. Consequently, they could be applied to heterogeneous resources spanning over different layers and providers. Our solution differs from all the existing ones in the state of the art. Its major assets are being generic and agnostic to the service layer and the used technology. This allows us to reuse and even enhance the existing off-the-shelf solutions for the management of Cloud resources by simply mapping their offered services to our defined Mixins. 1.3. Paper structure The remainder of this paper is organized as follows. In Section 2 we define autonomic computing and its basic aspects (i.e, monitoring and reconfiguration). Then we give an overview on OCCI. Section 3 is the core of our proposition, and it introduces the different OCCI Entities (i.e., Entities are Resources and Links) and Mixins that we use to establish an autonomic computing infrastructure. In Section 4.1 we present the interactions between the different defined elements. The details of our implementation are presented in Section 5.1. In Section 5.2, we present the use case that we used to evaluate our proposal as well as the results of our experiments. Afterwards, we discuss the related works in Section 6. Finally, we conclude the paper and give some directions for future work in Section 7. 2. Background In this section, we will introduce the different terms related to our work. We start by describing autonomic computing and its basic aspects. Due to the large spectrum of this domain we will limit our description to monitoring and reconfiguration. Then, we will introduce Open Cloud Computing Interface (OCCI). 2.1. Autonomic computing IBM defines Autonomic Computing (Jacob et al., 2004) as the ability to manage computing resources that automatically and dynamically responds to the requirements of the business based on SLA. In the context of Cloud Computing, it increases the availability of resources and reduces their costs. Autonomic systems exhibit the ability to continuously sense their attributes and services and the ability to change their states to adapt their behavior according to their environments. Consequently, these systems need

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

ARTICLE IN PRESS

JID: JSS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

3

resources, but the structure of the system remains unchanged; and (4)Behavioral: the reconfiguration leads to a change of the behavior of one or more resources by changing some attributes. 2.4. OCCI specification overall

Fig. 1. Overview of OCCI core model.

Fig. 2. Overview of the infrastructure extension for OCCI.

to enable the cooperation between granular monitoring systems and granular reconfiguration mechanisms. A good autonomic computing system, implements the needed functionalities to collect the relevant monitoring data, to apply analysis rules on this data, to generate reconfiguration strategies when needed and to apply these strategies on the concerned resources. In this autonomic loop, monitoring and reconfiguration play a basic role. In the following, we define monitoring and reconfiguration since they are the key terms for the understanding of the rest of the paper.

The Open Grid Forum defines the Open Cloud Computing Interface (OCCI) (Nyren et al., 2011) as “an abstraction of real world resources, including the means to identify, classify, associate and extend those resources.” OCCI provides a model to represent resources and an API to manage them. The OCCI model can be extended to cover different levels in the cloud (i.e., IaaS, PaaS, SaaS, etc.). OCCI specifications consist essentially of three documents. The first is OCCI Core (Nyren et al., 2011) that formally describes the OCCI Core Model shown in Fig. 1. It describes the Cloud resources as instances of the class Entity which can be a Resource or a Link. Each Entity instance is typed using an instance of the class Kind and could be extended using one or more instances of the class Mixin. Kind and Mixin are subtypes of Category and each Category instance can have one or more instances of the class Action. The second document is OCCI HTTP Rendering (Metsch and Edmonds, 2011b) that describes how to interact with OCCI Core using HTTP protocol. The HTTP rendering is based on the REST architecture and uses the REST verbs to manipulate the different defined Resources and their states. The third document is OCCI Infrastructure (Metsch and Edmonds, 2011a) that is an extension of the Core Model to represent the cloud infrastructure layer. This extension brings basically the definition of new Resources that inherit the core basic types Resource and Link. The new Resource subclasses are Network, Compute and Storage while the new Link subclasses are StorageLink and NetworkInterface. We need to exhibit that the extensibility of OCCI is basically due to the usage of Mixins. This mechanism allows new Resources capabilities to be added at instantiation time or at runtime. 3. OCCI extension for autonomic management

Monitoring consists of informing interested parts of the status of a property or a service. In our work (Mohamed et al., 2013c, 2013a), we consider two models of monitoring: monitoring by polling or by subscription. Polling is the simpler way of monitoring, as it allows the observer to request the current state of a property whenever there is a need. The interested part can generally interact with a specific interface that provides a getter of the needed property. Monitoring by subscription model is based on a publish/subscribe system which is defined as a set of nodes divided into publishers and subscribers. Subscribers express their interests by subscribing for specific notifications independently of the publishers. Publishers produce notifications that are asynchronously sent to subscribers whose subscriptions match these notifications (Baldoni et al., 2004).

In this section, we define the different types that we need to establish an autonomic computing system using OCCI resources. In an autonomic system, monitoring components are usually used to get information to verify whether an SLA is met or not. This latter may concern one or more resources to be managed (i.e., monitored and reconfigured). We assume that each Resource has its own SLA. We define a new OCCI Resource called Autonomic Manager. This is an abstract Resource responsible of inspecting the SLA and instantiating the needed Resources of our autonomic infrastructure. It is responsible of adding the needed facilities to gather monitoring information. Based on this information, the established infrastructure can take decisions to trigger reconfiguration actions to apply on Cloud resources. In order to offer autonomic computing infrastructure on demand to a provider, we need to define new Resource types, new Links and new Mixins. In the following subsections we define these elements.

2.3. Reconfiguration

3.1. OCCI resources

Reconfiguration is a runtime modification of the structure or the implementation of an infrastructure (Hnˇetynka and Plášil, 2006). In the literature, we can classify the reconfiguration in four classes: (1) Structural: the reconfiguration leads to a change in the structure of the system (removing resources, adding resources, creating or removing links between resources); (2) Geometric: the reconfiguration leads to a new mapping of the existing resource to new locations (migration of resources); (3) Implementation: the reconfiguration leads to change the implementation of one or more

To provide a generic description for autonomic computing using OCCI, we defined new Resources. Basically, these resources are the Autonomic Manager Resource, the Analyzer Resource and the Reconfiguration Manager Resource. These Resources inherits the Resource base type defined in OCCI core (Nyren et al., 2011) (see Fig. 3).

2.2. Monitoring

3.1.1. Autonomic manager resource In order to automatize the establishment of the autonomic infrastructure, we defined an Autonomic Manager Resource as a

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

ARTICLE IN PRESS

JID: JSS 4

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14 Table 2 Definition of the Analyzer. Model attribute

Value

Scheme Term Attributes Related

http://ogf.schemas.sla/occi/autonomic# Analyzer (see below) http://ogf.schemas.sla/occi/core#resource Attributes for the Analyzer

Fig. 3. Resource subclasses for autonomic computing. Table 1 Definition of the autonomic manager (AM). Model attribute

Value

Scheme Term Attributes Related

http://ogf.schemas.sla/occi/autonomic# Autonomic manager (see below) http://ogf.schemas.sla/occi/core#resource Attributes for the Autonomic Manager

Name Name Version SLALocation

Type String String String

Mut. yes yes yes

Req. yes no no

Description AM name AM version SLA location

generic OCCI resource. It inspects a given SLA and carries out the list of actions to build the autonomic computing infrastructure. From a given SLA, this Resource determines monitoring targets (i.e., the needed metrics to be monitored). It is also responsible of extracting the rules to be used by the Analyzer resource (see Section 3.1.2) and the reconfiguration strategies to be used by the Reconfiguration Manager (see Section 3.1.3). After inspecting the contract (SLA), the Autonomic Manager instantiates the needed Entities (i.e., Resources and Links). Then, it customizes these Entities with the needed Mixins and eventually configures them with the needed parameters. Since the SLA can be described using different specifications (i.e., WS-Agreements, USDL, etc), the Autonomic Manager uses a SpecificSLA Mixin (see Section 3.3.1) that describes how to deal with a specific SLA. As shown in Table 1, we specify the name of the Autonomic Manager to instantiate and its version. Moreover, we specify the location of the SLA. This SLA will be used to provision the Resource and to establish the rest of the autonomic infrastructure. This last attribute is not mandatory because we can get the SLA directly if we have an instance of the AgreementLink (see Section 3.2.1). 3.1.2. Analyzer It is a generic OCCI Resource that allows to analyze monitoring data and eventually to generate alerts. Notifications are received through an instance of the NotificationLink (described in Section 3.2.3). The Analyzer Resource is an abstract description that would be specified using RuleSet Mixin collection (described in Section 3.3.5). It uses RuleSet Mixin to specify rules to apply on monitoring data. The description of this Resource is depicted in Table 2. As shown in Table 2, in order to provision an Analyzer Resource, we need to specify its name and version as well as eventual max and min thresholds. These thresholds are used for analysis aims. 3.1.3. Reconfiguration manager It is a generic OCCI Resource that receives alerts from the Analyzer through an AlertLink (described in Section 3.2.4). To receive alerts, this Resource must be the target of one or more AlertLink instances. The Reconfiguration Manager Resource is an abstract description that would be specified using StrategySet Mixin collection

Name Name Version maxThreshold minThreshold

Type String String String String

Mut. yes yes yes yes

Req. yes no no no

Description Analyzer name Analyzer version Max threshold Min threshold

Table 3 Definition of the Reconfiguration Manager (RM). Model attribute

Value

Scheme Term Attributes Related

http://ogf.schemas.sla/occi/autonomic# Reconfiguration Manager (see below) http://ogf.schemas.sla/occi/core#resource

Attributes for the Reconfiguration Manager Name Name Version

Type String String

Mut. yes yes

Req. yes no

Description RM name RM version

Fig. 4. Link subclasses for autonomic computing.

(described in Section 3.3.9). This latter is used to specify strategies to apply for each received alert. Based on these alerts the Reconfiguration Manager may generate reconfiguration actions to apply on Resources. The description of the Reconfiguration Manager is depicted in Table 3. As shown in Table 3, in order to instantiate a Reconfiguration Manager Resource, we need to specify its name and version. 3.2. OCCI links To link the different Resources, we defined new links inheriting the Link base type defined in OCCI Core (Nyren et al., 2011) (see Fig. 4). Due to space limits, the attributes of the Links are not specified.1 3.2.1. AgreementLink It is a Link between an Autonomic Manager and a managed Resource. It allows an Autonomic Manager to inspect the SLA of the managed Resource. 3.2.2. SubscriptionLink It is a Link between a subscriber and the Resource to be monitored (via its Subscription Mixin described in Section 3.3.3). It models the occurrence of subscriptions on specific metrics. A SubscriptionLink is related to one consumer that requires monitoring 1 A complete description of Links attributes is available on http://www-inf. it-sudparis.eu/SIMBAD/tools/OCCI/autonomic/occilinks.html.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

5

Fig. 5. Mixin subclasses for autonomic computing.

information and one managed Resource that have a Subscription Mixin. A Subscription is specified with a Mixin instance from the SubscriptionTool Mixin collection. 3.2.3. NotificationLink It is a Link between a monitored Resource (via its Subscription Mixin) and a subscriber. It models the activity of notifying subscribers about the state of a monitored metric. This activity depends on the type of the subscription. If the subscription is on change, a notification will occur whenever the monitored metric changes. If the subscription is on interval, a notification will occur periodically based on the period specified in the subscription. A notification is specified by an instance of the NotificationTool Mixin collection (described in 3.3.7). Eventually, a NotificationLink is instantiated whenever a subscription occurred. 3.2.4. AlertLink It is a Link between the Analyzer Resource and the Reconfiguration Manager Resource. Its role is to drive alerts from the Analyzer to the Reconfiguration Manager. An alert is generated when the Analyzer detects that one rule in the RuleSet Mixin collection is not respected. An alert is specified by an instance of the AlertTool Mixin collection (described in Section 3.3.8). 3.2.5. ActionLink It is a Link between a Reconfiguration Manager Resource and a managed Resource on which the latter applies reconfiguration actions. It models the transfer of actions from the Reconfiguration Manager to be applied on the managed Resource. An ActionLink is characterized by the activity that applies reconfiguration actions on other Resources. This activity is specified with a Mixin instance from the ActionTool Mixin collection (described in Section 3.3.10). 3.3. OCCI mixins To customize the different Resources and Links, we defined different Mixins. These latter inherits the Mixin base type defined in OCCI Core (Nyren et al., 2011) (see Fig. 5). Due to space limits, attributes of the Mixins are not herein described.2 3.3.1. SpecificSLA Mixin In Cloud Computing, there are different languages to describe SLA. To tackle this heterogeneity we defined this Mixin that describes the needed tools allowing an Autonomic Manager to extract the needed information from a specific SLA. This Mixin allows the Autonomic Manager to get the metrics that need to be monitored, the analysis rules and the reconfiguration strategies. Based on these metrics, the Autonomic Manager instantiates the needed Resources to establish the autonomic loop and configures them. Technically, a Mixin instance of SpecificSLA implements the interface SpecificSLAInterface. This interface defines an abstract method 2 A complete description of Mixins attributes is available on http://www-inf. it-sudparis.eu/SIMBAD/tools/OCCI/autonomic/occiMixins.html.

called inspectSLA(File SLA). For each type of SLAs we can implement a Mixin that describes exactly how to process SLAs. An example is a WS-Agreement Mixin using a WSAgreementParser that we developed to inspect any SLA based on WS-Agreement. At this stage of the research, we don’t tackle the semantics of the described SLA. The SpecificSLA Mixin, uses a simple XML file that syntactically matches the described information in the SLA with the existing Mixins. However, in our future work we need to extend this aspect to take into account semantic interpretation of SLA information. 3.3.2. Polling Mixin It describes the needed operations to monitor a Resource by polling. By extending a Resource with this Mixin, we add a list of actions that render a Resource monitorable by polling. This Mixin is an implementation of the interface PollingInterface that describes two abstract methods getAttributes() and getAttributeValue(attributeName). The first method getAttributes() returns the list of the attributes of the managed Resource. The second method getAttributeValue(attributeName) returns the value of a given attribute. These two methods are implemented by Polling Mixin according to the specific use case. For example, to be able to monitor by polling the memory usage of an OCCI Compute instance, the Polling can be implemented as a call to an external program able to retrieve this data or it can be a remote UNIX command “top | grep KiB” assuming that the Compute is a Linux instance. 3.3.3. Subscription Mixin It allows an interested part to subscribe to get monitoring data. For each Resource that we want to manage, we add a Subscription Mixin to manage subscriptions on specific metrics in order to send notifications containing monitoring data. Notifications may be on interval so that the client receives a notification periodically containing the state of one or more metrics. It may also be on change, so the Subscription Mixin pushes the new state of the metric whenever a change occurred. This Mixin implements the interface SubscriptionInterface that defines a list of actions to manage clients subscriptions (i.e., subscribe(), unsubscribe(), etc.). The implementation must define how to realize these actions according to a specific scenario. For example, the subscribe() method could be implemented as an HTTP Restlet that can receive subscriptions via REST queries. 3.3.4. Reconfiguration Mixin It provides the needed functionalities to ensure reconfiguration. By extending a Resource with this Mixin, we add new methods that render a resource configurable. Technically, this Mixin implements a ReconfigurationInterface that defines three abstract methods. The method setAttributeValue(attributeName) changes the value of a given attribute. The getActions() returns the list of actions that one could apply on the managed Resource. The invokeAction(action) invokes a given action on the Resource. We also defined a reconfigure() method that allows to implement a specific way to reconfigure a Resource. For a specific use case, the implementation of this Mixin must describe exactly how to realize these actions.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS 6

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

Fig. 6. Autonomic computing Infrastructure establishment.

For example, we can implement the reconfigure() method to resize the disk of an OCCI Compute instance. In this case, the implementation can be a REST query to the infrastructure manager (e.g., OpenNebula, 2014 or OpenStack, 2014) to resize the disk. It can also be a usage of libvirt API 3 through the virt-resize command that allows to resize a given disk. 3.3.5. RuleSet Mixin It represent a computation rule that based on incoming monitoring information triggers specific alerts. It represents a function applied by the Analyzer to compare monitoring information values against previously defined thresholds or conditions. Whenever a condition is not respected, the Mixin instance makes the Analyzer trigger a specific alert. This Mixin implements the RuleSetInterface that defines a method analyze(notification). An implementation of this method describes how the analyzer process the incoming notifications. An example of this Mixin can use the Event-ConditionAction model. 3.3.6. SubscriptionTool Mixin It models a Subscription instance. It contains the different details of a subscription (i.e., subscription type, duration, and eventually filters). This Mixin may describe the mechanism of the subscription or may refer to an external program that handles this task. This Mixin implements a SubscriptionToolInterface that defines subscribe() method specifying how to pass a subscription from a subscriber to the managed resource. For example, this method can prepare a REST query with the different details of the subscription and send it to the managed resource. 3.3.7. NotificationTool Mixin It models the needed tools by which notifications are sent to subscribers. An instance of the NotificationTool may enable notification messages by different tools like emails, messages, or simple HTTP notifications. This Mixin may implement the needed mechanisms to send the notification or it may refer to an external program that handles this task. It implements a NotificationToolInterface that describes a notify() method specifying how to pass a notification from a managed resource to a subscriber. For example, an implementation of this method can create a REST query with the information of the notification and send it to the subscriber. 3

http://libvirt.org/.

3.3.8. AlertTool Mixin It models the needed tools by which alerts can reach a Reconfiguration Manager. This Mixin implements an AlertToolInterface that defines a method alert(). An implementation of this method can entail sending a simple message containing the description of the encountered violation or a creation of an alert and sending it to a given HTTP interface via REST. It may also refer to an external program that handles the alerting task. 3.3.9. StrategySet Mixin Each instance of this Mixin implements a computation strategy that based on incoming alerts triggers reconfiguration actions on specific Resources. It represents a function applied by the Reconfiguration Manager to process incoming alerts. This Mixin implements the StrategySetInterface that defines a method generateReconfigurationActions(alert). An implementation of this method describes how to process the incoming alert in order to generate reconfiguration actions. It can also refer to an external program that handles planning for reconfigurations. For example, this Mixin can make a direct matching between the alert and existing actions: if the incoming alert is a scaleUpAlert, this Mixin generates a scaleUp action. 3.3.10. ActionTool Mixin It models the needed mechanisms by which reconfiguration actions are applied on a specific Resource. An instance of the ActionTool may represent a simple Action as it is defined for the OCCI Core model (Nyren et al., 2011). It may also be a composed action that refers to a list of actions. This Mixin implements a StrategySetInterface that defines a method applyAction(reconfigurationAction). An implementation of this Mixin must describe how actions are executed, they can be applied directly using the Reconfiguration Mixin of a given resource. They can be also a call to the IaaS or PaaS managers to add or retrieve new resources for example. 4. Using OCCI extension to manage cloud resources 4.1. Autonomic infrastructure establishment In this section, we detail the usage of the previously defined OCCI Entities (i.e., Resources and Links) and Mixins to establish an autonomic computing infrastructure (see Fig. 6). To establish our autonomic computing infrastructure, we start by setting up our OCCI Server. This server is responsible of instantiating any OCCI Entity. The first Resource instantiated in this

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

7

Fig. 7. Overview of the defined OCCI platform types.

server is the Autonomic Manager Resource. This manager uses the SLA to establish and customize the autonomic infrastructure as shown in Fig. 6. Based on the SLA, the Autonomic Manager detects the needed Mixins to be added to the resource. Then, instead of instantiating a standard (OCCI) application resource, the Autonomic Manager extends the Resource by a Polling, Subscription and Reconfiguration Mixins to render it monitorable and reconfigurable. The mixed Resource offers the same services at the original one with newly added services of monitoring and reconfiguration. The Polling Mixin can get the monitoring data of the resource. It parses the data to extract just what it needs (needed metrics). The Subscription Mixin periodically gets the data from the polling Mixin and sends notifications to the Analyzer through the Notification Link. The Autonomic Manager sends a request to the OCCI Server to instantiate the Mixed OCCI Resource (i.e., the Resource with its newly added Mixins). Whenever this Resource is ready, it orders the Server to deploy and start it. The next step realized by the Autonomic Manager is to carry out a series of queries addressed to the OCCI Server to instantiate and customize the needed Resources and Links. It starts by instantiating the Analyzer that consumes monitoring data and applies analysis rules. To receive monitoring notifications, the Analyzer must be subscribed to the monitored Resource using a SubscriptionLink having as source the Analyzer itself and as target the Resource to be monitored. The SubscriptionLink being abstract, needs to be specified using a SubscriptionTool Mixin. When a subscription occurred, a NotificationLink is instantiated between the managed resource and the Analyzer to drive notifications. Technically, the aspects related to notifications transfer are described using instances of the NotificationTool Mixin. At the reception of a notification, the Analyzer uses RuleSet Mixin instances to analyze incoming monitoring data. If the data does not meet the SLA, the Analyzer may raise alerts to the Reconfiguration Manager. Accordingly, the Autonomic Manager orders the OCCI Server to instantiate the Reconfiguration Manager Resource and link it to the Analyzer via an Alert link specified with an AlertTool Mixin. The Reconfiguration Manager uses specific strategies to generate reconfiguration actions. These strategies are instances of the StrategySet Mixin. The last step is to link the Reconfiguration Manager to the managed Resource using an ActionLink that uses the generated reconfiguration actions and applies them on the Resource. We specify how to apply these actions using an ActionTool Mixin. The ActionLink can use the actions provided by the Reconfiguration Mixin to apply reconfigurations. These actions could be used as methods of the resource itself since they are implemented by the Mixin. It is noteworthy that an instance of the Analyzer Resource can subscribe to different Resources to receive monitoring data. Respectively, the Reconfiguration Manager can be linked to one or more Resources to apply reconfiguration actions.

4.2. Supported managed resources In our work, we chose to use OCCI in order to remain agnostic to the resource level. By doing so, our mechanisms could be applied on the different layers of the Cloud. For the IaaS level, we use the OCCI extension for Infrastructure (Metsch and Edmonds, 2011a) proposed by OCCI working groups. As shown in Fig. 2, this extension brings new Resources and Links to describe Cloud infrastructures. The new Resource subclasses are Network, Compute and Storage while the new Link subclasses are StorageLink and NetworkInterface. In the use case presented in Section 5, we will add management facilities to Compute instances. In order to cover the PaaS level, we defined a new OCCI extension for the Platform (Yangui et al., 2013). This extension models all resources that can be provisioned by a PaaS provider to made up an environment to host an application. As shown in Fig. 7, the main defined Resource subclasses are (1) Database to describes storage resources for PaaS level, (2) Container to describes the different engines to host and run applications, and (3) Router to describes resources that provide protocols, message format transformations and routing. We also defined a set of Link subclasses to connect and interact with PaaS resources. These subclasses are (1) ContainerLink to connect to Container resources, (2) RouterLink to connect to Router resources, and (3) DatabaseLink to connect a Container resource to a Database resource. Moreover, in order to cover the SaaS level, we defined a new OCCI extension for Applications (Sellami et al., 2013a). This extension allows describing applications provided in the Cloud. The defined Resource subclasses are (1) Application which is the software or program that can be deployed on top of a PaaS (Java Web application, Ruby program, etc.), (2) Environment which represents a set of resources needed to host and run an Application (e.g. runtime, framework, message queue, etc.), and (3) Deployable which represents the Application deployables (e.g. sources archives, etc.). We also defined EnvironmentLink as a Link subclass to connect an Application to an Environment. 5. Implementation and evaluation 5.1. Implementation In this section, we present the different aspects of our implementation. To implement the different Resources, Links and Mixins that we previously defined we used a JAVA implementation provided by OCCI working group called occi4java.4 This implementation is developed by the OCCI working group according to the specifications. The project is divided into three parts. Each of these parts 4

https://github.com/occi4java/occi4java.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS 8

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

Fig. 8. Overview of the defined OCCI application types.

corresponds to one of the three basic documents of OCCI specifications (i.e., OCCI Core, OCCI Infrastructure and OCCI HTTP Rendering). We already used this project to implement our extensions of OCCI Platform (Yangui et al., 2013; Sellami et al., 2013b) and Application (Sellami et al., 2013a). Using occi4java, we extended the Resource class to create our own resources for autonomic computing. We also extended the Link class to define our links. Since the Mixin mechanism is not a native functionality for JAVA, we used the mixin4j framework.5 This framework allows creating Java Mixins by the use of annotations. To this end, it is possible to define the classes to be extended as abstract classes annotated with “@MixinBase” annotation. It is to say that this abstract class will be extended using Mixins. To create the Mixin itself, it is necessary to create an interface for this Mixin. Then, the @MixinType annotation is used to say that this interface represents a Mixin. The implementation of the interface must be specified following the @MixinType annotation. Consequently, the framework allows to instantiate new classes from the abstract classes annotated with @MixinBase mixed with implementations of the interfaces annotated with @MixinType. We used this framework to define our Mixins. We annotated our OCCI Resources and Links with @MixinBase to be able to extend them later. Then, we defined all the interfaces of our Mixins and annotated them with @MixinType. Moreover, we implemented the different Mixins containing the needed functionalities for our autonomic infrastructure. In our implementation, the Autonomic Manager is implemented as an abstract Mixin Base. Using mixin4j, we mixed it with an implementation of the SpecificSLAInterface as a MixinType and a WSAgMixinImpl that represents the real implementation that uses WSAgreementParser to parse WS-Agreements SLAs. Similarly, the Analyzer and Reconfiguration Manager are implemented as MixinBase. The RuleSet and the StrategySet Mixins implement respectively the two MixinType interfaces RuleSetInterface and StrategySetInterface. For REST HTTP Rendering, the occi4java project uses Restlet framework.6 Restlet is an easy framework that allows to add REST mechanisms. After adding the needed libraries, one needs just to add the Restlet annotations to implement the different REST actions (i.e., POST, GET, PUT and DELETE). Then to set up the server, one has to create one or more Restlet Components and attach the resources to them. We used the proposed annotations to define the needed actions for our OCCI Resources, Links and Mixins. That allowed us to use the HTTP REST Rendering to manage our autonomic infrastructure. And in order to enforce the scalability of our

5 6

http://hg.berniecode.com/java-mixins. http://restlet.org/.

solution, we implemented different Restlet Components to allow the distribution of our infrastructure. We also used this framework to implement all the communication mechanisms used by Mixins to specify the Links (i.e., SubscriptionTool, NotificationTool, AlertTool and ActionTool Mixins) (Fig. 8). Furthermore, we used WS-Agreement to describe SLAs (see Listing 1). WS-Agreement is an extensible language to describe SLAs. The extensibility of this language allows defining user specific elements and using them. In our example, we were able to specify the attributes to be monitored and their different thresholds. We started from the WS-Agreement XSD7 and we added new elements to describe the analysis rules (i.e., ) and the reconfiguration strategies (i.e, ). The Mixin that we implemented to parse this kind of SLAs is used by the Autonomic Manager to detect the content of these elements and makes a syntactic matching between a list of existing Mixins for each type (RuleSet or StrategySet Mixins). If it finds a Mixin that matches the description it uses it to extend the concerned resource. More work is needed in the specifications of rules and strategies in a more rigorous way. In order to parse WS-Agreement SLAs, we used Eclipse Modeling Framework8 (EMF) to implement WSAgreementParser. EMF is a modeling framework that allows building tools and applications based on their structured data models (e.g., XSD). We used WS-Agreement XSD to generate the needed classes to process WSAgreements. Based on these classes, we developed a parser that responds to our requirements. This parser is used by an instance of SpecificSLA Mixin. To realize an evaluation use case, we implemented different monitoring and reconfiguration Mixins based on OCCI REST queries. Table 4 resumes the different actions performed by each Mixin. Monitoring Mixin for Applications is based on the REST API proposed by NewRelic (Cloud Foundry Blog, 2014) service integrated in CloudFoundry PaaS. We also implemented three Mixins to monitor Compute resources. These three Mixins communicate with the OCCI server proposed by the IaaS manager (we used OpenNebula in our use case). These monitoring Mixins get the needed information about the resource and create notifications related to the monitored attributes and send them to the Analyzer. We implemented also different reconfiguration Mixins. Each one implements the reconfigure() method differently. The different Mixins implementations to reconfigure OCCI Compute resources, address REST queries to the OCCI server of the IaaS manager (i.e., OpenNebula). As shown in Table 4, we need to send an XML file

7 8

http://schemas.ggf.org/graap/2007/03/ws-agreement. http://www.eclipse.org/modeling/emf/.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

ARTICLE IN PRESS

JID: JSS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

9

Table 4 Summary of the monitoring and reconfiguration Mixins. Mixins role

REST query

Monitoring Compute memory consumption Monitoring Compute remaining free disk Monitoring Compute availability Monitoring Application average response time Adding memory to Compute Deducing memory from Compute Hot-plugging a disk to Compute Redeploying Compute Hot-detaching a disk from Compute Scaling up an Application Scaling down an Application

GET GET GET GET PUT PUT PUT PUT PUT PUT PUT

describing the new targeted state of the Compute resource. Finally, we implemented a Mixin able to reconfigure applications deployed on CloudFoundry by adding or retrieving application instances. This latter uses a REST query to communicate with COAPS9 generic API (Sellami et al., 2013b). COAPS allows to seamlessly interact with different PaaS in a generic manner. In this paper we will not go in the details of these aspects but will use some actions proposed by COAPS. As shown in Table 4, the queries sent to COAPS must contain the identifier of the application as well as the new number of the instances. A description of the realized work is available at http: //www-inf.it-sudparis.eu/SIMBAD/tools/OCCI/autonomic. The page contains a link to download an archive of all the implementations. The archive includes also a user guide document that explains how to test the project using an Application resource that we implemented. This resource consists on a Servlet that generates 2 random matrices and calculates their products. In the same page, the client that we used to call the application is available. It is a REST client that sends requests to the application with a parameter designing the size of the matrix to generate. 5.2. Evaluation In order to validate our proposal, we present in this section the use case that we realized to this aim. We describe the experiments environment and the preliminary results. 5.3. Evaluation environment To perform our tests we used the NCF (Network and Cloud Federation) experimental platform deployed in our laboratory. The NCF experimental platform disposes of 380 Cores Intel Xeon Nehalem, 1.17 TB RAM and 100 TB as shared storage. We used OpenNebula, which is a virtual infrastructure engine that provides the needed functionality to deploy and manage virtual machines (VMs) on a pool of distributed physical resources. During our experiments, we used our specific template with the following characteristics: 4 cores (2.66 GHZ) and 4GB of RAM. We also used our customized virtual machines containing an instance of the open source PaaS CloudFoundry. In our tests we deploy an application (i.e, SaaS) on our private instance of the open source PaaS CloudFoundry. Our goal is to show how dynamically establish autonomic management infrastructure around Cloud resources in a generic manner as explained in Section 5.4. The established infrastructure has to harvest monitoring data, to analyze this data and to generate reconfiguration actions when needed. In the following, we describe the use case that we implemented to perform our tests. 9

http://www-inf.it-sudparis.eu/SIMBAD/tools/COAPS/.

http://OpenNebulaOCCIserver/compute/id/ http://OpenNebulaOCCIserver/compute/id/ http://OpenNebulaOCCIserver/compute/id/ http://NewRelicInstance/v2/applications.xml http://OpenNebulaOCCIserver/compute/id/compute.xml http://OpenNebulaOCCIserver/compute/id/compute.xml http://OpenNebulaOCCIserver/compute/id/compute.xml http://OpenNebulaOCCIserver/compute/id/compute.xml http://OpenNebulaOCCIserver/compute/id/compute.xml http://COAPSAddress/coaps/appid/instances/nbr http://COAPSAddress/coaps/appid/instances/nbr

5.4. Use case To evaluate our work we present a real use case that spans over the different levels of the Cloud (i.e., IaaS, PaaS and SaaS levels). It consists on the deployment of an application (i.e., the SaaS) on a private instance of CloudFoundry PaaS.10 Our PaaS is deployed on three virtual machines managed by OpenNebula IaaS manager. OpenNebula provides an OCCI server that allows us to describe and manage the VMs as OCCI Compute resources. It is noteworthy that we used OCCI extension for Infrastructure (Metsch and Edmonds, 2011a) to describe the IaaS level. Moreover, we used our Platform and Application extensions for PaaS and SaaS resources. We also used COAPS to interact with CloudFoundry PaaS. The used application is composed of two servlets that collaborate to perform numerical analysis on matrices. The first servlet is the endpoint that receives client queries containing the size of the matrices. It generates the matrices and saves them on the disk. Then it sends their locations to the second servlet that reads them, applies the desired computation and saves the result on the disk. In our use case, we will focus on matrices products calculation. We assume that these matrices could be with large sizes. Consequently, the used disk could be saturated. The client queries number can also increase or decrease. Consequently, the application instances could be more or less solicited. Moreover, we consider that the virtual machines instances could encounter failures or disconnections. In order to ensure the continuity of the application execution with all its required resources, we applied our approach to add autonomic behaviors to Application and Compute resources. Our targeted metrics to monitor are (1) Compute disk usage, (2) Compute memory usage, (3) Application response time, and (4) Compute availability. The analysis of monitoring data of these metrics could result on the following reconfiguration actions (1) reconfiguring the used disk by the Compute instance by hot-plugging new disk or by resizing the original one, (2) resizing the Compute instance to add or to retrieve memory capacity, (3) auto-scaling of the application by adding or retrieving instances, and (4) redeploying a new instance of the Compute instead of the failing one. To monitor the Compute instances (i.e., 3 used VMs), we used three of the implemented monitoring Mixins. As described previously in Section 5.1, the first Mixin monitors the memory consumption of the VM, the second monitors its remaining free disk, while the third monitor its state. All these Mixins are based on OCCI REST queries sent to OpenNebula OCCI server. The collected data is sent to the Analyzer as notifications. The Analyzer uses its rules Mixins to analyze these data and to generate alerts if the SLA is violated.

10

http://www.cloudfoundry.com/.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

ARTICLE IN PRESS

JID: JSS 10

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14 Table 5 Description of monitoring metrics, thresholds, alerts and reconfigurations of our use case. Resource (level)

Metrics

Thresholds

Compute (IaaS)

Memory

Compute (IaaS) Compute (IaaS) Application (SaaS)

Disk Availability Response time

Max Min Max

80% 20% 80% –

Max Min

3230 2205

Table 6 Average times needed to reconfiguration actions. Reconfiguration action

Time (s)

Adding memory to compute Retrieving memory from compute Hot-plugging disk to compute Hot-detaching disk from compute Redeploying compute instance Scaling up an application Scaling down an application

23.5 24.6 7.9 8.2 20.2 11.8 12.1

Accordingly, we need to monitor the Application resource. To do that, we used the associated Mixin that gathers response time related to a CloudFoundry Application. As well, this Mixin sends notifications to the Analyzer that applies the related rules to verify whether the SLA is violated or not. Table 5 resumes the monitoring metrics of our use case, their associated thresholds, alerts and reconfigurations. This information is directly used by our implemented Mixins. In each case, the Analyzer uses its rules to verify if the monitoring data respects the defined thresholds or not. If the SLA is violated, the Analyzer sends an alert to the Reconfiguration Manager with the associated alert. The Reconfiguration Manager uses its Mixin to generate the associated reconfiguration actions. For example, if the Compute resource consumes more than 80% of its memory, the Analyzer would send an alert saying that the Compute memory is over-used. Consequently, the Reconfiguration Manager generates actions to reconfigure the Compute instance. This reconfiguration is based on the associated Mixin described in Section 5.1. It is worthy to note that the thresholds are deduced from the SLA, and that the analysis rules are proactive. The rules do not wait till the SLAs are violated, instead they anticipate by sending alerts before the occurrence of the violation. More work in this area will be conducted in our future work. But since it is not the basic objective of this paper, we fixed the thresholds to ±5% of the hard thresholds specified in the SLA. This choice was made based on our experiments and it is specific to this use case. 5.4.1. Use case instantiation The input of our OCCI Server is an SLA describing the needed Compute and Application instances and their requirements as well as the needed details for the autonomic aspects. It is passed to the Autonomic Manager in the following HTTP POST request:

The Autonomic Manager uses the SLA to build the environment where to deploy the application. It carries on the list of queries to the OCCI Server to instantiate the three Mixed Compute instances (with monitoring and reconfiguration Mixins) using their identi-

Alerts

Reconfigurations

Memory overloaded Memory underloaded Disk full Compute unavailable Application overloaded Application underloaded

Add memory to compute Retrieve memory from compute Hot-plug disk to compute Redeploy compute Scale up application Scale down application

fiers. These instances are configured to run a CloudFoundry environment. Once deployed, they offer an endpoint of this PaaS that we specified to api.cloudfoundry.io. Afterwards the Autonomic Manager sends the needed queries so that the OCCI server deploys and starts the Application extended with monitoring and reconfiguration Mixins. To this aim, the OCCI Server uses the actions proposed by COAPS. After the provisioning of the application, the Autonomic Manager proceeds to the instantiation of all the needed Resources, Links and Mixins via the OCCI Server. From the SLA, the Autonomic Manager detects the metrics to be monitored. It extracts also the defined thresholds as well as the analysis rules and the reconfiguration strategies. These information are described using user specific elements that we added to WS-Agreement as shown in Listing 1. 5.4.2. Evaluation result In these evaluations, we took different measurements of the time needed to deploy and start Application and Compute resources and the time needed to build the autonomic infrastructure. The measurements show that the deployment and starting time depends on the size of the Resource. However, the establishment of the rest of the autonomic infrastructure is independent of the Resource. The instantiation of a Compute instance takes 15.8 s. And since the queries are simultaneous, the instantiation of the three instances takes 38.7 s. To deploy our test application, having the size of 14.4 Kb, to CloudFoundry we measured 68.2 s. To start this same application we measured 15.6 s. The time needed to establish the rest of the autonomic infrastructure is independent of the resources size, but it depends on the number of the managed resources. This time was around 12.3 s. This time is encouraging since it is negligible compared to the resources deployment and starting time. During the experiments, we measured the needed time to apply the different reconfiguration actions. Table 6 resumes the average times needed to apply each reconfiguration action. Furthermore, we compared the behavior of the application before and after adding our autonomic infrastructure. To this end, we developed a REST Client able to call the deployed application. We deployed this client on different VMs in order to launch a big number of parallel calls. This is to create a client query burst targeting the application. In each series of this experiments we modified the number of the used clients targeting our application and we saved the measured response times. The results of this evaluation are shown in Fig. 9(a). The response time of the application without our mechanisms is increasing proportionally with the number of clients’ queries. Moreover, after 5000 parallel clients, the application went down and its execution continuity was broken. In contrast, using our mechanisms, whenever the response time approached the max threshold that we specified in the SLA, a reconfiguration action is triggered to scale up the application by adding a new instance. This reconfiguration decreases the response time of the application. If the response time is approaching the min threshold, a reconfiguration action is triggered to scale down the application. The experiments show the efficiency of our approach to enhance the

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

11

Fig. 9. Response time and memory consumption of application resource before and after adding our mechanisms.

behavior of the application. In fact, over 1000 clients, the response time remains around the specified thresholds. The experiments show also that we reached 7000 clients without any downtime of the application. Over and above the response time of the Application, we measured its memory consumption. The Fig. 9(b) shows the difference between the memory consumption of an Application with and without our mechanisms. The memory consumption of a non-autonomic one is constant since it has a constant allocated memory and no new instances are added. Meanwhile, for an autonomic Application, the curve shows that the memory consumption is changing whenever there is a duplication or a consolidation. The cost of duplication corresponds to 512 MB that represents the size of an added application instance. These changes in the memory consumption could have an impact on the other metrics and reconfiguration actions (i.e., memory usage of the Compute instances and their configuration). These results open new perspectives for further studies to fine-tune the choice of the used thresholds. As previously mentioned, the thresholds used for analysis rules are deduced from the values specified in the SLA. We started our experiments by using the hard thresholds specified in the SLA and we noticed that there are some violations. In order to guaranty the accuracy of the system regarding the specified SLA, we start gradually to add ± percentage to each threshold value. After multiple experiments series, we reached 100% of accuracy by adding ± 5% to all the specified thresholds in the SLA. We note that this value is specific to this use case. This direction needs more attention in our future work. In this section, we showed by experiments how our approach adds autonomic behaviors to heterogeneous Cloud Resources deployed over different levels of a real Cloud. 6. State of the art In Cloud environments, there are different research works around autonomic computing. In the following we try to give an overview on these works. One of the pioneers in the Autonomic Computing field is IBM who proposed an entire toolkit. It is called the IBM Autonomic Computing Toolkit (Jacob et al., 2004). The authors gave the IBM definition of Autonomic Computing as well as the needed steps to add autonomic capabilities to components. The proposed toolkit is a collection of technologies and tools that allows a user to develop autonomic behavior for his systems. One of the basic tools is the Autonomic Management Engine that includes representations of the autonomic loop. Several tools are proposed to allow managed resources create log messages in a standard format comprehensive by the Autonomic Manager. This is realized by creating a touch-point containing a sensor and an effector. The sensors role is sensing the state of the resource and generating logs, while the

effector is used to execute adaptations on the resource. Moreover, IBM proposed an Adapter Rule Builder that allows creating specific rules to generate adaptations. BySolomon et al. (2010), the authors proposed a process to develop autonomic management systems for Cloud Computing. This process is a list of steps to follow in order to get an autonomic system. The first step is to identify the control parameters. These parameters are the different elements to be monitored or reconfigured in the system. The second step is to define a system model that provides a view of the managed system. The next step is to identify the inputs of the system that need to be monitored. Afterward, there is an initialization for the parameters of the system model previously defined. The administrator of the system might decide the update rate of the model according to the measurements. He can also specify the type of the decisional system to be used. The different components are linked using a coordinator. Afterwards, sensors and filters might be deployed to take measurements. At this step the autonomic system is in place to control the managed system and apply reconfiguration actions on it. In their work (Buyya et al., 2012) proposed a conceptual architecture to enhance autonomic computing for Cloud environments. The proposed architecture is composed of a SaaS web application used to negotiate the SLA between the provider and its customers, an Autonomic Management System (AMS) located in the PaaS layer and, an IaaS layer that provides public or private Cloud resources. The AMS incorporates an Application scheduler and an Energy-efficient scheduler. The Application scheduler is responsible for assigning applications to Cloud resources. And the Energy-efficient scheduler aims at minimizing the energy consumption of the overall system. The AMS implements the logic for provisioning and managing virtual resources. The PaaS layer contains a component to detect Security infraction and attack tentative. ByLi et al. (2011), the authors proposed an integrated approach to automate the management of virtualized resources in the Cloud. In their proposal they use different analytic techniques to control the virtual resources. These techniques include the feedback control theory, statistical machine learning and system identification. They used KVM as a virtualization solution and monitor to implement their solution. In the state of the art, there are other works related to autonomic computing applying this paradigm on different areas (Baker et al., 2013; Bezemer and Zaidman, 2014; Wang et al., 2008; Jiao et al., 2010; Cheng and Garlan, 2012; Perez-Palacin et al., 2012; Peng et al., 2012). Almost all of these works propose specific solutions. Especially when coupled with Cloud environments, the proposed works are really facing a specific area and could not be generalized. However, what we are proposing in this paper is a generic solution that we can apply on all the levels. In order to pass from a

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS 12

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

level to another, we can keep our Resources and Links unchanged while adding just the needed Mixins to specify the behavior of the infrastructure. Hereafter, we propose an overview on the state of the art of monitoring and reconfiguration since they are the basic concepts of autonomic computing. 6.1. Monitoring and reconfiguration In the literature, there are many attempts to provide monitoring in the cloud and in distributed systems. In this subsection, we present some proposed approaches in this area. We conclude by explaining the limitations of these approaches and giving our objectives to overtake these limits. Different approaches for monitoring and reconfiguration are proposed in the state of the art (Massie et al., 2004; Brandt et al., 2009; Entreprises, 2014; Rak et al., 2011; Ferretti et al., 2010; Katsaros et al., 2011; Pellegrini, 1999; Hnˇetynka and Plášil, 2006; Ruz et al., 2010). Almost all of these solutions target specific environments and do not provide generic monitoring and reconfiguration of Cloud resources that span over the different layers. We were inspired by the work of Ciuffoletti (2014) to extend OCCI to provide a Monitoring and Reconfiguration infrastructure on demand. Consequently, our approach is generic and is based on the OCCI standards. The extensibility of our work is guaranteed using OCCI Mixins that could specify technical aspects for each level separately. While the Resources and Links are abstract, we can keep them unchanged from one level to another. Furthermore, to enable different kinds of reconfigurations, all what we need is to provide new Mixins to add new specific behaviors. 7. Conclusions Autonomic computing remains an unavoidable way to efficiently deal with the dynamic and heterogeneity of Cloud environments. Indeed, in such environments the management of Cloud resources in automated and integrated way is a challenging task (Buyya et al., 2012). In this paper we proposed an extension for Open Cloud Computing Interface (OCCI) to ensure autonomic computing on demand. We defined a list of OCCI Entities and Mixins to enable the autonomic loop for OCCI resources. The establishment of the infrastructure is based on OCCI HTTP Rendering. All the descriptions respects the last revision of OCCI standards. To advocate our proposal, we proposed a real use case. This latter spans over the different layers of the Cloud and it shows how we can establish an autonomic computing infrastructure for different OCCI resources to deal with their evolution during their life cycles. Accordingly we described the different aspects of our implementation and gave a brief introduction of each used feature. Combining different conceptual and technical mechanisms allowed us to go ahead in our proposal. This allowed us to provide a solution that can bypass the heterogeneity obstacle in Cloud environments. The evaluation that we conducted encouraged us to go farther in this work and opened vast perspectives. In our future work, we plan to integrate our work on the CompatibleOne project (CompatibleOne, 2013) and consequently (EASI-CLOUDS, 2014) project. The challenging part in this work is that we advocate that our proposal can be applied independently from the service level (i.e., IaaS, PaaS, SaaS). Consequently, we need to implement Mixins that allow to establish the autonomic loop for heterogeneous resources spanning over different layers in the Cloud. Moreover, to enhance the automatic establishment of the autonomic infrastructure, we need to clearly describe how the Autonomic Manager can extract in a standard way the different information contained in the SLA. Basically, we need to add the description of the rules used by the Analyzer resource and the Listing 1. WS-agreement sample.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS

ARTICLE IN PRESS M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14

strategies used by the Reconfiguration Manager resource in a standard way. In order to get efficient rules and strategies, we plan to apply the same formal approach realized by Amziani et al. (2013) that consists on a generic elasticity controller that allows the definition and evaluation of different elasticity rules and strategies. Acknowledgments The work presented in this paper was partially funded by the French FUI (CompatibleOne, 2013), the French FSN (Open PaaS, 2014) and the European ITEA (EASI-CLOUDS, 2014) projects. References Amziani, M., Melliti, T., Tata, S., 2013. Formal modeling and evaluation of stateful service-based business process elasticity in the cloud. In: Meersman, R., Panetto, H., Dillon, T., Eder, J., Bellahsene, Z., Ritter, N., De Leenheer, P., Dou, D. (Eds.), OTM Conferences On the Move to Meaningful Internet Systems. In: Lecture Notes in Computer Science, Vol. 8185. Springer Berlin Heidelberg, pp. 21– 38. Andrieux, A., et al., 2005. Web services agreement specification (WS-agreement). Technical Report. Global Grid Forum, Grid Resource Allocation Agreement Protocol (GRAAP) WG. Baker, T., Rana, O., Calinescu, R., Tolosana-Calasanz, R., Bañares, J., 2013. Towards autonomic cloud services engineering via intention workflow model. In: Altmann, J., Vanmechelen, K., Rana, O. (Eds.), In: Economics of Grids, Clouds, Systems, and Services. Lecture Notes in Computer Science. Springer International Publishing. Baldoni, R., Beraldi, R., Piergiovanni, S., Virgillito, A., 2004. Measuring notification loss in publish/subscribe communication systems. In: Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing, Papeete, Tahiti, French Polynesia, pp. 84–93. Bezemer, C.-P., Zaidman, A., 2014. Performance optimization of deployed softwareas-a-service applications. J. Syst. Softw. 87 (0), 87–103. Brandt, J., Gentile, A., Mayo, J., Pebay, P., Roe, D., Thompson, D., Wong, M., 2009. Resource monitoring and management with OVIS to enable HPC in cloud computing environments. In: Proceedings of IEEE International Symposium on Parallel Distributed Processing, Rome, Italy, pp. 1–8. Buyya, R., Calheiros, R., Li, X., 2012. Autonomic cloud computing: open challenges and architectural elements. In: Proceedings of the Third International Conference on Emerging Applications of Information Technology (EAIT), Kolkata, India, pp. 3–10. Cheng, S.-W., Garlan, D., 2012. Stitch: a language for architecture-based selfadaptation. J. Syst. Softw. 85 (12), 2860–2875.Self-Adaptive Systems. Ciuffoletti, A., 2014. A simple and generic interface for a cloud monitoring system. In: Proceedings of the 3rd International Conference on Cloud Computing and Services Science, CLOSER. SciTePress, Barcelona, Spain. Cloud Foundry Blog, 2014. Monitoring cloud foundry applications with new relic. http://blog.cloudfoundry.org/2013/10/10/monitoring-cloud-foundryapplications-with-new-relic/. Cloud Industry Forum, 2012. UK cloud adoption and trends for 2013. Technical Report. http://www.cloudindustryforum.org/white-papers/uk-cloud-adoptionand-trends-for-2013. Cloud4SOA, 2012. Cloud4SOA. http://www.cloud4soa.eu/. Cloudware, O., 2013. Open Cloudware Project. http://www.opencloudware.org/bin/ view/Main/WebHome. CompatibleOne, 2013. CompatibleOne: the first real open cloud broker. EASI-CLOUDS, 2014. EASI-CLOUDS: extensible architecture and service infrastructure for Cloud-aware Software. http://easi-clouds.eu. Entreprises, N., 2014. Nagios Documentation. http://www.nagios.org/documentation. Entrialgo, J., García, D.F., García, J., García, M., Valledor, P., Obaidat, M.S., 2011. Dynamic adaptation of response-time models for QoS management in autonomic systems. J. Syst. Softw. 84 (5), 810–820. Ferretti, S., Ghini, V., Panzieri, F., Pellegrini, M., Turrini, E., 2010. QoS Aware Clouds.. In: Proceedings of the IEEE 3rd International Conference on Cloud Computing (CLOUD), Miami, Florida, USA, pp. 321–328. Hnˇetynka, P., Plášil, F., 2006. Dynamic reconfiguration and access to services ´ I., in hierarchical component models. In: Gorton, I., Heineman, G., Crnkovic, Schmidt, H., Stafford, J., Szyperski, C., Wallnau, K. (Eds.), Component-Based Software Engineering. In: Lecture Notes in Computer Science, Vol. 4063. Springer Berlin Heidelberg, pp. 352–359. Jacob, B., Lanyon-Hogg, R., Nadgir, D.K., Yassin, A.F., 2004. A practical guide to the IBM autonomic computing toolkit. IBM Redbooks. http://www.redbooks.ibm. com/. Jiao, W., Sun, Y., Mei, H., 2010. Automated assembly of internet-scale software systems involving autonomous agents. J. Syst. Softw. 83 (10), 1838–1850. Kadner, K., et al., 2011. Unified service description language XG final report. Technical Report. http://www.w3.org/2005/Incubator/usdl/XGR-usdl-20111027/. Katsaros, G., Gallizo, G., Kübert, R., Wang, T., Fitó, J.O., Henriksson, D., 2011. A multi-level architecture for collecting and managing monitoring information in cloud environments.. In: Proceedings of the third International Conference on Cloud Computing and Services Science, CLOSER. SciTePress, Noordwikerhout, The Netherlands, pp. 232–239.

[m5G;February 1, 2016;22:10] 13

Li, Q., Hao, Q.-f., Xiao, L.-m., Li, Z.-j., 2011. An integrated approach to automatic management of virtualized resources in cloud environments. The Comput. J. 54 (6), 905–919. Massie, M.L., Chun, B.N., Culler, D.E., 2004. The ganglia distributed monitoring system: design, implementation, and experience.. Parallel Comput. 30 (5-6), 817– 840. Metsch, T., Edmonds, A., 2011. Open cloud computing interface - infrastructure. Technical Report. Open Grid Forum. http://www.ogf.org/documents/GFD.184.pdf. Metsch, T., Edmonds, A., 2011. Open cloud computing interface - RESTful HTTP Rendering. Technical Report. Open Grid Forum. http://www.ogf.org/documents/GFD. 184.pdf. Mohamed, M., Belaïd, D., Tata, S., 2013a. Adding monitoring and reconfiguration facilities for service-based applications in the cloud. In: Proceedings of the IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), Barcelona, Spain, pp. 756–763. Mohamed, M., Belaid, D., Tata, S., 2013b. Monitoring and reconfiguration for OCCI resources. In: Proceedings of the IEEE 5th International Conference on Cloud Computing Technology and Science (CloudCom), Bristol, UK, Vol. 1, pp. 539–546. Mohamed, M., Belaïd, D., Tata, S., 2013c. Monitoring of SCA-based applications in the cloud. In: Proceedings of the 3rd International Conference onCloud Computing and Services Science, pp. 47–57. Nyren, R., Edmonds, A., Papaspyrou, A., Metsch, T., 2011. Open cloud computing interface - core. Technical Report. Open Grid Forum. http://www.ogf.org/ documents/GFD.183.pdf. Open PaaS, 2014. Open PaaS Project. https://open-paas.org. OpenNebula, 2014. OpenNebula: Flexible Entreprise Cloud Made simple. http:// opennebula.org. OpenStack, 2014. Openstack: Open source software for building private and public clouds.http://www.openstack.org. Pellegrini, N.-C., 1999. Dynamic reconfiguration of Corba-based applications. In: Proceedings of Technology of Object-Oriented Languages and Systems. IEEE Computer Society, Lante, USA, p. 329. Peng, X., Chen, B., Yu, Y., Zhao, W., 2012. Self-tuning of software systems through dynamic quality tradeoff and value-based feedback control loop. J. Syst. Softw. 85 (12), 2707–2719.Self-Adaptive Systems. Perez-Palacin, D., Mirandola, R., Merseguer, J., 2012. QoS and energy management with Petri nets: a self-adaptive framework. J. Syst. Softw. 85 (12), 2796– 2811.Self-Adaptive Systems. Rak, M., Venticinque, S., Mahr, T., Echevarria, G., Esnal, G., 2011. Cloud application monitoring: the mOSAIC approach.. In: Proceedings of the IEEE International Conference onCloud Computing Technology and Science, CloudCom, Athens, Greece, pp. 758–763. Ruz, C., Baude, F., Sauvan, B., 2010. Component-based generic approach for reconfigurable management of component-based SOA applications. In: Proceedings of the 3rd International Workshop on Monitoring, Adaptation and Beyond. ACM, Ayia Napa, Cyprus, pp. 25–32. Sellami, M., Yangui, S., Mohamed, M., Tata, S., 2013a. Open cloud computing interface - application. Technical Report. Sellami, M., Yangui, S., Mohamed, M., Tata, S., 2013b. PaaS-Independent provisioning and management of applications in the cloud. In: Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing (CLOUD), Santa Clara Marriott, CA, USA, pp. 693–700. Solomon, B., Ionescu, D., Litoiu, M., Iszlai, G., 2010. Designing autonomic management systems for cloud computing. In: International Joint Conference on Computational Cybernetics and Technical Informatics (ICCC-CONTI), Timisoara, Romania, pp. 631–636. Wang, X., Du, Z., Chen, Y., Li, S., 2008. Virtualization-based autonomic resource management for multi-tier Web applications in shared data center. J. Syst. Softw. 81 (9), 1591–1608.Gauging the progress of Software Architecture research: three selected papers from Working IEEE/IFIP Conference on Software Architecture (WICSA) 200. Yangui, S., Mohamed, M., Sellami, M., Tata, S., 2013. Open cloud computing interface - platform. Technical Report.

Mohamed Mohamed is a researcher at IBM Research, Almaden Research Center, San Jose, CA, USA. He is working in the Platform and Mobile team. His research interests are around Cloud computing provisioning and management of Platform resources as well as SLA management and PaaS persistence. Mohamed is a member of many international conferences in his domain. For more information, see http://goo.gl/F7j3sN.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002

JID: JSS 14

ARTICLE IN PRESS

[m5G;February 1, 2016;22:10]

M. Mohamed et al. / The Journal of Systems and Software 000 (2016) 1–14 Djamel Belaïd is a Professor and the Head of the Computer Science Department at Télécom SudParis part of Intitut Mines-Telecom and University of Paris-Saclay, France. He is working in the area of distributed computing systems. Among his interests include middleware for pervasive environments, service oriented architecture (SOA), cloud computing, and autonomic service management. He has worked in a number of French and European collaborative researches funded projects involving collaboration with other academic institutions, research laboratories, large companies and SMEs.

Samir Tata is a Professor and Research Group Leader at Télécom SudParis part of Intitut Mines-Telecom and University of Paris-Saclay, France. His current research area includes service computing, Cloud computing and business process management. He was chair of several international conferences and workshops. He is member of the steering or the program committee of several international conferences. For more information, see http: //www-inf.it-sudparis.eu/∼tata.

Please cite this article as: M. Mohamed et al., Extending OCCI for autonomic management in the cloud, The Journal of Systems and Software (2016), http://dx.doi.org/10.1016/j.jss.2016.01.002