Performance analysis of multi-core VMs hosting cloud SaaS applications

Performance analysis of multi-core VMs hosting cloud SaaS applications

Accepted Manuscript Performance Analysis of Multi-Core VMs hosting Cloud SaaS Applications Said El Kafhali , Khaled Salah PII: DOI: Reference: S0920...

755KB Sizes 0 Downloads 32 Views

Accepted Manuscript

Performance Analysis of Multi-Core VMs hosting Cloud SaaS Applications Said El Kafhali , Khaled Salah PII: DOI: Reference:

S0920-5489(17)30048-X 10.1016/j.csi.2017.07.001 CSI 3226

To appear in:

Computer Standards & Interfaces

Received date: Revised date: Accepted date:

7 February 2017 13 June 2017 8 July 2017

Please cite this article as: Said El Kafhali , Khaled Salah , Performance Analysis of MultiCore VMs hosting Cloud SaaS Applications, Computer Standards & Interfaces (2017), doi: 10.1016/j.csi.2017.07.001

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights 

An analytical model to capture the dynamics and behavior of SaaS cloud services is presented. At any given workload, the model can estimate the minimum number of multi-core Virtual Machines (VMs) needed to satisfy QoS parameters.



Mathematical formulas for key performance are derived. These formulas can be used for capacity engineering and scalability solutions for SaaS cloud services. Discrete Event Simulations are used to validate the analytical model.



Results show that MaaS systems under heavy workload can benefit in terms of cost efficiency and

CR IP T



AC

CE

PT

ED

M

AN US

system responsiveness from the deployment of multi-core VMs as opposed to single-core VMs.

ACCEPTED MANUSCRIPT

Performance Analysis of Multi-Core VMs hosting Cloud SaaS Applications Said El Kafhali1 and Khaled Salah2 1

Abstract

CR IP T

Computer, Networks, Mobility and Modeling laboratory National School of Applied Sciences, Hassan 1st Univ, Settat, Morocco Email: [email protected] 2 Electrical and Computer Engineering Department Khalifa University of Science, Technology and Research (KUSTAR), UAE Email: [email protected]

AN US

Today’s data centers are designed to scale up to respond to the offered workload in a rapid, efficient, and effective manner, and at the same time, they must satisfy the Service Level Agreement (SLA) requirements. This opens up many interesting and challenging research issues and opportunities. The Software-as-a-Service (SaaS) is the most popular cloud service model being used these days, in which multi-core VMs are allocated efficiently to meet the offered workload, and in a way to avoid any violations to the agreed SLA. This entails

M

the need to model SaaS services to predict the performance and overall system cost, and to estimate the required number of VM resources and their respective multi-core capacity prior to the actual deployment. To this end, we present in this paper a queuing mathematical model to study and analyze the performance of multi-

ED

core VMs hosting cloud SaaS applications. Our analytical model estimates under any offered workload the number of required multi-core VM instances needed to satisfy the Quality of Service (QoS) parameters. Our mathematical model is validated using DES (Discrete Event Simulator) simulations. Results obtained from our

PT

analysis as well as simulation models show that the proposed model is powerful and able to correctly and effectively predict the system performance and cost, and also to determine the number of VMs cores needed for

CE

SaaS services in order to achieve QoS targets under different workload conditions.

KEYWORDS: Cloud Computing Data Center, Queuing Theory, Performance Analysis, SaaS Applications,

AC

Multi-Cores VMs.

1. Introduction

Cloud computing has three popular service models [1]. The first service model, named Infrastructure as a Service (IaaS), such as Amazon EC2 [2], offers virtualized hardware, storage and physical devices over the Internet. The Software as a Service (SaaS) model, such as Customer Relationship Management (CRM) [3], offers software and hosted applications over the Internet. Finally, as a combination of both, the Platform as a

2

ACCEPTED MANUSCRIPT

Service (PaaS) model, such as Google App Engine (GAE) [4], offers the capability of deploying applications created using programming languages, libraries, services, and tools supported by the Cloud provider, where the consumer does not manage or control the underlying Cloud infrastructure, but has control over the deployed applications. Moreover, Clouds can be deployed as public, private, hybrid or community, depending on how they are managed [1]. Private Clouds are operated by a single company or organization, whilst public Clouds are available for external and public usage. Community Clouds share their infrastructure between multiple

CR IP T

companies or organizations. Finally, hybrid Clouds are a composition of the previous deployment models. Some underlying technologies for SaaS cloud computing, such as on demand computing software delivery and the Application Service Provider (ASP), open up interesting research problems related to service performance modeling and evaluation [5], with the aim of achieving proper resource allocation, efficiency, and scalability. SaaS applications and services are widely used today, and include email, sales management, Customer

AN US

Relationship Management (CRM), online gaming, online tax services, financial management, human resource management, billing and business collaboration [3]. The user applications are delivered as services to online clients, known by SaaS, with a middleware to support those services and guarantee a QoS according to an SLA agreement. In this type of cloud service, cloud consumers deploy their applications on the cloud environment, which can be accessed through networks from various clients (e.g. web browser, PDA, etc.) by application users [6]. Cloud consumers do not have control over the Cloud infrastructure that often employs a multi-

M

tenancy architecture; namely, different cloud consumer applications are organized in a single logical environment to achieve scale and optimization economy in terms of speed, security, availability, disaster

ED

recovery, and maintenance [7]. In a typical cloud data center (CDC) environment, a single VM can support more SaaS client requests. Additionally, SaaS applications can be planned and executed with minimal effort,

PT

creating one of the shortest time-to-value intervals possible for a major IT investment. Virtualization is a key enabling technology for cloud computing. Virtualization technologies have continuously

CE

evolved to provide different capabilities, such as the execution of multiple applications in parallel, and the movement of running applications to other hosts [8]. The virtualization can also provide the basic building blocks for your cloud computing to enhance agility and flexibility. The virtualization enables multiple VMs to

AC

share underlying PMs. Computational capacity is almost always captured by a single number per machine. In reality, both PMs and VMs may have multiple CPUs cores [9]. When a VM is mapped to a PM, each of the VM processor (vCPU) cores must also be mapped to one of the PM processor (pCPU) cores. The vCPUs are the CPUs visible to the guest Operating System (OS). A vCPU does not necessarily have to correspond to a pCPU or to a pCPU core. Typically, multiple VMs running on the same multi-core PM, which is called VM consolidation, can improve resource utilization. As multi-core PMs are pervasive from the embedded to the high end computing infrastructures, it becomes increasingly important to exploit performance opportunities and

3

ACCEPTED MANUSCRIPT

improve the efficiency of virtualized multi-core VMs mapping onto multi-cores PMs [10]. Some SaaS applications are small and can be hosted to a single core processor VM or multi-core VM. In the case of large SaaS applications (Engineering applications, Healthcare applications, social network, etc.), we need to allocate several VMs or a VM with multiple processors cores to host these applications. If we use several VMs to host these applications we must account for functional dependency between services that are

CR IP T

deployed to separate VMs during application scaling. Taking this into account, these dependencies can result in increasing, for instance, the response time of SaaS applications. Moreover, when VM resources (CPU, Memory, etc.) become saturated and the demand increases, it is straightforward that QoS including response time and availability of SaaS applications requests deteriorates its performance, which may result in final users’ dissatisfaction and financial penalties due to SLA violations. Hence, in this paper, we are allocating multi-core VMs from multi-core PMs, so that SLA requirements will be satisfied for different SaaS applications. In a

AN US

way, each VM will have a processor with multi-cores and can serve large SaaS requests at the same time (depending on the number of cores of the VM processor) and minimize the dependency time between the VMs. Also, if each application is deployed in a separate VM instance, it can be scaled up or down by adding or removing muli-core VM processor.

Multi-core systems can be beneficial to today’s high-performance PMs that commonly host VMs on multi-core

M

architectures, because VMs hosting multiple applications often have multiple resource needs. Commercial success of any CDC platform depends upon its ability to deliver guaranteed QoS and evaluate the performance

ED

by measurements, simulation or by modeling which is of paramount importance. Application performance metrics measure client satisfaction trying to capture some QoS level of performance, which may include the mean response time, mean throughput, the number of SLA violations, the mean waiting time, and the number

PT

of accepted requests. In [11], we presented a brief and preliminary work of a simple analytical model to analyze the performance of CDC and to estimate how many single-core VMs we need to achieve a target QoS metric.

CE

The simple model presented in [11] does not capture the vast complexities related to cloud resources or cloud services models, specifically with multi-core VMs and for hosting multiple SaaS applications. However, in this paper, we present a detailed analytical model to aid in studying and analyzing performance of multi-core VMs

AC

hosting multiple cloud SaaS applications. Our primary contributions in this paper can be summarized as follows: 

A detailed queuing analytical model is presented to capture the dynamics and behavior of SaaS cloud services. The analytical model is composed of two concatenated subsystems including scheduler and VMs queuing models.



Mathematical equations are derived from the analytical model for key performance measures.



Numerical examples are given to show how our analytical model can be used to predict the

4

ACCEPTED MANUSCRIPT

performance of SaaS services, and also to determine the required number of VMs cores needed under variable workload conditions. 

A simulation model using Java Modeling Tools (JMT) simulator is presented to cross check and validate the accuracy of our analytical models.

The rest of the paper is organized as follows: Section 2 summarizes the related work. The proposed CDC

CR IP T

model is presented in section 3. Section 4 presents analytical model for the proposed model. Section 5 presents numerical and simulation results. Finally, Section 6 includes concluding remarks and future works.

2. Related Work

A number of papers have been published on the subject of performance analysis of cloud computing data centers. In this section, we present published papers related to performance evaluation of cloud environment.

AN US

The authors in [12] proposed an admission control and scheduling algorithms for resource allocation for SaaS providers to minimize the total cost and SLA violations. The proposed algorithms take certain QoS parameters such as response time and service initiation time for satisfying the clients while minimizing the use of hardware resources. The obtained experiments results show that the proposed algorithm offered significant improvement (cost-savings up to 40%) and covered multiple types of QoS measures. Espadas et al. [13] proposed a resource allocation approach to deploy and maintain SaaS applications over cloud computing platforms to create a cost

M

effective scalable environment. They used a tenant based isolation model, which encapsulated the execution of each tenant. Each tenant may have different QoS requirements. The obtained results show that the method to be

ED

efficient in reducing the response time and system cost. Genez et al. [14] used two Integer Linear Program (ILP) formulation to schedule cloud resources so as to execute user tasks with two SLA levels from both the cloud user and the provider. The ILP is formulated to schedule SaaS clients’ workflows into multiple IaaS

PT

providers. The simulation results show that the optimal run of the proposed ILP is able to find low cost solutions for short deadlines. Keqin et al. [15] formulated and solved three optimization problems related to

CE

optimal virtual server configuration in a cloud computing environment, namely, optimal multi-core server processor partitioning, optimal multi-core server processor partitioning with power constraint, and optimal power allocation. The system performance measures considered are the mean task response time and the mean

AC

power consumption. The authors presented two models for multi-core CPU and power consumption; namely: the idle-speed model and the constant-speed model. They treated the multi-core server processor as a group of queuing systems with multiple servers. The obtained results show that the idle-speed model and constant-speed model can be introduced to handle the optimization on power consumption and resources provisioning. The authors in [16] presented an analytical model for evaluating the performance of heterogeneous data centers. Based on the proposed model, several performance measures are analyzed, such as the mean response

5

ACCEPTED MANUSCRIPT

time, the mean waiting time, the loss blocking probability, and the probability of immediate execution. They also conduct simulation experiments to confirm the validity of the proposed queuing model. The obtained results indicate that the proposed model is effective in accurately estimating the performance of the heterogeneous data center. Guo et al. [17] presented a dynamic performance optimization model to optimize the performance of services in an on-demand service in cloud center using M/M/m queuing system. Using the queuing model, they have proposed the function, strategy and synthesis optimization mode. They also

CR IP T

compared and analyzed the simulation results to the classical optimization methods (short service time first and first in, first out method).The simulation results show that the proposed model can allow less waiting time and queue length and more clients to gain the service when the number of servers increases. Akingbesote et al. [18] extended existing and widely adopted theories to non-pre-emptive model by using the queuing theory and the simulation model in the context of healthcare services. They evaluated the impact performance of mobile healthcare device requests waiting time using the non preemptive priority and the non priority service

AN US

discipline. The results reveal that the unconditional average waiting time remains the same with reduction in waiting time over the non preemptive priority model.

Meyer et al. [19] modeled and studied the performance of software routers to show the linear scaling of packet processing performance with multi-core CPUs. Based on that, they showed that the maximum throughput of the packet processing scales nearly linearly with the number of CPU cores. They calibrated and validated the

M

proposed model in a case study. The case study showed that the model is able to predict performance behavior of the tested software router in a realistic manner even in the case when parallel processing with multi-core

ED

processors is applied. Stavrinides et al. [20] investigated the performance of strategies for the scheduling of complex workloads, in a SaaS cloud applications, in the presence of periodic soft real-time single-task jobs. They examined the impact of gang service time variability on the performance of the scheduling algorithms, by

PT

considering service demands that follow a hyper-exponential process. Simulation results show that the relative performance of the employed scheduling strategies depends on the type of the workload. Salah et al. [21]

CE

provides a queuing model and closed-form solutions for estimating Cloud applications and proposes a very fast solution for estimating the minimum number of VMs required achieving performance objectives. The infrastructure is modeled as a finite M/G/1/K queuing system. The proposed model is validated with

AC

experimental measurements for Cloud servers reported in the literature. A Discrete Event Simulation has been used to verify the correctness of the proposed analytical model. Truyen et al. [22] outlined a container-based architecture for multi-tenant SaaS applications. They evaluated the technical strengths, the weaknesses, the opportunities, and the threats which should be taken into account by a SaaS provider when considering the adoption of such container based architecture.

6

ACCEPTED MANUSCRIPT

In contrast with previous works reported in the literature, our proposed model is different in two main ways: (1) We optimize cloud resources allocation and workload scheduling to minimize the overall resources cost (by allocating the smallest number of multi-core VMs) and at the same time meeting the required performance parameters like response time. (2) The model can also be used in scalability solutions and dynamic scaling in which resources are allocated as the incoming workload changes. So if the aggregated workload goes up, more multi-core VMs have to be allocated to meet such increase, and conversely, if workload goes down, the number

CR IP T

of provisioned multi-core VMs has to decrease. In addition, none of these models have focused on determining the number of Multi-core VM instances required to meet SLA performance parameters for cloud SaaS services. Also, some of these models ignore the role of the load balancer which plays an important role in dispatching, monitoring, and tracking the availability of compute instances at the CDC. Our proposed model has been

Wu et al. [12]

Modeling approach Analytical

Multi-core technology No

SaaS applications Yes

Simulation Validation No

Espadas et al. [13]

Analytical

No

Yes

No

Genez et al. [14] Keqin et al. [15]

Analytical Analytical

No Yes

Yes No

Bai et al. [16]

Analytical

No

Guo et al. [17]

Analytical

Akingbesote et al. [18] Meyer et al. [19] Stavrinides et al. [20] Salah et al. [21]

Analytical

M

Yes

No

Yes

No

No

Yes

PT

ED

No

Yes No No

No Yes Yes

Yes No Yes

Analytical

Yes

Yes

Yes

Analytical Analytical Analytical

AC

Our model

No No

No

CE

Related works

AN US

compared with existing literature models as described in Table 1.

QoS parameters Response time, service initiation time and cost Response time and system cost User deadlines and cost CPU utilization and power consumption Response Time, Waiting Time, Loss Probability and Probability of Immediate Execution Waiting time and queue length Waiting time Throughput Response time Response time, loss probability, CPU utilization, and Bandwidth Response time, Waiting time, Probability of immediate execution, Throughput, CPU utilization, number of requests in the system and system cost

Table 1: Summary of related work

7

Load balancer No No No No Yes

No No No Yes Yes

Yes

ACCEPTED MANUSCRIPT

3. Cloud Data Center Model 3.1. Cloud Data Center Architecture In our system model depicted in Figure 1, we consider a large-size data center with a collection of

N homogeneous PMs, where VMs run on the top of PM according to the VMs requests. Indeed, large data centers of Google, Microsoft, Yahoo and Amazon etc. contain tens of thousands of PMs [23]. Each VM is

CR IP T

allocated to one PM, where as a PM can be allocated K VMs through a hypervisor. Each PM is characterized by the Central Processing Unit (CPU), size of RAM, storage hard drive and network bandwidth. End-clients have applications running on VMs that require resources from the PMs defined in CPU capacity, amount of RAM, and network bandwidth. The requests from various clients are stored in buffer which is connected to scheduling queue for allocation of resources in the cloud. The queued requests are directed to one of the local queuing VMs. When final clients need new VM with the required specifications to accomplish their

AN US

applications, SLA is established between final clients and cloud service provider to detect the QoS that will be delivered. The cloud service provider pays a penalty to the final clients, if the customer reports violation in the agreed SLA parameters. In summary, final clients will send requests to the scheduler. The scheduler has a load balancing functionality, to receive all the incoming requests from clients and distributed it evenly among the VMs in CDC. Final clients in the model are represented by the generated SaaS requests, whereas scheduling queue and local queue are the processing stations for these requests. Each request will be assigned to a unique

AC

CE

PT

ED

M

VM and each SaaS application will be hosted in a different VM.

Figure 1. Cloud Data Center system model

8

ACCEPTED MANUSCRIPT

3.2. Cloud Data Center Queuing Model The SaaS client uses the software application through a browser from anywhere, thus the received requests

from clients to use SaaS applications can be arbitrary over time. To be able to provide an acceptable QoS, the SaaS must be prepared to host a multiple of requests at the same time. However, the computational capacity of the SaaS provider should be sufficient to deal with the multiple numbers of

CR IP T

clients applications. Practically, clients requiring SaaS service submit their requests to be executed and finished at a given SLA time as specified in the SLA agreement. The SaaS provider offers clients software on demand. Therefore, our proposed model can be applied to describe and capture the behavior of cloud cluster jobs, MapReduce, cloud gaming, social networking and, all the Google Apps and elastic cloud applications that can be considered as random incoming. For example, we consider the SaaS service of Gmail, the client request comes in a form to run their request in a software on Google servers in the VMs. The client is

AN US

able to access this software to run their request provided that he has access to Internet. The clients do not own the application which is solely Google's property; he just exploits the services that are provided by Google. To overcome the above complexities related to multiple applications hosting on Clouds, cloud providers typically use a queuing system with multi-core processor. To depict such dynamics, Figure 2 shows a queuing

M

network model of homogeneous CDCs with a large number of VMs with multiple cores. The two concatenated queuing models (the scheduling queuing model and the VM queuing model) make up the queuing model that characterizes the CDC and the service processes involved in cloud computing. End-clients in the model are

ED

represented by the generated SaaS requests, whereas scheduling queue and local queue are the processing stations for these requests. All client requests are received by the scheduling queue. The scheduling queue

PT

organizes the input requests in a number of input local queues that are allocated to VMs instances. Scheduling queue is modeled to have a M/M/1/C queue with finite capacity C . On the other hand each VM instance has a

M/M/n queue. Client requests are submitted to a scheduling queue and then processed on the First-In First-

CE

Out (FIFO) basis. The arrivals of requests follow a Poisson process with arrival rate s . The inter-arrival times are independent and identically distributed exponential random variables with mean 1/s . Queued requests are

AC

distributed evenly to different VMs and the scheduling rate depends on the scheduling queue server capacity. We assume that the service time of the scheduling queue server is exponentially distributed with mean service time 1/ s . The requests are evenly distributed by the scheduling queue server to each VM with the same probability 1/(K  N ) . K and N are the number of VMs in each PM and the number of PMs in the CDC, respectively.

9

AN US

CR IP T

ACCEPTED MANUSCRIPT

M

Figure 2. Queuing model of cloud data centers

4. Analytical Modeling

ED

4.1. Scheduler queuing model

To date, request scheduling system is one of the challenging issues in cloud computing, and plays a key role in modeling and allocating resources that would fulfill efficiently and satisfactory cloud user’s requests [24]. In a

PT

CDC, resources are composed of VMs, which are deployed on PMs. It is advantageous for the cloud provider to accept and fulfill as many incoming requests as possible in order to improve its utilization of resources and

CE

maximizes return on investment [25]. The number of accepted requests is limited by the size of the PMs and also by the terms of its SLA. However, the random nature of request arrivals makes the decision whether to admit a service request or reject it a non-trivial and challenging task. Since management of computing

AC

resources and scheduling requests across CDC are done by a unified resources management platform and the system should be accompanied by finite capacity [16], [26], [27], the Scheduling Queue (SQ) is modeled as a

M/M/1/C queue [28]. The maximum number of requests in the system is C , which implies a maximum queue length of C-1 . An arriving request enters the queue if it finds less than C requests in the system and is lost otherwise. Using the balanced equations and the normalization condition

10

ACCEPTED MANUSCRIPT

CR IP T

 0( s ) ( s )   1( s ) ( s )  (s) (s) (s)  1 (( s )  ( s ) )   0 ( s )   2 ( s )  ...   (s)  k (( s )  ( s ) )   k( s)1( s )   k( s)1( s ) , k  2,..., C  1  (s) (s)  C ( s )   C 1( s ) C   (s)  1 k  k 0

(1)

We obtain the steady-state probability of k requests in the SQ,  k( s ) , as following

 0( s ) 

1  ( s ) 1  ( s )

C 1

,  k( s ) 

1  ( s ) 1  ( s )C 1

(ks ) , k  1, 2,..., C

AN US

where ( s )  ( s ) / ( s ) denotes the offered load in SQ server.

(2)

Now, we can derive important performances measures. First, we can obtain blocking probability due to lack of space in SQ as following (s) Ploss   C( s ) 

1  ( s )

M

The mean throughput service X ( s ) is given by

1  ( s )

ED

(s) X ( s )  ( s ) (1  Ploss )  s

The SQ server utilization is

PT

U (s) 

X (s)

( s )

 ( s )

C 1

(Cs )

1  (Cs )

(3)

(4)

1  ( s )C 1 1  (Cs )

(5)

1  ( s )C 1

The mean number of requests (including the mean number of waiting requests in the queue and the mean

CE

number of scheduled requests) in the SQ is

AC

E

(s)

C

  k k 1

(s) k

( s ) 

1  (C  1)  (Cs )  C (Cs ) 1 (1  ( s ) )(1  (Cs ) 1 )

, ( s )  1

(6)

The mean number of scheduled requests is (s) sch

E

( s )  (Cs )1  (1   )  1  ( s )C 1 (s) 0

(7)

Hence, the mean number of waiting requests in the SQ can determined as given below (s) Ew( s )  E ( s )  Esch

11

(8)

ACCEPTED MANUSCRIPT

Using effective arrival rate and Little’s law formula [29], we can determine other performance measures of SQ, which are given below. The mean waiting time in the SQ, Tw( s ) , comes out to be as mentioned below

Tw( s ) 

Ew( s ) 1  (s) ( s ) (1  Ploss ) ( s )

(9)

Tq( s )  Tw( s ) 

1

( s )

Finally, the mean response time of requests at the SQ as

C(Cs ) E(s) 1   X ( s ) ( s )  1 (Cs ) 1  (Cs )

(10)

(11)

AN US

T (s) 

CR IP T

The mean waiting time in the queue, Tq( s ) , can be written as given below

4.2. Multi-core Virtual machine model

A multi-core VM is defined as a VM having at least one vCPU with several processing cores. The core is the part of the vCPU that performs the execution of the requests. With a single-core vCPU, only one request can be processed at a given time while multi-core vCPU can execute several requests. This is interesting because it

M

allows VMs with only one vCPU to be able to run parallel requests. We assume that each VM works as an execution server, which is a multi-core vCPU that has n identical cores so that n can be multiple of 2. Since each multi-core vCPU can parallel-process multiple requests, it is modeled as a

ED

M/M/n queuing system [30, 31] as shown in Figure 3. Arriving requests form a single waiting buffer

based on the order of their arrival and each request requires exactly one core processor. The request

PT

execution times on the cores of VM are independent and identically distributed exponential random variables with mean 1/  . As the clients requests come from all over the word and the cloud

CE

computing can provide infinite service, the service of the requests and number of the queuing is not limited. The state of the M/M/n system is completely characterized by the number of requests in the

AC

VM. If all the cores of the VM are busy when a new request arrives at the VM, the arriving request joins the queue and waits its turn to receive service from the first idle core. Figure 4 exhibits the transition diagram for the new request in a single VM. When the state is i(0  i  n) , the i cores are busing and the other n  i cores are idle. When i  n , n cores are busing and i  n requests are waiting.

12

CR IP T

ACCEPTED MANUSCRIPT

AN US

Figure 3. Multi-core VM Queuing Model

Figure 4. Continuous Time Markov Chain for new request in a single VM

Let  i( j ) denote the equilibrium probability that there are i requests in the execution queuing model of the jth VM ( j  1,..., K  N ) . We suppose that the utilization of any individual core    / n is smaller than one.

M

The balance equations are the following

(12)

 i(j1)    i( j ) n   i(j1)    i( j ) n, i  n

(13)

ED

 i(j1)   i( j )i   i(j1)    i( j )i, 1  i  n

CE

PT

From equations (12) and (13), we can obtain the following steady-state probabilities

 i( j )

 ( j)  i , in  0 i!  i  ( j )  , i  n  0 n !ni  n

(14)

AC

From which, together with the normalization equation, we can obtain 

 i 0

( j) i

 1   0( j ) 

1 n 1



i

n

 i !  (n  1)!(n   )

(15)

i 0

Now, we can derive important performances parameters. First, the average number of requests in the jth VM is given by 

E ( j ) [ N ]   i i( j )  n   i 0

13

(n )n  0( j ) n ! (1   )2

(16)

ACCEPTED MANUSCRIPT

The average number of waiting requests in the jth VM is 

E ( j ) [W ] 

 (i  n) i( j )   0( j )

i  n 1

 n1

1 (n  1)! (1   ) 2

(17)

Let the random variable B denote the number of busy cores; then

CR IP T

 P( N  i )   i( j ) , 0  i  n  1   P( B  i )    n( j ) ( j) , in  P ( N  n)    i  1  i n  The average number of busy cores is then n 1

E ( j ) [ B]   i i( j )  i 0

n n( j )   n  1  

(18)

(19)

The probability of queuing (i.e., the probability that a newly arrived request must wait because the all processor

AN US

cores are busy) is 

( j) Pqueue    i( j )  i n

 n( j ) (n )n  0( j )  1  n 1 

(20)

The probability that a newly arrived request from the SQ to the jth VM will be executed immediately is 

( j) ( j) Pexe  1    i( j )  1  Pqueue

(21)

M

i n

Lastly, the mean response time and mean waiting time in the jth VM is then computed using little’s formula

ED

as follows

E( j)[N ]



, T ( j ) [W ] 

E ( j ) [W ]



(22)

PT

T ( j ) [ R] 

4.3. Performance Metrics for the VMs

CE

To analyze the performance of the execution model in CDC, we derive formulas for key performance metrics related to the execution queuing system. First, the mean number of requests and the mean number of waiting

AC

requests in the execution queuing model in the data center are

Ee [ N ] 

K N

E j 1

( j)

[ N ], Ee [W ] 

K N

E

( j)

[W ]

(23)

j 1

The mean probability of a newly arrived request is queuing in the execution queuing model in the data center is

Pqueue 

K N

P j 1

( j) queue

The mean probability of a newly arrived request entering the execution queuing model in the data center is

14

(24)

ACCEPTED MANUSCRIPT

Pexe 

KN

P j 1

( j) exe

(25)

The mean response time and mean waiting time of the execution queuing model in the data center are K N

T

( j)

j 1

[R] , TW 

K N

T

( j)

[W]

(26)

j 1

4.4. Performance Metrics for the Overall System

CR IP T

TR 

Based on analyzing the two concatenated queuing models, we derive now key performance formulas for the entire queuing system as follows. First, the mean response time in CDC is the sum of the response time of the two queuing models. The mean response time equation can be formulated as

T  TR  T ( s )

AN US

The mean waiting time for a request is

T  TW  Tq( s )

(27)

(28)

Since the SQ is a finite capacity queuing model, the loss probability of the system is related to the finite capacity of the SQ system C and the scheduling rate of the main scheduler server ( s ) . (s) Ploss  Ploss

(29)

M

The probability of immediate execution indicates the probability of a request being executed immediately by a

ED

VM in data center after being scheduled by the SQ and sent to the VM is obtained as flow sys Pexe  Pexe

(30)

In order to estimate the system costs, we have considered the two important factors: (1) system service costs

PT

and (2) waiting time costs of requests in the system. The cost analysis provided may be helpful in establishing the trade-off between the increased costs associated with the better service, and the decreased requests waiting times derived from providing that service.

CE

The expected service cost in the system is expressed as

SC  nVM CVM

(31)

AC

where nVM is the number of VMs, and CVM is the service cost of each VM. The service cost of each VM is n

CVM   Ci i 1

where Ci is the service cost of each processor core. The expected waiting time Costs in the system is calculated as

15

(32)

ACCEPTED MANUSCRIPT

WC  TCw

(33)

where  is the number of requests arrivals, T is the average waiting time spent in the system and Cw opportunity cost of waiting time by requests. Finally, we obtain the system costs as flow

SC os tes  SC  WC

CR IP T

(34)

5. Simulation and Numerical Results 5.1. Discrete Event Simulator

Several Cloud simulators have been specifically developed for the performance analysis of Cloud computing environments, with CloudSim [32], CloudAnalysit [33], GreenCloud [34], iCanCloud [35], GridSim [36], MDCSim [37], NetworkCloudSim [38] and EMUSIM [39] being the most prominent ones. All of these

AN US

available tools did not have the capabilities to capture accurately the internal behavior and dynamics of the Cloud computing environments. For this reason, we choose the Java Modeling Tools (JMT) to implement the performance of the proposed model. JMT [40] is a suite of open source toolkits for performance evaluation and workload characterization of computer and communication systems based on queuing systems [41]. JMT includes tools for workload characterization (JWAT), solution of queuing networks with analytical algorithms

M

(JMVA), simulation of general-purpose queuing models (JSIM), bottleneck identification (JABA), and teaching support for Markov chain models underlying queuing systems (JMCH) [42]. This discrete event simulator supports also advanced features including finite capacity regions, bursty processes, fork join servers

PT

5.2. Simulation Environment

ED

and load dependent service times [43] in addition to several distributions for arrival and service process.

We assumed that the simulation environment consists of a typical small-size CDC with five PMs, and the PM is supposed to support at most a number of five VMs. The average request arrival rate to the system is varied

CE

from 1000 to 5000 requests per second. The service times of each request in the SQ are exponentially distributed with an average of 0.001 seconds. The maximum number of requests in the SQ is 300. The request

AC

execution times on the cores processor of each VM are independent and identically distributed exponential random variables with an average of 0.01 seconds. For clarity, the all numerical parameters in Table 2 are used in our simulation. All the simulation results are performed on a PC configured with a 2.40 GHz Intel Core i5 CPU, 4 GB of memory, and a 250 GB disk. The simulation started with a total of six VMs with each VM having two processing cores as shown in the Figure 5. At the end of each simulation the VMs was increased in each PM by one while the record of the QoS in the system was kept.

16

ACCEPTED MANUSCRIPT

Parameters

( s )

Description Request arrival rate

Values [0111 to 5000] (Req/s)

1/ ( s )

Mean request SQ service time

0.0001 (s)

1/  C

Mean request Core service time Maximum number of requests in the SQ Number of PMs in the data center Number of VMs in each PM

0.01 (s) 300 5 5

N K

ED

M

AN US

CR IP T

Table 2. Input parameters and their respective values

PT

Figure 5. Simulation model for six VMs

CE

5.3. Results and Discussion

First, we have cross validated results obtained from our analytical model with those results obtained by the simulation to assess the degree of correction. From the figures presented in this section, and for the most part,

AC

the analytical results and those produced from the simulation experiments are in good agreement. Hence, this obviously validates our analytical model. The reported simulation results presented in this section are the average of five runs, and the average is recorded in the plots. The results obtained from simulation are represented by the black circles, whereas the curves represented by lines are those of analysis. We have analyzed the performance curves using multiple VM cores instances as function of requests arrival rate when the number of cores in each VM equals to 2, 4 and 8. Each figure presented as performance measure calculated based on the analytical model for varying arrival rate values and compared the simulation ones. Performance

17

ACCEPTED MANUSCRIPT

curves as those of the mean response time, mean waiting time, probability of immediate execution, mean throughput, CPU utilization and mean number of requests in the system are plotted in the figures. 0.09 VM cores=8 VM cores=4 VM cores=2 Simulation

0.07

CR IP T

Mean response Time (s)

0.08

0.06 0.05 0.04 0.03

0.01 1000

1500

AN US

0.02

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

Figure 6. Impact of number of VMs cores on mean response time

In Figure 6, we have plotted the mean response time in function of requests arrival rate according to three

M

different VM cores. We have considered in our simulation eight, four and two cores. We remark that when the requests arrival rate and the VM cores increase, the mean response time decreases. We see that when we use

ED

eight cores at VM, we can improve the mean response time parameter and enhance the QoS performance. For example, when there are 4500 requests per second in the system, by using VM with 8 cores, the mean response time reaches 10 ms; while, with 2 cores, the mean response time tends to 30 ms. This demonstrates that the

PT

multi-core VM gives better performance compared to single-core VM, especially for the SaaS cloud provider that must be prepared to attend a multiple of requests at the same time. Intuitively, for a VM with several

CE

processor cores, there will be a decrease in the mean response time. Indeed, when the number of cores is increased, the gain in terms of waiting time outweighs the loss caused by the use of the multiple cores. Figure 7 illustrates the impact of number of VMs cores on mean waiting time when the arrival rate is varied. It

AC

is obvious that when we perform with higher number of cores, the VM can process more requests which allow decreasing the mean waiting time and making service available for users at time. In other terms, while the requests arrival rate increases and the number of cores is important, the mean waiting time is low. In 2 cores case, the mean waiting time reaches 0.2 ms where we have 5000 requests per second as requests arrival rate. However, in the case of 8 cores this does not exceed 0.1 ms. In Figure 8, we present the impact of number of VMs cores on probability of immediate execution versus requests arrival rate. We remark that, as the requests arrival rate increases using VMs with more cores as the

18

ACCEPTED MANUSCRIPT

probability of immediate execution is maximized. Regarding case of 4500 requests per second, with 2 cores, the probability reaches 0.885 while with 4 cores, it’s equal to 0.9 but for 8 cores, the probability of immediate execution tends to 0.92. -4

x 10

VM cores=8 VM cores=4 VM cores=2 Simulation

1.6 1.4 1.2 1

AN US

Mean waiting time (s)

1.8

CR IP T

2

0.8 0.6 1000

1500

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

Figure 7. Impact of number of VMs cores on mean waiting time

M

Figure 9 shows the mean throughput as a function of requests arrival rate for different values of VM cores. We observe that applying VMs with higher number of cores, the mean throughput becomes important when the

0.98

0.96

AC

CE

Probability of immediate execution

PT

1

ED

requests arrival rate is varied.

0.94

0.92

0.9

0.88 1000

VM cores=8 VM cores=4 VM cores=2 Simulation 1500

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

Figure 8. Impact of number of VMs cores on probability of immediate execution

19

ACCEPTED MANUSCRIPT

5000

4000 3500 3000 2500 2000 1500

CR IP T

Mean throughput (requests/s)

4500

VM cores=8 VM cores=4 VM cores=2 Simulation

1000 500 1500

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

AN US

0 1000

Figure 9. Impact of number of VMs cores on mean throughput

100 VM cores=8 VM cores=4 VM cores=2 Simulation

90

M

70 60 50

PT

40

ED

Utilization (%)

80

30

CE

20 1000

1500

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

AC

Figure 10. Impact of number of VMs cores on utilization

Figure 10 presents the impact of number of VMs cores on CPU utilization in regards to the variation of the requests arrival rate. We see that using VM with 8 cores, the CPU utilization tends nearly to 50% while with 2 cores the CPU utilization is 100% when the system has 5000 requests per second. Clearly, it can be concluded that for reducing the CPU utilization, it is highly recommend to launch VMs with more cores.

20

ACCEPTED MANUSCRIPT

VM cores=8 VM cores=4 VM cores=2 Simulation 1000

500

1500

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

AN US

0 1000

CR IP T

Mean number of requests (requests)

1500

Figure 11. Impact of number of VMs cores on mean number of requests

The impact of number of VMs cores on mean number of requests is given in Figure 11. We remark that, when the arrival rates don’t exceed 3000 requests per second, there is no difference between VM with 2, 4 or 8 cores in terms of mean number of requests in the system. So as the requests arrival rate increases and the VM core = remains constant and considerably low.

M

2, the mean number of requests in the system increases. But for 4 and 8 cores, the mean number requests

ED

To compute the total cost of the system, we assume that the service cost of each VM core is $0.01 per second ( Ci  $0.01/ s ) and waiting time cost is $10 per second ( Cw  $10 / s ). Suppose we want to find the number of VMs core that minimizes the expected total cost, these parameters values are based on empirical

PT

measurements from prior works [44, 45, 46, 47]. These references use the pricing charges and costs proposed by AWS EC2 [48, 49], for VM usage per time unit. We used in our simulations the representative values

CE

according to these references. For waiting time cost, we have referred to results obtained in [45]. In Figure 12, we have studied the impact of number of VMs cores on overall system cost. The system cost

AC

evaluation is performed for requests arrival rate varying from 1000 to 5000 requests per second. We remark that, when the requests arrival rate increases the system cost increases. Up to an incoming workload of 2500 requests per second, the system with 2 cores shows the best cost compared to other systems. Under a workload of a rate ranging from 2500 to 5000 requests/s, the cost becomes much higher in case of VM with two cores and it reaches 12 Dollars; whereas, with 4 cores, the cost is equal to 9 dollars and with VM with 8 cores, the cost is reduced and reaches 7 dollars at 5000 requests per second. Consequently, it can be concluded that SaaS systems under heavy workload can benefit from allocating VMs with more cores, specifically in terms of

21

ACCEPTED MANUSCRIPT

minimizing the cost and the waiting time. When the number of cores is increased, the gain in the cost in terms of waiting time outweighs the loss caused by the use of the cores.

12

CR IP T

System cost (dollars)

10

8

6

4

VM cores=8 VM cores=4 VM cores=2 Simulation

0 1000

1500

AN US

2

2000 2500 3000 3500 4000 Requests arrival rate (requests/s)

4500

5000

M

Figure 12. Impact of number of VMs cores on system cost

6. Conclusion

ED

In this paper, we have presented an analytical queuing model to determine, under any given workload conditions, the minimum number of compute resources required for hosting SaaS cloud applications. We have derived formulas for key performance metrics such as response time, waiting time request, probability of

PT

immediate execution, CPU utilization, and throughput. Furthermore, we have shown how our analytical model can be used to estimate the overall system cost. Our analytical model has been cross validated by simulation

CE

using the popular JMT simulator. We have given many numerical examples involving multi-cores VMs with 2 cores, 4 cores and 8 cores. Surprisingly, we found that MaaS systems under heavy workload can benefit from allocating VMs with more cores (as opposed to VMs with a single or a few cores), specifically in terms of

AC

minimizing the cost and the waiting time. As a future work, we are planning to implement our formulas to be an integral part of a comprehensive solution that has the ability to dynamically scale resources efficiently for SaaS applications in publically available cloud platforms as that of Amazon AWS. Also, the proposed model will be extended to include heterogeneous cloud with heterogeneous VMs to represent different resources that can be provided to customers.

22

ACCEPTED MANUSCRIPT

Acknowledgments The authors thank the anonymous reviewers for their valuable comments, which have helped us to considerably improve the content, quality, and presentation of this paper.

[1] P. Mell and T. Grance, “The NIST definition of cloud computing,” 2011.

CR IP T

References

[2] E. Amazon, “Amazon Elastic Compute Cloud,” Retrieved Feb, vol. 10, 2009.

[3] M. Cusumano, “Cloud Computing and SaaS as New Computing Platforms,” Communications of the ACM, vol. 53, no. 4, pp. 27–29, 2010. [4] E. Ciurana, Developing with Google App Engine. Apress, 2009.

[5] Q. Duan, “Cloud service performance evaluation: Status, challenges, and opportunities–a survey from the system modeling perspective,” Digital Communications and Networks, 2016.

AN US

[6] J. A. Gonzalez-Martinez, M. L. Bote-Lorenzo, E. Gomez-Sanchez, and R. Cano-Parra, “Cloud computing and education: A state-of-the-art survey,” Computers & Education, vol. 80, pp. 132–151, 2015. [7] T. Dillon, C. Wu, and E. Chang, “Cloud computing: issues and challenges,” in 2010 24th IEEE international conference on advanced information networking and applications. IEEE, 2010, pp. 27–33. [8] I. Pietri and R. Sakellariou, “Mapping virtual machines onto physical machines in cloud computing: A survey,” ACM Computing Surveys (CSUR), vol. 49, no. 3, pp. 49:1–49:30, 2016.

M

[9] Z. Mann, “Multicore-aware virtual machine placement in cloud data centers,” IEEE Transactions on Computers, vol. 65, no. 11, pp. 3357– 3369, 2016. [10] Y. Cheng, W. Chen, Z. Wang, and Y. Xiang, “Precise contention-aware performance prediction on virtualized multicore system,” Journal of Systems Architecture, pp. 1–9, 2016.

ED

[11] S. El Kafhali and K. Salah, “Stochastic modelling and analysis of cloud computing data center,” Proceedings of the 20th ICIN Conference Innovations in Clouds, Internet and Networks, Paris, France, March 7-9, 2017, pp .122-126.

PT

[12] L. Wu, S. K. Garg, and R. Buyya, “Sla-based admission control for a software-as-a-service provider in cloud computing environments,” Journal of Computer and System Sciences, vol. 78, no. 5, pp. 1280– 1299, 2012. [13] J. Espadas, A. Molina, G. Jimenez, M. Molina, R. Ramirez, and D. Concha, “A tenant-based resource allocation model for scaling softwareas-a-service applications over cloud computing infrastructures,” Future Generation Computer Systems, vol. 29, no. 1, pp. 273–286, 2013.

CE

[14] T. A. Genez, L. F. Bittencourt, and E. R. Madeira, “Workflow scheduling for saas/paas cloud providers considering two sla levels,” in Network Operations and Management Symposium (NOMS), 2012 IEEE. IEEE, 2012, pp. 906–912.

AC

[15] K. Li, “Optimal partitioning of a multicore server processor,” The Journal of Supercomputing, vol. 71, no. 10, pp. 3744–3769, 2015. [16] W.-H. Bai, J.-Q. Xi, J.-X. Zhu, and S.-W. Huang, “Performance analysis of heterogeneous data centers in cloud computing using a complex queuing model,” Mathematical Problems in Engineering, vol. 2015, 2015. [17] L. Guo, T. Yan, S. Zhao, and C. Jiang, “Dynamic performance optimization for cloud computing using m/m/m queueing system,” Journal of Applied Mathematics, vol. 2014, 2014. [18] A. O. Akingbesote, M. O. Adigun, S. Xulu, and E. Jembere, “Performance modeling of proposed guiset middleware for mobile healthcare services in e-marketplaces,” Journal of Applied Mathematics, vol. 2014, 2014.

23

ACCEPTED MANUSCRIPT

[19] T. Meyer, F. Wohlfart, D. Raumer, B. E. Wolfinger, and G. Carle, “Validated model-based performance prediction of multi-core software routers,” PIK-Praxis der Informationsverarbeitung und Kommunikation, vol. 37, no. 2, pp. 93– 107, 2014. [20] G. L. Stavrinides and H. D. Karatza, “Scheduling different types of applications in a saas cloud,” in Proceedings of the 6th International Symposium on Business Modeling and Software Design (BMSD16), 2016, pp. 144–151. [21] Salah, K., Elbadawi, K., and Boutaba, R., “An Analytical Model for Estimating Cloud Resources of Elastic Services,” Journal of Network and Systems Management, Springer, Vol. 24, No. 2, April 2016, pp. 285-308.

CR IP T

[22] E. Truyen, D. Van Landuyt, V. Reniers, A. Rafique, B. Lagaisse, and W. Joosen, “Towards a container-based architecture for multi-tenant saas applications,” in Proceedings of the 15th International Workshop on Adaptive and Reflective Middleware. ACM, 2016, p. 6. [23] R. H. Katz, “Tech titans building boom,” IEEE Spectrum, vol. 2, no. 46, pp. 40–54, 2009.

[24] L. Li, “An optimistic differentiated service job scheduling system for cloud computing service users and providers,” in Third International Conference on Multimedia and Ubiquitous Engineering, (MUE’09). IEEE, 2009, pp. 295–299. [25] H. Khojasteh and J. Misic, “Task admission control policy in cloud server pools based on task arrival dynamics,” Wireless Communications and Mobile Computing, vol. 16, no. 11, pp. 1363–1376, 2016.

AN US

[26] B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. H. Katz, S. Shenker, and I. Stoica, “Mesos: A platform for fine-grained resource sharing in the data center.” in NSDI, vol. 11, 2011, pp. 22–22. [27] V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth et al., “Apache hadoop yarn: Yet another resource negotiator,” in Proceedings of the 4th annual Symposium on Cloud Computing. ACM, 2013, pp. 5–21. [28] Salah K, El Kafhali S (2017) Performance modeling and analysis of hypoexponential network servers. Telecommun Syst. doi:10.1007/s11235-016-0262-3

M

[29] R. Nelson, Probability, stochastic processes, and queueing theory: the mathematics of computer performance modeling. Springer Science & Business Media, 2013.

ED

[30] Narayan Bhat, U (2015) An introduction to queueing theory: modeling and analysis in applications. Birkhäuser, Springer, New York. [31] Dattatreya GR (2008) Performance analysis of queuing and computer networks. CRC Press, Boca Raton

PT

[32] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. De Rose, and R. Buyya, “Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms,” Software: Practice and Experience, vol. 41, no. 1, pp. 23–50, 2011.

CE

[33] B. Wickremasinghe, R. N. Calheiros, and R. Buyya, “Cloudanalyst: A cloudsim-based visual modeller for analysing cloud computing environments and applications,” in 2010 24th IEEE International Conference on Advanced Information Networking and Applications. IEEE, 2010, pp. 446–452.

AC

[34] L. Liu, H. Wang, X. Liu, X. Jin, W. B. He, Q. B. Wang, and Y. Chen, “Greencloud: a new architecture for green data center,” in Proceedings of the 6th international conference industry session on Autonomic computing and communications industry session. ACM, 2009, pp. 29– 38. [35] A. Nunez, J. L. Vazquez-Poletti, A. C. Caminero, G. G. Castane, J. Carretero, and I. M. Llorente, “icancloud: A flexible and scalable cloud infrastructure simulator,” Journal of Grid Computing, vol. 10, no. 1, pp. 185–209, 2012. [36] R. Buyya and M. Murshed, “Gridsim: A toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing,” Concurrency and computation: practice and experience, vol. 14, no. 13-15, pp. 1175–1220, 2002. [37] S.-H. Lim, B. Sharma, G. Nam, E. K. Kim, and C. R. Das, “Mdcsim: A multi-tier data center simulation, platform,” in 2009 IEEE International Conference on Cluster Computing and Workshops. IEEE, 2009, pp. 1–9.

24

ACCEPTED MANUSCRIPT

[38] S. K. Garg and R. Buyya, “Networkcloudsim: Modelling parallel applications in cloud simulations,” in 2011 Fourth IEEE International Conference on Utility and Cloud Computing (UCC). IEEE, 2011, pp. 105–113. [39] R. N. Calheiros, M. A. Netto, C. A. De Rose, and R. Buyya, “Emusim: an integrated emulation and simulation environment for modeling, evaluation, and validation of performance of cloud computing applications,” Software: Practice and Experience, vol. 43, no. 5, pp. 595–612, 2013. [40] M. Bertoli, G. Casale, and G. Serazzi, “Jmt: performance engineering tools for system modeling,” ACM SIGMETRICS Performance Evaluation Review, vol. 36, no. 4, pp. 10–15, 2009.

CR IP T

[41] G. Bolch, S. Greiner, H. de Meer, and K. S. Trivedi, Queueing networks and Markov chains: modeling and performance evaluation with computer science applications. John Wiley & Sons, 2006. [42] U. N. Bhat, An introduction to queueing theory: modeling and analysis in applications. Birkhauser, 2015.

[43] G. Fishman, Discrete-event simulation: modeling, programming, and analysis. Springer Science & Business Media, 2013. [44] Y.-J. Chiang, Y.-C. Ouyang, and C.-H. Hsu, “Performance and cost effectiveness analyses for cloud services based on rejected and impatient users,” IEEE Transactions on Services Computing, vol. 9, no. 3, pp. 446–455, 2016.

AN US

[45] S. Genaud and J. Gossa, “Cost-wait trade-offs in client-side resource provisioning with elastic clouds,” in Cloud computing (CLOUD), 2011 IEEE international conference on. IEEE, 2011, pp. 1–8. [46] Y.-J. Chiang, Y.-C. Ouyang, and C.-H. R. Hsu, “An efficient green control algorithm in cloud computing for cost optimization,” IEEE Transactions on Cloud Computing, vol. 3, no. 2, pp. 145–155, 2015. [47] I. A. Moschakis and H. D. Karatza, “Evaluation of gang scheduling performance and cost in a cloud computing system,” The Journal of Supercomputing, vol. 59, no. 2, pp. 975–992, 2012. [48] Wang, H., Jing, Q., Chen, R., He, B., Qian, Z., and Zhou, L. (2010). Distributed Systems Meet Economics: Pricing in the Cloud. HotCloud, 10, 1-6.

AC

CE

PT

ED

M

[49] Amazon EC2 Pricing, https://aws.amazon.com/ec2/pricing/on-demand/?nc1=h_ls.

25