Multi-site data distribution for disaster recovery—A planning framework

Multi-site data distribution for disaster recovery—A planning framework

Future Generation Computer Systems 41 (2014) 53–64 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: ww...

990KB Sizes 0 Downloads 55 Views

Future Generation Computer Systems 41 (2014) 53–64

Contents lists available at ScienceDirect

Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs

Multi-site data distribution for disaster recovery—A planning framework Shubhashis Sengupta ∗ , K.M. Annervaz Accenture Technology Labs, Bangalore, India

highlights • • • • •

We describe a fault-tolerant multi-cloud data backup scheme using erasure coding. The data is distributed using a plan driven by a multi-criteria optimization. The plan uses parameters like cost, replication level, recoverability objective etc. Both single customer and multiple customer cases are tackled. Simulation results for the plans and sensitivity analyses are discussed.

article

info

Article history: Received 4 July 2012 Received in revised form 17 July 2014 Accepted 29 July 2014 Available online 8 August 2014 Keywords: Disaster recovery Planning Data distribution Optimization

abstract In this paper, we present DDP-DR: a Data Distribution Planner for Disaster Recovery. DDP-DR provides an optimal way of backing-up critical business data into data centers (DCs) across several Geographic locations. DDP-DR provides a plan for replication of backup data across potentially large number of data centers so that (i) the client data is recoverable in the event of catastrophic failure at one or more data centers (disaster recovery) and, (ii) the client data is replicated and distributed in an optimal way taking into consideration major business criteria such as cost of storage, protection level against site failures, and other business and operational parameters like recovery point objective (RPO), and recovery time objective (RTO). The planner uses Erasure Coding (EC) to divide and codify data chunks into fragments and distribute the fragments across DR sites or storage zones so that failure of one or more site / zone can be tolerated and data can be regenerated. We describe data distribution planning approaches for both single customer and multiple customer scenarios. © 2014 Elsevier B.V. All rights reserved.

1. Introduction In today’s enterprise computing, data centers generate an overwhelming volume of data. Applications such as particle physics [1], storing of web pages and indexes [2], social networking applications, and engineering applications of pharmaceutical and semiconductor companies can easily generate petabytes of data over days and weeks. Disaster Recovery (DR) and Business Continuity planning (BCP) require that critical enterprise data is backed up periodically and kept in geographically separate and secure locations. In the event of operational disruption at the primary site, the operation can be resumed at an alternate site where the backed up data



Corresponding author. Tel.: +91 9845185828. E-mail addresses: [email protected] (S. Sengupta), [email protected] (K.M. Annervaz). http://dx.doi.org/10.1016/j.future.2014.07.007 0167-739X/© 2014 Elsevier B.V. All rights reserved.

and log files are shipped and applications/services can be instantiated again. Additionally, recent regulatory and compliance standards like HIPAA, SOX and GLBA mandate that all operational data is retained for a certain period of time and be made available for auditing. With the increasing volume of data and increasing emphasis on service availability and data retention, the technology and process of handling backup and recovery have come under renewed scrutiny. 1.1. Traditional backup methodology Traditionally, the data backup and archival are done using magnetic tapes which are processed and transported to a remote location. However, such procedure is manual and cumbersome (therefore slow) and rapid data restoration and service resumption are often not possible. Recently, with the advent of cheap, improved storage and online disk backup technology, and advances in networking; online remote backup options have become attractive

54

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

Fig. 1. Schematic multi-site DR infrastructure.

[3]. The storage area network and virtualization technology has become sophisticated enough to create a storage volume snapshot to a remote site [4]. Increasingly, open-source technologies such as RSync [5] are being used to achieve the same goals; albeit with a lower efficiency. 1.2. Multi-site data replication and backup Cloud computing and cheap online storage technologies are promising to change the landscape of disaster recovery. The data from the primary site is now backed up in the cloud and/or in multiple geographically separated data centers to improve fault tolerance and availability [6]. Several cloud infrastructure and storage vendors such as Amazon S3, Glacier [7], and Rackspace [8] provide storage for backup. Several other vendors like Zamanda [9], use the cloud storage, such as Amazon S3, to provide backup services. Organizations are also adopting hybrid approach—where very critical or sensitive data is stored within the enterprise and non-sensitive data is dispatched to cloud. While backup using a single cloud or online storage is cheap and practical, online storing of encrypted backup data to a single third-party storage provider may not be prudent due to the lack of operational control, security, reliability and availability issues. It is advisable that organizations hedge their bets by replicating data to multiple cloud locations and data centers. It is also observed that in large organizations, having data centers (DC) in multiple geographies, DR may involve using one regional data center as an alternate site against another by replicating data. Replicating DR data across sites improves the availability by reducing the risk of simultaneous co-related failures. In this context, we present a schematic diagram for a multi-site DR in Fig. 1. The primary site (DC1) hosts the servers and storage for production, test, and development. Historical operational data

is periodically copied to the staging servers where aggregation and de-duplication are run. The ‘‘backup’’ ready data is then replicated to multiple data centers1 that the firm owns (DC2 and DC3) and/or to the public cloud storage providers. In the event of failure at the primary site, the data can be recovered to the recovery (also called secondary) site on demand. Data recovery or retrieval may require additional compute resources to carry out costly operations such as de-compression and decryption of data. Therefore, recovery can be optionally offloaded to a server firm (DC4) or to a dedicated processing hardware in the DR sites that can do bulk recovery of multiple customers within stipulated time bounds. Since we propose a distributed storage substrate, one possible mechanism to maintain data consistency across backup sites is to create a peer-to-peer based storage overlay layer across the sites. Various distributed archival storage substrate are discussed in literature [10,11], but this is not the primary concern of this paper. 1.3. Optimal data distribution plan for multi-site backup (Motivation of this work) The current approach of multi-site backup is to replicate data to single or multiple remote sites so that co-related storage or network failures do not hamper data availability. Replication, however, increases data redundancy linearly with the number of sites. Plain replication, even with data compression technologies, makes the data footprint quite large. Additionally, it is often seen that the strategy of data placement and distribution is not driven

1 We use the term data center in a broad sense. It may also mean a set of storage nodes/cluster.

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

from recovery standpoint (there is no way of telling if the data recovery can happen on time if any of the primary sites fails) and overall storage topology may get sub-optimal. Therefore, there is a need for rationalizing and optimizing distributed backup storage—from the point of view of data footprint, cost, security, availability and data recoverability (within time and cost). Disaster Recovery planning [12] often overlooks this critical issue. In this paper, we propose a novel planning approach, called Data Distribution Planner for Disaster Recovery (DDP-DR in short). The planner creates a plan for distributing data across heterogeneous DR sites in a manner so that it satisfies multiple objectives— the overall cost of storage can be kept below a customer specified bound, the data can survive outages of up-to a pre-specified number of sites, and the distributed data can be recovered and reassembled within a customer specified time bound. To reduce the data replication footprint, we take recourse to Erasure Coding (EC) [13]. We combine EC based data encoding technique with Linear Programming (LP) based constraint satisfaction algorithm. Backup data files are broken up through EC into multiple data and code fragments. Coding rate of EC determines how many data and code fragments will be created from a data file. In this work, we drive coding rate, and therefore, data footprint through a set of optimized parameters based on failure protection level required by the customers. This data fragmentation and coding technique create considerably less data redundancy overhead than the conventional multi-site replication method. The optimization, a key part of the work, is done through constraint based mathematical problem formulation and Linear Programming method. In our research problem formulation, we analyze a couple of likely scenarios for cloud-based distributed DR system planning. In the first scenario: we discuss the problem from the view-point of a customer (an institution or an individual) who wishes to back up and archive the data remotely across different cloud based data centers. The customer wants to create a data distribution strategy so that the data can be distributed in a redundant manner to achieve certain degree of tolerance to the failure of the backup data centers; and at the same time the customer has a threshold for the time taken to back up the incremental data at a certain interval and to recover data within a certain period to a preferred secondary site if the primary site is struck by disaster. Additionally, the customer has a limit on how much can be spent as the rental cost of storage. We term this as Single Customer Problem. In the second scenario, we discuss the formulation from the view-point of a DR Service provider, running (or renting out) multiple storage sites; so that it can accommodate individual customers with different needs for data redundancy, cost, and data backup/recovery time bounds. We call this as Multi-Customer Problem. As a research methodology, we take recourse to mathematical formulation of the problem and validation through simulation. We take hypothetical but realistic values for storage media cost, cost and bandwidth of network links, and representative data volumes for archival and backup. The representative cost values are obtained from the public data of storage vendors. The storage costs of commercial public cloud providers are not used in our calculation as the actual type and nature of underlying storage media are not known to us. It is more likely that a large corporation or scientific institution adopting multi-data center based archival will adopt a strategy where it can rent storage directly from providers (such as large Telcos) and, therefore, will have the option to select the type of media, replication and bandwidth. However, our formulation can work for public cloud providers as well if the similar data is made available from public cloud providers. To our knowledge, the proposed formulation and approach for planning data placement in DDP-DR is unique as no other similar approach is reported. The cloud storage vendors and DR

55

Service providers highlight the virtues of distributed data replica across regions and data centers for availability reasons [14], but no structured mathematical formulation has been reported so far. Similarly, the traditional works on static and dynamic data placement and replication in Grid environments are traditionally focused on efficient job execution through the enablement of data locality [15,16]. The usage of distributed cloud based infrastructure as storage substrate has been studied extensively from the perspective of system development and not from the perspective of data placement planning. The organization is the paper is as follows: in Section 2, we describe the topology, data backup and recovery processes associated with DR and also state our assumptions. We present the formulation for Single Customer Problem in Section 3. The model extensions for Multiple Customer variant is presented in Section 4. The implementation and sample results are described in Section 5. Related works are described in Section 6, followed by the conclusion and possible extensions of the work (Section 7). 2. Multi-site DR process, topology and assumptions 2.1. DR process and topological assumptions In DR, one of the critical IT processes involves copying and distribution data offsite to remote location(s) and keeping the data synchronized as much as possible with the primary copy. This is because, in the event of failure, we want to resume operations with data which is as near point in time as when the operation has failed. The backup data is treated as a single file even though it can be aggregate of multiple user and application data, log files, system configuration files etc. We call this file as Level 0 or full copy backup (L0 ). Subsequently, the customer can upload changes/additions to the backup data on periodic basis, say between time intervals Ti and Ti−1 . We call these additional files as L1 (delta) change. Please note that, the backup and refresh of data between Primary and Recovery sites happen through a rolling time horizon. This means that all Primary changes will be eventually propagated to Recovery site through the backup substrate. When the time interval [i − 1, i] is infinitesimally small, the situation is akin to hot standby. In this context, two definitions are pertinent to DR planning.

• RTO: Recovery Time Objective—this specifies an upper bound for the time limit within which, after disaster at the primary site, the entire backup data (L0 ) needs to be restored at the recovery site to resume operations. • RPO: Recovery Point Objective—this provides an upper bound of the time limit within which the L1 incremental backups of added and changed data are to be committed to the system. This value, in effect, also determines how old the DR data is at the time of disaster. In DDP-DR we model a multi-DC backup and disaster recovery scenario. One of the DC is the Primary site of a customer. Another DC may be chosen as the Recovery site. Other DCs are used for storage of backup data. Each DC/storage zone may have different types of storage. We basically model three types of storage—SATA (serial access), iSCSI (internet Small Computer System Interface) and FC (Fiber Channel). These protocols provide different types of data write and read rates into and from the storage, and they vary in cost. We also assume that each storage unit in the DC is divided into storage blocks or Buckets of certain capacities. S3 storage blocks in Amazon or Swift blocks in Open-Stack cloud [17] architecture are examples of such storage buckets. The data fragments get stored in these buckets.

56

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

One of key parameters of our planning is Protection Level (PL)—which is defined by the degree (a numerical integer number) to which the customer requires guarantee against simultaneous data center failures. For example, if there are n DCs, the maximum protection level that can be guaranteed to the customer is n − 1. Further, a customer can select if he wants a particular DC to be excluded or included from the list where the backup data can be placed. For example a banking customer can demand his data to be placed only in PCI (credit card standard) compliant zone and not otherwise. We capture this with Placement Constraint (PC)— which is taken as 0 if the data for a customer is not to be placed in a particular DC and 1 otherwise. Finally, we note that restoring to the Recovery site is the first step. This step is followed by Virtual Machine restart, application configuration, and service health-check. Those steps are beyond the scope of our current work. 2.2. Data fragmentation strategy Spreading backup data across multiple sites increases data replication volume. To reduce the volume of replicated data, we take recourse to Erasure Coding (EC) [13]. Erasure Coding (EC) is a technique frequently used in Networking (Multicast FEC), distributed storage, security etc. We use MDS coding to code m data chunks with k error-correcting code devices to achieve n = (m + k) fragments, with coding rate r = m/(m + k), we require any m fragments to get back the data. The backup file is broken into chunks and each chunk is then erasure coded. The encoded blocks can then be dispatched to different DCs and public cloud storage. In the event of failure, the remaining encoded data fragments can be transmitted back to the Recovery site and decoded. The data distribution planning for any customer should be such that there are enough encoded fragments available for the customer to tolerate failure of data centers until the Protection Level is reached. In this paper, we concentrate on the data distribution strategy and not on actual implementation of backup to different clouds. It has been established that EC creates lesser data volumes and lesser network bandwidth for similar resilience (system durability) than plain replication scheme [18]. Additionally, network coding schemes are found to be more efficient globally than many other local storage-level data protection schemes [19]. One of the drawbacks of EC is that to regenerate data from partial failures, one must recall all the remaining encoded fragments to decode, re-encode and redistribute. However, the storage substrate can be made more resilient by distributing extra encoded fragments. Newer network coding techniques [20] allow partial regeneration of data and, therefore, more efficient. 3. DDP-DR approach for Single Customer Problem Consider a scenario where a customer (an institution or an individual) wants to back up data to multitude of remotely located cloud-based data centers to support a disaster recovery strategy. As motivated in Section 1.3, such a customer will want to devise an optimized plan for data distribution where data can be backed up across possibly heterogeneous storage nodes in remote data centers across different zones or geographies in a manner that the time limits for backup and retrieval are met and the data can be stored within certain cost. Given a customer and a set of backup data centers, our objective is to create an optimal plan for distributing data across DCs. An optimal plan for data distribution will be such so that the total cost of storage, the data backup time (RPO) and data recovery time (RTO) are minimized. The resultant decision problem is a multiobjective function. We try to break the multi-objective decision

problem with single objectives at a time – Objective 1: minimize the cost of storage and replication for a customer while maintaining the RTO and RPO time bounds and other policy level constraints; Objective 2: minimize the RTO (recovery time) for a customer while maintaining the cost bound and RPO time bound and other policy level constraints; Objective 3: minimize the RPO (backup time) for a customer while maintaining the cost bound and RTO time bound and other policy level constraints. In each case, customer policy level parameters such as Protection Level (PL) and, Placement Constraint (PC) are satisfied. The looking at the resultant distribution plan, the customer will be able to select a suitable operating point. We consider that network links to the data center are of equal bandwidth and cost; and hence, they do not affect our solution. However, our plan formulation can be augmented with network considerations as well. Additionally, we carry out distribution of data in the form of erasure coded fragments so the overall volume of coded data can be minimized.2 EC coded fragments can be distributed in such a way across the DCs so that even if some of the DCs (storage) are compromised, data integrity is maintained as one needs at least m encoded fragments to get back the original file. The placement problem is a Mixed Integer Programming [22] problem. However, we have relaxed the problem into an LP and tried finding the feasibility in a convex hull. We established that the solutions do not go beyond a small bound beyond the actual MIP solution because of this relaxation. 3.1. Planning parameters We list some of the main parameters of our formulation in the Table 1 with explanations. In this table, the known parameters are: FCust , BCust , me , er , dr , bucket_size, BW ij , ci , Pb , Tb , IOPS, no_of_disks, segment_size, Cb . Parameters to be determined are: Xi , which is the number of encoded fragments per data center and, therefore, ne , which is the total number of encoded fragments for distribution. 3.2. Constraints We now discuss the formulation of the constraints to the objective. RPO constraint: RPO constraint states that, for a customer Cust, encoding of the L1 delta backup data and transferring of encoded fragments into storage DCs should happen within Pb time. We assume that data can be pushed simultaneously to all selected DCs. Mathematically,



  1 BCust n ∗ BCust + max ∗ ∗ Xj ≤ Pb j =1 er BW sjact me      1 Pb 1 n Re-writing, max ∗ Xj ≤ − ∗ me . 1

j =1

BW sjact

BCust

er

Re-writing again in standard LP form,

∀j{1, . . . , n},



1 BW sjact

 ∗ Xj

 ≤

Pb BCust



1 er



 ∗ me . (3.2.1)

2 It is observed easily that if we keep the data Protection Level very high where data has to be protected against simultaneous failure of all but one data center, then, replication and EC based schemes require same numbers of data backup copies. If we decrease the protection levels to below n − 1 i.e. n − 2, . . . , 1, where n is the number of DCs, then EC based scheme will require lesser data volume than replication [21].

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

57

Table 1 Formulation parameters for single customer case. Parameters

Explanation

FCust BCust Bucket_size me ne er , dr i = {1, . . . , n} BW ij IOPS No_of_disks Segment_size BW ijact ci

Total backup file size (L0 ) for customer Size of delta backup (L1 ) for customer Bucket size, presumed to be same across all storage units in DCs   Total number of input (to-be coded) fragments for the customer FCust /bucket_size Total number of output (encoded) fragments for the customer Rate of encoding and rate of decoding at server Data centers Available incoming bandwidth to data center j from data center i Data read–write rate in the group of storage servers in a zone (FC, iSCSI or SATA) Number of disks in a storage server in a storage zone Size of the read/write segment in MB in the server Actual available bandwidth between data centers i and j Weighted average cost of storage in ith data center

Xi

Number of encoded fragments to be stored in data center i (ne =

Pb Tb Cb Si

Backup time bound for the customer data of size BCust (equivalent to RPO) Recovery time bound for the customer data of size FCust (equivalent to RTO) Cost bound for the customer Total available storage in ith data center

RTO constraint: this constraint is connected with data recovery, i.e., in data recovery, encoded L0 data for customer Cust is to be pulled out of storage data centers and decoded within time bound Tb . As in the previous case, here also we assume that the data is fetched simultaneously to Recovery site through channels from different DCs. Mathematically,



1 dr



n

∗ FCust + max

1



BW jsact

j =1



FCust

∗ Xj

me

≤ Tb .

n



max j =1



1 BW jsact

∗ Xj

 ≤

Tb FCust



1



i.e., ∀j{1, . . . , n},



1 BW jsact

 ∗ Xj

 ≤

Tb FCust



1



 ∗ me

dr

.

(3.2.2)

Storage cost constraint: the customer Cust may have a cost bound. The total cost for allocated storage across all DCs has to be Cb within this bound. Mathematically, n  (FCust /me ) ∗ ci ∗ Xi ≤ Cb .

(3.2.3)

i=1

PL constraint: enough encoded fragments are to be spread across the DCs so that a failure of up-to PL DCs may be tolerated. To support a failure of a single DC j, there should be enough fragments available across other DCs. As me is the number of fragments required to get the data for customer, to support the failure of DC j, we need to have n 

∀j{1, . . . , n}

Xi ≥ m e .

i=1,i̸=j

So, in order to support a protection level of simultaneous failures of up-to k DCs, there should be enough encoded fragments in rest n Cn−k DCs. Thus,

 S = {1, . . . , n};

∀O ∈ ℜ(S , n − k)

 

Xi )

Xi = 0,

i ∈ Q , Q ⊂ {1, . . . , n}.

(3.2.5)

Storage availability constraint: the total size of all of the fragments that are stored in DC i should be less than the storage space available in DC i:

 Xi ∗ FCust Me

 < Si .

(3.2.6)

Available bandwidth constraint: the actual rate of data transfer from DC i to DC j may the smaller of the network bandwidth and the read/write rate of the storage unit. The read/write rate (in GB) of the storage unit can determined as (IOPS ∗ no_of _disks ∗ segment_size/1024). Mathematically, available bandwidth have been derived by

 ∗ me

dr

i=1

Hence for any DC in subset Q , fragments can be zero.

∀i ∈ {1, . . . , n}

Re-writing in standard LP form,

n

Xi ≥ me

(3.2.4)

i∈O

where, ℜ(S , n − k) is the combination from a data centers set S taken (n − k) at a time. PC constraint: the customer may want to exclude a subset Q of the available n DCs to store any fragments of the backup data.

BW ijact = MIN (((IOPS ∗ no_of _disks

∗ segment_size)/1024), BW ij ).

(3.2.7)

3.3. Problem formulation As discussed earlier, we have broken the multi objective minimization problem into three sets of minimizations and allow users to choose from the available feasible solutions. To that effect, we now discuss the corresponding Linear Programming formulations one after the other. Storage cost minimization: for a customer, the objective function for storage cost minimization can be written as the sum of the product of the size of each encoded fragment, the number of encoded fragments written to DC i, and the cost of storage per unit in DC i across all n DCs: Minimize

n  (FCust /me ) ∗ ci ∗ Xi .

(3.3.1)

i=1

The minimization can be subjected to RTO, RPO, PL, PC, Storage Availability and Bandwidth constraints detailed in the previous section. RPO minimization: the objective function for the minimization of the RPO may involve minimizing the total time to encode and store the encoded fragments of incremental backup, L1 across DCs:

 Minimize

1 er

n

∗ BCust + max j =1



1 BW sjact



BCust me

 ∗ Xj

.

58

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

This can be written as, Minimize t, such that,



∀j{1, . . . , n}  +

1 er

1



BW sjact

 ∗ BCust

BCust me

multi-objective function. We try to break the multi-objective decision problem by way of solving the following individual objectives at a time –

 ∗ Xj




(3.3.2)

The minimization can be subject to Storage Cost, RTO, PL, PC, Storage Availability and Bandwidth constraints detailed in the previous section. RTO Minimization: the objective function for minimizing the RTO will involve minimizing the time to retrieve the full backup, L0 , from each of the DCs that store coded fragments of the backup data:

 Minimize

1 dr

n



∗ FCust + max j =1

1 BW jsact



FCust me

 ∗ Xj

.

Like in the previous section, we will discuss the various constraints and objectives in detail. We start with a brief description of the various parameters used in the subsequent discussion.

Which can be written as Minimize t, such that,

∀j{1, . . . , n},  +

1 dr



BW jsact

 ∗ FCust

1



FCust me

Objective 1: minimize the overall total cost of storage across all data centers subject to constraints imposed by all categories customers. This objective is similar to the one created for the single customer, except that in this case, it is the service provider which is trying to reduce its overall cost given the backup (RPO) and recovery (RTO) time bounds, individual storage cost constraints, protection and exclusion levels for each category of customers, and Objective 2: minimize the overall cost for overall network link usage (and cost) subject to the constraints imposed by all categories of customers as given in objective 1.

 ∗ Xj

4.1. Planning parameters




(3.3.3)

The minimization can be subject to Storage Cost, RPO, PL, PC, Storage Availability and Bandwidth constraints detailed in the previous section. 4. DDP-DR approach for Multi-Customer Problem For multiple customer case, we motivate the problem from the perspective of a storage or backup service provider (or department) that provides multi-data center or cloud based storage service. Consider a DR provider or a consulting company which owns or leases storage zones and network links across different data centers so that it can host its tenants (customers) archived data redundantly across the centers to provide data availability and recovery if the primary operation site of any of its customers fail. The DR provider would often get into a contractual agreement with its customers to maintain certain level of service parameters such as time taken to backup and retrieve the data and also to conduct operations within certain cost bounds. In this case, we formulate the problem of optimized data distribution on behalf of such service providers. Therefore, the objective and approach is slightly different from the one discussed in Section 3, although we build on the notations and formulations made in that section. To make the problem of multiple customers computationally tractable, we assume that there are finite sets of service level categories for customers. Consider the case of a DR Service provider catering to a set of customers with different service level agreements, say Gold, Silver and Bronze. The Gold set of customers has stricter SLAs in terms of RPO and RTO bounds than the Silver set of customers while the Gold set of customers may be willing to pay more than the Silver set and so on. In [23], the authors have discussed practical examples of user categorization and SLA servicing in various shared cloud data centers. The DR service provider will have to create an optimum storage data distribution plan for p categories of customers, where the RPO and RTO bounds for each individual customer in a category are to be satisfied, and the service provider will have to minimize its total cost of storage and total network link cost.3 The resultant decision problem is a

3 Note that in Multiple Customer case we bring Network link costs into consideration (while this was not so for Single Customer). This is because, we assume

We list the main parameters with explanations of our formulation for multi-customer case in Table 2. Only the parameters that are changed or are not present in the formulation presented in Section 3.1/Table 1 are presented here. 4.2. Multi-customer constraints Backup deadline constraints: Considering that data from each of the p customers are required to be backed up into n backup data centers within the RPO bound, it is required that summation of total bandwidths available to the customer across all the time quanta will have to be sufficient to carry the coded data to the backup centers. Thus, the backup deadline constraints for all of the p customers may be modeled as:

∀i ∈ {1, . . . , p}, ∀j ∈ {1, . . . , n}, P  Custib  B Custi (LBisjk ∗ τk ) ≥ ∗ Xij k=1

meCusti

(4.2.1)

where s is the source data center. And, ∀i ∈ {1, . . . , p}, ∀j ∈ {1, . . . , n},

  ∀k ∈ {PCustib + 1, . . . , gRPO }, LBisjk = 0 .

(4.2.2)

Recovery deadline constraints: As with the previous constraint, the recovery deadline constraint warrants that the total bandwidth available to customers across time quanta must be sufficient for transferring data to recovery centers within RTO limits. This can be modeled as:

∀i ∈ {1, . . . , p}, ∀j ∈ {1, . . . , n},  T Custib  FCusti i ∗ Xij (LRjdk ∗ τk ) ≥ k=1

meCusti

(4.2.3)

that in the single customer case the links are leased by the customer from the provider and the storage cost the customer pays to the provider is inclusive of the Link cost. Whereas, the DR service provider pays for the physical link charges to the Telcos separately from the storage charges. So the provider has two cost minimization objectives—one for the storage and another one for the network links. For detailed discussion on how network links should be priced, please refer [24].

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

59

Table 2 Formulation parameters for multiple customer case. Parameters

Explanation

Cust i FCusti BCusti

ith category of customer in a set of p categories of customers: {1, . . . , p} Average size of full backup file (L0 ) for Cust i Total size of delta backup file (L1 ) for Cust i Unit of time quantum for solution. For formulating the network cost optimization problem, we have broken the total backup/recovery periods into time quanta. A quantum of time in our context is a positive real number, which multiplied with a time unit (seconds, minutes or hours) will provide the elapsed time. This second entity is called Time Quanta Length of the plan Identifies the total number of time quanta available by which backups and recoveries, respectively, of all p categories customers are to be completed Share of link bandwidth allotted to Cust i from data center a to data center b in the kth time quantum during L1 backup. The unit may be bandwidth units, such as megabytes per second Share of link bandwidth allotted to Cust i from data center a to data center b in the kth time quantum during recovery. The unit may be bandwidth units, such as megabytes per second Avg. cost of usage of link per quantum in pay-per-use model Identifies a time quantum that is the incremental or delta backup deadline for Cust i (equivalent to RPO) Identifies a time quantum that is the recovery deadline for Cust i (equivalent to RTO) Cost bound for Cust i Protection level for Cust i The data center exclusion list for Cust i Total number of input (to-be coded) fragments for the customer i based on the back up size Actual bandwidth capacity between ith and jth data centers

τi

gRPO , gRTO LBiabk LRiabk

γab PCustib TCustib CCustib PLCusti Qi meCusti BW ij

where d is the recovery data center and,

4.3. Objectives for Multi-Customer Problem

∀i ∈ {1, . . . , p}, ∀j ∈ {1, . . . , n},   ∀k ∈ {TCustib + 1, . . . , gRTO }, LRijdk = 0 .

(4.2.4)

Link capacity constraints: The link capacity (cumulative bandwidth) must be sufficient to carry the backup data for all of the p customers:

∀j ∈ {1, . . . , n}, ∀k ∈ {1, . . . , gRPO },

p 

LBisjk ≤ BW sj .

Minimize (4.2.5)

i =1

And the recovery of the backup data for all of the p customers, copied from each of the data centers to the recovery data center d:

∀j ∈ {1, . . . , n}, ∀k ∈ {1, . . . , gRTO },   p  i LRjdk ≤ BW jd .

(4.2.6)

i =1

Storage cost constraints: The storage cost constraints for each of the p customers may be modeled as:

∀i ∈ {1, . . . , p},

 n   j=1

Xij ∗

FCusti meCusti

 ∗ cj

 ≤ CCustib .

(4.2.7)

Data center exclusion list: the data center exclusion list constraints for any of the p customers may be modeled as:

∀i ∈ {1, . . . , p}, [∀j ∈ Qi , Xij = 0], (4.2.8) where Qi ⊂ {1, . . . , n} is the data center exclusion list for Cust i . Protection Level (PL) constraints: The protection level constraints for any of the p customers may be modeled as: S = {1, . . . , n}, ∀O ∈ ℜ(S , n − PLCusti ),

 

 Xij ≥ meCusti .

(4.2.9)

i∈O

Storage capacity constraints: the total storage capacity of a data center must support summation of storage allocated to all customer data fragments allocated to that data center. This can be modeled as:

∀j ∈ {1, . . . , n},

 p  Xij ∗ FCusti i=1

meCusti

 < Sj .

Storage cost minimization objective: an objective function for minimizing the total storage cost to the DR Service provider may involve minimizing the storage costs for storing the coded fragments of all p customers across the data centers.

(4.2.10)

p  n   FCusti i=1 j=1

meCusti

 ∗ Xij ∗ cj .

(4.3.1)

The objective function for minimizing the total storage cost may be subject to the constraints, detailed in Section 4.2. Link cost minimization objective: if the links are purchased by the DR provider on a pay-per-use basis, then the distribution plan may have an associated per link cost. Accordingly, an objective function for minimizing the total link cost to the DR Service for backup may involve minimizing the cost of copying the coded fragments of the backup data over the links during backup: Minimize,

gRPO p  n   (LBisjk ∗ γsj ), i=1 j=1 k=1

with same set of constraints used for minimizing the total storage cost. Similarly, an objective function for minimizing the total link cost to the DR Service for recovery may involve minimizing the cost of copying the coded fragments of the backup data over the links during recovery: gRTO p  n  

(LRijdk ∗ γjd ),

i=1 j=1 k=1

with same set of constraints used for minimizing the total storage cost. For the formulation above we have considered the network bandwidth to be divided across customers in each time quanta. However this division is only required for this formulation to check the satisfiability of the individual customers time constraints. For a set of RPO and RTO time windows, which gets the verification from the LP formulation above, we can come up with a schedule without the division of the bandwidth. For a satisfiable set of RPO and RTO time windows that gets verified by the LP formulation above with division of bandwidth, a greedy strategy of choosing the customer with the earliest deadline first, without dividing the bandwidth for backup and recovery, will still meet all the constraints.

60

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64 Table 3 Customer parameters. Customer

Incremental backup size (L1 )

Total backup size (L0 )

Protection level

C1

35.0 GB

350.0 GB

1

Table 4 Data center details. Data center

Storage type

Free storage (TB)

Per MB storage costa ($)

IOPS

DC1 DC2 DC3 DC4 DC5 DC6

iSCSI SATA FC iSCSI SATA FC

3 3.9 2 4 5 3.2

0.008 0.0006 0.09 0.008 0.0006 0.09

750 450 24 000 750 450 24 000

a

Representative cost, not actual.

Table 5 Minimized cost and distribution plan for various RPO/RTO bounds. RPO bound (h)

RTO bound (h)

Minimized cost objective ($)

m

n

Data center shares (DC1, DC2, DC3, DC4, DC5, DC6) of fragments

1.0 1.0 1.0 1.0 1.0 1.0

4.0 4.3 4.6 6.0 7.0 24.0

Infeasible Infeasible 15 231.0 7289.0 3033.0 1763.0

35 35 35 35

47 46 49 53

8, 8, 8, 8, 8, 7, 10, 10, 6, 10, 10, 0, 12, 12, 1, 12, 12, 0, 17, 17, 0, 2, 17, 0,

5. Implementation and sample results DDP-DR has been implemented as a standalone software tool. The tool can accept the business SLAs through an input file as a set of values, bounds and parameters. For erasure coding, we use standard MDS module implementing Reed–Solomon technique. A linear programming solver (IBM ILOG CPLEX Optimizer) was used. We have taken extensive runs with number of DCs varying from 3 to 50. For infrastructure and customer parameters similar to the one discussed in the illustrative example to be discussed next; the problem formulation consisted an LP with less than 50 variables. On a machine with 2 GB of RAM and 2.6 GHz dual core processor, running Ubuntu 10.04, the results for each of the cases took less than few seconds. The number of iterations for solution to converge varied between 10 and 30, depending on the numerical values. 5.1. Discussion of results for single customer case Let us discuss a sample result to illustrate the concept. We consider a hypothetical example with a customer C1 and 6 DCs. The parameters specific to each of the customers are listed below in Table 3 and those specific to the storage DCs are presented in Table 4. We take Bucket_size of the servers as 10 GB. Please note that we did not put any explicit Placement Constraint for customer data at any of the DCs. The values of the parameters chosen are for illustrative purpose, and are not based on any particular criteria. We have not made any assumption on the value range of the various parameters in the formulation and the model will cater to any values of the parameters from the corresponding value ranges. The bandwidth of the network links was considered 10 MBPS. DC1 was chosen as Primary data center and DC2 was chosen as Recovery site. We run DDP-DR on this dataset. For cost minimization objective, we run the optimization routine (A) with progressive relaxed RPO and RTO bounds. As RPO will typically involve far lesser volume of data than RTO, the time bound for RPO will be smaller than that of RTO. As depicted in Table 5, we start with an RPO bound of 1 h and RTO bound of 4 h and found that a data distribution plan

is not feasible (i.e., the solution does not converge). We progressively relax RTO bound to 4.6 h and find a feasible plan with following distribution pattern: DC1 till DC5 will hold 80 GB of data each while DC6 will hold 70 GB of data. The total storage cost for this plan is $15 231, which is the lowest cost. A total of 470 GB of storage is used to store 350 GB of backup data of customer C1 (contrast this with pure Replication scheme where 700 GB will be required). Accordingly, the erasure coding rate selected by DDPDR is ne /me , (for explanation of variables, please refer p. 8), which equals 47/35. In practice, the L0 data would be broken up into 10 GB chunks and then encoded into EC fragments with coding rate of 47/35. We show a few more iterations with different RTO and RPO bounds in Table 5. To achieve a distribution with RPO bound of 1 h and RTO bound of 6 h, the total cost incurred will be 7289 dollars and the distribution plan will have 10 GB of data for each data center except DC3 and DC6. DC3 will host 6 GB of data and no data will be placed on DC6. If we further relax the RTO bound to 7 h, the data placed in DC 3 will be reduced to 1 GB while the DC6 will not host any data. The other DCs will host 12 GB of data each. It is evident that relaxing the RTO bound will allow the planner to take out the expensive data centers from the solution plan (both DC3 and DC6 are expensive as they have FC storage disks with 0.09 dollars of storage cost per GB). We can get a feasible distribution plan with cost of 1763 dollar but the RTO bound has to be relaxed to 24 h. The data centers with least expensive storage (DC2 and DC5) have maximum numbers of shares in this plan. The total data storage requirements and coding ratios (i.e. n/m) are different for different distribution plans, as expected. Table 6 describes scenarios where customer C1’s RPO was minimized while continuing to satisfy the specified RTO and cost bounds (objective B). Note that for an RTO bound of 24 h and cost bound of $1950, we get a distribution plan {9, 14, 0, 14, 14, 0} which minimizes the RPO (0.86 h). If we relax the cost bound to $2000, we get a better RPO bound and different fragment distribution (comparatively more DCs with SCSI disks get used). From the table, it is evident that a for a cost bound of 1950 dollars, a feasible plan is not possible unless the RTO bound is kept around 24 h. This result is numerically in conformance with the results described in Table 5.

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

61

Table 6 Minimized RPO and distribution plan for various cost and RTO bounds. Minimized RPO (h)

RTO bound (h)

Cost bound ($)

Infeasible Infeasible 0.8628334 Infeasible Infeasible 0.82172227

6.0 7.0 24.0 4.0 7.0 24.0

1950.0 1950.0 1950.0 1950.0 2000.0 2000.0

m

n

Data center shares (DC1, DC2, DC3, DC4, DC5, DC6) of fragments

35

51

9, 14, 0, 14, 14, 0,

35

50

11, 13, 0, 13, 13, 0,

Table 7 Minimized RTO and distribution plan for various cost and RPO bounds. RPO bound (h)

Minimized RTO (h)

Cost bound (dollars)

m

n

Data center shares (DC1, DC2, C3, DC4, DC5, DC6) of fragments

1.0 1.2 1.4 1.8

8.503 8.503 8.503 8.503

1950.0 1950.0 1950.0 1950.0

35 35 35 35

51 51 51 51

9, 14, 0, 14, 14, 0, 9, 14, 0, 14, 14, 0, 9, 14, 0, 14, 14, 0, 14, 14, 0, 9, 14, 0,

Table 8 Customer parameters. Customer

Incremental backup size (L1 ) (GB)

Total backup size (L0 ) (GB)

Protection level

C4 C3 C5 C2 C1

65.0 55.0 75.0 45.0 35.0

650.0 550.0 750.0 450.0 350.0

3 1 1 2 2

Table 9 Customer back up and recovery deadlines. Customer

RPO deadline time quanta

RTO deadline time quanta

C4 C3 C5 C2 C1

4 6 4 6 5

11 10 7 9 9

is inclined towards minimizing RTO (timely recovery is more important), other operating points become important to him. However, generally speaking, one can see (from the line graph for cost, shown in Graph 1) the operating cost increases with tighter RTO and RPO bounds. This trend is not strictly monotonic. Graph 1. Solution points for single customer case: X -axis represents various feasible points, 25 such points are shown. Y -axis represents the value of RTO in hours, RPO in hours and cost in 1000 dollars. In each solution point there are 3 bars, the left most bar stand for RTO (in hours), the middle bar stands for RPO (in hours) and the right most bar stands for cost (in 1000 dollars).

Table 7 describes scenarios where C1’s RTO was minimized while continuing to satisfy the specified RPO and cost bounds (objective C ). For example, an RTO of 8.503 h can be achieved with an RPO bound of 1.8 h and a cost bound of 1950 dollars. Ever if we tighten the RPO bound to 1 h, the same RTO limit can be achieved. Note that, RTO is relatively insensitive to RPO bounds as the data involved in recovery is much more than in periodic backups. 5.1.1. Selecting an operating point from a range Let us now discuss how a DR Service provider or broker who can offer different operating choices to a customer. Suppose that a customer provides a range of RPO, an operating range of RTOs and an overall cost bound. The service provider can take these inputs and create a set of operating points that are feasible within this cost bound. In Graph 1, we show some such operating points within a RPO range of 0.6–1.2 h, and an RTO range of 5–18 h. The other data is same as in Section 5.1. If the customer is more sensitive towards minimizing RPOs (timely backup is more important) then he has a choice on operating at certain points. If the customer

5.2. Discussion of result for multi-customer case Let us discuss a case and a sample result for a multi-customer scenario. Let us consider an example with 5 sample customers from different categories and 6 DCs. The average volumetric parameters specific to each of the category are listed below in Table 8 and those specific to the storage DCs are the same as for the single customer case as presented in Table 4. All other infrastructure parameters are same as before. As explained in Section 5.1, the values of the parameters chosen are for illustrative purpose, and are not based on any particular criterion. We have not made any assumption on the value range of the various parameters in the formulation and the model will cater to any values of the parameters from the corresponding value ranges. As depicted in Table 9, for each of the customers, we assume the time quanta by which their backup and recovery have to be completed. For example, customer C1 of category 1 would like to have an RPO bound of not more than 5 Time Quanta and an RTO bound of not more than 9 Time Quanta and so forth. We ran the test for variable lengths of time quanta for backup and recovery. As shown in Table 10, for the first experiment, we ascribe an RPO Time Quanta Length of 0.5 h and an RPO Time Quanta Length of 3 h. This means, for example, a feasible plan does not exist where RPO/RTO bounds of {2.5/27}, {3/27}, {2/21}, {3/30} and {2/33}

62

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64 Table 10 Scenario varying recovery time quanta length. Number

RPO time quanta

RPO time quanta length (h)

RTO time quantas

RTO time quanta length (h)

Result

1 2

7 7

0.5 0.5

11 11

3 7

Infeasible Refer to Table 11

Table 11 Distribution results for case 2 of Table 10. Customer

Data center

Fragments

C4 C4 C4 C4 C4 C4 C3 C3 C3 C3 C3 C3 C5 C5 C5 C5 C5 C5 C2 C2 C2 C2 C2 C2 C1 C1 C1 C1 C1 C1

DC3 DC4 DC1 DC6 DC2 DC5 DC3 DC4 DC1 DC6 DC2 DC5 DC3 DC4 DC1 DC6 DC2 DC5 DC3 DC4 DC1 DC6 DC2 DC5 DC3 DC4 DC1 DC6 DC2 DC5

22 22 22 22 22 22 16 18 18 1 12 12 26 25 25 25 0 0 12 12 12 12 12 12 9 9 9 9 9 9

Graph 2. Solution points graph for 12 data centers case: X -axis represents various feasible points, 13 such points are shown. Y -axis represents the value of RTO in hours, RPO in hours and cost in 1000 dollars. In each solution point there are 3 bars, the left most bar stands for RTO (in hours), the middle bar stands for RPO (in hours) and the right most bar stands for cost (in 1000 dollars).

hours are possible for customers C1, C2, C5, C3, and C4 respectively. However, if we relax the RTO Quanta Length to 7, a feasible data distribution plan is possible and it is presented in Table 11, which contains the distribution plan for various customers in various data centers. The distribution plan is shown only for storage cost minimization scenario. We omit showing the distribution plan for link usage cost minimization. 6. Scalability of the solution approach The scalability of our solution, i.e. the ability to give results for large planning sets, depends primarily on the scalability of the underlying linear programming solver used. With the current linear programming solver (IBM ILOG CPLEX Optimizer) which we used, we have taken extensive runs with number of DCs varying from 3 to 50. A summary of the results of the single customer case, when the number of DCs is 12 and 24, is presented below. We generated a set of input values for Storage Types, Free Storage Capacity, Storage Cost and IOPS values for 24 data centers within some specific range. All the other settings and parameters remain same as for the primary experiment scenario described in Section 5. For the same customer parameters described in Section 5, we ran our planning module for 12 data centers and then for 24 data centers. For the sake of brevity and as the objective here to show the scalability of the approach, we only present the solution points graphs (Graphs 2 and 3) for these two planning experimentations. For both Graphs 2 and 3, a linear cost curve arrived through regression is also plotted. Graph 2 depicts the solution points for the customer described in Section 5, when there are 12 data centers available for storage.

Graph 3. Solution points graph for 24 data centers case: X -axis represents various feasible points, 19 such points are shown. Y -axis represents the value of RTO in hours, RPO in hours and cost in 1000 dollars. In each solution point there are 3 bars, the left most bar stands for RTO (in hours), the middle bar stands for RPO (in hours) and the right most bar stands for cost (in 1000 dollars).

13 solution points for this case are shown. Graph 3 depicts the solution points for the same customer described in Section 5, when there are 24 data centers available for storage. 19 such solution points are shown for this case. Please note that actual values of the solution points will differ, based on the characteristics of the data centers like Storage Types, Free Storage Capacity, Storage Cost

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

and IOPS values. But the capability to generate the solution points is dependent only on the number of parameters, rather than the actual value of them. So irrespective of the actual value of the solution points, the ability to generate them in the case of 12 and 24 data centers, shows the scalability of the approach. As can be observed in these figures, the planning is very much scalable and the results are also predictable—general increase in cost with tighter RPO and RTO bounds. 7. Related work There is little published literature on the planning of distributed storage on the basis of performance and recoverability SLA or DR purpose. Minerva system, discussed by Alvarez et al. [25], works on creating an array of storage nodes based on performance (transaction I/O supported, disk characteristics etc.). This work, however, focuses on how to design and optimize the storage array for optimal transaction rates and does not deal with the case of archival data with incremental data addition into the substrate and the case of data availability during failure of nodes. Cost based data placement for storage purpose has been discussed by [26]; however, this work deals with static data and from the perspective of achieving data locality for supporting deterministic grid computing based jobs running on the distributed sites. Recently, a work has been published on storage planning for disaster recovery [27]. The authors have characterized the planning process for storage volume, paths, and DC zones. However, in this work, authors have characterized the planning problem as an infrastructure provisioning problem (i.e., creation or sizing of new DR storage infrastructure for a workload). Our work, in contrast, looks planning as a constraint satisfaction problem for optimal data placement given a set of storage sites. Data recovery scheduling after the disaster has been studied in [28] by using heuristic (GA-based) techniques. Cost based resource provisioning of elastic cloud resources for workflow based applications has been discussed in [29]. This work focuses on provisioning and aggregating differently priced virtual machine resources (compute and disk) having different capabilities to support a particular workload based on a mixed integer programming formulation. Cost minimized execution for a specific type of distributed workflow application over cloud nodes (where compute and data nodes are distributed) through a non-linear programming based model is discussed in [30]. Distributed data coding schemes like Erasure Coding have been studied extensively by [13,31], especially for storage applications; and in these papers various algorithms of EC like Reed–Solomon, Cauchy Reed–Solomon, Liberation coding and other forms MDS coding are studied. Some works on using Erasure like distributed coding for distributed fault tolerance in large-scale archival systems have been discussed in OceanStore [10], PAST [32], and Farsite [11]. One of the important distinctions of our work from these is that these works have an eventual recovery model, i.e., no recovery time objective SLA is set, while in our case the coding structure and data distribution strategy is strictly SLA dependent. The proposition discussed in [10] provides a significant motivation for our study. In recent times, other vendors such as EMC have come up with Erasure Coding based cloud backup system like ATMOS [33]. The growing use of EC like coding schemes in distributed backup systems underscores the efficacy of the model. Multi-site replication of data for availability reasons is a well investigated problem. Assignment of distributed data files and allocation of nodes to files are studied by Wah [34]. Problems regarding page replication and migration are studied by Fleischer et al. [35]. There have been a series of work around data placement and replication in a data grid environment like Bell [15]. Data replication and job scheduling have been studied in [36]. QoS aware replication has been studied in [16]. QoS aware broking

63

services to cloud infrastructure have been studied in [37]. Our problem is of different nature as we are studying replication related to archival fault-tolerance of data and not for performance. There has been an extensive set of work on the diversity and economic aspects of cloud storage. In [38], the authors study the cost implication of using Grid market place for various type of workload—a download server, an upload server, a computational server and a web server where the CPU, storage and bandwidth requirements will be different. In [39], the authors have built a cost model for a hybrid cloud scenario taking into consideration important input variable and fixed costs electricity, hardware (compute and storage), software, premises, data transfer, labor, services etc. It is conjectured that for certain services savings on moving to cloud is significant and in other cases not so. In [40], authors discuss the merits of multi-cloud storage substrate, primarily to avoid vendor lock-in and distributed fault-tolerance, a key premise of this paper. Multiple aspects of cost of cloud (across compute, storage and networks) vis-à-vis in-house data centers have been discussed in [41]. The authors point out significant benefits in cloud-based deployments in terms of lower provisioning and maintenance cost, higher utilization and more flexibility. However, the authors do not specifically get into the discussion of how data placement can be planned. The novelty of cloud storage architecture (which masks many internal complexities of storage management) is presented in [42]. The simplicity of using cloud storage is perhaps driving better adoption of cloud. In [43], authors discuss a cost based dynamic data replication scheme based on user access patterns. However, again the treatise is primarily on how to plan replication to support transactional queues (especially for cloud based big data applications). Procuring and managing network services through cloud providers is tackled in [44] where the service composition has been viewed as a virtual graph mapping problem with different topologies with the objective of cost minimization through simulations where the nodes of the graph denote service placements and the edges denote the links between the services. 8. Conclusion and future work With the advent of sophisticated online backup, multi-geography sites and cloud computing, multi-site DR is increasingly becoming popular for better reliability and availability of operational data. In this paper, we describe DDP-DR a novel data distribution plan for multi-site disaster recovery, where backup data can reside in multiple data centers including the public cloud. The plan takes customer policy level constraints and infrastructural constraints into consideration to suggest a series of data distribution plans. The work tries to fit customer requirements into existing DR topologies as opposed to provisioning servers for DR as most of the other DR planning works have suggested. The generated plans help a client to plan its disaster recovery strategy with respect to multiple on-line backup providers. We also discuss an approach for a DR Service provider to test its capacity planning and operational feasibility in terms of recovery objectives of different set of clients. The implications of this work are as follows: the work can be used as a guideline for formulations by organizations that are looking to build a DR planning strategy embracing remote distributed data centers and cloud. Although the actual planning issues can be more complicated in DR (especially the operations part); the strategy for efficient data distribution and storage is often overlooked. This work shows that a significant cost optimization can be derived by using more heterogeneous storage medium (where cheaper nodes can be used to store redundant data) and adopting more efficient erasure coding based data replication than with the plain replication of data across sites. The distribution strategy can take into consideration the RPO and RTO constraints right in the planning stage so that data can be distributed in a balanced manner

64

S. Sengupta, K.M. Annervaz / Future Generation Computer Systems 41 (2014) 53–64

across storage sites. In the current practice, RTO and RPO are tested later as part of DR operational testing, and it is often found that the lop-sided data placement has either choked the secondary sites or the network and the time bound for recovery cannot be achieved. Similarly, the work can be used as guidance for new-breed of online DR service providers who plan federated cloud based backup storage and disaster recovery services for customers primarily in small and medium business segments. The providers can potentially use the formulation for creating and serving different categories of customers based on the storage and bandwidth inventory they possess. They can also use the formulation to plan a-priori if enough capacity exists to induct extra tenants in their system without affecting the overall system performance. The work can also help researchers working in the cloud and utility computing space to refer and extend the optimization strategy. We are currently trying to enhance the formulation for capacity planning for multi-class services (storage, DR, staging and test environment) for cloud based data centers. Also on the work is the notion of incremental cost planning for customers cloud bursting. Another possible extension of the work can be to extend the formulation for different categories of data. For example, the objective for planning for storage mission critical data will be to maximize data availability and fault tolerance and so forth. References [1] J. Schiers, Multi-PB distributed databases; IT Division, DB Group, CERN, Presentation available at: http://www.jamie.web.cern.ch/jamie/LHC-overview.ppt. [2] A. Gulli, A. Signorini, The indexable web is more than 11.5 billion pages, http://www.divms.uiowa.edu/~asignori/papers/the-indexable-web-is-morethan-11.5-billion-pages/. [3] Mozy Online Backup Storage at http://www.mozy.com/. [4] NetApp SnapMirror technical documentation at www.symantec.com/business/support/resources/.../288533.pdf. [5] K.S. Zahed, P.S. Rani, U.V. Saradhi, A. Potluri, Reducing storage requirements of snapshot backups based on rsync utility, in: Proceedings of First International Workshop on Communication Systems and Networks, COMSNET, 2009. [6] J. Chidambaram, C. Prabhu, et al. A methodology for high availability of data for business continuity planning/disaster recovery in a grid using replication in a distributed database, in: Proceedings of TELECON08 Conference, 2008. [7] Amazon Glacier storage system at http://aws.amazon.com/glacier/. [8] Rackspace Cloudfile storage system at http://www.rackspace.com/cloud/files/. [9] Zamanda cloud backup at http://www.zmanda.com/cloud-backup.html. [10] J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, B. Zhao, OceanStore: An architecture for global-scale persistent storage, in: Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2000. [11] A. Adya, W.J. Bolosky, M. Castro, G. Cermak, R. Chaiken, J.R. Douceur, J. Howell, J.R. Lorch, M. Theimer, R.P. Wattenhofer, FARSITE: Federated, available, and reliable storage for an incompletely trusted environment, in: Proc. of the 5th Symposium on Operating Systems Design and Implementation, OSDI, USENIX; 2002. [12] M. Wallace, L. Webber, The Disaster Recovery Handbook—A Step-by-Step Plan to Ensure Business Continuity and Protect Vital Operations, Facilities, and Assets, American Management Association, 2007. [13] J.S. Plank, Erasure codes for storage applications; Tutorial Given at FAST-2005: 4th Usenix Conference on File and Storage Technologies, 2005. [14] Disaster Recovery issues whitepaper at http://www.hds.com/assets/pdf/wp_ 117_02_disaster_recovery.pdf. [15] W.H. Bell, D.G. Cameron, R. Carvajal-Schiaffino, A.P. Millar, K. Stockinger, F. Zini, Evaluation of an economy-based file replication strategy for a data grid, in: Proceedings of the 3st International Symposium on Cluster Computing and the Grid, CCGRID, 2003. [16] X. Tang, J. Xu, QoS-aware replica placement for content distribution, IEEE Trans. Parallel Distrib. Syst. 16 (10) (2005). [17] H. Bin, A general architecture for monitoring data storage with openstack cloud storage and RDBMS, Appl. Mech. Mater. 347–350 (2013) 739–742. [18] R. Rodrigues, B. Liskov, High availability in DHTs: erasure coding vs. replication, in: Proceedings of P2P Systems IV, in: Lecture Notes in Computer Science, vol. 3640, 2005, pp. 226–239. [19] R. Rodrigues, B. Liskov, High availability in DHTs: erasure coding vs replication, in: Proceedings of International Workshop on Peer-to-Peer Systems, IPTPS, 2005. [20] Z. Chen, X. Wang, Y. Jin, W. Zhou, Exploring fault-tolerant distributed storage system using GE code, in: Proceedings of Embedded Software and Systems Conference, IFIP, 2008.

[21] H. Weatherspoon, J.D. Kubiatowicz, Erasure coding vs. replication: A quantitative comparison, Lecture Notes in Computer Science, vol. 2429/2002, Springer Link, 2002. [22] G. Nemhauser, L. Wolsey, Integer and Combinatorial Optimization, John Wiley and Sons Inc., 1988. [23] M. Macias, J. Guitart, Client classification policies for SLA negotiation and allocation in shared cloud data centers, in: Proceedings of Economics of Grids, Clouds, Systems and Services—8th International Workshop, GECON 2011, 2011. [24] J. Altmann, C. Karyen, How to charge for network services—flat-rate or usagebased, J. Comput. Netw. 36 (5–6) (2011) 519–531. [25] G.A. Alvarez, B. Borowsky, Minerva: an automated resource provisioning tool for large-scale storage systems, ACM Trans. Comput. Syst. (TOCS) (2001). [26] U. Cibej, B. Silvnik, B. Robic, The complexity of static data replication in data grids, J. Parallel Comput. 31 (8–9) (2005) 900–912. [27] S. Gopisetty, E. Butler, S. Jaquet, S. Korupolu, M. Seaman, et al., Automated planners for storage provisioning and disaster recovery, IBM J. Res. Dev. 52 (4–5) (2008) 353–366. [28] K. Keeton, D. Beyer, et al. On road to recovery—restoring data after disaster, in: Proceedings of European System Conference, EoroSys, 2006. [29] E. Byun, Y.-S. Kee, J.-S. Kim, Cost optimized provisioning for elastic resources for application workloads, J. Future Gener. Comput. Syst. 27 (8) (2008) 1011–1028. [30] S. Pandey, A. Barker, K.K. Gupta, R. Buyya, Minimizing execution costs when using globally distributed cloud services, in: 24th IEEE International Conference on Advanced Information Networking and Applications, AINA, April 2010, pp. 222–229. [31] J.A. Cooley, J.L. Mineweaser, L.D. Servi, E. Tsung, Software based Erasure codes for scalable distributed storage, in: IEEE Symposium of Mass Storage, 2003. [32] M. Castro, P. Duschel, A. Rowstron, D.S. Wallach, Secure routing for structured peer-to-peer overlay network, in: Proceedings of SIGCOMM, 2001. [33] EMC ATMOS cloud storage site at http://atmosonline.com/atmos/. [34] B.J. Wah, An efficient heuristics for file placement in distributed databases, in: Proceedings of ACM Computer Software and Application Conference, 1981. [35] R. Fleischer, S. Seiden, New results for online page replication, in: Proceedings of the Third International Workshop on Approximation Algorithms for Combinatorial Optimization, 2000. [36] K. Ranganathan, I. Foster, Simulation studies of computation and data scheduling algorithms for data grids, J. Grid Comput. 1 (1) (2003) 53–62. [37] D. D’Agostino, A. Galizia, A. Clematis, M. Mangini, I. Porro, A. Quarati, A QoSaware broker for hybrid clouds, J. Comput. 95 (1) (2013) 89–109. Springer. [38] M. Risch, J. Altmann, Cost analysis of current grids and its implications for future grid markets, in: GECON 2008, 5th International Workshop on Grid Economics and Business Models, 2008, pp. 13–27. [39] Mahdi M. Kashef, J. Altmann, A cost model for hybrid clouds, in: GECON2011, 8th International Workshop on Economics of Grids, Clouds, Systems, and Services, in: LNCS, Springer, Paphos, Cyprus, 2011. [40] H. Abu-Libdeh, L. Pricehouse, H. Weatherspoon, RACS: a case for cloud storage diversity, in: Proceedings of ACM Symposium on Cloud Computing, SoCC, 2010. [41] A. Greenberg, J. Hamilton, D.A. Maltz, P. Patel, The cost of a cloud: research problems in data center networks, ACM SIGCOMM Comput. Commun. Rev. 39 (1) (2009). [42] W. Zeng, Y. Zhao, K. Ou, W. Song, Research on cloud storage architecture and key technologies, in: Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human, 2009. [43] Q. Wei, B. Veeravalli, B. Gong, L. Zeng, CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster, in: Proceedings of IEEE International Conference on Cluster Computing, 2009. [44] K. Tran, N. Agoulmine, Y. Iraqi, Cost effective complex service mapping in cloud infrastructures, in: IEEE-IFIP Network Operations and Management Symposium, NOMS’2012, Hawai, USA, May 2012.

Shubhashis Sengupta is a senior researcher with Accenture Technology Labs. He has received Bachelor Degree in Comp. Science from Jadavpur University Calcutta, and Ph.D. in Management Information Systems from Indian Institute of Management. His research interest includes distributed computing, performance engineering, Grids, cloud and software architecture. Shubhashis is a senior member of ACM and member of IEEE Computer Society.

Annervaz K M is a researcher at Accenture Technology Labs, Bangalore. He graduated with masters in Computer Science and Engineering from Indian Institute of Technology Bombay. His main research interests are in Algorithms, System Models, Formal Verification, Machine Learning and Optimization Techniques. Currently his research focus is on the application of Optimization and Machine Learning techniques to real world problems, especially in the area of Software Engineering and System Architectures.