Journal Pre-proof Strict timed causal consistency as a hybrid consistency model in the cloud environment Hesam Nejati Sharif Aldin, Hossein Deldari, Mohammad Hossein Moattar, Mostafa Razavi Ghods PII: DOI: Reference:
S0167-739X(19)32105-3 https://doi.org/10.1016/j.future.2019.11.038 FUTURE 5310
To appear in:
Future Generation Computer Systems
Received date : 7 August 2019 Revised date : 1 October 2019 Accepted date : 27 November 2019 Please cite this article as: H. Nejati Sharif Aldin, H. Deldari, M.H. Moattar et al., Strict timed causal consistency as a hybrid consistency model in the cloud environment, Future Generation Computer Systems (2019), doi: https://doi.org/10.1016/j.future.2019.11.038. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier B.V.
Journal Pre-proof
Strict timed causal consistency as a hybrid consistency model in the cloud environment Hesam Nejati Sharif Aldina,∗, Hossein Deldarib , Mohammad Hossein Moattara , Mostafa Razavi Ghodsa,∗ a Department
of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran of Computer Engineering, Salman Institute of higher education, Mashad, Iran
pro of
b Department
Abstract
urn a
lP
re-
Cloud computing is a model of distributed systems. This system allows users to access virtual resources including the processing power, storage, applications, etc. Storage as a Service (SaaS) is one of the cloud computing services. Cloud storage systems provide this service for the end-users, and deliver data availability and durability as well as global accessibility throughout the Internet. High data availability and scalability are very crucial criteria for the end-users in cloud storage systems. To achieve them, we need replication in these systems. However, the replication brings about asynchronization of data among replicas in different cloud data-centers. It reduces the performance of the cloud storage systems as well. Therefore, replication is one of the crucial challenges in cloud storage systems. These systems need to ensure that the data are synchronized among different replicas by implementing consistency policies. In this paper, we present the Strict Timed Causal Consistency (STCC) as a hybrid consistency model which can be considered as an extension to the cloud computing. This consistency model has two components: client-side, and server-side. At the client-side, this model supports monotonic read, monotonic write, read your write, and write follow read consistencies. At the server-side, it supports the Timed Causal Consistency (TCC) as well. Additionally, it is stronger than the client-centric and is more flexible than the data-centric approaches. In spite of partition tolerance, our proposed method guarantees the consistency and satisfies data availability. Cassandra is a NoSQL database with high scalability and availability. Cassandra comes with multiple consistency levels as a service such as ONE, ALL, QUORUM, etc. We have examined the proposed approach with respect to different consistency levels of Cassandra and Causal Consistency (CC). Yahoo Cloud Serving Benchmark (YCSB) consists of a number of workloads which are used to evaluate our proposed method. We have executed different workloads on the Cassandra clusters and with respect to which we have made a comparison between the performance of our proposed method and the four other different consistency levels in Cassandra. The experimental results based on the comparison between the proposed method and ONE, ALL, QUORUM, as well as the CC consistencies, on a Cassandra cluster with 24 nodes, testify that on average our approach has reduced the stale read rate by 24% on workload-A, and on workload-B by 25%. Also, the system throughput with respect to workload-A has increased by more than 20%. Besides, when we applied our proposed STCC on workload-B the system throughput increased by almost 35%. Keywords: Replication, Session Consistency, Causal Consistency, Timed Consistency, Timed Causal Consistency, Cloud Computing, Data-centric, Client-centric Contents Introduction
2
Related works
3
Proposed method 3.1 Scenario . . . . . . . . . . . . . . . . . . . 3.2 Assumptions . . . . . . . . . . . . . . . . 3.3 Distributed Users Operations Table (DUOT) 3.4 Auditing strategy . . . . . . . . . . . . . . 3.5 Strict Timed Causal Consistency . . . . . .
Jo
1
∗ Corresponding
. . . . .
. . . . .
author Email addresses:
[email protected] (Hesam Nejati Sharif Aldin),
[email protected] (Hossein Deldari),
[email protected] (Mohammad Hossein Moattar),
[email protected] (Mostafa Razavi Ghods) Preprint submitted to Elsevier
2 3 4 4 5 5 6 7
3.6
Garbage collection mechanism . . . . . . . . .
9
3.7
Estimation of Stale Read Rate . . . . . . . . .
9
4 Experimental setup
9
4.1
Consistencies in Cassandra . . . . . . . . . . .
10
4.2
Performance Evaluation . . . . . . . . . . . . .
10
4.2.1
Practical Performance . . . . . . . . .
10
4.2.2
System throughput . . . . . . . . . . .
11
4.2.3
Stale read rate estimation . . . . . . . .
13
4.2.4
Network latency of read operations . .
14
5 Discussion
14
6 Conclusion and future works
14
Journal Pre-proof 1. Introduction
systems. One of these challenges is to ensure that the data are synchronized among the replicas. Consistency based on storage system policies ensures the synchronization among the replicas in the cloud storage systems [16]. Various theories have introduced different consistency policies for the storage systems. Many researchers have considered their policy based on the CAP theorem [17, 18]. According to the CAP theorem, linear consistency is stronger than the data-centric (and its subcategories such as sequential, causal, . . . ), and client-centric (and its subcategories such as eventual, monotonic read, . . . ) in terms of the execution order of operations. Furthermore, in spite of the fair-loss in a system with network partitioning, the consistency is satisfied in the cloud storage systems. However, the convergence among the replicas in the cloud storage systems which use the linear consistency is unpleasant for the Cloud Service Providers (CSPs) [19]. Strong consistencies like linear require effort to coordinate replicas in different locations. Therefore, users are faced with high network latency [16]. This high latency is against the cloud storage system policies. However, the eventual consistency is a very popular alternative which is highly specific. This model, in some periods of time, tolerates the inconsistency. Furthermore, it ensures the convergence of all replicas over time. The lower the convergence among the replicas, the higher the availability of the replicas would be for the clients [19]. Additionally, some researchers have introduced other clientcentric consistencies such as monotonic read, monotonic write, read your write and write follow read as session consistency models [20–22] [23–26]. Each session determines the consistency domain of the model. One of the conditions to rectify this model is to keep the status of the previous session by the client. Moreover, monotonic read reflects continuous reads of the set of write operations [24]. If the continuous read is done by a client, the update will be performed by the same client [27]. Read your write requires that every write operation be visible in order of execution. In other words, each read operation reflects the result of its previous write operation [24]. In monotonic write, the write operation of a client on a replica is guaranteed once all the previous write operations by that client on the same replica are registered [27, 28]. Write follow read consistency is also known as a causal session consistency [22]. In this model update propagation, is the result of the previous read operation. However, session is a weak consistency that sacrifices the consistency for the data availability. CSPs do not use this consistency to prevent data breaches among the replicas unless they meet the clients’ needs of this model, such as distributed mailbox, web page update, functions and software libraries, or users’ newsgroups, etc [28]. Causal consistency (CC) is proposed for the distributed systems based on the happened-before relations by Lamport [29]. This model is proposed to make a trade-off between the consistency and data availability. Some data-centric models such as linear and sequential which are stronger than the causal are not practically applicable as the replicas are always prone to change and no intense convergence happens among them. The
Jo
urn a
lP
re-
pro of
Context. In recent years cloud computing has gained lots of popularity among users, organizations, etc. This technology is a computational model which provides resources such as processing units, storages, services, etc. [1]. Most of the web services and social networks such as Facebook, Instagram, and Google produce huge bulks of data and therefore its necessary for them to avoid problems like data flooding. There exist various definitions for the big data in the literature. Here we quote one of them as “the amount of data which is more than the ability of technology to store, manage and process efficiently” [2, 3]. Big data needs to be stored on a very large storage space. Cloud storage systems and the socalled NoSQL systems on cloud computing are commonly consistent with respect to each other which can provide a space to store big data [4]. Cloud computing models can be categorized into three types: (a) Private cloud is used an internal data-center in organizations and it is not accessible for common use [1, 5, 6]. (b) Public cloud is a cloud infrastructure and is presented for general use on the pay-as-you-go model [1, 5, 6]. (c) Hybrid/Multi cloud is a combination of two or more distinguished infrastructures (private and public cloud) [1, 5, 6]. Cloud computing provides cost-effective services to the endusers through the Internet [7]. Its services are categorized into three main categories which are: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS). A number of new services have emerged recently which are forked out of these three main categories, e.g Model as a service (MaaS) [5], Database as a Service (DaaS) [8–10], Storage as a Service (SaaS) [11, 12], etc. In contrast to the traditional storage systems, cloud storage systems are completely cost-effective and many clients consider the cloud storage systems to be more flexible than the local storage systems [13]. Cloud storage systems are classified into three categories: (a) Object-oriented storage systems such as the Amazon S3, and Amazon EC2; (b) personal storage systems such as Microsoft OneDrive, and iCloud; and (c) database systems such as Microsoft Azure SQL, and Dynamo DB [14]. These storage systems pursue goals such as, reliability, scalability, availability and data sharing which require replication to achieve. These systems are featured with high scalability and availability for which we need replication in the cloud storage systems [15]. Replication is one of the most important challenges in the cloud storage systems. These systems use some replicas across multiple data-centers to serve the clients more quickly. Therefore, the replicas have to be available at the nearest data storage. So, the clients can request data locally from the nearest replica to the data-center and receive a quick response. Existing challenges. The replication process increases the data availability of the system. However, replication is in contrast with the convergence among the replicas. Moreover, a reduction in convergence leads to a decrement in the performance of the cloud storage systems. The replication process brings about some important challenges with the cloud storage
2
Journal Pre-proof convergence feature specifies that all replicas should be in the same state. However, the causal consistency allows them to be temporarily divergent. The causal consistency not only can tolerate the network partitioning with a low degree of convergence, but also guarantees the consistency and data-availability with the highest satisfaction [16]. Moreover, causal consistency can be considered as the strongest consistency which is a combination of causal arbitration and causal visibility [30]. Additionally, the causal has a low network latency while partitioning is being occured in the network [17, 30, 31]. Since in this model the physical time is not considered, and is not capable to provide convergence by it-self. Therefore, implementing this model requires to setup a logical time [32] for each event to solve this problem. Then, the logical time is added to the model which introduces the Timed Causal Consistency (TCC) [20, 33–35]. Proposed solution. In this paper we proposed the Strict Timed Causal Consistency (STCC) as hybrid consistency and we intend to guarantee the propose method between the user’s operations in consecutive sessions and the operations performed by other users in both client-side and server-side using our method in each session. Other than guaranteeing the consistency in both server-side and client-side. Causal relations can be among the users operations, the data in shared data storage, the stored keys in a system, the user sessions and the log files existing in the replicas. Also in session consistency, the type of consistency can be different. Sometimes these models are performed on the client’s data, sometimes on the sessions and some other times on the operations which are executed by the client. The similarity between these two data-centric and client-centric models are in their applicability type. These two model share in the consistency between the user’s operations and sessions as well as the causal relations between the operations of the users in different sessions. Moreover, their difference is that causal consistency focus on the execution of user’s operations at the server-side; however, the client-centric models such as (monotonic read, monotonic write, read your write, and write follow read) are performed on user’s operations at the client-side. Contributions. There is a trade-off between convergence and consistency. However, if the consistency is weak, then the convergence among replicas is reduced. Then, the probability of stale read rate is increased. Therefore, the probability of the stale read rate of the client-centric models are more than the others [7, 36, 37]. More precisely, the major contributions of this paper are as follows:
ity and which results in a reduction in the staleness read probability. • The proposed method reduces the synchronization among the replicas and therefore the network latency to update them will be decreased.
pro of
• As far as the network latency is low, thanks to proposed method, the system can handle more operations, the read operations in particular, which results in an increase the system throughout.
urn a
lP
re-
Experimental setup. Storage systems such as Cassandra [38], Google Big Table [39] and Amazon Dynamo [40] are proved to be efficient for storing NoSQL data and providing services in large scales. Most of these systems such as Amazon Dynamo have chosen eventual consistency or weak session consistency for implementation [41, 42] whose replicas are gradually updated which results in a reduction the synchronization overhead. Each cloud database pursues a specific goal. For example, Apache Cassandra [38] is an open source cloud database and is used by many applications such as AppScale [43] Instagram and Facebook [44], etc. The proposed method is implemented on the Cassandra cloud database. It tries to tackle with challenges such as network latency, response time, and system throughput [21, 24, 26]. In all previous works Complete Replication and Propagation (CRP) protocol is applied which makes the causal consistency easier [14, 17, 20, 33–35]. To apply the CRP in Cassandra the NetworkTopologyStrategy is employed and all the replicas in our cloud data-centers are updated. The remainder of this paper is organized as follows. In section 2, different consistency models are reviewed. The proposed model and the studied scenario are introduced in section 3. Section 4 includes the evaluation of the proposed model and the results are discussed and concluded in sections 5, and 6 respectively. Finally, in the Appendix four different client-centric and causal consistency algorithms are explained. 2. Related works
Jo
Timed consistency determines the execution order of the operations on the shared entities in the cloud storage systems. Moreover, by registering the logical time of the executed operations it validates the execution order among the operations [45]. However, since the actual time is not available, causal consistency models fail to provide convergence. To tackle with this problem, we need a causal consistency model in a time framework. This model is called TCC [33]. Causal+, which is used in the COPS storage system, guarantees convergence through combining causal consistency which converges conflict handling and assures that the replicas are not inconsistent permanently [46]. Our proposed STCC dominates the convergence problem at the client-side through supporting monotonic read, monotonic write, read your write, and write follow read. The application of meta-data on small audit clouds as a consistency service has been proposed in [20]. The meta-data
• We have proposed a consistency similar to the linear, that has a high degree of convergence among the replicas at the server-side. • The proposed method tackles the low scalability and convergence degree of client-centric at the client side. • In the proposed consistency, like eventual, the users are satisfied with the network partitioning and data availabil3
Journal Pre-proof stored in a cloud requires causal consistency and then investigates whether the appointed consistency level is reached or not. Finally, the local auditing is used to solve the monotonic read and read your write. Additionally, all users have the same view from the meta-data of each node. STCC is a data-centric model which supports monotonic read and read your write in the client side particularly for monotonic write and write follow read schemes. Replication is applied in order to make the cloud storage systems more reliable and fault tolerant. This technique reduces the data access time. By applying the CRP protocol, the execution of causal consistency would be more easy. This consistency using full-track algorithm by CRP protocol, updates local replica and reduces wrong causal relation. The results of this algorithm are fed to the Opt-Track-CRP algorithm which leads to the reduction of the control information and update messages [47]. These two algorithms make the causal consistency able to be executed efficiently with partial replication. Caelus [14] is a device which registers information about time and order of operations from the CSPs and then sends them to the other devices. This operation only needs to be confident about the duration and length of the operations as well as their accordance with the causal consistency model. Ultimately, all the devices have the same view of the event time and order of the operations. The mechanism applied by Caelus, at first verifies whether there are any inconsistencies in the cloud or not. Then, it validates the time and order of the operations in cloud. Other devices need to be connected directly in order to be able to verify the consistency of the operations after reading the information from the cloud services. Weak causal consistency is a type of causal consistency [48]. It tries to maintain the causal relationships while focusing on convergence. Causal convergence is another type of causal consistency which is a combination of weak causal consistency and convergence. Finally, it is the causal consistency which matches causal memory while using the shared memory. Causal consistency algorithm can be simply implemented in the cloud storage systems. This model varies according to its application if it is used in partial or complete replication protocols. In this method, comparative algorithms of causal consistency have been investigated on two protocols of partial and full replication and the performance of these two algorithms has been evaluated [35].
pro of
The purpose of this scenario is to provide different levels of consistency at the client-side, as well as the CSPs at the serverside. In section 3.2, by relaxing some assumptions, we have simplified the implementation of our approach. In section 3.3, we propose the Distributed User Operations Table (DUOT). In this table, users’ operations are registered according to the timestamp. This table is also located on other nodes. In section 3.4, the strategy is to inspect the rules on the operation, in order to find the causal relationship between a user’s operation in the client-side table and the causal relationship between the operation of several users at the server-side by our proposed method. Then in section 3.5, we present the STCC and its implementation on DUOT. Our proposed approach examines the consistency of user operations based on timestamp in the table. It also provides the consistency for the user or CSPs. In section 3.6, we use the Garbage Collection mechanism to remove the stale operations included in the DUOT table. Finally, in section 3.7 we have examined the calculation method to estimate the stale read rate, so that we can evaluate the staleness parameter in our proposed method in comparison with the levels of consistency provided by Cassandra.
re-
3.1. Scenario
3. Proposed method
urn a
lP
Consider Fig. 1 in which several cloud servers are available for the user. A user tweets his/her message via connecting to a server. When this user reconnects to the same cloud server or moves to another place and connects to another server, the following situations may occur: • The user sees his/her previous or the most recent tweet when connected to the new cloud server. • The user retweets when connected to the new server. • The user sees his/her new tweet upon connecting to a new cloud server. • After reading his previous tweet, the user retweets again upon connecting to a new cloud server. • A user tweets the text message when connected to the cloud server. Then another user reads the message content and answers the tweet of the first user.
In this section, we first introduce consistency as a cloud service in Cassandra cloud database in Fig. 1. Then we will describe the DUOT, along with how to record user operations. Subsequently, we introduce the types of operations performed by the user and how to assign logical clock to user operations in the distribution table. Then we will present the structure of the audit strategy and its related definitions. Finally, the implementation of the session consistency is introduced by the STCC. Consistency is a service model which includes cloud data in which users audit their own operations. This cloud is preserved by the cloud server which includes the column / family data storage system [38]. Users are required to check their requested operations before running. For example, if the user’s documents or program is sent to the cloud, the attribute of the
Jo
Apache Cassandra is a cloud database that is located on different nodes of the data center. This database supports the service as a consistency. The types of consistencies presented in this database are ONE, ALL, QUORUM, Serial, etc. In this paper we have implemented our proposed method in Cassandra. We have compared the performance of our proposed method with different levels of consistency in Cassandra. Our proposed method consists of seven sections. In section 3.1, we consider the scenario for our approach. The scenario we target is about a social network. Users send their messages by connecting to the CSPs server, and the other users put comments on their posts. 4
Journal Pre-proof correct when it reflects continuous reads of the set of write operations. CS1
Write operations are related to one or more read operations. Write follow read is correct when it updates the propagation which is the result of the previous read operation.
CS2
CS3
CS4
CS5
pro of
Write operations after the previous write operations. The write operations are performed in the same order as the start order by a process. • By reading the value of a resource, the logical time can be specified by the timestamp [20]. Figure 1: A scenario in which an application requires the STCC.
3.3. Distributed Users Operations Table (DUOT)
document or program is sent based on the user’s unique identifier. Before uploading it to the cloud, the user checks three of the following:
Each user has access to the distributed user operations table and records its operations there. Each operation recorded in this table contains elements such as the type of operation, the user ID of the applicant the operation, the base name, is the universal logical clock vector. The insertion of operations in this table is such that The user is entered with the type of operation (read/write) and the source x in which the operation is performed. By operation insertion its logical clock is also inserted at the same time. There are two types of operations, one of them is the read operation from a resource and the other is the write operation on the resource. For example, the operation R(x)a, indicates the read operation of value a from the resource x or the operation W(x)a on the resource x, the read/write values can be either be unique or common. The logical time is replaced with the physical clock in the DUOT. Therefore, users can have the same view of the order the operation execution DUOT. For example, the user maintains a logical clock to keep track of the logical time of its operation. Assume that we have N users, the logical clock of their operation on the common source contains a vector with N logical clocks based on each user’s id in such a way that there would be a clock corresponding to each user. For the user i, 1 < i < N, its logical clock is < LC1 , LC2 , ..., LC N > [49]. In case LCi is the logical clock of user i, as soon as the user sends its request to the CSPs, it will be stored in the DUOT. The user corresponding to LC j by sending its request to the CSPs will register its logical clock in DUOT, too. Initially, all logical clocks of users are zero. This means that no user operations have been performed, and there is no operation in the DUOT. The blue circles in Fig. 2 illustrate the event of read/write on a common resource. Logical time increases as the user operations are performed. If the first operation is carried out by U1 (W(x)a), the U1 requests for the write operation of value a on the common resource x with the logical clock < 1, 0, 0 > in the DUOT will be registered. This logical clock is the same timestamp which is stored in the DUOT and the other users like U2 or U3 also register their requests in the DUOT as well.
1. Is the level of the STCC provided by the cloud?
re-
2. Has the cloud been violated or not? 3. Is the value read stale?
urn a
lP
In our proposed system, the audit strategy is carried out on a global strategy. Each user before executing the operations, registers them in the DUOT. All users access DUOT simultaneously and the accessibility to DUOT is based on timed sequential consistency, and the timed sequential consistency is used to manage the operations on the data in this table. The insertion of each user operation in this table is performed with a timestamp to arrange the view of users’ operations in this distribution table based on the timestamp. This audit is carried out globally by all users in order to execute the correct operations. Each user’s operations will be reviewed by its latest operations as well as the other users’ operations on the shared resource. 3.2. Assumptions • Data replication is performed by the CRP and all nodes have a replica [47].
Jo
• The communication cost for communicating with the server to insert operations in the DUOT is considered to be negligible (close to zero). Recording process of the user operations as well as the insertion of user operations along with its event time (logical clock as a timestamp) in each record in the DUOT [38]. • Write operations are based on three factors [20]. These factors determine the relations between operations that represent the types of consistency levels. Write operations are not related to any previous read operations. This factor shows that the write operation was performed on a server for the first time by a client. Read your write is correct when each read operation reflects the result of its previous write operations. Monotonic read is 5
Journal Pre-proof
U2
U3
< 0, 0, 0 >
< 0, 0, 0 >
< 2, 0, 0 >
< 1, 0, 0 >
< 1, 0, 0 >
< 2, 0, 0 >
< 2, 0, 0 >
< 2, 0, 0 >
< 2, 0, 0 >
< 2, 1, 0 >
< 2, 1, 0 >
< 2, 0, 0 >
< 2, 2, 0 >
< 2, 2, 0 >
< 2, 0, 0 >
< 2, 3, 0 >
< 2, 3, 0 >
< 2, 0, 0 >
< 2, 3, 0 >
< 2, 3, 1>
< 2, 0, 0 >
< 2, 3, 0 >
< 2, 3, 2>
< 2, 0, 0 >
< 2, 3, 0 >
< 2, 3, 3>
< 2, 0, 0 >
< 2, 4, 3>
< 2, 3, 3>
< 2, 0, 0 >
< 2, 5, 3>
< 2, 3, 3>
U1
U2
U3
Figure 2: The logical clock of user operations.
(1d)
lP
3.4. Auditing strategy
logical clock < 1, 0, 0 > < 2, 0, 0 > < 2, 1, 0 > < 2, 2, 0 > < 2, 3, 0 > < 2, 3, 1 > < 2, 3, 2 > < 2, 3, 3 > < 2, 4, 3 > < 2, 5, 3 > < 3, 4, 3 >
urn a
Operation (R/W) W(x)a W(x)b R(x)a R(x)b W(x)d R(x)a R(x)b R(x)d R(x)d W(x)c R(x)b
(O1 = r(x)a) ∧ (O2 = w(x)b)
re-
Table 1: Distributed User Operation Table.
User U1 U1 U2 U2 U2 U3 U3 U3 U2 U2 U1
(1c)
In eq. 1a the read operation of the value a from resource x on server S i by O1 and after that the read operation of the value b from resource x on server S j by O2 at the client-side which indicates the execution of the operations on a resource with the monotonic read. Then, the read value a on resource x on server S i and the write value a on resource x on server S j before reading value b on resource x on server S j . In eq. 1b the write operation of the value a from resource x on server S i by O1 and after that the read operation of the value a from resource x by O2 on server S j at the client-side which indicates the execution of the operations on a resource with the read your write. Then, the write value a on resource x on server S j before reading value a on resource x on server S j . In eq. 1c the write operation of the value a from resource x by O1 and after that the write operation of the value b from resource x by O2 at the client-side which indicates the execution of the operations on a resource with the monotonic write. Then, the write value a on resource x on server S i and the write value a on resource x on server S j before writing value b on resource x on server S j . In eq. 1d the read operation of the value a from resource x by O1 on server S i and after that the write operation of the value b from resource x by O2 on server S j at the client-side which indicates the execution of the operations on a resource with the write follow read. Then, the write value a on resource x on server S i and the write value a on resource x on server S j before reading value b on resource x on server S j . The above-mentioned operations at the server-side indicates the resource has causal consistency. We have considered the operations by the user Ci based on the timestamp T O1 < T O2 stored in the DOUT. Consequently, the user’s new operations are compared with its previous operations and the operations of the other users. The comparison criteria are as follows [22]: Causality between write operations on the same resource
< 3, 5, 3>
User operation
(O1 = w(x)a) ∧ (O2 = w(x)b)
pro of
U1 < 1, 0, 0 >
In this strategy each user inserts its operations in DUOT independently. Then, this required operations with his previous operation and other users required operations will be analyzed on the same resources. In case the causality relation between the user’s new operations with his/her previous operations or the other users’ operations are analyzed. This strategy is performed globally and the user operations are merely the read/write operations. Consistency in the distributed storage systems is one of the major challenges as the accessibility to a resource by multiple users is performed simultaneously and therefore it will lose consistency in execution of the operations. The considered operations are recorded in the DUOT based on the timestamp by the users. These operations might be recorded whether by a client or different clients. These executed operations are considered as follows:
Si
(O1 O2 ⇒ O1 −→ O2 ) The middle operation o, ∃ (O1 o∈O
(1a)
(O1 = w(x)a) ∧ (O2 = r(x)a)
(1b)
O2 ) is the causal
relations between operations o1 and o2 by one or two different clients. The execution of the operations o1 and o2 by the same client ∃o1 C+i o2 ci The operations are performed by the same client or two dif-
Jo
(O1 = r(x)a) ∧ (O2 = r(x)b)
o∧o
Pi
Pi
ferent clients concurrently. @ O1 −→ O2 ∨O2 −→ O1 ⇒ O1 k O2 , Pi
the operations which do not have the causality are executed at the same time. Causal consistency could be defined using Rule 1, this rule indicates the behavior and the performance of this consistency model on the shared resource [22]: ∀ O1 , O2
Si
6
∀ (O1
Ow∪OS i
Si
O2 ⇒ O1 −→ O2 )
(Rule 1)
Journal Pre-proof
𝑇𝑠 𝐶1 :
𝑡1
W 𝑥, 𝜀 a
𝑡2
𝑡3
𝑡4
𝑡5
𝑡6
𝑡7
𝑡8
the value u in time θ is read by the client Ci . Therefore, the monotonic read and STCC are not violated.
time
𝑡9
W 𝑥, 𝛾 𝑏
𝑆1 :
Ci
𝑆2 :
𝐶2 :
R 𝑥, 𝛾 b
R 𝑥, 𝛿 a
Failure
𝐶1 :
R 𝑥, 𝛾 b
R 𝑥, 𝛿 a
Failure
Sj
Ci
Ci
Ci
w(xi , )v −→ r(xi , γ)v 7→ w(xi , γ) −−→ w(x, δ)u −→ r(x j , θ)u
(2)
⇒ w(x, )v −→ w(x, γ)u −→ r(x j , θ)u
Figure 3: Violation in causality order among the operations.
pro of
Case study 2: If the write operation of the value v in time on the server S i is performed by the client Ci , then the same client wants to write the value u on the server S j , the value v in time γ must be written on the server S j in order to have the value u written by the client Ci in time δ. In this case, the monotonic write and the STCC are not violated.
Rule 1 presents that in case the execution of the operation o1 evokes operation o2 on the replica existing in server S i , then the other processes should also first observe the operation o1 on their own server and then the operation o2 [22]. In other words, the execution process is performed according to the cause and effect relation between the operations.
Sj
w(xi , )v 7→ w(x, γ)v −−→ w(x, δ)u
3.5. Strict Timed Causal Consistency
Ci
(3)
⇒ w(x, )v −→ w(x j , δ)u
Our proposed method has two components. The first one is strictness. Perform consistency at the client-side and support the session consistency based on the causal relationships. The second one performs the TCC at the server-side. The occurrence time of the operation at the client-side or the server-side based on causality between the operations. Fig. 3 depicts the behavior of the STCC model on the user operations at the client-side as well as the server-side in terms of the elapsed time. The elapsed time is utilized based on the logical clock in DUOT. Besides, Fig. 3 also illustrates the operations of the client Ci in terms of the elapsed times which are defined in [50]. The value of a in time and after that value of b in time γ will be written on server S j by client C1 . The causality relation among the write and read operations by client C1 are verified at the client-side. Client C1 can read the value b from the server S j , as soon as the value a is written prior than the value b. On this basis, consistency at the client-side is violated. Additionally, at the server-side the causal relation among the write operations is performed by client C1 and read operations is done by client C2 on the server S j . Furthermore, in Fig. 3, it is shown that the causal consistency among the client operations is violated. In order to avoid this inconsistency, our proposed method performs the consistency at both client and server sides. Violations among user operations Ci in STCC and also the variance in the observation of the order of the execution by Ci or the other clients in Fig. 3. In case the conditions for the following case studies meet, based on Fig. 3 our proposed method would execute correctly. Case study 1: If the write operation v in time on the server S i is performed by the user Ci . Then the same client λ should read the value v. If the client Ci wants to write the value u on the server S j , then value v must be written on the server S j in time in order to have the value u in time λ written on the server S j . Then,
lP
re-
Case study 3: If the write operation of the value v in time on the server S i is performed by the client Ci , then the same client wants to read the value u on the server S j , the value v in time γ must be written on the server S j in order to have the value u in time θ be read by the client Ci correctly. Therefore, the read your write and the STCC are not violated in this type of operation. Sj
Ci
w(xi , )v 7→ w(x, γ)v −−→ w(x, δ)u −→ r(x j , θ)u Ci
Ci
(4)
⇒ w(x, )v −→ w(x j , δ)u −→ r(x j , θ)u
urn a
Case study 4: If the write operation of the value v in time on the server S i is performed by the client Ci , then the same client wants to read the value v on the server S i . If the value u want to be written on the server S j , then the value v in time λ must be read from the server S i . Therefore, the value v in time δ is written of the server S j and then the value u in time θ is written on the same server. Therefore, the write follow read and the STCC are not violated in this type of operation. Sj
Ci
w(xi , )v −→ r(xi , γ)v 7→ w(x, δ)v −−→ w(x, θ)u
Jo
Ci
Ci
(5)
⇒ w(xi , )v −→ r(xi , γ)u −→ w(x, θ)u Case study 5: If two operations are executed by two different clients Ci and C j based on conditions 3 and 4 with causal relations, then our proposed method based on event and causality among operations are performed on the other servers then STCC is not violated anymore. Algorithm 1 is our proposed method at the client side based on the causal relations between user operations and the operation event time, supports monotonic read, monotonic write , 7
Journal Pre-proof
5:
8: 9:
O2 ) ⇒ O1 −→ O2 then Si
∀(W(x, )α −→ r(x, γ)β)
Si
i h Ci if ∃ ∃ (r(x, )α C+i r(y j , γ)α)) ∨ (w(x, )α −→ then w(y , γ)θ)) − 7 → r(y , δ)θ j j CS i
11: 12: 13: 14: 15: 16: 17: 18: 19:
j
Monotonic Read is correct Sj if ∃ ∃ (w(x, )α C+i w(y j , γ)θ) ∧ (w(x, γ)β −−→ then w(y, δ)θ) Ci S j Monotonic Write is correct Sj if ∃ ∃ (w(x, )α C+i r(x j , γ)θ ∨ w(x, )β −−→ then w(x, γ)θ) − 7 → r(x , δ)θ j Ci S j Read Your Write is correct Sj if ∃ ∃ (r(x, )α C+i w(y, δ)θ ∧ (w(x, )α −−→ r(y, γ)β) 7−→ w(y j , δ)θ then Ci S j Write Follow Read is correct
lP
10:
20: 21:
if (O1 Si
29: 30: 31:
Sj
Si
Si
Sj
Timed Causal Consistency is correct O1 ← operation by Ci O2 ← operation by C j if O1 + O2 then Si if (O1 O2 ) ⇒ O1 −→ O2 then
Jo
28:
Si
∀O1 ∧ o ∈ OS i (O1 −→ o) ∧ ∀ o ∧ O2 ∈ OS i (o −−→ O2 )
23:
25: 26: 27:
Si
O2 ) ⇒ O1 −→ O2 then
∀(O1 −→ O2 )
22:
24:
re-
7:
Si
if (O1
urn a
6:
pro of
Algorithm 1 Strict Timed Causal Consistency Input: Distributed Users Operations Table (DUOT), O1 : old operation, O2 : new operation, α & β: same value on diff servers, θ: new value on an available server, : timestamp, λ: next timestamp, δ: latest timestamp. 1: procedure STCC 2: while n < latest operation on DUOT do 3: if x = y then 4: if O1 C+i O2 then
Si
∀(O1 −→ O2 )
Si
Sj
Si
∀O1 ∧ o ∈ OS i (O1 −→ o) ∧ ∀ o ∧ O2 ∈ OS i (o −−→ O2 )
Si
Sj
Timed Causal Consistency is correct
8
Journal Pre-proof read your write, and write follow read. In case, the operations based on two components at the server side are performed then our model will support the TCC. Our proposed method based on the Rule 1 and the mentioned five case studies provides a combination of the client-centric and data-centric as STCC in cloud computing. In algorithm 1 we showed that the event time of the operation is important and the operations O1 and O2 are either new or old with respect to the occurrence time. α and β are identical values written on different servers. θ is a new value written by the new operation on the nearest server. In this algorithm, in case the existing resources in both CSPs are not identical and the old operation O1 and the new operation O2 are performed by the client Ci will be executed on the CSPs. If the request for the operation O1 is sent by the same client Ci to the CSPs, before the request O2 is sent, then in all servers S i first O1 (W(x, )a) and then O2 (R(y, γ)b) will be executed on the CSPs. Therefore, if the write operation is not performed on the server, the read operation from the resource is violated. Consequently, in case according to the aforementioned case studies in this paper, the operations have causal relation with each other, then our algorithm will work at the client-side and server-side according to the following:
3.6. Garbage collection mechanism
re-
pro of
During the audit process, each client stores its operation in the distributed table. The size of this table grows without any interference or restrictions. Therefore, the storage costs of this table will be too high. Using the garbage collection mechanism, we can eliminate trivial operations to maintain the effectiveness of the audit. In our garbage collection mechanism, clients only store their latest read or write operations in the distribution table of client operations after each operation interval. In this case, it is ensured that the table always contains the last read and write operations of each client. Therefore, the unnecessary operations are eliminated, and the storage size of this distributed table will not increase with an increase in the number of clients. In this case, if the client does not have its last read operation in the table, if the current read operation of the client is displayed before the last read operation recorded in the table, then the read-only consistency has been violated. If the current value of the current write of the worker is somewhat ahead of the last value of the client’s writing on the table, the write-fit consistency has been violated. If the client reads a bit of a difference with his latest reading, the self-reading consistency has been violated. Ultimately, if the client has had plenty before the last reading, it is consistent with writing after reading.
• Client-side:
3.7. Estimation of Stale Read Rate In this section, estimation of stale read rate in the system is calculated using probability calculations [51]. This estimation model requires primary awareness of practical program access pattern and network delay in storage systems. Network delay is important in this work and is defined based on time of update propagation to other replicas. Access pattern includes two key factors for read and write rates. These two rates determine the requirement of consistency in storage system. These two rates are determined based on the requirements of functional program. This state may lead to stale read. The probability of stale read rate is calculated by the exponential distribution. The function aims at the calculation this probability, while the client is updating the data locally or the updated value is being propagated among the replicas in the nodes of data-centers.
lP
According to case study 1 when a client requests for the read operation from a resource. If the previous value α is written in the CSPs, then it will read the latest value θ in the resource correctly. Therefore, the STCC guarantees that the monotonic read will not be violated.
urn a
According to case study 2, the write operation of the new value θ on the CSPs will be performed correctly, in case the previous value of α is written on the CSPs. Thus, the STCC guarantees that the monotonic write will not be violated. According to case study 3, the client can read the latest written value, written by itself or any other clients, on the resource in CSPs, when the previous value α is first written by the client on resource as well as the other resources and in conjunction with each CSPs, the client reads the new value θ. Hence, the STCC guarantees that the read your write will not be violated.
4. Experimental setup Apache Cassandra is a distributed storage system whose design is inspired from the Amazon Dynamo and Google BigTable. The selection and implementation of consistency in this system is very similar to the Amazon Dynamo and the data model utilized in it is similar to that of Google BigTable [52]. Yahoo Cloud Serving Benchmark (YCSB), is a framework which have developed a set of workloads to evaluate different aspects of system’s performance, called YCSB core package. A package is a set of relative workloads (workloads-A, workloads-B, . . .). Each set of workloads show a specific blend of read/write operations, data size, request distributions, and so on. A package tests a wide section of the performance space [52].
• Server-side:
Jo
According to case study 4, the client can write the new value θ on the resource in CSPs, in case the previous value α already exists in the resource in CSPs and is read in order for the new value to be written. Therefore, the STCC guarantees that the write follow read will not be violated.
According to case study 5, in case the operations O1 and O2 have causal relations, then they can be executed by two different clients. Based on the event time of the operation logical time operation, the CSPs runs. In this case, the STCC guarantees that the TCC would not be violated. 9
Journal Pre-proof The workloads in the core package are various types of the same basic application. In this application, there is a table of records, each of which come with F fields. Each record is identified by a primary key, which is a string like ”user112432”. Each field is named f ield0, f ield1 through f ield9 and so on. The values of each field are a random string of ASCII characters of length L. Each operation against the data store is randomly chosen to be one of:
Rack
Rack Rack
Rack
pro of
• Insert: Insert a new record.
Rack
• Update: Update a record by replacing the value of one field.
Figure 4: The communication between the Cassandra data-center and the execution of YCSB on these clusters.
• Read: Read a record, either one randomly chosen field or all fields. • Scan: Scan records in order, starting at a randomly chosen record key.
4.2. Performance Evaluation
and is also efficient and flexible [44]. Additionally, Cassandra provides the management of the dynamic efficient storage. Like Google BigTable, Cassandra stores the data in samll table named MEMTables in the storage. As soon as the size of MEMTables exeeds the threshold, the data will be stored on the SSTable [44, 51]. Three Cassandra clusters are applied to evaluate the STCC. In these three clusters a total of 24 groups exist which have an overall number of 45 cores to each of which 4 GBs of memory is dedicated. The local network is the Gigabyte Ethernet. To the accessible memory for the Cassandra is 12 TiBs. The share of cluster is 4 TBs and the share of each node of it is 512 GBs. The NetworkTopologyStrategy is applied as the replication strategy. By selecting this strategy data are stored in all clusters and racks. The YCSB benchmark is applied to conduct experiment on cloud based data storage. This benchmark shows the current services of the cloud servers. This benchmark has been extended for the open source databases such as MongoDB [44], Hadoop HBase [52], and Cassandra [38]; among which different workloads are available for our proposed method. Workloads which have different number of read or write operations. We have applied the YCSB 0.14.0 for the Cassandra in order to analyze with different consistency levels during runtime. Workload-A is 50% read and 50% write operations, which called update heavy. Workload-B is 95% read and 5% write operations, which called read heavy. In our experiments, to execute the workloads A and B, 4 million records are loaded and 5 million operations are considered. First, we have executed both of the workloads on 2, 8, and 16 nodes from two different data centers; and on 24 nodes from three different data centers. This benchmark has been executed ten times on ONE, QUORUM, ALL, and our proposed STCC consistency levels.
In Cassandra the data are stored in tables and are indexed by the row-keys. This data-model in Cassandra is based on the column-family. The column-keys are categorized in a set named column-families. For each table the column-families are defined and column-keys can be generated dynamically. This data model provides a great potential for the big data structures
4.2.1. Practical Performance System Run-time. Different consistency levels are presented in previous sections. Now, in this section we will study the runtime of each workloads at every level of consistency and investigate system performance considering workloads, threads
4.1. Consistencies in Cassandra
lP
re-
The number of records to scan is randomly chosen. In order to evaluate the Cassandra the YCSB is applied. This benchmark has variant workloads (workload-A, workload-B, etc.) which can be varied. In this section we will make a comparison between the proposed method and the ONE, QUORUM and ALL consistencies using the workloads A and B, using the YCSB [52] in terms of Runtime, throughput, stale read rate, and network latency.
urn a
In this section we will define some of the different consistencies used in Cassandra [53]. In the following, consistencies which are introduced in this section will be compared with proposed method out of which the weak and strong consistencies are focused more extensively. • All: This level of strong consistency is static. In this consistency, in order to avoid violation all nodes are replicated simultaneously. • Quorum: this policy is the eventual consistency. The number of accessible to perform operation are calcuh replicas i f actor . Therefore, the immedilated based on replication + 1 2 ate response from the replica in this policy is independent of its location (data center, rack, etc.)
Jo
• ONE: This consistency is the first level of eventual consistencies in which only one replica is the fastest responsible replica in the read and write operation.
10
Journal Pre-proof
Nodes 2 8 16 24
Run-time workload-A (ms) All ONE Quorum 2105885 1006523 1457578 2079575 906523 1536137 1470966 698428 1392261 1176947 531674 934889
CC 1284673 881643 657198 510973
Throughput (Ops/Sec)
Table 2: Run-time of workload-A at different nodes with different consistency levels.
STCC 976334 808611 564722 435132
ONE Quorum Causal STCC
16
64
100
pro of
CC 754993 3519443 2816341 2515927
ALL
Client Threads (Workload-A)
STCC 609820 3284879 2342415 2008968
Figure 5: Throughput of the system based on workload-A in 2 nodes.
Throughput (Ops/Sec)
Run-time workload- B (ms) All ONE Quorum 880459 5032061 1264137 4832753 4288880 3645790 4031881 4075193 3185798 3437235 3848012 2532984
Nodes = 2
1
Table 3: Run-time of workload-B in different nodes with different consistency levels.
Nodes 2 8 16 24
1900 1800 1700 1600 1500 1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0
Nodes = 8
re-
number, nodes number, and type of consistencies used between nodes. Run-time of workloads at nodes is in millisecond.
3200 3000 2800 2600 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 0
Workload-A Based on Table 2, the run-time of workloadA decreases when the number of threads is increased. In addition, increasing node numbers at every consistency level will decrease the run-time. Table 2 shows that increasing the number of nodes will significantly decrease run-time of workload. Running workload-By applying our proposed consistency (STCC) on 2 nodes of three data centers took more time than consistency of ONE, QUORUM, ALL, and CC (3%, 34%, 54%, and 24% respectively). Our proposed method took less time than consistencies of ONE, QUORUM, ALL and CC (11%, 47%, 61%, and 8% respectively) by increasing number of nodes to 8. When system run workload-A with 16 nodes, STCC consistency took less time than consistencies of ONE, QUORUM, ALL, and CC (19%, 59%, 62%, and 14% respectively). Finally, the proposed method was applied on 24 nodes and it took less time than consistencies of ONE, QUORUM, ALL, and CC (18%, 53%, 63%, and 15% respectively). However, this period is decreased significantly by increasing the number of nodes and it took less time than other consistency levels for running the workload. This decrease in running time shows scalability of our proposed method.
1
16
64
ALL ONE
Quorum Causal STCC
100
Threads (Workload-A)
lP
Figure 6: Throughput of the system based on workload-A in 8 nodes.
urn a
ing node numbers and has lower run-time than other consistencies. Also, our proposed take less time than consistencies of ALL, ONE, QUORUM, and CC (32%, 23%, 10%, and 7% respectively) to run the workload-B by increasing node number to 8. Consistency levels studied in 16 nodes show that STCC take less time than consistencies of ALL, ONE, QUORUM, and CC (42%, 43%, 26%, and 17% respectively) to run workload-B. Finally, this workload is studied in 24 nodes and our proposed method takes less time than consistencies of ALL, ONE, QUORUM, and CC (42%, 48%, 21%, and 20% respectively) to run the workload-B. However, our proposed consistency performed better in terms of Run-time by increasing the number of nodes taking the workload constant.
Workload-B Run-time of workload-B is shown in Table 3 with the same conditions for network delay so that the effect of workload type in the system can be realized. Workload-B with 45% increase in read operation shows that increasing node number not only creates no decrease in run-time but the time is increased between 2 to 8 times regarding consistency level. Also, STCC has the smallest run-time with 2 nodes and ONE consistency has the highest workload run-time with 2 nodes. However, STCC with 2 nodes has less workload run-time compared with ALL (31%), QUORUM (52%), and CC (19%). But it takes 88% more time than ONE consistency. Our proposed method shows better performance by increas-
Jo
4.2.2. System throughput In this paper, we assess the system throughput by considering different consistency levels (ALL, ONE, QUORUM, CC, and STCC), different number of nodes (2, 8, 16, and 24 nodes) and two different workload of A and B. Results of this assessment are illustrated in figures 5, 6, 7, and 8. Workload-A. Workload-A is run with 1, 16, 64, and 100 threads in a system with 2, 8, 16, and 24 nodes and results are investigated. Throughput of the system is one of the things studied considering different consistency levels introduced in the previous sections. Throughput is based on the number of run opera11
3200 3000 2800 2600 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 0
Nodes = 16 Throughput (Ops/Sec)
Throughput (Ops/Sec)
Journal Pre-proof
ALL ONE Quorum Causal STCC
1
16
64
2800 2600 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 0
Nodes = 2
ALL ONE Quorum Causal STCC
1
100
100
pro of 800 700
Throughput (Ops/Sec)
Nodes = 24
ALL ONE Quorum Causal STCC
Nodes = 8
600 500
ALL
400
ONE Quorum
300
Causal
200
STCC
100
re-
Throughput (Ops/Sec)
64
Figure 9: Throughput of the system based on workload-B in 2 nodes.
Figure 7: Throughput of the system based on workload-A in 16 nodes. 3600 3400 3200 3000 2800 2600 2400 2200 2000 1800 1600 1400 1200 1000 800 600 400 200 0
16
Threads (Workload-B)
Threads (Workload-A)
0
1
16
64
1
100
Threads (Workload-A)
16
64
100
Threads (Workload-B)
Figure 10: Throughput of the system based on workload-B in 8 nodes.
lP
Figure 8: Throughput of the system based on workload-A in 24 nodes.
tion in time (in sec) by the system and regarding the promised consistency level. This operation is based on the number of threads a client is run during workload process. As it can be seen in figures 5, 6, 7, and 8 throughput of the system is shown considering workload-A in 2, 8, 16, and 24 nodes. It can be clearly seen that the throughput of the system has an increasing trend until 64 threads with ALL, ONE, QUORUM, CC, and STCC consistencies. However, as the number of nodes increases our proposed method slows down. According to the system throughput results it can be perceived that STCC has a dramatically increasing trend until 16 threads. This ascending trend is 2 times more than throughput with 1 thread. But unlike expectations, the system throughput of this consistency level is decreased by increasing the number of threads 100 ones and also by increasing the number of nodes. This decreasing trend causes system throughput to decrease 11%. Our proposed method has shown constant better performance than ALL, ONE, QUORUM, and CC consistencies by running workload-A with 2, 8, 16, and 24 nodes. Also, a 35% growth in STCC is observed by increasing the number of threads and running the workload with 100 threads compared with three other consistencies. However, by increasing the number of threads, the system throughput with the ALL, ONE, QUORUM, and CC consistencies is decreased in different nodes between 3% to 17%. This decreasing trend in ALL, ONE, QUORUM, and CC consistencies is slowed down by increasing the number of nodes.
urn a
Increasing the number of threads and nodes does not lead to decrease in system throughput of our proposed method. This method leads to almost 21% increase in system throughput by increasing the number of nodes and threads. This increase in the system throughput is due to the effect of workload-A which includes 50% reading operation and 50% writing operation; in our proposed method, consistency is important in an operation which there is a cause and effect relation within the operation. Reading and writing operations having no cause and effect relation can be run concurrently. Therefore, it leads to increase in the number of operations and finally to increase in system throughput. That’s why our proposed consistency not only does not decrease the system throughput by increasing the number of threads and nodes but it increases the power.
Jo
Workload-B. We run workload-B with different nodes and different number of threads in a system and assess its system throughput. Figures 9, 10, 11, and 12 illustrate the system throughput with 2, 8, 16 and 24 nodes. According to Fig. 9 in which system throughput is significantly higher with the STCC, almost up to 100 threads, in comparison with the other consistencies. In other words, the system throughput in ALL, ONE, and CC consistencies is respectively 33%, 17%, and 13% less than the system throughput in the STCC . By increasing the number of threads in the system, all consistencies are decreased in terms of system throughput. Another point is that by increasing the number of threads there would be only a slight decre12
Journal Pre-proof
1000
0.9
800
0.8
Nodes = 16
Probability of stale read
Throughput (Ops/Sec)
900 700 600
ALL
500
ONE
400
Quorum
300
Causal
200
STCC
Nodes = 24
0.7 0.6
ALL
0.5
ONE
0.4
Quorum
0.3
Causal
0.2
STCC
0.1
100
0
0 1
16
64
1
100
100
pro of 0.9
1000
0.8
Nodes = 24
Probability of stale read
Throughput (Ops/Sec)
64
Figure 13: Stale read estimation rate based on workload-A with 24 nodes.
Figure 11: Throughput of the system based on workload-B in 16 nodes.
900
16
Client threads (Workload-A)
Threads (Workload-B)
800 700 600
ALL
500
ONE
400
Quorum
300
Causal
200
STCC
0.6
ALL
0.5
ONE
0.4
Quorum
0.3
Causal
0.2
STCC
0.1
re-
100
Nodes = 24
0.7
0
0
1
1
16
64
100
Threads (Workload -B)
16
64
100
Client threads (Workload-B)
Figure 14: Stale read estimation rate based on workload-B with 24 nodes.
lP
Figure 12: Throughput of the system based on workload-B in 24 nodes.
cation delay leads to loner waiting time for reading operation and consequently decreases the system throughput. Throughput of a system with ALL is almost 42% less than a system with STCC. Throughput of a system with ALL consistency will decrease almost 43% if the number of threads reaches more than 64 threads and the number of nodes is more than 2.
urn a
ment in terms of the system throughput in our proposed STCC method. It is clear in three other figures that our proposed method increases the system throughput due to increase in reading operation compared with writing operation in workload-B and the type of relation between the operations. A reading operation having no cause and effect relation can be run concurrently in a system with STCC. There is a growing trend in system throughput with ONE and QUORUM in figures 9, 10, 11, and 12 with 2, 8, 16, and 24 nodes and by increasing the number of threads up to 64 threads but system throughput with ONE is almost 23% less than in a system with STCC and the same feature in a system with QUORUM, is almost 29% less than in a system with STCC. Also, the system throughput with CC, is almost 12% less than in a system with STCC. Unlike our proposed method, ONE experiences a 22% drop in system throughput by increasing the number of threads and taking it to 100 threads. ALL consistency faces an 10% rise in system throughput by increasing the number of nodes and threads in two nodes. ALL is the strong Consistency. System throughput is more investigated at this level based on the amount of writing operation and the duration of its replication to more replicas. When it takes too long for the system to perform writing and replication operation, reading operation called by the client waits for a long time. This increased time for workload-B which 95% of its operation is reading along with increase in writing and repli-
Jo
4.2.3. Stale read rate estimation Figures 13 and 14 show the probability of stale read rate based on workloads A and B with 24 nodes. They also state that although the stale read estimation rate in ALL consistency is significantly less, it has the least system throughput in comparison with the other consistencies. Therefore, ALL can be neglected when there is increased system throughput. But, our STCC has decreased stale read rate significantly compared with QUORUM, ONE, and causal. Using STCC the stale read rate of the system with 24 nodes decreases between 8% to 24% in workload-A and between 2% to 15% in workload-B compared with QUORUM. Also, by increasing the number of threads up to 64, in contrast with the QUORUM and ONE consistencies, the stale read rate using CC and STCC is reduced. Moreover, comparing with the CC the stale read estimation rate is reduced between 6% to 16% in workload-A and 4% to 17% in workload-B using STCC. Stale read rate of a system with 24 nodes using STCC compared with ONE consistency has decreased for workload-A between 9% and 38%, and for workload-B it has been dropped 13
99th Percentail Latency (ms)
Journal Pre-proof
1400 1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0
crease latency of read operation by increasing the number of threads but it has less delay than ALL. This level of consistency sacrifices the consistency in exchange of decreasing latency. Our proposed method spends more time on write operation and propagation to the other replicas running workload-A due to the rise in write operation size; it investigates the cause and effect relation between operations as well. Therefore, latency of read operation in our proposed method is less than the other compared methods. But, STCC reduces the latency of the write operation as well as the required time to investigate the cause and effect relation between the operations. Thanks to the 5% write operation in workload-B it has a significantly less latency in comparison with workload-A which has 50% write operation in it. Therefore, response time to client’s request in read operation is less than it amount in QUORUM, ONE, and CC. Latency decrease of read operation obtained by STCC is more than ALL, ONE, QUORUM, and causal (29%, 21%, 9%, and 9.5% respectively).
Nodes = 24 ALL ONE Quorum Causal STCC
1
16
64
100
pro of
Client Threads (Workload-A)
1300 1200 1100 1000 900 800 700 600 500 400 300 200 100 0
Nodes = 24
ALL ONE
5. Discussion
Quorum Causal
Auditing strategy [20] is divided into two categories: local auditing, and global auditing. Cloud storage systems need additional time for each strategy, user should send their ”user operation table” to the auditor. The propagation of the user’s operation table to the auditor is temporally ineffective, as it must be analyzed by the auditor and then its result is sent back to the user. The time taken to choose the auditor increases as the number of users grows up. Also, the volume of tables which are being sent to the user increases and the auditor needs more time to investigate user tables. Therefore, system is faced with lack of scalability and consequently the response time increases as the number of users grow higher. There is a positive correlation between the number users and the number of messages sent/received by the auditor. Therefore, the more the number of the users, the higher the chances would be for the creation of a bottleneck at the CSPs. By adding consistencies like monotonic write and write follow read there would be an increment in the number of operations recorded in users’ operation table. This brings about more complexity in the analysis performed by the auditor. Thanks to our proposed STCC, the user operations table could be replaced by DUOT.
STCC
1
16
64
re-
99th Percentail Latency (ms)
Figure 15: Network latency of read operations by running workload-A.
100
Client Threads (Workload-B)
lP
Figure 16: Network latency of read operations by running workload-B.
urn a
between 6% to 32%, while using 16 threads. Stale read rate falls in our method by increasing the number of threads; however, it grows higher in a system with ONE consistency. As it can be seen in Fig 13 and 14, the highest difference in stale read values is between STCC and ONE on workload-A and workload-B with 100 threads. Moreover, the least difference in stale read values is between STCC and CC running workload-A and workload-B with 1 thread. Therefore we come to a conclusion that, the more replication time is taken between the replicas, the highest the probability of read operation from other nodes will be. Hence, the probability of stale read rate increases. Consistencies like QUORUM, ONE, and CC spend more time on replication in nodes. Therefore, the probability of stale read rate in a period of running node replication with read operation call will be more than our method.
Jo
6. Conclusion and future works
4.2.4. Network latency of read operations Figures 15 and 16 show the 99th percent-ail in read operation. This latency is based on the number of threads and workloads which run in 24 nodes. As it can be seen, ALL increases read operation time by increasing the number of threads. Therefore, if there is an increasing growth in latency, a system with ALL should wait longer for a response from different nodes of different racks when the read operation is called because after read it, the nearest replica of each node responds to the client’s request, although it violates the consistency. ONE will also in-
Convergence among replicas is an important issue in replication. There have been many different methods for data consistency management in Cloud storage systems. These methods satisfy goals such as cost, energy consumption, scalability, performance, reliability, and quantity of violation and staleness intense. STCC is a data-centric model. This model guarantees the consistency at the client-side and server-side in the cloud storage systems. It also comes with a high degree of convergence. Moreover, in comparison with the consistencies like 14
Journal Pre-proof monotonic read, monotonic write, read your write, and write follow read our proposed method has considerably improved the system throughput and has reduced the staleness and network latency. Our proposed STCC could be even further improved in three different spots. First, by considering the quantity of violations we can prohibit the violation in ordering of operations in the replicas. Second, by adding the stale read rate to the STCC we can guarantee that the latest updates are affected to the replicas. Third, there exists a tradeoff between the monetary cost and staleness which is known as the consistency-cost; therefore, by reducing the consistency-cost we can alleviate the monetary cost.
pro of
[14] B. H. Kim, D. Lie, Caelus: Verifying the consistency of cloud services with battery-powered devices, Proceedings - IEEE Symposium on Security and Privacy 2015-July (2015) 880–896. doi:10.1109/SP.2015. 59. [15] Consistency models in distributed systems : A survey on definitions , disciplines , challenges and applications 1–52arXiv:1902.03305v1. [16] R. Guerraoui, Trade-o ff s in Replicated Systems (2016) 14–26. [17] E. Brewer, CAP twelve years later: How the ”rules” have changed, Computer 45 (2) (2012) 23–29. doi:10.1109/MC.2012.37. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper. htm?arnumber=6133253 [18] S. Gilbert, N. Lynch, Perspectives on the CAP Theorem, Computer 45 (2) (2012) 30–36. doi:10.1109/MC.2011.389. [19] P. Mahajan, L. Alvisi, M. Dahlin, Consistency , Availability , and Convergence (2011) 1–53. [20] Q. Liu, G. Wang, J. Wu, Consistency as a service: Auditing cloud consistency, IEEE Transactions on Network and Service Management 11 (1) (2014) 25–35. doi:10.1109/TNSM.2013.122613.130411. [21] R. Kotla, M. Balakrishnan, D. Terry, M. K. Aguilera, Transactions with consistency choices on geo-replicated cloud storage, Microsoft, MSRTR-2013-82, Tech. Rep (2013). [22] J. Brzezinski, C. Sobaniec, D. Wawrzyniak, From Session Causality to Causal Consistency., in: PDP, 2004, pp. 152–158. [23] Y. Zhu, J. Wang, Client-centric consistency formalization and verification for system with large-scale distributed data storage, Future Generation Computer Systems 26 (8) (2010) 1180–1188. doi:10.1016/j. future.2010.06.006. URL http://dx.doi.org/10.1016/j.future.2010.06.006 [24] D. Terry, a.J. Demers, K. Petersen, M. Spreitzer, M. Theimer, B. Welch, Session guarantees for weakly consistent replicated data, Proceedings of 3rd International Conference on Parallel and Distributed Information Systems (1994) 140–149doi:10.1109/PDIS.1994.331722. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper. htm?arnumber=331722 [25] D. B. Terry, V. Prabhakaran, R. Kotla, M. Balakrishnan, M. K. Aguilera, H. Abu-Libdeh, Consistency-based service level agreements for cloud storage, Sosp ’13 (2013) 309–324doi:10.1145/2517349.2522731. URL http://dl.acm.org/citation.cfm?doid=2517349. 2522731 [26] P. Bailis, A. Ghodsi, J. M. Hellerstein, I. Stoica, Bolt-on causal consistency, Acm Sigmod (2013) 761doi:10.1145/2463676.2465279. [27] Y. Saito, M. Shapiro, Optimistic replication, ACM Computing Surveys 37 (1) (2005) 42–81. doi:10.1145/1057977.1057980. URL http://doi.acm.org/10.1145/1057977.1057980 [28] A. S. Tanenbaum, M. Van Steen, Distributed Systems: Principles and Paradigms, 2/E, 2007. doi:10.1002/1521-3773(20010316)40: 6<9823::AID-ANIE9823>3.3.CO;2-C. [29] L. Lamport, Time, Clocks, and the Ordering of Events in a Distributed System, Communications of the ACM 21 (7) (1978) 558–565. arXiv: 10614036, doi:10.1145/359545.359563. [30] S. Burckhardt, Principles of Eventual Consistency, Principles of Eventual Consistency 1 (1-2) (2014) 1–150. doi:10.1561/2500000011. URL http://research.microsoft.com/apps/pubs/default. aspx?id=230852 [31] R. Guerraoui, M. Pavlovic, D.-A. Seredinschi, Trade-offs in replicated systems, IEEE Data Engineering Bulletin 39 (ARTICLE) (2016). [32] L. Lamport, How to make a multiprocessor computer that correctly executes multiprocess progranm, IEEE transactions on computers (9) (1979) 690–691. [33] F. J. Torres-Rojas, E. Meneses, Convergence Through a Weak Consistency Model : Timed Causal Consistency, CLEI Electronic Journal 8 (2) (2005). [34] C. Li, D. Porto, A. Clement, J. Gehrke, N. Preguic, R. Rodrigues, Making Geo-Replicated Systems Fast as Possible, Consistent when Necessary, OSDI’12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation (2012) 265–278. [35] T.-Y. Hsu, A. D. Kshemkalyani, M. Shen, Causal consistency algorithms for partially replicated and fully replicated systems, Future Generation Computer Systems 86 (2018) 1118–1133. [36] H. Yu, A. Vahdat, Design and evaluation of a conit-based continuous consistency model for replicated services, ACM Transactions on Com-
References
Jo
urn a
lP
re-
[1] F. Zafar, A. Khan, S. U. R. Malik, M. Ahmed, A. Anjum, M. I. Khan, N. Javed, M. Alam, F. Jamil, A survey of cloud computing data integrity schemes: Design challenges, taxonomy and future trends, Computers and Security 65 (2017) 29–49. doi:10.1016/j.cose.2016.10.006. URL http://dx.doi.org/10.1016/j.cose.2016.10.006 [2] J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, A. H. Byers, Big data: The next frontier for innovation, competition, and productivity (2011). [3] S. Yu, M. Liu, W. Dou, X. Liu, S. Zhou, Networking for Big Data: A Survey, IEEE Communications Surveys & Tutorials 19 (1) (2017) 531– 549. doi:10.1109/COMST.2016.2610963. URL http://ieeexplore.ieee.org/document/7571188/ [4] D. Bermbach, S. Tai, Eventual consistency: How soon is eventual? An evaluation of Amazon S3’s consistency behavior, Proceedings of the 6th Workshop on Middleware for Service Oriented Computing. ACM (2011) 1–6doi:10.1145/2093185.2093186. URL http://dl.acm.org/citation.cfm?doid=2093185. 2093186{%}5Cnhttp://dl.acm.org/citation.cfm?id=2093186 [5] C. Yang, Q. Huang, Z. Li, K. Liu, F. Hu, Big Data and cloud computing: innovation opportunities and challenges, International Journal of Digital Earth 10 (1) (2017) 13–53. doi:10.1080/17538947.2016.1239771. [6] A. Goyal, S. Dadizadeh, A survey on cloud computing, Technical Report for CS 58 (December) (2009) 55–58. doi: 10.17148/IJARCCE.2016.54261. URL http://unbreakablecloud.com/wordpress/wp-content/ uploads/2011/02/A-Survey-On-Cloud-Computing.pdf [7] D. Bermbach, S. Tai, Benchmarking eventual consistency: Lessons learned from long-term experimental studies, Proceedings - 2014 IEEE International Conference on Cloud Engineering, IC2E 2014 (2014) 47– 56doi:10.1109/IC2E.2014.37. [8] A. Arasu, S. Blanas, K. Eguro, M. Joglekar, R. Kaushik, D. Kossmann, R. Ramamurthy, P. Upadhyaya, R. Venkatesan, Secure database-as-aservice with cipherbase, in: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ACM, 2013, pp. 1033– 1036. [9] M. A. Alzain, E. Pardede, Using multi shares for ensuring privacy in database-as-a-service, in: 2011 44th Hawaii International Conference on System Sciences, IEEE, 2011, pp. 1–9. [10] C. Curino, E. P. C. Jones, R. A. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, N. Zeldovich, Relational cloud: A database-as-a-service for the cloud (2011). [11] B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, Others, Windows Azure Storage: a highly available cloud storage service with strong consistency, in: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, ACM, 2011, pp. 143–157. [12] M. Nabeel, E. Bertino, Privacy preserving delegated access control in the storage as a service model, in: 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), IEEE, 2012, pp. 645–652. [13] A. Shraer, C. Cachin, A. Cidon, I. Keidar, Y. Michalevsky, D. Shaket, Venus: Verification for untrusted cloud storage, Ccsw’10 (2010) 19– 29doi:10.1145/1866835.1866841.
15
Journal Pre-proof
[42]
[43] [44] [45]
[46]
[47]
[48] [49] [50]
[51]
[52] [53]
pro of
[41]
Pr(S taleread ) =
∞ X
N−Xn N
× Pr(Xwi < Xr < Xwi + T + T p ) Xn + N × Pr(Xwi < Xr < Xwi + T )
i=0
!
(.1)
All writing times that may happen in the system have exponential distribution function. Duration of writing operation occurrence is shown by exponential distribution; sum of Xwi with written with gamma parameters of i and λw . All recorded times for write operation follow exponential distribution. Sum of Xw s for all write operations follow gamma distribution of parameters i and λw . Therefore, the probability in formula 1-8 is as follows:
re-
[40]
lP
[39]
N−1 R ∞ i ∞ N × 0 fw (t)(Fr (t + T + T p ) − Fr (t)dt X Pr(S taleread ) = R 1 ∞ i i=0 + N 0 fw (t)(Fr (t + T ) − Fr (t)dt (.2)
urn a
[38]
Time T for local write is negligible compared with time T p so we put it zero. The following probability shows the simple replacement of mass function of Poisson’s distribution probability and cumulative distribution function of exponential distribution:
Pr(S taleread ) =
∞ X i=0
Jo
[37]
repeated the same way for all other write operations happening in the system. Tp is the time needed for replicate write operation or updating all replicas. Transaction inputs are generally the same as Poisson’s distribution process [53]. It is assumed that inputs of write and read are intended to Poisson’s distribution of parameters λr and λw . These parameters are changed dynamically during storage system monitoring and running of incoming write or read calls. The distribution of the waiting period between two inputs with Poisson’s distribution is exponential. Random variables of Xr and Xr are read and write time with exponential distribution of parameters λr and λw . The possibility that the next reading returns an stale value is calculated by equation .1 with N replication factors in system and Xn replicas engaging in read operation [51].
puter Systems 20 (3) (2002) 239–282. doi:10.1145/566340.566342. URL http://portal.acm.org/citation.cfm?doid=566340. 566342 W. Golab, X. Li, M. Shah, Analyzing consistency properties for fun and profit, Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing. ACM, 2011. (2011) 197– 206doi:10.1145/1993806.1993834. URL http://dl.acm.org/citation.cfm?id=1993834 A. Lakshman, P. Malik, Cassandra: a decentralized structured storage system, ACM SIGOPS Operating Systems Review 44 (2) (2010) 35–40. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, R. E. Gruber, Bigtable: A distributed storage system for structured data, ACM Transactions on Computer Systems (TOCS) 26 (2) (2008) 4. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, W. Vogels, Dynamo: Amazon’s Highly Available Key-value Store, Proceedings of the Symposium on Operating Systems Principles (2007) 205–220arXiv:z0024, doi:10.1145/1323293.1294281. URL http://dl.acm.org/citation.cfm?id=1323293.1294281 W. Vogels, Eventually consistent, Communications of the ACM 52 (1) (2009) 40–44. L. Brutschy, D. Dimitrov, P. M¨uller, M. Vechev, Static serializability analysis for causal consistency, in: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, ACM, 2018, pp. 90–104. C. Bunch, N. Chohan, C. Krintz, Appscale: open-source platform-as-aservice, UCSB Technical Report 2011-01 (2011). M. N. Giannakos, K. Chorianopoulos, K. Giotopoulos, P. Vlamos, Using Facebook out of habit, Behaviour & Information Technology 32 (6) (2013) 594–602. F. J. Torres-Rojas, M. Ahamad, M. Raynal, Timed consistency for shared distributed objects, Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing - PODC ’99 (1999) 163– 172doi:10.1145/301308.301350. W. Lloyd, M. J. Freedman, M. Kaminsky, D. G. Andersen, Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS, Proceedings of the Symposium on Operating Systems Principles (2011) 1–16doi:10.1145/2043556.2043593. URL http://doi.acm.org/10.1145/2043556.2043593 M. Shen, A. D. Kshemkalyani, T. Y. Hsu, Causal Consistency for Geo-Replicated Cloud Storage under Partial Replication, Proceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015 (2015) 509–518doi:10.1109/ IPDPSW.2015.68. M. Perrin, A. Mostefaoui, C. Jard, Causal consistency: beyond memory, in: ACM SIGPLAN Notices, Vol. 51, ACM, 2016, p. 26. C. J. Fidge, Timestamps in Message-Passing Systems That Preserve the Partial Ordering, 11th Australian Computer Science Conference (ACSC’88) 10 (1) (1988) 56–66. H. Nejati Sharif Aldin, H. Deldari, M. H. Moattar, M. Razavi Ghods, Consistency models in distributed systems: A survey on definitions, disciplines, challenges and applications, arXiv e-prints (2019) arXiv:1902.03305arXiv:1902.03305. H. E. Chihoub, S. Ibrahim, G. Antoniu, M. S. P´erez, Harmony: Towards automated self-adaptive consistency in cloud storage, Proceedings - 2012 IEEE International Conference on Cluster Computing, CLUSTER 2012 (2012) 293–301doi:10.1109/CLUSTER.2012.56. B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, R. Sears, Benchmarking cloud serving systems with YCSB, in: Proceedings of the 1st ACM symposium on Cloud computing, ACM, 2010, pp. 143–154. L. Boug´e, G. Antoniu, Managing Consistency for Big Data Applications on Clouds: Tradeoffs and Self-Adaptiveness, Ph.D. thesis, INRIA Rennes, France (2013).
N−1 N
×
R∞ 0
− λt
e w −λr t ti−1 γ(i)λ − e−λr (t+t p ) )dt i (e w
(.3)
Finally, the probability of the next read to be an old value is calculated from the following simplified formula: Pr(S taleread ) =
Appendix A: Stale read calculations
(N − 1)(1 − e−λr T p )(1 + λr λw ) Nλr λw
(.4)
Appendix B: Consistency algorithms
If the start of Xr happens between the start of the last writing action Xw and the termination time of data propagation to other copies, the value may be called stale. This condition will be
The proposed algorithms presented in this sections are the algorithmic functions which are applied in the proposed 16
Journal Pre-proof
Jo
urn a
lP
re-
pro of
method. These algorithms are related to the client-centric consistency. Algorithm 2 conveys the monotonic read Consistency. Whereas, algorithm 3 is about the monotonic write Consistency. Besides, algorithm 4 performs the read your write Consistency and finally algorithm 5 handles the write follow read Consistency. All of the aforementioned algorithms are applied in the algorithm 1 which is presented in section 3.
17
Si
10: 11: 12: 13: 14: 15:
if (O1
Si
O2 ) ⇒ O1 −→ O2 then Si
reSj
∃ ∀ (W(x, )α −→ r(x, γ)β ∧ (w(x, )α −−→ w(y, γ)θ) 7−→ r(y j , δ)θ) Ci S j i h Ci if ∃ ∃ (r(x, )α C+i r(y j , γ)α)) ∨ (w(x, )α −→ then w(y , γ)θ)) − 7 → r(y , δ)θ j j Ci S j Monotonic Read is correct Sj (r(x, )α Ci r(y , γ)φ) ∨ (w(x, )φ −− → w(y, γ)θ j then + else if ∃ ∃ Sj Ci S j 7−→ r(y j , δ)θ) ∨ (w(x, )α −−→ w(y, γ)θ 7−→ r(y j , δ)α)
urn a
9:
Si
∀(r(x, )α −→ r(x, δ)θ)
Si
Monotonic Read is violated
Jo
8:
lP
Algorithm 2 monotonic read consistency Input: Distributed Users Operations Table (DUOT), O1 : old operation, O2 : new operation, α & β: same value on diff servers, θ: new value on an available server, : timestamp, λ: next timestamp, δ: latest timestamp. 1: procedure Monotonic Read Consistency 2: O1 ← read operation: r(x)α 3: O2 ← read operation: r(y)θ 4: while n < latest operation on DUOT do 5: if x = y then 6: if O1 C+i O2 then 7: for ∀(IF O1 ∧ O2 ∈ OS i ) do
pro of
Journal Pre-proof
18
Si
10: 11: 12:
13:
14: 15:
Si
Si
re-
if (O1 O2 ) ⇒ O1 −→ O2 then Sj ∃ ∀ (w(x, )α C+i w(y j , γ)θ) ∧ (w(x, γ)β −−→ w(y, δ)θ) Ci S j Sj if ∃ ∃ (w(x, )α C+i w(y j , γ)θ) ∧ (w(x, γ)β −−→ w(y, δ)θ) then Ci S j Monotonic Write is correct Sj w(x, )φ Ci w(y , γ)θ) 7−→ (w(x, γ)φ −− → w(y, δ)θ j + then else if ∃ ∃ Ci S j S j ∨ w(x, )α C+i w(y j , γ)θ 7−→ w(x, γ)θ −−→ w(y, δ)β
urn a
9:
Si
∀(w(x, )α −→ w(x, δ)θ)
Monotonic Write is violated
Jo
8:
lP
Algorithm 3 monotonic write consistency Input: Distributed Users Operations Table (DUOT), O1 : old operation, O2 : new operation, α & β: same value on diff servers, θ: new value on an available server, : timestamp, λ: next timestamp, δ: latest timestamp. 1: procedure Monotonic Write Consistency 2: O1 ← write operation: w(x)α 3: O2 ← write operation: w(y)θ 4: while n < latest operation on DUOT do 5: if x = y then 6: if O1 C+i O2 then 7: for ∀(IF O1 ∧ O2 ∈ OS i ) do
pro of
Journal Pre-proof
19
Journal Pre-proof
Si
9: 10: 11: 12: 13:
if (O1
Si
O2 ) ⇒ O1 −→ O2 then Si
∃ ∀ (w(x, )α −→ w(x, γ)θ 7−→ r(x j )θ) Sj if ∃ ∃ (w(x, )α C+i r(x j , γ)θ ∨ w(x, )β −−→ w(x, γ)θ) 7−→ r(x j , δ)θ then Ci S j Read Your Write is correct (w(x, )φ C+i r(x j , γ)β) ∨ (w(x, )α C+i r(x j , γ)β then else if ∃ ∃ Sj Ci S j ∧(w(x, )α −−→ w(x, γ)θ) 7−→ r(x j , δ)α)
Ci S j
Read Your Write is violated
lP
14: 15:
Si
∀(w(x, )α −→ r(xi , δ)θ)
Si
re-
8:
pro of
Algorithm 4 read your write consistency Input: Distributed Users Operations Table (DUOT), O1 : old operation, O2 : new operation, α & β: same value on diff servers, θ: new value on an available server, : timestamp, λ: next timestamp, δ: latest timestamp. 1: procedure Read Your Write Consistency 2: O1 ← write operation: w(x)α 3: O2 ← read operation: r(xi )θ 4: while n < latest operation on DUOT do 5: if x = xi then 6: if O1 C+i O2 then 7: for ∀(IF O1 ∧ O2 ∈ OS i ) do
8: 9: 10: 11: 12: 13: 14:
Jo
Si
urn a
Algorithm 5 write follow read consistency Input: Distributed Users Operations Table (DUOT), O1 : old operation, O2 : new operation, α & β: same value on diff servers, θ: new value on an available server, : timestamp, λ: next timestamp, δ: latest timestamp. 1: procedure Write Follow Read Consistency 2: O1 ← read operation: r(x)α 3: O2 ← write operation: w(y)θ 4: while n < latest operation on DUOT do 5: if x = y then 6: if O1 C+i O2 then 7: for ∀(IF O1 ∧ O2 ∈ OS i ) do Si
∀(w(x, )α −→ r(x, γ)β)
Si
Si
if (O1 O2 ) ⇒ O1 −→ O2 then Sj ∃ ∀ (w(x, )α C+i w(y j , γ)θ) ∧ (w(x, γ)β −−→ w(y, δ)θ) Ci S j Sj if ∃ ∃ (r(x, )α C+i w(y, δ)θ ∧ (w(x, )α −−→ then r(y, γ)β) − 7 → w(y , δ)θ j Ci S j Write Follow Read is correct Sj else if ∃ ∃ r(x, )φ C+i w(y, δ)θ) ∨ (r(x, )φ C+i w(y, δ)θ ∧ w(x, γ)φ −−→ w(y, δ)θ then Ci S j
Write Follow Read is violated
20
pro of
Journal Pre-proof
Hesam Nejati is currently a member of Department of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran. He received his Master’s degree in software engineering from the Islamic Azad University of Mashhad, Iran in 2017. His works during the master’s period involves workflow scheduling in distributed systems, multi-join query in distributed databases, parallel algorithms, and consistency in cloud computing.
lP
re-
He received his Bachelor’s degree in Computer Engineering from the Islamic Azad University of Mashhad, Iran, in 2014. Hesam’s research interests fall broadly into IoT, and cloud computing.
Jo
urn a
Dr. Hossein Deldari is currently a member of Department of Computer Engineering, Salman Institute of higher education, Mashhad, Iran. He received his B.Sc. degree in Physics from University of Mashhad, Iran, in 1970. Afterwards, he received his Master’s degree in Computer science from University of Oregon, Eugene, Oregon, USA, in 1979. He received his Ph.D. in Parallel and distributed systems from University of Leeds, Leeds, England, in 1995. Dr. Deldari has been involved in the research and development of parallel and distributed systems for almost a decade. His research interests includes parallel algorithmic skeletons, parallel fuzzy genetic algorithms, and grid/cluster computing.
Dr. Mohammad Hossein Moattar is currently a member of Department of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran. He received his B.Sc. degree in software engineering from the Islamic Azad University of Mashhad, Iran, in 2002. Afterwards, he received his Master’s degree in artificial intelligence from Amirkabir University of Technology, Tehran, Iran, in 2004. He also received his Ph.D. in artificial intelligence systems from Amirkabir University of Technology, Tehran, Iran, in 2009. Dr. Moattar has been involved in the research and development of pattern recognition, machine learning,
Journal Pre-proof
pro of
and cloud computing related applications for almost a decade. His research interests includes deep learning, natural language processing, pattern recognition, and cloud computing.
re-
Mostafa Razavi Ghods is currently a member of Department of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran. He received his Master’s degree in artificial intelligence from the Islamic Azad University of Mashhad, Iran in 2017. His works during the master’s period involves consistency in distributed systems, cloud computing, machine learning, and deep learning.
Jo
urn a
lP
He received his Bachelor’s degree in Computer Engineering from the Islamic Azad University of Mashhad, Iran, in 2011. Mostafa’s research interests fall broadly into deep learning, IoT, and cloud computing.
Journal Pre-proof
Conflict of Interest and Authorship Conformation Form Please check the following as appropriate:
pro of
All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or revising it critically for important intellectual content; and (c) approval of the final version. This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue. The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript The following authors have affiliations with organizations with direct or indirect financial interest in the subject matter discussed in the manuscript:
re-
o
Author’s name
Affiliation
Jo
urn a
lP
None of the authors have any affiliations with organizations with direct or indirect financial interest in the subject matter discussed in the manuscript
Journal Pre-proof
Jo
urn a
lP
re-
pro of
1. Cloud storage systems are provides this service for end-users, and deliver data availability and durability as well as global accessibility throughout the Internet. 2. Replication brings about the asynchronization of data among replicas in different cloud datacenters. 3. These systems need to ensure that data is synchronized among different replicas by implementing consistency policies. 4. Strict timed causal consistency as a hybrid consistency model in which our proposed method has extended the cloud computing. 5. This consistency model has two components: client-side, and server-side. 6. At the client-side, this model supports client-centric consistency. At the server-side, it supports timed causal consistency as well. 7. Strict timed causal consistency is stronger than the client-centric and is more flexible than the datacentric approaches. 8. Our proposed method guarantees the consistency and satisfies data availability. 9. Cassandra is an NOSQL database with high scalability and availability. Cassandra comes with multiple consistency levels as a service such as consistency ONE, ALL, QUORUM, etc.