The goal of this article is to give more clarity to the theorem and show pros and cons of ACID and BASE models that might stand in the way of implementing distributed systems.
What is CAP about?
The (CAP) theorem (Consistency, Availability and Partitioning tolerance) was given by Eric Brewer, a professor at the University of California, Berkeley and one of the founders of Google, in 2001 in the keynote of Principles of Distributed Computing.
The theorem states:
Though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can achieve all three at the same time.
In other words a system can have at most two of three desirable properties at the same time in presence of errors.
Let’s first give definitions to these 3 terms:
Consistency: A service that is consistent should follow the rule of ordering for updates that spread across all replicas in a cluster – “what you write is what you read”, regardless of location. For example, Client A writes 1 then 2 to location X, Client B cannot read 2 followed by 1. This rule has another name “Strong consistency”.
Availability: A service should be available. There should be a guarantee that every request receives a response about whether it was successful or failed. If the system is not available it can be still consistent. However, consistency and availability cannot be achieved at the same time. This means that one has two choices on what to leave. Relaxing consistency will allow the system to remain highly available under the partitioning conditions (see next definition) and strong consistency means that under certain conditions the system will not be available.
Partition tolerance: The system continues to operate despite arbitrary message loss or failure of part of the system. A simple example, when we have a cluster of N replicated nodes and for some reason a network is unavailable among some number of nodes (e.g. a network cable got chopped). This leads to inability to synchronize data. Thus, only some part of the system doesn’t work, the other one does. If you have a partition in your network, you lose either consistency (because you allow updates to both sides of the partition) or you lose availability (because you detect the error and shut down the system until the error condition is resolved).
There are lots of articles about this theorem these days around but not many of them reveal real meaning behind this, neither CAP theorem talks about the normal operation of a distributed system when there are no errors. A simple meaning of this theorem is “It is impossible for a protocol to guarantee both consistency and availability in a partition prone distributed system”. This was mentioned above in examples.
Most of the NoSQL database system architectures favour one factor over the other:
- BigTable, used by Google App engine, and HBase, which runs over Hadoop, claim to be strongly consistent within a data-center and highly available meaning there’s an eventual consistency between data-centers. Updates are propagated to all replicas asynchronously.
- Google Spanner, a new globally-distributed, and synchronously-replicated database – the successor to BigTable. Updates are propagated to all replicas synchronously. Google Spanner supports strong consistency even in the the presence of wide-area replication unlike BigTable which can only support eventual-consistent (see below for definitions) replication across data-centers.
- Amazon’s Dynamo, Cassandra and Riak instead sacrifice consistency in favor of availability and partition tolerance. They achieve a weaker form of consistency known as eventual consistency – updates are propagated to all replicas asynchronously, without guarantees on the order of updates across replicas and when they will be applied.
- Oracle NoSQL allows to choose a consistency policy which might affect performance depending on a level selected.
- Apache Cassandra is similar to BigTable, but it has a tunable consistency model.
Let’s recall in brief what ACID (Atomicity, Consistency, Isolation and Durability) means in traditional RDBMS community before moving to the next topic.
ACID transactions provide 4 properties which must be guaranteed:
Atomicity: All of the operations in the transaction will complete, or none will. If one part of the transaction fails, the entire transaction fails.
Consistency: The database will be in a consistent state when the transaction begins and ends. This property ensures that any transaction will bring the database from one valid state to another. In high availability environment this rule must be satisfied for all nodes in a cluster.
Isolation: The transaction will behave as if it is the only operation being performed upon the database. Each transaction has to execute in total isolation from the rest.
Durability: Upon completion of the transaction, the operation will not be reversed.
ACID is guaranteed by A Two-phase commit – a distributed algorithm that ensures this across multiple database instances when performing transaction.
Eventual consistency (BASE) != Strong consistency
Eventual consistency (normally asynchronous transactions) is a form of a weaker consistency which allows to improve speed and availability, because ACID provides strong consistency (synchronous transactions) for partitioned databases and thus gets in the way of availability. A transaction that involves N nodes in a cluster that uses 2-phase commit also reduces the availability. The term eventual consistency or as it is called BASE (Basically Available, Soft state, Eventual consistency) is the opposite of ACID (Atomicity, Consistency, Isolation and Durability). Where ACID is pessimistic and requires consistency at the end of every operation, BASE is optimistic and accepts that the database consistency will be in a state of flux. The eventual consistency is simply an acknowledgement that there is an unbounded delay in propagating a change made on one machine to all the other copies which might lead to stale data. For instance, a distributed system maintains copies of shared data on multiple machines in a cluster to ensure high availability. When data gets updated in a cluster there might be some interval of time during which some of the copies will be updated, but others won’t. Eventually the changes will be propagated to all remaining machines. That’s why it is named eventual consistency. BASE trades consistency for availability and doesn’t give any ordering guarantees at all. Eventual consistency has nothing to do with a single node systems since there’s no need for propagation. If the database system only supports eventual consistency, then the application will need to handle the possibility of reading stale (inconsistent) data. There are different techniques how it can be achieved as well as other forms of weak consistency and out of scope of this article. Eventual consistency is only one form from the list of consistency-models which are out of scope of this article.
NRW notation (Read-Your-Writes)
NRW (Node, Read, Write) allows to analyse and tune how a distributed database will trade off consistency, read / write performance.
- N = the number of nodes that keep copies of a record distributed to.
- W = the number of nodes that must successfully acknowledge a write to be successfully committed.
- R = the number of nodes that must send back the same value of a unit of data for it to be accepted as read by the system.
The majority of NoSQL databases use N>W>1 – more than one write must complete, but not all nodes need to be updated immediately.
- W < N – high write availability
- R < N – high read availability
- W+R > N – is a strong consistency, read/write are fully overlapped
- W+R <= N – is an eventual consistency, meaning that there is no overlap in the read and write set;
You can set these parameters and see what you will get online.Thus, varying the parameters we can tune a wide variety of scenarios with different properties of availability, consistency, reliability, and speed.
A strongly consistent system gives up availability upon a certain kind of failure, and eventually-consistent system gives up consistency upon a certain kind of failure which improves availability. The bottom line: It is impossible to guarantee consistency while providing high availability and network partition tolerance. This makes ACID databases less powerful for highly distributed environments and led to the emergence of alternate data stores that are target to high availability and high performance. The eventual consistency is one of approaches to achieve this.