According to Brewer’s CAP theorem, it is impossible for any distributed computer system to simultaneously provide all three of Consistency, Availability and Partition Tolerance. You can’t have the three at the same time and get an acceptable latency.
Cassandra values Availability and Partitioning tolerance (AP). Trade-offs between consistency and latency are tunable in Cassandra. You can get strong consistency with Cassandra (with an increased latency).
Cassandra extends the concept of eventual consistency. Eventual consistency implies the storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value.
How does C* know which record in the cluster is the latest one?
Each Column Family is composed of rows of row metadata and columns. A Column is the smallest increment of data in Cassandra. It is a tuple containing a name, a value and a timestamp. Cassandra uses the column timestamp to determine the most recent update to a column. The timestamp is provided by the client application. The latest timestamp always wins when requesting data, so if multiple client sessions update the same columns in a row concurrently, the most recent update is the one that will eventually persist.
C* provides tunable consistency by allowing to set the replication factor, replica placement strategy and consistency level.
Replication factor is total number of replicas across the cluster. So, a replication factor of 2 implies that there are two copies of each row and each copy is on a different node. All replicas are equally important; there is no primary or master replica.
As a general rule, the replication factor should not exceed the number of nodes in the cluster. However, you can increase the replication factor and then add the desired number of nodes afterwards.
When replication factor exceeds the number of nodes, writes are rejected, but reads are served as long as the desired consistency level can be met.
Consistency level is used to decide the number and location of nodes to which a write must be written to the commit log and memory table of. In case of read, Consistency level is used to decide the number and location of nodes whose responses must be compared for fetching the most recent record.
The Consistency Levels in C* are ONE, TWO, THREE, QUORUM, LOCAL_QUORUM, EACH_QUORUM, ALL, ANY.
ONE, TWO and THREE imply the number each represents.
QUORUM => a quorum of replica nodes
LOCAL_QUORUM => a quorum of replica nodes in the same data center as the coordinator node.
EACH_QUORUM => a quorum of replica nodes in all data centers.
ALL => all replica nodes in the cluster
ANY => at least one nodes in the cluster
The size of the quorum is calculated as (replication_factor / 2) + 1
Selecting a Consistency Level
Consistency level “ANY” should only be used where the absolute requirement is that a write never fails. This consistency level has the highest probability of a read not returning the latest written values.
For write operations, ANY is the lowest consistency (but highest availability), and ALL is the highest consistency (but lowest availability).
For read operations, ONE is the lowest consistency (but highest availability), and ALL is the highest consistency (but lowest availability).
QUORUM is a good middle-ground ensuring strong consistency, yet still tolerating some level of failure.
In multi data center clusters, contacting multiple data centers for a read or write request can slow down the response. The consistency level LOCAL_QUORUM is specifically designed for doing quorum reads and writes in multi data center clusters.