Skip to main content

Consistency Models and Trade-offs

Choose the right consistency guarantee for your data, balancing correctness, latency, and availability

TL;DR

Consistency models form a spectrum. Strong consistency (linearizability) guarantees all reads see all completed writes, but sacrifices latency and availability. Eventual consistency guarantees all writes eventually propagate, accepting temporary inconsistency. Between them lie causal consistency and other models. Choose based on your data's importance and your tolerance for inconsistency.

Learning Objectives

  • Understand the spectrum of consistency models
  • Distinguish strong, causal, and eventual consistency
  • Recognize the latency and availability cost of stronger consistency
  • Apply per-operation and per-data consistency strategies

Motivating Scenario

Your e-commerce system needs inventory to be accurate. A customer buys the last item; you must not sell it again. But performance testing shows that strong consistency drops throughput by 60%. What if you used eventual consistency just for inventory? Risk: the same item sells twice. What if you use strong consistency only for inventory changes? Complexity: application logic must vary by operation. This is the consistency trade-off in practice.

The Consistency Spectrum

Consistency Models Spectrum

Strong Consistency (Linearizability)

Definition: Every read returns the result of the most recent write. All operations appear to execute atomically in a single total order.

Characteristics:

  • All nodes agree on data values
  • Reads never return stale data
  • Highest latency (waiting for confirmation from replicas)
  • Lowest throughput (mutations must be coordinated)
  • Easiest to reason about for application developers

When to Use:

  • Financial transactions (must be accurate)
  • Inventory management (must prevent overselling)
  • Atomic counters or versioning
  • Any scenario where consistency errors cascade

Trade-off: You pay in latency and availability. During network partitions, you must choose between serving stale data (losing consistency) or refusing requests (losing availability).

Time: 0    Alice writes balance = 100 to Node A
Time: 1 Bob reads from Node B
Result: Bob sees 100 (not old value)

Guarantees: Bob MUST see Alice's write
Cost: Node B must sync with Node A first

Causal Consistency

Definition: Operations that causally relate to each other are seen by all processes in the same order. Non-related operations can appear in different orders.

Characteristics:

  • Weaker than strong consistency (allows some ordering divergences)
  • Stronger than eventual consistency (respects causal relationships)
  • Better latency than strong consistency
  • Useful for operations with dependencies

When to Use:

  • Message threads (replies must follow messages)
  • Collaborative editing (edits depend on previous state)
  • Comment threads (replies must appear after comments)
  • Any scenario with ordered dependencies

Trade-off: You avoid some latency costs of strong consistency while maintaining logical ordering.

Time: 0    Alice posts message "Hello"
Time: 1 Bob reads message "Hello"
Time: 2 Bob replies "Hi there"
Time: 3 Charlie reads both message and reply

Guarantees: Charlie sees message before reply
(causal relationship preserved)

No guarantee: If Dave posts unrelated message,
order relative to Alice's message is undefined

Eventual Consistency

Definition: All updates eventually propagate to all replicas. No guarantees about timing, but all writes eventually appear everywhere.

Characteristics:

  • Highest throughput (no coordination overhead)
  • Lowest latency (writes don't wait for confirmation)
  • Temporary inconsistency (reads may be stale)
  • Requires application logic to handle conflicts
  • Simple to scale horizontally

When to Use:

  • Social media feeds (eventual correctness is fine)
  • Product recommendations (stale data acceptable)
  • Caching layers (temporary inconsistency expected)
  • Systems with high read traffic (can scale reads infinitely)
  • Analytics and logging (eventual consistency natural)

Trade-off: Simplest to implement and scale, but application must handle temporary inconsistency.

Time: 0    Alice writes counter = 100 to Node A
(doesn't wait for Node B to acknowledge)
Time: 1 Bob reads counter from Node B
Result: Bob might see 99 (old value)

Time: 2 Update propagates to Node B
(Bob's next read sees 100)

No guarantee: When exactly Bob sees the update

Real-World Consistency Failures and How to Handle Them

Case Study 1: Overselling in E-Commerce

Problem: Two orders place last item in inventory simultaneously

Time 0:  Item inventory = 1
Time 1: Order A checks inventory (sees 1, places order)
Time 2: Order B checks inventory (sees 1, places order - race condition!)
Time 3: System processes Order A (decrements to 0)
Time 4: System processes Order B (decrements to -1, error!)

Solutions by consistency model:

# Strong Consistency: Serialize with lock
def purchase_item_strong(item_id):
with lock(item_id): # Mutex
inventory = database.get(item_id)
if inventory > 0:
database.update(item_id, inventory - 1)
return success
return out_of_stock

# Eventual Consistency: Accept then reconcile
def purchase_item_eventual(item_id):
# Optimistic: assume success
database.decrement(item_id)
# Later: reconcile if inventory goes negative
job = BackfillInventory(item_id)
if inventory < 0:
notify_customer_cancellation()

Case Study 2: Payment Processing

Problem: Customer charged but order never created (duplicate charges)

Strong Consistency:
- Atomic transaction: both succeed or both fail
- Guaranteed: if charged, order created

Eventual Consistency:
- Charge succeeds, order creation fails
- Must detect and handle: refund customer, retry order creation

ACID vs BASE

These frameworks describe consistency philosophies:

ACID (Traditional Databases)
  1. SQL databases
  2. Transactional systems
  3. Financial ledgers
BASE (Distributed Systems)
  1. NoSQL databases
  2. Distributed caches
  3. Event-driven systems

Practical Strategies

1. Hybrid Consistency

Use different consistency models for different operations:

class InventoryService:
def purchase_item(self, user_id, item_id):
"""Inventory operations need strong consistency"""
# Use quorum write/read (strong consistency)
self.inventory_store.put(
key=f"item:{item_id}",
value=decreased_quantity,
consistency='strong'
)

def view_recommendations(self, user_id):
"""Recommendations can use eventual consistency"""
# Use fast local replica (eventual consistency)
return self.recommendation_cache.get(
key=f"recommendations:{user_id}",
consistency='eventual'
)

2. Conflict Resolution

With eventual consistency, you need conflict resolution strategies:

  • Last-Write-Wins: Timestamp wins. Simple but loses data.
  • Application Logic: Custom merge logic. Preserves data but complex.
  • Quorum/Voting: Multiple versions vote on correct value.
  • Operational Transformation: Track and merge concurrent edits (Google Docs).

3. Monotonic Reads

Prevent a user from seeing their own writes go backward:

Time 0: Alice writes value = 100
Time 1: Alice reads value (sees 100)
Time 2: Alice's request routed to different replica (hadn't replicated yet)
Without monotonic reads: Alice sees 99 (inconsistent!)
With monotonic reads: Still sees 100

Trade-off Matrix

PropertyStrong ConsistencyCausalEventual
LatencyHighMediumLow
ThroughputLowMediumHigh
AvailabilityLowerMediumHigher
Reasoning DifficultyEasyMediumHard
ScalabilityLimitedBetterExcellent

Implementation Patterns

Read-After-Write Consistency

Guarantee user sees their own writes immediately:

class ConsistencyManager:
def write(self, user_id, key, value):
# Write to primary (strong consistency)
self.primary_db.put(key, value)
# Remember this user's write timestamp
self.user_write_times[user_id] = time.time()

def read(self, user_id, key):
# If user just wrote, read from primary
if user_id in self.user_write_times:
last_write = self.user_write_times[user_id]
if time.time() - last_write < 1.0: # Recent write
return self.primary_db.get(key)
# Otherwise, read from replica (faster)
return self.replica_db.get(key)

Quorum-Based Consistency

Ensure majority agreement:

class QuorumReplicaSet:
def write(self, key, value, quorum_size):
"""Write to majority of replicas"""
acks = 0
for replica in self.replicas:
if replica.write(key, value):
acks += 1
if acks >= quorum_size:
return True # Success
return False

def read(self, key, quorum_size):
"""Read from majority of replicas"""
results = {}
for replica in self.replicas:
val = replica.read(key)
results[val] = results.get(val, 0) + 1
# Return most common value (majority agreement)
majority_value = max(results, key=results.get)
return majority_value

Eventual Consistency with Conflict Resolution

class DataStore:
def put(self, key, value, timestamp=None):
"""Store with timestamp for conflict resolution"""
if timestamp is None:
timestamp = time.time()
self.data[key] = (value, timestamp)

def merge_replica(self, other_store):
"""Merge another replica, resolve conflicts by timestamp"""
for key, (value, timestamp) in other_store.data.items():
if key not in self.data:
self.data[key] = (value, timestamp)
else:
local_timestamp = self.data[key][1]
# Last-write-wins: latest timestamp wins
if timestamp > local_timestamp:
self.data[key] = (value, timestamp)

Trade-off Decision Matrix

ScenarioConsistencyReason
Bank balanceStrongMoney must be accurate; overselling terrible
Post likesEventualApproximate count OK; stale count acceptable
Friend listCausalIf A adds B, then B sees A (causal relationship)
CacheEventualStale cached data is expected
E-commerce inventoryStrongCan't sell same item twice
User profile nameWeakSmall delay in name change acceptable
Message threadCausalReplies must follow messages
LeaderboardEventualApproximate rankings acceptable
Session tokenStrongMust validate immediately

Monitoring and Observability

class ConsistencyMonitor:
def __init__(self):
self.replication_lag = [] # Track lag distribution
self.divergence_count = 0 # Count of reads that diverged

def measure_replication_lag(self, primary, replica):
"""Measure how far replica is behind primary"""
primary_version = primary.get_version(key)
replica_version = replica.get_version(key)
lag_seconds = primary_version - replica_version
self.replication_lag.append(lag_seconds)

def alert_if_high_lag(self, threshold_ms=100):
avg_lag = sum(self.replication_lag) / len(self.replication_lag)
if avg_lag > threshold_ms:
alert(f"High replication lag: {avg_lag}ms")

def detect_divergence(self, replicas):
"""Detect if replicas have different values"""
values = [r.get(key) for r in replicas]
if len(set(values)) > 1:
self.divergence_count += 1
alert(f"Replicas diverged: {values}")

Self-Check

For each scenario, decide what consistency model to use and explain why:

  1. User's bank balance? Strong. Money errors have severe consequences; exact balance critical.
  2. Social media post likes? Eventual. Approximate counts acceptable; consistency delay fine.
  3. User's profile name? Weak/Eventual. Small propagation delay acceptable; not mission-critical.
  4. Shopping cart contents? Strong/Causal. Users must see their additions immediately.
  5. Product reviews? Eventual. Slight delay in review appearing acceptable.
  6. Authentication token validation? Strong. Must immediately reject revoked tokens.
  7. Recommendation feed? Eventual. Stale recommendations acceptable; data doesn't need to match perfectly.
  8. Distributed lock? Strong. Lock coordination requires immediate visibility.
One Takeaway

Consistency isn't binary—it's a spectrum. Stronger consistency costs latency and availability. Choose the weakest consistency model that's correct for your use case.

Next Steps

  1. Handle Failures: Read Partition Tolerance and Failure Modes
  2. Enable Retries: Learn about Idempotency
  3. Implement Communication: Explore API Styles

References

  • Kleppmann, M. (2017). "Designing Data-Intensive Applications". O'Reilly Media.
  • Vogels, W. (2008). "Eventually Consistent". Communications of the ACM.
  • Tanenbaum, A. S., & Van Steen, M. (2006). "Distributed Systems: Principles and Paradigms".
  • Coulouris, G., Dollimore, J., Kindberg, T., & Blair, G. (2011). "Distributed Systems: Concepts and Design" (5th ed.).