Skip to main content

Event Sourcing

Store the complete history of changes as immutable events rather than storing only the current state.

TL;DR

Instead of storing current state (user's name, balance, status), store the complete history of changes as immutable events: "NameChanged", "DepositMade", "StatusUpdated". Rebuild current state by replaying events. This gives you a complete audit trail, enables temporal queries (what was the balance on March 5?), and makes debugging easier—you can replay events to see exactly what happened. The tradeoff: reading requires replaying events, which is slower than a state table. Use snapshots to optimize: instead of replaying 1 million events, take a snapshot at event 500,000 and replay only 500,000 more. Event sourcing is powerful but adds complexity—use it when complete history, auditability, or temporal queries are requirements, not just nice-to-haves.

Learning Objectives

  • Understand event sourcing vs. state-based storage
  • Design event streams for your domain
  • Implement event replay and state reconstruction
  • Use snapshots to optimize performance
  • Create projections from event streams
  • Handle event versioning and schema evolution
  • Enable temporal queries and debugging

Motivating Scenario

A financial system stores current account balance: 5000. A user disputes a transaction from two months ago. How do you reconstruct exactly what happened? Without event sourcing, the history is lost. With event sourcing, you have the complete ledger: "Deposit 10000", "Withdraw 3000", "Fee 500", "Deposit 1500"—every change. You can replay to see balances at any point in time.

Core Concepts

Event Store vs. State Store

Traditional: store current state, optionally log changes. Event sourcing: store only events, derive state. Events are immutable facts; state is derived and can change as you understand your domain better. This enables debugging and auditability.

Event Streams

Organize events by aggregate (domain object). An "Account" aggregate has events like AccountCreated, MoneyDeposited, MoneyWithdrawn. Stream all events for one account together, enabling consistent reads and writes.

Snapshots

Replaying all events is slow. Take periodic snapshots: "At event 100,000, the balance was 5000". To rebuild state, load the latest snapshot and replay events since that snapshot. This trades off storage for speed.

Projections

Create materialized views from events. A "CustomerDashboard" projection consumes events and updates a view table: "When UserCreated event arrives, insert row into users table". Projections enable easy reads and integrations with non-event-sourced systems.

Practical Example

# ❌ POOR - State-based storage loses history
class Account:
def __init__(self, account_id, balance=0):
self.account_id = account_id
self.balance = balance

def deposit(self, amount):
self.balance += amount
db.update('accounts', self.account_id, {'balance': self.balance})

def get_balance(self):
return self.balance

# ✅ EXCELLENT - Event sourcing with complete history
class AccountEvent:
def __init__(self, event_type, data, timestamp=None):
self.event_type = event_type
self.data = data
self.timestamp = timestamp or datetime.now()

class Account:
def __init__(self, account_id, event_store):
self.account_id = account_id
self.event_store = event_store
self.version = 0

def deposit(self, amount):
if amount <= 0:
raise ValueError("Amount must be positive")
event = AccountEvent('MoneyDeposited', {'amount': amount})
self.event_store.append(self.account_id, event)

def withdraw(self, amount):
if amount <= 0:
raise ValueError("Amount must be positive")
event = AccountEvent('MoneyWithdrawn', {'amount': amount})
self.event_store.append(self.account_id, event)

def get_current_state(self):
"""Reconstruct current state by replaying events"""
events = self.event_store.get_events(self.account_id)
state = {'balance': 0, 'created_at': None}

for event in events:
if event.event_type == 'AccountCreated':
state['created_at'] = event.timestamp
elif event.event_type == 'MoneyDeposited':
state['balance'] += event.data['amount']
elif event.event_type == 'MoneyWithdrawn':
state['balance'] -= event.data['amount']

return state

def get_balance_at_date(self, date):
"""Temporal query: what was the balance on this date?"""
events = self.event_store.get_events(self.account_id)
balance = 0

for event in events:
if event.timestamp > date:
break
if event.event_type == 'MoneyDeposited':
balance += event.data['amount']
elif event.event_type == 'MoneyWithdrawn':
balance -= event.data['amount']

return balance

class EventStore:
def __init__(self, db):
self.db = db

def append(self, stream_id, event):
"""Append event to stream"""
self.db.insert('events', {
'stream_id': stream_id,
'event_type': event.event_type,
'data': json.dumps(event.data),
'timestamp': event.timestamp
})

def get_events(self, stream_id):
"""Get all events for a stream"""
rows = self.db.query(
'SELECT * FROM events WHERE stream_id = %s ORDER BY id',
stream_id
)
return [AccountEvent(row['event_type'], json.loads(row['data']), row['timestamp'])
for row in rows]

def get_snapshot(self, stream_id):
"""Get latest snapshot for optimization"""
return self.db.query(
'SELECT * FROM snapshots WHERE stream_id = %s ORDER BY event_number DESC LIMIT 1',
stream_id
)

When to Use / When Not to Use

When to Use Event Sourcing
  1. Systems requiring complete audit trails and regulatory compliance
  2. Applications where temporal queries (what was the state on date X) are important
  3. Debugging complex business processes by replaying events
  4. Systems with complex domain logic that benefit from event-driven design
  5. High-scale systems where event streams enable interesting analysis
When NOT to Use Event Sourcing
  1. Simple CRUD applications where history isn
  2. ,
  3. ,
  4. t justified)
  5. Teams without event-driven architecture maturity
  6. Systems with high update frequency where event storage becomes a bottleneck

Patterns and Pitfalls

Design Review Checklist

  • All state changes are represented as events
  • Events are immutable and append-only
  • Event store supports efficient querying by stream ID and timestamp
  • Snapshots are taken at regular intervals for performance
  • Event versioning strategy handles schema evolution
  • Events include sufficient context for debugging (user ID, timestamp, reasons)
  • Projections derive read models from events for query performance

Self-Check

  • How would you reconstruct state at a specific point in time?
  • What are the performance implications of replaying events?
  • How do you handle schema changes in events that have already been stored?
One Takeaway

Event sourcing trades write simplicity for complete history and auditability. Store immutable events, rebuild state by replaying, and use snapshots to optimize performance. The investment in this pattern pays off when history, auditability, or temporal queries are central to your application.

Next Steps

  • Implement event store with persistence and efficient querying
  • Design snapshots to optimize event replay performance
  • Create projections from events to support different query patterns
  • Build event versioning and upcasting for schema evolution

References

  • Greg Young, Event Sourcing
  • Martin Fowler, Event Sourcing
  • Chris Richardson, Microservices Patterns: Pattern Language for Microservices