Data Modeling & Access
Design schemas and access patterns for performance, maintainability, and scale
Overview
How you model data and access it profoundly impacts system performance, maintainability, and scalability. This section covers schema design principles, trade-offs between normalization and denormalization, query optimization patterns, and multi-tenancy approaches.
Core Topics
- Normalization vs Denormalization - Trade-offs between data consistency and query performance
- ORMs & Data Mappers - Active Record vs Data Mapper patterns for persistence
- Query Patterns - Pagination, filtering, sorting for efficient data retrieval
- Indexing Strategies - Single, composite, partial indexes for query optimization
- Multi-Tenancy - Pooled vs siloed architectures for SaaS systems
Key Principles
- Understand Access Patterns First - Schema design should follow queries, not precede them
- Measure Before Optimizing - Use metrics to guide indexing and denormalization decisions
- Normalize for Consistency - Reduce anomalies and enforce data integrity
- Denormalize for Performance - Cache computed values and embed related data
- Use Appropriate Abstractions - ORMs help, but understand their costs
Design Decision Flow
1. Understand Access Patterns
↓
2. Choose Storage Model
↓
3. Design Base Schema (Normalized)
↓
4. Identify Performance Bottlenecks
↓
5. Add Indexes/Denormalization
↓
6. Measure & Iterate
Quick Comparison
| Pattern | Best For | Trade-off |
|---|---|---|
| Fully Normalized (3NF) | Write-heavy, transactional | Slower reads with joins |
| Denormalized | Read-heavy, analytical | Update complexity, data duplication |
| Mixed (denorm + normalized) | Most production systems | Complexity, consistency window |
| Wide-column denorm | Time-series, analytics | Single partition dependency |
Next Steps
- Start with Normalization vs Denormalization for foundational concepts
- Explore ORMs for persistence layer patterns
- Learn Query Patterns for access optimization
- Study Indexing for query performance
- Understand Multi-Tenancy for SaaS systems
Common Pitfalls
- Denormalizing without measuring first
- Ignoring access patterns during schema design
- Over-indexing (write penalty)
- Under-indexing (read performance cliffs)
- Improper multi-tenancy isolation