Data Residency & Sovereignty
Manage where data physically resides and who can access it, respecting legal boundaries and customer requirements.
TL;DR
Data residency specifies where data physically resides (storage location). Data sovereignty specifies who can access and control data (legal/business rules). Example: EU customers' data must reside in EU data centers and cannot be accessed by non-EU employees. Violating this creates legal liability, regulatory fines, and customer trust issues. Plan early: where your data lives affects architecture, infrastructure, and operations.
Learning Objectives
- Understand the purpose and scope of data residency & sovereignty
- Learn practical implementation approaches and best practices
- Recognize common pitfalls and how to avoid them
- Build sustainable processes that scale with your organization
- Mentor others in applying these principles effectively
- Map regulatory requirements to technical constraints
- Design multi-region architectures respecting data boundaries
- Implement access controls reflecting sovereignty requirements
- Document data residency and sovereignty in architecture
Motivating Scenario
Your organization faces a challenge that data residency & sovereignty directly addresses. Without clear processes and alignment, teams work in silos, making duplicate decisions or conflicting choices.
Real situation: Your US-based SaaS company operates in EU. You store all customer data in AWS us-east-1 (Virginia). GDPR requires personal data of EU residents to stay within EU. You discover a customer filed a complaint. Data of their 50 EU employees is in the US. GDPR fine: 20 million euros or 4% of revenue, whichever is higher.
Even worse: your backup system copies everything to a global read-only replica for disaster recovery. Technically the data left the EU. Violation.
This section provides frameworks, templates, and practices to move forward with confidence and coherence.
Core Concepts
Key Terms
Data Residency: Physical location where data is stored. Example: "Customer data must reside in AWS eu-central-1."
Data Sovereignty: Legal/political control and jurisdiction over data. Example: "EU data cannot be accessed by US employees."
Personal Data: Any information identifying a real person. Example: names, emails, IP addresses.
Sensitive Data: Additional categories requiring stricter control. Example: payment info, health data, biometric data.
Jurisdiction: Legal authority governing the data. GDPR (EU), CCPA (California), LGPD (Brazil), etc.
Common Regulatory Requirements
| Regulation | Scope | Key Requirement |
|---|---|---|
| GDPR | EU residents | Data in EU, accessible to EU authorities only |
| CCPA | California residents | Right to delete, right to know, opt-out |
| LGPD | Brazil residents | Data in Brazil or with consent to transfer |
| SOC 2 | All customers | Audited security controls |
| HIPAA | Healthcare data (US) | Encrypted, limited access, audit logs |
| PCI-DSS | Payment cards | Encrypted, isolated from other systems |
Purpose and Value
Data Residency & Sovereignty matters because it creates clarity without creating bureaucracy. When processes are lightweight and transparent, teams understand what decisions matter and can move fast with safety.
Key Principles
- Clarity: Make the "why" behind processes explicit
- Lightweight: Every process should create more value than it costs
- Transparency: Document criteria so teams know what to expect
- Evolution: Regularly review and refine based on experience
- Participation: Include affected teams in designing processes
Implementation Pattern
Most successful implementations follow this pattern: understand current state, design minimal viable process, pilot with early adopters, gather feedback, refine, and scale.
Governance Without Bureaucracy
The hard part is scaling without creating approval bottlenecks. This requires clear decision criteria, asynchronous review mechanisms, and truly delegating decisions to teams.
Practical Example
- Process Implementation
- Standard Template
- Governance Model
# Data Residency & Sovereignty - Implementation Roadmap
Week 1-2: Discovery & Design
- Understand current pain points
- Design minimal viable process
- Identify early adopter teams
- Create templates and documentation
Week 3-4: Pilot & Feedback
- Run process with pilot teams
- Gather feedback weekly
- Make quick adjustments
- Document lessons learned
Week 5-6: Refinement & Documentation
- Incorporate feedback
- Create training materials
- Prepare communication plan
- Build tools to support process
Week 7+: Scaling & Iteration
- Roll out to all teams
- Monitor adoption metrics
- Gather feedback monthly
- Continuously improve based on learning
# Data Residency & Sovereignty - Quick Reference
## What This Is
[One sentence explanation]
## When to Use This
- Situation 1
- Situation 2
- Situation 3
## Process Steps
1. [Step with owner and timeline]
2. [Step with owner and timeline]
3. [Step with owner and timeline]
## Success Criteria
- [Measurable outcome 1]
- [Measurable outcome 2]
## Roles & Responsibilities
- [Role 1]: [Specific responsibility]
- [Role 2]: [Specific responsibility]
## Decision Criteria
- [Criterion that allows action]
- [Criterion that requires escalation]
- [Criterion that allows exception]
## Common Questions
Q: What if...?
A: [Clear answer]
Q: Who decides...?
A: [Clear authority]
# Governance Approach
Decision Tier 1: Team-Level (Own It)
- Internal team decisions
- No cross-team impact
- Timeline: Team decides
- Authority: Tech Lead
- Process: Documented in code review
Decision Tier 2: Cross-Team (Collaborate)
- Affects multiple teams or shared systems
- Requires coordination
- Timeline: 1-2 weeks
- Authority: System/Solution Architect
- Process: ADR review, stakeholder feedback
Decision Tier 3: Org-Level (Align)
- Organization-wide impact
- Strategic implications
- Timeline: 2-4 weeks
- Authority: Enterprise Architect
- Process: Design review, exception evaluation
Escape Hatch: Exception
- Justified deviation from standard
- Time-boxed (3-6 months)
- Requires rationale and review plan
- Authority: Role + affected team lead
Core Principles in Practice
- Make the Why Clear: Teams will follow processes they understand the purpose of
- Delegate Authority: Push decisions down; keep strategy centralized
- Use Asynchronous Review: Documents and ADRs scale better than meetings
- Measure Impact: Track metrics that show whether process is working
- Iterate Quarterly: Regular review keeps processes relevant
Success Indicators
✓ Teams proactively engage in the process ✓ 80%+ adoption without enforcement ✓ Clear reduction in the pain point the process addresses ✓ Minimal time overhead (less than 5% of team capacity) ✓ Positive feedback in retrospectives
Pitfalls to Avoid
❌ Process theater: Requiring documentation no one reads ❌ Over-standardization: Same rules for all teams and all decisions ❌ Changing frequently: Processes need 3-6 months to stabilize ❌ Ignoring feedback: Refusing to adapt based on experience ❌ One-size-fits-all: Different teams need different process levels ❌ No documentation: Unwritten processes get inconsistently applied
Related Concepts
This practice connects to:
- Architecture Governance & Organization (overall structure)
- Reliability & Resilience (ensuring systems stay healthy)
- Documentation & ADRs (capturing decisions and rationale)
- Team Structure & Communication (enabling effective collaboration)
Checklist: Before You Implement
- Clear problem statement: "This process solves [X]"
- Stakeholder input: Teams that will use it helped design it
- Minimal viable version: Start simple, add complexity only if needed
- Success metrics: Define what "better" looks like
- Communication plan: How will people learn about this?
- Pilot plan: Early adopters to validate before scaling
- Review schedule: When will we revisit and refine?
Self-Check
- Can you explain the purpose of this process in one sentence? If not, it's too complex.
- Do 80% of teams engage without being forced? If not, reconsider its value.
- Have you measured the actual impact? Or are you assuming it works?
- When did you last gather feedback? If >3 months, do it now.
Takeaway
The best processes are rarely the most comprehensive ones. They're the ones teams choose to follow because they see the value. Start lightweight, measure impact, gather feedback, and iterate. A simple process that 90% of teams adopt is infinitely better than a perfect process that 30% of teams bypass.
Next Steps
- Define the problem: What specifically are you trying to solve?
- Understand current state: How do teams work today?
- Design minimally: What's the smallest change that creates value?
- Pilot with volunteers: Find early adopters who see the value
- Gather feedback: Weekly for the first month, then monthly
- Refine and scale: Incorporate feedback and expand gradually
Architectural Patterns
Pattern: Data Residency by Region
Separate databases per region, replicated minimally:
US Customers → AWS us-east-1 (US database)
↓
Backup (us-west-2, still US)
EU Customers → AWS eu-central-1 (EU database)
↓
Backup (eu-west-1, still EU)
Shared Services (logs, metrics) → Global (anonymized only)
Pattern: Data Classification
Tag data by sensitivity and location requirements:
class DataTag:
PERSONAL_EU = "personal:eu" # GDPR-protected
PERSONAL_US = "personal:us" # CCPA-protected
PAYMENT = "sensitive:payment" # PCI-DSS
HEALTH = "sensitive:health" # HIPAA
ANONYMOUS = "public:anonymous" # No residency requirement
Pattern: Access Control by Jurisdiction
# Only EU employees can access EU data
@requires_jurisdiction("EU")
def get_customer_data(customer_id):
return database_eu.query(customer_id)
# Only US employees can access US data
@requires_jurisdiction("US")
def get_customer_data(customer_id):
return database_us.query(customer_id)
# Logs can be global if anonymized
@anonymous_only
def log_event(event):
return global_log_service.store(event)
Pitfalls and Mitigations
Pitfall: Backups Cross Borders
Your primary data is in-region, but backups go global. This violates GDPR.
Mitigation: Backup location must respect residency. Use regional backup services.
Pitfall: Logging Captures Personal Data
Your logging system logs all requests, including personal data. Logs go to a global system. Violation.
Mitigation: Redact personal data before logging. Or keep logs in-region.
Pitfall: Analytics and Metrics Mix Data
Your analytics system aggregates data from all regions to a central data warehouse. Now personal data is outside its jurisdiction.
Mitigation: Aggregate at the source. Compute metrics in each region, send only metrics (numbers, no personal data) globally.