Data Residency & Sovereignty

Manage where data physically resides and who can access it, respecting legal boundaries and customer requirements.

TL;DR

Data residency specifies where data physically resides (storage location). Data sovereignty specifies who can access and control data (legal/business rules). Example: EU customers' data must reside in EU data centers and cannot be accessed by non-EU employees. Violating this creates legal liability, regulatory fines, and customer trust issues. Plan early: where your data lives affects architecture, infrastructure, and operations.

Learning Objectives

Understand the purpose and scope of data residency & sovereignty
Learn practical implementation approaches and best practices
Recognize common pitfalls and how to avoid them
Build sustainable processes that scale with your organization
Mentor others in applying these principles effectively
Map regulatory requirements to technical constraints
Design multi-region architectures respecting data boundaries
Implement access controls reflecting sovereignty requirements
Document data residency and sovereignty in architecture

Motivating Scenario

Your organization faces a challenge that data residency & sovereignty directly addresses. Without clear processes and alignment, teams work in silos, making duplicate decisions or conflicting choices.

Real situation: Your US-based SaaS company operates in EU. You store all customer data in AWS us-east-1 (Virginia). GDPR requires personal data of EU residents to stay within EU. You discover a customer filed a complaint. Data of their 50 EU employees is in the US. GDPR fine: 20 million euros or 4% of revenue, whichever is higher.

Even worse: your backup system copies everything to a global read-only replica for disaster recovery. Technically the data left the EU. Violation.

This section provides frameworks, templates, and practices to move forward with confidence and coherence.

Core Concepts

Key Terms

Data Residency: Physical location where data is stored. Example: "Customer data must reside in AWS eu-central-1."

Data Sovereignty: Legal/political control and jurisdiction over data. Example: "EU data cannot be accessed by US employees."

Personal Data: Any information identifying a real person. Example: names, emails, IP addresses.

Sensitive Data: Additional categories requiring stricter control. Example: payment info, health data, biometric data.

Jurisdiction: Legal authority governing the data. GDPR (EU), CCPA (California), LGPD (Brazil), etc.

Common Regulatory Requirements

Regulation	Scope	Key Requirement
GDPR	EU residents	Data in EU, accessible to EU authorities only
CCPA	California residents	Right to delete, right to know, opt-out
LGPD	Brazil residents	Data in Brazil or with consent to transfer
SOC 2	All customers	Audited security controls
HIPAA	Healthcare data (US)	Encrypted, limited access, audit logs
PCI-DSS	Payment cards	Encrypted, isolated from other systems

Purpose and Value

Data Residency & Sovereignty matters because it creates clarity without creating bureaucracy. When processes are lightweight and transparent, teams understand what decisions matter and can move fast with safety.

Key Principles

Clarity: Make the "why" behind processes explicit
Lightweight: Every process should create more value than it costs
Transparency: Document criteria so teams know what to expect
Evolution: Regularly review and refine based on experience
Participation: Include affected teams in designing processes

Implementation Pattern

Most successful implementations follow this pattern: understand current state, design minimal viable process, pilot with early adopters, gather feedback, refine, and scale.

Governance Without Bureaucracy

The hard part is scaling without creating approval bottlenecks. This requires clear decision criteria, asynchronous review mechanisms, and truly delegating decisions to teams.

Practical Example

Process Implementation
Standard Template
Governance Model

# Data Residency & Sovereignty - Implementation Roadmap

Week 1-2: Discovery & Design
  - Understand current pain points
  - Design minimal viable process
  - Identify early adopter teams
  - Create templates and documentation

Week 3-4: Pilot & Feedback
  - Run process with pilot teams
  - Gather feedback weekly
  - Make quick adjustments
  - Document lessons learned

Week 5-6: Refinement & Documentation
  - Incorporate feedback
  - Create training materials
  - Prepare communication plan
  - Build tools to support process

Week 7+: Scaling & Iteration
  - Roll out to all teams
  - Monitor adoption metrics
  - Gather feedback monthly
  - Continuously improve based on learning

# Data Residency & Sovereignty - Quick Reference

## What This Is
[One sentence explanation]

## When to Use This
- Situation 1
- Situation 2
- Situation 3

## Process Steps
1. [Step with owner and timeline]
2. [Step with owner and timeline]
3. [Step with owner and timeline]

## Success Criteria
- [Measurable outcome 1]
- [Measurable outcome 2]

## Roles & Responsibilities
- [Role 1]: [Specific responsibility]
- [Role 2]: [Specific responsibility]

## Decision Criteria
- [Criterion that allows action]
- [Criterion that requires escalation]
- [Criterion that allows exception]

## Common Questions
Q: What if...?
A: [Clear answer]

Q: Who decides...?
A: [Clear authority]

# Governance Approach

Decision Tier 1: Team-Level (Own It)
  - Internal team decisions
  - No cross-team impact
  - Timeline: Team decides
  - Authority: Tech Lead
  - Process: Documented in code review

Decision Tier 2: Cross-Team (Collaborate)
  - Affects multiple teams or shared systems
  - Requires coordination
  - Timeline: 1-2 weeks
  - Authority: System/Solution Architect
  - Process: ADR review, stakeholder feedback

Decision Tier 3: Org-Level (Align)
  - Organization-wide impact
  - Strategic implications
  - Timeline: 2-4 weeks
  - Authority: Enterprise Architect
  - Process: Design review, exception evaluation

Escape Hatch: Exception
  - Justified deviation from standard
  - Time-boxed (3-6 months)
  - Requires rationale and review plan
  - Authority: Role + affected team lead

Core Principles in Practice

Make the Why Clear: Teams will follow processes they understand the purpose of
Delegate Authority: Push decisions down; keep strategy centralized
Use Asynchronous Review: Documents and ADRs scale better than meetings
Measure Impact: Track metrics that show whether process is working
Iterate Quarterly: Regular review keeps processes relevant

Success Indicators

✓ Teams proactively engage in the process ✓ 80%+ adoption without enforcement ✓ Clear reduction in the pain point the process addresses ✓ Minimal time overhead (less than 5% of team capacity) ✓ Positive feedback in retrospectives

Pitfalls to Avoid

❌ Process theater: Requiring documentation no one reads ❌ Over-standardization: Same rules for all teams and all decisions ❌ Changing frequently: Processes need 3-6 months to stabilize ❌ Ignoring feedback: Refusing to adapt based on experience ❌ One-size-fits-all: Different teams need different process levels ❌ No documentation: Unwritten processes get inconsistently applied

This practice connects to:

Architecture Governance & Organization (overall structure)
Reliability & Resilience (ensuring systems stay healthy)
Documentation & ADRs (capturing decisions and rationale)
Team Structure & Communication (enabling effective collaboration)

Checklist: Before You Implement

Clear problem statement: "This process solves [X]"
Stakeholder input: Teams that will use it helped design it
Minimal viable version: Start simple, add complexity only if needed
Success metrics: Define what "better" looks like
Communication plan: How will people learn about this?
Pilot plan: Early adopters to validate before scaling
Review schedule: When will we revisit and refine?

Self-Check

Can you explain the purpose of this process in one sentence? If not, it's too complex.
Do 80% of teams engage without being forced? If not, reconsider its value.
Have you measured the actual impact? Or are you assuming it works?
When did you last gather feedback? If >3 months, do it now.

Takeaway

The best processes are rarely the most comprehensive ones. They're the ones teams choose to follow because they see the value. Start lightweight, measure impact, gather feedback, and iterate. A simple process that 90% of teams adopt is infinitely better than a perfect process that 30% of teams bypass.

Next Steps

Define the problem: What specifically are you trying to solve?
Understand current state: How do teams work today?
Design minimally: What's the smallest change that creates value?
Pilot with volunteers: Find early adopters who see the value
Gather feedback: Weekly for the first month, then monthly
Refine and scale: Incorporate feedback and expand gradually

Architectural Patterns

Pattern: Data Residency by Region

Separate databases per region, replicated minimally:

US Customers → AWS us-east-1 (US database)
              ↓
              Backup (us-west-2, still US)

EU Customers → AWS eu-central-1 (EU database)
             ↓
             Backup (eu-west-1, still EU)

Shared Services (logs, metrics) → Global (anonymized only)

Pattern: Data Classification

Tag data by sensitivity and location requirements:

class DataTag:
    PERSONAL_EU = "personal:eu"      # GDPR-protected
    PERSONAL_US = "personal:us"      # CCPA-protected
    PAYMENT = "sensitive:payment"    # PCI-DSS
    HEALTH = "sensitive:health"      # HIPAA
    ANONYMOUS = "public:anonymous"   # No residency requirement

Pattern: Access Control by Jurisdiction

# Only EU employees can access EU data
@requires_jurisdiction("EU")
def get_customer_data(customer_id):
    return database_eu.query(customer_id)

# Only US employees can access US data
@requires_jurisdiction("US")
def get_customer_data(customer_id):
    return database_us.query(customer_id)

# Logs can be global if anonymized
@anonymous_only
def log_event(event):
    return global_log_service.store(event)

Pitfalls and Mitigations

Pitfall: Backups Cross Borders

Your primary data is in-region, but backups go global. This violates GDPR.

Mitigation: Backup location must respect residency. Use regional backup services.

Pitfall: Logging Captures Personal Data

Your logging system logs all requests, including personal data. Logs go to a global system. Violation.

Mitigation: Redact personal data before logging. Or keep logs in-region.

Pitfall: Analytics and Metrics Mix Data

Your analytics system aggregates data from all regions to a central data warehouse. Now personal data is outside its jurisdiction.

Mitigation: Aggregate at the source. Compute metrics in each region, send only metrics (numbers, no personal data) globally.

Data Residency & Sovereignty

TL;DR​

Learning Objectives​

Motivating Scenario​

Core Concepts​

Key Terms​

Common Regulatory Requirements​

Purpose and Value​

Key Principles​

Implementation Pattern​

Governance Without Bureaucracy​

Practical Example​

Core Principles in Practice​

Success Indicators​

Pitfalls to Avoid​

Related Concepts​

Checklist: Before You Implement​

Self-Check​

Takeaway​

Next Steps​

Architectural Patterns​

Pattern: Data Residency by Region​

Pattern: Data Classification​

Pattern: Access Control by Jurisdiction​

Pitfalls and Mitigations​

Pitfall: Backups Cross Borders​

Pitfall: Logging Captures Personal Data​

Pitfall: Analytics and Metrics Mix Data​

References​