Skip to main content

Synchronous vs Asynchronous Communication

Master the fundamental choice that determines coupling, latency, and availability of your system

TL;DR

Synchronous: A calls B, waits for response. Tight coupling, simple flow, but low availability. Asynchronous: A sends a message to B, continues immediately. Loose coupling, scales well, but more complex. Most systems use both: synchronous for user-facing operations needing immediate feedback, asynchronous for background work and inter-service communication.

Learning Objectives

  • Understand the difference between synchronous and asynchronous communication
  • Recognize coupling implications of each approach
  • Apply each style to appropriate scenarios
  • Understand how to combine both effectively

Motivating Scenario

Your e-commerce system receives an order. The customer is waiting for confirmation. Payment processing might take seconds. Email delivery might take minutes. Inventory updates might queue up. If you make the customer wait for all of it synchronously, they timeout. If you respond immediately asynchronously, they never know if their order went through. The solution: synchronous for immediate operations (payment), asynchronous for eventual operations (inventory, email).

Synchronous Communication

Definition: Service A sends a request to Service B and waits for a response before continuing.

Characteristics:

  • Blocking: A doesn't proceed until B responds
  • Request-response pattern
  • Immediate feedback
  • Tight coupling (A depends on B being up)
  • Simple to reason about
  • Limited scalability (waiting requests consume resources)

Timing:

Time 0: A sends request ──────────→ B
Time 1: B processes
Time 2: A receives response ←────── B
A continues

When to Use:

  • User-facing operations needing immediate response
  • Operations where failure is obvious (show error to user)
  • Operations where response is required to proceed
  • Small microservices environments with good networks
User clicks "Purchase"
├─ Check inventory (sync) ────→ Inventory Service
├─ Wait for response ✓
├─ Process payment (sync) ────→ Payment Service
├─ Wait for response ✓
└─ Return confirmation to user ✓

Time: ~500ms-2s (depending on latencies)
If Inventory Service is down: User sees error immediately

Asynchronous Communication

Definition: Service A sends a message to Service B (usually through a broker) and continues immediately without waiting for processing.

Characteristics:

  • Non-blocking: A continues after sending
  • Fire-and-forget or publish-subscribe
  • Decoupled: A doesn't care if B is up
  • No immediate feedback
  • Complex to reason about
  • High scalability (no waiting)

Timing:

Time 0: A sends message ──────→ Queue ←─── B subscribing
A continues immediately
Time 1: B picks up message
Time 2: B processes
(A already finished long ago)

When to Use:

  • Background operations (emails, analytics)
  • Work that takes time (image processing)
  • Operations where immediate feedback isn't needed
  • Decoupling services
  • Event-driven architectures
  • Handling spikes in traffic
User clicks "Purchase"
├─ Check inventory (sync) ────→ Inventory Service ✓
├─ Send order event (async) ──→ Message Queue
├─ Return confirmation immediately ✓

Later, asynchronously:
├─ Payment Service picks up order
├─ Process payment
└─ Send payment confirmation email

Time: ~100ms (only for critical path)
If Email Service is down: Order still processes, email sent later

Comparison

Synchronous
  1. REST API calls
  2. gRPC requests
  3. Database queries
Asynchronous
  1. Message queues
  2. Event streams
  3. Webhooks

The Latency Difference

Latency: Synchronous vs Asynchronous

Hybrid Approaches

The best systems use both:

  1. Synchronous for critical path: User needs immediate feedback
  2. Asynchronous for background: Processing that can happen later
@app.post('/orders')
def create_order(request):
# Synchronous: Check inventory, required for decision
if not inventory.has_stock(request.item_id):
return error('Out of stock')

# Synchronous: Process payment, required for decision
payment_result = payment_service.charge(request.amount)
if not payment_result.success:
return error('Payment failed')

# Create order (synchronous)
order = Order.create(...)

# Asynchronous: Send confirmation email (can happen later)
queue.send('email.new_order', order.id)

# Asynchronous: Update analytics (can happen later)
queue.send('analytics.order_created', order.id)

# Return immediately with confirmed order
return success(order)

Failure Handling

Failure Handling in Sync vs Async

Architecture Patterns for Sync/Async

Saga Pattern (Distributed Transactions)

class OrderSaga:
"""Coordinate order creation across multiple services."""

async def create_order(self, order_data):
"""
Orchestrate order across multiple services.
Synchronous critical path, asynchronous updates.
"""
try:
# Synchronous: Must succeed or entire order fails
order = await self.create_order_in_db(order_data)

# Asynchronous: Individual service failures don't fail order
# But must be tracked for compensation
tasks = [
self.reserve_inventory(order),
self.authorize_payment(order),
self.create_shipment(order),
self.send_confirmation_email(order)
]

results = await asyncio.gather(*tasks, return_exceptions=True)

# Track failures for compensation
failures = [r for r in results if isinstance(r, Exception)]
if failures:
# Log failures, trigger compensation
await self.handle_partial_failure(order, failures)

return order

except Exception as e:
# Rollback on critical failure
await self.rollback_order(order_data)
raise

async def handle_partial_failure(self, order, failures):
"""
Compensate for failed async operations.
Example: inventory reserved but payment failed.
"""
for failure in failures:
await self.compensation_service.compensate(order, failure)

Outbox Pattern (Reliable Publishing)

class OrderService:
"""Ensure events are published even if system crashes."""

async def create_order(self, order_data):
# Transaction 1: Create order and event in same transaction
async with db.transaction():
order = await self.orders.create(order_data)

# Write event to outbox (same transaction)
await self.outbox.insert({
'event_type': 'OrderCreated',
'order_id': order.id,
'payload': order.to_dict(),
'published': False
})

# Later (separate process): Publish events
# Even if app crashes, events are in DB and will be retried
async def publish_pending_events():
events = await self.outbox.find_unpublished()
for event in events:
try:
await self.message_broker.publish(event)
await self.outbox.mark_published(event.id)
except Exception as e:
logger.error(f"Failed to publish event {event.id}: {e}")
# Will retry on next run

Self-Check

Which communication style for each?

  1. User clicks "Send Email" - needs immediate confirmation? Async (return immediately, send email in background)
  2. Processing a batch of images overnight? Async (background job, no immediate response)
  3. Checking account balance? Sync (user needs immediate response)
  4. Recording analytics events? Async (not critical, eventual consistency OK)
  5. Processing a refund? Sync for validation, async for notification
  6. Loading product catalog? Sync with caching
  7. Updating inventory after purchase? Async with retries
  8. Validating user input on form? Sync (immediate feedback)
  9. Sending SMS notification? Async (can fail gracefully)
  10. Checking if username available? Sync (user needs answer)
One Takeaway

Synchronous is simple but scales poorly. Asynchronous is complex but scales well. Use synchronous for critical decisions, asynchronous for everything else.

Next Steps

  1. Messaging Details: Read Messaging
  2. API Gateway: Learn API Gateway
  3. Resilience: Explore Timeouts and Retries

Advanced Synchronous Patterns

Request-Response with Timeouts

Always set timeouts on synchronous calls. Infinite waits cause cascading failures:

import requests
from requests.exceptions import Timeout

def call_with_timeout(url, timeout_seconds=5):
try:
response = requests.get(url, timeout=timeout_seconds)
return response.json()
except Timeout:
# Handle timeout - don't wait forever
logger.error(f"Request to {url} timed out after {timeout_seconds}s")
return None

# Usage with circuit breaker
class CircuitBreaker:
def __init__(self, failure_threshold=5):
self.failures = 0
self.failure_threshold = failure_threshold
self.last_failure_time = None
self.state = 'closed' # closed, open, half-open

def call(self, func, *args, **kwargs):
if self.state == 'open':
# Recently failed, don't retry yet
if time.time() - self.last_failure_time > 60:
self.state = 'half-open'
else:
raise CircuitBreakerOpen()

try:
result = func(*args, **kwargs)
self.failures = 0
self.state = 'closed'
return result
except Exception as e:
self.failures += 1
self.last_failure_time = time.time()
if self.failures >= self.failure_threshold:
self.state = 'open'
raise

# Usage
breaker = CircuitBreaker(failure_threshold=3)

def call_inventory_service():
return breaker.call(lambda: call_with_timeout(
'https://inventory.service/check',
timeout_seconds=2
))

Synchronous Request Chaining

Be careful with long chains of synchronous calls:

Request Chain Pattern (ANTI-PATTERN):
Client → API Gateway (100ms)
→ User Service (100ms)
→ Auth Service (100ms)
→ Product Service (100ms)
→ Inventory Service (100ms)
→ Price Service (100ms)

Total: 600ms (each call must wait for previous)

Problem: Any single slow service makes entire chain slow. Latency multiplies.

Solution: Parallelize where possible:

import asyncio

async def get_order_details(order_id):
# Parallel requests instead of sequential
user_task = get_user_details()
product_task = get_product_details()
inventory_task = get_inventory_status()

# Wait for all to complete
user, products, inventory = await asyncio.gather(
user_task, product_task, inventory_task
)

return {
'user': user,
'products': products,
'inventory': inventory
}

# Total time: max(100ms, 100ms, 100ms) = 100ms (not 300ms)

Advanced Asynchronous Patterns

Event-Driven Architecture

Instead of direct calls, publish events that other services subscribe to:

class OrderService:
def create_order(self, order_data):
order = Order.create(order_data)
self.order_repo.save(order)

# Publish event instead of calling other services
self.event_bus.publish('order.created', {
'order_id': order.id,
'user_id': order.user_id,
'items': order.items,
'total': order.total
})

return order

# Other services subscribe independently
class PaymentService:
def on_order_created(self, event):
order_id = event['order_id']
# Charge payment asynchronously
self.process_payment(order_id)

class NotificationService:
def on_order_created(self, event):
user_id = event['user_id']
# Send confirmation email
self.send_confirmation_email(user_id)

class InventoryService:
def on_order_created(self, event):
items = event['items']
# Allocate inventory
self.allocate_items(items)

Benefits: Services loosely coupled. New subscribers can be added without changing OrderService. If one subscriber fails, others aren't affected.

Request-Reply Pattern with Correlation ID

For async request-reply, use correlation IDs to match responses:

import uuid

class AsyncRequestReply:
def __init__(self, message_broker):
self.broker = message_broker
self.pending_requests = {}

def send_request(self, service_name, request_data, timeout=10):
# Generate unique ID
correlation_id = str(uuid.uuid4())

# Send request
self.broker.publish(f'{service_name}.requests', {
'correlation_id': correlation_id,
'payload': request_data
})

# Wait for response
future = asyncio.Future()
self.pending_requests[correlation_id] = future

# Timeout after 10 seconds
try:
response = asyncio.wait_for(future, timeout=timeout)
return response
finally:
del self.pending_requests[correlation_id]

def handle_response(self, message):
correlation_id = message['correlation_id']
if correlation_id in self.pending_requests:
self.pending_requests[correlation_id].set_result(message['payload'])

Dead Letter Queues

Messages that fail repeatedly go to a dead letter queue for inspection:

class RobustMessageProcessor:
def __init__(self, queue, max_retries=3):
self.queue = queue
self.max_retries = max_retries
self.dead_letter_queue = queue.dead_letter_queue

async def process_messages(self):
while True:
msg = await self.queue.receive()

retry_count = msg.get('retry_count', 0)

try:
await self.handle_message(msg)
await self.queue.acknowledge(msg)
except Exception as e:
if retry_count < self.max_retries:
# Retry: re-queue with incremented counter
msg['retry_count'] = retry_count + 1
await self.queue.send(msg)
logger.warning(f"Retrying message {msg.id}: {e}")
else:
# Max retries exceeded: send to dead letter queue
await self.dead_letter_queue.send({
'original_message': msg,
'error': str(e),
'retry_count': retry_count
})
await self.queue.acknowledge(msg)
logger.error(f"Message {msg.id} moved to DLQ: {e}")

Choosing Sync vs Async: Decision Tree

Use this decision tree to determine the best approach:

Do you need immediate feedback?
├─ YES: "Is it user-facing (direct request)?"
│ ├─ YES: Synchronous (API call, REST request)
│ │ └─ Examples: Login, fetch product, check balance
│ └─ NO: "Can it fail gracefully?"
│ ├─ YES: Asynchronous with user notification
│ │ └─ Example: Generating report, video encoding
│ └─ NO: Synchronous (critical operation)
│ └─ Example: Process payment, create order
└─ NO: Asynchronous (background job)
├─ Can happen later: Message queue
│ └─ Examples: Send email, update analytics, cleanup
└─ Time-critical but not user-facing: Event stream
└─ Examples: Real-time notifications, audit logging

Real-World Trade-Offs Example

E-commerce checkout:

class CheckoutService:
async def checkout(self, order):
# SYNCHRONOUS: Critical for user experience
# Must complete or fail immediately

try:
# 1. Validate inventory (sync) — must know if in stock
if not self.inventory.has_stock(order.items):
raise OutOfStock()

# 2. Process payment (sync) — must know if payment succeeded
payment = self.payment.charge(order.user_id, order.total)
if not payment.success:
raise PaymentFailed()

# 3. Create order record (sync) — must persist before responding
saved_order = self.order_repo.save(order)

# ASYNCHRONOUS: Can happen in background
# User doesn't wait for these

# 4. Send confirmation email (async) — user can wait
self.queue.send('email.order_confirmation', {'order_id': saved_order.id})

# 5. Update analytics (async) — not critical
self.queue.send('analytics.checkout_completed', {'order_id': saved_order.id})

# 6. Notify warehouse (async) — has time window
self.queue.send('warehouse.new_order', {'order_id': saved_order.id})

# User gets response immediately
return {'status': 'success', 'order_id': saved_order.id}

except (OutOfStock, PaymentFailed) as e:
# User sees error immediately
raise

Critical Path (Synchronous): 200-300ms total

  • Inventory check: 50ms
  • Payment processing: 200ms
  • Order creation: 50ms

Background Tasks (Asynchronous): Happen later

  • Email sent in 1-5 seconds
  • Analytics updated in 10 seconds
  • Warehouse notified in 30 seconds

User sees order confirmation in 300ms, even though full process takes 35 seconds.

References

  • Newman, S. (2015). "Building Microservices". O'Reilly Media.
  • Fowler, M., & Lewis, J. (2014). "Microservices". martinfowler.com.
  • Indrasiri, K., & Kulatunga, D. (2021). "Microservices Development Cookbook". Packt.
  • "Enterprise Integration Patterns" by Gregor Hohpe
  • "Designing Event-Driven Systems" by Ben Stopford