Skip to main content

Webhooks and Callbacks

Master event-driven integrations where servers push updates to clients

TL;DR

Webhooks are server-initiated HTTP callbacks: server POSTs to your endpoint when something happens (instead of you polling). Sign webhooks with HMAC-SHA256 to verify authenticity. Handle duplicate deliveries with idempotent endpoints (track event IDs). Implement exponential backoff retries: give webhooks max 24 hours to deliver. Use at-least-once semantics (duplicates expected, your code must handle). Webhooks enable real-time integrations; use message queues for guaranteed ordering and delivery.

Learning Objectives

  • Understand webhook architecture and push-vs-pull patterns
  • Implement webhook signing and signature verification
  • Handle duplicate deliveries with idempotency and deduplication
  • Design reliable retry strategies with exponential backoff
  • Build webhook infrastructure (dispatcher, retry queue, dead-letter handling)
  • Recognize webhook limitations and alternative patterns
  • Debug webhook delivery failures
  • Design webhook event schemas

Motivation: Webhooks vs Polling

Imagine you integrate with a payment processor. Two approaches:

Polling (Old Way):

Your App:
while True:
sleep(1) # Every second
status = payment_processor.get_status(order_id)
if status.paid:
fulfill_order()

Problems: Wasteful (99% of calls get no data), slow (1 second latency), expensive (1000 orders * 1 call/sec = 1M API calls/day).

Webhooks (Better Way):

Payment Processor → Your App:
POST /webhook/payment with { orderId, status, amount }
Your App immediately processes, no polling needed

Benefits: Instant notification, no wasted API calls, real-time.

But webhooks introduce challenges: network failures, duplicate delivery, signature spoofing, performance/reliability.

Webhook vs Polling: Architecture Comparison

Core Concepts

Webhook: HTTP POST request from server to client endpoint when something happens. Asynchronous, unidirectional.

Event ID: Unique identifier for webhook. Enables deduplication if webhook delivered twice.

Signature: HMAC-SHA256 hash of payload. Proves webhook came from trusted server, not attacker.

Idempotency: Webhook endpoint safe to call twice; second call doesn't duplicate side-effects.

At-Least-Once Delivery: Webhook delivered at least once, possibly multiple times. Client must handle duplicates.

Retry Strategy: If webhook delivery fails, retry with exponential backoff (1s, 2s, 4s, 8s...) up to 24 hours.

Practical Examples

# Receiving webhooks from Stripe payment processor

import hmac
import json
import hashlib
from datetime import datetime
from flask import Flask, request
from database import db

app = Flask(__name__)
STRIPE_WEBHOOK_SECRET = 'whsec_...' # From Stripe dashboard

@app.post('/webhook/stripe')
def handle_stripe_webhook():
"""
Handle payment events from Stripe.
Stripe sends:
charge.succeeded, charge.failed, payment_intent.succeeded, etc.
"""

# Get raw body (must verify signature before parsing JSON)
payload = request.get_data(as_text=True)
signature = request.headers.get('Stripe-Signature', '')

# Verify signature
try:
event = verify_stripe_signature(payload, signature)
except ValueError as e:
return {'error': 'Invalid signature'}, 401

# Extract event details
event_id = event['id'] # Unique per webhook
event_type = event['type']
event_data = event['data']['object']

# Deduplication: check if already processed
existing = db.query_one(
'SELECT * FROM webhooks WHERE event_id = ?',
event_id
)
if existing:
return {'ok': True, 'duplicate': True}

# Process based on event type
try:
if event_type == 'charge.succeeded':
handle_charge_succeeded(event_data)
elif event_type == 'charge.failed':
handle_charge_failed(event_data)
elif event_type == 'payment_intent.payment_failed':
handle_payment_failed(event_data)
else:
# Unknown event type, still acknowledge receipt
pass

# Mark as processed (deduplication)
db.execute(
'INSERT INTO webhooks (event_id, event_type, received_at) VALUES (?, ?, ?)',
event_id, event_type, datetime.utcnow()
)
return {'ok': True}

except Exception as e:
# Don't mark as processed; Stripe will retry
app.logger.error(f'Webhook processing failed: {e}', exc_info=True)
return {'error': 'Processing failed'}, 500

def verify_stripe_signature(payload, signature_header):
"""
Verify Stripe signature format: t=timestamp,v1=signature
"""
try:
parts = dict(part.split('=') for part in signature_header.split(','))
timestamp = int(parts['t'])
signature = parts['v1']
except (KeyError, ValueError):
raise ValueError('Invalid signature format')

# Stripe uses: hmac_sha256(timestamp.payload, secret)
signed_content = f'{timestamp}.{payload}'
expected = hmac.new(
STRIPE_WEBHOOK_SECRET.encode(),
signed_content.encode(),
hashlib.sha256
).hexdigest()

if not hmac.compare_digest(expected, signature):
raise ValueError('Signature mismatch')

return json.loads(payload)

def handle_charge_succeeded(charge):
"""Process successful payment"""
order_id = charge['metadata']['order_id']
amount = charge['amount'] / 100 # Stripe uses cents

db.execute(
'UPDATE orders SET status = ? WHERE id = ?',
'PAYMENT_CONFIRMED', order_id
)

# Trigger fulfillment
send_to_fulfillment_queue(order_id)

def handle_charge_failed(charge):
"""Process failed payment"""
order_id = charge['metadata']['order_id']
reason = charge['failure_message']

db.execute(
'UPDATE orders SET status = ?, failure_reason = ? WHERE id = ?',
'PAYMENT_FAILED', reason, order_id
)

# Notify customer
send_payment_failure_email(order_id)

Webhook Signing and Verification

CRITICAL: Always verify webhook signatures to prevent spoofing:

# Sender (Service) creates signature

import hmac
import hashlib
import json

def create_signed_webhook(payload: dict, secret: str) -> tuple:
"""Create webhook with HMAC signature"""

# Convert to JSON
payload_json = json.dumps(payload, separators=(',', ':'), sort_keys=True)

# Create signature: HMAC-SHA256(payload, secret)
signature = hmac.new(
secret.encode(),
payload_json.encode(),
hashlib.sha256
).hexdigest()

return payload_json, signature

# Send webhook
payload = {'orderId': 123, 'status': 'paid', 'amount': 99.99}
secret = 'your_webhook_secret_key'
payload_json, signature = create_signed_webhook(payload, secret)

requests.post(
'https://client.example.com/webhook',
data=payload_json,
headers={
'Content-Type': 'application/json',
'X-Webhook-Signature': f'sha256={signature}'
}
)
# Receiver (Client) verifies signature

from flask import Flask, request
import hmac
import hashlib
import json

app = Flask(__name__)
SECRET = 'your_webhook_secret_key'

@app.post('/webhook')
def handle_webhook():
"""Verify webhook before processing"""

# Get signature from header
signature_header = request.headers.get('X-Webhook-Signature', '')

# Get raw body (MUST use raw bytes, not parsed JSON)
payload_raw = request.get_data(as_text=True)

# Calculate expected signature
expected_signature = hmac.new(
SECRET.encode(),
payload_raw.encode(),
hashlib.sha256
).hexdigest()

# Compare signatures (timing-safe comparison!)
if not hmac.compare_digest(
f'sha256={expected_signature}',
signature_header
):
return {'error': 'Invalid signature'}, 401

# Signature verified! Safe to process
payload = json.loads(payload_raw)
process_webhook(payload)
return {'ok': True}

def process_webhook(payload):
"""Handle verified webhook"""
print(f'Processing order {payload["orderId"]}')

Idempotency and Duplicate Handling

Webhooks use at-least-once delivery semantics. Duplicates WILL happen:

# Idempotent endpoint: safe to call twice

from flask import Flask, request
from database import db

app = Flask(__name__)

@app.post('/webhook/payment')
def handle_payment_webhook():
"""Process payment webhook idempotently"""

event = request.json
event_id = event['id'] # Unique per webhook (e.g., 'evt_123abc')

# Step 1: Check if already processed (deduplication)
existing = db.query_one(
'SELECT id FROM webhook_events WHERE event_id = ?',
event_id
)

if existing:
# Already processed, return same response (idempotent)
return {
'ok': True,
'status': 'already_processed',
'event_id': event_id
}, 200

# Step 2: Process the event (atomic transaction)
try:
# Use database transaction
with db.transaction():
# Record event as processed FIRST (prevents duplicate processing)
db.execute(
'''
INSERT INTO webhook_events
(event_id, event_type, received_at)
VALUES (?, ?, ?)
''',
event_id,
event['type'],
datetime.utcnow()
)

# Then process the actual event
if event['type'] == 'payment.confirmed':
payment_data = event['data']
db.execute(
'''
UPDATE orders
SET status = ?, payment_id = ?
WHERE id = ?
''',
'CONFIRMED',
payment_data['payment_id'],
payment_data['order_id']
)

# Commit both inserts atomically

return {'ok': True, 'event_id': event_id}, 200

except Exception as e:
# If error occurs, don't record as processed
# Webhook will retry
app.logger.error(f'Webhook failed: {e}', exc_info=True)
return {'error': 'Processing failed'}, 500

Retry Strategy with Exponential Backoff

Design webhooks for eventual consistency with retries:

import time
import random

def retry_webhook_with_backoff(webhook_url, payload, max_attempts=10):
"""Retry webhook delivery with exponential backoff"""

for attempt in range(1, max_attempts + 1):
try:
response = requests.post(
webhook_url,
json=payload,
timeout=30
)

if response.status_code == 200:
return True # Success

# Server returned error (4xx, 5xx)
raise Exception(f'HTTP {response.status_code}')

except Exception as e:
# Check if max attempts reached
if attempt == max_attempts:
# All retries exhausted
send_alert(f'Webhook failed after {max_attempts} attempts: {e}')
return False

# Calculate backoff: exponential + jitter
base_delay = 2 ** (attempt - 1) # 1, 2, 4, 8, 16, 32...
jitter = random.uniform(0, base_delay * 0.1) # +/- 10% random
delay = base_delay + jitter

print(f'Attempt {attempt} failed, retrying in {delay:.1f}s: {e}')
time.sleep(delay)

return False

# Retry schedule (example with 10 attempts, ~24 hours total):
# Attempt 1: Immediate
# Attempt 2: 1s + jitter
# Attempt 3: 2s + jitter
# Attempt 4: 4s + jitter
# Attempt 5: 8s + jitter
# Attempt 6: 16s + jitter
# Attempt 7: 32s + jitter (~1 minute total)
# Attempt 8: 64s + jitter (~2 minutes total)
# Attempt 9: 128s + jitter (~4 minutes total)
# Attempt 10: 256s + jitter (~8 minutes total)
# Total time: ~15 minutes to exhaust retries
# For longer retries (24h): use longer base delays or more attempts

Common Webhook Patterns and Pitfalls

Webhook vs Alternatives

PatternProsConsUse Case
WebhooksReal-time, no polling, simpleUnreliable network, duplicates, no orderingReal-time notifications (payments, deploys)
Message QueueOrdered, exactly-once, retries built-inComplex setup, slower (ms to s latency)Mission-critical (financial txns, inventory)
PollingSimple, reliableWasteful, latency, scales poorlyNon-critical, low-frequency changes
WebSocketBidirectional, real-timeRequires persistent connection, firewall issuesChat, live updates, games
Server-Sent Events (SSE)Simple streaming, one-wayLimited browser support, firewall issuesLive updates, dashboard feeds

Webhook Delivery Checklist

  • Are webhooks signed with HMAC-SHA256?
  • Is signature verified using timing-safe comparison (hmac.compare_digest)?
  • Are endpoint handlers idempotent (safe to call twice)?
  • Is event_id used for deduplication?
  • Are failed webhooks retried with exponential backoff?
  • Is max TTL enforced (e.g., no retries older than 24h)?
  • Do endpoints respond < 5 seconds (heavy work deferred to queue)?
  • Are failed webhooks sent to dead-letter queue for review?
  • Is there a webhook delivery dashboard (status, logs, retry)?
  • Are webhook failures alerted to ops/monitoring?
  • Can old webhooks be replayed (for debugging)?
  • Is the webhook format documented (schema, examples)?

Self-Check

  1. Why sign webhooks? Prove they came from trusted source, prevent spoofing.
  2. What's idempotency? Webhook endpoint safe to call twice; second call doesn't duplicate side-effects.
  3. Why retry with backoff? Avoid overwhelming receiver; transient failures recover naturally.
  4. What if webhook endpoint is down? Retry with exponential backoff up to 24h; eventually give up and send to DLQ.
  5. Can webhooks be out of order? Yes. Don't assume ordering; design handlers to be order-agnostic.
info

One Takeaway: Webhooks are excellent for real-time integrations but require careful handling: sign them, handle duplicates, retry failures, and monitor delivery. For mission-critical operations, use message queues instead.

Next Steps

  • Implement webhook delivery dashboard for monitoring and replay
  • Study message queues (Kafka, RabbitMQ) for guaranteed delivery
  • Learn event sourcing for building auditable systems
  • Explore CQRS (command query responsibility segregation) pattern
  • Build webhook templates for common integrations (Stripe, GitHub, etc.)

References