Skip to main content

Real-Time Systems: Latency and Determinism

Hard vs soft real-time requirements, scheduling algorithms, and priority inversion prevention

TL;DR

Real-time systems guarantee predictable, bounded latency and deadline adherence, not necessarily fast responses. Hard real-time means missing a deadline is unacceptable (aircraft control). Soft real-time means missing deadlines degrades quality but is recoverable (video playback). Determinism requires careful scheduling (Rate Monotonic, Earliest Deadline First), avoiding priority inversion, and watchdog timers. RTOS (Real-Time Operating Systems) provide preemption and bounded interrupt handling.

Learning Objectives

  • Distinguish hard vs soft real-time requirements
  • Understand scheduling algorithms and their tradeoffs
  • Prevent and handle priority inversion
  • Design deterministic systems with bounded latency
  • Implement watchdog timers and fault recovery
  • Choose RTOS vs general-purpose OS

Motivating Scenario

You're building a braking system for autonomous vehicles. A sensor detects an obstacle 50 meters ahead. Software must detect, compute deceleration, and send command to brake within 50 milliseconds (hard deadline). Missing this deadline by 1ms could cause a crash. The OS scheduler must guarantee that the brake logic thread runs within that window, regardless of other tasks (music playback, telemetry logging). General-purpose OS (Linux, Windows) offer no such guarantee; they're optimized for throughput, not deadline predictability. A hard real-time RTOS (FreeRTOS, QNX, VxWorks) provides determinism through preemptive scheduling.

Core Concepts

Real-time systems guarantee bounded latency through deterministic scheduling and execution:

Hard Real-Time: Missing deadline is catastrophic (surgery, aircraft, power plants).

Soft Real-Time: Missing deadline degrades quality but system continues (video streaming, game AI).

Firm Real-Time: Missing deadline is wasteful but not catastrophic (missed video frame in livestream).

Scheduling: Preemptive algorithm that ensures high-priority tasks always run within deadline.

Priority Inversion: Low-priority task holds resource (lock) that high-priority task needs, blocking it.

Watchdog Timer: Hardware timer that resets if code periodically "feeds" it. If code hangs, timer expires and system recovers (resets).

Real-time scheduling: Rate Monotonic vs Earliest Deadline First

Key Concepts

Preemptive Scheduling: Higher-priority task immediately interrupts lower-priority task.

Context Switch: Saving/restoring thread state (registers, stack). Adds bounded overhead (microseconds).

Priority Inversion Prevention: Priority inheritance or priority ceiling protocols ensure high-priority tasks never blocked by low-priority ones.

Bounded Execution: Worst-case execution time (WCET) must be provably less than deadline.

Interrupt Latency: Time from hardware interrupt to ISR start. Must be bounded (not thousands of interrupts queued).

Practical Example

import threading
import time
import queue
from threading import Lock

class RealTimeTask:
"""Simplified real-time task with deadline."""
def __init__(self, name: str, period_ms: int, deadline_ms: int, work_ms: int):
self.name = name
self.period_ms = period_ms
self.deadline_ms = deadline_ms
self.work_ms = work_ms
self.deadline_missed = 0
self.deadline_met = 0

def execute(self) -> bool:
"""Simulate work that takes approximately work_ms."""
start = time.time_ns() // 1_000_000
time.sleep(self.work_ms / 1000.0)
elapsed = (time.time_ns() // 1_000_000) - start

met = elapsed <= self.deadline_ms
if met:
self.deadline_met += 1
else:
self.deadline_missed += 1
return met

class RealTimeScheduler:
"""Simple rate-monotonic scheduler."""
def __init__(self):
self.tasks = []
self.shared_resource = Lock()
self.stop_event = threading.Event()

def add_task(self, task: RealTimeTask):
"""Add task, sorted by period (shorter period = higher priority)."""
self.tasks.append(task)
self.tasks.sort(key=lambda t: t.period_ms) # Rate monotonic

def run_task(self, task: RealTimeTask):
"""Run task periodically with deadline checking."""
while not self.stop_event.is_set():
start = time.time_ns() // 1_000_000

# With priority inheritance: acquire lock with priority boost
with self.shared_resource: # Simulate critical section
task.execute()

elapsed = (time.time_ns() // 1_000_000) - start
sleep_time = max(0, (task.period_ms - elapsed) / 1000.0)

if elapsed > task.deadline_ms:
print(f"{task.name} DEADLINE MISSED: {elapsed}ms > {task.deadline_ms}ms")

time.sleep(sleep_time)

def start_all(self):
"""Start all tasks in separate threads."""
threads = []
for i, task in enumerate(self.tasks):
# In a real RTOS, set OS-level priority
# On Linux: os.sched_setscheduler(), but limited to root
thread = threading.Thread(target=self.run_task, args=(task,), daemon=False)
thread.start()
threads.append(thread)
return threads

def stop_all(self):
self.stop_event.set()

# Example: Brake system (hard real-time)
scheduler = RealTimeScheduler()

# High-priority: Sensor reading (must meet 5ms deadline)
sensor_task = RealTimeTask("BrakeSensor", period_ms=5, deadline_ms=5, work_ms=1)
# Medium-priority: Brake actuation (must meet 10ms deadline)
brake_task = RealTimeTask("BrakeActuator", period_ms=10, deadline_ms=10, work_ms=2)
# Low-priority: Telemetry logging (soft real-time, 100ms deadline)
telemetry_task = RealTimeTask("Telemetry", period_ms=100, deadline_ms=100, work_ms=5)

scheduler.add_task(sensor_task)
scheduler.add_task(brake_task)
scheduler.add_task(telemetry_task)

threads = scheduler.start_all()

# Run for 2 seconds
time.sleep(2)
scheduler.stop_all()

for t in threads:
t.join()

# Results
print(f"\nSensor: {sensor_task.deadline_met} met, {sensor_task.deadline_missed} missed")
print(f"Brake: {brake_task.deadline_met} met, {brake_task.deadline_missed} missed")
print(f"Telemetry: {telemetry_task.deadline_met} met, {telemetry_task.deadline_missed} missed")

When to Use / When Not to Use

  1. Hard deadlines (safety-critical systems)
  2. Predictable latency is non-negotiable
  3. Embedded/IoT with predictable workloads
  4. Industrial automation, aerospace, medical devices
  5. Need deterministic behavior across all conditions
  6. Performance variability could cause harm
  1. Soft guarantees are acceptable
  2. General-purpose server with variable workloads
  3. Web applications, CRUD services
  4. Average-case performance matters more than worst-case
  5. Team lacks real-time OS expertise
  6. Complexity/cost not justified by requirements

Patterns and Pitfalls

Patterns and Pitfalls

Design Review Checklist

  • Are deadlines clearly specified (hard vs soft vs firm)?
  • Is scheduling algorithm chosen (Rate Monotonic, EDF, fixed priority)?
  • Can you prove all tasks meet deadlines under worst-case load?
  • Are priority inversion risks identified and mitigated?
  • Is interrupt latency bounded and measured?
  • Is memory pre-allocated (no malloc in critical path)?
  • Are watchdog timers implemented for critical tasks?
  • Can you measure worst-case execution time (WCET)?
  • Is thread/task synchronization lock-free where possible?
  • Does team understand real-time OS limitations and capabilities?

Self-Check

  1. What's the difference between hard and soft real-time? Hard real-time: missing deadline is catastrophic (surgery, aircraft). Soft real-time: missing deadline degrades quality but is recoverable (video playback).
  2. Why is priority inversion a problem? Low-priority task holds lock that high-priority task needs, so high-priority task can't run. Defeats the purpose of prioritization.
  3. How do you prevent priority inversion? Priority inheritance or priority ceiling protocols. High-priority task temporarily boosts low-priority task's priority while it holds the critical resource.
info

One Takeaway: Real-time is about predictability, not raw speed. A system that's slow but predictable beats one that's fast but unpredictable when missing deadlines is catastrophic.

Next Steps

  • Scheduling Theory: Rate Monotonic Analysis (RMA), Earliest Deadline First (EDF)
  • RTOS Examples: FreeRTOS, QNX, VxWorks, Zephyr
  • Worst-Case Execution Time (WCET): Static analysis tools and timing validation
  • Interrupt Handling: Latency bounds, deferred execution, ISR design
  • Lock-Free Programming: Alternatives to locks for synchronization

References

  • Liu, J. W. S. (2000). Real-Time Systems. Prentice Hall. ↗️
  • Buttazzo, G. (2011). Hard Real-Time Computing Systems. Springer. ↗️
  • FreeRTOS Documentation. ↗️