Skip to content
SP StackPractices
beginner

Timeout Pattern

Prevent operations from hanging indefinitely by enforcing a maximum execution time. A resilience pattern for predictable response times.

Topics: design

Timeout Pattern

Overview

The Timeout Pattern is a resilience pattern that prevents operations from hanging indefinitely by enforcing a maximum execution time. Without timeouts, a single slow downstream service can hold up threads, connections, and user requests indefinitely, causing cascading failures across the system.

When to Use

Use the Timeout Pattern when:

  • You call external services, databases, or APIs that may become unresponsive
  • You need to guarantee maximum response times to users or upstream callers
  • Hanging operations could exhaust thread pools, connection pools, or memory
  • You want to fail fast rather than wait indefinitely for a response
  • Always combine with Retry for transient issues, and Circuit Breaker for chronic failures

Solution

Python

import signal
from concurrent.futures import ThreadPoolExecutor, TimeoutError as FutureTimeout

def with_timeout(seconds: float):
    def decorator(func):
        def wrapper(*args, **kwargs):
            with ThreadPoolExecutor(max_workers=1) as executor:
                future = executor.submit(func, *args, **kwargs)
                try:
                    return future.result(timeout=seconds)
                except FutureTimeout:
                    raise TimeoutError(f"Operation timed out after {seconds}s")
        return wrapper
    return decorator

@with_timeout(seconds=2.0)
def fetch_slow_data():
    import time
    time.sleep(5)
    return "data"

# Usage
try:
    result = fetch_slow_data()
    print(result)
except TimeoutError as e:
    print(f"Failed: {e}")

JavaScript

function withTimeout(fn, timeoutMs) {
  return function(...args) {
    return new Promise((resolve, reject) => {
      const timer = setTimeout(() => {
        reject(new Error(`Operation timed out after ${timeoutMs}ms`));
      }, timeoutMs);

      Promise.resolve(fn(...args))
        .then(resolve)
        .catch(reject)
        .finally(() => clearTimeout(timer));
    });
  };
}

async function fetchSlowData() {
  await new Promise(r => setTimeout(r, 5000));
  return "data";
}

const timedFetch = withTimeout(fetchSlowData, 2000);

// Usage
timedFetch()
  .then(console.log)
  .catch(e => console.log("Failed:", e.message));

Java

import java.util.concurrent.*;

public class Timeout {
    public static <T> T execute(Callable<T> task, long timeoutMs) throws Exception {
        ExecutorService executor = Executors.newSingleThreadExecutor();
        try {
            Future<T> future = executor.submit(task);
            return future.get(timeoutMs, TimeUnit.MILLISECONDS);
        } catch (TimeoutException e) {
            throw new RuntimeException("Operation timed out after " + timeoutMs + "ms");
        } finally {
            executor.shutdownNow();
        }
    }

    public static void main(String[] args) {
        try {
            String result = execute(() -> {
                Thread.sleep(5000);
                return "data";
            }, 2000);
            System.out.println(result);
        } catch (Exception e) {
            System.out.println("Failed: " + e.getMessage());
        }
    }
}

Explanation

The Timeout Pattern enforces a hard deadline on operations:

  • Deadline: The maximum time an operation is allowed to run
  • Cancellation: When the deadline expires, the operation is interrupted or abandoned
  • Propagation: Timeouts should propagate through call chains — if an API call has 5s, and it calls a DB that takes 4s, the DB call should use a shorter timeout (e.g., 3s) to leave margin

This prevents thread pool exhaustion, connection leaks, and poor user experience from unresponsive dependencies.

Variants

VariantDescriptionUse Case
Fixed TimeoutSame timeout for all callsSimple, predictable behavior
Adaptive TimeoutTimeout based on historical latencies (P99)Dynamic response to service health
Deadline PropagationPass remaining time through the call chainEnd-to-end request latency budgets
Partial ResultsReturn what was fetched before timeoutStreaming, search, aggregation

Best Practices

  • Always set timeouts on external calls — network I/O, database queries, HTTP requests
  • Propagate deadlines through your call chain (e.g., gRPC context, HTTP headers)
  • Set timeouts shorter at lower levels — leave headroom for retries and fallbacks
  • Log timeout events with the target service name for debugging
  • Combine with Circuit Breaker — if timeouts are frequent, stop calling the failing service
  • Use Promise.race in JavaScript and Future.get(timeout) in Java for clean cancellation

Common Mistakes

  • Not setting any timeout, allowing operations to hang forever
  • Setting timeouts too long, defeating the purpose of failing fast
  • Setting timeouts too short, causing unnecessary failures during normal spikes
  • Not canceling the underlying operation when the timeout fires (resource leaks)
  • Ignoring timeout propagation, causing cascading deadline misses

Frequently Asked Questions

Q: What timeout value should I use? A: Base it on your SLA and the downstream service’s P99 latency. If your API promises 500ms response time, and a DB call takes 100ms at P99, set the DB timeout to ~150ms to leave room for retries and processing.

Q: Does timeout cancel the underlying operation? A: It depends on the implementation. Thread interruption signals cancellation but doesn’t force it. With async frameworks (Java CompletableFuture, JavaScript AbortController), you can properly cancel the underlying I/O.

Q: Should I retry after a timeout? A: Yes, if the operation is idempotent and the timeout might have been caused by a transient network issue. But if timeouts are frequent, combine with Circuit Breaker to avoid wasting retries on a chronically slow service.