Skip to content
SP StackPractices
advanced

Saga Pattern

Manage distributed transactions across multiple services by chaining local transactions with compensating actions for rollbacks. A microservices pattern.

Topics: design

Saga Pattern

Overview

The Saga Pattern manages distributed transactions across multiple services by breaking a long-running transaction into a sequence of local transactions. Each local transaction updates a single service and publishes an event or message to trigger the next step. If a step fails, the saga executes compensating transactions to undo the changes made by previous steps.

When to Use

Use the Saga Pattern when:

  • A business operation spans multiple microservices or databases
  • Two-phase commit (2PC) is too slow or unavailable
  • You need eventual consistency across distributed services
  • Each service must remain autonomous with its own transaction boundaries
  • Examples: e-commerce checkout, travel booking, financial transfers, order fulfillment

Solution

Python

from typing import Callable, List, Dict, Any
from dataclasses import dataclass

@dataclass
class SagaResult:
    success: bool
    data: Any = None
    error: str = None
    step_index: int = 0

class SagaStep:
    def __init__(self, name: str, action: Callable, compensation: Callable = None):
        self.name = name
        self.action = action
        self.compensation = compensation

class SagaOrchestrator:
    def __init__(self):
        self.steps: List[SagaStep] = []
        self.completed: List[Dict] = []

    def add_step(self, name: str, action: Callable, compensation: Callable = None):
        self.steps.append(SagaStep(name, action, compensation))

    def execute(self, context: Dict) -> SagaResult:
        self.completed = []
        for i, step in enumerate(self.steps):
            try:
                result = step.action(context)
                self.completed.append({"step": step.name, "context": dict(context)})
                print(f"Step '{step.name}' completed")
            except Exception as e:
                print(f"Step '{step.name}' failed: {e}")
                self.rollback(i)
                return SagaResult(success=False, error=str(e), step_index=i)
        return SagaResult(success=True, data=context)

    def rollback(self, failed_index: int):
        print(f"Rolling back {failed_index} completed steps...")
        for j in range(failed_index - 1, -1, -1):
            step = self.steps[j]
            if step.compensation:
                try:
                    state = self.completed[j]
                    step.compensation(state["context"])
                    print(f"Compensated '{step.name}'")
                except Exception as e:
                    print(f"Compensation failed for '{step.name}': {e}")

# Usage: Travel booking saga
saga = SagaOrchestrator()

saga.add_step(
    "reserve_flight",
    action=lambda ctx: ctx.update({"flight": "FL123"}) or True,
    compensation=lambda ctx: print("Canceling flight reservation")
)

saga.add_step(
    "reserve_hotel",
    action=lambda ctx: ctx.update({"hotel": "HT456"}) or True,
    compensation=lambda ctx: print("Canceling hotel reservation")
)

saga.add_step(
    "charge_payment",
    action=lambda ctx: (_ for _ in ()).throw(Exception("Payment declined")),
    compensation=lambda ctx: print("Refunding payment")
)

result = saga.execute({"user": "alice"})
print(f"Saga success: {result.success}")

JavaScript

class SagaStep {
  constructor(name, action, compensation) {
    this.name = name;
    this.action = action;
    this.compensation = compensation;
  }
}

class SagaOrchestrator {
  constructor() {
    this.steps = [];
    this.completed = [];
  }

  addStep(name, action, compensation) {
    this.steps.push(new SagaStep(name, action, compensation));
  }

  async execute(context) {
    this.completed = [];
    for (let i = 0; i < this.steps.length; i++) {
      try {
        await this.steps[i].action(context);
        this.completed.push({ step: this.steps[i].name, context: { ...context } });
        console.log(`Step '${this.steps[i].name}' completed`);
      } catch (e) {
        console.log(`Step '${this.steps[i].name}' failed: ${e.message}`);
        await this.rollback(i);
        return { success: false, error: e.message, stepIndex: i };
      }
    }
    return { success: true, data: context };
  }

  async rollback(failedIndex) {
    console.log(`Rolling back ${failedIndex} completed steps...`);
    for (let j = failedIndex - 1; j >= 0; j--) {
      const step = this.steps[j];
      if (step.compensation) {
        try {
          await step.compensation(this.completed[j].context);
          console.log(`Compensated '${step.name}'`);
        } catch (e) {
          console.log(`Compensation failed for '${step.name}': ${e.message}`);
        }
      }
    }
  }
}

// Usage
const saga = new SagaOrchestrator();
saga.addStep("reserveFlight",
  async (ctx) => { ctx.flight = "FL123"; },
  async () => console.log("Canceling flight")
);
saga.addStep("reserveHotel",
  async (ctx) => { ctx.hotel = "HT456"; },
  async () => console.log("Canceling hotel")
);
saga.addStep("chargePayment",
  async () => { throw new Error("Payment declined"); },
  async () => console.log("Refunding payment")
);

saga.execute({ user: "alice" }).then(r => console.log("Success:", r.success));

Java

import java.util.*;
import java.util.function.Consumer;

class SagaStep {
    String name;
    Consumer<Map<String, Object>> action;
    Consumer<Map<String, Object>> compensation;

    SagaStep(String name, Consumer<Map<String, Object>> action, Consumer<Map<String, Object>> compensation) {
        this.name = name;
        this.action = action;
        this.compensation = compensation;
    }
}

class SagaResult {
    boolean success;
    String error;
    int stepIndex;

    SagaResult(boolean success, String error, int stepIndex) {
        this.success = success;
        this.error = error;
        this.stepIndex = stepIndex;
    }
}

class SagaOrchestrator {
    private final List<SagaStep> steps = new ArrayList<>();
    private final List<Map<String, Object>> completed = new ArrayList<>();

    void addStep(String name, Consumer<Map<String, Object>> action, Consumer<Map<String, Object>> compensation) {
        steps.add(new SagaStep(name, action, compensation));
    }

    SagaResult execute(Map<String, Object> context) {
        completed.clear();
        for (int i = 0; i < steps.size(); i++) {
            try {
                steps.get(i).action.accept(context);
                completed.add(new HashMap<>(context));
                System.out.println("Step '" + steps.get(i).name + "' completed");
            } catch (Exception e) {
                System.out.println("Step '" + steps.get(i).name + "' failed: " + e.getMessage());
                rollback(i);
                return new SagaResult(false, e.getMessage(), i);
            }
        }
        return new SagaResult(true, null, steps.size());
    }

    void rollback(int failedIndex) {
        System.out.println("Rolling back " + failedIndex + " completed steps...");
        for (int j = failedIndex - 1; j >= 0; j--) {
            SagaStep step = steps.get(j);
            if (step.compensation != null) {
                try {
                    step.compensation.accept(completed.get(j));
                    System.out.println("Compensated '" + step.name + "'");
                } catch (Exception e) {
                    System.out.println("Compensation failed: " + e.getMessage());
                }
            }
        }
    }
}

// Usage
SagaOrchestrator saga = new SagaOrchestrator();
saga.addStep("reserveFlight",
    ctx -> ctx.put("flight", "FL123"),
    ctx -> System.out.println("Canceling flight")
);
saga.addStep("reserveHotel",
    ctx -> ctx.put("hotel", "HT456"),
    ctx -> System.out.println("Canceling hotel")
);
saga.addStep("chargePayment",
    ctx -> { throw new RuntimeException("Payment declined"); },
    ctx -> System.out.println("Refunding payment")
);

SagaResult result = saga.execute(new HashMap<>(Map.of("user", "alice")));
System.out.println("Success: " + result.success);

Explanation

The Saga Pattern has two styles:

  • Orchestration: A central orchestrator manages the sequence and handles failures
  • Choreography: Services communicate via events; each service listens for events and acts, publishing the next event

Both approaches use compensating transactions to undo work when a step fails. Unlike ACID transactions, sagas are eventually consistent — intermediate states are visible.

Variants

VariantDescriptionUse Case
Orchestrated SagaCentral coordinator manages flowComplex flows; need visibility
Choreographed SagaEvent-driven, no central coordinatorSimple flows; loose coupling
Parallel SagaIndependent steps run concurrentlyNon-dependent operations
Nested SagaA saga calls another sagaComplex domain decompositions

Best Practices

  • Design compensations first — every step must have a reliable undo operation
  • Idempotency: Steps and compensations should be safe to run multiple times
  • Timeouts: Each step must have a timeout; missing responses should trigger compensation
  • Logging: Log every step, compensation, and failure for observability
  • Retries: Retry transient failures within a step before declaring failure

Common Mistakes

  • Forgetting compensations for steps that have side effects
  • Not handling partial failures in compensations (some succeed, some fail)
  • Allowing sagas to run indefinitely without timeouts
  • Not making steps idempotent, causing duplicate side effects on retry
  • Mixing synchronous and async compensations inconsistently

Frequently Asked Questions

Q: What is the difference between Saga and 2PC? A: 2PC locks resources across services until commit, ensuring strong consistency but blocking and brittle. Saga releases locks immediately after each local transaction, achieving eventual consistency with better availability and performance.

Q: How do I handle a compensation that also fails? A: Log the failure and alert an operator. Some compensations may require manual intervention (e.g., refunding a payment). Design compensations to be as simple and reliable as possible.

Q: Orchestration vs. Choreography — which should I use? A: Use orchestration for complex flows where visibility and control are critical. Use choreography for simpler flows where loose coupling and autonomy are more important.