Skip to content
SP StackPractices
advanced By StackPractices

Workflow Engines

Orchestrate complex business processes with workflow engines, state machines, and long-running task coordination across distributed services.

Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.

Overview

Workflow engines orchestrate complex, multi-step business processes that span services, time, and failure domains. Unlike simple job queues that execute independent tasks, workflows manage state transitions, retries, timeouts, and compensations across distributed systems. Whether processing an e-commerce order, underwriting an insurance policy, or approving a loan, workflow engines ensure each step executes in the right order with proper error handling.

When to Use

Use this resource when:

  • Business processes have 5+ sequential steps with failure handling requirements
  • Steps need to wait for human approval or external events (hours or days)
  • Partial failures require compensating transactions (saga pattern)
  • You need audit trails and visibility into long-running process state

Solution

Temporal Workflow (TypeScript)

import { Workflow, Activity } from '@temporalio/workflow';

const { sendEmail, chargePayment, shipOrder } = proxyActivities<{
  sendEmail(email: string): Promise<void>;
  chargePayment(amount: number): Promise<string>;
  shipOrder(orderId: string): Promise<string>;
}>({
  startToCloseTimeout: '30 seconds',
  retry: { maximumAttempts: 3 }
});

export async function orderWorkflow(order: Order): Promise<void> {
  await sendEmail(order.customerEmail);

  const paymentId = await chargePayment(order.total);
  if (!paymentId) {
    await sendCompensationEmail(order);
    throw new Error('Payment failed');
  }

  try {
    await shipOrder(order.id);
  } catch (err) {
    await refundPayment(paymentId);
    throw err;
  }

  await sendEmail(order.customerEmail, 'Order shipped!');
}

State Machine (Python + transitions)

from transitions import Machine

class OrderWorkflow:
    states = ['pending', 'paid', 'shipped', 'cancelled']

    def __init__(self):
        self.machine = Machine(
            model=self,
            states=OrderWorkflow.states,
            initial='pending',
            transitions=[
                {'trigger': 'pay', 'source': 'pending', 'dest': 'paid'},
                {'trigger': 'ship', 'source': 'paid', 'dest': 'shipped'},
                {'trigger': 'cancel', 'source': ['pending', 'paid'], 'dest': 'cancelled',
                 'after': 'refund_payment'}
            ]
        )

    def refund_payment(self):
        print("Refunding payment...")

order = OrderWorkflow()
order.pay()      # pending -> paid
order.ship()     # paid -> shipped

Camunda BPMN Process

<?xml version="1.0" encoding="UTF-8"?>
<bpmn:definitions>
  <bpmn:process id="OrderProcess" isExecutable="true">
    <bpmn:startEvent id="StartEvent" />
    <bpmn:sequenceFlow id="Flow_1" sourceRef="StartEvent" targetRef="CheckInventory" />
    
    <bpmn:serviceTask id="CheckInventory" camunda:delegateExpression="${inventoryChecker}" />
    <bpmn:sequenceFlow id="Flow_2" sourceRef="CheckInventory" targetRef="Gateway_1" />
    
    <bpmn:exclusiveGateway id="Gateway_1" default="Flow_4">
      <bpmn:sequenceFlow id="Flow_3" sourceRef="Gateway_1" targetRef="ProcessPayment"
                         conditionExpression="${inventoryAvailable}" />
      <bpmn:sequenceFlow id="Flow_4" sourceRef="Gateway_1" targetRef="NotifyOutOfStock" />
    </bpmn:exclusiveGateway>
  </bpmn:process>
</bpmn:definitions>

Explanation

Core concepts:

  • Workflow definition: The blueprint describing steps, transitions, and conditions
  • Activity: A single unit of work (API call, database update, human task)
  • State: The current position in the workflow ( persisted for durability)
  • Compensation: Reversing already completed steps when a later step fails
  • Timer: Delaying execution or setting deadlines for activities

When to use workflow engines vs. code:

ComplexityApproachExample
1-2 stepsDirect function callsSending a welcome email
3-5 stepsCode with retry logicOrder processing with payment
5+ stepsWorkflow engineLoan approval with 10+ departments
Human tasksBPMN engineInsurance claim review

Variants

EngineModelBest For
TemporalCode-as-workflowDeveloper-centric; durable execution
CamundaBPMNBusiness analyst visibility; human tasks
Apache AirflowDAGsData pipelines; scheduled workflows
Netflix ConductorJSON DSLMicroservices orchestration
AWS Step FunctionsState machinesServerless; AWS integration

Best Practices

  • Idempotent activities: Running the same activity twice should produce the same result. See message idempotency.
  • Idempotency keys: Pass unique keys to external APIs to prevent double charges
  • Set timeouts on everything: Default 10-minute timeout prevents stuck workflows
  • Version workflow definitions: New deployments shouldn’t break in-flight workflows
  • Query workflow state: Expose APIs to check progress without inspecting internal state

Common Mistakes

  1. Tight coupling to orchestrator: Business logic bleeding into workflow definitions makes testing hard
  2. No compensation paths: Failed workflows that already charged the customer need explicit refunds. Learn more in saga pattern.
  3. Polling instead of events: Waiting 30 seconds to check status wastes resources; use callbacks
  4. Ignoring workflow history: Old completed workflows fill storage; implement retention policies
  5. No replay testing: Temporal and similar engines replay history; non-deterministic code breaks

Frequently Asked Questions

Q: When should I use a workflow engine instead of a message queue? A: Use message queues for independent, parallel tasks. Use workflow engines for coordinated, sequential processes with state.

Q: How do workflow engines handle crashes? A: They persist state after each activity. On restart, they resume from the last completed step.

Q: Can business analysts modify workflows without developers? A: BPMN-based engines (Camunda) allow this. Code-based engines (Temporal) require developers but offer more flexibility.