API Performance Budget Template

Overview

Every API endpoint consumes resources: CPU, memory, database connections, and network bandwidth. Without explicit budgets, teams add features until the system collapses under load. This template defines performance budgets as a contract between product requirements and infrastructure capacity, making trade-offs explicit before they become outages.

When to Use

Use this resource when:

Designing a new API or adding a new endpoint to an existing one
Planning capacity for a marketing campaign or seasonal traffic spike
Investigating why an API is slow under load

Solution

# API Performance Budget: `<API Name>`

## 1. Endpoint Budgets

### 1.1. Latency Budget (p95)

| Endpoint | Target | Current | Buffer | Notes |
|----------|--------|---------|--------|-------|
| `GET /health` | < 10ms | 8ms | 2ms | Must not depend on DB |
| `GET /users/{id}` | < 100ms | 85ms | 15ms | Single DB query allowed |
| `POST /orders` | < 250ms | 210ms | 40ms | Multi-table transaction |
| `GET /reports/aggregate` | < 2s | 1.8s | 200ms | Read replica allowed |

### 1.2. Throughput Budget

| Endpoint | RPS Target | Max Burst | Connection Pool | Notes |
|----------|------------|-----------|-----------------|-------|
| `GET /products` | 5,000 | 10,000 | 50 | Cached in Redis |
| `POST /checkout` | 500 | 1,000 | 20 | DB write, idempotent |

### 1.3. Payload Budget

| Direction | Max Size | Typical | Compression |
|-----------|----------|---------|-------------|
| Request body | 1 MB | 10 KB | Gzip |
| Response body | 5 MB | 50 KB | Gzip + Brotli |
| Header total | 16 KB | 1 KB | None |

### 1.4. Dependency Budget

| Dependency | Max Latency Contribution | Max Concurrent Calls | Fallback |
|------------|--------------------------|----------------------|----------|
| PostgreSQL | 40ms | 20 | Read replica |
| Redis | 5ms | 100 | Skip cache (slower) |
| Payment Gateway | 200ms | 10 | Queue for async |
| Search Service | 100ms | 50 | Basic LIKE query |

## 2. Resource Budgets

| Resource | Baseline | Headroom | Max | Alert At |
|----------|----------|----------|-----|----------|
| CPU / core | 40% | 30% | 70% | 60% |
| Memory / pod | 512 MB | 256 MB | 1 GB | 800 MB |
| DB connections | 50 | 30 | 80 | 70 |
| Network egress | 100 Mbps | 50 Mbps | 200 Mbps | 150 Mbps |

## 3. Budget Enforcement

- [ ] Latency assertions in CI (fail build if p95 exceeds budget)
- [ ] Load test gates in deployment pipeline
- [ ] Payload size validation at API gateway
- [ ] Circuit breaker on dependency calls exceeding max latency
- [ ] Autoscaling triggered at 60% CPU, not 80%

## 4. Review Cadence

| Trigger | Action | Owner |
|---------|--------|-------|
| Weekly | Review dashboards for drift above budget | SRE |
| Monthly | Adjust budgets based on traffic growth | Platform |
| Quarterly | Re-negotiate budgets with product team | Engineering Lead |
| Incident | Post-incident budget audit | Incident Commander |

Explanation

A latency budget is a chain: frontend (100ms) + network (50ms) + API (200ms) + database (40ms) = 390ms total. If any single link exceeds its allocation, the whole user experience degrades. The buffer column exists because real-world performance is noisy. Dependency budgets prevent a slow downstream service from consuming the entire API budget. Throughput budgets define connection pools and cache sizes. Resource budgets prevent one heavy endpoint from starving others on the same host.

Variants

Context	Approach	Notes
GraphQL	Query complexity budget	Limit total field cost per query, not just depth
Streaming API	Throughput per shard	Budget per partition, not global
Mobile SDK	Bundle size + call frequency	Budget includes payload size and number of parallel requests

Best Practices

Set budgets during design, not after launch
Include budgets in API design review checklists
Fail CI builds when latency tests exceed budget by any margin
Reserve 20-30% buffer in every budget for traffic spikes
Document who can approve budget exceptions (usually engineering lead)

Common Mistakes

Basing budgets on best-case lab conditions instead of production percentiles
Ignoring dependency latency when calculating endpoint budgets
Setting one global budget for all endpoints regardless of criticality
Treating budgets as one-time decisions instead of living documents
Failing to enforce budgets in CI or deployment gates

Frequently Asked Questions

What happens when an endpoint exceeds its budget?

First, optimize: add caching, denormalize queries, or paginate. If the endpoint fundamentally needs more time, escalate to the engineering lead who can either approve a budget increase or downgrade the SLO. Never silently ignore budget violations.

How do I budget for endpoints with unknown traffic patterns?

Start with conservative estimates based on similar endpoints. Monitor for 2 weeks after launch, then adjust. Use feature flags to ramp traffic gradually so you can observe real performance before committing to a budget.

Should budgets apply to batch / async endpoints?

Yes, but use different metrics. For async, budget on queue depth, processing rate (items/second), and end-to-end latency from enqueue to completion. Throughput matters more than response time for batch jobs.

API Performance Budget Template

Overview

When to Use

Solution

Explanation

Variants

Best Practices

Common Mistakes

Frequently Asked Questions

What happens when an endpoint exceeds its budget?

How do I budget for endpoints with unknown traffic patterns?

Should budgets apply to batch / async endpoints?

API Lifecycle Management Template

API Monitoring & Alerting Template

Microservice Contract Template

Service Dependency Map Template

System Diagram Template

Overview

When to Use

Solution

Explanation

Variants

Best Practices

Common Mistakes

Frequently Asked Questions

What happens when an endpoint exceeds its budget?

How do I budget for endpoints with unknown traffic patterns?

Should budgets apply to batch / async endpoints?

Related Resources

API Lifecycle Management Template

API Monitoring & Alerting Template

Microservice Contract Template

Service Dependency Map Template

System Diagram Template