Health Endpoint Monitoring Pattern
Expose lightweight health endpoints so orchestrators, load balancers, and monitoring tools can verify service availability.
Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.
Overview
The Health Endpoint Monitoring Pattern exposes lightweight endpoints that report whether a service is alive and ready to handle traffic. Load balancers, container orchestrators, and monitoring tools can call these endpoints to decide whether to route traffic to an instance or restart it.
This pattern is the foundation of self-healing systems and is essential for any service that runs in a dynamic environment where instances can fail or restart at any time.
When to Use
Use this pattern when:
- You run services in containers or behind a load balancer
- You want an orchestrator to restart unhealthy instances automatically
- You need to distinguish between “the process is running” and “the service is usable”
- You want to add dependency health checks without modifying client code
- You need to surface health data to a monitoring dashboard or alert system
Solution
// Express health endpoints with liveness and readiness probes
const express = require('express');
const app = express();
app.get('/health/live', (req, res) => {
res.status(200).json({ status: 'alive' });
});
app.get('/health/ready', async (req, res) => {
const dbHealthy = await checkDatabaseConnection();
const cacheHealthy = await checkCacheConnection();
if (dbHealthy && cacheHealthy) {
res.status(200).json({ status: 'ready' });
} else {
res.status(503).json({ status: 'not ready' });
}
});
app.listen(3000);
# Kubernetes liveness and readiness probes
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
template:
spec:
containers:
- name: api
image: api:latest
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
Explanation
Health endpoints separate two concerns:
- Liveness: the process is running and should not be restarted. If liveness fails, the orchestrator kills the container and starts a new one.
- Readiness: the service is ready to accept traffic. If readiness fails, the load balancer stops sending requests but does not restart the instance.
By checking dependencies such as databases, caches, and message queues, readiness probes prevent traffic from reaching an instance that cannot serve requests correctly. This improves reliability and reduces error rates during deployments or outages.
Variants
| Endpoint | Purpose | Response |
|---|---|---|
| Liveness | Is the process alive? | 200 when running, 500 otherwise |
| Readiness | Can it handle traffic? | 200 when dependencies are healthy, 503 otherwise |
| Startup | Has it finished starting? | 200 after initialization is complete |
| Deep health | Detailed subsystem status | JSON with per-dependency health |
Best Practices
- Keep the liveness probe lightweight and dependency-free
- Make the readiness probe reflect the actual ability to serve requests
- Return consistent status codes (
200for healthy,503for unhealthy) - Avoid heavy operations in health checks to prevent false failures
- Add timeouts and retry budgets for dependency checks
- Log health check failures for debugging but do not spam logs on every call
Common Mistakes
- Using a single endpoint that returns OK even when the service is broken
- Making health checks depend on external services that are not critical
- Returning
500for liveness, causing unnecessary restarts - Forgetting to test readiness probes during deployment rollouts
- Exposing health endpoints publicly without authentication or rate limiting
Frequently Asked Questions
Q: Should a liveness probe check the database? A: No. Liveness should only verify that the process is running. If the database is down, a readiness probe should fail, not the liveness probe.
Q: What status code should a readiness probe return when unhealthy?
A: Return 503 Service Unavailable. This tells the orchestrator to stop routing traffic without restarting the container.
Q: Can I expose health endpoints to the public internet? A: Only if they do not leak sensitive information. Internal deep-health endpoints should be protected by network policies or authentication.
Related Resources
Gateway Routing Pattern
Route requests to multiple backend services through a single entry point that handles cross-cutting concerns.
PatternAnti-Corruption Layer Pattern
Insert a translation layer between a bounded context and an external system to isolate domain models, prevent legacy constraints from leaking, and preserve semantic integrity.
PatternContent Delivery Network (CDN) Pattern
Distribute static and dynamic content through geographically dispersed edge servers to reduce latency, improve availability, and offload origin infrastructure.
GuideAPI Gateway Design — Resilience, Routing, and Security
A practical guide to designing API gateways: routing patterns, rate limiting, authentication, circuit breakers, and observability for resilient APIs.
PatternDatabase per Service Pattern
Give each microservice its own private database to ensure loose coupling, independent deployment, and technology heterogeneity across the application portfolio.