Skip to content
SP StackPractices
intermediate

Caching & Memoization

How to cache expensive computations and API responses using in-memory, LRU, and distributed caches across Python, JavaScript, and Java.

Topics: data

Overview

Caching stores the result of expensive computations so that subsequent requests for the same data can be served faster. Memoization is a specific form of caching where function return values are cached based on their arguments.

Caching is one of the most effective performance optimizations, but it introduces complexity: stale data, cache invalidation, and distributed consistency.

When to Use

Use this recipe when:

  • Calling expensive database queries or API endpoints repeatedly
  • Computing complex mathematical or statistical results
  • Serving static or slowly-changing configuration data
  • Reducing latency in high-traffic read-heavy systems
  • Offloading load from downstream services

Solution

Python

from functools import lru_cache
from cachetools import TTLCache

# Built-in LRU memoization
@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

print(fibonacci(100))  # Instant, cached

# TTL cache with expiration
api_cache = TTLCache(maxsize=100, ttl=300)  # 5 minutes

def fetch_user(user_id):
    if user_id in api_cache:
        return api_cache[user_id]
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    api_cache[user_id] = user
    return user

JavaScript

// Simple memoization
function memoize(fn) {
  const cache = new Map();
  return (...args) => {
    const key = JSON.stringify(args);
    if (cache.has(key)) return cache.get(key);
    const result = fn(...args);
    cache.set(key, result);
    return result;
  };
}

const fib = memoize((n) => (n < 2 ? n : fib(n - 1) + fib(n - 2)));
console.log(fib(100)); // Instant

// LRU cache with size limit
class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.cache = new Map();
  }
  get(key) {
    if (!this.cache.has(key)) return undefined;
    const value = this.cache.get(key);
    this.cache.delete(key);
    this.cache.set(key, value); // Move to end (most recent)
    return value;
  }
  set(key, value) {
    if (this.cache.has(key)) this.cache.delete(key);
    else if (this.cache.size >= this.capacity) {
      const first = this.cache.keys().next().value;
      this.cache.delete(first);
    }
    this.cache.set(key, value);
  }
}

Java

import com.github.benmanes.caffeine.cache.*;

Cache<String, User> userCache = Caffeine.newBuilder()
    .maximumSize(100)
    .expireAfterWrite(Duration.ofMinutes(5))
    .build();

// Get or compute
User user = userCache.get(userId, id -> db.findById(id));

// Manual put
userCache.put(userId, updatedUser);

// Invalidate
userCache.invalidate(userId);

Cache Invalidation Strategies

StrategyWhen to UseTrade-off
TTL (Time To Live)Data changes predictablyMay serve stale data briefly
Write-throughConsistency is criticalSlower writes, simpler reads
Write-behindHigh write throughputRisk of data loss on crash
Cache-asideFlexibility, read-heavyApplication manages cache logic
Eviction (LRU/LFU)Memory constraintsMay evict hot data prematurely

Best Practices

  • Cache at the right level: Don’t cache everything. Cache the most expensive and most frequently accessed data.
  • Set TTLs thoughtfully: Too short = useless. Too long = stale data.
  • Monitor hit rates: A cache with <80% hit rate is usually not worth the complexity.
  • Handle cache failures gracefully: If Redis is down, fall back to the database. Don’t fail the request.
  • Version cache keys: Include the data version or app version in the key to prevent stale data after deployments.
  • Invalidate proactively: Clear cache entries when underlying data changes, not just when TTL expires.

Common Mistakes

  • Caching data that changes too frequently or is rarely requested
  • Not handling cache stampede (thundering herd) when TTL expires
  • Storing unbounded caches that grow until out-of-memory
  • Ignoring cache consistency in distributed systems
  • Forgetting to invalidate cache after mutations

Frequently Asked Questions

Q: What is cache stampede and how do I prevent it? A: Cache stampede happens when many requests simultaneously hit a missing cache key. Use locking, per-key semaphores, or probabilistic early expiration.

Q: When should I use Redis instead of in-memory caching? A: Use Redis when you need shared cache across multiple application instances, persistence, or advanced data structures.

Q: Should I cache API responses? A: Yes, if the data is cacheable and the endpoint is read-heavy. Use the Cache-Control header to communicate cacheability to clients and CDNs.