Caching Strategies: Where to Cache and What to Invalidate
Phil Karlton’s observation that cache invalidation is one of the two hard problems in computer science has become a joke repeated so often it’s lost its teeth. It shouldn’t have. Cache invalidation is genuinely hard, not because the concept is complex, but because the consequences of getting it wrong are subtle and the bugs are often invisible until a user notices something wrong.
A cache serves stale data silently. It doesn’t crash. It doesn’t throw an error. It just returns something that used to be true.
Before getting into strategies, a useful framing: a cache is a contract. You are saying “this data is stable enough to serve repeatedly without re-fetching from the source.” Understanding what that stability actually looks like for your data is the prerequisite for invalidating it correctly.
The layers where caching happens
Caching is not one thing. It happens at multiple layers of a system, each with different characteristics.
Browser cache
The browser caches HTTP responses based on headers the server sets. Cache-Control is the main one:
Cache-Control: max-age=3600, public
This tells the browser (and any intermediate caches) to serve the cached response for up to 3600 seconds before re-fetching. public means the response can be stored by shared caches like CDNs.
For static assets (JS bundles, CSS, images) that change with each deployment, the standard approach is content-based cache busting: include a hash of the file contents in the filename. The URL changes when the file changes, so you can set max-age to a year or more. When you deploy, the new filename bypasses all caches.
Cache-Control: max-age=31536000, immutable
immutable tells the browser not to revalidate even when the user refreshes - the content will never change at this URL.
For HTML pages or API responses that change, you have two options: short TTL (accept some staleness), or Cache-Control: no-cache (re-validate every time, using ETags or Last-Modified).
CDN cache
CDNs sit between your users and your origin server. They cache responses at edge nodes geographically close to users. The same Cache-Control headers that tell browsers what to cache tell CDNs what to cache.
CDNs add the ability to invalidate programmatically - when you deploy new content, you can purge affected URLs from the CDN cache. This is slower than a browser TTL expiry but faster than waiting for caches everywhere to expire naturally.
For user-specific or authenticated responses, be careful: CDN caches are shared. A response with a user’s private data must not be cached by a CDN. The Cache-Control: private directive tells the CDN not to store it (only the browser can).
Application-level cache
This is what most developers think of when they hear “caching” - storing the result of expensive operations in memory or Redis, keyed by some identifier.
def get_user_profile(user_id: str) -> UserProfile:
cache_key = f"user:profile:{user_id}"
cached = redis.get(cache_key)
if cached:
return deserialize(cached)
profile = db.fetch_user_profile(user_id)
redis.setex(cache_key, 300, serialize(profile)) # TTL: 5 minutes
return profile
This is cache-aside (also called lazy loading) - the most common application caching pattern. The cache is populated on first access. On cache miss, you fetch from the source and populate. On cache hit, you return the cached value.
The obvious downside: the first request after a cache miss is slow. Under high load, a cache miss on a popular key can cause a thundering herd - many simultaneous requests all see a miss and all go to the database at once. Solutions include locking (only one request populates the cache), probabilistic early expiration (probabilistically re-fetch before TTL expires), or background refresh.
Database query cache
Some databases have internal caches for query results. PostgreSQL doesn’t have a user-visible query result cache, but it has a buffer pool that caches pages in memory. Other databases (MySQL historically) have had explicit query result caches, which have largely been deprecated because they’re easy to misconfigure and the invalidation is crude.
In practice, when developers talk about “database caching” they usually mean application-level caching sitting in front of database queries, not database-internal caching.
Invalidation strategies
Cache invalidation is deciding when the cached value is no longer valid. There are three real approaches:
Time-based expiry (TTL)
Set a time-to-live. After it expires, the next access re-fetches from source.
Simple to implement. The tradeoff is staleness: you’re accepting that the cache might serve data that’s up to max_age seconds out of date. This is fine for data that doesn’t change often or where short-term staleness is acceptable.
Choose TTL based on acceptable staleness for the data, not based on what’s convenient. A product catalog might be fine with 15-minute caching. A user’s account balance should not be cached with a 5-minute TTL if accurate reads matter.
Long TTLs amplify the impact of bugs. If you cache the wrong value for 24 hours, that’s 24 hours of incorrect data.
Event-driven invalidation
When the source data changes, immediately invalidate the cache entry.
def update_user_profile(user_id: str, data: dict) -> UserProfile:
profile = db.update_user_profile(user_id, data)
redis.delete(f"user:profile:{user_id}") # Invalidate immediately
return profile
This is more work - every write path must know which cache keys to invalidate. It also requires that all writes go through code you control. A direct database update from a migration script or admin tool bypasses the cache invalidation.
The benefit: near-zero staleness. Users see updated data immediately after a write.
A more robust variant is write-through: update the cache at write time instead of deleting it. This avoids the thundering herd on the next read, but requires that the write path can also compute the cached representation.
Versioned keys
Instead of invalidating a key, change the key itself.
def cache_key(user_id: str, version: int) -> str:
return f"user:profile:{user_id}:v{version}"
When the data changes, increment the version. Old cache entries become unreachable (and eventually evicted). This sidesteps the invalidation problem by making stale data unreachable rather than deleted.
Works well when the version can be stored cheaply (in the database row, for example). Leaves stale entries sitting in cache until eviction - not a problem if your cache has reasonable eviction policies and sufficient size.
What not to cache
Some things should never be cached, or should be cached with extreme care:
User-specific sensitive data in shared caches. Authentication tokens, session state, financial data. The risk of serving one user’s data to another is severe. If you cache user-specific data, make sure the cache key is unique per user and the cache storage is not shared across users.
Results of operations that must reflect the latest state. Inventory levels, seat availability, anything transactional where correctness matters more than speed. Caching can introduce race conditions where stale data leads to overselling or double-booking.
Data that changes more often than the TTL. A 5-minute cache on data that changes every 30 seconds is mostly serving stale data and adding complexity for little benefit.
Cache stampede and warming
Two practical problems worth knowing about.
A cache stampede (or thundering herd) happens when a hot cache entry expires and many concurrent requests simultaneously see a miss. All of them go to the database. If the database query is slow, they all pile up. The mitigation is to serialize cache population - use a distributed lock so only one request rebuilds the cache while others wait or serve the expired value briefly.
Cache warming is proactively populating the cache before traffic hits it. After a deployment, all caches start cold. Under high load, a cold cache means a sudden spike of database traffic. For critical data, consider warming the cache as part of your deployment process.
A simple decision framework
When you’re deciding whether to cache something and how:
- What is the read/write ratio? High read, low write - good candidate. High write rate - caching adds complexity with less benefit.
- How often does it change? If it changes every few seconds, TTL-based caching probably isn’t the right tool.
- What’s the cost of staleness? A user seeing their profile from 5 minutes ago is usually fine. A user seeing an incorrect balance is not.
- What’s the cost of the operation you’re caching? If the database query takes 2ms, caching it in Redis adds latency and complexity for marginal gain.
- What’s the blast radius of a bug? A caching bug on a user’s session is bad. A caching bug on public product listings is low risk.
The fastest operation is the one you don’t do. Caching is a way of not doing operations. But every cache is a copy of data that will eventually disagree with the source. The engineering work is in deciding how much disagreement you can tolerate, and building the invalidation to match.