The Hardest Problem

“There are only two hard things in computer science”

Phil Karlton famously said: “There are only two hard things in Computer Science: cache invalidation and naming things.”

Cache invalidation is the act of removing or updating cached data when the underlying data changes. It sounds simple — data changed, delete the cache. But in practice, it creates surprisingly difficult problems.

Why invalidation is hard

Problem 1: Knowing what to invalidate. A new review is posted for book-1. Which cache entries need updating?

book:book-1 (the book detail with avg_rating)
top-books (the ranking might have changed)
books:genre=science-fiction:page=1 (if book-1 is sci-fi, the listing might have changed)
leaderboard:most-reviewed (the review count changed)
catalog-stats (the total review count changed)

One write affects many cache entries. Missing one means stale data. Tracking all of them is error-prone.

Problem 2: Race conditions. Two requests arrive simultaneously:

Request A: POST /reviews (creates a review)
Request B: GET /books/top (reads top books)

Timeline:
1. A queries current top books (old data)
2. B creates a review → invalidates "top-books" cache
3. A stores old data in "top-books" cache
→ Cache has stale data until next TTL expiry

Request A read the data before the review was created, but stored it after the invalidation. The cache now contains stale data that was written after the invalidation.

Problem 3: Distributed invalidation. With multiple servers, each has its own in-memory cache. Invalidating on server A does not clear server B’s cache. Users see different data depending on which server handles their request.

The strategies

There is no perfect solution. Every strategy trades simplicity for consistency:

Time-based (TTL): Accept staleness. Data refreshes when the TTL expires. Simple, works everywhere, but data can be stale for the entire TTL window. Covered in the next lesson.

Event-based: Delete cache entries when data changes. More consistent but requires tracking which entries to invalidate. Covered in lesson 15.

Tag-based: Tag cache entries with their dependencies. Invalidate by tag. Balances simplicity and consistency. Covered in lesson 16.

The practical rule

For most applications, TTL-based invalidation is sufficient. A 60-second TTL means data is at most 60 seconds stale — acceptable for catalogs, listings, and non-critical displays.

Event-based invalidation adds complexity but reduces staleness to near-zero. Use it for data where even brief staleness is noticeable (a user edits their name and it does not update on the page).

Tag-based invalidation is the sweet spot for complex applications with many interrelated cache entries.

This section covers all three, building from simple to sophisticated.

Exercises

Exercise 1: Post a review. Immediately call /books/top. Is the new review reflected? (It depends on the TTL.)

Exercise 2: List all cache keys that should be invalidated when a review is posted for book-1. How many did you find?

Exercise 3: Describe the race condition: two requests, one writes data and one reads it. How could the cache end up with stale data even after invalidation?