The Hardest Problem
“There are only two hard things in computer science”
Phil Karlton famously said: “There are only two hard things in Computer Science: cache invalidation and naming things.”
Cache invalidation is the act of removing or updating cached data when the underlying data changes. It sounds simple — data changed, delete the cache. But in practice, it creates surprisingly difficult problems.
Why invalidation is hard
Problem 1: Knowing what to invalidate. A new review is posted for book-1. Which cache entries need updating?
book:book-1(the book detail with avg_rating)top-books(the ranking might have changed)books:genre=science-fiction:page=1(if book-1 is sci-fi, the listing might have changed)leaderboard:most-reviewed(the review count changed)catalog-stats(the total review count changed)
One write affects many cache entries. Missing one means stale data. Tracking all of them is error-prone.
Problem 2: Race conditions. Two requests arrive simultaneously:
Request A: POST /reviews (creates a review)
Request B: GET /books/top (reads top books)
Timeline:
1. A queries current top books (old data)
2. B creates a review → invalidates "top-books" cache
3. A stores old data in "top-books" cache
→ Cache has stale data until next TTL expiry Request A read the data before the review was created, but stored it after the invalidation. The cache now contains stale data that was written after the invalidation.
Problem 3: Distributed invalidation. With multiple servers, each has its own in-memory cache. Invalidating on server A does not clear server B’s cache. Users see different data depending on which server handles their request.
The strategies
There is no perfect solution. Every strategy trades simplicity for consistency:
Time-based (TTL): Accept staleness. Data refreshes when the TTL expires. Simple, works everywhere, but data can be stale for the entire TTL window. Covered in the next lesson.
Event-based: Delete cache entries when data changes. More consistent but requires tracking which entries to invalidate. Covered in lesson 15.
Tag-based: Tag cache entries with their dependencies. Invalidate by tag. Balances simplicity and consistency. Covered in lesson 16.
The practical rule
For most applications, TTL-based invalidation is sufficient. A 60-second TTL means data is at most 60 seconds stale — acceptable for catalogs, listings, and non-critical displays.
Event-based invalidation adds complexity but reduces staleness to near-zero. Use it for data where even brief staleness is noticeable (a user edits their name and it does not update on the page).
Tag-based invalidation is the sweet spot for complex applications with many interrelated cache entries.
This section covers all three, building from simple to sophisticated.
Exercises
Exercise 1: Post a review. Immediately call /books/top. Is the new review reflected? (It depends on the TTL.)
Exercise 2: List all cache keys that should be invalidated when a review is posted for book-1. How many did you find?
Exercise 3: Describe the race condition: two requests, one writes data and one reads it. How could the cache end up with stale data even after invalidation?
Why is cache invalidation called 'the hardest problem'?