Caching Computed Results
Beyond raw queries
Some cached data is not a direct database query — it is a computed result that requires multiple queries, transformations, or aggregations. Computing it is expensive. The result changes infrequently. Caching it avoids repeating the computation.
Average ratings
The book detail endpoint computes the average rating with a subquery:
(SELECT AVG(rating) FROM reviews WHERE book_id = books.id) AS avg_rating This runs on every request. An alternative: compute it once and cache it.
async function getBookWithRating(bookId: string) {
return cacheThrough(`book-with-rating:${bookId}`, 5 * 60_000, () => {
const book = db
.prepare(
"SELECT books.*, authors.name AS author_name FROM books JOIN authors ON books.author_id = authors.id WHERE books.id = ?",
)
.get(bookId) as any;
if (!book) return null;
const stats = db
.prepare(
"SELECT AVG(rating) AS avg_rating, COUNT(*) AS review_count FROM reviews WHERE book_id = ?",
)
.get(bookId) as any;
return { ...book, avgRating: stats.avg_rating, reviewCount: stats.review_count };
});
} Two queries run once, the combined result is cached for 5 minutes.
[!NOTE] The Database Design course’s When to Denormalize lesson solved this with denormalized
review_countandavg_ratingcolumns on thebookstable, updated by triggers. Caching is an alternative: instead of storing computed values in the database, store them in memory. Both approaches avoid recomputing on every request. Denormalization is better when you have many different queries that need the value. Caching is better when only a few endpoints use it.
Leaderboards and rankings
A “most reviewed books” leaderboard requires counting reviews, sorting, and limiting. This is a multi-table aggregation:
async function getMostReviewed() {
return cacheThrough("leaderboard:most-reviewed", 10 * 60_000, () =>
db
.prepare(
`
SELECT books.id, books.title, authors.name AS author_name,
COUNT(reviews.id) AS review_count
FROM books
JOIN authors ON books.author_id = authors.id
JOIN reviews ON reviews.book_id = books.id
GROUP BY books.id
ORDER BY review_count DESC
LIMIT 20
`,
)
.all(),
);
} Leaderboards are perfect cache candidates: expensive to compute, requested frequently, and change slowly (only when new reviews are posted).
Aggregated statistics
Dashboard-style endpoints that summarize data:
async function getCatalogStats() {
return cacheThrough("catalog-stats", 15 * 60_000, () => {
const bookCount = (db.prepare("SELECT COUNT(*) AS c FROM books").get() as any).c;
const authorCount = (db.prepare("SELECT COUNT(*) AS c FROM authors").get() as any).c;
const reviewCount = (db.prepare("SELECT COUNT(*) AS c FROM reviews").get() as any).c;
const avgRating = (db.prepare("SELECT AVG(rating) AS avg FROM reviews").get() as any).avg;
return {
books: bookCount,
authors: authorCount,
reviews: reviewCount,
averageRating: avgRating ? Math.round(avgRating * 10) / 10 : null,
};
});
} Four queries run once, the summary is cached for 15 minutes. A dashboard that polls this endpoint every 5 seconds makes 180 requests in 15 minutes — only 1 hits the database.
Precomputing on a schedule
For expensive computations, precompute in a background job instead of on first request:
// Background job (from the Background Jobs course)
const handlers = {
compute_leaderboard: async () => {
const leaderboard = db.prepare("SELECT ...").all();
cacheSet("leaderboard:most-reviewed", leaderboard, 15 * 60_000);
console.log("[CACHE] Leaderboard precomputed");
},
};
// Cron: recompute every 10 minutes
registerCron("compute_leaderboard", "*/10 * * * *"); [!NOTE] This combines the Background Jobs course’s cron scheduling with caching. The leaderboard is always warm — no user ever triggers the expensive computation. The first visitor sees precomputed data, not a cache miss.
Exercises
Exercise 1: Cache a computed result (book with rating stats). Verify the two queries only run once per TTL window.
Exercise 2: Create a catalog stats endpoint. Cache it for 15 minutes. Call it 100 times. Verify only 1 database query.
Exercise 3: Precompute the leaderboard with a background job. Verify the endpoint never triggers a database query (always cache hit).
When is caching a better choice than denormalization for computed values?