Caching Database Queries
Which queries to cache
Not every query benefits from caching. The best candidates are queries that are expensive (JOINs, aggregations, subqueries), frequent (called on every page load), and stable (the result does not change often).
Expensive queries
The top books query from this course joins three tables and computes an average:
const topBooks = await cacheThrough("top-books", 5 * 60_000, () =>
db
.prepare(
`
SELECT books.id, books.title, authors.name AS author_name,
AVG(reviews.rating) AS avg_rating, COUNT(reviews.id) AS review_count
FROM books
JOIN authors ON books.author_id = authors.id
JOIN reviews ON reviews.book_id = books.id
GROUP BY books.id
ORDER BY avg_rating DESC, review_count DESC
LIMIT 10
`,
)
.all(),
); This query is a good cache candidate: it joins three tables, computes aggregates, sorts, and limits. It runs on every homepage visit. The result changes only when reviews are posted (infrequently).
Building cache keys from query parameters
Queries with parameters need cache keys that include the parameters:
route.get("/books", {
resolve: async (c) => {
const genre = (c.input.query as any)?.genre;
const page = (c.input.query as any)?.page ?? "1";
const key = genre ? `books:genre=${genre}:page=${page}` : `books:page=${page}`;
const books = await cacheThrough(key, 60_000, () => {
let sql = `
SELECT books.*, authors.name AS author_name
FROM books JOIN authors ON books.author_id = authors.id
`;
const params: any[] = [];
if (genre) {
sql += " WHERE books.genre = ?";
params.push(genre);
}
sql += " ORDER BY books.title LIMIT 20 OFFSET ?";
params.push((parseInt(page) - 1) * 20);
return db.prepare(sql).all(...params);
});
return Response.json(books);
},
}); Different query parameters produce different cache keys. books:genre=fiction:page=1 and books:genre=fantasy:page=1 are separate cache entries because they return different data.
Single-record caching
Individual records are cached by their ID:
route.get("/books/:id", {
resolve: async (c) => {
const id = c.input.params.id as string;
const book = await cacheThrough(`book:${id}`, 10 * 60_000, () =>
db
.prepare(
`
SELECT books.*, authors.name AS author_name,
(SELECT AVG(rating) FROM reviews WHERE book_id = books.id) AS avg_rating,
(SELECT COUNT(*) FROM reviews WHERE book_id = books.id) AS review_count
FROM books JOIN authors ON books.author_id = authors.id
WHERE books.id = ?
`,
)
.get(id),
);
if (!book) return Response.json({ error: "Not found" }, { status: 404 });
return Response.json(book);
},
}); [!WARNING] Be careful caching “not found” results. If
cacheThroughcachesundefined(no book found), every request for that non-existent ID returnsundefinedfrom cache — even if the book is created later. Either do not cache misses, or use a very short TTL for them.
Queries NOT worth caching
Queries that are already fast. A simple SELECT * FROM books WHERE id = ? with an indexed primary key takes under 1ms. Caching adds complexity without meaningful improvement.
Queries with unique parameters. A search query like WHERE title LIKE '%user input%' has infinite possible inputs. Each input produces a different cache key. The cache fills up with entries that are never reused.
Queries that must be real-time. Inventory checks for purchase decisions, account balances, authentication status. Stale data here causes real problems.
Exercises
Exercise 1: Add caching to the /books/top endpoint with cacheThrough. Log hits and misses. Call it 10 times. Verify 1 miss and 9 hits.
Exercise 2: Add caching to /books with genre and page in the cache key. Verify different parameters use different cache entries.
Exercise 3: Cache /books/:id. Post a review for that book. Verify the cached detail still shows the old rating (stale until TTL expires).
Why should the cache key include query parameters like genre and page?