Tag-Based Invalidation

The manual tracking problem

The Event-Based Invalidation lesson showed hard-coded invalidation:

cacheDelete(`book:${bookId}`);
cacheDelete("top-books");
cacheDelete("catalog-stats");
cacheDelete("leaderboard:most-reviewed");

This is fragile. Add a new endpoint that caches book data? You must remember to add it to every write path. Forget one? Stale data.

Tag-based invalidation solves this: tag cache entries with the entities they depend on. When an entity changes, invalidate all entries with that tag — no need to track individual keys.

How tags work

When caching data, tag the entry with the entities it depends on:

// "top-books" depends on books and reviews
cacheSetWithTags("top-books", books, 5 * 60_000, ["books", "reviews"]);

// "book:book-1" depends on book-1 and its reviews
cacheSetWithTags("book:book-1", book, 10 * 60_000, ["book:book-1", "reviews:book-1"]);

// "catalog-stats" depends on books, authors, and reviews
cacheSetWithTags("catalog-stats", stats, 30 * 60_000, ["books", "authors", "reviews"]);

When a review is posted for book-1, invalidate the tags:

invalidateByTag("reviews"); // Clears: top-books, catalog-stats
invalidateByTag("reviews:book-1"); // Clears: book:book-1

Every cache entry tagged with "reviews" or "reviews:book-1" is deleted. You do not need to know which specific keys to delete — the tags handle it.

Implementation

// src/tagged-cache.ts
interface TaggedCacheEntry<T> {
  value: T;
  expiresAt: number;
  tags: string[];
}

const cache = new Map<string, TaggedCacheEntry<any>>();
const tagIndex = new Map<string, Set<string>>(); // tag → set of cache keys

export function cacheSetWithTags<T>(key: string, value: T, ttlMs: number, tags: string[]): void {
  // Remove old entry's tag associations if replacing
  const existing = cache.get(key);
  if (existing) {
    for (const tag of existing.tags) {
      tagIndex.get(tag)?.delete(key);
    }
  }

  // Store the entry
  cache.set(key, { value, expiresAt: Date.now() + ttlMs, tags });

  // Index by tags
  for (const tag of tags) {
    if (!tagIndex.has(tag)) tagIndex.set(tag, new Set());
    tagIndex.get(tag)!.add(key);
  }
}

export function cacheGetTagged<T>(key: string): T | undefined {
  const entry = cache.get(key);
  if (!entry) return undefined;

  if (Date.now() > entry.expiresAt) {
    // Expired — clean up
    for (const tag of entry.tags) {
      tagIndex.get(tag)?.delete(key);
    }
    cache.delete(key);
    return undefined;
  }

  return entry.value;
}

export function invalidateByTag(tag: string): number {
  const keys = tagIndex.get(tag);
  if (!keys) return 0;

  let count = 0;
  for (const key of keys) {
    const entry = cache.get(key);
    if (entry) {
      // Remove from all tag indexes
      for (const t of entry.tags) {
        tagIndex.get(t)?.delete(key);
      }
      cache.delete(key);
      count++;
    }
  }

  tagIndex.delete(tag);
  return count;
}

The tagIndex maps each tag to the set of cache keys that depend on it. invalidateByTag finds all keys for the tag and deletes them.

Using tags in the application

// Caching with tags
route.get("/books/top", {
  resolve: async () => {
    const cached = cacheGetTagged<any[]>("top-books");
    if (cached) return Response.json(cached);

    const books = db.prepare("SELECT ...").all();
    cacheSetWithTags("top-books", books, 5 * 60_000, ["books", "reviews"]);
    return Response.json(books);
  },
});

route.get("/books/:id", {
  resolve: async (c) => {
    const id = c.input.params.id as string;
    const cached = cacheGetTagged<any>(`book:${id}`);
    if (cached) return Response.json(cached);

    const book = db.prepare("SELECT ...").get(id);
    if (!book) return Response.json({ error: "Not found" }, { status: 404 });
    cacheSetWithTags(`book:${id}`, book, 10 * 60_000, [`book:${id}`, "books", `reviews:${id}`]);
    return Response.json(book);
  },
});

// Invalidation — clean and simple
route.post("/books/:id/reviews", {
  resolve: async (c) => {
    const bookId = c.input.params.id as string;
    db.prepare("INSERT INTO reviews ...").run(/* ... */);

    // Two lines invalidate everything that depends on reviews for this book
    invalidateByTag("reviews");
    invalidateByTag(`reviews:${bookId}`);

    return Response.json({ status: "created" }, { status: 201 });
  },
});

route.put("/books/:id", {
  resolve: async (c) => {
    const bookId = c.input.params.id as string;
    db.prepare("UPDATE books SET ...").run(/* ... */);

    // Invalidate everything that depends on this book
    invalidateByTag(`book:${bookId}`);
    invalidateByTag("books");

    return Response.json({ status: "updated" });
  },
});

Adding a new cached endpoint that depends on reviews? Tag it with "reviews". It will be automatically invalidated when reviews change — no need to update the write endpoints.

Tag naming conventions

Use a consistent scheme:

// Entity type (broad): invalidates all entries for that type
"books"; // All book-related caches
"reviews"; // All review-related caches
"authors"; // All author-related caches

// Entity instance (specific): invalidates entries for one record
"book:book-1"; // Caches specific to book-1
"reviews:book-1"; // Caches for book-1's reviews
"author:author-1"; // Caches specific to author-1

Broad tags for broad changes (new book added → invalidate all listings). Specific tags for specific changes (review posted for book-1 → invalidate book-1’s detail).

Exercises

Exercise 1: Implement the tagged cache. Tag /books/top with ["books", "reviews"]. Tag /books/:id with ["book:ID", "reviews:ID"]. Verify both endpoints cache correctly.

Exercise 2: Post a review. Call invalidateByTag("reviews"). Verify /books/top is cleared but /books/:id for a different book is not.

Exercise 3: Add a new cached endpoint (e.g., /genres/:genre/books). Tag it with ["books"]. Verify it is automatically invalidated when a book is added — without modifying the book creation endpoint.

What is the main advantage of tag-based invalidation over manually tracking cache keys?