Dead Letter Queues
Jobs that never succeed
After 5 retries (covered in the previous lesson), a job is marked "failed". It sits in the jobs table forever. Nobody looks at it. The email was never sent. The customer never got their confirmation.
A dead letter queue (DLQ) is a separate holding area for permanently failed jobs. Instead of rotting in the main table, failed jobs are moved to a dedicated table where they can be reviewed, debugged, and reprocessed.
The dead letter table
CREATE TABLE IF NOT EXISTS dead_letter_jobs (
id TEXT PRIMARY KEY,
original_job_id TEXT NOT NULL,
type TEXT NOT NULL,
payload TEXT NOT NULL,
attempts INTEGER NOT NULL,
last_error TEXT,
failed_at TEXT NOT NULL DEFAULT (datetime('now')),
reviewed INTEGER NOT NULL DEFAULT 0,
notes TEXT
); The DLQ table stores the same job data plus metadata for investigation: when it failed, whether someone has reviewed it, and notes about the resolution.
Moving failed jobs to the DLQ
Update failJob to move permanently failed jobs:
export function failJob(jobId: string, error: string): void {
const job = db.prepare("SELECT * FROM jobs WHERE id = ?").get(jobId) as Job;
const newAttempts = job.attempts + 1;
if (newAttempts >= job.max_attempts) {
// Permanently failed — move to dead letter queue
db.transaction(() => {
db.prepare(
`
INSERT INTO dead_letter_jobs (id, original_job_id, type, payload, attempts, last_error)
VALUES (?, ?, ?, ?, ?, ?)
`,
).run(crypto.randomUUID(), jobId, job.type, job.payload, newAttempts, error);
db.prepare("DELETE FROM jobs WHERE id = ?").run(jobId);
})();
console.error(`[DLQ] Job ${jobId} (${job.type}) moved to dead letter queue: ${error}`);
return;
}
// Still has retries — schedule for later
db.prepare(
`
UPDATE jobs
SET status = 'pending', attempts = ?, last_error = ?,
locked_by = NULL, locked_at = NULL, updated_at = datetime('now'),
scheduled_at = datetime('now', '+' || ? || ' seconds')
WHERE id = ?
`,
).run(newAttempts, error, newAttempts * 30, jobId);
} [!NOTE] The
db.transaction()wrapping the INSERT and DELETE ensures atomicity — either both succeed or neither does. Without it, a crash between the two statements could lose the job entirely. The Database Design course’s Transactions lesson explains why this matters.
Monitoring the DLQ
route.get("/admin/dead-letter-jobs", {
resolve: () => {
const jobs = db
.prepare(
`
SELECT id, type, last_error, failed_at, reviewed
FROM dead_letter_jobs
ORDER BY failed_at DESC LIMIT 50
`,
)
.all();
const count = db
.prepare("SELECT COUNT(*) as count FROM dead_letter_jobs WHERE reviewed = 0")
.get() as { count: number };
return Response.json({ unreviewed: count.count, jobs });
},
}); When the DLQ is not empty, someone should investigate. Common causes: the customer deleted their account after the job was queued, a persistent service outage, or a bug in the handler code.
Reprocessing dead letter jobs
After fixing the underlying issue, requeue dead letter jobs:
route.post("/admin/dead-letter-jobs/:id/reprocess", {
resolve: (c) => {
const dlj = db
.prepare("SELECT * FROM dead_letter_jobs WHERE id = ?")
.get(c.input.params.id) as any;
if (!dlj) return Response.json({ error: "Not found" }, { status: 404 });
db.transaction(() => {
enqueue(dlj.type, JSON.parse(dlj.payload));
db.prepare(
"UPDATE dead_letter_jobs SET reviewed = 1, notes = 'Reprocessed' WHERE id = ?",
).run(dlj.id);
})();
return Response.json({ status: "requeued" });
},
}); Exercises
Exercise 1: Create the dead_letter_jobs table. Set a job’s max_attempts to 1. Make it fail. Verify it appears in the dead letter table.
Exercise 2: Build the DLQ monitoring endpoint. Check the count of unreviewed jobs.
Exercise 3: Fix a broken handler. Reprocess the dead letter job. Verify it succeeds.
Why move failed jobs to a separate table instead of leaving them with status 'failed'?