hectoday
DocsCoursesChangelog GitHub
DocsCoursesChangelog GitHub

Access Required

Enter your access code to view courses.

Invalid code

← All courses Error Handling and Resilience with @hectoday/http

The problem with errors

  • Why error handling matters
  • Project setup

Error fundamentals

  • JavaScript error types
  • Try-catch and error propagation
  • Async errors

Structured error handling

  • Custom error classes
  • A global error handler
  • Operational vs programmer errors

Resilience patterns

  • Retries
  • Timeouts
  • Circuit breakers
  • Fallbacks and degradation

Server lifecycle

  • Graceful shutdown
  • Uncaught exceptions and unhandled rejections
  • Health checks under failure

Putting it all together

  • Error handling checklist
  • Capstone: resilient e-commerce API

Capstone: resilient e-commerce API

What we built

Over the course of these lessons, we built a complete error handling and resilience system for an e-commerce API. Let’s take a step back and look at everything together.

LayerWhat it doesLesson
Error response helpersConsistent error format, status codes baked inCustom error classes
Error classesCarry status code and error code for service-level errorsCustom error classes
Global error handlerCatches unexpected throws, logs structured JSONA global error handler
Operational vs programmerReturn expected errors, catch unexpected onesOperational vs programmer
RetriesExponential backoff with jitter for transient failuresRetries
TimeoutsAbortController, never wait foreverTimeouts
Circuit breakersStop calling failing services, fail fastCircuit breakers
FallbacksQueue, cache, defaults for non-critical failuresFallbacks and degradation
Graceful shutdownSIGTERM, finish in-flight, close connectionsGraceful shutdown
Process handlersuncaughtException, unhandledRejectionUncaught exceptions
Health checksDependency health, degraded vs unhealthyHealth checks under failure

Each layer handles a different kind of failure. Together, they form a system where no error goes unhandled, no failure goes unlogged, and no user gets a worse experience than necessary.

The error handling architecture

Here is how a request flows through the system:

Incoming request
|
|-- Route handler
|   |-- Input validation --> return validationFailed() (400)
|   |-- Resource lookup --> return notFound() (404)
|   |
|   |-- External service call
|   |   |-- Circuit breaker check
|   |   |-- Timeout wrapper (10s max)
|   |   |-- Retry with backoff (3 attempts)
|   |   |-- Fallback on failure (queue, cache, default)
|   |
|   |-- Response
|
|-- Global error handler (onError callback)
|   |-- AppError --> specific status code + error code
|   |-- Unknown error --> 500 + generic message + log stack
|
|-- Process handlers
|   |-- uncaughtException --> log + shutdown
|   |-- unhandledRejection --> log
|
|-- Health check
    |-- healthy (200) --> all dependencies up
    |-- degraded (200) --> non-critical services down
    |-- unhealthy (503) --> critical dependency down

Every possible failure has a handler. Validation errors and missing resources are returned directly from the route. Service failures are handled by the resilience stack. Unexpected errors are caught by the global handler. Errors that escape everything are caught by the process handlers.

The complete order flow

This is the heart of the application. Every pattern from the course appears in this single route. Compare this to the naive version from the project setup. The OrderBody schema is the same, but every failure is now handled:

Code along
// src/app.ts (updated)
import { setup, route } from "@hectoday/http";
import { z } from "zod/v4";
import db from "./db.js";
import { handleError } from "./error-handler.js";
import { notFound, validationFailed, conflict, serviceUnavailable } from "./errors.js";
import { chargeCard } from "./services/payment.js";
import { sendEmail } from "./services/email.js";
import { reserveStock } from "./services/inventory.js";
import { paymentCircuit } from "./circuits.js";
import { withRetry } from "./retry.js";
import { withTimeout } from "./timeout.js";
import { enqueue } from "./queue.js";
import { checkHealth } from "./health.js";

const OrderBody = z.object({
  userId: z.string(),
  productId: z.string(),
  quantity: z.number().int().positive(),
  paymentToken: z.string(),
});

export const app = setup({
  onError: ({ error, request }) => handleError(error, request),
  routes: [
    route.get("/health", {
      resolve: () => {
        const health = checkHealth();
        const statusCode = health.status === "unhealthy" ? 503 : 200;
        return Response.json(health, { status: statusCode });
      },
    }),

    route.get("/products", {
      resolve: () => {
        const products = db.prepare("SELECT * FROM products").all();
        return Response.json(products);
      },
    }),

    route.get("/products/:id", {
      request: { params: z.object({ id: z.string() }) },
      resolve: (c) => {
        if (!c.input.ok)
          return validationFailed(
            c.input.issues.map((i) => ({ field: i.path.join("."), message: i.message })),
          );

        const product = db.prepare("SELECT * FROM products WHERE id = ?").get(c.input.params.id);
        if (!product) return notFound("Product");

        return Response.json(product);
      },
    }),

    route.post("/orders", {
      request: { body: OrderBody },
      resolve: async (c) => {
        // Validation: return error directly
        if (!c.input.ok) {
          return validationFailed(
            c.input.issues.map((i) => ({ field: i.path.join("."), message: i.message })),
          );
        }

        const { userId, productId, quantity, paymentToken } = c.input.body;

        // Resource lookup: return 404 if missing
        const product = db.prepare("SELECT * FROM products WHERE id = ?").get(productId) as any;
        if (!product) return notFound("Product");

        // Stock check: return 409 if insufficient
        if (product.stock < quantity) {
          return conflict(`${product.name} has only ${product.stock} in stock`);
        }

        // CRITICAL: create order in database (no fallback, must succeed)
        const orderId = `ord_${crypto.randomUUID().slice(0, 8)}`;
        const total = product.price * quantity;

        db.prepare("INSERT INTO orders (id, user_id, status, total) VALUES (?, ?, ?, ?)").run(
          orderId,
          userId,
          "pending",
          total,
        );
        db.prepare(
          "INSERT INTO order_items (order_id, product_id, quantity, price) VALUES (?, ?, ?, ?)",
        ).run(orderId, productId, quantity, product.price);

        // IMPORTANT: deduct stock
        db.prepare("UPDATE products SET stock = stock - ? WHERE id = ?").run(quantity, productId);

        // IMPORTANT: charge payment (retry, timeout, circuit breaker, then queue)
        try {
          await paymentCircuit.call(() =>
            withTimeout(
              () =>
                withRetry(() => chargeCard(total, paymentToken), {
                  maxRetries: 3,
                  baseDelayMs: 500,
                }),
              10_000,
              "Payment",
            ),
          );
          db.prepare("UPDATE orders SET status = ? WHERE id = ?").run("paid", orderId);
        } catch {
          enqueue("charge_card", { orderId, amount: total, token: paymentToken });
          db.prepare("UPDATE orders SET status = ? WHERE id = ?").run("payment_pending", orderId);
        }

        // NICE-TO-HAVE: send confirmation email (fire-and-forget)
        sendEmail(userId, "Order Confirmed", `Your order ${orderId} has been placed.`).catch(() => {
          enqueue("send_email", { to: userId, orderId });
        });

        const order = db.prepare("SELECT * FROM orders WHERE id = ?").get(orderId);
        return Response.json(order, { status: 201 });
      },
    }),
  ],
});

Read through this carefully. The validation and lookup errors (validationFailed, notFound, conflict) are returned directly. No throwing, no try-catch for expected cases. The route checks the condition, returns the error response, and that is it.

The payment call is wrapped in three layers. The innermost layer is withRetry, which retries up to 3 times with exponential backoff. That is wrapped in withTimeout, which gives the whole retry sequence 10 seconds before giving up. That is wrapped in paymentCircuit.call, which checks whether the payment service is even available before trying. If all of that fails, the payment is queued for later. The try-catch here is appropriate because the payment service throwing is genuinely unpredictable.

The email is fire-and-forget. If it fails, it is queued and the user still gets their order response.

The user always gets a response. The payment is eventually consistent. The app degrades gracefully when dependencies fail.

Project structure

src/
  app.ts                    # Hectoday HTTP setup, routes, global error handler
  server.ts                 # HTTP server, graceful shutdown
  db.ts                     # Database schema, connection, seed data
  errors.ts                 # Response helpers + AppError classes
  error-handler.ts          # Global error handler (handleError function)
  circuits.ts               # Circuit breaker instances (payment, email, inventory)
  circuit-breaker.ts        # CircuitBreaker class
  retry.ts                  # withRetry function
  timeout.ts                # withTimeout function
  queue.ts                  # Job queue (enqueue function)
  health.ts                 # Health check with dependency checks
  process-handlers.ts       # uncaughtException, unhandledRejection
  services/
    payment.ts              # Payment service (simulated, 20% failure)
    email.ts                # Email service (simulated, 10% failure)
    inventory.ts            # Inventory service (simulated, 5% failure)

The resilience stack

Each layer in the resilience stack protects against a specific failure mode:

Request arrives
|
|-- Circuit breaker (protects against: cascading failures)
|   |-- Timeout (protects against: slow responses)
|       |-- Retry (protects against: transient failures)
|           |-- Fallback (protects against: prolonged outages)
|               |-- Error handler (protects against: unhandled errors)
|                   |-- Process handler (protects against: crashes)
|                       |-- Docker restart (protects against: process death)

Remove any layer and a specific failure mode goes unhandled. No timeouts? Slow dependencies hold resources forever. No circuit breakers? Failed services get hammered with requests. No retries? Transient blips become user-visible errors. No fallbacks? Every failure blocks the user.

Challenges

If you want to go further, here are some challenges that build on everything in this course.

Challenge 1: dead letter queue. When a queued job exceeds its max attempts, move it to a dead letter queue instead of retrying forever. Alert the team so they can investigate.

Challenge 2: rate-aware retries. If the external service returns 429 (rate limited), read the Retry-After header and wait that long before retrying instead of using exponential backoff.

Challenge 3: distributed tracing. Generate a request ID at the start of each request. Pass it through every log entry, error, and external service call. This lets you trace a single request through the entire system.

Challenge 4: monitoring dashboard. Build a monitoring endpoint that shows: error rate over the last hour, circuit breaker states, queue depth, and average response time. Use the patterns from this course to expose operational health.

In the order flow, why is the payment queued on failure instead of returning an error to the user?

What is the most important layer in the resilience stack?

← Error handling checklist Back to course →

© 2026 hectoday. All rights reserved.