hectoday
DocsCoursesChangelog GitHub
DocsCoursesChangelog GitHub

Access Required

Enter your access code to view courses.

Invalid code

← All courses Error Handling and Resilience with @hectoday/http

The problem with errors

  • Why error handling matters
  • Project setup

Error fundamentals

  • JavaScript error types
  • Try-catch and error propagation
  • Async errors

Structured error handling

  • Custom error classes
  • A global error handler
  • Operational vs programmer errors

Resilience patterns

  • Retries
  • Timeouts
  • Circuit breakers
  • Fallbacks and degradation

Server lifecycle

  • Graceful shutdown
  • Uncaught exceptions and unhandled rejections
  • Health checks under failure

Putting it all together

  • Error handling checklist
  • Capstone: resilient e-commerce API

Circuit breakers

The problem retries do not solve

Retries handle transient failures, those brief blips that clear up in a few seconds. But what if the payment service is completely down? Not a brief timeout, but genuinely offline for minutes or hours.

Every incoming request tries to call the payment service. Each one waits for the 5-second timeout. Each one retries 3 times. Each request takes 15 or more seconds just to fail. Meanwhile, the server is holding connections and memory for all those slow, doomed requests. Users are staring at a loading spinner.

We are wasting time calling a service we already know is down. What we need is a way to detect that a service is failing and stop calling it.

That is what a circuit breaker does. Like a circuit breaker in your house that trips when there is an electrical fault, a software circuit breaker trips when a service is failing and cuts off all calls to it. Instead of waiting for timeouts, requests fail immediately. Periodically, the circuit breaker lets one test request through to check if the service has recovered.

The three states

A circuit breaker has three states:

CLOSED --> (failures exceed threshold) --> OPEN --> (timer expires) --> HALF-OPEN
  ^                                                                       |
  |________ (test request succeeds) ______________________________________|
                                             |
                     (test request fails) -> OPEN (reset timer)

CLOSED is the normal state. Requests flow through to the service as usual. The circuit breaker counts failures. When the failure count exceeds a threshold (say, 5 failures), the circuit opens.

OPEN means the service is down. Requests are rejected immediately with an error. No calls are made to the service at all. After a timeout period (say, 30 seconds), the circuit moves to half-open.

HALF-OPEN is the testing state. One single request is allowed through to the service. If it succeeds, the service has recovered and the circuit closes. If it fails, the service is still down and the circuit opens again.

Implementation

Let’s build it.

Code along
// src/circuit-breaker.ts

type CircuitState = "closed" | "open" | "half-open";

export class CircuitBreaker {
  private state: CircuitState = "closed";
  private failures = 0;
  private lastFailureTime = 0;
  private readonly threshold: number;
  private readonly resetTimeoutMs: number;

  constructor(
    private readonly name: string,
    options: { threshold?: number; resetTimeoutMs?: number } = {},
  ) {
    this.threshold = options.threshold ?? 5;
    this.resetTimeoutMs = options.resetTimeoutMs ?? 30_000;
  }

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      // Check if enough time has passed to try again
      if (Date.now() - this.lastFailureTime > this.resetTimeoutMs) {
        this.state = "half-open";
        console.log(`[${this.name}] Circuit half-open, testing...`);
      } else {
        throw new Error(`${this.name} circuit is open`);
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (err) {
      this.onFailure();
      throw err;
    }
  }

  private onSuccess(): void {
    if (this.state === "half-open") {
      console.log(`[${this.name}] Circuit closed, service recovered`);
    }
    this.failures = 0;
    this.state = "closed";
  }

  private onFailure(): void {
    this.failures++;
    this.lastFailureTime = Date.now();

    if (this.failures >= this.threshold) {
      this.state = "open";
      console.log(`[${this.name}] Circuit OPEN, ${this.failures} failures`);
    }
  }

  isOpen(): boolean {
    return this.state === "open" && Date.now() - this.lastFailureTime <= this.resetTimeoutMs;
  }

  getState(): { state: CircuitState; failures: number } {
    return { state: this.state, failures: this.failures };
  }
}

Let’s trace through the call method, because that is where everything happens.

When you call circuitBreaker.call(fn), the first thing it checks is whether the circuit is open. If it is, there are two possibilities. Either enough time has passed since the last failure (the reset timeout expired), in which case the circuit moves to half-open and lets this request through as a test. Or the timeout has not expired, in which case the request is rejected immediately by throwing an error. No call to the service. No waiting.

If the circuit is closed or half-open, the function fn is called. If it succeeds, onSuccess resets the failure count to zero and closes the circuit. If it was half-open, that means the test request passed and the service is back.

If fn fails, onFailure increments the failure count. If the count reaches the threshold, the circuit opens.

The getState method is for monitoring, which we will wire up shortly.

Using the circuit breaker

import { CircuitBreaker } from "../circuit-breaker.js";
import { serviceUnavailable } from "../errors.js";

const paymentCircuit = new CircuitBreaker("payment-service", {
  threshold: 5, // Open after 5 failures
  resetTimeoutMs: 30_000, // Try again after 30 seconds
});

route.post("/orders", {
  resolve: async (c) => {
    // ... validate order ...

    // Check if the circuit is open before even trying
    if (paymentCircuit.isOpen()) {
      return serviceUnavailable("Payment service");
    }

    try {
      const charge = await paymentCircuit.call(() => chargeCard(order.total, "tok_123"));
      // ... complete order ...
    } catch (err) {
      // Payment failed (service error or circuit just opened)
      return serviceUnavailable("Payment service");
    }
  },
});

The route checks the circuit state and returns a serviceUnavailable response directly. When the circuit is open, the user sees a friendly 503 response without waiting for a timeout. No throwing, no crashing. The route handles the failure and tells the user what happened.

The benefit

Let’s put some numbers on it.

Without a circuit breaker: 100 requests try the payment service. Each waits 5 seconds for the timeout. Each retries 3 times. Total wasted time: 100 requests times 15 seconds equals 1,500 seconds of hanging connections.

With a circuit breaker: the first 5 requests fail normally (they trip the circuit). The next 95 requests fail immediately in milliseconds, not seconds. The server stays responsive. Users get an instant error message instead of a 15-second loading spinner.

Circuit breaker per service

You should create one circuit breaker per external service. Put them in a shared module so both routes and health checks can access the same instances:

Code along
// src/circuits.ts
import { CircuitBreaker } from "./circuit-breaker.js";

export const paymentCircuit = new CircuitBreaker("payment-service", { threshold: 5 });
export const emailCircuit = new CircuitBreaker("email-service", { threshold: 3 });
export const inventoryCircuit = new CircuitBreaker("inventory-service", { threshold: 10 });

Each service has its own failure count and state. If the payment service goes down, the email circuit stays closed. The email service keeps working. You do not want one failing dependency to block everything.

Monitoring circuit state

Expose circuit states in a monitoring endpoint so you can see at a glance which services are healthy:

import { paymentCircuit, emailCircuit, inventoryCircuit } from "./circuits.js";

route.get("/admin/circuits", {
  resolve: () => {
    return Response.json({
      payment: paymentCircuit.getState(),
      email: emailCircuit.getState(),
      inventory: inventoryCircuit.getState(),
    });
  },
});

When a circuit opens, that is a signal that something needs attention. This is exactly how production systems surface dependency health. We will connect this to health checks later in the course.

Exercises

Exercise 1: Implement the CircuitBreaker class. Set the threshold to 3. Make the simulated payment service fail 100% of the time. After 3 failures, verify the circuit opens.

Exercise 2: Wait for the reset timeout. Verify the circuit moves to half-open and sends a test request.

Exercise 3: Make the payment service recover (stop failing). Verify the circuit closes after a successful half-open test.

Circuit breakers stop us from calling services that are down. But they raise a new question: if the payment service is down and the circuit is open, what do we tell the user? Do we just fail the whole order? Maybe not. Next, we will look at fallbacks and graceful degradation, where the app keeps working with reduced functionality instead of failing outright.

What happens when a circuit breaker is in the 'open' state?

← Timeouts Fallbacks and degradation →

© 2026 hectoday. All rights reserved.