Health Checks

Why health checks matter

A container can be running but the app inside can be broken: deadlocked, out of memory, unable to connect to the database. From Docker’s perspective, the container is “running” because the process has not exited. But the app is not serving requests.

Health checks tell Docker (and orchestrators like Docker Compose, Kubernetes, or load balancers) whether the app inside the container is actually healthy.

The /health endpoint

Every course in this series starts with a /health endpoint. This is why:

route.get("/health", {
  resolve: () => {
    // Optionally check database connectivity
    try {
      db.prepare("SELECT 1").get();
    } catch {
      return Response.json({ status: "unhealthy", error: "database" }, { status: 503 });
    }

    return Response.json({ status: "ok" });
  },
});

A simple GET /health that returns 200 if the app is working and 503 if something is wrong. The health check can verify database connectivity, cache connections, or anything else the app needs to function.

The HEALTHCHECK instruction

Add a health check to the Dockerfile:

HEALTHCHECK --interval=30s --timeout=5s --retries=3 --start-period=10s \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

--interval=30s — Check every 30 seconds.

--timeout=5s — If the check takes more than 5 seconds, consider it failed.

--retries=3 — After 3 consecutive failures, mark the container as unhealthy.

--start-period=10s — Wait 10 seconds after the container starts before running the first check. This gives the app time to start up.

The command: wget makes an HTTP request to the health endpoint. If it fails (non-200 response or timeout), exit 1 tells Docker the check failed.

[!NOTE] We use wget instead of curl because Alpine images include wget but not curl. If you use a Debian-based image, curl works too: CMD curl -f http://localhost:3000/health || exit 1.

Checking health status

docker run -d --name myapp -p 3000:3000 myapp

# Check health status
docker ps
# CONTAINER ID   IMAGE   STATUS                    PORTS
# abc123         myapp   Up 30s (health: starting)  0.0.0.0:3000->3000/tcp

# After the start period:
# abc123         myapp   Up 60s (healthy)           0.0.0.0:3000->3000/tcp

# If the app breaks:
# abc123         myapp   Up 5m (unhealthy)          0.0.0.0:3000->3000/tcp

The STATUS column shows healthy, unhealthy, or health: starting.

How orchestrators use health checks

Docker Compose: depends_on with condition: service_healthy waits for a service to be healthy before starting dependent services.

Docker Swarm / Kubernetes: Unhealthy containers are automatically restarted or replaced.

Load balancers: Unhealthy containers are removed from the load balancer pool. Traffic goes only to healthy containers.

Exercises

Exercise 1: Add the HEALTHCHECK instruction to your Dockerfile. Build and run. Watch the health status transition from “starting” to “healthy” with docker ps.

Exercise 2: Update the /health endpoint to check database connectivity. Simulate a database failure (rename the database file). Watch the container become “unhealthy.”

Exercise 3: Set --retries=1. Break the health endpoint. How quickly does the container become unhealthy? (Answer: after one interval + one failed check, so about 30 seconds.)

Why does the HEALTHCHECK have a start-period?