Health and Metrics
From logs to metrics
Individual log entries tell you what happened to one request. Metrics tell you what is happening to all requests: request rate, error rate, average response time, queue depth. Metrics are aggregated numbers derived from log data.
Types of metrics
Counters — Values that only go up: total requests, total errors, total orders placed.
Gauges — Values that go up and down: active connections, queue depth, cache size.
Histograms — Distribution of values: response time percentiles (p50, p95, p99), request body sizes.
A simple metrics collector
// src/metrics.ts
class Metrics {
private counters: Map<string, number> = new Map();
private gauges: Map<string, number> = new Map();
private histograms: Map<string, number[]> = new Map();
increment(name: string, value: number = 1): void {
this.counters.set(name, (this.counters.get(name) ?? 0) + value);
}
gauge(name: string, value: number): void {
this.gauges.set(name, value);
}
observe(name: string, value: number): void {
const values = this.histograms.get(name) ?? [];
values.push(value);
this.histograms.set(name, values);
}
getSnapshot(): Record<string, unknown> {
const snapshot: Record<string, unknown> = {};
for (const [name, value] of this.counters) {
snapshot[`counter.${name}`] = value;
}
for (const [name, value] of this.gauges) {
snapshot[`gauge.${name}`] = value;
}
for (const [name, values] of this.histograms) {
const sorted = [...values].sort((a, b) => a - b);
snapshot[`histogram.${name}`] = {
count: sorted.length,
min: sorted[0],
max: sorted[sorted.length - 1],
avg: Math.round(sorted.reduce((a, b) => a + b, 0) / sorted.length),
p95: sorted[Math.floor(sorted.length * 0.95)],
p99: sorted[Math.floor(sorted.length * 0.99)],
};
}
return snapshot;
}
reset(): void {
this.histograms.clear();
// Counters and gauges persist between resets
}
}
export const metrics = new Metrics(); Collecting metrics from requests
onResponse: ({ response, locals }) => {
const duration = Date.now() - (locals.startTime as number);
// Count requests
metrics.increment("http.requests.total");
metrics.increment(`http.requests.${response.status}`);
// Track response time
metrics.observe("http.response_time_ms", duration);
// Track errors
if (response.status >= 500) {
metrics.increment("http.errors.5xx");
}
// ... normal request logging
}, Every request increments the counter and records the response time. After 1,000 requests, the metrics snapshot shows: total requests, error rate, and response time percentiles.
A health/metrics endpoint
route.get("/health", {
resolve: () => {
const snapshot = metrics.getSnapshot();
const dbOk = checkDatabase();
const uptime = process.uptime();
return Response.json({
status: dbOk ? "healthy" : "degraded",
uptime: Math.round(uptime),
metrics: snapshot,
});
},
}); [!NOTE] The Deploying with Docker course added a
/healthendpoint for container health checks. Now it returns metrics too — the operations team sees request rate, error rate, and response times at a glance.
Periodic metric logging
Log metrics at a fixed interval for trend analysis:
setInterval(() => {
const snapshot = metrics.getSnapshot();
logger.info("metrics snapshot", snapshot);
metrics.reset(); // Reset histograms for the next interval
}, 60_000); // Every 60 seconds {
"level": "info",
"message": "metrics snapshot",
"counter.http.requests.total": 1234,
"counter.http.errors.5xx": 3,
"histogram.http.response_time_ms": {
"count": 1234,
"min": 2,
"max": 450,
"avg": 18,
"p95": 45,
"p99": 120
}
} One log line per minute with the full picture. Search for message: "metrics snapshot" to plot trends over time.
Exercises
Exercise 1: Build the Metrics class. Collect request counts and response times in onResponse.
Exercise 2: Add a /health endpoint that returns the metrics snapshot. Make 50 requests. Check the snapshot.
Exercise 3: Add periodic metric logging (every 30 seconds). Watch the snapshots accumulate in the logs.
Why collect metrics in addition to request logs?