hectoday
DocsCoursesChangelog GitHub
DocsCoursesChangelog GitHub

Access Required

Enter your access code to view courses.

Invalid code

← All courses Do You Need a Database?

Storage Fundamentals

  • A Database is Just Files
  • The Setup
  • Approach 1: Linear Scan
  • Approach 2: In-Memory Map
  • Approach 3: Binary Search on Disk
  • SQLite as a Baseline
  • Benchmarking with wrk
  • Reading the Numbers
  • When You Actually Need a Database
  • Quiz: Storage Fundamentals (wip)

Writes and Durability

  • The Write Path
  • Append Throughput
  • Writes Break the Index
  • Concurrent Writers
  • Atomic Multi-Record Writes

Concurrent writers

Back in section 1 we mentioned, in passing, that running two server processes against the same flat file would cause problems. We pointed at it, said “more on that later,” and kept moving. This is the later.

So far, every approach we have built assumes exactly one server process. A single process is a quiet, cooperative place. There is only ever one thing running your code at a time. The moment you step outside that assumption, a bunch of guarantees quietly stop being true, and the storage strategy you picked suddenly matters a lot more. This lesson is about what changes.

What “single-threaded” actually means

JavaScript runtimes are single-threaded. Inside one Node process, there is exactly one thread executing your application code at any moment. Two HTTP requests cannot both be running createUser at the same instant. One of them is running. The other is waiting on the event loop.

That is hugely convenient for us. The in-memory Map<string, User> cannot be in an inconsistent state from a partial write, because there is no such thing as a partial write at the JavaScript level. Either users.set(id, user) has executed fully, or it has not started yet. There is no in-between state any other code can observe.

The same is roughly true for appendFileSync. The underlying write(2) syscall is atomic for small writes. POSIX guarantees atomicity for writes below a value called PIPE_BUF, and most filesystems extend that guarantee further for normal writes to ordinary files. One write, one record, no interleaving.

So inside one process, our flat-file approaches just work for concurrent requests. There are no race conditions because there is no real concurrency. Every request is handled start to finish before the next one starts.

What multiple processes break

The moment you start a second Node process, for any reason (redundancy, horizontal scaling, a rolling deploy with overlap, autoscaling, debugging), all of those nice guarantees get quietly revoked.

Let us be concrete. Consider the in-memory map store. Two Node processes, same users.jsonl file on disk, each with their own in-memory Map<string, User>.

Process A receives a POST /users request. It appends a line to the file, updates its own map, returns.

Process B has no idea any of this happened. Its map does not contain the new user. If a subsequent request comes in to process B and asks for that new user, B returns a 404. The user exists on disk. The user exists in process A’s map. Process B has never heard of them.

And the file itself? The file is fine, actually. OS-level append is atomic for small writes, so both processes can append simultaneously and both records land cleanly with no interleaving. The file is consistent. The problem is that each process’s in-memory copy has drifted from the file and from each other.

What do you think happens if a third request creates another user through B? Both files now have two new records that only one process knows about. And if both processes happen to generate the same random UUID? Unlikely but possible, and now you have two different users with the same id. Every layer below the in-memory map is fine. Every layer above it is broken.

You can work around this in a few ways, and they all have real costs.

  • Re-read the file on every request. You are back to linear scan performance.
  • Watch the file for changes and incrementally append to the map. File watchers are not fully reliable, and you have to handle partial reads.
  • Use a pub/sub system to broadcast writes between processes. You have just introduced a distributed cache invalidation problem, which is one of the hardest problems in computing.
  • Skip the in-memory map entirely and always read from the file.

You can do any of these. They all cost something. They are, collectively, why databases exist.

How SQLite handles this

SQLite has a coherent and genuinely impressive answer for multi-process access. Multiple processes can safely use the same SQLite file at the same time, without a server, without a coordinator. Let us look at how.

SQLite uses file locks at the OS level. On Linux and macOS that is fcntl; on Windows it is LockFileEx. There are five lock states, and a SQLite connection transitions between them as it reads and writes.

  • UNLOCKED. Nobody has the file.
  • SHARED. One or more readers have the file open. Other readers can join.
  • RESERVED. One writer is preparing to write. Readers can still acquire SHARED locks.
  • PENDING. The writer is waiting for current readers to finish before it can upgrade.
  • EXCLUSIVE. The writer has full access. Nobody else can read or write.

The basic flow is: a reader acquires SHARED, reads what it needs, releases. A writer starts by acquiring RESERVED, then upgrades to PENDING (which prevents new readers from joining), waits for any existing readers to release their SHARED locks, upgrades to EXCLUSIVE, writes, then releases everything.

In rollback journal mode, which was the historical default, writers block all readers during the actual write. Safe, but contended under load.

In WAL mode, which is what we enabled in the SQLite lesson, writers go to a separate WAL file while readers continue reading the main database file. Readers see a consistent snapshot of the state from before the writer started. The writer only blocks other writers, not readers.

But here is the catch that is easy to miss. SQLite still only allows one writer at a time, even in WAL mode. Two processes cannot both be writing to the same SQLite database at the same exact moment. The second writer blocks until the first one commits. If it waits too long, it gets SQLITE_BUSY and has to retry.

What this means in practice

For a single server with one Node process, none of this matters. Single-threaded JavaScript makes coordination free. Flat files are fine, the map is fine, SQLite is fine.

For a single server with multiple Node processes (say four worker processes behind a local load balancer for redundancy or CPU utilization), SQLite handles it gracefully. All four can read concurrently without blocking each other. Writes serialize through one writer lock, but for most CRUD workloads, writes are infrequent enough that the serialization is invisible.

For a multi-server deployment, SQLite cannot save you. The SQLite file is local to one machine. Sharing it across machines over NFS is a famously bad idea, because SQLite’s lock detection does not work reliably over network filesystems. Giving each server its own SQLite file means your data is now sharded by “whichever server handled the request first,” which is almost never what you want. This is the point where you finally reach for a real client/server database.

That transition, by the way, is not about user count or data size. It is about topology. The moment your application runs on more than one machine and all the machines need to see the same data, you need something that speaks over a network. Postgres. MySQL. Or a service that exposes SQLite over a network, like Turso or LiteFS.

Measuring write contention

Let us actually see what happens when two writers contend on the same SQLite file.

Start two Node servers against the same users.db:

APPROACH=sqlite PORT=8081 node src/server.ts &
APPROACH=sqlite PORT=8082 node src/server.ts &

This relies on the PORT env var we added to server.ts back in the benchmarking lesson, so the two processes bind to different ports.

In two separate terminals, hit them simultaneously:

# Terminal 1
wrk -t2 -c25 -d10s -s post_users.lua http://localhost:8081

# Terminal 2
wrk -t2 -c25 -d10s -s post_users.lua http://localhost:8082

Each process is now driving 25 concurrent writers against the same SQLite file. What you should see:

  • Combined throughput is roughly the same as running one server with -c50.
  • Average latency is noticeably higher, because writes are serializing through the lock.
  • Occasionally, you see SQLITE_BUSY errors if a writer times out waiting.

The contention is real. Multiple writers on the same SQLite file do not give you any throughput improvement. They just add the overhead of acquiring and releasing the writer lock. SQLite is a single-writer system by design.

For comparison, try the same test with the flat-file APPROACH=map implementation on both servers. Both processes will happily write to the same file. Both will return 201 Created quickly. And both will silently disagree about what the data looks like in their respective in-memory maps. The throughput chart looks great. The data integrity is a mess.

Two Node processes both serve writes to the same JSONL file using appendFileSync. Both have their own in-memory map. A user is created via process A, then immediately fetched via process B. What happens?

The Postgres comparison

Postgres handles concurrent writers very differently. Multiple connections can write to the same database simultaneously. It uses row-level locking plus MVCC (multi-version concurrency control) to keep things consistent. Two clients inserting different users do not contend at all. Two clients updating the same row contend, but only on that row.

That is the architectural unlock that lets Postgres scale to dozens or hundreds of concurrent writers. SQLite’s single-writer model gives up that scalability in exchange for a single-file, no-separate-process model that is dramatically simpler to operate.

For most applications, SQLite’s model is fine. Modern hardware can serialize thousands of writes per second through a single writer, which is more throughput than most products actually generate. When it stops being fine (when your write throughput genuinely exceeds what a single writer can serialize), you move to Postgres.

A note on Node worker threads

What about scaling inside a single Node process using worker threads? This seems like it could avoid the multi-process problem, because the workers live inside the same runtime instance.

It does not help here, for two reasons. First, Node’s worker threads each have their own isolated JavaScript engine state. There is no shared Map<string, User> across workers. Messages pass between them by structured cloning, not shared memory. So the coherence problem reappears between workers. Second, the use cases for worker threads are mostly CPU-bound work like parsing, encoding, or image processing. For I/O-bound HTTP serving, the standard pattern is multiple processes behind a load balancer, not multiple workers inside one process.

So the multi-writer story is the same whether the “second writer” is a worker inside the same process, another Node process on the same machine, or another machine entirely.

What this means for your design

Three simple rules of thumb.

One server, one process. Anything works. Flat files are fine, the map is fine, SQLite is fine. The complexity savings from a single-process deployment are real and underrated. Never miss an opportunity to stay here.

One server, multiple processes. Use SQLite or a network database. Flat files with in-memory state diverge. SQLite handles it correctly, but serializes writes. Postgres handles it natively.

Multiple servers. You need a network database. SQLite, or a SQLite-over-a-network service like Turso or LiteFS, can work in some configurations. Postgres and MySQL handle this natively. There is no good flat-file answer for multi-server.

If you can keep your application at “one server, one process” for as long as possible, you avoid an entire category of operational complexity. Many products stay there for years. Some never leave. There is nothing embarrassing about shipping the simpler thing.

In the final lesson of this section we look at the last constraint that breaks flat files completely. Atomic writes across multiple records, and what it takes to get them right.

← Writes Break the Index Atomic Multi-Record Writes →

© 2026 hectoday. All rights reserved.