Transaction basics

You just learned how to insert rows. Most interesting apps, though, do not insert one row at a time and call it a day. A signup creates a user row and a matching profile row together. Adding a tag to a book creates multiple rows in a junction table. Even the migrations you have been writing insert into two tables: the schema change itself and the tracking row in schema_migrations. In every one of those cases, the writes have to hold together. If any one of them fails, you want all of them to fail. That is what a transaction gives you.

You have also been staring at BEGIN and COMMIT in every migration file since Section 3 without a proper definition. This lesson is where that wrapper stops being a mystery, and where we set up a pattern we will use for the rest of the course: any time we do something that only makes sense if all its pieces succeed, we wrap it in a transaction.

A transaction groups multiple SQL statements into a single atomic unit. “Atomic” means either every statement succeeds, or none of them do. There is no in-between state where half the changes went through and the other half did not. That guarantee is the whole point.

The three statements

Every transaction uses some combination of three statements:

BEGIN starts a transaction.
COMMIT saves all changes made since BEGIN.
ROLLBACK discards all changes made since BEGIN.

You have already seen BEGIN and COMMIT in every migration file. ROLLBACK is the one we have not used yet. It is the “undo” button you reach for when something goes wrong partway through.

Why you want atomicity

The clearest example is one you have been running without thinking about it: every migration. Each migration does two things that belong together — it applies the schema change, and it inserts a row into schema_migrations to record that the change ran.

ALTER TABLE users ADD COLUMN bio TEXT;
INSERT INTO schema_migrations (version, name) VALUES ('002', 'add_bio_to_users');

Without a transaction, these two statements are independent. If the INSERT fails (maybe a constraint rejection, maybe the process dies between them), the ALTER TABLE has already landed. You now have a schema that is technically correct but not recorded anywhere. The next time the runner starts, it will try to apply migration 002 again and immediately fail because the bio column already exists.

Wrapping the pair in a transaction fixes it:

BEGIN;
  ALTER TABLE users ADD COLUMN bio TEXT;
  INSERT INTO schema_migrations (version, name) VALUES ('002', 'add_bio_to_users');
COMMIT;

If anything between BEGIN and COMMIT fails, neither change is saved. The database is left exactly as it was before. This is the shape every migration file takes, and now you can see why.

The same logic applies to anything in your application where multiple writes need to hold together. In the next section we will build tables for authors, books, tags, profiles, and reviews. Signing up a user will insert into users and into profiles. Tagging a book will insert several rows into book_tags. Inserting an author’s first book will insert one row into authors and another into books that references it. Each of those is two or more writes that only make sense together, so each one will get wrapped in a transaction, exactly like the migration above.

Rolling back explicitly

You can also roll back a transaction yourself, without waiting for an error. This is useful when your application decides mid-way that the transaction should not go through after all. Try it in the shell to make it stick:

Code along

BEGIN;
INSERT INTO users (name, email) VALUES ('Test', '[email protected]');
SELECT * FROM users WHERE name = 'Test';   -- the row is visible inside the transaction
ROLLBACK;
SELECT * FROM users WHERE name = 'Test';   -- gone after the rollback

Inside the transaction, the INSERT behaves like any other write. You can query it, join against it, whatever you want — it is just not permanent yet. Once you ROLLBACK, it is as if that INSERT never happened. Swap the ROLLBACK for a COMMIT and the row sticks.

ROLLBACK is what saves you when your application logic decides mid-transaction that the whole thing should not go through. Imagine a signup flow that creates a user row and then runs a spam check that says the signup is a bot: you issue ROLLBACK and the database looks exactly like the signup never started.

What we are not covering here

Transactions have more depth than the basics. A helper for writing transactions from application code, the surprisingly large performance win that one transaction gives you over a thousand individual inserts, keeping denormalized counts in sync across statements, savepoints for nested transactions — all of that comes later in the course, after you have seen indexes and queried data across related tables. For now you only need this layer: atomicity, BEGIN, COMMIT, ROLLBACK.

Exercises

Exercise 1: Open the SQLite shell and run BEGIN;, then any INSERT, then ROLLBACK;. Run SELECT * FROM users afterwards to confirm the row is not there. Now do the same thing but use COMMIT; instead. Verify the row persisted this time.

Exercise 2: Run BEGIN; and two INSERT statements that add two different users. Before committing, open a new sqlite3 connection in another terminal and run SELECT * FROM users. Do you see the two new rows from the first session? (You should not, until the first session commits.)

Exercise 3: Open any migration file you have written so far. Confirm it starts with BEGIN and ends with COMMIT, with the INSERT INTO schema_migrations row just before the COMMIT. In one sentence, explain what would go wrong if that tracking INSERT came after the COMMIT instead of before.

What does a transaction guarantee?

When would you use ROLLBACK instead of COMMIT?

You now know enough about transactions to read any migration file and understand exactly why it is shaped the way it is. From here on in the course, whenever we do something that only makes sense if all its steps succeed (inserting an author and their first book in one go, tagging a book with several tags at once, signing a user up with a matching profile row), we wrap it in BEGIN / COMMIT. In the next lesson we turn to the other half of CRUD, the one you will do the most: reading data back out with SELECT.

Access Required

Transaction basics

The three statements

Why you want atomicity

Rolling back explicitly

What we are not covering here

Exercises