JOINs
In the last section, we spent a lot of time splitting data across tables. Authors live in authors, books live in books, and tags live in tags. That separation is great for data integrity, but it creates a small problem. If you want to display something as simple as “The Old Man and the Sea by Ernest Hemingway,” the title lives in one table and the author’s name lives in another. You need a way to stitch those rows back together at read time.
That is what JOINs are for. A JOIN tells SQLite, “here are two tables, here is how they relate, please give me a result that combines them.” In this lesson we will cover the JOINs you will use every day: INNER JOIN, LEFT JOIN, chained joins across many tables, and self-joins where a table references itself.
INNER JOIN
Let’s start with the most common one. An INNER JOIN returns only the rows that have a match in both tables:
SELECT books.title, authors.name AS author_name
FROM books
INNER JOIN authors ON books.author_id = authors.id; Read this from the top. We are selecting the title column from books and the name column from authors, renaming the second one to author_name so the output is clearer. The FROM books tells SQLite that books is our starting table. Then INNER JOIN authors ON books.author_id = authors.id says: for each row in books, find the row in authors whose id equals this book’s author_id.
If SQLite finds a match, the combined row shows up in the result. If a book somehow had an author_id that did not match any author, it would be excluded. In practice this should not happen if you defined your foreign keys correctly, but it is good to know the rule.
One small note. INNER JOIN is the default flavor, so writing just JOIN means the same thing. You will see both styles in the wild.
LEFT JOIN
A LEFT JOIN is a little more forgiving. It returns every row from the left table, even when there is no match in the right table. When there is no match, the columns from the right table come back as NULL:
SELECT authors.name, books.title
FROM authors
LEFT JOIN books ON books.author_id = authors.id; Here authors is the “left” table because it comes first. For every author, SQLite tries to find their books. If an author has books, they appear once per book. If an author has no books at all, they still appear in the result, but books.title is NULL.
| name | title |
|--------------------|--------------------------|
| Ernest Hemingway | The Old Man and the Sea |
| Ernest Hemingway | A Farewell to Arms |
| New Author | NULL | <- no books Why do we care about this? Two big reasons.
The first is counting with zeros. Imagine you want to answer “How many books does each author have?” If you use an INNER JOIN, authors with no books disappear completely from the result. That is almost never what you want. You want them to show up with a count of 0.
The second is optional relationships. Think “show users with their profiles.” Some users have not filled out a profile yet. If you INNER JOIN, those users vanish. With a LEFT JOIN, they still appear, just with NULLs where the profile data would be.
The difference matters
Let’s make the difference concrete with a counting example:
-- INNER JOIN: only authors who have books
SELECT authors.name, COUNT(books.id) AS book_count
FROM authors
INNER JOIN books ON books.author_id = authors.id
GROUP BY authors.id;
-- "New Author" (0 books) is MISSING from results
-- LEFT JOIN: all authors, including those with 0 books
SELECT authors.name, COUNT(books.id) AS book_count
FROM authors
LEFT JOIN books ON books.author_id = authors.id
GROUP BY authors.id;
-- "New Author" appears with book_count = 0 Same data, two different questions. The INNER version answers “for authors who have books, how many?” The LEFT version answers “for every author, how many books, even if the answer is zero?”
A good rule of thumb: use INNER JOIN when you only care about rows with matches, and LEFT JOIN when “no match” is still a meaningful answer you want to see.
Multiple JOINs
You are not limited to two tables. You can chain JOINs to pull data from several tables in a single query:
SELECT
books.title,
authors.name AS author_name,
tags.name AS tag_name
FROM books
JOIN authors ON books.author_id = authors.id
LEFT JOIN book_tags ON books.id = book_tags.book_id
LEFT JOIN tags ON book_tags.tag_id = tags.id; Walk through this one step at a time. We start from books. Then we INNER JOIN authors because every book must have an author. Then we LEFT JOIN book_tags, the join table that connects books and tags, because a book may or may not have tags. Finally, we LEFT JOIN tags to get the actual tag names.
The shape of the result might surprise you at first. You get one row per book-tag combination. If a book has three tags, it shows up three times. If a book has no tags at all, it still appears once, with tag_name as NULL because of the LEFT JOIN. This is normal, and aggregation in the next lesson will help us reshape it when we need to.
Table aliases
Long table names make queries noisy. SQL lets you give each table a short alias:
SELECT b.title, a.name AS author_name
FROM books b
JOIN authors a ON b.author_id = a.id
WHERE a.name LIKE 'Ernest%'; Writing books b creates an alias b for the books table. From that point on, b.title means books.title. Same idea for authors a. Once your queries start pulling from four or five tables, aliases become essential for readability. This is exactly how production queries are written.
Self-joins
Sometimes a table needs to JOIN to itself. This sounds strange at first, but it comes up any time a row references another row in the same table. A classic example is an employees table where each employee has a manager, and the manager is also an employee:
CREATE TABLE employees (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
manager_id TEXT,
FOREIGN KEY (manager_id) REFERENCES employees(id)
);
SELECT e.name, m.name AS manager_name
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id; Here employees appears twice with different aliases. e represents the employee, and m represents the manager (who is also sitting in the employees table). The join matches each employee’s manager_id to the id of another row in the same table. We use LEFT JOIN so that employees without a manager, like the CEO, still show up in the result with a NULL manager_name.
What do you think would happen if we used INNER JOIN here? The CEO would disappear, because there is no row to match their NULL manager_id. That is almost never what you want for a hierarchy query.
Exercises
Exercise 1: Write an INNER JOIN to get all books with their author names. What happens to books without authors?
Exercise 2: Rewrite the query as a LEFT JOIN. What changes in the results?
Exercise 3: Write a query that joins books, authors, and tags. Show each book’s title, author name, and all tag names.
Exercise 4: Count books per author using LEFT JOIN. Verify that authors with 0 books show a count of 0.
When should you use LEFT JOIN instead of INNER JOIN?
JOINs combine rows from different tables, which is great when you want to see the details. But sometimes the question is not “show me the rows,” it is “give me a summary.” How many books per author? What is the average rating? Which genre has the most? That kind of question needs aggregation, and that is what the next lesson is about.