Retrieving Data from Multiple Tables
Because a relational database consists of many tables linked with related columns, each individual table is usually incomplete for most business purposes. Different table’s data usually needs to be combined to give complete results for business purposes. In order to retrieve data from more than one table you use a JOIN. The data that had been split apart during normalization can be joined back together in one result set. The JOIN is based on the related column between the tables. A table is related to another table with a foreign key. It is usually with this relationship of the primary key of one table and the foreign key of the other table, which are the matching columns that are used with the JOIN operator.
There are different types of JOINS:
INNER JOIN Returns rows that have matching values in both tables LEFT JOIN Returns all rows from the left table, and matched rows from the right table RIGHT JOIN Returns all rows from the right table, and matching rows from the left table FULL OUTER JOIN Returns all rows from both tables, even if there is not a related key
Let’s take a look at the ERD and the same 3 tables with data.
magKey | magName | magPrice |
---|---|---|
1 | Fishing in the Mojave | 13.95 |
2 | Car Racing Made Easy | 15.45 |
3 | Pine Cone Computing | 17.50 |
4 | Cooking Like Mad | 18.00 |
5 | If Only I Could Sing | 12.45 |
6 | Beautiful Birds | 12.45 |
7 | Corn Shucking for Fun and Profit | 15.05 |
8 | MySQL Magic | 10.95 |
subKey | magKey | scribKey | subStartDate | SubLength |
---|---|---|---|---|
1 | 1 | 1 | 2011-03-01 | 12 |
2 | 2 | 2 | 2011-03-01 | 14 |
3 | 6 | 3 | 2012-02-01 | 12 |
4 | 6 | 5 | 2012-02-01 | 12 |
5 | 4 | 3 | 2011-09-01 | 12 |
6 | 7 | 5 | 2012-07-01 | 24 |
7 | 7 | 4 | 2012-08-01 | 12 |
8 | 1 | 3 | 2011-05-01 | 12 |
9 | 1 | 4 | 2011-09-01 | 12 |
10 | 5 | 3 | 2011-12-01 | 12 |
11 | 3 | 3 | 2011-05-01 | 18 |
scribKey | lastName | firstName | address | city | state | zip |
---|---|---|---|---|---|---|
1 | Johnston | Julie | 10336 NE 187th St | Bothell | WA | 98012 |
2 | Anderson | Albert | 220 K Street Southeast | Auburn | WA | 98002 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 |
4 | Jimenez | Jose | 187 27th Ave | Seattle | WA | 98122 |
5 | Lamont | Lucy | 175 Smokey Point Dr | Lakewood | WA | 98409 |
6 | Wong | Walter | 1073 South 323rd Street | Federal Way | WA | 98003 |
Using single table queries, you could get information like a list of the magazine subscribers or a list of the magazines. But to get useful information like all the subscribers of a certain magazine, you would need to use a JOIN clause.
INNER JOIN
The most common way to join tables is with an INNER JOIN. There are different ways to write the syntax to join tables but we will be using the explicit syntax, which is considered the “best practice” way to write the syntax for a join.
SELECT select_list FROM table_1 JOIN table_2 ON join_condition_1 JOIN table_3 ON join_condition_2 [. . .]
To create the join you specify all the tables that will have data that will be included in the result set or the filtering condition and specify how they are related to each other. Nothing in the database table definitions can instruct MySQL how to join the tables; we have to do that ourselves.
Lets join two of the tables together. Say we need to have a list of the subscribers names and subscription start dates that belong to each person. Some people have more than one magazine that they subscribe to, so there might be people listed more than once with different subscription start dates.
SELECT lastName, firstName, subStartDate
We know that for this query’s SELECT we will have columns from more than one table. The first and last name from the subscriber table and the subscription start date from the subscription table. We don’t have a filter, or WHERE clause in this case, but if we did, we’d also want to take in to consideration what table column was being used there as well. But, in our case, we know we need to join the subscriber and the subscription tables together.
With an INNER JOIN only people who actually have a subscription, and only subscription start dates that correspond to subscribers will show up. Remember: INNER JOINS will only return rows that have matching values in both tables.
The INNER JOIN keyword follows the first table and specifies which table you are joining to the first table. Then the ON clause specifies the relationship between the two tables; which are the columns that each table has in common. In our case the scribKey as the primary key from the subscriber table and the scribKey as the foreign key from the subscription table.
FROM subscriber INNER JOIN subscription ON subscriber.scribKey = subscription.scribKey
Because, in our case, the primary and foreign key have the exact same name, we must qualify the column name by preceding it with the table name and a period. Each of the two tables is represented, this is important because MySQL needs to know from which table the column name came from.
Here is the whole query with an ORDER BY to sort by last name.
SELECT lastName, firstName, subStartDate FROM subscriber INNER JOIN subscription ON subscriber.subKey = subscription.subKey ORDER BY lastName;
lastName | firstName | subStartDate |
---|---|---|
Anderson | Albert | 2011-03-01 |
Jimenez | Jose | 2012-08-01 |
Jimenez | Jose | 2011-09-01 |
Johnston | Julie | 2011-03-01 |
Lamont | Lucy | 2012-07-01 |
Lamont | Lucy | 2012-02-01 |
Sanders | Samantha | 2011-05-01 |
Sanders | Samantha | 2011-12-01 |
Sanders | Samantha | 2012-02-01 |
Sanders | Samantha | 2011-05-01 |
Sanders | Samantha | 2011-09-01 |
Now we have a result set with data from two different tables. We can see that most of our subscribers have a subscription to more than one magazine and therefore have a number of different start dates.
Let’s look at another INNER JOIN. In this example we will be joining all 3 tables. We will be using table aliases for each table that we can then use to qualify the tables in the ON clause. ‘sr’ to represent ‘subscriber’, ‘sn’ to represent ‘subscription’, and m to represent magazine. You can use the keyword AS before each alias, but it is optional.
Once you give a table an alias you must refer to that alias throughout the query if you need to qualify any column names.
We will also be using just the keyword JOIN instead of INNER JOIN, they keyword INNER is optional. Both JOIN and INNER JOIN mean exactly the same thing. When you use the keyword JOIN it is assumed that you are using INNER JOIN.
This query will get a list of people who subscribe to the magazine called ‘Beautiful Birds’.
SELECT lastName Last_Name, firstName First_Name FROM subscriber sr JOIN subscription sn ON sr.subKey = sn.subKey JOIN magazine m ON sn.magKey = m.magKey WHERE magName = ‘Beautiful Birds’ ORDER BY last_Name;
Last_Name | First_Name |
---|---|
Lamont | Lucy |
Sanders | Samantha |
We can see that we want the first and last names again, which is only from one table but in order to see who has subscribed to ‘Beautiful Birds’ we have to include a WHERE clause that uses magName from the magazine table. So we are using data from multiple tables. Because the subscriber table that lists the people and the magazine table that lists the magazines do not have a common key that related the two we will have to use the linking subscription table in our join as well. So we join the subscriber table to the subscription table using the related column of subKey. Then we join the magazine table to the subscription table using the related column of magKey.
Order of Execution
Let’s take a minute to look at the Order of Execution with joins. Remember: The order that we write the syntax of our queries is not the order that the clauses actually execute. All of the SQL clauses have complex interdependencies on each other, and they need to execute in a certain order to behave properly. The execution order is:
1. FROM and JOIN 2. WHERE 3. SELECT 4. ORDER BY
Using the last query we looked at, the FROM is the first thing that executes. Before anything else all the rows from all the tables are retrieved. This gets us the total working set of data that is being queried. JOIN is an operator of FROM, not an independent clause so it executes at the same time as FROM.
scribKey | lastName | firstName | address | city | state | zip | subKey | magKey | scribKey | subStartDate | subLength | magKey | magName | magPrice |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Johnston | Julie | 10336 NE 187th St | Bothell | WA | 98012 | 1 | 1 | 1 | 2011-03-01 | 12 | 1 | Fishing in the Mojave | 13.95 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 | 8 | 1 | 3 | 2011-05-01 | 12 | 1 | Fishing in the Mojave | 13.95 |
4 | Jimenez | Jose | 187 27th Ave | Seattle | WA | 98122 | 9 | 1 | 4 | 2011-09-01 | 12 | 1 | Fishing in the Mojave | 13.95 |
2 | Anderson | Albert | 220 K Street Southeast | Auburn | WA | 98002 | 2 | 2 | 2 | 2011-03-01 | 14 | 2 | Car Racing Made Easy | 15.45 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 | 11 | 3 | 3 | 2011-05-01 | 18 | 3 | Pine Cone Computing | 17.50 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 | 5 | 4 | 3 | 2011-09-01 | 12 | 4 | Cooking Like Mad | 18.00 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 | 10 | 5 | 3 | 2011-12-01 | 12 | 5 | If Only I Could Sing | 12.45 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 | 3 | 6 | 3 | 2012-02-01 | 12 | 6 | Beautiful Birds | 12.45 |
5 | Lamont | Lucy | 175 Smokey Point Dr | Lakewood | WA | 98409 | 4 | 6 | 5 | 2012-02-01 | 12 | 6 | Beautiful Birds | 12.45 |
5 | Lamont | Lucy | 175 Smokey Point Dr | Lakewood | WA | 98409 | 6 | 7 | 5 | 2012-07-01 | 24 | 7 | Corn Shucking for Fun and Profit | 15.05 |
4 | Jimenez | Jose | 187 27th Ave | Seattle | WA | 98122 | 7 | 7 | 4 | 2012-08-01 | 12 | 7 | Corn Shucking for Fun and Profit | 15.05 |
So now that we have our set of data, let’s filter it down to exactly what we want. That’s done with WHERE next. With WHERE we can throw away the data we don’t need by setting up conditions on what it is exactly we want or don’t want. The WHERE then narrows down the set of data.
scribKey | lastName | firstName | address | city | state | zip | subKey | magKey | subKey | subStartDate | subLength | magKey | magName | magPrice |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 | 3 | 6 | 3 | 2012-02-01 | 12 | 6 | Beautiful Birds | 12.45 |
5 | Lamont | Lucy | 175 Smokey Point Dr | Lakewood | WA | 98409 | 4 | 6 | 5 | 2012-02-01 | 12 | 6 | Beautiful Birds | 12.45 |
The data is filtered but all columns included at this point. SELECT now executes and we can specify which columns we want to show up.
Last_Name | First_Name |
---|---|
Sanders | Samantha |
Lamont | Lucy |
Then the ORDER BY executes and sorts the results
Last_Name | First_Name |
---|---|
Lamont | Lucy |
Sanders | Samantha |
OUTER JOIN
Another type of join is the OUTER JOIN. This type of join returns all rows from one table and just the matched rows from another table.
Let’s take a look at the ERD and the same 3 tables with data again.
magKey | magName | magPrice |
---|---|---|
1 | Fishing in the Mojave | 13.95 |
2 | Car Racing Made Easy | 15.45 |
3 | Pine Cone Computing | 17.50 |
4 | Cooking Like Mad | 18.00 |
5 | If Only I Could Sing | 12.45 |
6 | Beautiful Birds | 12.45 |
7 | Corn Shucking for Fun and Profit | 15.05 |
8 | MySQL Magic | 10.95 |
subKey | magKey | scribKey | subStartDate | SubLength |
---|---|---|---|---|
1 | 1 | 1 | 2011-03-01 | 12 |
2 | 2 | 2 | 2011-03-01 | 14 |
3 | 6 | 3 | 2012-02-01 | 12 |
4 | 6 | 5 | 2012-02-01 | 12 |
5 | 4 | 3 | 2011-09-01 | 12 |
6 | 7 | 5 | 2012-07-01 | 24 |
7 | 7 | 4 | 2012-08-01 | 12 |
8 | 1 | 3 | 2011-05-01 | 12 |
9 | 1 | 4 | 2011-09-01 | 12 |
10 | 5 | 3 | 2011-12-01 | 12 |
11 | 3 | 3 | 2011-05-01 | 18 |
scribKey | lastName | firstName | address | city | state | zip |
---|---|---|---|---|---|---|
1 | Johnston | Julie | 10336 NE 187th St | Bothell | WA | 98012 |
2 | Anderson | Albert | 220 K Street Southeast | Auburn | WA | 98002 |
3 | Sanders | Samantha | 316 Union Ave | Snohomish | WA | 98290 |
4 | Jimenez | Jose | 187 27th Ave | Seattle | WA | 98122 |
5 | Lamont | Lucy | 175 Smokey Point Dr | Lakewood | WA | 98409 |
6 | Wong | Walter | 1073 South 323rd Street | Federal Way | WA | 98003 |
Notice there is one subscriber ‘Walter Wong’ that is not subscribed to any magazine yet. There is also one magazine ‘MySQL Magic’ that has no subscribers yet. When we do an inner join as we did above, Walter never showed up because there was no matching foreign key that matched his primary scribKey. In other words, there was no scribKey 6 in the linking table as a foreign key. What if we wanted to show all subscribers last and first name and their subscription start dates even if that subscriber hadn’t subscribed to a magazine yet. We would have to use an OUTER JOIN.
SELECT lastName, firstName, subStartDate FROM subscriber LEFT OUTER JOIN subscription ON subscriber.scribKey = subscription.scribKey ORDER BY lastName;
lastName | firstName | subStartDate |
---|---|---|
Anderson | Albert | 2011-03-01 |
Jimenez | Jose | 2012-08-01 |
Jimenez | Jose | 2011-09-01 |
Johnston | Julie | 2011-03-01 |
Lamont | Lucy | 2012-07-01 |
Lamont | Lucy | 2012-02-01 |
Sanders | Samantha | 2011-05-01 |
Sanders | Samantha | 2011-12-01 |
Sanders | Samantha | 2012-02-01 |
Sanders | Samantha | 2011-05-01 |
Sanders | Samantha | 2011-09-01 |
Wong | Walter | NULL |
The keywords LEFT OUTER JOIN specified that all records from the table on the left would be returned and only the matching rows from the other table are returned. Because Walter Wong was from the subscriber table (the table on the left in the query) he showed up. But notice that the subscription start date is null (there is no foreign key in the subscription table and therefore no subscription start date. So all the rows are returned from the subscriber table and only the matching rows of the subscription table.
RIGHT JOIN is used the same way but refers to the table on the right as the table that will have all the rows in the result set. So the query above could be rewritten as follows and get the same results.
SELECT lastName, firstName, subStartDate FROM subscription RIGHT OUTER JOIN subscriber ON subscriber.scribKey = subscription.scribKey ORDER BY lastName;
Usually a LEFT JOIN is used placing the table you want all the results from first (or on the left) so RIGHT JOIN doesn’t have to ever be used.
Let’s look at a join that uses all 3 tables.
If we wanted all the magazines and their associated subscribers, even if they didn’t have any subscribers, we’d use an outer join. This time we will use the keywords LEFT JOIN. The keyword OUTER is optional. Using just LEFT JOIN will be exactly the same.
SELECT magName, magPrice, lastName, firstName FROM magazine LEFT JOIN subscription ON magazine.magKey = subscription.magKey LEFT JOIN subscriber ON subscription.scribKey = subscriber.scribKey ORDER BY lastName;
magName | magPrice | lastName | firstName |
---|---|---|---|
MySQL Magic | 10.95 | null | null |
Car Racing Made Easy | 15.45 | Anderson | Albert |
Fishing in the Mojave | 13.95 | Jimenez | Jose |
Corn Shucking for Fun and Profit | 15.05 | Jimenez | Jose |
Fishing in the Mojave | 13.95 | Johnston | Julie |
Beautiful Birds | 12.45 | Lamont | Lucy |
Corn Shucking for Fun and Profit | 15.05 | Lamont | Lucy |
Cooking Like Mad | 18.00 | Sanders | Samantha |
If Only I Could Sing | 12.45 | Sanders | Samantha |
Beautiful Birds | 12.45 | Sanders | Samantha |
Fishing in the Mojave | 13.95 | Sanders | Samantha |
Pine Cone Computing | 17.50 | Sanders | Samantha |
Now all the rows from the magazine table show up even if the magazine has no subscriptions associate with it. ‘MySQL Magic’ is a magazine that has no associated foreign key in another table. Therefore it shows null for both the last and first name, since no one has subscribed to it yet.
FULL OUTER JOIN
Full outer joins are not very common but worth mentioning here. A full outer join will return all rows from both tables, even if there is not a related key. So if we wanted all magazines and all subscribers even if they didn’t have related records we would use the keyword UNION to combine the rows of two different results sets together to simulate a full outer join. Both result sets must return the same number of columns and the corresponding columns must have compatible data types.
SELECT magName, magPrice, lastName, firstName, subStartDate FROM magazine m LEFT JOIN subscription sn ON m.magKey = sn.magKey LEFT JOIN subscriber sr ON sn.subKey = sr.subKey UNION SELECT magName, magPrice, lastName, firstName, subStartDate FROM magazine m RIGHT JOIN subscription sn ON m.magazineKey = sn.magazineKey RIGHT JOIN subscriber sr ON sn.subscriberKey = sr.subscriberKey;
magName | magPrice | lastName | firstName | subStartDate |
---|---|---|---|---|
Fishing in the Mojave | 13.95 | Johnston | Julie | 2011-03-01 |
Fishing in the Mojave | 13.95 | Sanders | Samantha | 2011-05-01 |
Fishing in the Mojave | 13.95 | Jimenez | Jose | 2011-09-01 |
Car Racing Made Easy | 15.45 | Anderson | Albert | 2011-03-01 |
Pine Cone Computing | 17.50 | Sanders | Samantha | 2011-05-01 |
Cooking Like Mad | 18.00 | Sanders | Samantha | 2011-09-01 |
If Only I Could Sing | 12.45 | Sanders | Samantha | 2011-12-01 |
Beautiful Birds | 12.45 | Sanders | Samantha | 2012-02-01 |
Beautiful Birds | 12.45 | Lamont | Lucy | 2012-02-01 |
Corn Shucking for Fun and Profit | 15.05 | Lamont | Lucy | 2012-07-01 |
Corn Shucking for Fun and Profit | 15.05 | Jimenez | Jose | 2012-08-01 |
MySQL Magic | 10.95 | null | null | null |
null | null | Wong | Walter | null |
Notice how MySQL Magic shows up even though no one has subscribed to it and how Walter Wong shows up even though he has not subscribed to any magazines. Again, full outer joins are rare but might be used with something like exception reports, which are used to discover abnormal data that might need attention.