Coder Social home page Coder Social logo

dsc-join-statements-hbs-ds-060120's Introduction

Join Statements

Introduction

In this section, you will learn about several types of JOIN statements. Joins are the primary mechanism for combining data from multiple tables. In order to do this, you define the common attribute(s) between tables in order for them to be combined.

Objectives

You will be able to:

  • Write SQL queries that make use of various types of joins
  • Compare and contrast the various types of joins
  • Discuss how primary and foreign keys are used in SQL
  • Decide and perform whichever type of join is best for retrieving desired data

CRM Schema

In almost all industry cases, rather than just working with a single table you will generally need data from multiple tables. Doing this requires the use of joins using shared columns from the two tables. For example, here's a diagram of a mock customer relationship management (CRM) database.

Connecting to the Database

As usual, you'll start by connecting to the database.

import sqlite3
import pandas as pd
conn = sqlite3.connect('data.sqlite')
cur = conn.cursor()

Displaying product details along with order details

Let's say you need to generate some report that includes details about products from orders. To do that, we would need to take data from multiple tables in a single statement.

cur.execute("""SELECT * 
               FROM orderdetails
               JOIN products
               ON orderdetails.productCode = products.productCode
               LIMIT 10;
               """)

# Take results and create DataFrame
df = pd.DataFrame(cur.fetchall()) 
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 10100 S18_1749 30 136.00 3 S18_1749 1917 Grand Touring Sedan Vintage Cars 1:18 Welly Diecast Productions This 1:18 scale replica of the 1917 Grand Tour... 2724 86.70 170.00
1 10100 S18_2248 50 55.09 2 S18_2248 1911 Ford Town Car Vintage Cars 1:18 Motor City Art Classics Features opening hood, opening doors, opening ... 540 33.30 60.54
2 10100 S18_4409 22 75.46 4 S18_4409 1932 Alfa Romeo 8C2300 Spider Sport Vintage Cars 1:18 Exoto Designs This 1:18 scale precision die cast replica fea... 6553 43.26 92.03
3 10100 S24_3969 49 35.29 1 S24_3969 1936 Mercedes Benz 500k Roadster Vintage Cars 1:24 Red Start Diecast This model features grille-mounted chrome horn... 2081 21.75 41.03
4 10101 S18_2325 25 108.06 4 S18_2325 1932 Model A Ford J-Coupe Vintage Cars 1:18 Autoart Studio Design This model features grille-mounted chrome horn... 9354 58.48 127.13

Compared to the individual tables:

orderdetails table:

cur.execute("""SELECT * FROM orderdetails LIMIT 10;""")
df = pd.DataFrame(cur.fetchall()) 
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber
0 10100 S18_1749 30 136.00 3
1 10100 S18_2248 50 55.09 2
2 10100 S18_4409 22 75.46 4
3 10100 S24_3969 49 35.29 1
4 10101 S18_2325 25 108.06 4

products table:

cur.execute("""SELECT * FROM products LIMIT 10;""")
df = pd.DataFrame(cur.fetchall()) 
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 S10_1678 1969 Harley Davidson Ultimate Chopper Motorcycles 1:10 Min Lin Diecast This replica features working kickstand, front... 7933 48.81 95.70
1 S10_1949 1952 Alpine Renault 1300 Classic Cars 1:10 Classic Metal Creations Turnable front wheels; steering function; deta... 7305 98.58 214.30
2 S10_2016 1996 Moto Guzzi 1100i Motorcycles 1:10 Highway 66 Mini Classics Official Moto Guzzi logos and insignias, saddl... 6625 68.99 118.94
3 S10_4698 2003 Harley-Davidson Eagle Drag Bike Motorcycles 1:10 Red Start Diecast Model features, official Harley Davidson logos... 5582 91.02 193.66
4 S10_4757 1972 Alfa Romeo GTA Classic Cars 1:10 Motor City Art Classics Features include: Turnable front wheels; steer... 3252 85.68 136.00

The USING clause

A more concise way to join the tables, if the column name is identical, is the USING clause. Rather then saying on tableA.column = tableB.column we can simply say using(column). Again, this only works if the column is identically named for both tables.

cur.execute("""SELECT * FROM orderdetails
               JOIN products
               USING(productCode)
               LIMIT 10;
               """)
df = pd.DataFrame(cur.fetchall()) 
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 10100 S18_1749 30 136.00 3 1917 Grand Touring Sedan Vintage Cars 1:18 Welly Diecast Productions This 1:18 scale replica of the 1917 Grand Tour... 2724 86.70 170.00
1 10100 S18_2248 50 55.09 2 1911 Ford Town Car Vintage Cars 1:18 Motor City Art Classics Features opening hood, opening doors, opening ... 540 33.30 60.54
2 10100 S18_4409 22 75.46 4 1932 Alfa Romeo 8C2300 Spider Sport Vintage Cars 1:18 Exoto Designs This 1:18 scale precision die cast replica fea... 6553 43.26 92.03
3 10100 S24_3969 49 35.29 1 1936 Mercedes Benz 500k Roadster Vintage Cars 1:24 Red Start Diecast This model features grille-mounted chrome horn... 2081 21.75 41.03
4 10101 S18_2325 25 108.06 4 1932 Model A Ford J-Coupe Vintage Cars 1:18 Autoart Studio Design This model features grille-mounted chrome horn... 9354 58.48 127.13

More Aliasing

Alternatively, you can also assign tables an alias by entering an alternative shorthand name directly after them. This is slightly different than the previous lesson where we included the AS keyword when creating an alias. Here we use the aliases 'o' and 'p' for orderdetails and products, respectively.

cur.execute("""SELECT * FROM orderdetails o
               JOIN products p
               ON o.productCode = p.productCode
               LIMIT 10;
               """)
df = pd.DataFrame(cur.fetchall()) 
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
orderNumber productCode quantityOrdered priceEach orderLineNumber productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP
0 10100 S18_1749 30 136.00 3 S18_1749 1917 Grand Touring Sedan Vintage Cars 1:18 Welly Diecast Productions This 1:18 scale replica of the 1917 Grand Tour... 2724 86.70 170.00
1 10100 S18_2248 50 55.09 2 S18_2248 1911 Ford Town Car Vintage Cars 1:18 Motor City Art Classics Features opening hood, opening doors, opening ... 540 33.30 60.54
2 10100 S18_4409 22 75.46 4 S18_4409 1932 Alfa Romeo 8C2300 Spider Sport Vintage Cars 1:18 Exoto Designs This 1:18 scale precision die cast replica fea... 6553 43.26 92.03
3 10100 S24_3969 49 35.29 1 S24_3969 1936 Mercedes Benz 500k Roadster Vintage Cars 1:24 Red Start Diecast This model features grille-mounted chrome horn... 2081 21.75 41.03
4 10101 S18_2325 25 108.06 4 S18_2325 1932 Model A Ford J-Coupe Vintage Cars 1:18 Autoart Studio Design This model features grille-mounted chrome horn... 9354 58.48 127.13

LEFT JOINs

By default a JOIN is an INNER JOIN, or the intersection between two tables. In other words, the JOIN between orders and products is only for productCodes that are in both the orderdetails and products tables. If a product had yet to be ordered (and wasn't in the orderdetails table) then it would also not be in the result of the JOIN.

The LEFT JOIN keyword returns all records from the left table (table1), and the matched records from the right table (table2). The result is NULL from the right side if there is no match.

There are many other types of joins, displayed below. Of these, sqlite does not support outer joins, but it is good to be aware of as more powerful versions of sql such as PostgreSQL support these additional functions.

For example, the statement

SELECT * FROM products LEFT JOIN orderdetails;

would return all products, even those that hadn't been ordered. You can imagine that all products in inventory should have a description in the product table, but perhaps not every product is represented in the orderdetails table.

cur.execute("""SELECT * 
               FROM products
               LEFT JOIN orderdetails
               USING(productCode);
               """)
df = pd.DataFrame(cur.fetchall()) 
df.columns = [i[0] for i in cur.description]
print(len(df))
print(len(df[df.orderNumber.isnull()]))
df[df.orderNumber.isnull()].head()
2997
1
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
productCode productName productLine productScale productVendor productDescription quantityInStock buyPrice MSRP orderNumber quantityOrdered priceEach orderLineNumber
1122 S18_3233 1985 Toyota Supra Classic Cars 1:18 Highway 66 Mini Classics This model features soft rubber tires, working... 7733 57.01 107.57 NaN NaN NaN NaN

As you can see, it's a rare occurrence, but there is one product that has yet to be ordered.

Primary Versus Foreign Keys

Another important consideration when performing joins is to think more about the key or column you are joining on. As you'll see in upcoming lessons, this can lead to interesting behavior if the join value is not unique in one or both of the tables. In all of the above examples, you joined two tables using the primary key. The primary key(s) of a table are those column(s) which uniquely identify a row. You'll also see this designated in our schema diagram with the asterisk (*).

You can also join tables using foreign keys which are not the primary key for that particular table, but rather another table. For example, employeeNumber is the primary key for the employees table and corresponds to the salesRepEmployeeNumber of the customers table. In the customers table, salesRepEmployeeNumber is only a foreign key, and is unlikely to be a unique identifier, as it is likely that an employee serves multiple customers. As such, in the resulting view employeeNumber would no longer be a unique field.

cur.execute("""SELECT * 
               FROM customers c
               JOIN employees e
               ON c.salesRepEmployeeNumber = e.employeeNumber
               ORDER BY employeeNumber;
               """)
df = pd.DataFrame(cur.fetchall())
df.columns = [i[0] for i in cur.description]
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
customerNumber customerName contactLastName contactFirstName phone addressLine1 addressLine2 city state postalCode ... salesRepEmployeeNumber creditLimit employeeNumber lastName firstName extension email officeCode reportsTo jobTitle
0 124 Mini Gifts Distributors Ltd. Nelson Susan 4155551450 5677 Strong St. San Rafael CA 97562 ... 1165 210500 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
1 129 Mini Wheels Co. Murphy Julie 6505555787 5557 North Pendale Street San Francisco CA 94217 ... 1165 64600 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
2 161 Technics Stores Inc. Hashimoto Juri 6505556809 9408 Furth Circle Burlingame CA 94217 ... 1165 84600 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
3 321 Corporate Gift Ideas Co. Brown Julie 6505551386 7734 Strong St. San Francisco CA 94217 ... 1165 105000 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep
4 450 The Sharp Gifts Warehouse Frick Sue 4085553659 3086 Ingle Ln. San Jose CA 94217 ... 1165 77600 1165 Jennings Leslie x3291 [email protected] 1 1143 Sales Rep

5 rows ร— 21 columns

Notice that this also returned both columns: salesRepEmployeeNumber and employeeNumber.

Summary

In this lesson, you investigated joins. This included implementing the ON and USING clauses, aliasing table names, implementing LEFT JOIN, and using primary vs. foreign keys.

dsc-join-statements-hbs-ds-060120's People

Contributors

loredirick avatar mas16 avatar mathymitchell avatar sproulhimself avatar sumedh10 avatar tkoar avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.