SQL - Data Query Language - II
Mark Fontenot, PhD
Northeastern University
1
Joins
2
Level 0: Simple Joins
Use JOIN keyword to combine data from two tables.
Must provide ON or USING
(or you just get the cartesian product)
3
SELECT <list of columns>
FROM <table name> [JOIN <table name>
ON <condition>]
[WHERE <condition list>]
[ORDER BY <column-order list>];
General Form
Level 0: Simple Joins
- Provide a list of all product names with the name of the category each belongs to.
4
SELECT ProductName, CategoryName
FROM Products JOIN Categories
ON Products.CategoryID = Categories.CategoryID;
Example:
the JOIN Condition
without it, you get Cartesian product
Level 1: Natural Joins
NATURAL JOIN will join on attributes with the same names.
- adds an implicit ON clause.
5
SELECT <list of columns>
FROM <table name> NATURAL JOIN <table name>
[WHERE <condition list>]
[ORDER BY <column-order list>];
General Form
Level 1: Natural Joins
- Provide a list of all product names with the name of the category each belongs to.
6
SELECT ProductName, CategoryName
FROM Products NATURAL JOIN Categories
Example:
If no matching attribute names, you get the Cartesian product.
Aside - the soft Join
FROM clause provides comma-separated list of tables
WHERE clause provides a predicate for each join.
This style mixes the filtering of rows of tables with the join condition… can get a little messy.
Stick with the syntax on the previous slides.
7
SELECT ProductName, CategoryName
FROM Products, Categories
WHERE Products.CategoryID = Categories.CategoryID;
Example:
Your Turn
1. Which cities in either Ireland or Poland did Michael Suyama (an employee) ship items to customers?
8
Your Turn
2a. Which products have suppliers in Germany?
2b. Which, if any, of those products from Germany are on backorder?
9
Your Turn
3. Provide a list of all suppliers that Northwind purchases products in the Confections category from.
10
Outer Join
The joins we’ve seen so far (also sometimes called Inner Joins) retain only rows from the Cartesian product where the join condition is true.
An Outer Join includes everything from the inner join, but also includes one or both of the following:
11
Left Outer Join - Intuition
12
sID | Name |
112 | Mark |
234 | Sam |
643 | Rohit |
sID | cID | Grade |
112 | cs3200 | A |
112 | cs2500 | B- |
643 | cs1800 | A- |
643 | ds2000 | A- |
Student
Grades
Student LEFT OUTER JOIN Grades
ON Student.sID = Grades.sID
Student.sID | Name | Grades.sID | cID | Grade |
112 | Mark | 112 | cs3200 | A |
112 | Mark | 112 | cs2500 | B- |
643 | Rohit | 643 | cs1800 | A- |
643 | Rohit | 643 | ds2000 | A- |
234 | Sam | null | null | null |
Full Outer Join - Intuition
13
p | q |
1 | ball |
2 | bat |
3 | buzz |
r | s |
2 | cat |
3 | dog |
4 | fish |
A
B
A FULL OUTER JOIN B
ON A.p = B.r
p | q | r | s |
1 | ball | Null | Null |
2 | bat | 2 | cat |
3 | buzz | 3 | dog |
Null | Null | 4 | fish |
Thinking Through Joins
Thinking Through Joins
Northwind has been having some problems with Speedy Express shipping company… they don’t seem to be too speedy. Provide a list of all orders shipped via Speedy Express and include the order id, customer name, order date, and shipped date for all orders in 2022.
15
Maybe you think to yourself… “Where do I start?”
Thinking Through Joins
Northwind has been having some problems with Speedy Express shipping company… they don’t seem to be too speedy. Provide a list of all orders shipped via Speedy Express and include the order id, customer name, order date, and shipped date for all orders in 2022.
16
1. Think about what columns are needed in the final result.
Thinking Through Joins
Northwind has been having some problems with Speedy Express shipping company… they don’t seem to be too speedy. Provide a list of all orders shipped via Speedy Express and include the order id, customer name, order date, and shipped date for all orders in 2022.
17
2. What tables are those attributes in?
- Minimally, your FROM clause will need to include those tables.
- Are there other tables needed that link the above?
Tables:�- Shippers
- Orders
- Customers
Thinking Through Joins
18
3. What join conditions do I need for the tables?
- Can I natural join all of them?
- If no, what attributes link the different pairs of tables?
- Since Shippers and Orders do not have a common attribute name, we can’t use a Natural Join.
- Must use a JOIN … ON
Thinking Through Joins
Northwind has been having some problems with Speedy Express shipping company… they don’t seem to be too speedy. Provide a list of all orders shipped via Speedy Express and include the order id, customer name, order date, and shipped date for all orders in 2022.
19
4. What, if any, filters are needed for the rows?
Thinking Through Joins – Putting it all Together
20
1. Think about what columns are needed in the final result.
2. What tables are those attributes in?
3. What join conditions do I need for the tables?
4. What, if any, filters are needed for the rows?
SELECT o.orderid, c.CompanyName, o.OrderDate, o.ShippedDate
FROM Shippers s JOIN Orders o ON s.shipperid = o.shipvia
JOIN Customers c ON o.customerid = c.customerid
WHERE s.CompanyName = ‘Speedy Express’
AND o.OrderDate >= ‘2022-01-01’
What’s Wrong With the Following?
Northwind has been having some problems with Speedy Express shipping company… they don’t seem to be too speedy. Provide a list of all orders shipped via Speedy Express and include the order id, customer name, order date, and shipped date for all orders in 2022.
21
SELECT o.orderid,
c.CompanyName,
o.OrderDate,
o.ShippedDate
FROM shippers s
JOIN orders o ON s.shipperid = o.shipvia
NATURAL JOIN customers c
WHERE s.companyName = 'Speedy Express'
AND o.orderdate >= '2022-01-01';
GROUP BY and HAVING Clauses
Applying Aggregates to Groups
- The GROUP BY clause allows you to define subgroups for the application of an aggregate function.
- The GROUP BY list should be a subset of the SELECT attributes.
23
SELECT [DISTINCT] <attribute list>
FROM <table list>
[WHERE <condition list>]
[GROUP BY <attribute list>]
[ORDER BY <attribute-order list>];
Group By -
Intuition
Problem: Provide the average GPA for all students in each year of their education.
24
sID | Name | Year | GPA |
121 | Rohit | 2 | 3.8 |
134 | Jack | 1 | 3.7 |
231 | Sam | 1 | 3.6 |
114 | Sal | 2 | 3.5 |
153 | Alex | 1 | 3.5 |
Group By -
Intuition
Problem: Provide the average GPA for all students in each year of their education.
… GROUP BY Year
25
sID | Name | Year | GPA |
121 | Rohit | 2 | 3.8 |
134 | Jack | 1 | 3.7 |
231 | Sam | 1 | 3.6 |
114 | Sal | 2 | 3.5 |
153 | Alex | 1 | 3.5 |
sID | Name | Year | GPA |
153 | Alex | 1 | 3.5 |
134 | Jack | 1 | 3.7 |
231 | Sam | 1 | 3.6 |
114 | Sal | 2 | 3.5 |
121 | Rohit | 2 | 3.8 |
Average these vals
Average these vals…
Group By -
Intuition
Problem: Provide the average GPA for all students in each year of their education.
SELECT year, AVG(GPA) as avg_gpa
FROM students
GROUP BY year;
26
sID | Name | Year | GPA |
121 | Rohit | 2 | 3.8 |
134 | Jack | 1 | 3.7 |
231 | Sam | 1 | 3.6 |
114 | Sal | 2 | 3.5 |
153 | Alex | 1 | 3.5 |
Sample Group By
Determine the average price of products from each of our suppliers.
27
You can apply multiple agg functions to a group in 1 stmt.
For each category of products, give the name of the product category, min product unit price, max product unit price, and average product unit price.
28
SELECT A, MAX(B), MIN(B)
FROM table
GROUP BY A;
SELECT categoryName,
min(unitPrice) AS minPrice,
max(unitPrice) AS maxPrice,
avg(unitPrice) AS avgPrice
FROM Categories C JOIN Products P
ON C.categoryID = P.categoryID
GROUP BY categoryName;
HAVING Clause
- used in conjunction with GROUP BY
- a set of boolean expressions that are evaluated for each group
- If true, group is returned. If false, group is discarded.
29
Provide a list of all regions of the world where Northwind has more than 10 customers.
SELECT region,
count(customerid) AS numCustomers
FROM Customers
GROUP BY region
HAVING count(customerid) >= 10;
HAVING
30
SELECT <list of columns>
FROM <table name> [JOIN <table name> � ON <join condition> [...]]
[WHERE <condition list>]
[GROUP BY <attribute list>]
[HAVING <group conditions>]
[ORDER BY <column [ASC|DESC]>,[...]]
General�Form
Challenge Problem
Provide a list of all customers whose average order total is below the average for all companies in their same region.
31
Review Questions