CSE 414: Section 2
A SeQueL to SQL
January 16th, 2020
1
Administrivia
HW2 due Monday, January 21st, 2020
Contact the teaching staff for any partner related troubles
2
Importing Files (shown in lecture)
First, make the table.
Then, specify the format
.mode csv�Then, import the data.� .import ./carriers.csv Carriers� .import ./flights.csv Flights
.import /path/to/file NameOfTable
3
3-Valued Logic
4
SQL 3-Valued Logic
Real data often has missing information
DBMSs often model missing information with NULL
5
SQL 3-Valued Logic
[ex] price < 25 is UNKNOWN when price = NULL
6
SQL 3-Valued Logic
Formal definitions:
C1 AND C2 min(C1,C2)� C1 OR C2 max(C1,C2)� NOT C 1 — C
The rule for SELECT ... FROM ... WHERE <C> ... is the following:� if C = TRUE, then include the row in the output� if C = FALSE or UNKNOWN, then do not include it
7
SQL 3-Valued Logic
What is the output?
SELECT P.name
FROM People AS P
WHERE P.age >= 21
8
People | |
name | age |
Bob | 19 |
Amy | 32 |
Joe | NULL |
NULL | 24 |
SQL 3-Valued Logic
What is the output?
SELECT P.name
FROM People AS P
WHERE P.age >= 21
9
People | |
name | age |
Bob | 19 |
Amy | 32 |
Joe | NULL |
NULL | 24 |
SQL 3-Valued Logic
Why might NULL and 3-valued logic fail us?
10
SQL 3-Valued Logic
Why might NULL and 3-valued logic fail us?
SELECT P.name
FROM People AS P
WHERE P.age >= 21
OR P.age < 21
11
People | |
name | age |
Bob | 19 |
Amy | 32 |
Joe | NULL |
NULL | 24 |
SQL 3-Valued Logic
Why might NULL and 3-valued logic fail us?
SELECT P.name
FROM People AS P
WHERE P.age >= 21
OR P.age < 21
12
People | |
name | age |
Bob | 19 |
Amy | 32 |
Joe | NULL |
NULL | 24 |
Always true?
SQL 3-Valued Logic
Why might NULL and 3-valued logic fail us?
SELECT P.name
FROM People AS P
WHERE P.age >= 21
OR P.age < 21
13
People | |
name | age |
Bob | 19 |
Amy | 32 |
Joe | NULL |
NULL | 24 |
Nope.
SQL 3-Valued Logic
Another weird case
SELECT P.name
FROM People AS P
WHERE P.age = P.age
14
People | |
name | age |
Bob | 19 |
Amy | 32 |
Joe | NULL |
NULL | 24 |
SQL 3-Valued Logic
How to improve DBMSs to better deal with missing information is debated.
15
SQL Joins
16
⋈
Joining
Inner vs. Outer
Self Joins
17
Join Semantics
18
For more information and different types of joins see:
https://blogs.msdn.microsoft.com/craigfr/2006/08/16/summary-of-join-properties/
Nested Loop Semantics
SELECT x_1.a_1, …, x_n.a_n�FROM x_1, …, x_n�WHERE <cond>
for each tuple in x_1:
…
for each tuple in x_n:
if <cond>(x_1, …, x_n):
output(x_1.a_1, …, x_n.a_n)
19
A for-loop per table in FROM
Practice!
20
SQL Aggregates
21
Aggregates in the most tangible sense
COUNT(attribute)
SUM(attribute)
MIN/MAX(attribute)
AVG(attribute)
...
22
Aggregates in the most tangible sense
COUNT(attribute)
SUM(attribute)
MIN/MAX(attribute)
AVG(attribute)
...
23
What am I aggregating over?
Aggregates in the most tangible sense
COUNT(attribute)
SUM(attribute)
MIN/MAX(attribute)
AVG(attribute)
...
24
What am I aggregating over?
Some collection of tuples, a group!
Aggregates in the most tangible sense
COUNT(attribute)
SUM(attribute)
MIN/MAX(attribute)
AVG(attribute)
...
25
What am I aggregating over?
If only there were a way to specify groups...
Some collection of tuples, a group!
Grouping
GROUP BY [attribute_1], …, [attribute_n]
26
Grouping
27
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
People | ||
name | age | job |
Bob | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Kim | 33 | SDE |
Group
on
???
Grouping
28
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
People | ||
name | age | job |
Bob | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Kim | 33 | SDE |
Group
on
age
Grouping
29
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
People | ||
name | age | job |
Bob | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Kim | 33 | SDE |
Bob | 22 | SDE |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Amy | 33 | QA |
Kim | 33 | SDE |
Group
on
age
Grouping
30
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
People | ||
name | age | job |
Bob | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Kim | 33 | SDE |
Bob | 22 | SDE |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Amy | 33 | QA |
Kim | 33 | SDE |
Group
on
age
Group
on
job
Grouping
31
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
People | ||
name | age | job |
Bob | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Kim | 33 | SDE |
Bob | 22 | SDE |
Joe | 22 | QA |
Sue | 22 | SDE |
Carl | 22 | QA |
Amy | 33 | QA |
Kim | 33 | SDE |
Bob | 22 | SDE |
Sue | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Carl | 22 | QA |
Kim | 33 | SDE |
Group
on
age
Group
on
job
Grouping
32
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
Bob | 22 | SDE |
Sue | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Carl | 22 | QA |
Kim | 33 | SDE |
Select
*
What are my output columns?
name? age? job?
Grouping
33
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
Bob | 22 | SDE |
Sue | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Carl | 22 | QA |
Kim | 33 | SDE |
Select
*
What are my output columns?
n̶a̶m̶e̶ age job
Grouping
34
People(name, age, job)
SELECT *
FROM People AS P
GROUP BY P.age, P.job
Bob | 22 | SDE |
Sue | 22 | SDE |
Amy | 33 | QA |
Joe | 22 | QA |
Carl | 22 | QA |
Kim | 33 | SDE |
Select
*
age | job |
22 | SDE |
22 | QA |
33 | QA |
33 | SDE |
Aliasing
Good style for renaming attribute operations to more intuitive labels
Essential for self joins (ex: FROM [table] AS T1, [table] AS T2)
SELECT …
FROM [table] AS [table_name]
WHERE [table_name].[attribute] -- use the alias
35
SQL Query Evaluation Order
FWGHOS
(From, Where, Group By, Having, Order By, Select)
36
Relational Algebra (RA)
RA Operators
Standard:
⋃ - Union
⎼ - Diff.
σ - Select
π - Project
⍴ - Rename
Extended:
δ - Duplicate Elim.
ɣ - Group/Agg.
τ - Sorting
Joins:
⨝ - Nat. Join
⟕ - L.O. Join
⟖ - R.O. Join
⟗ - F.O. Join
✕- Cross Product
⋂ - Intersect
R1⋂R2 = R1–(R1–R2)
R1⋂R2 = R1⨝R2
38
A Few More SQL Keywords
(<sub>) INTERSECT (<sub>)
(<sub>) UNION (<sub>)
(<sub>) EXCEPT (<sub>)
39
Ɣ Notation
Grouping and aggregation on group:
ɣattr_1, …, attr_k, count/sum/max/min(attr) -> alias
Aggregation on the entire table:
ɣcount/sum/max/min(attr) -> alias
40
Query Plans (Example SQL -> RA)
Select-Join-Project structure
Make this SQL query into RA (remember FWGHOS):
SELECT R.b, T.c, max(T.a) AS T_max
FROM Table_R AS R, Table_T AS T
WHERE R.b = T.b
GROUP BY R.b, T.c
HAVING max(T.a) > 99
41
Query Plans (Example SQL -> RA)
Select-Join-Project structure
Make this SQL query into RA (remember FWGHOS):
SELECT R.b, T.c, max(T.a) AS T_max
FROM Table_R AS R, Table_T AS T
WHERE R.b = T.b
GROUP BY R.b, T.c
HAVING max(T.a) > 99
πR.b, T.c, T_max(σT_max>99(ɣR.b, T.c, max(T.a)->T_max(R ⨝R.b=T.b T)))
42
Practice!
43
Other ways to alter output
LIMIT number - limits the amount of tuples returned
[ex] SELECT * FROM table LIMIT 1;
DISTINCT - only returns different values (gets rid of duplicates)
[ex] SELECT DISTINCT column_name FROM table;
44