1 of 44

CSE 414: Section 2

A SeQueL to SQL

January 16th, 2020

1

2 of 44

Administrivia

HW2 due Monday, January 21st, 2020

  1. Remember that you have only 1 LATE DAY for hw2 and hw3 TOGETHER
  2. Additional late days at 10% penalty
  3. No submission allowed after Wednesday, January 23rd, 2020

Contact the teaching staff for any partner related troubles

2

3 of 44

Importing Files (shown in lecture)

First, make the table.

Then, specify the format

.mode csv�Then, import the data.� .import ./carriers.csv Carriers� .import ./flights.csv Flights

.import /path/to/file NameOfTable

3

4 of 44

3-Valued Logic

4

5 of 44

SQL 3-Valued Logic

Real data often has missing information

DBMSs often model missing information with NULL

5

6 of 44

SQL 3-Valued Logic

  • FALSE = 0
  • TRUE = 1
  • UNKNOWN = 0.5

[ex] price < 25 is UNKNOWN when price = NULL

6

7 of 44

SQL 3-Valued Logic

Formal definitions:

C1 AND C2 min(C1,C2)� C1 OR C2 max(C1,C2)� NOT C 1 — C

The rule for SELECT ... FROM ... WHERE <C> ... is the following:� if C = TRUE, then include the row in the output� if C = FALSE or UNKNOWN, then do not include it

7

8 of 44

SQL 3-Valued Logic

What is the output?

SELECT P.name

FROM People AS P

WHERE P.age >= 21

8

People

name

age

Bob

19

Amy

32

Joe

NULL

NULL

24

9 of 44

SQL 3-Valued Logic

What is the output?

SELECT P.name

FROM People AS P

WHERE P.age >= 21

9

People

name

age

Bob

19

Amy

32

Joe

NULL

NULL

24

10 of 44

SQL 3-Valued Logic

Why might NULL and 3-valued logic fail us?

10

11 of 44

SQL 3-Valued Logic

Why might NULL and 3-valued logic fail us?

SELECT P.name

FROM People AS P

WHERE P.age >= 21

OR P.age < 21

11

People

name

age

Bob

19

Amy

32

Joe

NULL

NULL

24

12 of 44

SQL 3-Valued Logic

Why might NULL and 3-valued logic fail us?

SELECT P.name

FROM People AS P

WHERE P.age >= 21

OR P.age < 21

12

People

name

age

Bob

19

Amy

32

Joe

NULL

NULL

24

Always true?

13 of 44

SQL 3-Valued Logic

Why might NULL and 3-valued logic fail us?

SELECT P.name

FROM People AS P

WHERE P.age >= 21

OR P.age < 21

13

People

name

age

Bob

19

Amy

32

Joe

NULL

NULL

24

Nope.

14 of 44

SQL 3-Valued Logic

Another weird case

SELECT P.name

FROM People AS P

WHERE P.age = P.age

14

People

name

age

Bob

19

Amy

32

Joe

NULL

NULL

24

15 of 44

SQL 3-Valued Logic

How to improve DBMSs to better deal with missing information is debated.

15

16 of 44

SQL Joins

16

17 of 44

Joining

Inner vs. Outer

Self Joins

17

18 of 44

Join Semantics

  • Primary focus on nested loops semantics for conceptualizing joins

  • NOT always the most efficient implementation!
    • Hash Join
    • Sort-Merge Join

18

For more information and different types of joins see:

https://blogs.msdn.microsoft.com/craigfr/2006/08/16/summary-of-join-properties/

19 of 44

Nested Loop Semantics

SELECT x_1.a_1, …, x_n.a_n�FROM x_1, …, x_n�WHERE <cond>

for each tuple in x_1:

for each tuple in x_n:

if <cond>(x_1, …, x_n):

output(x_1.a_1, …, x_n.a_n)

19

A for-loop per table in FROM

20 of 44

Practice!

20

21 of 44

SQL Aggregates

21

22 of 44

Aggregates in the most tangible sense

COUNT(attribute)

SUM(attribute)

MIN/MAX(attribute)

AVG(attribute)

...

22

23 of 44

Aggregates in the most tangible sense

COUNT(attribute)

SUM(attribute)

MIN/MAX(attribute)

AVG(attribute)

...

23

What am I aggregating over?

24 of 44

Aggregates in the most tangible sense

COUNT(attribute)

SUM(attribute)

MIN/MAX(attribute)

AVG(attribute)

...

24

What am I aggregating over?

Some collection of tuples, a group!

25 of 44

Aggregates in the most tangible sense

COUNT(attribute)

SUM(attribute)

MIN/MAX(attribute)

AVG(attribute)

...

25

What am I aggregating over?

If only there were a way to specify groups...

Some collection of tuples, a group!

26 of 44

Grouping

GROUP BY [attribute_1], …, [attribute_n]

26

27 of 44

Grouping

27

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

People

name

age

job

Bob

22

SDE

Amy

33

QA

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Kim

33

SDE

Group

on

???

28 of 44

Grouping

28

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

People

name

age

job

Bob

22

SDE

Amy

33

QA

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Kim

33

SDE

Group

on

age

29 of 44

Grouping

29

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

People

name

age

job

Bob

22

SDE

Amy

33

QA

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Kim

33

SDE

Bob

22

SDE

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Amy

33

QA

Kim

33

SDE

Group

on

age

30 of 44

Grouping

30

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

People

name

age

job

Bob

22

SDE

Amy

33

QA

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Kim

33

SDE

Bob

22

SDE

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Amy

33

QA

Kim

33

SDE

Group

on

age

Group

on

job

31 of 44

Grouping

31

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

People

name

age

job

Bob

22

SDE

Amy

33

QA

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Kim

33

SDE

Bob

22

SDE

Joe

22

QA

Sue

22

SDE

Carl

22

QA

Amy

33

QA

Kim

33

SDE

Bob

22

SDE

Sue

22

SDE

Amy

33

QA

Joe

22

QA

Carl

22

QA

Kim

33

SDE

Group

on

age

Group

on

job

32 of 44

Grouping

32

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

Bob

22

SDE

Sue

22

SDE

Amy

33

QA

Joe

22

QA

Carl

22

QA

Kim

33

SDE

Select

*

What are my output columns?

name? age? job?

33 of 44

Grouping

33

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

Bob

22

SDE

Sue

22

SDE

Amy

33

QA

Joe

22

QA

Carl

22

QA

Kim

33

SDE

Select

*

What are my output columns?

n̶a̶m̶e̶ age job

34 of 44

Grouping

34

People(name, age, job)

SELECT *

FROM People AS P

GROUP BY P.age, P.job

Bob

22

SDE

Sue

22

SDE

Amy

33

QA

Joe

22

QA

Carl

22

QA

Kim

33

SDE

Select

*

age

job

22

SDE

22

QA

33

QA

33

SDE

35 of 44

Aliasing

Good style for renaming attribute operations to more intuitive labels

Essential for self joins (ex: FROM [table] AS T1, [table] AS T2)

SELECT

FROM [table] AS [table_name]

WHERE [table_name].[attribute] -- use the alias

35

36 of 44

SQL Query Evaluation Order

FWGHOS

(From, Where, Group By, Having, Order By, Select)

36

37 of 44

Relational Algebra (RA)

38 of 44

RA Operators

Standard:

⋃ - Union

⎼ - Diff.

σ - Select

π - Project

⍴ - Rename

Extended:

δ - Duplicate Elim.

ɣ - Group/Agg.

τ - Sorting

Joins:

- Nat. Join

- L.O. Join

- R.O. Join

- F.O. Join

- Cross Product

⋂ - Intersect

R1⋂R2 = R1–(R1–R2)

R1⋂R2 = R1R2

38

39 of 44

A Few More SQL Keywords

(<sub>) INTERSECT (<sub>)

(<sub>) UNION (<sub>)

(<sub>) EXCEPT (<sub>)

39

40 of 44

Ɣ Notation

Grouping and aggregation on group:

ɣattr_1, …, attr_k, count/sum/max/min(attr) -> alias

Aggregation on the entire table:

ɣcount/sum/max/min(attr) -> alias

40

41 of 44

Query Plans (Example SQL -> RA)

Select-Join-Project structure

Make this SQL query into RA (remember FWGHOS):

SELECT R.b, T.c, max(T.a) AS T_max

FROM Table_R AS R, Table_T AS T

WHERE R.b = T.b

GROUP BY R.b, T.c

HAVING max(T.a) > 99

41

42 of 44

Query Plans (Example SQL -> RA)

Select-Join-Project structure

Make this SQL query into RA (remember FWGHOS):

SELECT R.b, T.c, max(T.a) AS T_max

FROM Table_R AS R, Table_T AS T

WHERE R.b = T.b

GROUP BY R.b, T.c

HAVING max(T.a) > 99

πR.b, T.c, T_maxT_max>99R.b, T.c, max(T.a)->T_max(R R.b=T.b T)))

42

43 of 44

Practice!

43

44 of 44

Other ways to alter output

LIMIT number - limits the amount of tuples returned

[ex] SELECT * FROM table LIMIT 1;

DISTINCT - only returns different values (gets rid of duplicates)

[ex] SELECT DISTINCT column_name FROM table;

44