1 of 25

A SeQueL to SQL

Feel free to grab a worksheet at the front!

(also available on the website)

CSE 414: Section 2

2 of 25

Announcements

  • HW1 due tomorrow, 10/05 at 11:59 PM
    • You can use up to 2 late days per assignment
    • Late deadline 10/07 at 11:59pm
  • HW2 released tomorrow, due October 5th @ 11:59pm
    • Name the files EXACTLY as specified

3 of 25

Importing Files (HW2)

First, create the table.�Then, import the data.

.mode csv� .import population.csv Population� .import gdp.csv GDP

.import /path/to/file NameOfTable

Make sure you import the tables in the order you create them so there are no foreign key constraint issues. For example, if GDP had a foreign key constraint to Population, it would be illegal to import GDP before Population.

3

4 of 25

Review

SQL has 3-valued logic

  • FALSE = 0

[ex] price < 25 is FALSE when price = 99

  • UNKNOWN = 0.5

[ex] price < 25 is UNKNOWN when price = NULL

  • TRUE = 1

[ex] price < 25 is TRUE when price = 19

4

5 of 25

Review

Formal definitions:

C1 AND C2 means min(C1,C2)� C1 OR C2 means max(C1,C2)� NOT C means means 1-C

The rule for SELECT ... FROM ... WHERE C is the following:� if C = TRUE then include the row in the output� if C = FALSE or C = unknown then do not include it

5

6 of 25

Worksheet Exercise: 3-Valued Logic

CREATE TABLE A (a int, b int);

INSERT INTO A VALUES (1, 1), (2, 10), (3, NULL);

Given tables created with these commands:

1:

SELECT A.a FROM A� WHERE A.b < 5

What’s the output for each of the following:

2:

SELECT A.a FROM A� WHERE A.b >= 5

7 of 25

Worksheet Exercise: 3-Valued Logic

(1, 1) 1 < 5

True

(2, 10) 10 < 5

False

(3, NULL); NULL < 5

Unknown

1:

SELECT A.a FROM A� WHERE A.b < 5

8 of 25

Worksheet Exercise: 3-Valued Logic

2:

SELECT A.a FROM A� WHERE A.b >= 5

(1, 1) 1 >= 5

False

(2, 10) 10 >= 5

True

(3, NULL); NULL >= 5

Unknown

9 of 25

Worksheet Exercise: 3-Valued Logic

CREATE TABLE A (a int, b int);

INSERT INTO A VALUES (1, 1), (2, 10), (3, NULL);

Given tables created with these commands:

3:

SELECT A.a FROM A� WHERE A.b != 1 AND

A.b != 10;

What’s the output for each of the following:

4:

SELECT A.a FROM A� WHERE A.b < 5 OR

A.b IS NULL;

10 of 25

Worksheet Exercise: 3-Valued Logic

3:

SELECT A.a FROM A� WHERE A.b != 1 AND

A.b != 10;

(1, 1) 1 != 1 AND 1 != 10

False && True

== False

(2, 10) 10 != 1 AND 10 != 10

True && False

== False

(3, NULL); NULL != 1 AND NULL != 10

Unknown && Unknown

== Unknown

11 of 25

Worksheet Exercise: 3-Valued Logic

4:

SELECT A.a FROM A� WHERE A.b < 5 OR

A.b IS NULL;

(1, 1) 1 < 5 OR 1 IS NULL

True || False

== True

(2, 10) 10 < 5 OR 10 IS NULL

False || False

== False

(3, NULL); NULL < 5 OR NULL IS NULL

Unknown || True

== True

12 of 25

Aliasing

  • Good style for renaming attribute operations to more intuitive labels
  • Essential for self joins (ex: FROM [table] AS T1, [table] AS T2)
  • You can alias without “AS” in the FROM clause (i.e. “AS” keyword can be omitted)

SELECT [attribute] AS [attribute_name]

FROM [table] AS [table_name]

[table_name].[attribute_name]

12

13 of 25

Misc. Filters

LIMIT number - limits the amount of tuples returned

[ex] SELECT * FROM table LIMIT 1;

DISTINCT - only returns unique values (eliminates duplicates)

[ex] SELECT DISTINCT column_name FROM table;

13

14 of 25

Join Semantics

  • Think as “nested loops”.

  • NOT the most efficient implementation on a large database! (we will talk about other ways to join later in the course)
  • Hash Join
  • Sort-Merge Join

14

For more information and different types of joins see:

https://blogs.msdn.microsoft.com/craigfr/2006/08/16/summary-of-join-properties/

15 of 25

Nested Loop Semantics

SELECT x_1.a_1, …, x_n.a_n�FROM x_1, …, x_n�WHERE <cond>

for each tuple in x_1:

for each tuple in x_n:

if <cond>(x_1, …, x_n):

output(x_1.a_1, …, x_n.a_n)

15

16 of 25

Join Types

There will be times we use inner join, full outer join, and left outer join.

There is never a scenario in this class we need to use a right outer join and sqlite3 does not support this operation. It also doesn’t support full outer join, which you most likely won’t need for this class.

16

17 of 25

Where we started

FWS

(From, Where, Select)

17

18 of 25

And now...

FWGHOSTM

(From, Where, Group By, Having, Order By, Select)

18

19 of 25

Group By

  • Powerful tool to handle “categories”
    • Groups rows with the same value of an attribute into a “bucket” (think dividing into categories)
  • Careful when selecting
    • Only select attributes in GROUP BY or aggregates
    • SQLite will guess (arbitrarily pick a value)
    • SQL Server will throw an error

19

20 of 25

Aggregates

  • Computes summary values for a set of tuples.

COUNT(attribute) - counts the number of tuples

SUM(attribute) - sums the value of the attribute among all tuples in set

MIN/MAX(attribute) - min/max value of the attribute among all tuples in the set

AVG(attribute) - avg value of the attribute among all tuples in the set

20

21 of 25

Group By - Examples

Do these queries work?

Enrolled(stu_id, course_num)

SELECT stu_id, course_num

FROM Enrolled

GROUP BY stu_id

SELECT stu_id, count(course_num)

FROM Enrolled

GROUP BY stu_id

21

johndoe

311

johndoe

344

maryjane

311

maryjane

351

maryjane

369

22 of 25

Group By - Examples

Do these queries work?

Enrolled(stu_id, course_num)

SELECT stu_id, course_num

FROM Enrolled

GROUP BY stu_id

SELECT stu_id, count(course_num)

FROM Enrolled

GROUP BY stu_id

22

johndoe

?

maryjane

?

23 of 25

Group By - Examples

Do these queries work?

Enrolled(stu_id, course_num)

SELECT stu_id, course_num

FROM Enrolled

GROUP BY stu_id

SELECT stu_id, count(course_num)

FROM Enrolled

GROUP BY stu_id

23

johndoe

2

maryjane

3

johndoe

311

johndoe

344

maryjane

311

maryjane

351

maryjane

369

24 of 25

Grouping and Ordering

GROUP BY [attribute], …, [attribute_n]

HAVING [predicate] - operates on groups, chooses to keep or remove the entire group

ORDER BY [attribute] [ASC/DESC]

24

25 of 25

Worksheet