1 of 20

CS W186 Exam Prep Section 1:

SQL

2 of 20

SQL

3 of 20

Single-Table SQL

SELECT [DISTINCT] <column list>� FROM <table1>� [WHERE <predicate>]� [GROUP BY <column list>]� [HAVING <predicate>][ORDER BY <column list> [DESC/ASC]][LIMIT <amount>];

4 of 20

Single-Table SQL: Logical Processing Order

  1. FROM <table1> - which table are we drawing data from
  2. [WHERE <predicate>] - only keep rows where <predicate> is satisfied
  3. [GROUP BY <column list>] - group together rows by value of columns in <column list>
  4. [HAVING <predicate>] - only keep groups having <predicate> satisfied
  5. SELECT <column list> - select columns in <column list> to keep
    1. [DISTINCT] - keep only distinct rows (filter out duplicates)
  6. [ORDER BY <column list> [DESC/ASC]] - order the output by value of the columns in <column list>, ASCending by default
  7. [LIMIT <amount>] - limit the output to just the first <amount> rows

5 of 20

Join Variants

SELECT * FROM

T1 INNER JOIN T2

ON T1.a = T2.a;

  • The different types of joins determine what we do with rows that don’t ever match the “join condition”
  • Default is INNER JOIN === FROM t1, t2 WHERE t1.id = t2.id
  • Fill entries with no matching data with NULL
  • If no join condition, return cartesian product (i.e. FROM t1, t2)

Join Condition

6 of 20

String Comparison

LIKE: following expression follows SQL specified format

  • _: Any single character
  • %: Zero, one, or more characters

Examples:

  • LIKE ‘z%’ starts with z
  • LIKE ‘z_’ exactly 2 letters, 1st is z
  • LIKE ‘_z%’ 2nd letter is a z

Note: ~ cannot be used in SQLite

7 of 20

More SQL Things

  • Sets - a collection with no duplicates
    • UNION - all items in either set, INTERSECTION - all items in both sets.
    • In SQL, these don’t have to be used on sets (there can be duplicates in the inputs). No duplicates in the final set!
    • Must add ALL to include duplicates - UNION ALL, INTERSECTION ALL
    • Used between two queries (SELECT … FROM … UNION SELECT … FROM …)
  • Correlated Query
    • Subquery depends on values from the outer query, subquery is recalculated for every row of the outer query’s table
    • SELECT S.sname FROM Sailors S WHERE EXISTS (SELECT * FROM Reserves R WHERE R.bid=102 AND S.sid=R.sid)

8 of 20

More SQL Things

  • Aggregations - COUNT, SUM, AVG, etc.
    • Using an aggregate in WHERE is not allowed! WHERE count(*) > 500 is an invalid query!
    • NULL column values are ignored by aggregate functions
  • Don’t use HAVING without GROUP BY - Just use WHERE instead!
  • DISTINCT removes all duplicate rows

9 of 20

Worksheet

10 of 20

Question 1

11 of 20

Question 1a

Return the bid and genre of each book that has ever been checked out. Remove any duplicate rows with the same bid and genre.

Need to join Books and Checkouts to get genre and the fact that a book is checked out

INNER JOIN would work as well!

12 of 20

Question 1b

Find all of the fantasy book titles that have been checked out and the date when they were checked out. Even if a book hasn’t been checked out, we still want to output the title (i.e. the row should look like (title, NULL)).

Wanting NULL in the output is a good sign of a LEFT OUTER or RIGHT OUTER join

LEFT JOIN to preserve all books

Use WHERE to filter genre of the joined table

13 of 20

Question 1c

Select the name of the book that has been checked out the most times and the corresponding checked out count. You can assume that each book was checked out a unique number of times, and that the titles of the books are all unique

Want title from Books table, # of checked out from aggregating on Checkouts. Group by title, not bid --- bid is unique for each “instance” of a book!

14 of 20

Question 1d

Select the name of all of the pairs of libraries that have books with matching titles. Include the name of both libraries and the title of the book. There should be no duplicate rows, and no two rows that are the same except the libraries are in opposite order. To ensure this, the first library name should be alphabetically less than the second library name. There may be zero, one, or more than one correct answer.

Incorrect: does not return books with matching titles. Also incorrectly uses lname to compare to Books.library

Correct: Filters rows in cross join where the alphabetic order is respected and book titles are the same

Correct: Finds book-library pairs as inner subqueries. Outer query does the cross join of these pairs such that the titles are the same and l1 is alphabetically “less than” l2

15 of 20

Question 2

16 of 20

Question 2a

Select all of the following queries which return the rid of the Rider with the most bikes. Assume all Riders have a unique number of bikes.

Correct: owner references rid so we don’t need to join on another table. Just aggregate!

Correct: returns “all” owners that have at least as many bikes as all owners. Because all riders have a unique # of bikes, this returns the 1 rider with the most bikes

Incorrect: using MAX on the table bikes is nonsensical (what even would be aggregated?)

17 of 20

Question 2b

Select the bid of all Bikes that have never been ridden

Incorrect: Finds all bikes without an owner --- not what we want!

Correct: Finds all bikes in the Bikes table for which there are no entries in the Rides table

Correct: Finds all bikes in the Bikes table that do not exist in the join of Rides and Bikes

18 of 20

Question 2c

Select the name of the rider and the city_name of the src and dest locations of all their journeys for all rides. Even if a rider has not ridden a bike, we still want to output their name.

None of these are correct!

The INNER JOINs and WHERE clauses will filter out rows with NULL values produced by the OUTER JOIN.

19 of 20

Question 2c

Select the name of the rider and the city_name of the src and dest locations of all their journeys for all rides. Even if a rider has not ridden a bike, we still want to output their name.

How would we construct a correct answer?

Need to do a JOIN on Locations to get the city_name, but need to do this before the join onto riders

20 of 20

Attendance Link