CSE 344: Section 4
Relational Algebra, Subqueries (again)
Oct 17th, 2024
Announcements
Relational Algebra
RA Operators
Standard:
⋂ - Intersect
⋃ - Union
⎼ - Difference
σ - Select
π - Project
⍴ - Rename
Extended:
δ - Duplicate Elim.
ɣ - Group/Agg.
τ - Sorting
Joins:
⨝ - Nat. Join
⟕ - L.O. Join
⟖ - R.O. Join
⟗ - F.O. Join
✕- Cross Product
4
Ɣ Notation
Grouping and aggregation on group:
ɣattr_1, …, attr_k, count/sum/max/min(attr) -> alias
Aggregation on the entire table:
ɣcount/sum/max/min(attr) -> alias
5
Format
6
Query Plans (Example SQL -> RA)
Select-Join-Project structure
Make this SQL query into RA (remember FWGHOS):
SELECT R.b, T.c, max(T.a) AS T_max
FROM Table_R AS R, Table_T AS T
WHERE R.b = T.b
GROUP BY R.b, T.c
HAVING max(T.a) > 99
7
Query Plans (Example SQL -> RA)
Select-Join-Project structure
Make this SQL query into RA (remember FJWGHOS):
SELECT R.b, T.c, max(T.a) AS T_max
FROM Table_R AS R, Table_T AS T
WHERE R.b = T.b
GROUP BY R.b, T.c
HAVING max(T.a) > 99
πR.b, T.c, T_max
|
σT_max > 99
|
ɣR.b, T.c, max(T.a)->T_max
|
⨝R.b=T.b
/ \
R T
8
Nested Queries Review!
Nested Queries
Nested Queries
11
Subquery in SELECT
12
SELECT DISTINCT C.cname, (SELECT COUNT(*)
FROM Product P
WHERE P.cid = C.cid)
FROM Company C;
Subquery in SELECT
Unnest using JOIN and GROUP BY
13
SELECT C.cname, count(P.cid)
FROM Company C LEFT OUTER JOIN
Product P ON C.cid = P.cid
GROUP BY C.cname;
Nested Queries
14
Subquery in FROM
15
SELECT X.pname
FROM (SELECT *
FROM Product
WHERE price > 20) AS X
WHERE X.price < 500;
More readable: WITH <name> AS (<subquery>)
Subquery in FROM
16
WITH price_more_20 AS (
SELECT *
FROM Product
WHERE price > 20
)
SELECT X.pname
FROM price_more_20 AS X
WHERE X.price < 500;
Subquery in FROM
Unnest using WHERE
17
SELECT X.pname
FROM Product AS X
WHERE X.price < 500 AND X.price > 20;
Subquery in WHERE (example 1)
18
SELECT DISTINCT C.cname
FROM Company C
WHERE EXISTS (SELECT *
FROM Product P
WHERE C.cid = P.cid AND P.price < 200);
Subquery in WHERE (example 1)
19
SELECT DISTINCT C.cname
FROM Company C, Product P
WHERE C.cid = P.cid AND P.price < 200;
Subquery in WHERE (example 2)
20
SELECT S1.major, S1.num_of_students, S1.school_name
FROM Schools S1
WHERE S1.num_of_students >= ALL (SELECT S2.num_of_students
FROM Schools S2
WHERE S1.school_name =
S2.school_name);
Subquery in WHERE (example 2)
21
SELECT S1.major, S1.num_of_students, S2.school_name
FROM Schools S1, Schools S2
WHERE S1.school_name = S2.school_name
GROUP BY S1.major, S1.num_of_students, S2.school_name
HAVING S1.num_of_students = MAX(S2.num_of_students);
Subquery in WHERE Syntax
22
Witnessing (i.e. argmax)
Find the student(s) who is taking the most classes.
Student(stu_id, id_num)
Enrolled(id_num, class)
SELECT S.stu_id
FROM Student S, Enrolled E
WHERE S.id_num = E.id_num
GROUP BY S.stu_id
HAVING count(E.class) >= ALL(
SELECT count(E1.class)
FROM Enrolled E1
GROUP BY E1.id_num);
23
johndoe | 973 |
maryjane | 712 |
alsmith | 899 |
973 | CSE 311 |
973 | CSE 344 |
712 | CSE 311 |
899 | CSE 351 |
Witnessing (i.e. argmax) (another way)
24
WITH class_counts AS (SELECT COUNT(E1.class) AS cnt
FROM Enrolled E1
GROUP BY E1.id_num),
max_counts AS (SELECT MAX(cnt) AS max FROM class_counts)
SELECT S.stu_id
FROM Students S, Enrolled E, max_counts Emax
WHERE S.id_num = E.id_num
GROUP BY S.stu_id, Emax.max
HAVING COUNT(E.class) = Emax.max;
To Nest or Not to Nest
25
Monotonic Query
Query that does not lose any output tuples with the addition of new tuples in the database.
Theorem/helpful hint: If a query is a SELECT-FROM-WHERE query without nested subqueries or aggregates then it is monotone.
Consequence: If a query is not monotonic, then we cannot write it as a SELECT-FROM-WHERE query without nested subqueries or aggregates.
(Non-)monotonic Queries
SELECT count(*)
FROM T1
GROUP BY T1.attr
27
(Non-)monotonic Queries
SELECT count(*)
FROM T1
GROUP BY T1.attr
No! This query does not satisfy set containment.
Ex:
Current output: {(6), (23), (10)}
After more data: {(6), (23), (11)}
{(6), (23), (10)} ⊄ {(6), (23), (11)}
28
Worksheet