1 of 18

CSE 344: Section 3

Grouping and (Un)Nesting

January 24th, 2019

1

2 of 18

Group By

  • Powerful tool to handle “categories”
    • Treat rows with a same attribute as a category
  • Careful when selecting
    • Only select attributes appeared in GROUP BY or aggregates
    • SQLite will guess (arbitrarily pick a value)¯\_(ツ)_/¯
    • SQL Server will throw an error ง •̀_•́)ง

2

3 of 18

Group By - Examples

Do these queries work?

Enrolled(stu_id, course_num)

SELECT stu_id, course_num

FROM Enrolled

GROUP BY stu_id

SELECT stu_id, count(course_num)

FROM Enrolled

GROUP BY stu_id

3

johndoe

311

johndoe

344

maryjane

311

maryjane

351

maryjane

369

4 of 18

Group By - Examples

Do these queries work?

Enrolled(stu_id, course_num)

SELECT stu_id, course_num

FROM Enrolled

GROUP BY stu_id

SELECT stu_id, count(course_num)

FROM Enrolled

GROUP BY stu_id

4

johndoe

?

maryjane

?

5 of 18

Group By - Examples

Do these queries work?

Enrolled(stu_id, course_num)

SELECT stu_id, course_num

FROM Enrolled

GROUP BY stu_id

SELECT stu_id, count(course_num)

FROM Enrolled

GROUP BY stu_id

5

johndoe

2

maryjane

3

johndoe

311

johndoe

344

maryjane

311

maryjane

351

maryjane

369

6 of 18

Witnessing (i.e. argmax)

Find the student who is taking the most classes.

Student(stu_id, id_num)

Enrolled(id_num, class)

SELECT S.stu_id

FROM Student S, Enrolled E

WHERE S.id_num = E.id_num

GROUP BY S.stu_id

HAVING COUNT(E.class) >= ALL(

SELECT COUNT(E1.class)

FROM Enrolled E1

GROUP BY E1.id_num);

6

johndoe

973

maryjane

712

alsmith

899

973

CSE 311

973

CSE 344

712

CSE 311

899

CSE 351

7 of 18

Witnessing (i.e. argmax)

Find the student who is taking the most classes.

Student(stu_id, id_num)

Enrolled(id_num, class)

SELECT S.stu_id

FROM Student S, Enrolled E

WHERE S.id_num = E.id_num

GROUP BY S.stu_id

HAVING COUNT(E.class) = (

SELECT MAX(C) FROM (

SELECT COUNT(E1.class) AS C

FROM Enrolled E1

GROUP BY E1.id_num));

7

johndoe

973

maryjane

712

alsmith

899

973

CSE 311

973

CSE 344

712

CSE 311

899

CSE 351

Alternative to “ALL”

8 of 18

Nested Queries

  • Avoid when possible
  • Danger of making simple queries slow and complicated
  • Just because you can do it, doesn’t mean you should

8

9 of 18

Subquery in SELECT

SELECT DISTINCT C.cname, (SELECT count(*)

FROM Product P

WHERE P.cid=C.cid)

FROM Company C

9

10 of 18

Subquery in SELECT

Unnest using JOIN and GROUP BY

SELECT C.cname, count(P.cid)

FROM Company C LEFT OUTER JOIN

Product P ON C.cid = P.cid

GROUP BY C.cname;

10

11 of 18

Subquery in FROM

SELECT X.pname � FROM (SELECT *

FROM Product

WHERE price > 20) AS X

WHERE X.price < 500

More readable: WITH <alias> AS (<subquery>)

11

12 of 18

Subquery in FROM

Unnest using WHERE

SELECT X.pname � FROM Product AS X

WHERE X.price < 500 AND X.price > 20;

12

13 of 18

Subquery in WHERE

SELECT DISTINCT C.cname

FROM Company C

WHERE EXISTS (SELECT *� FROM Product P� WHERE C.cid = P.cid AND P.price < 200)

13

14 of 18

Subquery in WHERE

SELECT DISTINCT C.cname

FROM Company C, Product P

WHERE C.cid = P.cid AND P.price < 200

14

15 of 18

Subquery in WHERE Syntax

  • SELECT ……… WHERE EXISTS (<sub>);
  • SELECT ……… WHERE NOT EXISTS (<sub>);
  • SELECT ……… WHERE attribute IN (<sub>);
  • SELECT ……… WHERE attribute NOT IN (<sub>);
  • SELECT ……… WHERE attribute > ANY (<sub>);
  • SELECT ……… WHERE attribute > ALL (<sub>);

15

16 of 18

(Non-)monotonic Queries

  • “Can we take back outputs by looking at more data?”
  • Is this a monotonic query?

SELECT count(*)

FROM T1

GROUP BY T1.attr

16

17 of 18

(Non-)monotonic Queries

  • “Can we take back outputs by looking at more data?”
  • Is this a monotonic query?

SELECT count(*)

FROM T1

GROUP BY T1.attr

No! This query does not satisfy set containment.

Ex:

Current output: {(6), (23), (10)}

After more data: {(6), (23), (11)}

{(6), (23), (10)} ⊄ {(6), (23), (11)}

17

18 of 18

To Nest or Not to Nest

  • Not an exact science
  • Figuring out what is actually wanted will help you find simpler solutions (best way is to practice)
  • Trigger words to use sub-querying
    • Every, All (universal quantifiers)
    • No, None, Never (negation)
    • Only

18