CSE 344: Section 7
Cardinality Estimation, Join Algorithms, Flight App
Nov 8th, 2025
Announcements
Announcements
Cardinality Estimation Example
Cardinality Estimation (Point Selection)
Assume a uniform distribution.
σcolor = “orange”
X ≅ 1 / V(color)
≅ 1/4
Cardinality Estimation (Union)
Assume conditions are disjoint.
σcolor = “orange” OR color = “green”
X ≅ 1 / V(color) + 1 / V(color)
≅ 1/4 + 1/4 = 1/2
Cardinality Estimation (Intersection)
Assume independence.
σc1 = “red” AND c2 = “yellow”
X ≅ 1 / V(c1) · 1 / V(c2)
≅ 1/2 · 1/2 = 1/4
c1
c2
Cardinality Estimation (Ranges)
Schema: Weather(date, pressure). Assume a uniform distribution.
σpressure < p
σpressure > p
σp < pressure < q
X ≅ (p − min) / (max - min + 1)
X ≅ (max − p) / (max - min + 1)
X ≅ (q − p) / (max - min + 1)
min(pressure)
max(pressure)
p
q
Cardinality Estimation (Equijoin)
Assume uniform distribution and full overlap.
⋈c1 = c2
X ≅ (1 / (V(c1) · V(c2))) · min(V(c1), V(c2))
≅ 1 / max(V(c1), V(c2))
≅ 1/4
c1
c2
Cardinality Estimation (Equijoin)
Assume uniform distribution and full overlap.
⋈c1 = c2
X ≅ (1 / (V(c1) · V(c2))) · min(V(c1), V(c2))
≅ 1 / max(V(c1), V(c2))
≅ 1/4 (didn’t change)
c1
c2
Cardinality Estimation (Equijoin)
Assume uniform distribution and full overlap.
⋈c1 = c2
X ≅ (1 / (V(c1) · V(c2))) · min(V(c1), V(c2))
≅ 1 / max(V(c1), V(c2))
≅ 1/5 (decreased)
c1
c2
Summary
+1
Join Algorithms
Definitions
B(R) = # blocks for relation R
T(R) = # tuples for relation R
M = # of blocks that can fit in memory
Nested Loop Join (⋈) - Naive Version
Naive: B(R) + T(R)B(S)
for each tuple t1 in R do
for each tuple t2 in S do
if t1 and t2 join then output (t1,t2)
This version of the algo reserves an extra block for the joined output; lecture doesn't
Nested Loop Join (⋈) - Block-at-a-time Version
Block-at-a-time: B(R) + B(S)B(R)
for each block bR in R:
for each block bS in S:
for each tuple tR in bR:
for each tuple tS in bS:
if tR and tS can join:
output (tR,tS)
Nested Loop Join (⋈) - Block Version
Block-nested-loop: B(R) + B(R)B(S)/(M-2)
for each group of M-2 blocks bR in R:
for each block bS in S:
for each tuple tR in bR:
for each tuple tS in bS:
if tR and tS can join:
output (tR,tS)
This version of the algo reserves an extra block for the joined output; lecture doesn't.��This version does reserve one block for S, however.
Hash Join (⋈)
R joined with S (assume R is smaller in size)
B(R) + B(S)
Assuming B(R) < M-2 for one pass (look at each table once) efficiency, read all of R into a hash table and join with all of S
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
x = find(y.s)
output(x, y)
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
x = find(y.s)
output(x, y)
Which table is R?
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
x = find(y.s)
output(x, y)
Which table is R?
In this case, R is table with PK. Why?
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
for x in find(y.s)
output(x, y)
What if |FK table| << |PK table| ?
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
for x in find(y.s)
output(x, y)
What if |FK table| << |PK table| ?
R is better as FK table
Sort-Merge Join (⋈)
B(R) + B(S)
One pass (look at each table once); Both must be small (B(R) + B(S) < M)
Why would we use this over Hash Join?
Sort-Merge Join (⋈)
sort(R); sort(S); // Better if sorted already!
x = R.first()
y = S.first()
while y != NULL do:
case:
x.r < y.s: x.next();
x.r = y.s: output(x, y); y.next();
x.r > y.s: y.next();
Note: R is table with PK. Why?
JDBC and Prepared Statements
Java: JDBC
ResultSet
// ... continued from previous slide
ResultSet rs = ps.executeQuery();
while (rs.next()) { // check if there is another row and move to the next row
String destCity = rs.getString("dest_city"); // Gets the value of the attribute "dest_city" for the current row
...
} // When `next` finally returns false, we know we’ve seen every row
rs.close(); // Remember to close the ResultSet
Statements and PreparedStatements
How they work:
String q1 = "SELECT * FROM Flights WHERE origin_city = 'SEA' AND day_of_month = 1";
String q2 = "SELECT * FROM Flights WHERE origin_city = 'PDX' AND day_of_month = 1";
String q3 = "SELECT * FROM Flights WHERE origin_city = 'SEA' AND day_of_month = 15";
…
PreparedStatements
String rawQuery = "SELECT * FROM Flights WHERE origin_city = ? AND day_of_month = ?";
PreparedStatement ps = conn.prepareStatement(rawQuery); // Pre-compiles the query into a PreparedStatement Object
ps.clearParameters(); // Clears parameters from previous use
ps.setString(1, originCity); // Sets the first parameter (the first "?") to the value of the variable "originCity"
ps.setInt(2, dayOfMonth); // Sets the second parameter (the second "?") to the value of the variable "dayOfMonth"
ResultSet rs = ps.executeQuery(); // Executes the query and stores the ResultSet in the variable "rs"
SQL Injections
SQL Injection
Hashing
Salting
HW6 Hashing/Salting
Flight App Notes
Demo
Worksheet