Cardinality Estimation
See speaker notes for explanations.
Cardinality Estimation (Point Selection)
Assume a uniform distribution.
σcolor = “orange”
X ≅ 1 / V(color)
≅ 1/4
Cardinality Estimation (Union)
Assume conditions are disjoint.
σcolor = “orange” OR color = “green”
X ≅ 1 / V(color) + 1 / V(color)
≅ 1/4 + 1/4 = 1/2
Cardinality Estimation (Intersection)
Assume independence.
σc1 = “red” AND c2 = “yellow”
X ≅ 1 / V(c1) · 1 / V(c2)
≅ 1/2 · 1/2 = 1/4
c1
c2
Cardinality Estimation (Ranges)
Schema: Weather(date, pressure)
Assume a uniform distribution.
σpressure < p
σpressure > p
σp < pressure < q
X ≅ (p − min) / (max - min)
X ≅ (max − p) / (max - min)
X ≅ (q − p) / (max - min)
min(pressure)
max(pressure)
p
q
Cardinality Estimation (Equijoin)
Assume uniform distribution and full overlap.
⋈c1 = c2
X ≅ 1 / (V(c1) · V(c2)) · min(V(c1), V(c2))
≅ 1 / max(V(c1), V(c2))
≅ 1/4
c1
c2
Cardinality Estimation (Equijoin)
Assume uniform distribution and full overlap.
⋈c1 = c2
X ≅ 1 / (V(c1) · V(c2)) · min(V(c1), V(c2))
≅ 1 / max(V(c1) · V(c2))
≅ 1/4 (didn’t change)
c1
c2
Cardinality Estimation (Equijoin)
Assume uniform distribution and full overlap.
⋈c1 = c2
X ≅ 1 / (V(c1) · V(c2)) · min(V(c1), V(c2))
≅ 1 / max(V(c1) · V(c2))
≅ 1/5 (decreased)
c1
c2