CSE 344: Section 9
Join Algorithms, B+ Trees, Indexing
November 21st, 2024
Announcements
Announcements
Join Algorithms
Nested Loop Join (⋈)
for each tuple t1 in R do
for each tuple t2 in S do
if t1 and t2 join then output (t1,t2)
Nested Loop Join (⋈)
for each block bR in R:
for each block bS in S:
for each tuple tR in bR:
for each tuple tS in bS:
if tR and tS can join:
output (tR,tS)
Nested Loop Join (⋈)
for each group of M blocks bR in R:
for each block bS in S:
for each tuple tR in bR:
for each tuple tS in bS:
if tR and tS can join:
output (tR,tS)
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
x = find(y.s)
output(x, y)
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
x = find(y.s)
output(x, y)
Which table is R?
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
x = find(y.s)
output(x, y)
Which table is R?
In this case, R is table with PK. Why?
Hash Join (⋈)
What if |FK table| << |PK table| ?
R is better as FK table
Hash Join (⋈)
for tuple x in R:
insert(x.r, x)
for tuple y in S:
for x in find(y.s)
output(x, y)
What if |FK table| << |PK table| ?
R is better as FK table
Sort-Merge Join (⋈)
sort(R); sort(S); // Better if sorted already!
x = R.first()
y = S.first()
while y != NULL do:
case:
x.r < y.s: x.next();
x.r = y.s: output(x, y); y.next();
x.r > y.s: y.next();
Note: R is table with PK. Why?
B+ Trees
B+ Trees
Goal: Reduce + optimize disk accesses
Keys k < 30
Keys 30<=k<120
Keys 120<=k<240
Keys 240<=k
Left pointer of k:
to keys < k
Right pointer of k:
to keys >= k
30 | 120 | 240 | | ||||
| | | | | |||
B+ Trees
Goal: Reduce + optimize disk accesses
40 | 50 | 60 | 70 | ||||
| | | | | |||
40
50
60
Next leaf
Data records
70
Worksheet Problem 1: Insert
20 | | | | ||||
| | | | | |||
(Root)
INSERT 20
20
20 | 40 | | | ||||
| | | | | |||
(Root)
INSERT 40
40
20
20 | 40 | 72 | | ||||
| | | | | |||
(Root)
INSERT 72
40
20
72
20 | 25 | 40 | 72 | ||||
| | | | | |||
(Root)
INSERT 25
25
20
40
72
(Root)
INSERT 63
20 | 25 | 40 | 63 | 72 | |||||
| | | | | | ||||
25
20
40
63
72
(Root)
20 | 25 | 40 | 63 | 72 | |||||
| | | | | | ||||
Violates m <= 4, split!
25
20
40
63
72
40 | | | | ||||
| | | | | |||
(Root)
20 | 25 | | | ||||
| | | | | |||
40 | 63 | 72 | | ||||
| | | | | |||
20
25
40
63
72
40 | | | | ||||
| | | | | |||
(Root)
INSERT 78
20 | 25 | | | ||||
| | | | | |||
40 | 63 | 72 | 78 | ||||
| | | | | |||
20
25
40
63
72
78
40 | | | | ||||
| | | | | |||
(Root)
INSERT 2
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | 72 | 78 | ||||
| | | | | |||
2
20
40
63
72
78
25
40 | 65 | | | ||||
| | | | | |||
(Root)
INSERT 65
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
2
20
40
63
25
65 | 72 | 78 | | ||||
| | | | | |||
72
78
65
40 | 65 | | | ||||
| | | | | |||
(Root)
INSERT 86
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
2
20
40
63
25
65 | 72 | 78 | 86 | ||||
| | | | | |||
72
78
65
86
40 | 65 | 78 | | ||||
| | | | | |||
(Root)
INSERT 92
2
20
40
63
25
72
78
65
86
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
78 | 86 | 92 | | ||||
| | | | | |||
92
40 | 65 | 78 | | ||||
| | | | | |||
(Root)
INSERT 95
2
20
40
63
25
72
78
65
86
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
78 | 86 | 92 | 95 | ||||
| | | | | |||
92
95
40 | 65 | 78 | 92 | ||||
| | | | | |||
(Root)
INSERT 97
2
20
40
63
25
72
78
65
86
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
78 | 86 | | | ||||
| | | | | |||
92
95
92 | 95 | 97 | | ||||
| | | | | |||
97
40 | 65 | 78 | 92 | ||||
| | | | | |||
(Root)
INSERT 99
2
20
40
63
25
72
78
65
86
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
78 | 86 | | | ||||
| | | | | |||
92
95
92 | 95 | 97 | 99 | ||||
| | | | | |||
97
99
40 | 65 | | | ||||
| | | | | |||
(New root)
INSERT 98
78
86
78 | 86 | | | ||||
| | | | | |||
92
95
92 | 95 | | | ||||
| | | | | |||
92 | 97 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
97 | 98 | 99 | | ||||
| | | | | |||
...
97
98
99
(split old root into two nodes)
Worksheet Problem 2: Delete
40 | 65 | | | ||||
| | | | | |||
DELETE 92
78
86
78 | 86 | | | ||||
| | | | | |||
92
95
92 | 95 | | | ||||
| | | | | |||
92 | 97 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
97 | 98 | 99 | | ||||
| | | | | |||
97
98
99
...
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | | | ||||
| | | | | |||
95
95 | | | | ||||
| | | | | |||
95 | 97 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
97 | 98 | 99 | | ||||
| | | | | |||
97
98
99
Violates m >= 2, check left and right to steal
...
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | | | ||||
| | | | | |||
95
95 | 97 | | | ||||
| | | | | |||
95 | 98 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
98 | 99 | | | ||||
| | | | | |||
97
98
99
...
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | | | ||||
| | | | | |||
95
95 | 97 | | | ||||
| | | | | |||
95 | 98 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
98 | 99 | | | ||||
| | | | | |||
97
98
99
DELETE 95
...
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | | | ||||
| | | | | |||
97
97 | | | | ||||
| | | | | |||
97 | 98 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
98 | 99 | | | ||||
| | | | | |||
98
99
Violates m >= 2, check left and right to steal
...
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | | | ||||
| | | | | |||
97
97 | | | | ||||
| | | | | |||
97 | 98 | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
98 | 99 | | | ||||
| | | | | |||
98
99
Nothing to steal, so merge!
...
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | 97 | | ||||
| | | | | |||
97
98 | | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
98 | 99 | | | ||||
| | | | | |||
98
99
2
20
40
63
25
72
65
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
Violates m >= 2, check left and right to steal
40 | 65 | | | ||||
| | | | | |||
78
86
78 | 86 | 97 | | ||||
| | | | | |||
97
98 | | | | ||||
| | | | | |||
78 | | | | ||||
| | | | | |||
98 | 99 | | | ||||
| | | | | |||
98
99
2
20
40
63
25
72
65
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
Nothing to steal, so merge!
40 | 65 | 78 | 98 | ||||
| | | | | |||
(Root)
2
20
40
63
25
72
78
65
86
2 | 20 | 25 | | ||||
| | | | | |||
40 | 63 | | | ||||
| | | | | |||
65 | 72 | | | ||||
| | | | | |||
78 | 86 | 97 | | ||||
| | | | | |||
98
99
98 | 99 | | | ||||
| | | | | |||
97
Indexing
Definitions
14 | |
15 | |
16 | |
14 | tuple1 |
15 | tuple2 |
16 | tuple3 |
Data file sorted on age
Index File
Index on age
Why Indexing?
Consider you have a deck of 52 cards. I asked you to pick out the 8 of hearts. Calculate the average number of flips required if:
Problem 1
To find the 8 of hearts from a randomly shuffled cards, on average I would require 26 flips through the deck, about half the deck.
Problem 2
Answer
Problem 1:
Average Flips = 26
Problem 2:
Average Flips = 9
Example
Types of Indexes
Definitions