Discussion 4
Buffer Management & Spatial Indexes
Announcements
Project 3 (Joins and QO) has been released and pt1 is due 10/10 at 11:59 PM.
Vitamin 4 (Buffer Management) is due Monday 9/30 at 11:59 PM
Midterm 1 - Tuesday 10/1
1 pfe double-sided cheat sheet and a calculator is permitted
Review for Disk and Memory
Disk vs Memory
Any action moving a page of data from disk to memory (reading from that page) or from memory to disk (writing to that page) will incur 1 I/O
Buffer Management
DBMS Architecture
Buffer Management Abstraction
Motivation
Buffer Manager
Page Replacement Policies
LRU (Least Recently Used)
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
A
0
1
Scan
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
A
0
2
B
Scan
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
A
0
3
B
C
Scan
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
3
B
C
LRU
D
D
Scan
A
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
4
C
LRU
Scan
D
A
A
B
Clock
Clock Example
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Clock Example
Want to insert page:
Current frame has�pin-count > 0 �→ Skip
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
A
B
C
D
E
F
Pinned
Pinned
G
Clock Example
Want to insert page:
Current frame has�pin-count > 0 �→ Skip
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
A
B
C
D
E
F
Pinned
Pinned
G
Clock Example
Want to insert page:
Not Pinned and �Ref. bit is set →
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
A
B
C
D
E
F
Pinned
Pinned
G
Clock Example
Want to insert page:
Not pinned and�Ref. bit unset:
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
A
B
C
G
E
F
Pinned
Pinned
G
D
Clock Example
Request for Page C:
Cache Hit!:
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
Empty Frame
A
B
C
G
E
F
Pinned
Pinned
Pinned
Pinned
C
LRU/Clock
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
A
0
1
Scan
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
A
0
2
B
Scan
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
A
0
3
B
C
Scan
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
3
B
C
LRU
D
D
Scan
A
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
4
C
LRU
Scan
D
A
A
B
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
5
A
LRU
Scan
D
B
B
C
LRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
6
A
LRU
Scan
D
B
No Cache Hits!
Sequential Flooding
MRU (Most Recently Used)
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
0
Scan
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
1
Scan
A
MRU
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
2
Scan
A
B
MRU
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
3
Scan
A
B
C
MRU
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
0
4
Scan
A
B
D
MRU
D
D
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
1
5
Scan
A
B
D
MRU
Hit!
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
2
6
Scan
A
B
D
MRU
Hit!
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
2
Scan
A
C
D
MRU
7
C
C
MRU: Repeated Scan of File
A
B
C
D
File
Empty
Frame
Empty
Frame
Empty
Frame
Buffer
Scan
Cache Hits:
Attempts:
3
8
Scan
A
C
D
MRU
Hit!
Improved
Cache
Hit Rate!
Summary of Replacement Policies
Worksheet
Note on access patterns
Worksheet: LRU
A B C D A F A D G D G E D F
Worksheet: LRU
A B C D A F A D G D G E D F
Worksheet: MRU
A B C D A F A D G D G E D F
Worksheet: MRU
A B C D A F A D G D G E D F
Worksheet: Clock
A B C D A F A D G D G E D F
A B C D A F A D G D G E D F
* For this problem, none of the pages remained pinned. The animation for pinning/unpinning each page is not shown.
Empty Frame
Empty Frame
0
Empty Frame
0
Empty Frame
0
0
A B C D A F A D G D G E D F
1
A
Empty Frame
0
Empty Frame
0
Empty Frame
0
A B C D A F A D G D G E D F
1
A
Empty Frame
0
Empty Frame
0
Empty Frame
0
A B C D A F A D G D G E D F
1
A
Empty Frame
0
Empty Frame
0
B
1
A B C D A F A D G D G E D F
1
A
Empty Frame
0
Empty Frame
0
B
1
A B C D A F A D G D G E D F
1
A
Empty Frame
Empty Frame
0
B
1
C
1
A B C D A F A D G D G E D F
1
A
Empty Frame
Empty Frame
0
B
1
C
1
A B C D A F A D G D G E D F
1
A
Empty Frame
B
1
C
1
1
D
A B C D A F A D G D G E D F
1
A
Empty Frame
B
1
C
1
1
D
A B C D A F A D G D G E D F
Hit
Set ref bit = 1
Do not move clock hand
1
A
Empty Frame
B
1
C
1
1
D
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
A
Empty Frame
B
1
C
1
1
D
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
A
Empty Frame
B
C
1
1
D
0
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
A
Empty Frame
B
C
0
1
D
0
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
A
Empty Frame
B
C
0
0
D
0
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
B
C
0
0
D
1
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
B
C
0
0
D
1
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
A
C
0
0
D
1
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
A
C
0
0
D
1
1
A B C D A F A D G D G E D F
Hit
Set ref bit = 1
Do not move clock hand
F
Empty Frame
A
C
0
D
1
1
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
A
G
D
1
1
1
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
A
G
D
1
1
1
1
A B C D A F A D G D G E D F
Hit
Set ref bit = 1
Do not move clock hand
F
Empty Frame
A
G
D
1
1
1
1
A B C D A F A D G D G E D F
Hit
Set ref bit = 1
Do not move clock hand
F
Empty Frame
A
G
D
1
1
1
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
F
Empty Frame
A
G
D
1
1
0
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
F
Empty Frame
A
G
D
1
1
0
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
F
Empty Frame
A
G
D
1
0
0
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 1 →
Set ref bit to 0
Advance clock hand
F
Empty Frame
A
G
D
0
0
0
0
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
A
G
E
0
0
0
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
F
Empty Frame
A
G
E
0
0
0
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
D
Empty Frame
A
G
E
0
0
1
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
D
Empty Frame
A
G
E
0
0
1
1
A B C D A F A D G D G E D F
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
D
Empty Frame
F
G
E
0
1
1
1
A B C D A F A D G D G E D F (Hit rate = 4/14)
Current frame’s ref bit = 0 →
Evict existing page
Read in new page
Set ref bit = 1
Advance clock hand
D
Empty Frame
F
G
E
0
1
1
1
Another visualization for Clock Policy
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Red frame outline is where the clock hand points at the end of the access
Red box means the second chance bit is zero
Green box means the second chance bit is one
* means a hit
Worksheet: LRU w/ Pinned Pages
A, B, C, D, A, F (remains pinned), D, G, D, unpin F, G, E, D, F
Worksheet: LRU w/ Pinned Pages
A, B, C, D, A, F (remains pinned), D, G, D, unpin F, G, E, D, F
A | | | | * | | | | | | | | E | | | 6 13 |
| B | | | | F | x | x | x | x | x | | | | * | |
| | C | | | | | | G | | | * | | | | |
| | | D | | | | * | | * | | | | * | |
* = hit Remember that unpinning doesn’t contribute to the hit count
X = pinned
Worksheet: MRU w/ Pinned Pages
A, B, C, D, A, F (remains pinned), D, G, D, unpin F, G, E, D, F
Worksheet: MRU w/ Pinned Pages
A, B, C, D, A, F (remains pinned), D, G, D, unpin F, G, E, D, F
A | | | | * | F | x | x | x | x | x | G | E | | | 3 13 |
| B | | | | | | | | | | | | | | |
| | C | | | | | | | | | | | | | |
| | | D | | | | * | G | D | | | | * | F |
* = hit Remember that unpinning doesn’t contribute to the hit count!
X = pinned
Worksheet 1 (continued)
(c) Is MRU ever better than LRU?
(d) Why would we use a clock replacement policy over LRU?
(e) Why would it be useful for a database management system to implement its own buffer replacement policy? Why shouldn’t we just rely on the operating system?
Worksheet 1 (continued)
(c) Is MRU ever better than LRU?
Yes, MRU prevents sequential flooding during sequential scans
(d) Why would we use a clock replacement policy over LRU?
Efficiency (approximation of LRU; don’t need to maintain entire ordering)
(e) Why would it be useful for a database management system to implement its own buffer replacement policy? Why shouldn’t we just rely on the operating system?
The database management system knows its data access patterns, which allows it to optimize its buffer replacement policy for each case
Spatial Indexes
Spatial Indexes
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Query: Find the closest vector to [9, 5] with respect to the manhattan distance
d([9, 5], [0, 10]) = |9 - 0| + |5 - 10| = 14
d([9, 5], [10, 0]) = |9 - 10| + |5 - 10| = 6
d([9, 5], [5, 5]) = |9 - 5| + |5 - 5| = 4
d([9, 5], [6, 2]) = |9 - 6| + |5 - 2| = 6
d([9, 5], [3, 8]) = |9 - 3| + |5 - 8| = 9
Closest
d([9, 5], [2, 3]) = |9 - 2| + |5 - 3| = 9
d([9, 5], [6, 9]) = |9 - 6| + |5 - 9| = 7
d([9, 3], [6, 2]) = |9 - 3| + |6 - 2| = 10
Using Indexes
K-d Tree Construction
Example
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Stored as a Tree
L
R
U
D
U
D
L
L
L
L
R
R
R
R
[0, 10]
[3, 8]
[2, 3]
[3, 2]
[5, 5]
[6, 9]
[6, 2]
[10, 0]
K-d Tree Nearest Neighbor Search
Example
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Example
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Example
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Example
7
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
7
3.5
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
4
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
4
1.5
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
4
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
4
6
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
4
6
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Backtrack
4
5
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Done
4
id | vec |
1 | [0, 10] |
2 | [10, 0] |
3 | [5, 5] |
4 | [6, 2] |
5 | [3, 8] |
6 | [2, 3] |
7 | [6, 9] |
8 | [3, 2] |
Worksheet #2a
How many distance computations with database points are needed to find the nearest neighbor using brute force?
Worksheet #2a
How many distance computations with database points are needed to find the nearest neighbor using brute force?
8, since we need to compare our query point to each database point
Worksheet #2b
How many distance computations with database points are needed to find the nearest neighbor using our k-d tree?
Worksheet #2b
How many distance computations with database points are needed to find the nearest neighbor using our k-d tree?
2, since we need to check both the points towards the bottom right
Attendance Link