BufferPool - Spill to Disk
Design Summary
High-level Design
BufferPool API
ReservationTracker
In-Memory Buffer/Page Mgmt
Operators & Query Lifecycle
Buffer-splitting allocator for hash tables
Spilled Page Mgmt
Scalable Buffer Allocator
Tmp File Mgmt
DiskIoMgr improvements
Completed
Code available
Work in progress/
Not started
Planner Reservation Computation/Allocation
Spilled Page Management
Spilled Page Mgmt - Page States
Pinned
Unpinned
(dirty)
Create()
Unpin()
pin_count == 0
pin_count > 0
Unpinned
(write in flight)
Unpinned
(clean)
Flush
Write Complete
Error
Write Error
Pin()
Pin()
Pin()
Destroyed
Close() from any state
Unpinned
(evicted)
Evict
Pin() +
blocking read
Legend
Has buffer
No buffer
Buffer Ownership
Each buffer is owned by either:
• a BufferPool client, e.g. ExecNode or MemPool
• a page (see above)
• internal free lists (if freed by client or a page was evicted)
Note: there is now a pinned (read in flight) state here. Need to update.
Spilled Page Mgmt - Policies
Spilled Page Mgmt
Client
Pinned pages:
Dirty pages�(LIFO):
Write in-flight pages:
Unpin()
flush
Clean pages
write complete
Evicted pages
Clean Page Lists�(one per core)
evict
MAX_MB mb
2mb
64kb
...
...
Memory allocator
Free buffer lists (one per core)
MAX_MB mb
2mb
64kb
...
...
TBD: eviction policy - FIFO?
Implementation notes:
These data structures will be combined to reduce the # of lock acquisitions.
Scalable Memory Allocator
Core 0
64kb
16mb
Core 1
64kb
16mb
...
...
Buffer-splitting Allocator for Hash Tables
Managing split between BufferPool and non-BufferPool Memory
Tmp File Management
Tmp File Management Design
DiskIoMgr Improvements