Imprecise Store Exceptions
Siddharth Gupta Yuanlong Li Qingxuan Kang�Abhishek Bhattacharjee Babak Falsafi Yunho Oh Mathias Payer�midgard.epfl.ch
Memory Capacity is Increasing
2
Future servers will have access to PBs of memory capacity
Memory Capacity
Technology
GBs
~100 GBs
~10 TBs
PBs
Virtual Memory (VM)
Address translation + permission check
TBs-PBs of memory capacity in servers
3
ST
ST
LD
ST
Performance bottleneck at the core for high memory capacity
Permission�Check
~1000 TLB entries
$
Memory
Previous Proposals
Virtual cache hierarchies
Intermediate address spaces
4
ST
ST
LD
ST
Page Table Walk
Proposals to push address translation towards memory
$
Memory
~1000 TLB entries
Permission�Check
Late Exception Detection
Store instructions can retire before completion
Precise handling requires too much state
5
Memory consistency model bars precise exception handling
ST
ST
LD
ST
$
Memory
Permission�Check
ST
Page Table Walk
This Paper: Imprecise Store Exceptions
Retired-but-incomplete stores are pending
Handle store exceptions imprecisely
6
$
Exceptions triggered imprecisely on an unrelated instruction
ST
Page Table Walk
Memory
Permission�Check
Contributions
OS + microarchitecture codesign with formalism
Key results
7
OS + microarchitecture codesign with formalism for exception handling
Reason for Imprecision
Out-of-order cores retire stores early
8
Exceptions generated after store retirement cannot be handled precisely
Load Store�Queue
Store�Buffer
Page Table Walk
ST
Core
Page Fault
Retire
Imprecise Store Exceptions: Design
Problem: Stores pending in the store buffer
Proposal: Hand off stores to the OS
9
Microarchitecture will record, OS will replay the stores
Implementation
Microarchitecture records stores in an in-memory queue
OS reads the recorded stores in the exception handler
10
Simple per-core controllers to record stores
ST
ST
ST
ST
µArch
In-Memory Queue
OS
Correctness Through Formalism
Does the final design comply with the memory consistency model?
11
Formalism is essential for proving compliance with the memory model
Proofs in the paper
Not Just Virtual Memory!
Imprecise exceptions might be caused by
12
täkō
Load Store�Queue
Store�Buffer
ST
Core
Page Fault
Retire
Evaluation
Full-system prototype with FireSim
Results
13
Artifact available on:
github.com/parsa-epfl/imprecise_store_exceptions
Conclusion
14
15
Memory Consistency Litmus Tests
16
Ordering relation | Cases covered |
Dependencies | 2366 |
Program order (same location) | 368 |
Preserved program order | 733 |
External read-from order | 1544 |
Internal read-from order | 1304 |
Coherence order | 747 |
From-read order | 976 |
Barriers | 1581 |
All litmus tests pass on the RISC-V prototype
RISC-V Litmus Tests [Alglave, CAV’10]