A Game of Cache Attacks and Defense
Moinuddin Qureshi
MAD Workshop
June 19, 2022
Shared Caches: Good for Performance, Bad for Security!
2
Shared LLC helps improve utilization and performance,
But timing difference (LLC hit vs miss) & sharing of LLC space leads to vulnerabilities!
L1 Cache
L2 Cache
CORE-0
Process-0
Shared
Last Level Cache (LLC)
L1 Cache
L2 Cache
CORE-1
Process-1
LLC-Hit 🡪 Fast
LLC-Miss to DRAM 🡪 Slow
Private
Conflict-Based Cache Attacks
B
LLC
CORE
(Spy)
CORE
(Victim)
A
B
V
Co-running spy can infer access pattern of victim by causing cache conflicts
Conflicts leak access pattern, used to obtain secrets [AES – Bernstein’05]
Miss for B
Victim Accessed Set
Solutions Space
Way Partitioning
Mapping Table
(MT)
Randomization
Not scalable to many core
Mapping Table large for LLC (MBs)
OS support needed to protect Table
Inefficient use of cache space
Outline
Shared Cache
Cache Attack
Shared Cache 🡺 Proprietary Hash
Key Attack Step: Form Eviction Set
B
LLC
CORE
(Spy)
CORE
(Victim)
A
B
Eviction Set: A minimal group of lines that can evict the victim line (conflict)
Formation of eviction-set is needed to launch conflict-based attack
Proprietary Indexing Function
Cache index function determines the line to set mapping
Proprietary index functions can make it hard to form eviction set
B
B
Line Address
Proprietary
Function
?
Line Address
Example: Line-to-Bank Mapping
Modern Intel processors (Sandy Bridge) use banked cache
The proprietary index function used for load balancing and obfuscation
Outline
Shared Cache
Cache Attack
Shared Cache 🡺 Proprietary Hash
Cache Attack 🡺 O(N^2) Attack
Attacking Under Unknown Function
Attacker needs an efficient search algorithm to form eviction set
LLC
X
Find “L” random lines that can cause conflict miss on target set
Step-1: Trial-and-Error
Use a search algorithm, find “W+1” conflicting lines in L
Step-2: Search Algorithm
Launch conflict-based attack
Step-3: Launch Attack
Learning Eviction Set
D
B
A
C
LLC
E
Attacker can form eviction-set with about O(N^2) accesses 🡺 milliseconds
[Liu et al. S&P, 2015]
Form pattern such that cache has a conflict miss
Removed line NOT
in conflicting set
Removed line maps
to conflicting set
Remove one line from pattern & check conflict
Conflict Miss?
No
Yes
Breaking one machine allows the attacker to break other machines
Outline
Shared Cache
Cache Attack
Shared Cache 🡺 Proprietary Hash
Cache Attack 🡺 O(N^2) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER
MICRO 2018, Best Paper Award
CEASE: Cache using Encrypted Address Space
Insight: Don’t memorize the random mapping, compute it
LLC
xCAFE0000
Physical Line
Address (PLA)
Encrypt
Key
Dirty Evict
ELA
(ELA)
Decrypt
Key
0xa17b20cf
CEASE
Localized change (ELA visible only within the cache)
Cache operations (access, coherence, prefetch) all remain unchanged
Randomization via Encryption
Lines that mapped to the same set, get scattered to different sets
A’
B’
LLC
Encrypt
Key
CEASE
B’’
A’’
LLC
Encrypt
Key
CEASE
A
B
LLC
Mapping depends on the key, different machines have different keys
Learning Eviction Set
D
B
A
C
LLC
E
Attacker can break CEASE within 22 seconds (8MB LLC)
[Liu et al. S&P, 2015]
Form pattern such that cache has a conflict miss
Removed line NOT
in conflicting set
Removed line MAPS
to conflicting set
Remove one line from pattern & check conflict
Conflict Miss?
No
Yes
CEASER: CEASE with Remapping
CEASER uses gradual remapping to periodically change the keys
Split time into Epoch of N accesses (change Key every Epoch)
CACHE
Key
Key
Key
CurrKey
NextKey
CurrKey
NextKey
CurrKey
NextKey
EPOCH
BULK
GRADUAL
time
Key Takeaway for CEASER
Robust to O(N^2)attack
Negligible slowdown (~ 1%)
No OS support needed
Negligible storage overheads
Localized change (within cache)
Cache
Line Address
Encrypt
Key1
Key2
Change Key, Periodically
Need practical solution to protect LLC from conflict-based attacks
CEASER can enable practical randomization for LLCs
Outline
Shared Cache
Cache Attack
Shared Cache 🡺 Proprietary Hash
Cache Attack 🡺 O(N^2) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack
ISCA 2019
S&P 2019
State-of-the-Art Search Algorithm (SHM)
D
A
C
B
LLC
X
Single Holdout Method (SHM): for L=1200, W=16, need about 1M accesses Can we develop faster attacks than state-of-the-art?
[Liu et al. S&P, 2015]
Form pattern such that cache has a conflict miss
Removed line NOT
in conflicting set
Removed line MAPS
to conflicting set
Remove one line from pattern & check conflict
Conflict Miss?
No
Yes
Attack-1: Fast Search Algorithm (GEM)
Group Elimination Method : L=1200, W=16, accesses reduced from 1M to 40K
🡺 Remap Rate of CEASER must be increased to >25% to mitigate GEM
Divide L in “W+1” groups. Hold out 1 group. If conflict, discard the group.
(R=1)
(R=2)
(R=3)
Attack-2: Exploit Replacement Policy
Attack-2 avoids search algorithm : L=1200, accesses reduced from 1M to ~3K
🡺 Remap Rate of CEASER must be increased to >100%
LRU susceptible to thrashing. Exploit property to avoid search algorithm
D
A
C
B
LLC
X
A
B
C
D
X
A
B
C
D
X
Access Pattern to Attack LRU
3 misses, all to conflicting set
No need for search algorithm
A
B
C
D
X
A
B
C
D
X
A
B
C
D
X
Access Pattern to Attack RRIP
Outline
Shared Cache
Cache Attack
Shared Cache 🡺 Proprietary Hash
Cache Attack 🡺 O(N^2) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S
ISCA 2019
Skewed-Cache: Diversity of Locations
Skewed-Cache enables line to map to different sets. Static hash insecure.
A
B
Set-Associative Cache
X
X?
X?
Skewed Cache
A/B
h2
h1
A/B
X
The Good:
Map to different sets
Need (for security):
Unpredictable hashes
Dynamic changing hashes
Skewed-CEASER
Attacker needs to dislodge target line from multiple possible locations
Skewed-CEASER with 2+ partitions possible (stronger security)
Enable CEASER to map lines to different sets via principles of skewing
Security Analysis
Skewed-CEASER can tolerate fast attacks with Remap-Rate of 1%
Time to get enough number of “hard conflict” lines (dislodge both locations)
X?
X?
A/B
A/B
Key Takeaway
Skewed-CEASER has flexibility in placement of line, increased robustness
(performance and storage overhead remains negligible)
Cache randomization makes it harder for the adversary to form eviction sets
Attack-1: Fast search algorithm (GEM)
Attack-2: Exploit replacement policy, avoid search
Defense: Skewed-CEASER = CEASER + Skewed-Cache
Concurrent Paper
IEEE S&P, May 2019
Skewed Cache (Static)
USENIX Security, Aug 2019
Outline
Shared Cache
Cache Attack
Shared Cache 🡺 Proprietary Hash
Cache Attack 🡺 O(N^2) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack 🡺 Probabilistic Attack
arxiv 2019, S&P’21
MICRO 2020
Probabilistic Attack
Probabilistic attacks gather large number of soft conflicts for each partition
Try to get a “soft conflict” 🡺 conflicting line evicts target from one partition
X?
X?
A/B
A/B
Attack: Prime + Prune + Probe
Probabilistic attacks significantly weakens the security of skewed designs
Outline
Cache Attack
Cache Attack 🡺 O(N^2) Attack
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack 🡺 Probabilistic Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S 🡺 MIRAGE
USENIX Security 2021
Goal: Eliminate Conflicts, avoid Eviction-Set
Eliminate, not obfuscate, to end the arms race in eviction-set formation
Fully-Associative Random Evictions
(no Set-Associative Evictions)
Security
Challenge: Fully-Associative Lookup
Impractical Lookup
Complexity for LLC
Drawing Inspiration from Load-Balancing to Eliminate Set-Associative-Evictions (SAE)
32
16 Balls in 4 Buckets (C=4)
4C
4C
4C
4C
Balls
(Cachelines)
Random
Buckets
(Sets)
C=4
16 Balls in 4 Buckets (C=8)
Bucket Overflows Almost After Every Ball Thrown
Bucket Overflow Likelihood Reduced, But Still Possible
16 Balls in 4 Buckets of C=8
Power of 2 Choices
[Mitzenmacher’96]
Bucket Overflow Improbable: Balanced Distribution
8C
8C
8C
8C
C=8
8C
8C
8C
8C
C=8
Random
Lower
Load
Security Guarantee With Power of 2 Choices
33
Simulation Results
Theoretical Model
0
1
2
3
4
5
6
Extra Capacity Per Bucket (Default-Capacity = 8)
Balls Per Bucket Overflow
With 75% extra capacity,
1 Bucket Overflow per 1034 Balls Thrown,
Frequency of Bucket Overflows (Set-Associative Evictions)
Equivalently, LLC with 75% extra capacity & Power of 2 Choices Indexing has
Security Guarantee of 1 Set-Associative Eviction Per 1034 LLC Installs (1017 years)
MIRAGE: Guaranteeing Globally Random Evictions
34
Tag-Store
One to One
Data-Store
Extra Tags Cheap, Extra Data Expensive (1:10)
Tag FPTR
Data RPTR
75% extra
Inspired from V-way Cache [ISCA’05]
Sets
Ways
Install in Invalid-Tag
Global Random Eviction
MIRAGE (Decouples Tag and Data)
MIRAGE: Guaranteeing Globally Random Evictions
35
Tag-Store
Data-Store
Extra Tags Cheap, Extra Data Expensive (1:10)
Tag FPTR
Data RPTR
Sets
Ways
Global Random Eviction
MIRAGE
Line Install
In Invalid Tag
Rf1
Skew-0
Skew-1
Rf2
3 Invalid
2 Invalid
Power-of-2-Choices
Indexing
Guarantees
Invalid Tags
Security Guarantee: With 75% extra tags, MIRAGE ensures 1 Set-Associative Eviction capable of leaking information every 1034 LLC Installs (once in 1017 years)
Results - Performance
36
2% Slowdown, 20% Storage Overhead (75% extra tags).
Storage neutral slowdown close to 3.5%.
MIRAGE-
2% slowdown
ScatterCache -
1.7% slowdown
Detailed Results at https://arxiv.org/abs/2009.09090
Trace-Based Simulation of 8-Core 16MB/16-way LLC System
Paper includes MIRAGE-Lite with 50% extra tags and additional optimizations for security
Summary: A Game of Cache Attacks & Defense
Cache Attack
Cache Attack 🡺 O(N^2) Attack
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S
Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack 🡺 Probabilistic Attack
Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S 🡺 MIRAGE
38
Protecting Against Shared-Memory Attacks
Randomization Alone Cannot Mitigate
Shared-Memory Attacks
Duplication of Shared Cache Lines
Needed Across Domains [MICRO’18, SEC’19]
Domain-0
A
Domain-1
LLC
RF
A
A
MIRAGE and Scatter-Cache [SEC’19])
has Domain-ID as input to rand-function for duplication
Domain-0
A
A
Domain-1
RF0
RF1
A
A
LLC
Domain-0
A
A
Domain-1
D0| A
RF
D1| A
MIRAGE LLC
MIRAGE also stores Domain-ID with Tag to enable
LLC-Hit = Domain-Match && Tag-Match,
for duplication even when address maps to same set across domains
A,D1
A,D0
B,D0
B,D1
Cache Occupancy Attacks – Website Fingerprinting
Randomization vs. Cache-Partitioning
40
Cache-Randomization
Defense
Cache-Partitioning
Defense
Set-Conflict
Attacks
Shared-Memory
Attacks
Cache-Occupancy
Attacks
Spy
A
B
B
Victim
V
Spy
V
Victim
Miss Leaks
Victim Access
Hit Leaks
Victim Access
Spy
Victim
Number of Resident Lines
Principled Mitigation of Cache Attacks
Existing Cache-Partitioning Solutions & Limitations
41
Way-Partitioning
Page-Coloring
Need: Scalable & Flexible Cache-Partitioning Solution
with Customizable, Fine-Grain Cache-Allocations (Bespoke)
DAWG [MICRO’18]
STEALTHMEM [SEC’12], MI6 [MICRO’19]
Domain-Based
Way-Selection
Not Scalable (Number of Partitions limited by LLC Associativity)
Not Flexible (Requires DRAM and LLC allocations in same ratio)
Bespoke Cache Partitioning [Saileshwar & Qureshi, SEED 21]
42
Key Idea: Programmable Indexing Enabling “Bespoke” Partitions per Domain (VM/Process)
Scalable – Supports up to 512 Domains in 32MB LLC
Fine-Grain – Partition-size in multiples of 64KB Slice
Flexible – No Restriction on DRAM allocations
Design Challenges:
LLC Divided into Physical Slices of 64KB
Load-Balancing-Hash (LBH) Maps Address to Logical-Slices (LSID) within Partition
Slice-Indirection-Module (SIM)
Maps Logical-Slices to Phyiscal-Slices in LLC