1 of 42

A Game of Cache Attacks and Defense

Moinuddin Qureshi

MAD Workshop

June 19, 2022

2 of 42

Shared Caches: Good for Performance, Bad for Security!

2

Shared LLC helps improve utilization and performance,

But timing difference (LLC hit vs miss) & sharing of LLC space leads to vulnerabilities!

L1 Cache

L2 Cache

CORE-0

Process-0

Shared

Last Level Cache (LLC)

L1 Cache

L2 Cache

CORE-1

Process-1

LLC-Hit 🡪 Fast

LLC-Miss to DRAM 🡪 Slow

Private

3 of 42

Conflict-Based Cache Attacks

B

LLC

CORE

(Spy)

CORE

(Victim)

A

B

V

Co-running spy can infer access pattern of victim by causing cache conflicts

Conflicts leak access pattern, used to obtain secrets [AES – Bernstein’05]

Miss for B

Victim Accessed Set

4 of 42

Solutions Space

Way Partitioning

Mapping Table

(MT)

Randomization

Not scalable to many core

Mapping Table large for LLC (MBs)

OS support needed to protect Table

Inefficient use of cache space

5 of 42

Outline

Shared Cache

Cache Attack

Shared Cache 🡺 Proprietary Hash

6 of 42

Key Attack Step: Form Eviction Set

B

LLC

CORE

(Spy)

CORE

(Victim)

A

B

Eviction Set: A minimal group of lines that can evict the victim line (conflict)

Formation of eviction-set is needed to launch conflict-based attack

7 of 42

Proprietary Indexing Function

Cache index function determines the line to set mapping

Proprietary index functions can make it hard to form eviction set

B

B

Line Address

Proprietary

Function

?

Line Address

8 of 42

Example: Line-to-Bank Mapping

Modern Intel processors (Sandy Bridge) use banked cache

The proprietary index function used for load balancing and obfuscation

9 of 42

Outline

Shared Cache

Cache Attack

Shared Cache 🡺 Proprietary Hash

Cache Attack 🡺 O(N^2) Attack

10 of 42

Attacking Under Unknown Function

Attacker needs an efficient search algorithm to form eviction set

LLC

X

Find “L” random lines that can cause conflict miss on target set

Step-1: Trial-and-Error

Use a search algorithm, find “W+1” conflicting lines in L

Step-2: Search Algorithm

Launch conflict-based attack

Step-3: Launch Attack

11 of 42

Learning Eviction Set

D

B

A

C

LLC

E

Attacker can form eviction-set with about O(N^2) accesses 🡺 milliseconds

[Liu et al. S&P, 2015]

Form pattern such that cache has a conflict miss

Removed line NOT

in conflicting set

Removed line maps

to conflicting set

Remove one line from pattern & check conflict

Conflict Miss?

No

Yes

Breaking one machine allows the attacker to break other machines

12 of 42

Outline

Shared Cache

Cache Attack

Shared Cache 🡺 Proprietary Hash

Cache Attack 🡺 O(N^2) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER

MICRO 2018, Best Paper Award

13 of 42

CEASE: Cache using Encrypted Address Space

Insight: Don’t memorize the random mapping, compute it

LLC

xCAFE0000

Physical Line

Address (PLA)

Encrypt

Key

Dirty Evict

ELA

(ELA)

Decrypt

Key

0xa17b20cf

CEASE

Localized change (ELA visible only within the cache)

Cache operations (access, coherence, prefetch) all remain unchanged

14 of 42

Randomization via Encryption

Lines that mapped to the same set, get scattered to different sets

A’

B’

LLC

Encrypt

Key

CEASE

B’’

A’’

LLC

Encrypt

Key

CEASE

A

B

LLC

Mapping depends on the key, different machines have different keys

15 of 42

Learning Eviction Set

D

B

A

C

LLC

E

Attacker can break CEASE within 22 seconds (8MB LLC)

[Liu et al. S&P, 2015]

Form pattern such that cache has a conflict miss

Removed line NOT

in conflicting set

Removed line MAPS

to conflicting set

Remove one line from pattern & check conflict

Conflict Miss?

No

Yes

16 of 42

CEASER: CEASE with Remapping

CEASER uses gradual remapping to periodically change the keys

Split time into Epoch of N accesses (change Key every Epoch)

CACHE

Key

Key

Key

CurrKey

NextKey

CurrKey

NextKey

CurrKey

NextKey

EPOCH

BULK

GRADUAL

time

17 of 42

Key Takeaway for CEASER

Robust to O(N^2)attack

Negligible slowdown (~ 1%)

No OS support needed

Negligible storage overheads

Localized change (within cache)

Cache

Line Address

Encrypt

Key1

Key2

Change Key, Periodically

Need practical solution to protect LLC from conflict-based attacks

CEASER can enable practical randomization for LLCs

18 of 42

Outline

Shared Cache

Cache Attack

Shared Cache 🡺 Proprietary Hash

Cache Attack 🡺 O(N^2) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack

ISCA 2019

S&P 2019

19 of 42

State-of-the-Art Search Algorithm (SHM)

D

A

C

B

LLC

X

Single Holdout Method (SHM): for L=1200, W=16, need about 1M accesses Can we develop faster attacks than state-of-the-art?

[Liu et al. S&P, 2015]

Form pattern such that cache has a conflict miss

Removed line NOT

in conflicting set

Removed line MAPS

to conflicting set

Remove one line from pattern & check conflict

Conflict Miss?

No

Yes

20 of 42

Attack-1: Fast Search Algorithm (GEM)

Group Elimination Method : L=1200, W=16, accesses reduced from 1M to 40K

🡺 Remap Rate of CEASER must be increased to >25% to mitigate GEM

Divide L in “W+1” groups. Hold out 1 group. If conflict, discard the group.

(R=1)

(R=2)

(R=3)

21 of 42

Attack-2: Exploit Replacement Policy

Attack-2 avoids search algorithm : L=1200, accesses reduced from 1M to ~3K

🡺 Remap Rate of CEASER must be increased to >100%

LRU susceptible to thrashing. Exploit property to avoid search algorithm

D

A

C

B

LLC

X

A

B

C

D

X

A

B

C

D

X

Access Pattern to Attack LRU

3 misses, all to conflicting set

No need for search algorithm

A

B

C

D

X

A

B

C

D

X

A

B

C

D

X

Access Pattern to Attack RRIP

22 of 42

Outline

Shared Cache

Cache Attack

Shared Cache 🡺 Proprietary Hash

Cache Attack 🡺 O(N^2) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S

ISCA 2019

23 of 42

Skewed-Cache: Diversity of Locations

Skewed-Cache enables line to map to different sets. Static hash insecure.

A

B

Set-Associative Cache

X

X?

X?

Skewed Cache

A/B

h2

h1

A/B

X

The Good:

Map to different sets

Need (for security):

Unpredictable hashes

Dynamic changing hashes

24 of 42

Skewed-CEASER

Attacker needs to dislodge target line from multiple possible locations

Skewed-CEASER with 2+ partitions possible (stronger security)

Enable CEASER to map lines to different sets via principles of skewing

25 of 42

Security Analysis

Skewed-CEASER can tolerate fast attacks with Remap-Rate of 1%

Time to get enough number of “hard conflict” lines (dislodge both locations)

X?

X?

A/B

A/B

26 of 42

Key Takeaway

Skewed-CEASER has flexibility in placement of line, increased robustness

(performance and storage overhead remains negligible)

Cache randomization makes it harder for the adversary to form eviction sets

Attack-1: Fast search algorithm (GEM)

Attack-2: Exploit replacement policy, avoid search

Defense: Skewed-CEASER = CEASER + Skewed-Cache

Concurrent Paper

IEEE S&P, May 2019

Skewed Cache (Static)

USENIX Security, Aug 2019

27 of 42

Outline

Shared Cache

Cache Attack

Shared Cache 🡺 Proprietary Hash

Cache Attack 🡺 O(N^2) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack 🡺 Probabilistic Attack

arxiv 2019, S&P’21

MICRO 2020

28 of 42

Probabilistic Attack

Probabilistic attacks gather large number of soft conflicts for each partition

Try to get a “soft conflict” 🡺 conflicting line evicts target from one partition

X?

X?

A/B

A/B

29 of 42

Attack: Prime + Prune + Probe

Probabilistic attacks significantly weakens the security of skewed designs

30 of 42

Outline

Cache Attack

Cache Attack 🡺 O(N^2) Attack

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack 🡺 Probabilistic Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S 🡺 MIRAGE

USENIX Security 2021

31 of 42

Goal: Eliminate Conflicts, avoid Eviction-Set

Eliminate, not obfuscate, to end the arms race in eviction-set formation

Fully-Associative Random Evictions

(no Set-Associative Evictions)

Security

Challenge: Fully-Associative Lookup

Impractical Lookup

Complexity for LLC

32 of 42

Drawing Inspiration from Load-Balancing to Eliminate Set-Associative-Evictions (SAE)

32

16 Balls in 4 Buckets (C=4)

4C

4C

4C

4C

Balls

(Cachelines)

Random

Buckets

(Sets)

C=4

16 Balls in 4 Buckets (C=8)

Bucket Overflows Almost After Every Ball Thrown

Bucket Overflow Likelihood Reduced, But Still Possible

16 Balls in 4 Buckets of C=8

Power of 2 Choices

[Mitzenmacher’96]

Bucket Overflow Improbable: Balanced Distribution

8C

8C

8C

8C

C=8

8C

8C

8C

8C

C=8

Random

Lower

Load

33 of 42

Security Guarantee With Power of 2 Choices

33

Simulation Results

Theoretical Model

0

1

2

3

4

5

6

Extra Capacity Per Bucket (Default-Capacity = 8)

Balls Per Bucket Overflow

With 75% extra capacity,

1 Bucket Overflow per 1034 Balls Thrown,

Frequency of Bucket Overflows (Set-Associative Evictions)

  • Strong Security: SAE unlikely in lifetime of universe
  • Modest Costs: 2% Slowdown, 18% Storage Overhead

Equivalently, LLC with 75% extra capacity & Power of 2 Choices Indexing has

Security Guarantee of 1 Set-Associative Eviction Per 1034 LLC Installs (1017 years)

34 of 42

MIRAGE: Guaranteeing Globally Random Evictions

34

Tag-Store

One to One

Data-Store

Extra Tags Cheap, Extra Data Expensive (1:10)

Tag FPTR

Data RPTR

75% extra

Inspired from V-way Cache [ISCA’05]

Sets

Ways

Install in Invalid-Tag

Global Random Eviction

MIRAGE (Decouples Tag and Data)

35 of 42

MIRAGE: Guaranteeing Globally Random Evictions

35

Tag-Store

Data-Store

Extra Tags Cheap, Extra Data Expensive (1:10)

Tag FPTR

Data RPTR

Sets

Ways

Global Random Eviction

MIRAGE

Line Install

In Invalid Tag

Rf1

Skew-0

Skew-1

Rf2

3 Invalid

2 Invalid

Power-of-2-Choices

Indexing

Guarantees

Invalid Tags

Security Guarantee: With 75% extra tags, MIRAGE ensures 1 Set-Associative Eviction capable of leaking information every 1034 LLC Installs (once in 1017 years)

36 of 42

Results - Performance

36

2% Slowdown, 20% Storage Overhead (75% extra tags).

Storage neutral slowdown close to 3.5%.

MIRAGE-

2% slowdown

ScatterCache -

1.7% slowdown

Trace-Based Simulation of 8-Core 16MB/16-way LLC System

Paper includes MIRAGE-Lite with 50% extra tags and additional optimizations for security

37 of 42

Summary: A Game of Cache Attacks & Defense

Cache Attack

Cache Attack 🡺 O(N^2) Attack

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S

Cache Attack 🡺 O(N^2) Attack 🡺 O(N) Attack 🡺 Probabilistic Attack

Shared Cache 🡺 Proprietary Hash 🡺 CEASER 🡺 CEASER-S 🡺 MIRAGE

38 of 42

38

Protecting Against Shared-Memory Attacks

Randomization Alone Cannot Mitigate

Shared-Memory Attacks

Duplication of Shared Cache Lines

Needed Across Domains [MICRO’18, SEC’19]

Domain-0

A

Domain-1

LLC

RF

A

A

MIRAGE and Scatter-Cache [SEC’19])

has Domain-ID as input to rand-function for duplication

Domain-0

A

A

Domain-1

RF0

RF1

A

A

LLC

Domain-0

A

A

Domain-1

D0| A

RF

D1| A

MIRAGE LLC

MIRAGE also stores Domain-ID with Tag to enable

LLC-Hit = Domain-Match && Tag-Match,

for duplication even when address maps to same set across domains

A,D1

A,D0

B,D0

B,D1

39 of 42

Cache Occupancy Attacks – Website Fingerprinting

40 of 42

Randomization vs. Cache-Partitioning

40

Cache-Randomization

Defense

Cache-Partitioning

Defense

Set-Conflict

Attacks

Shared-Memory

Attacks

Cache-Occupancy

Attacks

Spy

A

B

B

Victim

V

Spy

V

Victim

Miss Leaks

Victim Access

Hit Leaks

Victim Access

Spy

Victim

Number of Resident Lines

Principled Mitigation of Cache Attacks

41 of 42

Existing Cache-Partitioning Solutions & Limitations

41

Way-Partitioning

Page-Coloring

Need: Scalable & Flexible Cache-Partitioning Solution

with Customizable, Fine-Grain Cache-Allocations (Bespoke)

DAWG [MICRO’18]

STEALTHMEM [SEC’12], MI6 [MICRO’19]

Domain-Based

Way-Selection

Not Scalable (Number of Partitions limited by LLC Associativity)

Not Flexible (Requires DRAM and LLC allocations in same ratio)

42 of 42

Bespoke Cache Partitioning [Saileshwar & Qureshi, SEED 21]

42

Key Idea: Programmable Indexing Enabling “Bespoke” Partitions per Domain (VM/Process)

Scalable – Supports up to 512 Domains in 32MB LLC

Fine-Grain – Partition-size in multiples of 64KB Slice

Flexible – No Restriction on DRAM allocations

Design Challenges:

  1. Uniformly Map Addresses within Arbitrary-Size Partitions?
  2. Mapping Partitions to LLC Locations?

LLC Divided into Physical Slices of 64KB

Load-Balancing-Hash (LBH) Maps Address to Logical-Slices (LSID) within Partition

Slice-Indirection-Module (SIM)

Maps Logical-Slices to Phyiscal-Slices in LLC