1 of 49

CS-773 Paper Presentation��Randomised Row-Swap: Mitigating Row Hammer by Breaking Spatial Correlation between Aggressor and Victim Rows�

�Prokash Ghosh�The Booster Dose (#1)

214077001@iitb.ac.in

1

2 of 49

2

Outline

  • Classical Row Hammer Attacks
  • Existing Mitigations Techniques
  • Proposal: Randomized Row Swap
  • Results & Analysis
  • Conclusion

3 of 49

3

DRAM Bank

Row of Cells

Row

Row

Row

Row

Wordline

VLOW

VHIGH

Victim Row

Victim Row

Aggressor Row

Repeatedly opening and closing a row induces Row Hammer Attack in adjacent rows

Opened

Closed

Classical Row Hammer

4 of 49

Spatial Connection Between Aggressor Rows and Victims Rows

Row 0

Row 1

Row 2

Row 3

Row 4

Row 2

open

Row 1

Row 3

Row 3

closed

Row 3

open

Row 2

Row 4

Row 1

Row 5

Victim Row

Victim Row

Aggressor Row

Row 3

open

Row 3

closed

Aggressor Row

DRAM Bank

5 of 49

Prior Victim Focused : Mitigations

Row 0

Row 1

Row 2

Row 3

Row 5

Row 2

open

Row 1

Row 3

Row 2

closed

Row 2

open

Row 2

Row 4

Row 1

Victim Row

Victim Row

Aggressor Row

Row 2

open

Row 3 (Aggressor row)

Prior Defences

CAL[‘14],TWICe[ISCA’19], Graphene[MICRO’20], TRR-DDR4 standard

1

DRAM Bank

Track Aggressor

2

Mitigative Action

DDR4/LPDDR4 Specification: https://www.jedec.org/

Special

Refresh(TRR)

6 of 49

6

Prior Victim Focused : Mitigations

Can excessive usage of special refresh

corrupt data at the second level neighbours

of the aggressor row?

Target Row Refresh(TRR) 🡪 It is defined in DDR4/LPDDR4 specification: www.jedec.org

YES

7 of 49

7

Google’s Half-Double Row Hammer

Row 2 (Far Aggressor )

Row 1 (New Victim-Distance2)

Row 0

Row 3 (Near Aggressor)

Row 6

Row 5 (New Victim-Distance2)

Row 7

Row 4 (Far Aggressor )

1

Near Aggressor

2

Far Aggressor

3

New Victim at Distance-2

Google, 2021: Half-Double: Next-Row-Over Assisted RowHammer. https://github.com/google/hammer-kit/blob/main/20210525_half_double.pdf.

Special

Refresh

8 of 49

8

Familiarity with Some Terms

ACT 🡪 Activate Command, opens an intended row.

TRR 🡪 Target Row Refresh ( Refresh a selected row instead of refresh whole bank)

Note : ACTmax- 🡪 Maximum number of activations possible within 64ms ( ~1.36millions)

9 of 49

9

Row-Hammer Threshold: DDR Generations

Sl.No

DRAM Generations

RH-Theshold

1

DDR3(old)

139K

2

DDR3(new)

22.4K

3

DD4(old)

17.5K

4

DDR4(new)

10K

5

LPDDR4(old)

16.8K

6

LPDDR4(new)

4.8K

TRH = 4.8K

10 of 49

10

New Solution: Aggressor Focused Mitigations

  • Break the spatial correlation between the aggressor row and the victim rows (neighbours)
  • Intent is to reduce the time any aggressor might spend in its neighbourhood within an epoch
  • Periodically swap an aggressor row with a random row after threshold number of aggressor row activation

11 of 49

11

Aggressor

Aggressor

Randomized Row Swap : High Level

Row-Y

.………………..

Row X

Victim

Victim

Aggressor

Row-X

Row-Y

.………………..

Aggressor

Victim

Victim

Swap1

Row X ⬄ Aggressor

Aggressor

Aggressor

Row -X

Aggressor

.………………..

Row-Y

Victim

Victim

Swap2

Row Y ⬄ new Aggressor

Starting Position

12 of 49

12

Overview of Row-Swap

1

2

3

4

  • Row Indirection Table(RIT) holds the tuple(X,Y)
  • Row Swap time (~2.9us)

X

Y

13 of 49

13

RRS: In detail

RIT(Row Indirection Table)

HRT(Hot Row Tracker)

Swap=Yes

1

Present

Absent

Swap=Yes

.………………..

2

3

5

5

4

Memory access

  • Phase 1 (for current access): 1 🡪 2 🡪 3 🡪 Swap
  • Phase 2 (for next access): 4 🡪 5 🡪 Swap

14 of 49

14

Tracking Aggressor: Hot Row Tracker(HRT)

Yeonhong Park,Woosuk Kwon, Eojin Lee, Tae Jun Ham, Jung Ho Ahn, and JaeW. Lee, Graphene: Strong yet Lightweight Row Hammer Protection. In MICRO 2020.

Row

Address Count

A

6

Spill

Counter

2

X

3

Z

5

Row

Address Count

A

7

Spill

Counter

2

X

3

Z

5

A

Row

Address Count

A

7

Spill

Counter

3

X

3

Z

5

B

Row

Address Count

A

7

Spill

Counter

3

C

4

Z

5

C

Misra-Gries Algorithm:

15 of 49

15

High Level Operations in RIT+HRT

  • The aggressor row is identified using HRT
  • Access count for a row in HRT > (TRRS -1), then the row swap request is queued
  • All future access to the swapped rows must be redirected using RIT

16 of 49

16

Security of Row Swap

  • RIT is sufficiently sized to keep swapped row for 64ms
  • RIT introduce latency for lookup for DRAM access
  • Row buffer of a bank is closed after swap, no accesses are allowed during swap

17 of 49

17

Collision Avoidance Table (CAT)

  • RIT requires to store all swap entries for a particular epoch (refresh cycle)
  • RIT lookup latency is a part critical path, thus reduced lookup latency is paramount
  • Therefore, it is not desirable to implement RIT as a fully associative cache structure
  • RIT is implemented like a CAT type structure to store all entries and yet have reduced lookup latency like a set associative caches

18 of 49

18

Comparison of RRS with Victim Focused Mitigations

Attribute

Victim-Focused

RRS

Slowdown

<0.1 %

0.4%

Mitigates Classic Row Hammer( Neighbouring bit flips)

Mitigates Complex Patterns ( Far Aggressors of Half-Double)

X

Works Without Knowing DRAM mapping

X

19 of 49

19

What is Block Hammer

  • Track row activation rates using area-efficient Bloom filters
  • Use the tracking data to ensure that no row is ever activated rapidly enough to induce Row-Hammer bit-flips.
  • No DRAM row ever experiences a Row-Hammer unsafe activation rate.

20 of 49

20

Perf. comparison RRS vs Block-Hammer

BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows, HPCA, 2021.

Slowdown

Worst-case

Average

Block-Hammer

21.7%

2%

Slowdown

Worst-case

Average

RRS

7.6%

0.4%

21 of 49

21

Performance Sensitivity to RH-Threshold

As TRH increases slowdown reduces

22 of 49

22

Performance Overhead of Row-Swap

Avg 0.4% slowdown observed with 78-workloads

23 of 49

23

Results : Storage/Power Overhead

24 of 49

24

Summary

  • Mitigate all types of Row Hammer attack
  • 0.4% slowdown of performance with 78 workloads
  • 0.5 % overhead of DRAM power of RRS(per rank)
  • SRAM power overhead 0.903mW per rank
  • RIT lookups adds extra latency in datapath

25 of 49

25

Conclusion

  • The victim-based mitigation techniques lead to new row-hammer victims
  • An aggressor-based mitigation technique mitigates both row-hammer and half-double row-hammer attacks

26 of 49

26

Disadvantages of RRS

  • Increased on-chip power consumption due to added structures in memory controller
  • Performance degradation due to false positives while detecting aggressor row
  • It still possesses a non-zero probability of mounting a row-hammer based attack

27 of 49

27

Criticism Points

  • The authors have not discussed impact of false positives on system performance
  • The paper doesn’t report any analysis of variation of false positive rate with change in HRT size

28 of 49

Points to Discuss

28

  • Possibility side channel attack during swap
  • Increase window of speculative based attack due to blocking of accesses in DRAM during swap.

29 of 49

Possible Extensions

29

  • Coming up with a better heuristic to select swap candidate instead of random selection
  • Reduction of false positive rates by improving the strategy to identify or track aggressive rows

30 of 49

References

30

  • Gururaj Saileshwar, Bolin Wang, Moinuddin Qureshi, Prashant J. Nair, “Randomized Row-Swap: Mitigating Row Hammer by Breaking Spatial Correlation between Aggressor and Victim Rows”, In ASPLOS,2022.
  • A. G. Yaglikci, et.al “BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows, HPCA, 2021.
  • Eojin Lee, Ingab Kang, Sukhan Lee, G Edward Suh, and Jung Ho Ahn. 2019. TWiCe: preventing row-hammering by exploiting time window counters. In Proceedings of the 46th International Symposium on Computer Architecture.
  • Dae-Hyun Kim, Prashant J Nair, and Moinuddin K Qureshi. 2014. Architectural support for mitigating row hammering in DRAM memories. IEEE CAL 14, 1 (2014), 9–12.
  • J. S. Kim, et.al, “ Revisiting rowhammer: An experimental analysis of modern dram devices and mitigation techniques. ISCA, 2020

31 of 49

THANK YOU

31

32 of 49

Questions

32

33 of 49

Backup

33

34 of 49

34

CASLAT

ACT

READ

PRECHARGE

Command Bus

Address Bus

ACTIVE

tRC

tCCD

tRTP

tRP

35 of 49

35

Finding the RRS interval

  1. The randomised row swap threshold(T) can be such that TRH=k * T, k - integer.

  • For successful attack, # of ACTs >= k * T, with atleast k swaps of “T” sets of activations;

64ms

36 of 49

36

Finding the RRS interval

  1. The randomised row swap threshold(T) can be such that TRH=k * T, k - integer.

  • For successful attack, # of ACTs >= k * T, with atleast k swaps of “T” sets of activations;

64ms

37 of 49

37

An Event ( E) : Round of attack

E-1

Random Swap 1

E-2

Random Swap 2

  • In this manner k-swap/events need for attack..

38 of 49

38

Statistical Modelling : Bucket balls

  1. Intention : Find prob. of achieving k swaps on any row within 64ms

  • Max. # of ACTs of any bank (within 64ms)=A (=1.36millions)

  • With RRS, a bank will be busy in “row-swap” for a fraction of 64ms

  • So the available duty cycle for activations (say D).

39 of 49

39

Contd..

  1. A bank can undergo A*D activations in 64ms

  • A ball is thrown into N bucket when E triggers

  • The attacker can throw B = A*D/T such balls in 64ms.

40 of 49

40

Contd..

  1. Prob. of a row having k-swaps in 64ms.

  • Prob. of a bucket having k-balls per bucket after B balls are randomly thrown in N buckets, for a given T.

  • Bernouli Trials are used here.

41 of 49

41

Attack iterations Calculations

  • TRRS value considered for this paper is 800

42 of 49

42

Collision Avoidance Table (CAT): Storage Optimization

  • It is for storing C items

  • It adds E extra ways per set for S-sets cache

  • It has D demand ways per set. D=C/2S

43 of 49

43

Collision Avoidance Table (CAT): Storage Optimization

44 of 49

44

TRRS = 800

Entries required in =1.36millions/800=1700

Number of RIT-tuples = 3400

45 of 49

Issues in Evaluation Methodology

45

  • Evaluation methodology doesn’t segregate the IPC benefits due to
    • Adding new independent instructions in existing binary
    • Benefits due to software prefetching

46 of 49

46

Contd..

  1. Prob. of a row having k-swaps in 64ms.

  • Prob. of a bucket having k-balls per bucket after B balls are randomly thrown in N buckets, for a given T.

  • Bernouli Trials are used here.

47 of 49

Limitations

47

  • Extra area per bank 42.9kB. It can be 1.37MB per DIMM
  • It can be several MBs for multi-channel memory controllers
  • It is probabilistic method and not deterministic ways to mitigate

48 of 49

48

Performance Overhead of Row-Swap

49 of 49

49

Familiarity with Some Terms

Epoch 🡪 Refresh interval 64ms

ACT 🡪 Activate Command

TRR 🡪 Target Row Refresh

tRC 🡪 Time gap between two successive ACT commands in different rows of same bank

ACTmax 🡪 Maximum number of activations possible within an epoch. ( 1.36millions)

TRH 🡪 Minimum number of activations required to trigger Row Hammer Attack (4.8K accesses) on any physical row address within 64ms

Row Size 🡪8kB

Note : DDR4/LPDDR4 is used for this paper work.(PICTURE—tRC)