1 of 49

CS-773 Paper Presentation��Randomised Row-Swap: Mitigating Row Hammer by Breaking Spatial Correlation between Aggressor and Victim Rows�

�Prokash Ghosh��The Booster Dose (#1)

214077001@iitb.ac.in

1

2 of 49

2

Outline

Classical Row Hammer Attacks
Existing Mitigations Techniques
Proposal: Randomized Row Swap
Results & Analysis
Conclusion

3 of 49

3

DRAM Bank

Row of Cells

Row

Wordline

V_LOW

V_HIGH

Victim Row

Aggressor Row

Repeatedly opening and closing a row induces Row Hammer Attack in adjacent rows

Opened

Closed

Classical Row Hammer

4 of 49

Spatial Connection Between Aggressor Rows and Victims Rows

Row 0

Row 1

Row 2

Row 3

Row 4

Row 2

open

Row 1

Row 3

closed

Row 3

open

Row 2

Row 4

Row 1

Row 5

Victim Row

Aggressor Row

Row 3

open

Row 3

closed

Aggressor Row

DRAM Bank

Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors, (Kim et al., ISCA 2014)

In order to access data from a DRAM row, say row 2, the memory controller must first open or activate the row [CLICK].
After all requests are serviced from row 2, the memory controller must close or precharge the row [CLICK] in order to begin accessing data from another row.
[BACK & FORTH] Due to an increase in cell-to-cell interference as a likely result of increased cell packing density, rapidly activating and precharging a DRAM row can result in bit flips in nearby rows [CLICK]
[FORTH & BACK] Continuing to access the same row results in even more failures in nearby rows. This phenomenon is known as RowHammer. [CLICK]
We refer to the rapidly accessed row as an aggressor row and the rows containing bit flips as victim rows.

========
more polish text too close made better. More col;orful. Better figure too.

5 of 49

Prior Victim Focused : Mitigations

Row 0

Row 1

Row 2

Row 3

Row 5

Row 2

open

Row 1

Row 3

Row 2

closed

Row 2

open

Row 2

Row 4

Row 1

Victim Row

Aggressor Row

Row 2

open

Row 3 (Aggressor row)

Prior Defences

CAL[‘14],TWICe[ISCA’19], Graphene[MICRO’20], TRR-DDR4 standard

1

DRAM Bank

Track Aggressor

2

Mitigative Action

DDR4/LPDDR4 Specification: https://www.jedec.org/

Special

Refresh(TRR)

In order to access data from a DRAM row, say row 2, the memory controller must first open or activate the row [CLICK].
After all requests are serviced from row 2, the memory controller must close or precharge the row [CLICK] in order to begin accessing data from another row.
[BACK & FORTH] Due to an increase in cell-to-cell interference as a likely result of increased cell packing density, rapidly activating and precharging a DRAM row can result in bit flips in nearby rows [CLICK]
[FORTH & BACK] Continuing to access the same row results in even more failures in nearby rows. This phenomenon is known as RowHammer. [CLICK]
We refer to the rapidly accessed row as an aggressor row and the rows containing bit flips as victim rows.

========
more polish text too close made better. More col;orful. Better figure too.

6 of 49

6

Prior Victim Focused : Mitigations

Can excessive usage of special refresh

corrupt data at the second level neighbours

of the aggressor row?

Target Row Refresh(TRR) 🡪 It is defined in DDR4/LPDDR4 specification: www.jedec.org

YES

7 of 49

7

Google’s Half-Double Row Hammer

Row 2 (Far Aggressor )

Row 1 (New Victim-Distance2)

Row 0

Row 3 (Near Aggressor)

Row 6

Row 5 (New Victim-Distance2)

Row 7

Row 4 (Far Aggressor )

1

Near Aggressor

2

Far Aggressor

3

New Victim at Distance-2

Google, 2021: Half-Double: Next-Row-Over Assisted RowHammer. https://github.com/google/hammer-kit/blob/main/20210525_half_double.pdf.

Special

Refresh

8 of 49

8

Familiarity with Some Terms

ACT 🡪 Activate Command, opens an intended row.

TRR 🡪 Target Row Refresh ( Refresh a selected row instead of refresh whole bank)

Note : ACT_max- 🡪 Maximum number of activations possible within 64ms ( ~1.36millions)

9 of 49

9

Row-Hammer Threshold: DDR Generations

Sl.No	DRAM Generations	RH-Theshold
1	DDR3(old)	139K
2	DDR3(new)	22.4K
3	DD4(old)	17.5K
4	DDR4(new)	10K
5	LPDDR4(old)	16.8K
6	LPDDR4(new)	4.8K

T_RH = 4.8K

10 of 49

10

New Solution: Aggressor Focused Mitigations

Break the spatial correlation between the aggressor row and the victim rows (neighbours)
Intent is to reduce the time any aggressor might spend in its neighbourhood within an epoch
Periodically swap an aggressor row with a random row after threshold number of aggressor row activation

11 of 49

11

Aggressor

Randomized Row Swap : High Level

Row-Y

.………………..

Row X

Victim

Aggressor

Row-X

Row-Y

.………………..

Aggressor

Victim

Swap1

Row X ⬄ Aggressor

Aggressor

Row -X

Aggressor

.………………..

Row-Y

Victim

Swap2

Row Y ⬄ new Aggressor

Starting Position

12 of 49

12

Overview of Row-Swap

1

2

3

4

Row Indirection Table(RIT) holds the tuple(X,Y)
Row Swap time (~2.9us)

X

Y

13 of 49

13

RRS: In detail

RIT(Row Indirection Table)

HRT(Hot Row Tracker)

Swap=Yes

1

Present

Absent

Swap=Yes

.………………..

2

3

5

4

Memory access

Phase 1 (for current access): 1 🡪 2 🡪 3 🡪 Swap
Phase 2 (for next access): 4 🡪 5 🡪 Swap

14 of 49

14

Tracking Aggressor: Hot Row Tracker(HRT)

Yeonhong Park,Woosuk Kwon, Eojin Lee, Tae Jun Ham, Jung Ho Ahn, and JaeW. Lee, Graphene: Strong yet Lightweight Row Hammer Protection. In MICRO 2020.

Row

Address Count

A

6

Spill

Counter

2

X

3

Z

5

Row

Address Count

A

7

Spill

Counter

2

X

3

Z

5

A

Row

Address Count

A

7

Spill

Counter

3

X

3

Z

5

B

Row

Address Count

A

7

Spill

Counter

3

C

4

Z

5

C

Misra-Gries Algorithm:

15 of 49

15

High Level Operations in RIT+HRT

The aggressor row is identified using HRT
Access count for a row in HRT > (T_RRS -1), then the row swap request is queued
All future access to the swapped rows must be redirected using RIT

16 of 49

16

Security of Row Swap

RIT is sufficiently sized to keep swapped row for 64ms
RIT introduce latency for lookup for DRAM access
Row buffer of a bank is closed after swap, no accesses are allowed during swap

17 of 49

17

Collision Avoidance Table (CAT)

RIT requires to store all swap entries for a particular epoch (refresh cycle)
RIT lookup latency is a part critical path, thus reduced lookup latency is paramount
Therefore, it is not desirable to implement RIT as a fully associative cache structure
RIT is implemented like a CAT type structure to store all entries and yet have reduced lookup latency like a set associative caches

18 of 49

18

Comparison of RRS with Victim Focused Mitigations

Attribute	Victim-Focused	RRS
Slowdown	<0.1 %	0.4%
Mitigates Classic Row Hammer( Neighbouring bit flips)	√	√
Mitigates Complex Patterns ( Far Aggressors of Half-Double)	X	√
Works Without Knowing DRAM mapping	X	√

19 of 49

19

What is Block Hammer

Track row activation rates using area-efficient Bloom filters
Use the tracking data to ensure that no row is ever activated rapidly enough to induce Row-Hammer bit-flips.
No DRAM row ever experiences a Row-Hammer unsafe activation rate.

20 of 49

20

Perf. comparison RRS vs Block-Hammer

BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows, HPCA, 2021.

Slowdown	Worst-case	Average
Block-Hammer	21.7%	2%

Slowdown	Worst-case	Average
RRS	7.6%	0.4%

21 of 49

21

Performance Sensitivity to RH-Threshold

As T_RH increases slowdown reduces

22 of 49

22

Performance Overhead of Row-Swap

Avg 0.4% slowdown observed with 78-workloads

23 of 49

23

Results : Storage/Power Overhead

24 of 49

24

Summary

Mitigate all types of Row Hammer attack
0.4% slowdown of performance with 78 workloads
0.5 % overhead of DRAM power of RRS(per rank)
SRAM power overhead 0.903mW per rank
RIT lookups adds extra latency in datapath

25 of 49

25

Conclusion

The victim-based mitigation techniques lead to new row-hammer victims
An aggressor-based mitigation technique mitigates both row-hammer and half-double row-hammer attacks

26 of 49

26

Disadvantages of RRS

Increased on-chip power consumption due to added structures in memory controller
Performance degradation due to false positives while detecting aggressor row
It still possesses a non-zero probability of mounting a row-hammer based attack

27 of 49

27

Criticism Points

The authors have not discussed impact of false positives on system performance
The paper doesn’t report any analysis of variation of false positive rate with change in HRT size

28 of 49

Points to Discuss

28

Possibility side channel attack during swap
Increase window of speculative based attack due to blocking of accesses in DRAM during swap.

29 of 49

Possible Extensions

29

Coming up with a better heuristic to select swap candidate instead of random selection
Reduction of false positive rates by improving the strategy to identify or track aggressive rows

30 of 49

References

30

Gururaj Saileshwar, Bolin Wang, Moinuddin Qureshi, Prashant J. Nair, “Randomized Row-Swap: Mitigating Row Hammer by Breaking Spatial Correlation between Aggressor and Victim Rows”, In ASPLOS,2022.
A. G. Yaglikci, et.al “BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows, HPCA, 2021.
Eojin Lee, Ingab Kang, Sukhan Lee, G Edward Suh, and Jung Ho Ahn. 2019. TWiCe: preventing row-hammering by exploiting time window counters. In Proceedings of the 46th International Symposium on Computer Architecture.
Dae-Hyun Kim, Prashant J Nair, and Moinuddin K Qureshi. 2014. Architectural support for mitigating row hammering in DRAM memories. IEEE CAL 14, 1 (2014), 9–12.
J. S. Kim, et.al, “ Revisiting rowhammer: An experimental analysis of modern dram devices and mitigation techniques. ISCA, 2020

31 of 49

THANK YOU

31

32 of 49

Questions

32

33 of 49

Backup

33

34 of 49

34

CASLAT

ACT

READ

PRECHARGE

Command Bus

Address Bus

ACTIVE

tRC

tCCD

tRTP

tRP

35 of 49

35

Finding the RRS interval

The randomised row swap threshold(T) can be such that T_RH=k * T, k - integer.

For successful attack, # of ACTs >= k * T, with atleast k swaps of “T” sets of activations;

64ms

36 of 49

36

Finding the RRS interval

The randomised row swap threshold(T) can be such that T_RH=k * T, k - integer.

For successful attack, # of ACTs >= k * T, with atleast k swaps of “T” sets of activations;

64ms

37 of 49

37

An Event ( E) : Round of attack

E-1

Random Swap 1

E-2

Random Swap 2

In this manner k-swap/events need for attack..

38 of 49

38

Statistical Modelling : Bucket balls

Intention : Find prob. of achieving k swaps on any row within 64ms

Max. # of ACTs of any bank (within 64ms)=A (=1.36millions)

With RRS, a bank will be busy in “row-swap” for a fraction of 64ms

So the available duty cycle for activations (say D).

39 of 49

39

Contd..

A bank can undergo A*D activations in 64ms

A ball is thrown into N bucket when E triggers

The attacker can throw B = A*D/T such balls in 64ms.

40 of 49

40

Contd..

Prob. of a row having k-swaps in 64ms.

Prob. of a bucket having k-balls per bucket after B balls are randomly thrown in N buckets, for a given T.

Bernouli Trials are used here.

41 of 49

41

Attack iterations Calculations

T_RRS value considered for this paper is 800

42 of 49

42

Collision Avoidance Table (CAT): Storage Optimization

It is for storing C items

It adds E extra ways per set for S-sets cache

It has D demand ways per set. D=C/2S

43 of 49

43

Collision Avoidance Table (CAT): Storage Optimization

44 of 49

44

T_RRS= 800

Entries required in =1.36millions/800=1700

Number of RIT-tuples = 3400

45 of 49

Issues in Evaluation Methodology

45

Evaluation methodology doesn’t segregate the IPC benefits due to

Adding new independent instructions in existing binary
Benefits due to software prefetching

46 of 49

46

Contd..

Prob. of a row having k-swaps in 64ms.

Prob. of a bucket having k-balls per bucket after B balls are randomly thrown in N buckets, for a given T.

Bernouli Trials are used here.

47 of 49

Limitations

47

Extra area per bank 42.9kB. It can be 1.37MB per DIMM
It can be several MBs for multi-channel memory controllers
It is probabilistic method and not deterministic ways to mitigate

48 of 49

48

Performance Overhead of Row-Swap

49 of 49

49

Familiarity with Some Terms

Epoch 🡪 Refresh interval 64ms

ACT 🡪 Activate Command

TRR 🡪 Target Row Refresh

t_RC 🡪 Time gap between two successive ACT commands in different rows of same bank

ACT_max 🡪 Maximum number of activations possible within an epoch. ( 1.36millions)

T_RH 🡪 Minimum number of activations required to trigger Row Hammer Attack (4.8K accesses) on any physical row address within 64ms

Row Size 🡪8kB

Note : DDR4/LPDDR4 is used for this paper work.(PICTURE—tRC)