CS-773 Paper Presentation
�Predicting Performance Impact of DVFS for Realistic Memory Systems
Abhinav Sridhar
The Booster Dose (#1)
abhinavsridhar22@iitb.ac.in
1
DVFS
2
DVFS
3
Image Reference: Slides by Prof. Biswabandan Panda
Two Phase View
4
Current Methods: Leading Loads
Tmemory → memory access latency
Ccompute → # compute cycles
5
T = Ccompute * t + Tmemory
Current Methods: Stall Time
6
Shortcomings: DRAM
7
Shortcomings: Prefetching
8
Potential for Improvement
STATE OF THE ART DVFS CONTROLLERS CRUMBLE UNDER REALISTIC DRAM MODEL AS WELL AS PREFETCHERS
9
Realistic Execution Sequence
10
CRIT: Critical Path Calculation
11
CRIT: Critical Path Calculation
12
CRIT: Critical Path Calculation
PA = PB = 0, since Pglobal = 0
13
CRIT: Critical Path Calculation
Pglobal = max(0,A) = A
14
CRIT: Critical Path Calculation
15
CRIT: Critical Path Calculation
Pglobal = max(A, B) = B
16
CRIT: Critical Path Calculation
Pglobal = max(B, A+C) = A+C
17
CRIT: Critical Path Calculation
18
CRIT: Critical Path Calculation
19
CRIT: Critical Path Calculation
20
Effects of Prefetching
21
Limited Bandwidth Performance Model
22
Limited Bandwidth Performance Model
Tmin memory ; ∀ t < tcrossover
Tdemand + t * Ccompute ;
elsewhere
23
T =
DRAM Slack
24
Hardware Overheads
25
Experimental Methodology
26
Policies for Comparison
27
Policies for Comparison
28
Policies for Comparison
29
Results: Energy Reduction
Memory Intensive Benchmarks (Without Prefetching)
30
Results: Energy Reduction
Non-Memory Intensive Benchmarks (Without Prefetching)
31
Results: Energy Reduction
Prefetch-Heavy Benchmarks
32
Results: Energy Reduction
Prefetch-Light Benchmarks
33
Results
34
Results
All prefetch heavy benchmarks lie above y = x
35
Conclusion
36
References
Figures, unless mentioned, have been taken from: Miftakhutdinov Rustam, Eiman Ebrahimi, and Yale N. Patt. "Predicting performance impact of DVFS for realistic memory systems." ,MICRO 2012
37
THANK YOU
38
Critical Points
39
Current Methods: Leading Loads
40
Two Phase View
41
CRIT: Critical Path Calculation
42
CRIT: Critical Path Calculation
43
Current Methods: Leading Loads
T = Ccompute * t + Tmemory
44
Limited Bandwidth Performance Model
Tmin memory ; ∀ t < tcrossover
Tdemand + t * Ccompute ;
elsewhere
45
Results
All prefetch heavy benchmarks lie above y = x
46