CS61C: Great Ideas in Computer Architecture (aka Machine Structures)
Lecture 25: Introduction to Performance Programming
Instructors: Lisa Yan, Justin Yokota
#
CS 61C
Spring 2024
Agenda
2
CS 61C
Spring 2024
Measuring Runtime: The <time.h> library
3
CS 61C
Spring 2024
Measuring Runtime: The <time.h> library
4
CS 61C
Spring 2024
Amdahl's Law: Analogy
25 mi/hr average | ? average |
Overall: 50 mi/hr
Berkeley
San Jose
Midpoint
CS 61C
Spring 2024
Amdahl's Law: Analogy
25 mi/hr -> 1 hour spent | ? average -> 0 hours spent |
Overall: 50 mi/hr -> 1 hour total
Berkeley: 0 miles
San Jose: 50 miles
Midpoint: 25 miles
CS 61C
Spring 2024
Amdahl's Law
1x speed | ? average |
Overall: 2x speed
Start
End
Half original runtime
CS 61C
Spring 2024
Amdahl's Law
CS 61C
Spring 2024
Amdahl's Law: Example
1x speed | 100x speed |
Start
End
CS 61C
Spring 2024
Amdahl's Law: Example
1x speed | 100x speed |
Start
End
CS 61C
Spring 2024
Amdahl's Law: Example
75 sec, 1x speed | |
Start
End
75 sec, 1x speed | 25 sec, 100x speed |
CS 61C
Spring 2024
Amdahl's Law in the Real World
CS 61C
Spring 2024
Amdahl's Law: Consequences
CS 61C
Spring 2024
Amdahl's Law: Consequences
CS 61C
Spring 2024
Case Study: Matrix Multiplication
15
| | | | | 17 | 18 | 19 | 20 |
| | | | | 21 | 22 | 23 | 24 |
| | | | | 25 | 26 | 27 | 28 |
| | | | | 29 | 30 | 31 | 32 |
| | | | | | | | |
1 | 2 | 3 | 4 | | | | | |
5 | 6 | 7 | 8 | | | | | |
9 | 10 | 11 | 12 | | | | | |
13 | 14 | 15 | 16 | | | | | |
CS 61C
Spring 2024
16
CS 61C
Spring 2024
17
CS 61C
Spring 2024
18
CS 61C
Spring 2024
Version 1: Matrix Multiply in C
19
CS 61C
Spring 2024
Version 1: Matrix Multiply in C
20
CS 61C
Spring 2024
Operation Time Costs
21
CS 61C
Spring 2024
Operation Time Costs in Matrix Multiply
22
CS 61C
Spring 2024
Optimization 1: Loop Unrolling and Function Inlining
int f(int i) {return i*i;}
for(int i = 0; i < max; i++)�{� arr[i] = f(i);�}
int i = 0;�for(; i < max/4*4; i+=4) {� arr[i] = i * i;� arr[i+1] = (i+1) * (i+1);� arr[i+2] = (i+2) * (i+2);� arr[i+3] = (i+3) * (i+3);�}�for(; i < max; i++) {� arr[i] = i * i;�}
23
CS 61C
Spring 2024
Optimization 1: Loop Unrolling and Function Inlining
24
CS 61C
Spring 2024
Loop Unrolling and Function Inlining Limitations
25
CS 61C
Spring 2024
Optimization 2: Variable Caching
for(int i = 0; i < max/4; i++)�{� f(i);�}
register int i = 0;�register int maxoverfour = max/4;�for(; i < maxoverfour; i++) {� f(i);�}
26
CS 61C
Spring 2024
Variable Caching Limitations
27
CS 61C
Spring 2024
Optimization 3: Data Caching
28
CS 61C
Spring 2024
Data Caching Example
29
CS 61C
Spring 2024
Data Caching for Matrix Multiplication
30
| | | | | 17 | 18 | 19 | 20 |
| | | | | 21 | 22 | 23 | 24 |
| | | | | 25 | 26 | 27 | 28 |
| | | | | 29 | 30 | 31 | 32 |
| | | | | | | | |
1 | 2 | 3 | 4 | | | | | |
5 | 6 | 7 | 8 | | | | | |
9 | 10 | 11 | 12 | | | | | |
13 | 14 | 15 | 16 | | | | | |
CS 61C
Spring 2024
Data Caching for Matrix Multiplication
31
| | | | | 17 | 21 | 25 | 29 |
| | | | | 18 | 22 | 26 | 30 |
| | | | | 19 | 23 | 27 | 31 |
| | | | | 20 | 24 | 28 | 32 |
| | | | | | | | |
1 | 2 | 3 | 4 | | | | | |
5 | 6 | 7 | 8 | | | | | |
9 | 10 | 11 | 12 | | | | | |
13 | 14 | 15 | 16 | | | | | |
CS 61C
Spring 2024
Data Caching for Matrix Multiply
32
CS 61C
Spring 2024
33
CS 61C
Spring 2024
34
CS 61C
Spring 2024
35
CS 61C
Spring 2024