Caches I
CS61C
Great Ideas
in�Computer Architecture
(a.k.a. Machine Structures)
UC Berkeley�Teaching Professor �Dan Garcia
cs61c.org
Garcia
Caches I (1)
Binary Prefix�(review)
Garcia
Caches I (2)
The way to remember #s
X=0 | --- | Y=0 | 1 |
X=1 | kibi ~103 | Y=1 | 2 |
X=2 | mebi ~106 | Y=2 | 4 |
X=3 | gibi ~109 | Y=3 | 8 |
X=4 | tebi ~1012 | Y=4 | 16 |
X=5 | pebi ~1015 | Y=5 | 32 |
X=6 | exbi ~1018 | Y=6 | 64 |
X=7 | zebi ~1021 | Y=7 | 128 |
X=8 | yobi ~1024 | Y=8 | 256 |
| | Y=9 | 512 |
MEMORIZE!
[review]
Garcia
Caches I (3)
What units do different storage use?
RAM and Caches use powers of 2.
Garcia
Caches I (4)
Library Analogy
Garcia
Caches I (5)
Why are Large Memories Slow? Library Analogy
What we want is a large, yet fast memory!
Garcia
Caches I (6)
What to do: Library Analogy
Garcia
Caches I (7)
Memory Hierarchy
Garcia
Caches I (8)
New-School Machine Structures
Parallel Requests
Assigned to computer
e.g., Search “Cats”
Parallel Threads
Assigned to core e.g., Lookup, Ads
Parallel Instructions�>1 instruction @ one time
e.g., 5 pipelined instructions
Parallel Data�>1 data item @ one time
e.g., Add of 4 pairs of words
Hardware descriptions�All gates work in parallel at same time
Smart�Phone
Warehouse Scale Computer
Software Hardware
Logic Gates
Core
Core
…
Memory (Cache)
Input/Output
Computer
Main Memory
Exec. Unit(s)
Functional
Block(s)
A0+B0
A1+B1
Harness�Parallelism &
Achieve High�Performance
Garcia
Caches I (9)
Processor-DRAM Gap (Latency)�
1980 microprocessor executes ~one instruction in same time as DRAM access
2020 microprocessor executes ~1000 instructions in same time as DRAM access
Slow DRAM access has disastrous impact on CPU performance!
🗹
Garcia
Caches I (10)
Great Idea #3: Principle of Locality / Memory Hierarchy
Extremely fast
Extremely expensive�Tiny capacity
Processor chip
Fast
Priced reasonably�Medium capacity
DRAM chip –e.g. �DDR3/4/5�HBM/HBM2/3
Faster
Expensive�Small capacity
A UC Berkeley EECS invention: “Cache” sounds like “Cash”, so use shorthand $
Garcia
Caches I (11)
Computer without Cache
Garcia
Caches I (12)
Computer with Cache (copies of recent data)
Lines (i.e., blocks) of data are copied from memory to the cache.
Garcia
Caches I (13)
Jim Gray’s Storage Latency Analogy: �How Far Away is the Data?
Jim Gray�Turing Award
B.S. Cal 1966
Ph.D. Cal 1969
Registers
On-chip cache
Memory
1
2
100
Sacramento
This Room
My Head
1.5 hr
1 min
2 min
[ns]
Garcia
Caches I (14)
Characteristics of the Memory Hierarchy
Increasing distance from the processor in access time
$
Main Memory
Secondary Memory
Processor
(Relative) size of the memory at each level
Inclusive– what is in Level 1 cache (L1$) is a subset what is in MM that is a subset of is in SM
4-8 bytes (word)
1 to 4 blocks
1,024+ bytes�(disk sector = page)
8-32 bytes (block)
Garcia
Caches I (15)
Typical Memory Hierarchy
Control
Processor
Datapath
Secondary
Memory
(Disk
Or Flash)
On-Chip Components
RegFile
Main
Memory
(DRAM)
Data
Cache
Instr
Cache
Speed (#cycles): ½’s 1’s 10’s 100’s 10,000’s
Size (bytes): 100’s 10K’s M’s G’s T’s
Cost: highest lowest
Garcia
Caches I (16)
Memory Hierarchy
🗹
Garcia
Caches I (17)
Main Memory is DRAM
Desktop/server DIMM
(dual inline memory module)
SRAM chip from NES clone
Garcia
Caches I (18)
Storage / “Disk” / Secondary Memory
Attached as a peripheral I/O device. Non-volatile
Garcia
Caches I (19)
Locality, Design, Management
Garcia
Caches I (20)
Caching: The basis of the memory hierarchy
| Temporal Locality | Spatial Locality |
Idea | If we use it now, chances are that we’ll want to use it again soon. | If we use a piece of memory, chances are we’ll use the neighboring pieces soon. |
Library Analogy | We keep a book on the desk while we check out another book. | If we check out a book’s vol. 1 while we’re at it, we’ll also check out vol. 2. |
Memory | If a memory location is referenced, then it will tend to be referenced again soon. Therefore, keep most recently accessed data items closer to the processor. | If a memory location is referenced, the locations with nearby addresses will tend to be referenced soon. Move lines consisting of contiguous words closer to the processor. |
Garcia
Caches I (21)
Cache Design
Garcia
Caches I (22)
Garcia
Caches I (23)
How is the Hierarchy Managed?
Also a type of cache
Garcia
Caches I (24)
“And in Conclusion…”
Garcia
Caches I (25)