CSE 451
Operating Systems
L15 - Address Translation
Slides by: Tom Anderson
Baris Kasikci
Main Points
Address Translation Concept
Intel 57 bits
Kernel trap
Intel 52 bits
Cache line: 64 bytes
cache coherent (~10TB)
Virtually Addressed Base and Bounds
Base
Base + bound
Virtually Addressed Base and Bounds
Paged Translation
Intel: cr3 register will have the address of the page table
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 bit address
12 bit offset
45 bit page #
4 KB frames
5
5
12
Virtual address
12 bit
40 bit
Physical address
offset
frame
Process View
A B C D |
E F G H |
I J K L |
|
I J K L |
|
E F G H |
A B C D |
4 |
3 |
1 |
Page Table
Physical Memory
3
2
8 page
frames
4 byte
pages
0
9
010
01
000
00
010
001
01
100
00
Page Table Entry (Intel 64 bit)
PTE not present: Unused virtual pages do not need a physical page frame
don’t need a page frame for all virtual addresses
instruction pages = 0
kernel pages = 0
page replacement
set by HW,
cleared by OS
}
Paging Questions
Paging Questions
Paging Questions - 2
How big is the page table based on what we learned so far?
245 entries, 1 word= 64 bits = 8 bytes => 248 bytes (256 TB!)
An array is not going to work!
Single Level Page Table
Two Level Page Tables?
Level 1 Table
...
Level 2 Tables
...
12 bit
36 bit
offset
page
9 bit
offset into
level-2 page
index into level-1 page
page frame
of level-2
table
page frame
page 512 entries
Three Level Page Tables
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 bit
9
bit
9
bit
27 bit
no mapping
no mapping
no mapping
If PTE is invalid in level 1,
don’t need a level 2 table
If it is invalid in level 2,
don’t need a level 3 table
x86 Multilevel Paging
Paging and Sharing
Page Table Walk in xk (x86_vm64.c)
Page Translation in the OS
xk Paging Data Structures (vspace.h)
Multilevel Paging
Pros
Cons
Efficient Address Translation
Cost of TLB lookup + Prob(TLB miss) * cost of page table lookup
TLB and Page Table Translation
TLB Lookup
Hit
Miss
Question
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Question
TLB Shootdown
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
Question
What is the cost of a TLB miss on a modern processor?
Hardware Design Principle
The bigger the memory, the slower the memory
Why we have multiple levels of cache
L1 cache –
L2 cache – 2 MB,
L3 cache – 320 MB,
80 KB, 1.5 ns
4ns
22ns
Also why we have multiple levels of TLBs
Question
Superpages
Superpages
Virtually Addressed, Physically Tagged Caches
Virtually Addressed, Physically Tagged Cache
Virtually Addressed, Physically Tagged Cache
Address Translation Goals
Bonus Feature
Process Regions
memory mapped files, …
Question
Expand Stack on Reference
UNIX fork seems inefficient
Copy on Write
Question
Fill On Demand
Address Translation Uses
Address Translation (more)
Address Translation (even more)