Multiprocessors and Thread-Level Parallelism�(Part 2)
Chapter 5
Appendix F
Appendix I
1
Outline
2
Prof. Iyad Jafar
Distributed Shared Memory
3
Prof. Iyad Jafar
Distributed Shared Memory Multiprocessors (DSMs)
4
Prof. Iyad Jafar
Coherency in DSM
5
Prof. Iyad Jafar
Coherency in DSM
6
Prof. Iyad Jafar
Coherency in DSM
7
Prof. Iyad Jafar
Example Protocol
8
P = requesting node number, A = requested address, and D = data contents
Prof. Iyad Jafar
DSM and Directory-Based Coherence
9
State transition diagram for an individual cache block
Black: requests by the local processor
Gray: requests from the home directory
Bold: Actions by local processor
Prof. Iyad Jafar
DSM and Directory-Based Coherence
10
State transition diagram for the directory
Prof. Iyad Jafar
Synchronization
11
Prof. Iyad Jafar
Synchronization
12
Prof. Iyad Jafar
Basic Hardware primitives
.13
Prof. Iyad Jafar
Basic Hardware primitives
14
try: MOV R3,R4 ; copy exchange value to R3
LL R2,0(R1) ; load linked
SC R3,0(R1) ; store conditional
BEQZ R3, try ; branch if store fails
MOV R4,R2 ; put load value in R4
try: LL R2,0(R1) ;load linked
DADDUI R3,R2,#1 ;increment
SC R3,0(R1) ;store conditional
BEQZ R3, try ;branch store fails
Exchange
Fetch-and-Inc
Prof. Iyad Jafar
Locks and Coherence
15
DADDUI R2,R0,#1
lockit: EXCH R2,0(R1) ;atomic exchange
BNEZ R2,lockit ;already locked?
Prof. Iyad Jafar
Locks and Coherence
16
lockit: LD R2,0(R1) ;load of lock
BNEZ R2,lockit ;not available-spin
DADDUI R2,R0,#1 ;load locked value
EXCH R2,0(R1) ;swap
BNEZ R2,lockit ;branch if lock wasn’t 0
Prof. Iyad Jafar
Locks and Coherence
17
lockit: LL R2,0(R1) ;load linked
BNEZ R2,lockit ;not available-spin
DADDUI R2,R0,#1 ;locked value
SC R2,0(R1) ;store
BEQZ R2,lockit ;branch if store fails
Prof. Iyad Jafar
Consistency models
18
Prof. Iyad Jafar
Consistency Models
19
Prof. Iyad Jafar
Consistency Models
20
Prof. Iyad Jafar