CS232 Computer Organization
Week 11.
Computer Arch & Machine Language
Prepared by: Dr. Jun Yuan
Portions credit:
Prof. Nisan & Schocken (www.nand2tetris.org)
Prof. Krste Asanovic
Prof. David Lu
1
Agenda
2
Agenda
3
Review: Stored program concept
ALU
Computer System
Memory
CPU
registers
output
intput
Stored program concept
ALU
Computer System
program
Memory
CPU
registers
output
intput
data
Stored program concept
ALU
Computer System
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
output
intput
0101110011100110
1011000101010100
1110001011111100
...
Machine language
ALU
Computer System
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
output
intput
0101110011100110
1011000101010100
1110001011111100
...
Machine language
ALU
Computer System
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
output
intput
0101110011100110
1011000101010100
1110001011111100
...
current instruction
Machine language
ALU
Computer System
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
output
intput
0101110011100110
1011000101010100
1110001011111100
...
Handling instructions:
current instruction
Machine language
ALU
Computer System
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
output
intput
0101110011100110
1011000101010100
1110001011111100
...
Handling instructions:
operation
current instruction
Machine language
ALU
Computer System
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
output
intput
0101110011100110
1011000101010100
1110001011111100
...
Handling instructions:
operation
addressing
current instruction
Machine language
ALU
Computer System
0101110011100110
1011000101010100
1110001011111100
...
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
Handling instructions:
output
intput
operation
addressing
control
current instruction
Compilation
0101111100111100
1010101010101010
1101011010101010
1001101010010101
1101010010101010
1110010100100100
0011001010010101
1100100111000100
1100011001100101
0010111001010101
...
machine language
load and�execute
high-level program
compile
while (n < 100) {
sum += arr[i];
n++
}
Mnemonics
Interpretation 1:
Instruction:
1011000011000010
add
R3
R2
sample instruction
Mnemonics
Interpretation 2:
Instruction:
1011000011000010
add
R3
R2
sample instruction
Symbols
The assembler will resolve the symbol index into a specific address.
add 1, Mem[129]
Assembly:
add 1, index
Instruction:
1011000110000001
add
1
Mem[129]
Friendlier syntax:�we assume that index stands for Mem[129]
Agenda
17
Machine language
Machine operations
“if (condition) then goto instruction n”
Addressing
ALU
Computer System
0101110011100110
1011000101010100
1110001011111100
...
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
current instruction
output
intput
How does the language allow us to specify on which data the instruction should operate?
Memory hierarchy
Latency numbers every programmer should know
L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns
Compress 1K bytes with Zippy ............. 3,000 ns = 3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns = 20 µs
SSD random read ........................ 150,000 ns = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs
Round trip within same datacenter ...... 500,000 ns = 0.5 ms
Read 1 MB sequentially from SSD ..... 1,000,000 ns = 1 ms
Disk seek ........................... 10,000,000 ns = 10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns = 20 ms
Send packet CA->Netherlands->CA .... 150,000,000 ns = 150 ms
Memory hierarchy
ALU
CPU
•
•
•
•
•
•
Registers
•
•
•
Main Memory
Cache Memory
Disk Memory
more storage space, slower access time
Memory hierarchy
24
Random Access vs Sequential Access...
25
Latency numbers every programmer should know
L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns
Compress 1K bytes with Zippy ............. 3,000 ns = 3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns = 20 µs
SSD random read ........................ 150,000 ns = 150 µs
Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs
Round trip within same datacenter ...... 500,000 ns = 0.5 ms
Read 1 MB sequentially from SSD ..... 1,000,000 ns = 1 ms
Disk seek ........................... 10,000,000 ns = 10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns = 20 ms
Send packet CA->Netherlands->CA .... 150,000,000 ns = 150 ms
Registers
32767
ALU
CPU
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
Registers
32767
ALU
CPU
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
R1
R2
Data registers:
Registers
32767
ALU
CPU
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
R1
R2
Data registers:
add R1, R2
Registers
32767
ALU
CPU
10
25
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
R1
R2
before
Data registers:
add R1, R2
Registers
32767
ALU
CPU
10
35
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
R1
R2
after
Data registers:
add R1, R2
Registers
32767
ALU
CPU
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
R1
R2
Data registers:
add R1, R2
Address registers:
store R1, @A
A
Registers
32767
ALU
CPU
77
137
•
•
•
Registers
Memory
0
1
2
. . .
•
•
•
136
137
138
. . .
•
•
•
R1
R2
Data registers:
add R1, R2
Address registers:
store R1, @A
A
before
Registers
32767
ALU
CPU
77
137
•
•
•
Registers
Memory
0
1
2
. . .
77
•
•
•
136
137
138
. . .
•
•
•
R1
R2
Data registers:
add R1, R2
Address registers:
store R1, @A
A
after
Addressing modes
Register
add R1, R2 // R2 ← R2 + R1
Direct
add R1, M[200] // Mem[200] ← Mem[200] + R1
Indirect
add R1, @A // Mem[A] ← Mem[A] + R1
Immediate
add 73, R1 // R1 ← R1 + 73
Flow control
ALU
Computer System
0101110011100110
1011000101010100
1110001011111100
...
Memory
CPU
registers
0
1
2
...
1100101010010101
1100100101100111
0011001010101011
...
n
n+1
n+2
...
instructions
data
current instruction
output
intput
How does the language allow us to decide, and specify,�which instruction to process next?
Flow control
Flow control
101:
102:
103:
...
...
156:
load R1,0
add 1, R1
...
// do something with R1 value
...
jmp 102 // goto 102
Example:
load R1,0
LOOP:
add 1, R1
...
// do something with R1 value
...
jmp LOOP // goto loop
Symbolic version:
Flow control
Example:
jgt R1, 0, CONT // if R1>0 jump to CONT
sub R1, 0, R1 // R1 ← (0 - R1)
CONT:
...
// Do something with positive R1
Agenda
40
Hack computer: hardware
A 16-bit machine consisting of:
instructions
data out
data in
CPU
instruction
memory
data
memory
Harvard Model?
42
Computer architecture
43
program
Memory
CPU
data
•
•
•
Registers
ALU
intput
output
Computer architecture
44
program
Memory
CPU
data
•
•
•
Registers
control bus
address bus
data bus
ALU
Basic CPU loop
Repeat:
45
Fetching
46
Program Counter
instruction
program
Memory
data
Memory
address input
Memory output
Executing
47
different subsets of the instruction bits control different aspects of the operation
Computer architecture
48
program
Memory
CPU
data
•
•
•
Registers
control bus
address bus
data bus
ALU
Fetch – Execute
49
program
Memory
data
ALU
Fetch – Execute
50
program
Memory
data
data address
instruction address
control bus
address bus
address bus
(data flows not shown, to minimize clutter)
data
instruction
ALU
Fetch – Execute clash
51
program
Memory
data
data address
instruction address
control bus
address bus
address bus
data
instruction
If the Memory is one address space:
This scheme will not work:
ALU
Fetch – Execute clash
52
program
Memory
data
data address
instruction address
Memory address input
control bus
address bus
Memory output
address bus
If the Memory is one address space:
This scheme will not work:
Solution: multiplex
53
program
Memory
data
data address
instruction address
Memory address input
control bus
address bus
Memory output
address bus
mux
Solution: multiplex
54
program
Memory
data
data address
mux
instruction address
Memory address input
control bus
address bus
Memory output
address bus
data, when executing
instruction, when fetching
fetch /
execute
bit
Solution: multiplex, using an instruction register
55
program
Memory
data
ALU
instruction register
data address
fetch /
execute
bit
mux
instruction
instruction address
Memory address input
control bus
address bus
Memory output
address bus
data, when executing
load on fetch
Solution: multiplex, using an instruction register
56
program
Memory
data
ALU
instruction register
data address
fetch /
execute
bit
mux
instruction
instruction address
Memory address input
control bus
address bus
Memory output
address bus
data, when executing
load on fetch
instruction
Solution: multiplex, using an instruction register
57
program
Memory
data
ALU
instruction register
data address
fetch /
execute
bit
mux
instruction
instruction address
Memory address input
control bus
address bus
Memory output
address bus
(data flows not shown, to minimize clutter)
load on fetch
instruction
data
Why Harvard Architecture?
Two physically separate memory units:
58
Each can be addressed and manipulated
separately, and simultaneously
Hack computer: software
Hack machine language:
Hack program = sequence of instructions written in the� Hack machine language
RAM
ROM
instructions
data out
data in
CPU
Hack computer: control
RAM
ROM
instructions
data out
data in
CPU
Control:
reset
Hack computer: registers
RAM
ROM
instructions
data out
data in
CPU
The Hack machine language recognizes three 16-bit registers:
M register
A register
D register