The Processor�Single-Cycle Implementation
Chapter 4
Sections: 4.1-4.4, Appendix A, Appendix C.1 and C.2
Outline
2
Prof. Iyad Jafar
Introduction
3
Prof. Iyad Jafar
Introduction
4
Prof. Iyad Jafar
Logic Design Conventions
5
Prof. Iyad Jafar
Logic Design Conventions
6
A
B
Y
A
B
Y
+
I0
I1
Y
M�u�x
S
A
B
Y
ALU
F
D
Clk
Q
Prof. Iyad Jafar
Sequential Elements
7
Clk
D
Q
D
Clk
Q
Write
Write
D
Q
Clk
D
Clk
Q
Prof. Iyad Jafar
Registers
8
4-bit Register
4-bit Register with load control
(clock gating)
4-bit Register with load control
Prof. Iyad Jafar
Clocking Methodology
9
Combinational
logic
State
Element
State
Element
clock
one clock cycle
Prof. Iyad Jafar
Building Single-Cycle Datapath
10
Prof. Iyad Jafar
Single-Cycle Implementation
11
Prof. Iyad Jafar
Remember
12
Fetch
Decode
Execute
Prof. Iyad Jafar
Fetch Datapath
13
PC
Read
Address
Data
Instruction
Memory
+
4
Instruction
Prof. Iyad Jafar
Decode Datapath
14
Instruction
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
Read
Data 1
Read
Data 2
Control
Unit
R[rs1]
R[rs2]
Prof. Iyad Jafar
Inside the Register File
15
Register 0
Register 1
Register 2
….
Register 31
32-to-1 MUX
32-to-1 MUX
rs1
rs2
R[rs1]
R[rs2]
0
1
31
0
1
31
.
.
.
.
Prof. Iyad Jafar
Inside the Register File
16
rd
Write Data
Register 0
D
C
Register 1
D
C
Register 2
D
C
…..
D
C
Register 31
D
C
RegWrite
5-to-32 Decoder
31
1
0
Clock
Prof. Iyad Jafar
Inside the Register File
17
rd
5-to-32 Decoder
Write Data
Register 0
D
C
Register 1
D
C
Register 2
D
C
…..
D
C
Register 31
D
C
RegWrite
31
1
0
Clock
32-to-1 MUX
0
1
31
.
.
32-to-1 MUX
0
1
31
.
.
R[rs1]
R[rs2]
rs1
rs2
.
.
.
.
.
.
.
Prof. Iyad Jafar
Execute Datapath for R-Type
18
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
Read
Data 1
Read
Data 2
R[rs1]
R[rs2]
Instruction
Write
ALU
RegWrite
ALU Operation
rd
Not all instructions write to register file. Need RegWrite signal
Instructions use the ALU differently. Need ALU Operation signal
R[rd] 🡨 R[rs1] op R[rs2]
Prof. Iyad Jafar
Execute Datapath for LW
19
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
Read
Data 1
Read
Data 2
R[rs1]
R[rs2]
Instruction
Write
ALU
RegWrite
ALU Operation
Address
Data
Data Memory
Imm Gen
Write
Data
MemRead
imm
rd
Not all instructions read from memory
Need MemRead signal
R[rd] 🡨 M[R[rs1] + sign_ext(offset)]
Prof. Iyad Jafar
Execution Datapath for SW
20
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
Read
Data 1
Read
Data 2
R[rs1]
R[rs2]
Instruction
Write
ALU
RegWrite
ALU Operation
Address
Data
Data Memory
Imm Gen
Write
Data
MemRead
MemWrite
imm
Not all instructions write to memory. Need MemWrite signal
M[R[rs1]+sign_ext(offset)] 🡨 R[rs2]
Prof. Iyad Jafar
Execution Datapath for BEQ
21
if (R[rs1] == R[rs2])
PC 🡨 PC + sign_ext(imm)x2
else
PC 🡨 PC +4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register File
Read
Data 1
Read
Data 2
Instruction
Write
RegWrite
PC
+
4
ALU
ALU
Operation
+
Imm Gen
Zero
Branch Address
1
0
Only beq updates PC if Zero signal is 1. Need Branch signal.
Zero
Branch
R[rs1]
R[rs2]
Prof. Iyad Jafar
Combining Datapaths for Single-Cycle
22
PC
Prof. Iyad Jafar
Combining Datapaths for Single-Cycle
23
Prof. Iyad Jafar
Immediate Generation Unit
24
Prof. Iyad Jafar
Immediate Generation Unit
25
B1
B0
0
1
0
1
0
Selection Logic
Opcode
Out0
Out1
Out2
Out3
Oper. | Out3 | Out2 | Out1 | Out0 |
(1) | | | | |
(2) | | | | |
| | | | |
B1 | B1 | B1 | B0 |
B1 | B1 | B0 | 0 |
How about generating an immediate for lui?
Prof. Iyad Jafar
Building Single-Cycle Control Unit
26
Prof. Iyad Jafar
Building Control Unit
27
Prof. Iyad Jafar
Multi-level Decoding and Control
28
ALU Control
func3
func7
3
7
ALU
Prof. Iyad Jafar
Main Control Unit
29
Opcode
7
ALUOp
2
MemWrite
MemToReg
Branch
MemRead
ALUSrc
RegWrite
Main Control
Prof. Iyad Jafar
Datapath with Control
30
32
Prof. Iyad Jafar
Control Signals for R-Type
31
32
0
1
X
Values from main control
ALU Operation | ALUOp1 | ALUOp0 |
R-Type | 1 | 0 |
LW | 0 | 0 |
SW | 0 | 0 |
BEQ | 0 | 1 |
Prof. Iyad Jafar
Control Signals for LW
32
32
0
1
X
Values from main control
ALU Operation | ALUOp1 | ALUOp0 |
R-Type | 1 | 0 |
LW | 0 | 0 |
SW | 0 | 0 |
BEQ | 0 | 1 |
Prof. Iyad Jafar
Control Signals for SW
33
32
0
1
X
Values from main control
ALU Operation | ALUOp1 | ALUOp0 |
R-Type | 1 | 0 |
LW | 0 | 0 |
SW | 0 | 0 |
BEQ | 0 | 1 |
Prof. Iyad Jafar
Control Signals for BEQ
34
32
0
1
X
Values from main control
ALU Operation | ALUOp1 | ALUOp0 |
R-Type | 1 | 0 |
LW | 0 | 0 |
SW | 0 | 0 |
BEQ | 0 | 1 |
Prof. Iyad Jafar
Summary of Main Control Signals
35
Signal Name | Effect when Deassereted (0) | Effect when Asserted (1) |
Branch | Instruction is not branch | Instruction is branch |
MemRead | None | Contents of memory address are put on memory data output |
MemtoReg | Data written to the register file comes from ALU | Data written to the register file comes from memory |
ALUOp | Used with function fields to generate the ALUOp signal that specify the ALU operation | |
MemWrite | None | Data on memory data input is stored in the specified address |
ALUSrc | The second ALU operand comes from R[rs2] | The second ALU operand is the sign extended offset |
RegWrite | None | Enable writing to the register file |
Prof. Iyad Jafar
Values of Control Signals
36
| Inputs | Outputs | |||||||||||||
| Op6 | Op5 | Op4 | Op3 | Op2 | Op1 | Op0 | Branch | MemRead | MemWrite | RegWrite | MemToReg | AlUSrc | ALUOp1 | ALUOp0 |
R-type | 0 | 1 | 1 | 0 | 0 | 1 | 1 | | | | | | | | |
LW | 0 | 0 | 0 | 0 | 0 | 1 | 1 | | | | | | | | |
SW | 0 | 1 | 0 | 0 | 0 | 1 | 1 | | | | | | | | |
BEQ | 1 | 1 | 0 | 0 | 0 | 1 | 1 | | | | | | | | |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
X |
X |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
Prof. Iyad Jafar
Main Control Unit
37
I[6]
I[5]
I[4]
I[3]
I[2]
I[1]
I[0]
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOp0
lw
sw
beq
R-format
| Opcode | ||||||
R-type | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
LW | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
SW | 0 | 1 | 0 | 0 | 0 | 1 | 1 |
BEQ | 1 | 1 | 0 | 0 | 0 | 1 | 1 |
Prof. Iyad Jafar
ALU Control Unit
38
Function | Ainv | Bnegate | Op [1:0] |
and | 0 | 0 | 00 |
or | 0 | 0 | 01 |
add | 0 | 0 | 10 |
sub | 0 | 1 | 10 |
ALU Control
funct3
func7
ALUOp
3
7
2
2
1
1
Op
Prof. Iyad Jafar
ALU Control Unit
39
ALUOp1 | ALUOp0 | Funct7 | Func3 | | | Op[1] | Op[0] | ||||||||
I[31] | I[30] | I[29] | I[28] | I[27] | I[26] | I[25] | I[14] | I[13] | I[12] | ||||||
0 | 0 | x | x | x | x | x | x | x | x | x | x | 0 | 0 | 1 | 0 |
0 | 0 | x | x | x | x | x | x | x | x | x | x | 0 | 0 | 1 | 0 |
0 | 1 | x | x | x | x | x | x | x | x | x | x | 0 | 1 | 1 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 |
LW
SW
BEQ
ADD
SUB
AND
OR
Prof. Iyad Jafar
ALU Control Unit
40
Prof. Iyad Jafar
Exercise 1
41
addi rd, rs1 , immed
Prof. Iyad Jafar
Exercise
42
| Inputs | Outputs | | |||||||||||||
| Op6 | Op5 | Op4 | Op3 | Op2 | Op1 | Op0 | Branch | MemRead | MemWrite | RegWrite | MemToReg | AlUSrc | ALUOp1 | ALUOp0 | |
R-type | 0 | 1 | 1 | 0 | 0 | 1 | 1 | | | | | | | | | |
LW | 0 | 0 | 0 | 0 | 0 | 1 | 1 | | | | | | | | | |
SW | 0 | 1 | 0 | 0 | 0 | 1 | 1 | | | | | | | | | |
BEQ | 1 | 1 | 0 | 0 | 0 | 1 | 1 | | | | | | | | | |
ADDI | 0 | 0 | 1 | 0 | 0 | 1 | 1 | | | | | | | | | |
| | | | | | | | | | | | | | | | |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
X |
X |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
Prof. Iyad Jafar
Exercise 2
43
jal rd, immed
Prof. Iyad Jafar
Exercise 2
44
| Inputs | Outputs | | |||||||||||||
| Op6 | Op5 | Op4 | Op3 | Op2 | Op1 | Op0 | Branch | MemRead | MemWrite | RegWrite | MemToReg | AlUSrc | ALUOp1 | ALUOp0 | |
R-type | 0 | 1 | 1 | 0 | 0 | 1 | 1 | | | | | | | | | |
LW | 0 | 0 | 0 | 0 | 0 | 1 | 1 | | | | | | | | | |
SW | 0 | 1 | 0 | 0 | 0 | 1 | 1 | | | | | | | | | |
BEQ | 1 | 1 | 0 | 0 | 0 | 1 | 1 | | | | | | | | | |
ADDI | 0 | 0 | 1 | 0 | 0 | 1 | 1 | | | | | | | | | |
JAL | 1 | 1 | 0 | 1 | 1 | 1 | 1 | | | | | | | | | |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
0 |
1 |
X |
X |
0 |
0 |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
Prof. Iyad Jafar
Single-Cycle Performance
45
Prof. Iyad Jafar
Performance Analysis
46
Fetch
Fetch
Unit | Delay |
ALU | 2 ns |
Memory | 2 ns |
Register File | 1 ns |
Fetch
Fetch
Register
Register
Register
Register
ALU
ALU
ALU
ALU
Register
Register
Read Mem.
Write Mem.
R-Type
Load
Store
Branch
6 ns
8 ns
7 ns
5 ns
Prof. Iyad Jafar
Performance Analysis
47
LW
SW
Clock
Cycle 1
Cycle 2
waste
Prof. Iyad Jafar
Example
Use the information given in the tables to compare the performance of the two processors using a program the have the given instruction mix.
48
Unit | Time (ps) |
Memory | 200 |
ALU and adders | 100 |
Register File | 50 |
Instruction | % |
R-Type | 45 |
Load | 25 |
Store | 10 |
Branch | 15 |
Prof. Iyad Jafar
Example
49
Unit | Time (ps) |
Memory | 200 |
ALU and adders | 100 |
Register File | 50 |
Instruction | % |
R-Type | 45 |
Load | 25 |
Store | 10 |
Branch | 15 |
Processor B is 1.37 faster
So, adaptive clock cycle is faster; however it is hard to implement
| IM | Reg | ALU | DM | Reg | Total Time |
R-type | | | | | | |
Load | | | | | | |
Store | | | | | | |
Branch | | | | | | |
Prof. Iyad Jafar
Single-Cycle Summary
50
Prof. Iyad Jafar
Suggested Problems
51
Prof. Iyad Jafar
Suggested Problems
52
Prof. Iyad Jafar