Code-generation
Chapter 6
Code generation
1.Code generator – introduction
2.Issues in the design of a code generator
3.The Target machine Language
4.Basic blocks and flow graphs
5.Adresseses in the target code
6.Optimization of basic blocks
7.A simple code generator
source front intermediate code intermediate code target
program end code optimizer code generator program
symbol
table
1.Issues in the design of a code generator
1.Input to the code generator
2.Target Program
3.Memory management
4.Instruction selection
5.Register allocation
6.Choices of Evaluation order
1.Input to the code generator
Input to the code generator
1.We assume that prior to code generating the front end has scanned ,parsed, and translated the source program into a reasonably detailed intermediate representation ,
2.Type checking has taken place
3.The code generation phase can therefore be proceed on the assumption that its input is free of errors.
2.Issue:Target program
a.Absolute machine language: It can be placed in a fixed location
in memory and immediately executed
b. Relocatable machine language : Allows subprograms to be
compiled separately. A set of relocatable object modules can be
linked together and loaded for execution by a linking loader.
c. Assembly language: Generates symbolic instructions and use the
macro facilities of the assembler to help generate code
3rd Issue:Memory Management�
4th Issue: Instruction selection
5th Issue:Register Allocation
1.During register allocation we select the set of variables that will reside in register at a point in the program.
2.During a subsequent register assignment phase we pick the specific register that a variable will reside in.
Example 1:
t = a+ b
t = t*c
t = t/d
Optimal machine code sequence
L R1,a R1 contains a
A R1,b R1 contains a+b
M R0,c R0 contains (a+b)*c
D R0,d R0 contains remainder of (a+b)*c/d
ST R1,t R1 contains quotient of (a+b)*c/d and the value of R1 is stored in t.
Example2
t = a+b
t=t+c
t = t/d
Optimal machine code sequence
L R0,a R0 contains a
A R0,b Ro contains a+b
A R0,c R0 contains a+b+c
D R0,d R0 contains the remainder of (a+b+c)/d
6th Issue:Choice of Evaluation order.
The Target Machine (Language)
The Target Machine
The address modes together with assembly language form and associated costs are
MODE FORM ADDRESS ADDED COST
absolute M M 1
register R R 0
indexed c(R) c+contents(R) 1
indirect register *R contents(R) 0
indirect indexed *c(R) contents(c+contents(R) 1
Examples:
1.MOV R0,M
2.MOV 4(R0),M :stores contents (4+contents(R0)) into memory location M
3.MOV *4(R0),M store the value contents(contents(4+contents(R0)))
A final address mode allows the source to be a construct:
MODE FORM CONSTANT ADDED COST
literal #c c 1
Thus the instruction
MOV #1,RO loads the constant 1 into register R0.
Instruction Cost
Examples:
1. MOV R0,R1 cost is 1
2.MOV R5,M cost is 2
3.ADD #1, R3 cost is 2
What is the cost?(a=b+c)
ADD c,R0
MOV R0,a
Ans
What is the cost?
MOV b,a
ADD c,a
Ans
What is the cost�?
MOV *R1,*R0
ADD *R2,*R0
Here R0,R1 and R2 contains the address of a,b and c respectively.
Ans
What is the cost?
ADD R2,R1
MOV R1,a
here R1 and R2 contains the values of b and c respectively.
Conclusion about Target machine
Example 1:
t = a+ b
t = t*c
t = t/d
Optimal machine code sequence
L R1,a R1 contains a
A R1,b R1 contains a+b
M R0,c R0 contains (a+b)*c
D R0,d R0 contains remainder of (a+b)*c/d
ST R1,t R1 contains quotient of (a+b)*c/d and the value of R1 is stored in t.
Example2
t = a+b
t=t+c
t = t/d
Optimal machine code sequence
L R0,a R0 contains a
A R0,b Ro contains a+b
A R0,c R0 contains a+b+c
SRDA R0,32 a+b+c is transferred to R1 and R0 is cleared
D R0,d R0 contains the remainder of (a+b+c)/d
ST R1,t R1 contains the quotient of (a+b+c)/d and it is stored in t.
Basic Blocks and Flow graphs
A basic block is a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end.
t1 = a*a
t2 = a*b
t3= 2*t2
t4= t1+t3
t5=b*b
t6=t4+t5
Algorithm:Partition into basic blocks
Input : A sequence of three -address statements
Output:A list of basic blocks with each three address statement in exactly one block.
Method
i)The first statement is a leader
ii)Any statement that is the target of a conditional or unconditional goto is a leader
iii)Any statement that immediately follows a goto or conditional goto statement is leader.
2.For each leader ,its basic block consists of the leader and all statements up to but not including the next leader or the end of the program.
Example
begin
prod= 0;
i =1;
do begin
prod = prod +a[i]*b[i];
i=i+1;
end
while i<=20
end
Three address code computing dot product
1 prod=0
2 i=1
3 t1=4*i
4. t2=a[t1]
5. t3=4*i
6. t4=b[t3]
7. t5=t2*t4
8 .t6=prod+t5
9. prod=t6
10. t7=i+1
11 i = t7
12 if i<=20 goto(3)
Class work
Ans
prod=0 B1
i=1
t1=4*i
t2=a[t1]
t3=4*i
t4=b[t3]
t5=t2*t4
.t6=prod+t5 B2
prod=t6
t7=I+1
i = t7
if i<=20 goto(3)
Representation of Basic Blocks