UNIT-5 CODE GENERATION & CODE OPTIMIZATION
UNIT-5
CODE GENERATION &
CODE OPTIMIZATION
Code Generator
Code generator is used to produce the target code for
three-address statements. It uses registers to store the operands of the three
address statement.
Example:
Consider the three address statement x:= y + z. It can have the
following sequence of codes:
             MOV
x, R0
        
ADD y, R0
Register and Address Descriptors:
- A
     register descriptor contains the track of what is currently in each
     register. The register descriptors show that all the registers are
     initially empty.
- An
     address descriptor is used to store the location where current value of
     the name can be found at run time.
A code-generation algorithm:
The algorithm takes a sequence of three-address statements as
input. For each three address statement of the form a:= b op c perform the
various actions. These are as follows:
- Invoke
     a function getreg to find out the location L where the result of
     computation b op c should be stored.
- Consult
     the address description for y to determine y'. If the value of y currently
     in memory and register both then prefer the register y' . If the value of
     y is not already in L then generate the instruction MOV y' , L to
     place a copy of y in L.
- Generate
     the instruction OP z' , L where z' is used to show the
     current location of z. if z is in both then prefer a register to a memory
     location. Update the address descriptor of x to indicate that x is in
     location L. If x is in L then update its descriptor and remove x from all
     other descriptor.
- If
     the current value of y or z have no next uses or not live on exit from the
     block or in register then alter the register descriptor to indicate that
     after execution of x : = y op z those register will no longer contain y or
     z.
Generating Code for Assignment Statements:
The assignment statement d:= (a-b) + (a-c) + (a-c) can be
translated into the following sequence of three address code:
1.    
t:= a-b  
2.    
        u:= a-c  
3.    
        v:= t +u   
4.    
        d:= v+u  
Code sequence for the example is as follows:
| Code Generated | Register descriptor | Address descriptor | |
| t:= a - b | MOV a, R0 | R0 contains t | t in R0 | 
| u:= a - c | MOV a, R1 | R0 contains t | t in R0 | 
| v:= t + u | ADD R1, R0 | R0 contains v | u in R1 | 
| d:= v + u | ADD R1, R0 | R0 contains d | d in R0 | 
Design
Issues
In the code generation phase, various issues can arises:
- Input
     to the code generator
- Target
     program
- Memory
     management
- Instruction
     selection
- Register
     allocation
- Evaluation
     order
1. Input to the code generator
- The
     input to the code generator contains the intermediate representation of
     the source program and the information of the symbol table. The source
     program is produced by the front end.
- Intermediate
     representation has the several choices:
 a) Postfix notation
 b) Syntax tree
 c) Three address code
- We
     assume front end produces low-level intermediate representation i.e.
     values of names in it can directly manipulated by the machine
     instructions.
- The
     code generation phase needs complete error-free intermediate code as an
     input requires.
2. Target program:
The target program is the output of the code generator. The
output can be:
a) Assembly language: It allows subprogram to
be separately compiled.
b) Relocatable machine language: It makes the
process of code generation easier.
c) Absolute machine language: It can be placed
in a fixed location in memory and can be executed immediately.
3. Memory management
- During
     code generation process the symbol table entries have to be mapped to actual
     p addresses and levels have to be mapped to instruction address.
- Mapping
     name in the source program to address of data is co-operating done by the
     front end and code generator.
- Local
     variables are stack allocation in the activation record while global
     variables are in static area.
4. Instruction selection:
- Nature
     of instruction set of the target machine should be complete and uniform.
- When
     you consider the efficiency of target machine then the instruction speed
     and machine idioms are important factors.
- The
     quality of the generated code can be determined by its speed and size.
Example:
The Three address code is:
1.    
a:= b + c  
2.    
d:= a + e  
Inefficient assembly code is:
1.    
MOV b, R0              R0→b                              
2.    
ADD c, R0   R0      c + R0  
3.    
MOV R0, a               a   →   R0  
4.    
MOV a, R0      R0→  a  
5.    
ADD e, R0               R0  →       e + R0  
6.    
MOV R0, d               d    →  R0  
5. Register allocation
Register can be accessed faster than memory. The instructions
involving operands in register are shorter and faster than those involving in
memory operand.
The following sub problems arise when we use registers:
Register allocation: In register allocation, we select the set of variables
that will reside in register.
Register assignment: In Register assignment, we pick the register that contains
variable.
Certain machine requires even-odd pairs of registers for some
operands and result.
For example:
Consider the following division instruction of the form:
1.    
D x, y  
Where,
x is the dividend
even register in even/odd register pair
y is the divisor
Even register is used to hold the reminder.
Old register is
used to hold the quotient.
6. Evaluation order
The efficiency of the target code can be affected by the order
in which the computations are performed. Some computation orders need fewer
registers to hold results of intermediate than others.
Target
Machine
- The
     target computer is a type of byte-addressable machine. It has 4 bytes to a
     word.
- The
     target machine has n general purpose registers, R0, R1,...., Rn-1. It also
     has two-address instructions of the form
1.    
op source, destination  
Where,
op is used as an op-code and source and destination are used as a data field.
- It has the following op-codes:
 ADD (add source to destination)
 SUB (subtract source from destination)
 MOV (move source to destination)
- The
     source and destination of an instruction can be specified by the
     combination of registers and memory location with address modes.
| MODE | FORM | ADDRESS | EXAMPLE | ADDED COST | 
| absolute | M | M | Add R0, R1 | 1 | 
| register | R | R | Add temp, R1 | 0 | 
| indexed | c(R) | C+ contents(R) | ADD 100 (R2), R1 | 1 | 
| indirect register | *R | contents(R) | ADD * 100 | 0 | 
| indirect indexed | *c(R) | contents(c+
  contents(R)) | (R2), R1 | 1 | 
| literal | #c | c | ADD #3, R1 | 1 | 
- Here,
     cost 1 means that it occupies only one word of memory.
- Each
     instruction has a cost of 1 plus added costs for the source and
     destination.
- Instruction
     cost = 1 + cost is used for source and destination mode.
Example:
1. Move register to memory R0 → M
1.    
MOV R0, M  
2.    
cost = 1+1+1    (since address of memory location M is in word following the instruction)  
2. Indirect indexed mode:
1.    
MOV * 4(R0), M  
2.    
     cost = 1+1+1   (since one word for memory location M, one word  
3.    
for result of *4(R0) and one for instruction)  
3. Literal Mode:
1.    
MOV #1, R0  
2.    
cost = 1+1+1 = 3   (one word for constant 1 and one for instruction)
RUN-TIME STORAGE MANAGEMENT
The information which
required during an execution of a procedure is kept in a block of storage
called an activation record. The activation record includes storage for names
local to the procedure.
We can describe
address in the target code using the following ways:
- Static allocation
- Stack allocation
In static allocation,
the position of an activation record is fixed in memory at compile time.
In the stack
allocation, for each execution of a procedure a new activation record is pushed
onto the stack. When the activation ends then the record is popped.
For the run-time
allocation and deallocation of activation records the following three-address
statements are associated:
- Call
- Return
- Halt
- Action, a placeholder for other
     statements
We assume that the
run-time memory is divided into areas for:
- Code
- Static data
- Stack
Static allocation:
1. Implementation of call statement:
The following code is
needed to implement static allocation:
1.     MOV #here + 20, callee.static_area     /*it saves return address*/</p>  
2.     GOTO callee.code_area     /* It transfers control to the target code for the called procedure*/  
Where,
callee.static_area shows the address of the activation record.
callee.code_area shows the address of the first instruction for called procedure.
#here + 20 literal are used to return address of the instruction
following GOTO.
2. Implementation of return statement:
The following code is
needed to implement return from procedure callee:
1.    
GOTO * callee.static_area  
It is used to transfer
the control to the address that is saved at the beginning of the activation
record.
3. Implementation of action statement:
The ACTION instruction
is used to implement action statement.
4. Implementation of halt statement:
The HALT statement is
the final instruction that is used to return the control to the operating
system.
Stack allocation
Using the relative
address, static allocation can become stack allocation for storage in
activation records.
In stack allocation,
register is used to store the position of activation record so words in
activation records can be accessed as offsets from the value in this register.
The following code is
needed to implement stack allocation:
1. Initialization of stack:
1.     MOV #stackstart , SP    /*initializes stack*/  
2.     HALT                             /*terminate execution*/  
2. Implementation of Call statement:
1.    
ADD #caller.recordsize, SP/* increment stack pointer */   
2.    
MOV #here + 16, *SP                  /*Save return address */   
3.    
GOTO callee.code_area  
Where,
caller.recordsize is the size of the activation record
#here + 16 is the address of the instruction following the GOTO
3. Implementation of Return statement:
1.     GOTO *0 ( SP ) /*return to the caller */   
2.     SUB #caller.recordsize, SP           /*decrement SP and restore to previous value */ 
Basic Block
Basic block contains a
sequence of statement. The flow of control enters at the beginning of the
statement and leave at the end without any halt (except may be the last
instruction of the block).
The following sequence
of three address statements forms a basic block:
1.     t1:= x * x  
2.     t2:= x * y  
3.     t3:= 2 * t2  
4.     t4:= t1 + t3  
5.     t5:= y * y  
6.     t6:= t4 + t5  
Basic block construction:
Algorithm: Partition into basic blocks
Input: It contains the sequence of three address statements
Output: it contains a list of basic blocks with each three address
statement in exactly one block
Method: First identify the leader in the code. The rules for
finding leaders are as follows:
- The first statement is a
     leader.
- Statement L is a leader if
     there is an conditional or unconditional goto statement like: if....goto L
     or goto L
- Instruction L is a leader if it
     immediately follows a goto or conditional goto statement like: if goto B
     or goto B
For each leader, its
basic block consists of the leader and all statement up to. It doesn't include
the next leader or end of the program.
Consider the following
source code for dot product of two vectors a and b of length 10:
1.     begin   
2.                prod :=0;   
3.     i:=1;   
4.     do begin   
5.     prod :=prod+ a[i] * b[i];   
6.     i :=i+1;   
7.     end   
8.     while i <= 10   
9.     end  
The three address code
for the above source program is given below:
B1
1.     (1) prod := 0   
2.     (2) i := 1  
B2
1.     (3) t1 := 4* i   
2.     (4) t2 := a[t1]   
3.     (5) t3 := 4* i   
4.     (6) t4 := b[t3]   
5.     (7) t5 := t2*t4   
6.     (8) t6 := prod+t5   
7.     (9) prod := t6   
8.     (10)    t7 := i+1   
9.     (11)    i := t7   
10.  (12)    if i<=10 goto (3)      
Basic block B1
contains the statement (1) to (2)
Basic block B2
contains the statement (3) to (12)
Flow Graph
Flow graph is a
directed graph. It contains the flow of control information for the set of
basic block.
A control flow graph
is used to depict that how the program control is being parsed among the
blocks. It is useful in the loop optimization.
Flow graph for the
vector dot product is given as follows:

- Block B1 is the initial node.
     Block B2 immediately follows B1, so from B2 to B1 there is an edge.
- The target of jump from last
     statement of B1 is the first statement B2, so from B1 to B2 there is an
     edge.
- B2 is a successor of B1 and B1
     is the predecessor of B2.
Optimization of Basic Blocks:
Optimization process
can be applied on a basic block. While optimization, we don't need to change
the set of expressions computed by the block.
There are two type of
basic block optimization. These are as follows:
- Structure-Preserving
     Transformations
- Algebraic Transformations
1. Structure preserving transformations:
The primary
Structure-Preserving Transformation on basic blocks is as follows:
- Common sub-expression
     elimination
- Dead code elimination
- Renaming of temporary variables
- Interchange of two independent
     adjacent statements
(a) Common sub-expression elimination:
In the common
sub-expression, you don't need to be computed it over and over again. Instead
of this you can compute it once and kept in store from where it's referenced
when encountered again.
1.     a : = b + c  
2.     b : = a - d   
3.     c : = b + c                          
4.     d : = a - d  
In the above
expression, the second and forth expression computed the same expression. So
the block can be transformed as follows:
1.     a : = b + c   
2.     b : = a - d                                                         
3.     c : = b + c  
4.     d : = b  
(b) Dead-code elimination
- It is possible that a program
     contains a large amount of dead code.
- This can be caused when once
     declared and defined once and forget to remove them in this case they
     serve no purpose.
- Suppose the statement x:= y + z
     appears in a block and x is dead symbol that means it will never
     subsequently used. Then without changing the value of the basic block you
     can safely remove this statement.
(c) Renaming temporary variables
A statement t:= b + c
can be changed to u:= b + c where t is a temporary variable and u is a new
temporary variable. All the instance of t can be replaced with the u without
changing the basic block value.
(d) Interchange of statement
Suppose a block has
the following two adjacent statements:
1.     t1 : = b + c   
2.     t2 : = x + y  
These two statements
can be interchanged without affecting the value of block when value of t1 does
not affect the value of t2.
2. Algebraic transformations:
- In the algebraic
     transformation, we can change the set of expression into an algebraically
     equivalent set. Thus the expression x:= x + 0 or x:= x *1 can be
     eliminated from a basic block without changing the set of expression.
- Constant folding is a class of
     related optimization. Here at compile time, we evaluate constant
     expressions and replace the constant expression by their values. Thus the
     expression 5*2.7 would be replaced by13.5.
- Sometimes the unexpected common
     sub expression is generated by the relational operators like <=, >=,
     <, >, +, = etc.
- Sometimes associative
     expression is applied to expose common sub expression without changing the
     basic block value. if the source code has the assignments
1.     a:= b + c  
2.                     e:= c +d +b  
The following
intermediate code may be generated:
1.     a:= b + c  
2.      t:= c +d  
3.      e:= t + b  
Machine-Independent Optimization
- Machine independent
     optimization attempts to improve the intermediate code to get a better
     target code. The part of the code which is transformed here does not
     involve any absolute memory location or any CPU registers.
- The process of intermediate
     code generation introduces much inefficiency like: using variable instead
     of constants, extra copies of variable, repeated evaluation of expression.
     Through the code optimization, you can remove such efficiencies and
     improves code.
- It can change the structure of
     program sometimes of beyond recognition like: unrolls loops, inline
     functions, eliminates some variables that are programmer defined.
Code Optimization can
perform in the following different ways:
(1) Compile Time Evaluation:
(a) z = 5*(45.0/5.0)*r
     Perform 5*(45.0/5.0)*r at compile time.
(b) x = 5.7
     y = x/3.6
    Evaluate x/3.6 as 5.7/3.6 at compile time.
(2) Variable Propagation:
Before Optimization
the code is:
1.     c = a * b                                                
2.     x = a                                                   
3.     till                                                            
4.     d = x * b + 4   
After Optimization the
code is:
1.     c = a * b    
2.     x = a  
3.     till  
4.         d = a * b + 4  
Here, after variable
propagation a*b and x*b identified as common sub expression.
(3) Dead code elimination:
Before elimination the
code is:
1.     c = a * b                                                  
2.     x = b                                                 
3.     till                                                          
4.     d = a * b + 4    
After elimination the
code is:
1.     c = a * b  
2.     till  
3.     d = a * b + 4  
Here, x= b is a dead
state because it will never subsequently used in the program. So, we can
eliminate this state.
(4) Code Motion:
- It reduces the evaluation
     frequency of expression.
- It brings loop invariant
     statements out of the loop.
1.     do  
2.     {  
3.        item = 10;  
4.        valuevalue = value + item;   
5.     } while(value<100);  
6.        
7.        
8.     //This code can be further optimized as  
9.       
10.  item = 10;  
11.  do  
12.  {  
13.     valuevalue = value + item;   
14.  } while(value<100);  
(5) Induction Variable and Strength Reduction:
- Strength reduction is used to
     replace the high strength operator by the low strength.
- An induction variable is used
     in loop for the following kind of assignment like i = i + constant.
Before reduction the
code is:
1.     i = 1;                                                    
2.     while(i<10)                                                
3.     {                                                        
4.         y = i * 4;   
5.     }  
After Reduction the
code is:
1.     i = 1  
2.     t = 4  
3.     {   
4.        while( t<40)   
5.       y = t;   
6.       t = t + 4;  
7.     }  
Loop Optimization
Loop optimization is
most valuable machine-independent optimization because program's inner loop
takes bulk to time of a programmer.
If we decrease the
number of instructions in an inner loop then the running time of a program may
be improved even if we increase the amount of code outside that loop.
For loop optimization
the following three techniques are important:
- Code motion
- Induction-variable elimination
- Strength reduction
1.Code Motion:
Code motion is used to
decrease the amount of code in loop. This transformation takes a statement or
expression which can be moved outside the loop body without affecting the
semantics of the program.
52M
935
Prime Ministers of India | List of Prime Minister of India
(1947-2020)
For example
In the while
statement, the limit-2 equation is a loop invariant equation.
1.     while (i<=limit-2)     /*statement does not change limit*/  
2.     After code motion the result is as follows:  
3.               a= limit-2;  
4.               while(i<=a)    /*statement does not change limit or a*/  
2.Induction-Variable Elimination
Induction variable
elimination is used to replace variable from inner loop.
It can reduce the
number of additions in a loop. It improves both code space and run time
performance.

In this figure, we can
replace the assignment t4:=4*j by t4:=t4-4. The only problem which will be
arose that t4 does not have a value when we enter block B2 for the first time.
So we place a relation t4=4*j on entry to the block B2.
3. Reduction in Strength
- Strength reduction is used to
     replace the expensive operation by the cheaper once on the target machine.
- Addition of a constant is
     cheaper than a multiplication. So we can replace multiplication with an
     addition within the loop.
- Multiplication is cheaper than
     exponentiation. So we can replace exponentiation with multiplication
     within the loop.
Example:
1.     while (i<10)  
2.             {  
3.     j= 3 * i+1;  
4.     a[j]=a[j]-2;  
5.     i=i+2;  
6.             }  
After strength
reduction the code will be:
1.     s= 3*i+1;  
2.           while (i<10)  
3.            {  
4.                 j=s;  
5.                 a[j]= a[j]-2;  
6.                 i=i+2;  
7.                 s=s+6;  
8.            }  
In the above code, it
is cheaper to compute s=s+6 than j=3 *i
DAG representation for basic blocks
A DAG for basic block
is a directed acyclic graph with the following labels on nodes:
- The leaves of graph are labeled
     by unique identifier and that identifier can be variable names or
     constants.
- Interior nodes of the graph is
     labeled by an operator symbol.
- Nodes are also given a sequence
     of identifiers for labels to store the computed value.
- DAGs are a type of data
     structure. It is used to implement transformations on basic blocks.
- DAG provides a good way to
     determine the common sub-expression.
- It gives a picture
     representation of how the value computed by the statement is used in
     subsequent statements.
Algorithm for construction of DAG
Input:It contains a basic block
Output: It contains the following information:
- Each node contains a label. For
     leaves, the label is an identifier.
- Each node contains a list of
     attached identifiers to hold the computed values.
1.     Case (i) x:= y OP z  
2.        Case (ii) x:= OP y  
3.        Case (iii) x:= y  
Method:
Step 1:
82.2K
OTD in Space - Sept 8: Genesis Spacecraft Crash-Lands in Utah
If y operand is
undefined then create node(y).
If z operand is
undefined then for case(i) create node(z).
Step 2:
For case(i), create
node(OP) whose right child is node(z) and left child is node(y).
For case(ii), check
whether there is node(OP) with one child node(y).
For case(iii), node n
will be node(y).
Output:
For node(x) delete x
from the list of identifiers. Append x to attached identifiers list for the
node n found in step 2. Finally set node(x) to n.
Example:
Consider the following
three address statement:
1.     S1:= 4 * i  
2.     S2:= a[S1]  
3.     S3:= 4 * i  
4.     S4:= b[S3]    
5.     S5:= s2 * S4  
6.     S6:= prod + S5  
7.     Prod:= s6  
8.     S7:= i+1  
9.     i := S7  
10.  if i<= 20 goto (1)   
Stages in DAG Construction:
 
  
  
  
  
  
  
 
Global data flow analysis
- To efficiently optimize the
     code compiler collects all the information about the program and
     distribute this information to each block of the flow graph. This process
     is known as data-flow graph analysis.
- Certain optimization can only
     be achieved by examining the entire program. It can't be achieve by
     examining just a portion of the program.
- For this kind of optimization
     user defined chaining is one particular problem.
- Here using the value of the
     variable, we try to find out that which definition of a variable is
     applicable in a statement.
Based on the local
information a compiler can perform some optimizations. For example, consider
the following code:
1.     x = a + b;  
2.        x = 6 * 3  
- In this code, the first
     assignment of x is useless. The value computer for x is never used in the
     program.
- At compile time the expression
     6*3 will be computed, simplifying the second assignment statement to x =
     18;
Some optimization
needs more global information. For example, consider the following code:
1.     a = 1;  
2.        b = 2;  
3.        c = 3;  
4.        if (....) x = a + 5;  
5.        else x = b + 4;  
6.        c = x + 1;  
In this code, at line
3 the initial assignment is useless and x +1 expression can be simplified as 7.
But it is less obvious
that how a compiler can discover these facts by looking only at one or two
consecutive statements. A more global analysis is required so that the compiler
knows the following things at each point in the program:
- Which variables are guaranteed
     to have constant values
- Which variables will be used
     before being redefined
Data flow analysis is
used to discover this kind of property. The data flow analysis can be performed
on the program's control flow graph (CFG).
The control flow graph
of a program is used to determine those parts of a program to which a
particular value assigned to a variable might propagate.
 
 
Comments
Post a Comment