You are on page 1of 16

PERIYAR UNIVERSITY

I-M.Sc., CS

PAPER PRESENTATION





- P.SUGANYA
CONTENT
Compilers
Dynamic Structure of a Compiler
Compiler versus Interpreter
Static Structure of a Compiler
Lexical Analysis
Syntax Analysis or Parsing
Intermediate Code Generation
Code Optimization
Object Code Generation
3
Compilers
Compilers translate from a source language (typically a high
level language) to a functionally equivalent target language
(typically the machine code of a particular machine or a
machine-independent virtual machine).
Compilers for high level programming languages are among
the larger and more complex pieces of software
Original languages included Fortran and Cobol
Often multi-pass compilers (to facilitate memory reuse)
Compiler development helped in better programming language design
Early development focused on syntactic analysis and optimization
Commercially, compilers are developed by very large software groups
Current focus is on optimization and smart use of resources for
modern RISC (reduced instruction set computer) architectures.
4
Dynamic Structure of a Compiler
character stream v a l = 0 1 * v a l + i
lexical analysis (scanning)
token stream
1
ident
"val"
3
assign
-
2
number
10
4
times
-
1
ident
"val"
5
plus
-
1
ident
"i"
token number
token value
syntax analysis (parsing)
syntax tree
ident = number * ident + ident
Term
Expression
Statement
Front end
(analysis)
5
Dynamic Structure of a Compiler
semantic analysis (type checking, ...)
syntax tree
ident = number * ident + ident
Term
Expression
Statement
intermediate
representation
syntax tree, symbol table, or three address code (TAC) ...
optimization
code generation
const 10
load 1
mul
...
machine code
Front end
Back end
(synthesis)
6
Compiler versus Interpreter
Compiler translates to machine code
scanner parser ... code generator loader
source code machine code
Variant: interpretation of intermediate code
... compiler ...
source code intermediate code
(e.g. Java bytecode)
VM
source code is translated into the
code of a virtual machine (VM)
VM interprets the code
simulating the physical machine
Interpreter executes source code "directly"
scanner parser
source code interpretation
statements in a loop are
scanned and parsed
again and again
7
Static Structure of a Compiler
parser &
sem. analysis
scanner
symbol table
code generation
provides tokens from
the source code
maintains information about
declared names and types
generates machine code
"main program"
directs the whole compilation
uses
data flow
8
Lexical Analysis
Stream of characters is grouped into tokens
Examples of tokens are identifiers, reserved words, integers, doubles or
floats, delimiters, operators and special symbols

int a;
a = a + 2;

int reserved word
a identifier
; special symbol
a identifier
= operator
a identifier
+ operator
2 integer constant
; special symbol
9
Syntax Analysis or Parsing
Parsing uses a context-free grammar of valid programming
language structures to find the structure of the input
Result of parsing usually represented by a syntax tree

Example of grammar rules:
expression expression + expression |
variable | constant
variable identifier
constant intconstant | doubleconstant |

Example parse tree:
=

a +

a 2
10
Intermediate Code Generation
An intermediate code representation often helps contain
complexity of compiler and discover code optimizations.
Typical choices include:
Annotated parse trees
Three Address Code (TAC), and abstract machine language
Bytecode, as in Java bytecode.

Example statements:

if (a <= b)

{ a = a c; }

c = b * c

Resulting TAC:

_t1 = a > b
if _t1 goto L0
_t2 = a c
a = _t2
L0: _t3 = b * c
C = _t3
11
Intermediate Code Generation (cont'd)
Example statements:

if (a <= b)

{ a = a c; }

c = b * c

Postfix/Polish/Stack:

v1 v2 JumpIf(>)
v1 v3 store(v1)
v2 v3 * store(v3)

Java bytecode (javap -c):

55: iload_1
56: iload_2
57: if_icmpgt 64

60: iload_1
61: iload_3
62: isub
63: istore_1

64: iload_2
65: iload_3
66: imul
67: istore_3
12
Code Optimization
Compiler converts the intermediate representation to another
one that attempts to be smaller and faster.
Typical optimizations:
Inhibit code generation for unreachable segments
Getting rid of unused variables
Eliminating multiplication by 1 and addition by 0
Loop optimization: e.g. removing statements not modified in the
loop
Common sub-expression elimination
. . .
13
Object Code Generation
The target program is generated in the machine language of
the target architecture.
Memory locations are selected for each variable
Instructions are chosen for each operation
Individual tree nodes or TAC is translated into a sequence of
machine language instructions that perform the same task
Typical machine language instructions include things like
Load register
Add register to memory location
Store register to memory
. . .
14
Object Code Optimization
It is possible to have another code optimization phase that
transforms the object code into more efficient object code.
These optimizations use features of the hardware itself to
make efficient use of processors and registers.
Specialized instructions
Pipelining
Branch prediction and other peephole optimizations
JIT (Just-In-Time) compilation of intermediate code (e.g.
Java bytecode) can discover more context-specific
optimizations not available earlier.
15
Error Handling
Error handling and reporting also occurs across many phases
Lexical analyzer reports invalid character sequences
Syntactic analyzer reports invalid token sequences
Semantic analyzer reports type and scope errors, and the like
The compiler may be able to continue with some errors, but
other errors may stop the process

You might also like