You are on page 1of 6

Administrivia

• Everything is on the class Web site


http://www.stanford.edu/class/cs143/

• Syllabus is on-line, of course


Compilers – Assignment dates will not change
– Midterm
• Thursday, 10/21
• in class
– Final
CS143 • Wednesday, 12/8
• 7-10pm
11:00-12:15TT • Communication
– Use newsgroup, email, phone, office hours
B03 Gates – But definitely prefer the newsgroup!

Prof. Aiken CS 143 Lecture 1 1 Prof. Aiken CS 143 Lecture 1 2

Staff Text

• Instructor • The Purple Dragon Book


– Alex Aiken
• Aho, Lam, Sethi &
Ullman
• TAs
– Peter Boonstoppel
• Not required
– Isil Dillig
– But a useful reference
– Tom Dillig
– Steven Elia

• Office hours, contact info on 143 web site


Prof. Aiken CS 143 Lecture 1 3 Prof. Aiken CS 143 Lecture 1 4

Course Structure Academic Honesty

• Course has theoretical and practical aspects • Don’t use work from uncited sources
– Including old code
• Need both in programming languages! • We use plagiarism detection software
– many cases in past offerings
• Written assignments = theory
– Class hand-in

PLAGIARISM
• Programming assignments = practice
– Electronic hand-in

Prof. Aiken CS 143 Lecture 1 5 Prof. Aiken CS 143 Lecture 1 6

1
The Course Project How are Languages Implemented?

• A big project • Two major strategies:


– Interpreters (older)
• … in 4 easy parts – Compilers (newer)

• Interpreters run programs “as is”


• Start early!
– Little or no preprocessing

• Compilers do extensive preprocessing

Prof. Aiken CS 143 Lecture 1 7 Prof. Aiken CS 143 Lecture 1 8

Language Implementations History of High-Level Languages

• Batch compilation systems dominate • 1954 IBM develops the


– gcc 704
– Successor to the 701

• Some languages are primarily interpreted • Problem


– Java bytecode – Software costs exceeded
hardware costs!

• Some environments (Lisp) provide both


– Interpreter for development • All programming done in
assembly
– Compiler for production

Prof. Aiken CS 143 Lecture 1 9 Prof. Aiken CS 143 Lecture 1 10

The Solution FORTRAN I

• Enter “Speedcoding” • Enter John Backus

• Idea
• An interpreter
– Translate high-level code
to assembly
• Ran 10-20 times slower than hand-written
– Many thought this
assembly impossible

– Had already failed in


other projects

Prof. Aiken CS 143 Lecture 1 11 Prof. Aiken CS 143 Lecture 1 12

2
FORTRAN I (Cont.) FORTRAN I

• 1954-7 • The first compiler


– FORTRAN I project – Huge impact on computer science

• 1958
– >50% of all software is in • Led to an enormous body of theoretical work
FORTRAN

• Modern compilers preserve the outlines of


• Development time halved
FORTRAN I

Prof. Aiken CS 143 Lecture 1 13 Prof. Aiken CS 143 Lecture 1 14

The Structure of a Compiler Lexical Analysis

1. Lexical Analysis • First step: recognize words.


2. Parsing – Smallest unit above letters
3. Semantic Analysis
4. Optimization This is a sentence.
5. Code Generation

The first 3, at least, can be understood by


analogy to how humans comprehend English.

Prof. Aiken CS 143 Lecture 1 15 Prof. Aiken CS 143 Lecture 1 16

More Lexical Analysis And More Lexical Analysis

• Lexical analysis is not trivial. Consider: • Lexical analyzer divides program text into
ist his ase nte nce “words” or “tokens”
If x == y then z = 1; else z = 2;

• Units:

Prof. Aiken CS 143 Lecture 1 17 Prof. Aiken CS 143 Lecture 1 18

3
Parsing Diagramming a Sentence

• Once words are understood, the next step is This line is a longer sentence
to understand sentence structure

• Parsing = Diagramming Sentences article noun verb article adjective noun


– The diagram is a tree

subject object

sentence

Prof. Aiken CS 143 Lecture 1 19 Prof. Aiken CS 143 Lecture 1 20

Parsing Programs Semantic Analysis

• Parsing program expressions is the same • Once sentence structure is understood, we


• Consider: can try to understand “meaning”
– But meaning is too hard for compilers
If x == y then z = 1; else z = 2;
• Diagrammed:
x == y z 1 z 2 • Compilers perform limited analysis to catch
inconsistencies
relation assign assign

predicate then-stmt else-stmt


if-then-else
Prof. Aiken CS 143 Lecture 1 21 Prof. Aiken CS 143 Lecture 1 22

Semantic Analysis in English Semantic Analysis in Programming

• Example: • Programming {
Jack said Jerry left his assignment at home. languages define int Jack = 3;
What does “his” refer to? Jack or Jerry? strict rules to avoid
{
such ambiguities
int Jack = 4;
• Even worse: cout << Jack;
• This C++ code prints
Jack said Jack left his assignment at home? “4”; the inner }
How many Jacks are there? definition is used }
Which one left the assignment?

Prof. Aiken CS 143 Lecture 1 23 Prof. Aiken CS 143 Lecture 1 24

4
More Semantic Analysis Optimization

• Compilers perform many semantic checks • No strong counterpart in English, but akin to
besides variable bindings editing

• Example: • Automatically modify programs so that they


Jack left her homework at home. – Run faster
– Use less memory
– In general, conserve some resource
• A “type mismatch” between her and Jack; we
know they are different people
– Presumably Jack is male • The project has no optimization component
Prof. Aiken CS 143 Lecture 1 25 Prof. Aiken CS 143 Lecture 1 26

Optimization Example Code Generation

• Produces assembly code (usually)

X = Y * 0 is the same as X = 0 • A translation into another language


– Analogous to human translation

Prof. Aiken CS 143 Lecture 1 27 Prof. Aiken CS 143 Lecture 1 28

Intermediate Languages Intermediate Languages (Cont.)

• Many compilers perform translations between • IL’s are useful because lower levels expose
successive intermediate forms features hidden by higher levels
– All but first and last are intermediate languages – registers
internal to the compiler
– memory layout
– Typically there is 1 IL
– etc.

• IL’s generally ordered in descending level of


abstraction • But lower levels obscure high-level meaning
– Highest is source
– Lowest is assembly

Prof. Aiken CS 143 Lecture 1 29 Prof. Aiken CS 143 Lecture 1 30

5
Issues Compilers Today

• Compiling is almost this simple, but there are • The overall structure of almost every compiler
many pitfalls. adheres to our outline

• Example: How are erroneous programs • The proportions have changed since FORTRAN
handled? – Early: lexing, parsing most complex, expensive

• Language design has big impact on compiler – Today: optimization dominates all other phases,
lexing and parsing are cheap
– Determines what is easy and hard to compile
– Course theme: many trade-offs in language design
Prof. Aiken CS 143 Lecture 1 31 Prof. Aiken CS 143 Lecture 1 32

Prof. Aiken CS 143 Lecture 1 33

You might also like