You are on page 1of 40

LR-Grammars

LR(0), LR(1), and LR(K)


Deterministic Context-Free
Languages
DCFL
A family of languages that are accepted
by a Deterministic Pushdown
Automaton (DPDA)
Many programming languages can be
described by means of DCFLs
Prefix and Proper Prefix
Prefix (of a string)
Any number of leading symbols of that
string
Example: abc
Prefixes: c, a, ab, abc
Proper Prefix (of a string)
A prefix of a string, but not the string itself
Example: abc
Proper prefixes: c, a, ab
Prefix Property
Context-Free Language (CFL) L is said
to have the prefix property whenever w
is in L and no proper prefix of w is in L
Not considered a serve restriction
Why?
Because we can easily convert a DCFL to a
DCFL with the prefix property by introducing an
endmarker
Suffix and Proper Suffix
Suffix (of a string)
Any number of trailing symbols
Proper Suffix
A suffix of a string, but not the string itself
Example Grammar
This is the grammar that will be used in
many of the examples:
S Sc
S SA | A
A aSb | ab
LR-Grammar
Left-to-right scan of the input producing
a rightmost derivation
Simply:
L stands for Left-to-right
R stands for rightmost derivation
LR-Items
An item (for a given CFG)
A production with a dot anywhere in the
right side (including the beginning and end)
In the event of an c-production: B c
B is an item
Example: Items
Given our example grammar:
S Sc, S SA|A, A aSb|ab
The items for the grammar are:

SSc, SSc, SSc
SSA, SSA, SSA, SA, SA
AaSb, AaSb, AaSb, AaSb, Aab, Aab, Aab
Some Notation
* = 1 or more steps in a derivation

*
rm
= rightmost derivation


rm
= single step in rightmost
derivation
Right-Sentential Form
A sentential form that can be derived by
a rightmost derivation
A string of terminals and variables o is
called a sentential form if S* o
More terms
Handle
A substring which matches the right-hand side of a
production and represents 1 step in the derivation
Or more formally:
(of a right-sentential form for CFG G)
Is a substring | such that:
S *
rm
o|w
o|w =
If the grammar is unambiguous:
There are no useless symbols
The rightmost derivation (in right-sentential
form) and the handle are unique
Example
Given our example grammar:
S Sc, S SA|A, A aSb|ab
An example right-most derivation:
S Sc SAc SaSbc
Therefore we can say that: SaSbc is in
right-sentential form
The handle is aSb
More terms
Viable Prefix
(of a right-sentential form for )
Is any prefix of ending no farther right
than the right end of a handle of .
Complete item
An item where the dot is the rightmost
symbol
Example
Given our example grammar:
S Sc, S SA|A, A aSb|ab
The right-sentential form abc:
S *
rm
Ac abc
Valid prefixes:
A ab for prefix ab
A ab for prefix a
A ab for prefix c
Aab is a complete item, Ac is the right-sentential
form for abc
LR(0)
Left-to-right scan of the input producing
a rightmost derivation with a look-ahead
(on the input) of 0 symbols
It is a restricted type of CFG
1st in the family of LR-grammars
LR(0) grammars define exactly the
DCFLs having the prefix property
Computing Sets of Valid Items
The definition of LR(0) and the method
of accepting L(G) for LR(0) grammar G
by a DPDA depends on:
Knowing the set of valid items for each
prefix
For every CFG G, the set of viable
prefixes is a regular set
This regular set is accepted by an NFA
whose states are the items for G
Continued
Given an NFA (whose states are the
items for G) that accepts the regular set
We can apply the subset construction to
this NFA and yield a DFA
The DFA whose state is the set of valid
items for
NFA M
NFA M recognizes the viable prefixes for CFG
M = (Q, V T, o, q
0
, Q)
Q = set of items for G plus state q
0
G = (V, T, P, S)
Three Rules
o(q0,c) = {So| So is a production}
o(AoB|,c) = {B| B is a production}
Allows expansion of a variable B appearing
immediately to the right of the dot
o(AoX|, X) = {AoX|}
Permits moving the dot over any grammar symbol X
if X is the next input symbol
Theorem 10.9
The NFA M has property that o(q
0
, )
contains Ao| iff Ao| is valid for
This theorem gives a method for
computing the sets of valid items for any
viable prefix
Note: It is an NFA. It can be converted to a
DFA. Then by inspecting each state it can
be determine if it is a valid LR(0) grammar
Definition of LR(0) Grammar
G is an LR(0) grammar if
The start symbol does not appear on the
right side of any productions
prefixes of G where Ao is a
complete item, then it is unique
i.e., there are no other complete items (and
there are no items with a terminal to the right of
the dot) that are valid for
Facts we now know:
Every LR(0) grammar generates a
DCFL
Every DCFL with the prefix property has
a LR(0) grammar
Every language with LR(0) grammar
have the prefix property
L is DCFL iff L has a LR(0) grammar
DPDAs from LR(0) Grammars
We trace out the rightmost derivation in
reverse
The stack holds a viable prefix (in right-
sentential form) and the current state (of
the DFA)
Viable prefixes: X
1
X
2
X
k
States: s
1
, s
2
,,s
k
Stack: s
0
X
1
s
1
X
k
s
k

Reduction
If s
k
contains Ao
Then Ao is valid for X
1
X
2
X
k

o = suffix of X
1
X
2
X
k

Let
o = X
i+1
X
k
w such that X
1
X
k
w is a right-sentential
form.
Reduction Continued
There is a derivation:
S *
rm
X
1
X
i
Aw
rm
X
1
X
k
w
To obtain the right-sentential form
(X
1
X
k
w) in a right derivation we
reduce o to A
Therefore, we pop X
i+1
X
k
from the stack
and push A onto the stack
Shift
If s
k
contains only incomplete items
Then the right-sentential form (X
1
X
k
w)
cannot be formed using a reduction
Instead we simply shift the next input
symbol onto the stack
Theorem 10.10
If L is L(G) for an LR(0) grammar G,
then L is N(M) for a DPDA M
N(M) = the language accepted by empty
stack or null stack
Proof
Construct from G the DFA D
Transition function: recognizes Gs prefixes
Stack Symbols of M are
Grammar Symbols of G
States of D
M has start state q and other states
used to perform reduction
We know that:
If G is LR(0) then
Reductions are the only way to get the
right-sentential form when the state of the
DFA (on the top of the stack) contains a
complete item
When M starts on input w it will
construct a right-most derivation for w in
reverse order
What we need to prove:
When a shift is called for and the top
DFA state on the stack has only
incomplete items then there are no
handles
(Note: if there was a handle, then some
DFA state on the stack would have a
complete item)
Suppose - state Ao (complete item)
Each state is put onto the top of the
stack
It would then immediately be reduced to
A
Therefore, a complete item cannot
possibly become buried on the stack
Proof continued
The acceptance of G occurs when the
top of the stack contains the start
symbol
The start symbol by definition of LR(0)
grammars cannot appear on the right
side of a production
L(G) always has a prefix property if G
is LR(0)
Conclusion of Proof
Thus, if w is in L(G), M finds the
rightmost derivation of w, reduces w to
S, and accepts
If M accepts w, then the sequence of
right-sentential forms provides a
derivation of w from S
N(M) = L(G)
Corollary of Theorem 10.10
Every LR(0) grammar is unambiguous
Why?
The rightmost derivation of w is unique
(Given the construction we provided)
LR(1) Grammars
LR grammar with 1 look-ahead
All and only deterministic CFLs have
LR(1) grammars
Are greatly important to compiler design
Why?
Because they are broad enough to include the
syntax of almost all programming languages
Restrictive enough to have efficient parsers
(that are essentially DPDAs)
LR(1) Item
Consists of an LR(0) item followed by a
look-ahead set consisting of terminals
and/or the special symbol $
$ = the right end of the string
General Form:
A o|, {a
1
, a
2
, , a
n
}
The set of LR(1) items forms the states
of a viable prefix by converting the NFA
to a DFA
A grammar is LR(1) if
The start symbol does not appear on the
right side of any productions
The set of items, I, valid for some viable
prefix includes some complete item Ao,
{a
1
,,a
n
} then
No a
i
appears immediately to the right of the
dot in any item of I
If B|, {b
1
,,b
k
} is another complete item in
I, then a
i
= b
j
for any 1 s i s n and 1 s j s k
Accepting LR(1) language:
Similar to the DPDA used with LR(0)
grammars
However, it is allowed to use the next
input symbol during its decision making
This is accomplished by appending a $
to the end of the input and the DPDA
keeps the next input symbol as part of
the state
LR(1) Rules for Reduce/Shift
If the top set of items has a complete item
Ao, {a
1
, a
2
, , a
n
}, where A = S, reduce
by Ao if the current input symbol is in
{a
1
, a
2
, , a
n
}
If the top set of items has an item So,
{$}, then reduce by So and accept if the
current symbol is $ (i.e., the end of the
input is reached)
If the top set of items has an item
AoaB, T, and a is the current input
symbol, then shift
Regarding the Rules
Guarantees that at most one of the
rules will be applied for any input
symbol or $
Often for practicality the information is
summarized into a table
Rows: sets of items
Columns: terminals and $

You might also like