6.1120: Parsing
Top-Down Parsing
Martin Rinard
Laboratory for Computer Science
Massachusetts Institute of Technology
MIT 6.1100 (6.035) �Top-Down Parsing
Orientation
Starting Point
Recognition Approach:�<2-1>+1
Open Int Op Int Close Op Int
Open Expr Op Int Close Op Int
Open Expr Op Expr Close Op Int
Open Expr Close Op Int
Expr Op Int
Expr Op Expr
Expr
Start
Op = +|-|*|/
Int = [0-9] [0-9]*
Open = <
Close = >
1) Start → Expr
2) Expr → Expr Op Expr
3) Expr → Int
4) Expr → Open Expr Close
Generative Approach: �<2-1>+1
Start
Expr
Expr Op Expr
Open Expr Close Op Expr
Open Expr Op Expr Close Op Expr
Open Int Op Expr Close Op Expr
Open Int Op Expr Close Op Int
Open Int Op Int Close Op Int
Op = +|-|*|/
Int = [0-9] [0-9]*
Open = <
Close = >
1) Start → Expr
2) Expr → Expr Op Expr
3) Expr → Int
4) Expr → Open Expr Close
Grammar for Today
Start → Expr
Expr → Expr + Term
Expr → Expr - Term
Expr → Term
Term → Term * Int
Term → Term / Int
Term → Int
Goal: Recursive Descent Parser �(Generative/Top-Down Approach
Term()
if (token = Int n)
oldToken = token; token = NextToken();
node = TermPrime();
if (node == NULL) return oldToken;
else return(new TermNode(oldToken, node);
else throw SyntaxError
TermPrime()
if (token = *) || (token = /)
first = token; next = NextToken();
if (next = Int n)
token = NextToken();
return(new TermPrimeNode(first, next, TermPrime())
else throw SyntaxError
else return(NULL)
Basic Approach
Graphical Illustration of Leftmost Derivation
Apply Production Here
NT1 T1 T2 T3 NT2 NT3
Not Here
Sentential Form
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Start
Current Position in Parse Tree
Parsing Example
Applied Production
Start → Expr
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr
Expr
Current Position in Parse Tree
Parsing Example
Applied Production
Expr → Expr - Term
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr - Term
Start
Expr
Term
Expr
-
Parsing Example
Expr → Expr + Term
Expr → Expr - Term
Expr → Term
Applied Production
Expr → Term
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Term - Term
Expr
Term
Expr
-
Term
Parsing Example
Expr → Expr + Term
Expr → Expr - Term
Expr → Term
Applied Production
Term → Int
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr
Term
Expr
-
Term
Int
Int - Term
Parsing Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
2 - Term
Expr
Term
Expr
-
Term
Match
Input
Token!
Int 2
Parsing Example
Start
Parse
Tree
Sentential Form
Remaining Input
-2*2
2 - Term
Expr
Term
Expr
-
Term
Match
Input
Token!
Int 2
Parsing Example
Start
Parse
Tree
Sentential Form
Remaining Input
2*2
2 - Term
Expr
Term
Expr
-
Term
Match
Input
Token!
Int 2
Parsing Example
Applied Production
Term → Term * Int
Start
Parse
Tree
Sentential Form
Remaining Input
2*2
2 - Term*Int
Expr
Term
Expr
-
Term
Term
Int
*
Int 2
Parsing Example
Applied Production
Term → Int
Start
Parse
Tree
Sentential Form
Remaining Input
2*2
2 - Int * Int
Expr
Term
Expr
-
Term
Term
Int
*
Int 2
Int
Parsing Example
Match
Input
Token!
Start
Parse
Tree
Sentential Form
Remaining Input
2*2
2 - 2* Int
Expr
Term
Expr
-
Term
Term
Int
*
Int 2
Int 2
Parsing Example
Match
Input
Token!
Start
Parse
Tree
Sentential Form
Remaining Input
*2
2 - 2* Int
Expr
Term
Expr
-
Term
Term
Int
*
Int 2
Int 2
Parsing Example
Match
Input
Token!
Start
Parse
Tree
Sentential Form
Remaining Input
2
2 - 2* Int
Expr
Term
Expr
-
Term
Term
Int
*
Int 2
Int 2
Parsing Example
Parsing Example
Start
Parse
Tree
Sentential Form
Remaining Input
2
2 - 2*2
Expr
Term
Expr
-
Term
Term
Int 2
*
Int 2
Int 2
Parse
Complete!
Summary
Act 1
Policy Problem
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Start
Applied Production
Start → Expr
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr
Expr
Applied Production
Expr → Expr + Term
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr + Term
Expr
Term
Expr
+
Applied Production
Expr → Term
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Term + Term
Expr
Term
Expr
+
Term
Applied Production
Term → Int
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Int + Term
Expr
Term
Expr
+
Term
Int
Match
Input
Token!
Applied Production
Term → Int
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
-2*2
2 + Term
Expr
Term
Expr
+
Term
Int 2
Can’t
Match
Input
Token!
Applied Production
Start → Expr
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr
Expr
So
Backtrack!
Applied Production
Expr → Expr - Term
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Expr - Term
Expr
Term
Expr
-
Term
Applied Production
Expr → Term
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Term - Term
Expr
Term
Expr
-
Term
Applied Production
Term → Int
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2-2*2
Int - Term
Expr
Term
Expr
-
Int
Match
Input
Token!
Term
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
-2*2
2 - Term
Expr
Term
Expr
-
Int 2
Match
Input
Token!
Term
Backtracking Example
Start
Parse
Tree
Sentential Form
Remaining Input
2*2
2 - Term
Expr
Term
Expr
-
Int 2
Act 2
Any one see a problem?
Term
Num
*
Term
Term
Term
Num
*
Term
Term
Num
*
Left Recursion + Top-Down Parsing = Infinite Loop
Term
Num
*
Term
Term
Term
Num
*
Term
Term
Num
*
General Search Issues
Eliminating Left Recursion
builds parse tree like this:
A
α
A
α
A
α
β
Eliminating Left Recursion
A
α
A
α
β
A
β
R
α
R
α
R
ε
Old Parse Tree
New Parse Tree
Hacked Grammar
Original Grammar Fragment
Term → Term * Int
Term → Term / Int
Term → Int
New Grammar Fragment
Term → Int Term’
Term’ → * Int Term’
Term’ → / Int Term’
Term’ → ε
Term
Int
*
Term
Int
*
Int
Term
Int
Term’
Int
*
Term’
Int
*
Term’
ε
Parse Tree Comparisons
Original Grammar
New Grammar
Eliminating Left Recursion
Act 3
Predictive Parsing
Predictive Parsing Example Grammar
Start → Expr
Expr → Term Expr’
Expr’ → + Term Expr’
Expr’ → - Term Expr’
Expr’ → ε
Term → Int Term’
Term’ → * Int Term’
Term’ → / Int Term’
Term’ → ε
Choice Points
Term’ → * Int Term’
Term’ → / Int Term’
Term’ → ε
Predictive Parsing Example
Start
Parse
Tree
Sentential Form
Remaining Input
*2
2 - 2 Term’
Expr
Term
Term
Int 2
Term’
ε
Expr’
Term
Int 2
Term’
-
Term → Int Term’
Term’ → * Int Term’
Term’ → / Int Term’
Term’ → ε
Predictive Parsing Example
Start
Parse
Tree
Sentential Form
Remaining Input
*2
2 - 2 Term’
Expr
Term
Term
Int 2
Term’
ε
Expr’
Int 2
Term’
-
Applied Production
Term’ → * Int Term’
*
Term’
Int
Predictive Parsing + Hand Coding = Recursive Descent Parser
Example
Boolean Term()
if (token = Int n) token = NextToken(); return(TermPrime())
else return(false)
Boolean TermPrime()
if (token = *)
token = NextToken();
if (token = Int n) token = NextToken(); return(TermPrime())
else return(false)
else if (token = /)
token = NextToken();
if (token = Int n) token = NextToken(); return(TermPrime())
else return(false)
else return(true)
Term → Int Term’
Term’ → * Int Term’
Term’ → / Int Term’
Term’ → ε
Next time: Parse
Term()
if (token = Int n)
oldToken = token; token = NextToken();
node = TermPrime();
if (node == NULL) return oldToken;
else return(new TermNode(oldToken, node);
else throw SyntaxError
TermPrime()
if (token = *) || (token = /)
first = token; next = NextToken();
if (next = Int n)
token = NextToken();
return(new TermPrimeNode(first, next, TermPrime())
else throw SyntaxError
else return(NULL)
Term → Int Term’
Term’ → * Int Term’
Term’ → / Int Term’
Term’ → ε
Building A Parse Tree
Term
Int
2
Term’
Int
3
*
Term’
Int
4
*
Term’
ε
Parse Tree