1 of 81

1

Quiz 4

2 of 81

2

Midterm 2 ADT, Second Chance

type ‘a option =

| None

| Some of ‘a

3 of 81

3

Programming Languages and Compilers (CS 421)

Talia Ringer (they/them)

4218 SC, UIUC

https://courses.grainger.illinois.edu/cs421/fa2023/

Based heavily on slides by Elsa Gunter, which were based in part on slides by Mattox Beckman, as updated by Vikram Adve and Gul Agha

4 of 81

Objectives for Today

  • Reminder: We want to turn strings (code) into computer instructions
  • Done in phases
    • Turn strings into abstract syntax trees (parse)
    • Translate abstract syntax trees into executable instructions (interpret or compile)
  • Last week we started the first step of parsing, which is lexing those input strings into tokens
  • Today we will finish lexing and move on to the rest of parsing

*

4

5 of 81

Objectives for Today

  • Reminder: We want to turn strings (code) into computer instructions
  • Done in phases
    • Turn strings into abstract syntax trees (parse)
    • Translate abstract syntax trees into executable instructions (interpret or compile)
  • Last week we started the first step of parsing, which is lexing those input strings into tokens
  • Today we will finish lexing and move on to the rest of parsing

*

5

6 of 81

Objectives for Today

  • Reminder: We want to turn strings (code) into computer instructions
  • Done in phases
    • Turn strings into abstract syntax trees (parse)
    • Translate abstract syntax trees into executable instructions (interpret or compile)
  • Last week we started the first step of parsing, which is lexing those input strings into tokens
  • Today we will finish lexing and move on to the rest of parsing

*

6

7 of 81

7

Questions from last week?

8 of 81

8

Recap

9 of 81

Example : using generated file

# #use "test.ml";;

val main : Lexing.lexbuf -> result = <fun>

val __ocaml_lex_main_rec :

Lexing.lexbuf -> int -> result = <fun>

hi there 234 5.2

- : result = String "hi”

*

9

Recap

10 of 81

Example : using generated file

# #use "test.ml";;

val main : Lexing.lexbuf -> result = <fun>

val __ocaml_lex_main_rec :

Lexing.lexbuf -> int -> result = <fun>

hi there 234 5.2

- : result = String "hi”

*

10

What happened to the rest?

Recap

11 of 81

Example : using generated file

# let b = Lexing.from_channel stdin;;

# main b;;

hi 673 there

- : result = String "hi"

# main b;;

- : result = Int 673

# main b;;

- : result = String "there"

*

11

Recall the hidden argument of type lexbuf

Recap

12 of 81

Example : using generated file

# let b = Lexing.from_channel stdin;;

# main b;;

hi 673 there

- : result = String "hi"

# main b;;

- : result = Int 673

# main b;;

- : result = String "there"

*

12

Recall the hidden argument of type lexbuf

Recap

13 of 81

Example : using generated file

# let b = Lexing.from_channel stdin;;

# main b;;

hi 673 there

- : result = String "hi"

# main b;;

- : result = Int 673

# main b;;

- : result = String "there"

*

13

Recall the hidden argument of type lexbuf

Recap

14 of 81

14

Fancy Lexing

15 of 81

Problem

  • How to get lexer to look at more than the first token at one time?
  • Answer: action has to tell it to recursive calls
  • Downside: Not what you want to sew this together with ocamlyacc (parser generator)
  • Side Benefit: can add “state” into lexing
  • Note: already used this with the _ case

*

15

Fancy Lexing

16 of 81

Problem

  • How to get lexer to look at more than the first token at one time?
  • Answer: action has to tell it to recursive calls
  • Downside: Not what you want to sew this together with ocamlyacc (parser generator)
  • Side Benefit: can add “state” into lexing
  • Note: already used this with the _ case

*

16

Fancy Lexing

17 of 81

Problem

  • How to get lexer to look at more than the first token at one time?
  • Answer: action has to tell it to recursive calls
  • Downside: Not what you want to sew this together with ocamlyacc (parser generator)
  • Side Benefit: can add “state” into lexing
  • Note: already used this with the _ case

*

17

Fancy Lexing

18 of 81

Example: Old Version

rule main = parse

| (digits)'.'digits as f

{ Float (float_of_string f) }

| digits as n

{ Int (int_of_string n) }

| letters as s

{ String s }

| _ { main lexbuf }

*

18

Fancy Lexing

19 of 81

Example: WIP New Version

rule main = parse

| (digits)'.'digits as f

{ Float (float_of_string f) :: main lexbuf }

| digits as n

{ Int (int_of_string n) :: main lexbuf }

| letters as s

{ String s :: main lexbuf }

| _ { main lexbuf }

*

19

Fancy Lexing

20 of 81

Example: New Version

rule main = parse

| (digits)'.'digits as f

{ Float (float_of_string f) :: main lexbuf }

| digits as n

{ Int (int_of_string n) :: main lexbuf }

| letters as s

{ String s :: main lexbuf }

| eof { [] }

| _ { main lexbuf }

*

20

Fancy Lexing

21 of 81

Example Results

hi there 234 5.2

- : result list = � [String "hi"; String "there"; Int 234; Float 5.2]

Used Ctrl-d to send the end-of-file signal

*

21

Fancy Lexing

22 of 81

22

Questions so far?

Fancy Lexing

23 of 81

Dealing with Comments (No Nesting)

let open_comment = "(*"

let close_comment = "*)"

rule main = parse

(* same as last time *)

| open_comment { comment lexbuf }

| eof { [] }

| _ { main lexbuf }

and comment = parse

| close_comment { main lexbuf }

| _ { comment lexbuf }

*

23

Fancy Lexing

24 of 81

Dealing with Comments (No Nesting)

let open_comment = "(*"

let close_comment = "*)"

rule main = parse

(* same as last time *)

| open_comment { comment lexbuf }

| eof { [] }

| _ { main lexbuf }

and comment = parse

| close_comment { main lexbuf }

| _ { comment lexbuf }

*

24

Fancy Lexing

25 of 81

Dealing with Comments (No Nesting)

let open_comment = "(*"

let close_comment = "*)"

rule main = parse

(* same as last time *)

| open_comment { comment lexbuf }

| eof { [] }

| _ { main lexbuf }

and comment = parse

| close_comment { main lexbuf }

| _ { comment lexbuf }

*

25

Fancy Lexing

26 of 81

Dealing with Comments (No Nesting)

let open_comment = "(*"

let close_comment = "*)"

rule main = parse

(* same as last time *)

| open_comment { comment lexbuf }

| eof { [] }

| _ { main lexbuf }

and comment = parse

| close_comment { main lexbuf }

| _ { comment lexbuf }

*

26

Fancy Lexing

27 of 81

Dealing with Comments (No Nesting)

let open_comment = "(*"

let close_comment = "*)"

rule main = parse

(* same as last time *)

| open_comment { comment lexbuf }

| eof { [] }

| _ { main lexbuf }

and comment = parse

| close_comment { main lexbuf }

| _ { comment lexbuf }

*

27

Fancy Lexing

28 of 81

28

Questions so far?

Fancy Lexing

29 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth+1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

29

Fancy Lexing

30 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth + 1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

30

Fancy Lexing

31 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth + 1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

31

Fancy Lexing

32 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth + 1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

32

Fancy Lexing

33 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth + 1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

33

Fancy Lexing

34 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth + 1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

34

Fancy Lexing

35 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth + 1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

35

Fancy Lexing

36 of 81

Dealing with Nested Comments

rule main = parse

| open_comment { comment 1 lexbuf}

| eof { [] }

| _ { main lexbuf }

and comment depth = parse

| open_comment { comment (depth+1) lexbuf }

| close_comment { if depth = 1 then main lexbuf

else comment (depth - 1) lexbuf }

| _ { comment depth lexbuf }

*

36

Fancy Lexing

37 of 81

37

Note: No Longer Regular!

Fancy Lexing

38 of 81

38

Often easier to defer non-regular

things to the parser generator.

Fancy Lexing

39 of 81

Problem

  • How to get lexer to look at more than the first token at one time?
  • Answer: action has to tell it to recursive calls
  • Downside: Not what you want to sew this together with ocamlyacc (parser generator)
  • Side Benefit: can add “state” into lexing
  • Note: already used this with the _ case

*

39

Fancy Lexing

40 of 81

40

Questions so far?

41 of 81

41

Parsing

42 of 81

Lexing and Parsing

Source Program

Tokens

Abstract Syntax

Semantic Analysis

Symbol Table

Evaluation/

Translation

Result/IR

Lexer

Parser

Parsing

42

43 of 81

Lexing and Parsing

Source Program

Tokens

Abstract Syntax

Semantic Analysis

Symbol Table

Evaluation/

Translation

Result/IR

Lexer

Parser

To parse our source program and get abstract syntax, we need a grammar defined in terms of the kinds of tokens we get out of our lexer.

Parsing

43

44 of 81

Lexing and Parsing

Source Program

Tokens

Abstract Syntax

Semantic Analysis

Symbol Table

Evaluation/

Translation

Result/IR

Lexer

Parser

To parse our source program and get abstract syntax, we need a grammar defined in terms of the kinds of tokens we get out of our lexer.

The output, an abstract syntax tree, will track not just categories, but also structure.

Parsing

44

45 of 81

Lexing and Parsing

Binary Operator +

The output, an abstract syntax tree, will track not just categories, but also structure.

*

45

Constant 1

Constant 2

Syntax

Parsing

46 of 81

Lexing and Parsing

Source Program

Tokens

Abstract Syntax

Semantic Analysis

Symbol Table

Evaluation/

Translation

Result/IR

Lexer

Parser

To parse our source program and get abstract syntax, we need a grammar defined in terms of the kinds of tokens we get out of our lexer.

The output, an abstract syntax tree, will track not just categories, but also structure.

Parsing

46

47 of 81

Sample Grammar

  • Language: Parenthesized sums of 0’s and 1’s

<Sum> ::= 0

<Sum> ::= 1

<Sum> ::= <Sum> + <Sum>

<Sum> ::= (<Sum>)

*

47

Parsing

48 of 81

Sample Grammar

  • Language: Parenthesized sums of 0’s and 1’s

<Sum> ::= 0

<Sum> ::= 1

<Sum> ::= <Sum> + <Sum>

<Sum> ::= (<Sum>)

*

48

Parsing

49 of 81

49

Context-Free Grammars

50 of 81

BNF Grammars

  • A notation for a context-free grammar
  • Start with a set of characters a, b, … (terminals)
  • Add different characters X, Y, … (nonterminals)
  • One special nonterminal S called start symbol
  • BNF rules (aka productions) have form� X ::= y�where X is any nonterminal and y is a string of terminals and nonterminals
  • BNF grammar is a set of BNF rules such that every nonterminal appears on the left of some rule

*

50

Context-Free Grammars

51 of 81

BNF Grammars

  • A notation for a context-free grammar
  • Start with a set of characters a, b, … (terminals)
  • Add different characters X, Y, … (nonterminals)
  • One special nonterminal S called start symbol
  • BNF rules (aka productions) have form� X ::= y�where X is any nonterminal and y is a string of terminals and nonterminals
  • BNF grammar is a set of BNF rules such that every nonterminal appears on the left of some rule

*

51

Context-Free Grammars

52 of 81

BNF Grammars

  • A notation for a context-free grammar
  • Start with a set of characters a, b, … (terminals)
  • Add different characters X, Y, … (nonterminals)
  • One special nonterminal S called start symbol
  • BNF rules (aka productions) have form� X ::= ywhere X is any nonterminal and y is a string of terminals and nonterminals
  • BNF grammar is a set of BNF rules such that every nonterminal appears on the left of some rule

*

52

Context-Free Grammars

53 of 81

BNF Grammars

  • A notation for a context-free grammar
  • Start with a set of characters a, b, … (terminals)
  • Add different characters X, Y, … (nonterminals)
  • One special nonterminal S called start symbol
  • BNF rules (aka productions) have form� X ::= ywhere X is any nonterminal and y is a string of terminals and nonterminals
  • BNF grammar is a set of BNF rules such that every nonterminal appears on the left of some rule

*

53

Context-Free Grammars

54 of 81

Sample BNF Grammar

  • Terminals: 0 1 + ( )
  • Nonterminals: <Sum>
  • Start symbol = <Sum>

<Sum> ::= 0

<Sum> ::= 1

<Sum> ::= <Sum> + <Sum>

<Sum> ::= (<Sum>)

*

54

Context-Free Grammars

55 of 81

Sample BNF Grammar

  • Terminals: 0 1 + ( )
  • Nonterminals: <Sum>
  • Start symbol = <Sum>

<Sum> ::= 0

<Sum> ::= 1

<Sum> ::= <Sum> + <Sum>

<Sum> ::= (<Sum>)

Can be abbreviated as

<Sum> ::= 0 | 1 | <Sum> + <Sum> | (<Sum>)

*

55

Context-Free Grammars

56 of 81

56

Questions so far?

Context-Free Grammars

57 of 81

BNF Semantics

  • Question: What does a BNF grammar mean?
  • Answer: The meaning of a BNF grammar is the set of all strings consisting only of terminals that can be derived from the Start symbol
  • Question: How do we determine that set?

*

57

Context-Free Grammars

58 of 81

BNF Semantics

  • Question: What does a BNF grammar mean?
  • Answer: The meaning of a BNF grammar is the set of all strings consisting only of terminals that can be derived from the Start symbol
  • Question: How do we determine that set?

*

58

Context-Free Grammars

59 of 81

BNF Deriviations

  • Given rules

X ::= yZw and Z ::= v

we may replace Z by v to say

X => yZw => yvw

  • Sequence of such replacements called derivation
  • Derivation called right-most if always replace the right-most non-terminal

*

59

Context-Free Grammars

60 of 81

BNF Deriviations

  • Given rules

X ::= yZw and Z ::= v

we may replace Z by v to say

X => yZw => yvw

  • Sequence of such replacements called derivation
  • Derivation called right-most if always replace the right-most non-terminal

*

60

Context-Free Grammars

61 of 81

BNF Derivations

Start with the start symbol:

<Sum> =>

*

61

Context-Free Grammars

62 of 81

BNF Derivations

Pick a non-terminal:

<Sum> =>

*

62

Context-Free Grammars

63 of 81

BNF Derivations

Pick a rule and substitute:

    • <Sum> ::= <Sum> + <Sum>

<Sum> => <Sum> + <Sum >

*

63

Context-Free Grammars

64 of 81

BNF Derivations

Pick a non-terminal:

<Sum> => <Sum> + <Sum >

*

64

Context-Free Grammars

65 of 81

BNF Derivations

Pick a rule and substitute:

    • <Sum> ::= ( <Sum> )

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

*

65

Context-Free Grammars

66 of 81

BNF Derivations

Pick a non-terminal:

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

*

66

Context-Free Grammars

67 of 81

BNF Derivations

Pick a rule and substitute:

    • <Sum> ::= <Sum> + <Sum>

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

*

67

Context-Free Grammars

68 of 81

BNF Derivations

Pick a non-terminal:

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

*

68

Context-Free Grammars

69 of 81

BNF Derivations

Pick a rule and substitute:

    • <Sum >::= 1

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

=> ( <Sum> + 1 ) + <Sum>

*

69

Context-Free Grammars

70 of 81

BNF Derivations

Pick a non-terminal:

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

=> ( <Sum> + 1 ) + <Sum>

*

70

Context-Free Grammars

71 of 81

BNF Derivations

Pick a rule and substitute:

    • <Sum >::= 0

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

=> ( <Sum> + 1 ) + <Sum>

=> ( <Sum> + 1 ) + 0

*

71

Context-Free Grammars

72 of 81

BNF Derivations

Pick a non-terminal:

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

=> ( <Sum> + 1 ) + <Sum>

=> ( <Sum> + 1 ) + 0

*

72

Context-Free Grammars

73 of 81

BNF Derivations

Pick a rule and substitute

    • <Sum> ::= 0

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

=> ( <Sum> + 1 ) + <Sum>

=> ( <Sum> + 1 ) 0

=> ( 0 + 1 ) + 0

*

73

Context-Free Grammars

74 of 81

BNF Derivations

( 0 + 1 ) + 0 is generated by the grammar.

<Sum> => <Sum> + <Sum >

=> ( <Sum> ) + <Sum>

=> ( <Sum> + <Sum> ) + <Sum>

=> ( <Sum> + 1 ) + <Sum>

=> ( <Sum> + 1 ) + 0

=> ( 0 + 1 ) + 0

*

74

Context-Free Grammars

75 of 81

75

Questions so far?

Context-Free Grammars

76 of 81

Extended BNF Grammars

  • Alternatives: allow rules of from X ::= y | z
    • Abbreviates X ::= y, X ::= z
  • Options: X ::= y[v]z
    • Abbreviates X ::= yvz, X ::= yz
  • Repetition: X ::= y{v}*z
    • Can be eliminated by adding new nonterminal V and rules X ::= yz, X ::= yVz, V ::= v, V ::= vV

*

76

Context-Free Grammars

77 of 81

Extended BNF Grammars

  • Alternatives: allow rules of from X ::= y | z
    • Abbreviates X ::= y, X ::= z
  • Options: X ::= y[v]z
    • Abbreviates X ::= yvz, X ::= yz
  • Repetition: X ::= y{v}*z
    • Can be eliminated by adding new nonterminal V and rules X ::= yz, X ::= yVz, V ::= v, V ::= vV

*

77

Context-Free Grammars

78 of 81

Extended BNF Grammars

  • Alternatives: allow rules of from X ::= y | z
    • Abbreviates X ::= y, X ::= z
  • Options: X ::= y[v]z
    • Abbreviates X ::= yvz, X ::= yz
  • Repetition: X ::= y{v}*z
    • Can be eliminated by adding new nonterminal V and rules X ::= yz, X ::= yVz, V ::= v, V ::= vV

*

78

Context-Free Grammars

79 of 81

79

Questions?

80 of 81

80

Next Class: From Tokens to ASTs

81 of 81

  • EC2 is up
  • WA7 due Thursday
  • MP8 due next Tuesday
  • All deadlines can be found on course website
  • Use office hours and class forums for help

Next Class

81