1 of 111

Introduction to Computability Theory

Context Free Languages

Prof. Amos Israeli

1

2 of 111

Introduction and Motivation

On the last lecture we completed our study of regular languages. (There is still a lot to learn but our time is limited…).

2

3 of 111

Introduction and Motivation

In our study of RL-s we Covered:

Motivation and definition of regular languages.
DFA-s, NFA-s, RE-s and their equivalence.
Non Regular Languages and the Pumping Lemma.

3

4 of 111

Introduction and Motivation

In this lecture, we turn to Context Free Grammars and Context Free Languages.

The class of Context Free Languages is an intermediate class between the class of regular languages and the class of Decidable Languages (To be defined).

4

5 of 111

Introduction and Motivation

A Context Free Grammar is a “machine” that creates a language.

A language created by a CF grammar is called A Context Free Language.

(We will show that) The class of Context Free Languages Properly Contains the class of Regular Languages.

5

6 of 111

Context Free Grammar - Example

Consider grammar :��

A CFL consists of substitution rules called Productions.

The capital letters are the Variables.

The other symbols are the Terminals.

6

7 of 111

Context Free Grammar - Example

Consider grammar :��

The grammar generates the language � called the language of � , denoted by .

7

8 of 111

Context Free Grammar - Example

Consider grammar :��

This is a Derivation of the word by :

On each step, a single rule is activated. This mechanism is nondeterministic.

8

9 of 111

Context Free Grammar - Example

This is A Parse Tree of the word �by :

9

10 of 111

Context Free Grammar - Example

Each internal node of the tree is associated with a single production.

10

11 of 111

CF Grammar – A Formal Definition

A Context Free Grammar is a 4-tupple � where:

is a finite set called the variables.
is a finite set, disjoint from V called the terminals.
is a set of rules, where a rule is a variable and a string of variables and terminals, and
is the start variable .

11

,

12 of 111

A Derivation – A Formal Definition

A word is a string of terminals.

A derivation of a word w from a context Free Grammar is a sequence of strings ,�over , where:

is the start variable of G.
For each , is obtained by activating a single production (rule) of G on one of the variables of .

12

13 of 111

CF Grammar – A Formal Definition

A word w is in the Language of grammar G, denoted by , if there exists a derivation whose rightmost string is w .

Thus,

13

14 of 111

Example2: Arithmetical EXPS

Grammar :

Rules:

1.

2.

3.

14

15 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input �

15

16 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input

rule

16

17 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input output

rule

17

18 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input

18

19 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input

rule

19

20 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input output

rule

20

21 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input

21

22 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input

rule

22

23 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

input

output

rule

23

24 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

� input

24

25 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

� input

rule

25

26 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

� input output

rule

26

27 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

� input

27

28 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

� input

rule

28

29 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

� input

output rule

29

30 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input

30

31 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input

rule

31

32 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input output

rule

32

33 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input

33

34 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input

rule

34

35 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input

output rule

35

36 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

� input

36

37 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

� input

rule

37

38 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

output rule

38

39 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input �

39

40 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

input � rule

40

41 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

� output rule

41

42 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

� input

42

43 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

� input rule

43

44 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

�

44

45 of 111

Example2: Arithmetical EXPS

Derivation of by Grammar :

45

Note: There is more than one derivation.

46 of 111

Example3: The Language of WF ( )

To be Demonstrated on the blackboard

46

47 of 111

Ambiguity

We already saw that a word may have more then a single derivation from the same grammar.

A Leftmost Derivation is a derivation in which rules are applied in order left to right.

A grammar is ambiguous if it has Two parse trees.

47

48 of 111

Ambiguity

Reminder: Two parse trees are equal if they are equal as trees and if all productions corresponding to inner nodes are also equal .

48

49 of 111

Example4: Similar to Arith. EXPS

Grammar :

Rules:

49

,

50 of 111

Example4: 1^st Parse Tree for______

50

51 of 111

Example4: 2^nd Parse Tree for_____

51

52 of 111

Ambiguity

Note: Some ambiguous grammars may have an unambiguous equivalent grammar.

But: There exist Inherently Ambiguous Grammars , i.e. an ambiguous grammar that does not have an equivalent unambiguous one.

52

53 of 111

Discussion

Q: From a computational point of view, how strong are context free languages?

A: Since the language is not regular and it is CF, we conclude that� .

Q: Can one prove ?

A: Yes.

53

54 of 111

Discussion

Q: A language is regular if it is recognized by a DFA (or NFA). Does there exist a type of machine that characterizes CFL?

A: Yes, those are the Push-Down Automata (Next Lecture). .

Q: Can one prove a language not to be CFL ?

A: Yes, by the Pumping Lemma for CFL-s . For example: is not CFL.

54

55 of 111

Introduction and Motivation

In this lecture we introduce Pushdown Automata, a computational model equivalent to context free languages.

A pushdown automata is an NFA augmented with an infinitely large stack.

The additional memory enables recognition of some non regular languages.

55

56 of 111

Schematic of a Finite Automaton

56

Finite control

a

b

a

c

input

57 of 111

Schematic of a Pushdown Automaton

57

z

Finite control

b

c

a

x

y

stack

input

58 of 111

Informal Description

A Pushdown Automata (PDA) can write an unbounded number of Stack Symbols on the stack and read these symbols later.

Writing a symbol onto the stack is called pushing and it pushes all symbols on the stack one stack cell down.

58

59 of 111

Informal Description

Removing a symbol off the stack is called popping and every symbol on the stack moves one stack cell up.

Note: A PDA can access only the stack’s topmost symbol (LIFO).

59

60 of 111

A PDA Recognizing_________

This PDA reads symbols from the input.

As each 0 is read, it is pushed onto the stack.

As each 1 is read, a 0 is popped from the stack.

If the stack becomes empty exactly when the last 1 is read – accept.

Otherwise – reject.

60

61 of 111

Checking Stack Emptiness

The definition of a PDA does not give a special way to check emptiness.

One way to do it is to augment the stack alphabet with a special “emptiness” marker, the symbol $. (Note: There is nothing special about $ any other symbol not in the original� can do.)

61

62 of 111

Checking Stack Emptiness

The computation is started by an transition in which $ is pushed on the stack.

If the end marker $ is found on the stack at the end of the computation, it is popped by a single additional transition after which the automaton “knows” that the stack is empty.

62

63 of 111

A PDA Recognizing_________

The label of each transition represents the input (left of arrow) and pushed stack symbol (right of the arrow).

63

64 of 111

A PDA Recognizing_________

The $ symbol, pushed onto the stack at the beginning of the computation, is used as an “empty” marker.

64

65 of 111

A PDA Recognizing_________

The PDA accepts either if the input is empty, or if scanning the input is completed and the PDA is at .

65

66 of 111

Nondeterministic PDAs

A Nondeterministic PDA allows nondeterministic transitions.

Nondeterministic PDA-s are strictly stronger then deterministic PDA-s

In this respect, the situation is not similar to the situation of DFA-s and NFA-s.

Nondeterministic PDA-s are equivalent to CFL-s.

66

67 of 111

PDA – A Formal Definition

A pushdown automaton is a 6-tupple � where:

is a finite set called the states.
is the input alphabet.
is the stack alphabet.
is the transition function.
is the start state, and
is the set of accepting states.

67

68 of 111

PDA - The Transition Function

Consider the expression :

Recall that , and that .

Assume that the PDA is in state , the next input symbol is , and the top stack symbol is .

68

69 of 111

PDA - The Transition Function

The next transition may either depend on the input symbol and the stack symbol , or only on the input symbol , or only on the stack symbol , or on none of them.

This choice is formally expressed by the argument of the transition function as detailed in the next slides.

69

70 of 111

Transition Function Sub-steps

Each step of the automaton is atomic, meaning it is executed in a single indivisible time unit.

For descriptive purposes only, each step is divided into two separate sub-steps:

Sub-step1: A symbol may be read from the input, a symbol may be read and popped off the stack.

Sub-step2: A state transition is carried out and a stack symbol may be pushed on the stack.

70

71 of 111

Transition Function – 1^st Sub-step

If the transition depends both on and we write . In this case is consumed and is removed from the stack.

If the transition depends only on we write � , is consumed and the stack does not change.

71

72 of 111

Transition Function – 1^st Sub-step

If the transition depends only on , we write � . In this case is not consumed and � is removed from the stack.

Finally, If the transition depends neither on , nor on , we write . In this case is not consumed and the stack is not changed.

72

73 of 111

PDA - The Transition Function

The range of the transition function is :�The power set of the Cartesian product of the set of PDA states and the stack alphabet.

Using pairs means that determines:

1. The new state to which the PDA moves.

2. The new stack symbol pushed on the stack.

73

74 of 111

PDA - The Transition Function

Using the power set means that the PDA is nondeterministic: At any given situation, it may make a nondeterministic transition.

Finally, the use of means that at each transition the PDA may either push a stack symbol onto the stack or not (if the value is

).

74

75 of 111

CFLG-s and PDA-s are Equivalent

Theorem:

A language is CFL if and only if there exists a PDA accepting it.

Lemma->

For any CFL L, there exists a PDA P such that� .

75

76 of 111

Proof Idea

Since L is a CFL there exists a CFG G such that � . We will present a PDA P, that recognizes L.

The PDA P starts with a word on its input.

In order to decide whether , P simulates the derivation of w.

76

77 of 111

Proof Idea (cont.)

Recall that a derivation is a sequence of strings, where each string contains variables and terminals. The first string is always the start symbol of G and each string is obtained from the previous one by a single activation of some rule.

77

78 of 111

Proof Idea (cont.)

A string may allow activation of several rules and the PDA P non deterministically guesses the next rule to be activated.

The initial idea for the simulation is to store each intermediate string on the stack. Upon each production, the string on the stack before production is transformed to the string after production.

78

79 of 111

Proof Idea (cont.)

Unfortunately, this idea does not quite work since at any given moment, P can only access the top symbol on the stack.

To overcome this problem, the stack holds only a suffix of each intermediate string where the top symbol is the variable to be substituted during the next production.

79

80 of 111

The Intermediate String aaSbb

80

Finite control

a

b

input

S

b

stack

$

b

81 of 111

Informal Description of P

Push the marker $ and the start symbol S on the stack.

Repeat

If the top symbol is a variable V – Replace V by the right hand side of some non deterministically chosen rule whose left hand side is V .

…..

81

82 of 111

Informal Description of P

Push the marker $ and the start symbol S on the stack.

Repeat

…..

If the top symbol is a terminal compare it with the next symbol on the input. If equal – advance the input and pop the variable else – reject.

82

83 of 111

Informal Description of P

Push the marker $ and the start symbol S on the stack.

Repeat

…..

If the top symbol is $ and the input is finished – accept else – reject

83

84 of 111

The Proof

We start by defining Extended Transitions:

Assume that PDA P is in state q , it reads from the input and pops from the stack and then moves to state r while pushing � onto the stack.

This is denoted by .

Next, extended transitions are implemented.

84

85 of 111

Implementing Extended Trans.

Add states .

Set the transition function as follows:

Add to .

Set ,

,

……

(see next slide)

85

86 of 111

Implementing Extended Trans.

This extended transition

Is implemented by this �transition sequence

86

87 of 111

The Proof

Let G be an arbitrary CFG. Now we are ready to construct the PDA, P such that that � . The states of P are a

where E contains all states needed to implement the extended transitions presented in the previous slide.

The PDA P is presented on the next slide:

87

88 of 111

The Result PDA

This completes the Proof

88

89 of 111

Example

Consider the following CFG:

89

The Schematic NFA

Implementing First Transition

Implementing 1st Rule with Variables

Implementing 2nd Rule with Variables

Implementing 3rd Rule with Variables

Implementing 4th Rule with Variables

Implementing Rules with Constants

That’s All Folks!!!

90 of 111

Introduction and Motivation

In this lecture we present the Pumping Lemma for Context Free Languages.

This lemma enables us to prove that some languages are not CFL and hence are not recognizable by any PDA.

90

91 of 111

The Pumping Lemma

Let A be a context free language. There exists a number p such that for every , if � then w may be divided into five parts,� satisfying:

for each , it holds that .
.
.

Note: Without req. 2 the Theorem is trivial.

91

92 of 111

Proof Idea

If w is “long enough” (to be precisely defined later) it has a large parse tree which has a “long enough” path from its root to one of its leaves.

Under these conditions, some variable on should appear twice. This enables pumping of w as demonstrated in the next slide:

92

93 of 111

Proof Idea

93

R

Pumping up

Pumping down

94 of 111

Reminder

Let T be a binary tree.

The 0 - th level of T has nodes.

The 1 - th level of T has at most nodes.

…

The i - th level of T has at most nodes.

If T’ is a b-ari tree then its i-th level has at most� nodes.

94

95 of 111

The Proof

Let G be a grammar for the language L. Let b be the maximum number of symbols (variables and constants) in the right hand side of a rule of G. (Assume ). In any parse tree, T, for generating w from G, a node of T may have no more than b children.

If the height of T is h then .

95

96 of 111

The Proof (cont.)

If the height of T is h then . Conversely,

If then the height of T is at least .

Assume that G has variables. Then we set� .

Conclusion: For any , if , then the

height of any parse tree of w is at least .

96

97 of 111

The Proof (cont.)

To see how pumping works let be the parse tree of with a minimal number of nodes. The height of the tree , is at least , so it has a path, with at least nodes, from its root until some leaf. The path has � at least variables and a single terminal.

97

98 of 111

The Proof (cont.)

Since G has variables and has at least � variables, there exists a variable, R ,�that repeats itself among the lowest variables of , as depicted�in the following picture:�

98

R

99 of 111

The Proof (cont.)

Each occurrence of R has a sub-tree �rooted at it:�Let be the word �generated by the upper �occurrence of R and let�x be the word generated by the lower occurrence of R .

99

R

100 of 111

The Proof (cont.)

Since both sub-trees are generated by �the same variable, each of these�sub-trees can be replaced by �another . This tree is obtained �from by substituting the �upper sub-tree at the lower �occurrence of R.

100

R

101 of 111

The Proof (cont.)

The word generated is , and since �It is generated by a parse tree of G�we get . Additional�substitutions of the upper �sub-tree at the lower �occurrence of , yield the �conclusion for each

101

R

102 of 111

The Proof (cont.)

Substitution of the lower sub-tree at�the upper occurrence of R�yields this pars tree whose �generated word is .�Since once again this is a �legitimate parse tree we�get .

102

R

103 of 111

The Proof (cont.)

To see that , assume that this is�the situation. In this case, this�tree is a parse tree for w with �less nodes then , in�contradiction with the �choice of as a parse tree�for w with a minimal number of nodes.

103

R

104 of 111

The Proof (cont.)

In order to show that recall that we chose �R so that both its occurrences fall within the bottom nodes of the path , �where is the longest path of the �tree so the height of the red �sub-tree is at most and the�number of its leaves is at most

104

R

105 of 111

Using the Pumping Lemma

Now we use the pumping lemma for CFL to show that the language �is not CFL.

Assume towards a contradiction that L is CFL and let p be the pumping constant. Consider� . Obviously .

105

106 of 111

Using the Pumping Lemma

By the pumping lemma, there exist a partition� where , and for each� , it holds that .

Case 1: Both v and y contain one symbol each:�Together they may hold 2 symbols, so in � , the third symbol appears less than the other two.

106

107 of 111

Using the Pumping Lemma

Case 2: Either v or y contain two symbols:�In this case, the word has more than three blocks of identical letters: In other words: , Q.E.D.

�

107

quod erat demonstrandum (Wiktionary)

which was to be proved; which was to be demonstrated. Abbreviation: QED

108 of 111

Discussion

Some weeks ago we started our quest to find out “What can be computed and what cannot?”

So far we identified two classes: RL-s and CFL-s and found some examples which do not belong in neither class

108

109 of 111

Discussion

This is what we got so far:

�

109

CFL-s Ex:

RL-s Ex:

Non CFL-s Ex:

???

110 of 111

Discussion

Moreover: Our most complex example, namely, the language is easily recognizable by your everyday computer, so we did not get so far yet.�Our next attempt to grasp the essence of “What’s Computable?” are Turing Machines.

Some surprises are awaiting…

110

111 of 111

Recap

In this lecture we introduced and proved the Pumping Lemma for CFL-s

Using this lemma we managed to prove that the fairly simple language , is not CFL.

The next step is to define Turing Machines.

111