1 of 221

UNIT-1�INTRODUCTION TO

FINITE AUTOMATA��PREPARED BY:SRIRAM

2 of 221

Alphabet

An alphabet is a finite and non empty set of symbols.

The example of alphabet is,

S={a, b, c,……..z}

The elements a, b, c, …….z are alphabets.

If W={0,1}

Here 0 and 1 are alphabets, denoted by W.

3 of 221

String

TOC SP CHARI

  • It is finite collection of symbols from alphabet.

For example, if Σ = {a,b} then various strings that can be formed from Σ are {ab, ba, aaa, bbbb, aba, bab, ….}. An infinite number of strings can be generated.

Language

  • The language is a collection of appropriate strings.
  • The language is defined using an input set. The input set is typically denoted by letter Σ.

For example: Σ ={Ꜫ,0,00,000,0000,….}.

Here the language L is defined as ‘Any number of Zeros’.

4 of 221

Examples

TOC SP CHARI

  1. The set of all strings over {a,b,c} that have ac as substring can be written as {xacy | x,y {a,b,c}}.

This can also be written as {x{a,b,c}||x|ac ≥1},

stating that the set of all strings over {a,b,c} in which the number of occurrences of substring ac is at least 1.

  1. The set of all strings over some alphabet Σ with even number of a’s is {xΣ||x|a = 2n, for some nN}.

Equivalently, {xΣ||x|a ≡ 0 mod 2}.

  1. The set of all strings over some alphabet Σ with equal number

of a’s and b’s can be written as {xΣ||x|a =|x|b}.

  1. The set of all palindromes over an alphabet Σ can be written as

{xΣ| x = xR}, where xR is the string obtained by reversing x.

5 of 221

Finite State Machine

TOC notes by S P CHARI

  • The finite state machine represents a mathematical model of a system with certain input.

  • The model finally gives certain output.

  • The input is processed by various states, these states are called as intermediate states.

  • The finite state system is very good design tool for the programs such as TEXT EDITORS and LEXICAL ANALYZERS.

6 of 221

Definition of Finite Automata

TOC notes by S P CHARI

  • A finite automata is a collection of 5-tuple(Q,Σ,δ,q0,F) Where , Q is finite set of states, which is non empty.

Σ is input alphabet, indicates input set.

δ is transition function or mapping function. We can determine the next state using this function.

q0 is an initial state and is in Q F is set of final states.

7 of 221

Finite Automata Model

TOC notes by S P CHARI

  • The finite automata can be represented as follows:

Input Tape

A

B

A

B

A

B

A

B

A

B

Finite

Control

Tape

Reader

  • Input Tape: It is a linear tape having some number of cells. Each input symbol is placed in each cell.
  • Finite Control: The finite control decides the next state on receiving particular input from input tape.

Out put

8 of 221

Types of Automata

TOC notes by S P CHARI

Finite Automata

Deterministic Finite Automata

Non Deterministic Finite Automata

9 of 221

Types of Automata

  • Deterministic Finite Automata: The Finite Automata is called Deterministic Finite Automata if there is only one path for a specific input from current state to next state.

It can be represented as follows:

  • A machine M = (Q,Σ,δ,q0,F) Where ,

Q is finite set of states, which is non empty.

Σ is input alphabet, indicates input set.

δ is transition function or mapping function. We can determine the next state using this function.

q0 is an initial state and is in Q F is set of final states.

Where δ:Q X Σ -> Q

TOC notes by S P CHARI

KMIT

10 of 221

Types of Automata

  • Non Deterministic Finite Automata: The Finite Automata is called Non Deterministic Finite Automata if there are more than one path for a specific input from current state to next state.

It can be represented as follows:

  • A machine M = (Q,Σ,δ,q0,F) Where ,

Q is finite set of states, which is non empty.

Σ is input alphabet, indicates input set.

δ is transition function or mapping function. We can determine the next state using this function.

q0 is an initial state and is in Q F is set of final states.

Where δ:Q X Σ -> 2TOQC notes by Pallavi Joshi

11 of 221

Difference between DFA & NFA

TOC notes by S P CHARI

Deterministic Finite Automata

Non Deterministic Finite Automata

For Every symbol of the alphabet, there is only one

state transition in DFA.

We do not need to specify how does the NFA react

according to some symbol.

DFA cannot use Empty String transition.

NFA can use Empty String transition.

DFA can be understood as one machine.

NFA can be understood as multiple little machines

computing at the same time.

DFA will reject the string if it end at other than

accepting state.

If all of the branches of NFA dies or rejects the string,

we can say that NFA reject the string.

Backtracking is allowed in DFA.

Backtracking is not always allowed in NFA.

DFA is more difficult to construct.

NFA is easier to construct.

12 of 221

Hierarchy of languages

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages Context-Free Languages

Regular Languages

Non-Recursively Enumerable Languages

13 of 221

Deterministic Finite State Automata (DFA)

TOC notes by S P CHARI

……..

ken into cells

ead.

One-way, infinite tape, bro One-way, read-only tape h Finite control, i.e.,

finite number of states, and

transition rules between them, i.e.,

a program, containing the position of the read head, current symbol being scanned, and the current “state.”

A string is placed on the tape, read head is positioned at the left end, and the DFA will read the string one symbol at a time until all symbols have been read. The DFA will then either accept or reject the string.

Finite Control

0

1

1

0

0

14 of 221

TOC notes by S P CHARI

The finite control can be described by a transition diagram or table:

Example #1:

1

q0 q0 q1

1 1

q0 q0

One state is final/accepting, all others are rejecting.

The above DFA accepts those strings that contain an even number of 0’s, including the

null string, over Sigma = {0,1}

L = {all strings with zero or more 0’s}

Note, the DFA must reject all other strings

q0

0 0

q1

0

q0

1

0

1

15 of 221

Note:

TOC notes by S P CHARI

  • Machine is for accepting a language, language is the purpose!

  • Many equivalent machines may accept the same language, but a machine cannot accept multiple languages!

L

  • Id’s of the characters or states are irrelevant,

you can call them by any names!

Sigma = {0, 1} ≡ {a, b}

States = {q0, q1} ≡ {u, v}, as long as they have

identical (isomorphic) transition table

M1

M2

….

M-inf

16 of 221

An equivalent machine to the previous example (DFA for even number of 0’s):

TOC notes by S P CHARI

q0 q3 q1 q2 q2 q2

One state is final/accepting, all others are rejecting.

The above DFA accepts those strings that contain an even number of 0’s, including null

string, over Sigma = {0,1}

Can you draw a machine for a language by excluding the null string from the language? L =

{all strings with 2 or more 0’s}

q0

1 0 0 1 1q1

0

0

1

accept string

q2

1

1

0

q3

1

0

17 of 221

Example #2:

TOC notes by S P CHARI

a

c

c

c

b

accepted

q0

q0

q1

q2

q2

q2

a

a

c

rejected

q0 q0 q0 q1

Accepts those strings that contain at least two c’s

q1

q0

q2

a

b

a

b

c

c

a/b/c

18 of 221

a/b/c

q2

a q0

a

q1

b

b

c

c

Inductive Proof (sketch): that the machine correctly accepts strings with at least two c’s

Proof goes over the length of the string.

Base: x a string with |x|=0. state will be q0 => rejected.

Inductive hypothesis: |x|= integer k, & string x is rejected - in state q0 (x must have zero c),

OR, rejected – in state q1 (x must have one c),

OR, accepted – in state q2 (x has already with two c’s)

Inductive steps: Each case for symbol p, for string xp (|xp| = k+1), the last symbol p = a, b or c

xa

xb

xc

x ends in q0

q0 =>reject

(still zero c => should reject)

q0 =>reject

(still zero c => should reject)

q1 =>reject

(still zero c => should reject)

x ends in q1

q1 =>reject

(still one c => should reject)

q1 =>reject

(still one c => should reject)

q2 =>accept

(two c now=> should

accept)

x ends in q2

q2 =>accept

(two c already => should

accept) TOC notes by Palla

q2 =>accept

(two c already => should vi Jaocscheipt)

q2 =>accept

(two c already => should

accept)

19 of 221

Formal Definition of a DFA

TOC notes by S P CHARI

A DFA is a five-tuple:

M = (Q, Σ, δ, q0, F)

Q Σ

q0

F

δ

A finite set of states

A finite input alphabet

The initial/starting state, q0 is in Q

A set of final/accepting states, which is a subset of Q

A transition function, which is a total function from Q x Σ to Q

δ: (Q x Σ) –> Q

δ(q,s) = q’

δ is defined for any q in Q and s in Σ, and

is equal to some state q’ in Q, could be q’=q

Intuitively, δ(q,s) is the state entered by M after reading symbol s while in state q.

20 of 221

Revisit example #1:

TOC notes by S P CHARI

Q = {q0, q1}

Σ = {0, 1}

Start state is q0

F = {q0}

δ:

0

q1

1

q0

q0

q1

q0

q1

q

0

q1

0

0

1

1

21 of 221

Revisit example #2:

TOC notes by S P CHARI

Q = {q0, q1, q2}

Σ = {a, b, c} Start state is q0 F = {q2}

δ:

a

q0

b q0

c

q1

q0

q1

q1

q2

q2

q2

q2

n, at each st

ep M has

exactly on

q1

q2

Since δ is a functio e option.

It follows that for a given string, there is exactly one computation.

q1

q0

q2

a

b

a

b

c

c

a/b/c

22 of 221

Extension of δ to Strings

TOC notes by S P CHARI

δ^ : (Q x Σ*) –> Q

δ^(q,w) – The state entered after reading string w having started in state q.

Formally:

  1. δ^(q, ε) = q, and
  2. For all w in Σ* and a in Σ

δ^(q,wa) = δ (δ^(q,w), a)

23 of 221

Recall Example #1:

TOC notes by S P CHARI

δ^(q0, 011)

= δ (δ^(q0,01), 1)

= δ (δ ( δ^(q0,0), 1), 1)

= δ (δ (δ (δ^(q0, λ), 0), 1), 1)

= δ (δ (δ(q0,0), 1), 1)

= δ (δ (q1, 1), 1)

= δ (q1, 1)

= q1

by rule #2

by rule #2

by rule #2

by rule #1

by definition of δ by definition of δ by definition of δ

Is 011 accepted? No, since δ^(q0, 011) = q1 is not a final state.

q0

q1

0

What is δ^(q0, 011)? Informally, it is the state entered by0 M after processing 011 having started in state q0.

Formally:

1

1

24 of 221

Note that:

TOC notes by S P CHARI

δ^ (q,a) = δ(δ^(q, ε), a)

= δ(q, a)

by definition of δ^, rule #2

by definition of δ^, rule #1

Therefore:

δ^ (q, a1a2…an) = δ(δ(…δ(δ(q, a1), a2)…), an)

However, we will abuse notations, and use δ in place of δ^: δ^(q, a1a2…an) = δ(q, a1a2…an)

25 of 221

Example #3:

TOC notes by S P CHARI

δ(q0, 011)

= δ (δ(q0,01), 1)

= δ (δ (δ(q0,0), 1), 1)

= δ (δ (q1, 1), 1)

= δ (q1, 1)

= q1

by rule #2

by rule #2

by definition of δ by definition of δ by definition of δ

Is 011 accepted? No, since δ(q0, 011) = q1 is not a final state.

Language?

L ={ all strings over {0,1} that has 2 or more 0 symbols}

q

q

0 1

q

2

1

1

0

0

1

What is δ(q0, 011)? Informally, it is the state entered by M after process0ing 011 having

started in state q0.

Formally:

26 of 221

Recall Example #3:

TOC notes by S P CHARI

What is δ(q1, 10)?

δ(q1, 10)

= δ (δ(q1,1), 0)

by rule #2

= δ (q1, 0)

by definition of δ

= q2

by definition of δ

Is 10 accepted? No, since δ(q0, 10) = q1 is not a final state. The fact that δ(q1, 10) = q2 is

irrelevant, q1 is not the start state!

0

q

1

q

0

q

2

1

1

0

1

0

27 of 221

Definitions related to DFAs

Let M = (Q, Σ, δ,q0,F) be a DFA and let w be in Σ*. Then w is accepted by M iff δ(q0,w) = p for some state p in F.

Let M = (Q, Σ, δ,q0,F) be a DFA. Then the language accepted by M is the set:

L(M) = {w | w is in Σ* and δ(q0,w) is in F} Another equivalent definition:

L(M) = {w | w is in Σ* and w is accepted by M}

Let L be a language. Then L is a regular language iff there exists a DFA M such that L = L(M).

Let M1 = (Q1, Σ1, δ1, q0, F1) and M2 = (Q2, Σ2, δ2, p0, F2) be DFAs. Then M1 and M2 are equivalent iff

L(M1) = L(M2).

TOC notes by S P CHARI

28 of 221

TOC notes by S P CHARI

Notes:

A DFA M = (Q, Σ, δ,q0,F) partitions the set Σ* into two sets: L(M) and

Σ* - L(M).

If L = L(M) then L is a subset of L(M) and L(M) is a subset of L (def. of set equality). Similarly, if L(M1) = L(M2) then L(M1) is a subset of L(M2) and L(M2) is a subset of L(M1). Some languages are regular, others are not. For example, if

Regular: L1 = {x | x is a string of 0's and 1's containing an even

number of 1's} and

Not-regular: L2 = {x | x = 0n1n for some n >= 0}

Can you write a program to “simulate” a given DFA, or any arbitrary input DFA?

Question we will address later:

How do we determine whether or not a given language is regular?

29 of 221

Give a DFA M such that:

TOC notes by S P CHARI

L(M) = {x | x is a string of 0’s and 1’s and |x| >= 2}

Prove this by induction

q1

q0

q2

0/1

0/1

0/1

30 of 221

Give a DFA M such that:

TOC notes by S P CHARI

L(M) = {x | x is a string of (zero or more) a’s, b’s and c’s such that x does not contain the substring aa}

Logic:

In Start state (q0): b’s and c’s: ignore – stay in same state

q0 is also “accept” state

First ‘a’ appears: get ready (q1) to reject

But followed by a ‘b’ or ‘c’: go back to start state q0

When second ‘a’ appears after the “ready” state: go to reject state q2 Ignore everything after getting to the “reject” state q2

a

a/b/c

a

q0 q1 q2

b/c

b/c

31 of 221

Give a DFA M such that:

TOC notes by S P CHARI

L(M) = {x | x is a string of a’s, b’s and c’s such that x contains the substring aba}

a

a/b/c

b

q0 q1

c

Logic: acceptance is straight forward, progressing on each expected symbol

However, rejection needs special cabre/c, in each state (for DFA, we will see this becomes

easier in NFA, non-deterministic machine)

q2 q3

b/c

a

a

32 of 221

Give a DFA M such that:

TOC notes by S P CHARI

q0

b

q7

Remember, you may havqe multiple “final” staqtes, but only one “stqart” state

4 5 6

b

b

b

a

q2

q1

q3

a

L(M) = {x | x is a string of a’s and b’s such that x

contains both aa and bb}

First do, for a language where ‘aa’ comes before ‘bb’

Then do its reverse; and then parallelize them.

a

a

b

a/b

b

a

a

a

b

33 of 221

TOC notes by S P CHARI

Let Σ = {0, 1}. Give DFAs for {}, {ε}, Σ*, and Σ+.

For {}: For {ε}:

For Σ*:

For Σ+:

0/1

q0

0/1

q0

q1

q0

0/1

0/1

0/1

q0

q1

0/1

34 of 221

Problem: Third symbol from last is 1

TOC notes by S P CHARI

0/1

q1

q0

q3

1

0/1

q2

0/1

Is this a DFA?

No, but it is a Non-deterministic Finite Automaton

35 of 221

Nondeterministic Finite State Automata (NFA)

TOC notes by S P CHARI

An NFA is a five-tuple:

M = (Q, Σ, δ, q0, F)

Q Σ

q0

F

δ

A finite set of states

A finite input alphabet

The initial/starting state, q0 is in Q

A set of final/accepting states, which is a subset of Q

A transition function, which is a total function from Q x Σ to 2Q

δ: (Q x Σ) –> 2Q :2Q is the power set of Q, the set of all subsets of Q

:The set of all states p such that there is a transition

labeled s from q to p

δ(q,s)

δ(q,s) is a function from Q x S to 2Q (but not only to Q)

36 of 221

TOC notes by S P CHARI

Example #1: one or more 0’s followed by one or more 1’s

Q = {q0, q1, q2}

Σ = {0, 1}

Start state is q0

F = {q2}

δ:

0

1

q0

q

q

1 {q0, q1}

{}

2 {}

{q1, q2}

{q2}

{q2}

q1

q0

q2

0

1

0

1

0/1

37 of 221

Example #2: pair of 0’s or pair of 1’s as substring

Q = {q0, q1, q2 , q3 , q4}

Σ = {0, 1}

Start state is q0

F = {q2, q4}

δ: 0

1

q0

q1

q2 q3 q4

q0

0/1

0

0

q3

q4

0/1

q1

q2

0/1

Joshi

1

1

{q0, q3}

{q0, q1}

{}

{q2}

{q2}

{q2}

{q4}

{}

{q4}

{q4}

TOC notes by Pallavi

38 of 221

Notes:

TOC notes by S P CHARI

δ(q,s) may not be defined for some q and s (what does that mean?) δ(q,s) may map to multiple q’s

A string is said to be accepted if there exists a path from q0 to some state in F

A string is rejected if there exist NO path to any state in F

The language accepted by an NFA is the set of all accepted strings

Question: How does an NFA find the correct/accepting path for a given string?

NFAs are a non-intuitive computing model

You may use backtracking to find if there exists a path to a final state (following slide)

Why NFA?

We are primarily interested in NFAs as language defining capability, i.e., do NFAs accept languages that

DFAs do not?

Other secondary questions include practical ones such as whether or not NFA is easier to develop, or how does one implement NFA

39 of 221

Determining if a given NFA (example #2) accepts a given string (001) can be done

algorithmically:

TOC notes by S P CHARI

q0

q0

q0

q3

q3

q4 q4

Each level will have at most n states:

Complexity: O(|x|*n), for running over a string x

accepted

1q0

q1

0

0

q0

0/1

0

q3

1

1

q2

q1

0

0/1

q4

0/1

40 of 221

Another example (010):

TOC notes by S P CHARI

q0

q0

q0

q0

q3

q1

q3

not accepted

All paths have been explored, and none lead to an accepting state.

0

1

0

q0

0/1

0

q3

q4

1

1

q2

q1

0

41 of 221

Question: Why non-determinism is useful?

TOC notes by S P CHARI

Non-determinism = Backtracking Compressed information

Non-determinism hides backtracking

Programming languages, e.g., Prolog, hides backtracking => Easy to program at a higher level: what we

want to do, rather than how to do it

Useful in algorithm complexity study

Is NDA more “powerful” than DFA, i.e., accepts type of languages that any DFA cannot?

42 of 221

TOC notes by S P CHARI

Let Σ = {a, b, c}. Give an NFA M that accepts:

L = {x | x is in Σ* and x contains ab}

Is L a subset of L(M)? Or, does M accepts all string in L?

Is L(M) a subset of L? Or, does M rejects all strings not in L?

Is an NFA necessary? Can you draw a DFA for this L?

Designing NFAs is not as trivial as it seems: easy to create bug accepting string outside

language

q0 q1

q2

a

a/b/c

b

a/b/c

43 of 221

TOC notes by S P CHARI

Let Σ = {a, b}. Give an NFA M that accepts:

L = {x | x is in Σ* and the third to the last symbol in x is b}

a/b

Is L a subset of L(M)?

Is L(M) a subset of L?

Give an equivalent DFA as an exercise.

q1

q0

b

q3

a/b

q2

a/b

44 of 221

Extension of δ to Strings and Sets of States

What we currently have:

δ : (Q x Σ) –> 2Q

q0

0

1

q1

q4

0

1

q2

0

0

1

What we want (why?): δ : (2Q x Σ*) –> 2Q

We will do this in two steps, which will be slightly different from the book, and we will make use of the following NFA.

0

q3

TOC notes by S P CHARI

0

45 of 221

Extension of δ to Strings and Sets of States

TOC notes by S P CHARI

Step #1:

Given δ: (Q x Σ) –> 2Q define δ#: (2Q x Σ) –> 2Q as follows:

for all subsets R of Q, and symbols a in Σ

Note that:

1) δ#(R, a) = δ(q, a)

qR

δ#({p},a) = δ(q, a)

= δ(p, a)

by definition of δ#, rule #1 above

q{ p}

Hence, we can use δ for δ#

δ({q0, q2}, 0)

δ({q0, q1, q2}, 0)

These now make sense, but previously they did not.

46 of 221

Example:

TOC notes by S P CHARI

δ({q0, q2}, 0)

= δ(q0, 0) U δ(q2, 0)

= {q1, q3} U {q3, q4}

= {q1, q3, q4}

δ({q0, q1, q2}, 1) = δ(q0, 1) U δ(q1, 1) U δ(q2, 1)

= {} U {q2, q3} U {}

= {q2, q3}

47 of 221

Step #2:

TOC notes by S P CHARI

Given δ: (2Q x Σ) –> 2Q define δ^: (2Q x Σ*) –> 2Q as follows:

δ^(R,w) – The set of states M could be in after processing string w, having started from any

state in R.

Formally:

  1. δ^(R, ε) = R
  2. δ^(R,wa) = δ (δ^(R,w), a)

for any subset R of Q

for any w in Σ*, a in Σ, and

subset R of Q

Note that:

δ^(R, a)

= δ(δ^(R, ε), a)

= δ(R, a)

by definition of δ^, rule #3 above by definition of δ^, rule #2 above

Hence, we can use δ for δ^

δ({q0, q2}, 0110)

δ({q0, q1, q2}, 101101)

These now make sense, but previously they did not.

48 of 221

TOC notes by S P CHARI

Example:

What is δ({q0}, 10)?

Informally: The set of states the NFA could be in after processing 10,

having started in state q0, i.e., {q1, q2, q3}.

Formally:

δ({q0}, 10) = δ(δ({q0}, 1), 0)

= δ({q0}, 0)

= {q1, q2, q3}

Is 10 accepted? Yes!

q0

0

1

q1

q3

0

1

q2

1

1

0

49 of 221

Example:

TOC notes by S P CHARI

What is δ({q0, q1}, 1)?

δ({q0 , q1}, 1)

= δ({q0}, 1) δ({q1}, 1)

= {q0} {q2, q3}

= {q0, q2, q3}

What is δ({q0, q2}, 10)?

δ({q0 , q2}, 10)

= δ(δ({q0 , q2}, 1), 0)

= δ(δ({q0}, 1) U δ({q2}, 1), 0)

= δ({q0} {q3}, 0)

= δ({q0,q3}, 0)

= δ({q0}, 0) δ({q3}, 0)

= {q1, q2, q3} {}

= {q1, q2, q3}

50 of 221

Example:

TOC notes by S P CHARI

δ({q0}, 101)

= δ(δ({q0}, 10), 1)

= δ(δ(δ({q0}, 1), 0), 1)

= δ(δ({q0}, 0), 1)

= δ({q1 , q2, q3}, 1)

= δ({q1}, 1) U δ({q2}, 1) U δ({q3}, 1)

= {q2, q3} U {q3} U {}

= {q2, q3}

Is 101 accepted? Yes! q3 is a final state.

51 of 221

Definitions for NFAs

TOC notes by S P CHARI

Let M = (Q, Σ, δ,q0,F) be an NFA and let w be in Σ*. Then w is accepted by M iff δ({q0}, w)

contains at least one state in F.

Let M = (Q, Σ, δ,q0,F) be an NFA. Then the language accepted by M is the set: L(M) = {w | w is in Σ* and δ({q0},w) contains at least one state in F}

Another equivalent definition:

L(M) = {w | w is in Σ* and w is accepted by M}

52 of 221

Equivalence of DFAs and NFAs

TOC notes by S P CHARI

Do DFAs and NFAs accept the same class of languages?

Is there a language L that is accepted by a DFA, but not by any NFA? Is there a language L that is accepted by an NFA, but not by any DFA?

Observation: Every DFA is an NFA, DFA is only restricted NFA.

Therefore, if L is a regular language then there exists an NFA M such that L = L(M). It follows that NFAs accept all regular languages.

But do NFAs accept more?

53 of 221

TOC notes by S P CHARI

Consider the following DFA: 2 or more c’s

Q = {q0, q1, q2}

Σ = {a, b, c} Start state is q0 F = {q2}

δ:

a

q0

b q0

c

q1

q0

q1

q2

q1

q1

q2

q2

q2

q2

q1

q0

q2

a

b

a

b

c

c

a/b/c

54 of 221

An Equivalent NFA:

TOC notes by S P CHARI

Q = {q0, q1, q2}

Σ = {a, b, c} Start state is q0 F = {q2}

δ:

a

b

c

{q0}

{q0}

{q1}

q0

q1

q2

{q1}

{q1} {q

2}

{q2}

{q2} {q

2}

q1

q0

q2

a

b

a

b

c

c

a/b/c

55 of 221

TOC notes by S P CHARI

Lemma 1: Let M be an DFA. Then there exists a NFA M’ such that L(M) = L(M’).

Proof: Every DFA is an NFA. Hence, if we let M’ = M, then it follows that L(M’) = L(M). The above is just a formal statement of the observation from the previous slide.

56 of 221

TOC notes by S P CHARI

Lemma 2: Let M be an NFA. Then there exists a DFA M’ such that L(M) = L(M’).

Proof: (sketch)

Let M = (Q, Σ, δ,q0,F).

Define a DFA M’ = (Q’, Σ, δ’,q0,F’) as:

Q’ = 2Q

= {Q0, Q1,…,}

Each state in M’ corresponds to a

subset of states from M

where Qu = [qi0, qi1,…qij]

F’ = {Qu | Qu contains at least one state in F} q0 = [q0]

δ’(Qu, a) = Qv iff δ(Qu, a) = Qv

57 of 221

TOC notes by S P CHARI

Example: empty string or start and end with 0

Q = {q0, q1}

Σ = {0, 1}

Start state is q0

F = {q0}

δ:

0

1

q0

q

1 {q1}

{}

{q0, q1}

{q1}

q1

q0

0

0/1

0

58 of 221

Example of creating a DFA out of an NFA (as per the constructive proof):

TOC notes by S P CHARI

δ for DFA:

0

->q0

1

q1

q0 -->q0

q1

0/1

0

0

[q1]{q1}

[ ] write as [q1]

{}

write as [ ]

59 of 221

TOC notes by S P CHARI

δ:

0

1

->q0

[q1 [ ]

[q01

q1

q0

Example of creating a DFA out of an NFA (as per the constructive proof):

0/1

0

0

]{q1} write

as [q1]

{}

{q ,q }

]w 0 e1 s

rit a

[q01]

{q1}

60 of 221

TOC notes by S P CHARI

δ:

0

1

->q0

[q1 [ ]

[q01

q1

q0

Example of creating a DFA out of an NFA (as per the constructive proof):

0/1

0

0

]{q1} write

as [q1]

{}

{q ,q }

]w 0 e1 s

rit a

[q01]

{q1}

[ ]

[ ]

61 of 221

TOC notes by S P CHARI

δ:

0

1

->q0

[q1 [ ]

[q01

q1

q0

Example of creating a DFA out of an NFA (as per the constructive proof):

0/1

0

0

]{q1} write

as [q1]

{}

{q ,q }

]w 0 e1 s

rit a

[q01]

{q1}

[ ]

[ ]

[q01]

[q1]

62 of 221

TOC notes by S P CHARI

Construct DFA M’ as follows:

δ({q0}, 0) = {q1}

δ({q0}, 1) = {}

δ({q1}, 0) = {q0, q1} =>

δ({q1}, 1) = {q1}

δ({q0, q1}, 0) = {q0, q1}

δ({q0, q1}, 1) = {q1} =>

δ({}, 0) = {}

δ({}, 1) = {}

=> δ’([q0], 0) = [q1]

=> δ’([q0], 1) = [ ]

δ’([q1], 0) = [q0q1]

=> δ’([q1], 1) = [q1]

=> δ’([q0q1], 0) = [q0q1]

δ’([q0q1], 1) = [q1]

=>

=>

δ’([ ], 0) = [ ]

δ’([ ], 1) = [ ]

[ ]

1

0

1

[q1]

0/1

[q0]

0

1

[q0q1]

0

63 of 221

TOC notes by S P CHARI

Theorem: Let L be a language. Then there exists an DFA M such that L = L(M) iff there

exists an NFA M’ such that L = L(M’).

Proof:

(if) Suppose there exists an NFA M’ such that L = L(M’). Then by Lemma 2 there

exists an DFA M such that L = L(M).

(only if) Suppose there exists an DFA M such that L = L(M). Then by Lemma 1

there exists an NFA M’ such that L = L(M’).

Corollary: The NFAs define the regular languages.

64 of 221

TOC notes by S P CHARI

Note: Suppose R = {}

δ(R, 0)

= δ(δ(R, ε), 0)

= δ(R, 0)

= δ(q, 0)

= {}

qR

Since R = {}

Exercise - Convert the following NFA to a DFA:

Q = {q0, q1, q2}

Σ = {0, 1}

Start state is q0

F = {q0}

δ: 0

q0

1

q1

q

2

{q0, q1}

{ }

{q1}

{q2}

{q2}

{q2}

65 of 221

Problem: Third symbol from last is 1

TOC notes by S P CHARI

0/1

q1

q0

q3

1

0/1

q2

0/1

Now, can you convert this NFA to a DFA?

66 of 221

NFAs with ε Moves

TOC notes by S P CHARI

An NFA-ε is a five-tuple:

M = (Q, Σ, δ, q0, F)

Q Σ

q0

F

δ

A finite set of states

A finite input alphabet

The initial/starting state, q0 is in Q

A set of final/accepting states, which is a subset of Q

A transition function, which is a total function from Q x Σ U {ε} to 2Q

δ: (Q x (Σ U {ε})) –> 2Q

δ(q,s)

-The set of all states p such that there is a transition labeled a from q to p, where a is in Σ U {ε}

Sometimes referred to as an NFA-ε other times, simply as an NFA.

67 of 221

Example:

TOC notes by S P CHARI

δ:

0

1

ε

q0

- A string w = w1w2…wn is processed

as w = ε*w1ε*w2ε* … ε*wnε*

q

q

Example: all computations on 00:

0 ε 0

q0 q0 q1 q2

:

q

1 {q0}

{ }

{q1} -

2 {q , q }

1 2

{q0, q3}

{q2}

3 {q }

2

{q2}

{ }

{ }

{ }

{ }

q0

ε

0/1

q2

1

0

q1

0

q3

ε

0

1

68 of 221

Informal Definitions

Let M = (Q, Σ, δ,q0,F) be an NFA-ε.

A String w in Σ* is accepted by M iff there exists a path in M from q0 to a state in F labeled by w and zero or more ε transitions.

The language accepted by M is the set of all strings from Σ* that are accepted by M.

TOC notes by S P CHARI

69 of 221

ε-closure

TOC notes by S P CHARI

Define ε-closure(q) to denote the set of all states reachable from q by zero or more ε transitions.

Examples: (for the previous NFA)

ε-closure(q0) = {q0, q1, q2}

ε-closure(q1) = {q1, q2}

ε-closure(q2) = {q2}

ε-closure(q3) = {q3}

ε-closure(q) can be extended to sets of states by defining:

ε-closure(P) = ε-closure(q)

Examples:

1 2 1 2

ε-closure({q , q }) = {q , q }

ε-closure({q0, q3}) = {q0, q1, q2, q3}

qP

q0

0/1

q2

ε

1

0

q1

0

q3

ε

0

1

70 of 221

Extension of δ to Strings and Sets of States

TOC notes by S P CHARI

What we currently have:

δ : (Q x (Σ U {ε})) –> 2Q

What we want (why?): δ : (2Q x Σ*) –> 2Q

As before, we will do this in two steps, which will be slightly different from the book, and we will make use of the following NFA.

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

71 of 221

TOC notes by S P CHARI

Step #1:

Given δ: (Q x (Σ U {ε})) –> 2Q define δ#: (2Q x (Σ U {ε})) –> 2Q as follows:

1) δ#(R, a) = δ(q, a) for all subsets R of Q, and symbols a in Σ U {ε}

Note that:

qR

δ#({p},a) = δ(q, a) by definition of δ#, rule #1 above

= δ(p, a)

Hence, we can use δ for δ

#

δ({q0, q2}, 0)

δ({q0, q1, q2}, 0)

These now make sense, but previously

they did not.

q{ p}

72 of 221

Examples:

TOC notes by S P CHARI

What is δ({q0 , q1, q2}, 1)?

δ({q0 , q1, q2}, 1) = δ(q0, 1) U δ(q1, 1) U δ(q2, 1)

= { } U {q0, q3} U {q2}

= {q0, q2, q3}

What is δ({q0, q1}, 0)?

δ({q0 , q1}, 0)

= δ(q0, 0) U δ(q1, 0)

= {q0} U {q1, q2}

= {q0, q1, q2}

73 of 221

Step #2:

TOC notes by S P CHARI

Given δ: (2Q x (Σ U {ε})) –> 2Q define δ^: (2Q x Σ*) –> 2Q as follows:

δ^(R,w) – The set of states M could be in after processing string w, having starting

from any state in R.

Formally:

2) δ^(R, ε) = ε-closure(R) - for any subset R of Q

3) δ^(R,wa) = ε-closure(δ(δ^(R,w), a))

- for any w in Σ*, a in Σ, and subset R of Q

Can we use δ for δ^?

74 of 221

TOC notes by S P CHARI

Consider the following example:

δ({q0}, 0) = {q0}

δ^({q0}, 0) = ε-closure(δ(δ^({q0}, ε), 0))

= ε-closure(δ(ε-closure({q0}), 0))

= ε-closure(δ({q0, q1, q2}, 0))

= ε-closure(δ(q0, 0) U δ(q1, 0) U δ(q2, 0))

= ε-closure({q0} U {q1, q2} U {q2})

By rule #3

By rule #2

By ε-closure

By rule #1

= ε-closure({q0, q1, q2})

= ε-closure({q0}) U ε-closure({q1}) U ε-closure({q2})

= {q0, q1, q2} U {q1, q2} U {q2}

= {q0, q1, q2}

So what is the difference?

δ(q0, 0) - Processes 0 as a single symbol, without ε transitions.

δ^(q0 , 0) - Processes 0 using as many ε transitions as are possible.

75 of 221

Example:

TOC notes by S P CHARI

δ^({q0}, 01) = ε-closure(δ(δ^({q0}, 0), 1))

= ε-closure(δ({q0, q1, q2}), 1)

= ε-closure(δ(q0, 1) U δ(q1, 1) U δ(q2, 1))

= ε-closure({ } U {q0, q3} U {q2})

= ε-closure({q0, q2, q3})

By rule #3 Previous slide

By rule #1

= ε-closure({q0}) U ε-closure({q2}) U ε-closure({q3})

= {q0, q1, q2} U {q2} U {q3}

= {q0, q1, q2, q3}

76 of 221

Definitions for NFA-ε Machines

Let M = (Q, Σ, δ,q0,F) be an NFA-ε and let w be in Σ*. Then w is accepted by M iff δ^({q0},

w) contains at least one state in F.

Let M = (Q, Σ, δ,q0,F) be an NFA-ε. Then the language accepted by M is the set: L(M) = {w | w is in Σ* and δ^({q0},w) contains at least one state in F}

Another equivalent definition:

L(M) = {w | w is in Σ* and w is accepted by M}

TOC notes by S P CHARI

77 of 221

Equivalence of NFAs and NFA-εs

TOC notes by S P CHARI

Do NFAs and NFA-ε machines accept the same class of languages?

Is there a language L that is accepted by a NFA, but not by any NFA? Is there a language L that is accepted by an NFA, but not by any DFA?

Observation: Every NFA is an NFA-ε.

Therefore, if L is a regular language then there exists an NFA-ε M such that L = L(M). It follows that NFA-ε machines accept all regular languages.

But do NFA-ε machines accept more?

78 of 221

TOC notes by S P CHARI

Lemma 1: Let M be an NFA. Then there exists a NFA-ε M’ such that L(M) = L(M’).

Proof: Every NFA is an NFA-ε. Hence, if we let M’ = M, then it follows that L(M’) = L(M).

The above is just a formal statement of the observation from the previous slide.

79 of 221

Lemma 2: Let M be an NFA-ε. Then there exists a NFA M’ such that L(M) = L(M’).

TOC notes by S P CHARI

Proof: (sketch)

Let M = (Q, Σ, δ,q0,F) be an NFA-ε. Define an NFA M’ = (Q, Σ, δ’,q0,F’) as:

F’ = F U {q} if ε-closure(q) contains at least one state from F

F’ = F otherwise

δ’(q, a) = δ^(q, a) - for all q in Q and a in Σ

Notes:

δ’: (Q x Σ) –> 2Q is a function

M’ has the same state set, the same alphabet, and the same start state as M

M’ has no ε transitions

80 of 221

Example:

TOC notes by S P CHARI

Step #1:

Same state set as M

q0 is the starting state

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

81 of 221

Example:

TOC notes by S P CHARI

Step #2:

q0 becomes a final state

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

82 of 221

Example:

TOC notes by S P CHARI

Step #3:

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

0

0

0

83 of 221

Example:

TOC notes by S P CHARI

Step #4:

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

0/1

0/1

0/1

1

84 of 221

Example:

TOC notes by S P CHARI

Step #5:

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

0/1

0/1

0/1

1

0

0

85 of 221

Example:

TOC notes by S P CHARI

Step #6:

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

0/1

0/1

1

0/1

0/1

0/1

1

1

86 of 221

Example:

TOC notes by S P CHARI

Step #7:

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q3

q0

0/1

0/1

1

0/1

0/1

0/1

1

1

0

q1

87 of 221

Example:

TOC notes by S P CHARI

Step #8:

Done!

[use table of e-closure]

q0

ε

0/1

q2

1

0

q1

0

ε

0

q3

1

q2

q1

q3

q0

0/1

0/1

1

0/1

0/1

0/1

1

1

0/1

88 of 221

TOC notes by S P CHARI

Theorem: Let L be a language. Then there exists an NFA M such that L= L(M) iff there

exists an NFA-ε M’ such that L = L(M’).

Proof:

(if) Suppose there exists an NFA-ε M’ such that L = L(M’). Then by Lemma 2

there exists an NFA M such that L = L(M).

(only if) Suppose there exists an NFA M such that L = L(M). Then by Lemma 1

there exists an NFA-ε M’ such that L = L(M’).

Corollary: The NFA-ε machines define the regular languages.

89 of 221

UNIT-2�REGULAR EXPRESSIONS

TOC notes by S P CHARI

90 of 221

Introduction

TOC notes by S P CHARI

  • The Language accepted by finite automata are easily described by simple expressions called regular expressions.

  • The regular expression is the most effective way to represent any language.

  • The language accepted by some regular expression is known as a regular language.

91 of 221

Regular Expressions

TOC notes by S P CHARI

  1. Any terminal symbol(i.e. an element of ∑), Ꜫ and φ are regular expressions.
  2. The union of two regular expressions R1 and R2, written as R1+R2, is also a regular expression.
  3. The concatenation of two regular expressions R1 and R2, written as R1R2, is also a regular expression.
  4. The iteration(or Closure) of a regular expression R, written as R* is also a regular expression.
  5. If R is a regular expression , then (R) is also a regular expression.

92 of 221

Regular Expressions

Identity Rules :

Let P,Q and R are the regular expressions then the identity rules are as follows:

  1. φ +R = R
  2. φR = Rφ = φ
  3. ꜪR = RꜪ = R
  4. Ꜫ* = Ꜫ and φ *= Ꜫ
  5. R + R = R
  6. R*R* = R*
  7. RR* = R*R
  8. (R*)*=R*
  9. Ꜫ + RR*= R* = Ꜫ +R*R
  10. (PQ)*P = P(QP)*
  11. (P+Q)* = (P*Q*)* = (P*TOC+noQtes b*y )Pallavi Joshi
  12. (P+Q)R = PR + QR and R(P+Q) = RP + RQ

93 of 221

Regular Expressions

TOC notes by S P CHARI

  • Exercise 2: Construct the regular expression for the language which accepts all the strings which begin or end with either 00 or 11.

Solution:

The R.E. can be categorized into two subparts.

R = L1 + L2

L1 = The String which begin with 00 or 11. L2 = The String which end with 00 or 11.

Let us find out L1 and L2.

L1 = (00+11)( any number of 0’s and 1’s ) L1 = (00+11)(0+1)*

L2 = ( any number of 0’s and 1’s ) (00+11) L2 = (0+1)*(00+11)

Hence

R = [(00+11)(0+1)*] + [(0+1)*(00+11)]

94 of 221

Algorithm to build R.E. from given DFA

TOC notes by S P CHARI

  1. Let q1 be the initial state.
  2. There are q2,q3,q4,….qn number of states. The final state may be some qj where j<=n.
  3. Let αji represents the transition from qj to qi.
  4. Calculate qi such that qi = αji .qj

If qi is a start state

qi = αji .qj +Ꜫ

  1. Similarly compute the final state which ultimately gives the regular expression r.

95 of 221

Regular Expressions

TOC notes by S P CHARI

  • Exercise 1: Write a regular expression for the set of strings that contains an even number of 1’s over ={0,1}. Treat zero 1’s as an even number.

Solution:

Q0

Q1

1

0

0

1

The regular expression for the above F.A. is 0*10*1

96 of 221

DFA to RE via State

TOC notes by S P CHARI

1E.limStairntiangtiowinth (1int)ermediate states and then moving to accepting states, apply the state elimination process to produce an equivalent automaton with regular expression labels on the edges.

  • The result will be a one or two state automaton with a start state and

accepting state.

97 of 221

DFA to RE State

TOC notes by S P CHARI

Elimin2a. tiIof tnhe(t2w)o states are different, we will have an automaton that looks like the following:

Start

S

R

T

We can describe this automaton as: (R+SU*T)*SU*

U

98 of 221

DFA to RE State

TOC notes by S P CHARI

Elimin3a. tIfiothne s(t3ar)t state is also an accepting state, then we must also perform a state elimination from the original automaton that gets rid of every state but the start state. This leaves the following:

R

Start

We can describe this automaton as simply R*.

99 of 221

DFA to RE State Elimination (4)

TOC notes by S P CHARI

4. If there are n accepting states, we must repeat the above steps for each accepting states to get n different regular expressions, R1, R2,

… Rn.

5. For each

repeat we turn any other accepting state to non-

accepting.

6. The desired regular expression for the automaton is then the union of each of

the n regular expressions: R1 R2 RN

100 of 221

DFA🡪RE Example

TOC notes by S P CHARI

  • Convert the following to a RE
  • First convert the edges to RE’s:

0

3

Start

1

2

1

1

0

0

0,1

3

Start

1

2

1

1

0

0+1

101 of 221

DFA 🡪 RE Example (2)

TOC notes by S P CHARI

  • Eliminate State 1:
  • To:

Start

0

0

3

1

2

1

1

0+1

3

Start

2

11

0+10

0+1

Note edge from 3🡪3

Answer: (0+10)*11(0+1)*

102 of 221

Second Example

TOC notes by S P CHARI

  • Automata that accepts even number of 1’s
  • Eliminate state 2:

1

Start

2

3

1

1

0

1

0 0

0+10*1

1

Start

3

0

10*1

103 of 221

Second Example (2)

  • Two accepting states, turn off state 3 first

Start

1 3

0

0+10*1

10*1

0

0+10*1

10*1

3

1

Start

This is just 0*; can ignoreTgOoC innotgestboy Psatlalatvei Jo3shsi ince we would “die”

104 of 221

Second Example (3)

  • Turn off state 1 second:

1

Start

3

0

0+10*1

10*1

This is just 0*10*1(0+10*1)*

Combine from previous slide to get

0* + 0*10*1(0+10*1)*

0+10*1

1

Start

3

0

10*1

TOC notes by S P CHARI

105 of 221

Converting a RE to an Automata

TOC notes by S P CHARI

  • We have shown we can convert an automata to a RE. To show equivalence we must also go the other direction, convert a RE to an automaton.
  • We can do this easiest by converting a RE to an ε-NFA

106 of 221

RE to ε-NFA

TOC notes by S P CHARI

  • Basis:

R=a

R=ε

ε

R=Ø

a

Next slide: More complex RE’s

107 of 221

TOC notes by S P CHARI

R=S+T

S

T

ε

ε

ε

ε

R=ST

S

T

ε

R=S*

ε

S

ε

ε

ε

108 of 221

RE to ε-NFA Example

TOC notes by S P CHARI

  • Convert R= (ab+a)* to an NFA
    • We proceed in stages, starting from simple elements and working our way up

a

a

b

b

ab

a

b

ε

109 of 221

RE to ε-NFA Example (2)

ab+a

(ab+a)*

a

b

ε

a

ε

ε

ε

ε

a

b

ε

a

ε

ε

ε

ε

ε

ε

ε

ε

TOC notes by S P CHARI

110 of 221

CONTEXT FREE GRAMMAR

TOC notes by S P CHARI

111 of 221

Prerequisite

TOC notes by S P CHARI

  1. At very first session we have seen that the Finite Automata and importance of Finite State Machines and the construction of Deterministic Finite Automata and some examples on it.
  2. In previous session we have seen that the construction of Regular Expressions for the given problems and how to generate a Regular expression from the DFA and vice versa.
  3. The above two concepts are required to understand the Context Free Grammar and the applications of CFG.

112 of 221

TOC notes by S P CHARI

DEFINITION

  • The Context Free Grammar can be defined as a set denoted by G=(V,T,P,S) Where

V- set of non-terminals Ex: { P,Q,R…} T- set of Terminals Ex: {p,q,r …}

P- set of Production rules

Where non-terminal -> non-terminal

non-terminal -> terminal

S- is start symbol.

113 of 221

  • G=({S},{0,1},P,S)

Where P = { S->0S

TOC notes by S P CHARI

EXAMPLE

S->1S S->Ꜫ }

114 of 221

TOC notes by S P CHARI

Example1 : Construct CFG which consists of all the strings having atleast one occurrence of ‘000’ over input {0,1}.

  • Solution:

We need to construct the equivalent R.E for the given problem

first.

The R.E=(any number of 0’s or 1’s) 000 (any number of 0’s or 1’s)

Thus R.E = (0+1)*000(0+1)*

Now let us build the production rules to satisfy the above R.E

{ S->ATA

A->0A|1A|Ꜫ T->000 }

115 of 221

Excercise1: Find the CFG for the odd length of strings in{a,b}* with middle symbol a

Excercise2: Generate the a CFG to obtain balanced set of parenthesis(i.e. every left parenthesis should match with corresponding right parenthesis)

TOC notes by S P CHARI

116 of 221

Derivation Trees

TOC notes by S P CHARI

  • Definition: Derivation tree is a graphical representation for the derivation of the given production rules for a given CFG.
  • It is the simple way to show how the derivation can be done to obtain some string from given set of production rules.
  • The derivation tree is also called as Parse tree. Properties of Derivation Tree:-
  • The root node is always a node indicating start symbol.
  • The derivation is read from left to right.
  • The leaf nodes are always terminal nodes.
  • The interior nodes are always the non-terminal nodes.

117 of 221

Example2: The CFG is given as S->bSb | a | b is a production rule. The S is a start symbol.

TOC notes by S P CHARI

  • Solution :

The Derivation tree for deriving a string bbabb as follows.

S

b S b

b S b a

By Simply reading the leaf nodes we can obtain the desired string.

118 of 221

TOC notes by S P CHARI

Exercise : Construct the derivation tree for the string ‘aabbabba’ from the CFG given by

S->aB|bA

A->a|aS|bAA B->b|bS|aBB

119 of 221

Difference between Regular Grammar and

Context Free Gram

mar

S.No.

REGULAR GRAMMAR

CONTEXT FREE GRAMMAR

1

Regular grammar is Defined as G=(V,T,P,S)

where V- set of Non-Terminals,

T- set of Terminals, P- set of Production rules, S-

start Symbol

Hence G be a regular Grammar.

The Context free grammar is defined as follows G=(V,T,P,S)

where V- set of Non-Terminals,

T- set of Terminals, P- set of Production rules, S- start Symbol

Hence G be a CFG.

2

Using this grammar we can draw DFA

Using this we can construct Push Down Automata(PDA)

3

Compiler can not make use of this for Parsing.

Compiler makes use of this for Parsing.

4

The grammar accepts small set of languages.

Hence it is type-3 grammar.

The grammar accepts a large set of languages. Hence it is

type-2 grammar.

5

This form of grammar can be represented by left

linear or right linear grammar

This form of grammar can be represented as nonterminal->

set of terminals and non terminals.

6

Every regular grammar is context free grammar.

Every Context Free Grammar can not be Regular

7

For Ex: S->aA|B

A->aA|a

B->bB|b

For Example:

S->aSb|bSb|Ꜫ

8

Syntax of any programming language can not be

represented completely by regularTOgCrnaomtesmbyaPrallavi Jos

Syntax of any programming language can be represented

hicompletely by regular grammar

120 of 221

MINIMIZATION OF CFG

TOC notes by S P CHARI

  • The properties of reduced grammar are as follows.
  • Each variable(i.e. non terminal) and each terminal of G appears in the derivation of some word in L.
  • There should not be any production as X->Y where X and Y are non terminal.
  • If Ꜫ is not in the language L then there need not be the production X-> Ꜫ.

121 of 221

The Reduction of Grammar as follows

TOC notes by S P CHARI

Removal of Useless

symbols

Elimination of

Ꜫ Productions

Removal of unit

production

122 of 221

Removal of useless symbols

TOC notes by S P CHARI

  • Any symbol is not useful when it appears on R.H.S. in the production rule. If no

such derivation exists then it is supposed to be a useless symbol.

For Example:

G= (V,T,P,S) where V={S,T,X} T={0,1} S->0T|1T|X|0|1

T->00

In the above grammar S->X there is no further rule as a definition to X. Hence we can declare X is a useless symbol. Therefore we can remove the S-> X production from the given grammar.

After removal of useless symbol form the given grammar the Grammar should be as follows.

G= (V,T,P,S) where V={S,T} T={0,1}

S->0T|1T|0|1

T->00

123 of 221

Elimination of Ꜫ production

TOC notes by S P CHARI

  • In CFG if there is Ꜫ production we can remove it ,without changing the meaning of the grammar.

Example:

S->0S|1S|

from the above Grammar we can remove production without changing the meaning of it.

Simply if we place S-> Ꜫ in other rules.

we get S->0 and S->1 Hence we can rewrite the productions as follows

S->0S|1S|0|1

Thus we can remove Ꜫ production from the given grammar

124 of 221

Example2: For the CFG given below remove Ꜫ production S->aSa

S->bSb

S-> Ꜫ

  • Solution:

According to the replacement procedure we will place Ꜫ at S in the sentential form.

S->aSa if S= Ꜫ S->aa

Similarly if

S->bSb if S= Ꜫ S->bb

Thus finally rules are

S->aSa|bSb|aa|bb

Thus we can remove Ꜫ prodTOuC ncottesioby nPallafvri Joosmhi the given grammar.

125 of 221

Excercise1: Eliminate Ꜫ productions from the CFG given below A->0B1|1B1

B->0B|1B|Ꜫ

TOC notes by S P CHARI

Exercise 2: Construct the CFG without Ꜫ productions form the given Grammar.

S->a|Ab|aBa A->b|Ꜫ

B->b|A

126 of 221

Removal of Unit Productions

S->0A|1B|01

  • The unit productions are the productions in which one non terminal gives another non terminal.

For example X->Y, Y->Z, Z->X are the unit productions.

Example 1: if CFG is as below

S->0A|1B|C A->0S|00

B->1|A

C->01 then remove the unit productions.

Solution:

Clearly S->C is a unit production. But while removing it we have to consider what C gives. So, we add a rule to S.

TOC notes by S P CHARI

127 of 221

Example 1(2):

TOC notes by S P CHARI

  • Similarly B->A is also unit production so we can modify it as

B->1|0S|00

Thus finally we can write CFG without unit productions as

S->0A|1B|01

A->0S|00

B->1|0S|00

C->01

128 of 221

TOC notes by S P CHARI

Exercise: Eliminate the unit productions fro the Grammar S->AB

A->a

B->C|b C->D

D->E|bC

E->d|Ab

129 of 221

Solution:

Thus the R.E = (a+b)n a (a+b)n Where n>=0

As the above given expression doesn’t represent regular language, The DFA can

not be designed for representing such language.

The CFG G= (V,T,P,S)

Where

V={S} T={a,b}

P={ S->aSa|bSb|aSb|bSa|a} S is Start Symbol.

Back

a or b

a or b

a

Length n

Length n

TOC notes by S P CHARI

130 of 221

Solution:

TOC notes by S P CHARI

Back

The well formed parenthesis means whenever opening parentheses is present there should be a closing parentheses associated with it.

Any expression must begin with opening bracket .We will consider T={(,[,),]} as set of brackets.

The CFG G=(V,T,P,S)

Where

V={S} T={(,[,),]}

P={ S->SS|()|[]|(S)|[S]

Where S is Start Symbol.

131 of 221

Solution :

TOC notes by S P CHARI

  • To draw a tree we will first try to

obtain derivation for the string aabbabba.

S

aB

a aBB aa bS B

aab bA B aabba B aabba bS aabbab bA aabbabba

S->aB B->aBB B->bS S->bA A->a

B->bS S->bA

A->a

  • Now let us draw a tree

S

a B

a B B

b S b S

b A b A

a

a

Back

132 of 221

Solution:

TOC notes by S P CHARI

Back

  • We will first write all the productions that are giving terminal symbols. A->a

B->aa

But if we look at start symbol S then S->aS|A|C

There is no production for B from start symbol. Thus B become a useless symbol.

Similarly, S->C->aCb->aaCbb->….

There is no terminating symbol for C, Therefore we will eliminate C. Then the set of rules are

S->aS|A A->a

133 of 221

TOC notes by S P CHARI

Back

SCoonlsuidetr ailol thne p:roductions that are giving terminal symbols

S->a

B->a

D->ddd

Now Consider

S->cC

C->cCD

D->ddd.

We will try to derive the string using production rule C we get ,

S->cC->ccCddd->cccCDddd->….

We will not get any terminal symbol for C. Thus we can eliminate C production . To reach D , Cis the only

production, As C is eliminated there is no point of D production. Hence , D is also eliminated.

Therefore,

S->aA|a|Bb

A->aB

B->a|Aa

Is the reduced Grammar.

134 of 221

Solution:

  • Now the Ꜫ production is B->Ꜫ we will delete this production, And then add the production having B replaced by

A->0B1 if B=

A->01

Similarly,

A->1B1->11

A->0B1|1B1|01|11

Similarly,

B->0B->0 B->1B->1

B->0B|1B |0|1

Collectively we can write

A->0B1|1B1|01|11 B->0B|1B|0|1

Back

TOC notes by S P CHARI

135 of 221

Solution:

  • As S->AB and A->a

There is only one rule with A which is giving terminal symbol. Hence there is no Unit production with A.

Now Consider,

B->C

C->D

D->E

It is clear that B,C and D are Unit Productions. As E->d|Ab, we will replace value of D. Then

D->d|Ab|bC

Similarly, a C->D we can write

C->d|Ab|bC

Hence B->d|Ab|bC|b

Thus the grammar after removing unit productions

S->AB

A->a

B->d|Ab|bC|b

C->d|Ab|bC

Back

TOC notes by S P CHARI

136 of 221

UNIT-3�PUSH DOWN AUTOMATA

TOC notes by S P CHARI

137 of 221

Prerequisite

TOC notes by S P CHARI

  • The basic relationship between CFG and R.E is that the CFG can be constructed for every R.E.

  • But more than that CFG can also be written for non regular languages like 0n1n.

  • Thus we can say that regular expressions are the subset of CFG.

  • For every R.E we can be drawn F.A., But F.A. is not sufficient to draw the CFG.

  • For better understanding of PDA we must know about Stack and its operations.

138 of 221

Push Down Automata

TOC notes by S P CHARI

  • The PDA will have input tape, finite control and stack.

TOS

Z0

Stack

  • The input tape is divided in many cells.
  • The finite control has some pointer which points to the current symbol which is to be read.
  • At the end of the input $ or Δ blank symbol is placed to identify the end of input.
  • Here Stack will be used to store the items temporarily.

A

A

B

B

$

Δ

Δ

Δ

Finite Control

139 of 221

Definition of a PDA

TOC notes by S P CHARI

  • A pushdown automaton (PDA) is a seven-tuple: M = (Q, Σ, Г, δ, q0, z0, F)

Q Σ

A finite set of states

A finite input alphabet Г A finite

stack alphabet

q0 The initial/starting state, q0 is in Q

z0 A starting stack symbol, is in Г // need not always remain at the bottom of stack

F

δ

A set of final/accepting states, which is a subset of Q

A transition function, where

For DPDA For NDPDA

δ: Q x (Σ U {ε}) x Г –> finite subsets of Q x Г*

δ: Q x (Σ U {ε}) x Г –> finite subsets of 2Q x Г*

140 of 221

Push Down Automata

TOC notes by S P CHARI

  • Any language which is accepted by a F.A. can also accepted by PDA.
  • PDA can also accepts the class of languages which are not accepted by F.A., Thus PDA is much more superior to F.A.

Example: Design a PDA for accepting the Language L= {anbn|n>=1}.

Solution:

The above given language is in which equal number of a’s are followed by equal number of b’s.

Logic for PDA:

  1. Push all a’s onto the stack
  2. On reading every single b pop one a from the stack.
  3. If the input string is reached end and the stack is empty then the string is accepted by the PDA.

141 of 221

PDA

TOC notes by S P CHARI

Q0

Q1

Q2

(b,a/Ꜫ)

(a,a/aa)

(a,z0/az0)

(b,a/Ꜫ)

($, z0/z0)

142 of 221

Example(2)

TOC notes by S P CHARI

  • The Description for the PDA can be given as follows.

0 0 0 0

δ(q ,a,z )=(q ,az )

δ(q0,a,a)=(q0,aa)

δ(q0,b,a)=(q1,Ꜫ)

δ(q1,b,a)=(q1,Ꜫ)

δ(q1,$, z0)=(q2,Ꜫ)

Where q0 is start state and q2 is accept state.

  • Simulation of PDA for the input string aaabbb as follows

δ(q0,aaabbb,z0) Ͱ(q0,aabbb$,az0)

0 0

Ͱ(q ,abbb$,aaz )

Ͱ(q0,bbb$,aaaz0) Ͱ(q1,bb$,aaz0) Ͱ(q1,b$,az0)

Ͱ(q1,$, z0)

Ͱ(q2, Ꜫ)

Accept State

Hence the string is accepted by the PDA.

143 of 221

Excercise1: Design a PDA that accept a string of well formed parenthesis. Consider the parenthesis is a (,),[,],{,}.

TOC notes by S P CHARI

Excercise2: Construct PDA for the language L={anb2n|n>=1}.

144 of 221

Conversion of CFG to PDA

TOC notes by S P CHARI

  • For the conversion of CFG to PDA the necessary condition is that the first symbol on R.H.S. production must be a terminal symbol.
    • Rule1: For non terminal symbol, add following rule

δ(q, Ꜫ,A)=(q,α)

Where the production rule is A-> α

    • Rule 2: For each terminal symbol, add the following rule.

δ(q, a,a)=(q, Ꜫ) for every terminal symbol.

145 of 221

Example: Construct PDA for the given CFG

S->0BB

TOC notes by S P CHARI

B->0S|1S|0 and test whether 0104 is accepted by this PDA

  • Let PDA

A={{q},{0,1},{S,B,0,1}, δ,q,S,F}

The Production rules δ can be written as:

R1: δ(q, Ꜫ,S) = {(q,0BB)}

R2: δ(q, Ꜫ,B) = {(q,0S),(q,1S),(q,0)} R3: δ(q, 0,0) = {(q, Ꜫ)}

R4: δ(q, 1,1) = {(q, Ꜫ)}

146 of 221

Example(2):

TOC notes by S P CHARI

  • Testing 0104 i.e. 010000 against PDA

δ(q,010000,S) Ͱ(q,010000,0BB)

Ͱ(q,10000,BB)

Ͱ(q,10000,1SB) Ͱ(q,0000,SB) Ͱ(q,0000,0BBB) Ͱ(q,000,BBB)

Ͱ(q,000,0BB)

Ͱ(q,00,BB)

Ͱ(q,00,0B)

Ͱ(q,0, B)

Ͱ(q,0,0)

Ͱ(q, Ꜫ)

ACCEPT

Since R1 Since R3 Since R2

Since R4

Since R1

Since R3

Since R2 Since R3 Since R2 Since R3 Since R2 Since R3

147 of 221

Excercise1: Construct the PDA for the given CFG.

S->0A

TOC notes by S P CHARI

A->0AB|1 B->1

Excercise1: Construct the PDA for the given CFG.

S->aSA|bSB|a|b A->a

B->b

148 of 221

Construction of CFG from PDA

  • Algorithm for getting production rules of CFG: Rule1: The start symbol production can be

S->[q0,Z0,q]

Where q indicates the next state an q0 is a start state.

Rule2: If there exists a move of PDA

δ(q,a,Z)={(q’, Ꜫ)}

then the production rule can be written as:

δ(q,Z,q’)->a

Rule3: If there exists a move of PDA as

δ(q,a,Z)={(q’,Z1,Z2,…..Zn)}

Then the production rule of CFG can be written as

δ(q,a,Z)->a[q

1 1, 2 2 2, 3 3 3, 4 m m, m+1

,Z q ] [q ,Z q ] [q ,Z q ]….. [q ,Z q ].

TOC notes by S P CHARI

149 of 221

Example: The PDA is given below A=({q0, q1 },{0,1},{S,A},δ, q0, S,φ)

TOC notes by S P CHARI

Where δ is given as follows:

δ(q0, 1,S)={(q0, AS)}

δ(q0, Ꜫ,S)={(q0, Ꜫ)}

δ(q0, 1,A)={(q0, AA)}

δ(q0, 0,A)={(q1, A)}

δ(q1, 1,A)={(q1, Ꜫ)}

δ(q1, 0,S)={(q0, S)}

Construct the CFG equivalent to this PDA.

150 of 221

Example(2):

TOC notes by S P CHARI

  • Solution:

Let we will construct a CFG

G=(V,T,P,S)

Here, T={0,1}, V={S U [q0,A,q0 ],[q0,A,q1 ],[q0,S,q0 ],[q0,S,q1 ],

[q1,A,q1 ], [q1,A,q0 ], [q1,S,q1 ], [q1,S,q0 ]}

Now let up build the production rules as:

Using rule1 from the algorithm P1: S->[q0,S,q0 ]

P2: S->[q0,S,q1 ]

Using rule3 of algorithm for the δ(q0, 1,S)={(q0, AS)} we get, P3: [q0,S,q0 ]->1[q0,A,q0 ][q0,S,q0 ]

P4: [q0,S,q0 ]->1[q0,A,q1 ][q1,S,q0 ]

P5: [q0,S,q1 ]->1[q0,A,q0 ][q0,S,q1 ]

P6: [q0,S,q1 ]->1[q0,A,q1 ][q1,S,q1 ]

151 of 221

Example(3

)N:ow for δ(q0, Ꜫ,S)={(q0, Ꜫ)} using rule2 of algorithm we get

P7: [q0,S,q0 ]-> Ꜫ

Similarly for δ(q0, 1,A)={(q0, AA)} Using rule3 we get

P8:[q0,A,q0 ]->1[q0,A,q0 ][q0,A,q0 ]

P9:[q0,A,q0 ]->1[q0,A,q1 ][q1,A,q0 ]

P10:[q0,A,q1 ]->1[q0,A,q0 ][q0,A,q1 ]

P11:[q0,A,q1 ]->1[q0,A,q1 ][q1,A,q1 ]

Similarly for δ(q0, 0,A)={(q1, A)} gives

P12:[q0,A,q0 ]->0[q1,A,q0 ]

P13:[q0,A,q1 ]->0[q1,A,q1 ]

δ(q1, 1,A)={(q1, Ꜫ)} Gives

P14:[q1,A,q1 ]->1

δ(q1, 0,S)={(q0, S)}

P15: [q1,S,q0 ]->0[q0,S,q0 ]

P16:[q1,S,q1 ]->0[q0,S,q1TO]C notes by Pallavi Joshi

152 of 221

Exercise: Find the CFG corresponding to PDA whose transition mapping is as follows:

δ(S, a , X) = (s, AX)

δ(S, b , A) = (s, AA)

δ(S, a , A) = (s, Ꜫ)

TOC notes by S P CHARI

153 of 221

TOC notes by S P CHARI

BACK

Given PDA:

δ(S, a , X) = (s, AX)

δ(S, b , A) = (s, AA)

δ(S, a , A) = (s, )

Solution: Now we will apply conversion algorithm for each transition and obtain

the production rules:

For δ(S, a , X) = (s, AX)

P1: (S,X,s)->a(S A S)(S X s ) |a(S A s)(s X s )

P2: (S,X,S)->a(S A S)(S X S ) | a(S A s)(s X S )

For δ(S, b , A) = (s, AA)

P3: (S, A, s)->b(S A S)(S A s ) | b(S A s)(s A s)

P4: (S, A, S)->b(S A S)(S A S) | b(S A s)(s A S)

For δ(S, a , A) = (s, Ꜫ)

P5: (S, A, s)-> a

Hence , P1,P2,P3,P4 and P5 are productions in CFG.

154 of 221

TOC notes by S P CHARI

BACK

Given PDA:

δ(S, a , X) = (s, AX)

δ(S, b , A) = (s, AA)

δ(S, a , A) = (s, )

Solution: Now we will apply conversion algorithm for each transition and obtain

the production rules:

For δ(S, a , X) = (s, AX)

P1: (S,X,s)->a(S A S)(S X s ) |a(S A s)(s X s )

P2: (S,X,S)->a(S A S)(S X S ) | a(S A s)(s X S )

For δ(S, b , A) = (s, AA)

P3: (S, A, s)->b(S A S)(S A s ) | b(S A s)(s A s)

P4: (S, A, S)->b(S A S)(S A S) | b(S A s)(s A S)

For δ(S, a , A) = (s, Ꜫ)

P5: (S, A, s)-> a

Hence , P1,P2,P3,P4 and P5 are productions in CFG.

155 of 221

TURING MACHINE

TOC notes by S P CHARI

156 of 221

Turing Machines (TM)

TOC notes by S P CHARI

Generalize the class of CFLs:

Recursively Enumerable Languages

Recursive Languages Context-Free Languages

Regular Languages

Non-Recursively Enumerable Languages

157 of 221

Another Part of the Hierarchy:

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages Context-Sensitive Languages Context-Free Languages - ε

Regular Languages - ε

Non-Recursively Enumerable Languages

158 of 221

TOC notes by S P CHARI

Recursively enumerable languages are also known as type 0 languages.

Context-sensitive languages are also known as type 1 languages. Context-free languages are also known as type 2 languages.

Regular languages are also known as type 3 languages.

159 of 221

TMs model the computing capability of a general purpose computer, which informally can be described as:

Effective procedure

TOC notes by S P CHARI

Finitely describable

Well defined, discrete, “mechanical” steps

Always terminates

Computable function

A function computable by an effective procedure

TMs formalize the above notion.

Church-Turing Thesis: There is an effective procedure for solving a problem if and only if there is a TM that halts for all inputs and solves the problem.

There are many other computing models, but all are equivalent to or subsumed by TMs. There is no more powerful

machine (Technically cannot be proved).

DFAs and PDAs do not model all effective procedures or computable functions, but only a subset.

160 of 221

Deterministic Turing Machine (DTM)

TOC notes by S P CHARI

……..

Two-way, infinite tape, broken into cells, each containing one symbol.

Two-way, read/write tape head.

An input string is placed on the tape, padded to the left and right infinitely with blanks, read/write head is positioned at the left end of input string.

Finite control, i.e., a program, containing the position of the read head, current symbol being

scanned, and the current state.

In one move, depending on the current state and the current symbol being scanned, the TM 1) changes state, 2) prints a symbol over the cell being scanned, and 3) moves its’ tape head one cell left or right.

Many modifications possible, but Church-Turing declares equivalence of all.

Finite Control

B

B

0

1

1

0

……0..

B

B

161 of 221

Formal Definition of a DTM

TOC notes by S P CHARI

A DTM is a seven-tuple:

M = (Q, Σ, Γ, δ, q0, B, F)

Q Σ Γ B

q0

F

δ

A finite set of states

A finite input alphabet, which is a subset of Γ– {B} A finite tape alphabet, which is a strict superset of Σ A distinguished blank symbol, which is in Γ

The initial/starting state, q0 is in Q

A set of final/accepting states, which is a subset of Q

A next-move function, which is a mapping (i.e., may be undefined) from Q x Γ –> Q x Γ x {L,R}

Intuitively, δ(q,s) specifies the next state, symbol to be written, and the direction of tape head

movement by M after reading symbol s while in state q.

162 of 221

TOC notes by S P CHARI

Example #1: {w | w is in {0,1}* and w ends with a 0}

0

00

10

10110

Not ε

Q = {q0, q1, q2}

Γ = {0, 1, B}

Σ = {0, 1}

F = {q2}

δ:

0

->q0 (q0, 0, R)

1

(q0, 1, R)

B (q1, B, L)

q1 (q2, 0, R)

q2* -

-

-

-

-

q0 is the start state and the “scan right” state, until hits B

q1 is the verify 0 state q2 is the final state

163 of 221

Example #2: {0n1n | n ≥ 1}

0

1

X

B

->q0 (q1, X, R) -

-

Y

(q3, Y, R)0’s finished -

-

- (more 0’s)

q1 (q1, 0, R)ignore1 (q2, Y, L) q2 (q2, 0, L) ignore2 -

(q0, X, R)

-

- (more 1’s)

(q1, Y, R) ignore2 (q2, Y, L) ignore1 (q3, Y, R) ignore

(q4, B, R)

q3 -

q4* -

-

-

-

-

-

Sample Computation: (on 0011),

presume state q looks rightward

q00011BB.. |— Xq1011

|— X0q111

|— Xq20Y1

|— q2X0Y1

|— Xq00Y1

|— XXq1Y1

|— XXYq11

|— XXq2YY

|— Xq2XYY

|— XXq0YY

|— XXYq3Y B…

|— XXYYq3 BB…

|— XXYYBq4

TOC notes by S P CHARI

164 of 221

Making a TM for {0n1n | n >= 1}

Try n=2 or 3 first.

  • q0 is on 0, replaces with the character to X, changes state to q1, moves right
  • q1 sees next 0, ignores (both 0’s and X’s) and keeps moving right
  • q1 hits a 1, replaces it with Y, state to q2, moves left
  • q2 sees a Y or 0, ignores, continues left
  • when q2 sees X, moves right, returns to q0 for looping step 1 through 5
  • when finished, q0 sees Y (no more 0’s), changes to pre-final state q3
  • q3 scans over all Y’s to ensure there is no extra 1 at the end (to crash on seeing any 0 or 1)
  • when q3 sees B, all 0’s matched 1’s, done, changes to final state q4
  • blank line for final state q4

q00011BB.. |— Xq1011

Try n=1 next.

Make sure unbalanced 0’s and 1’s, or mixture of 0-1’s,

“crashes” in a state not q4, as it should be

|— X0q111

|— Xq20Y1

|— q2X0Y1

|— Xq00Y1

|— XXq1Y1

|— XXYq11

|— XXq2YY

|— Xq2XYY

|— XXq0YY

|— XXYq3Y B…

|— XXYYq3 BB…

|— XXYYBq4

TOC notes by S P CHARI

165 of 221

TOC notes by S P CHARI

Same Example #2: {0n1n | n ≥ 1}

Logic:

t to find next

q0

0

(q1, X, R)

1

-

X

-

Y

(q3, Y, R)

-

B

q1

(q1, 0, R)

(q2, Y, L)

-

(q1, Y, R)

-

q2

(q2, 0, L)

-

(q0, X, R)

(q2, Y, L)

-

q3

q4

-

-

-

-

-

-

(q3, Y, R)

-

(q4, B, R)

-

cross 0’s with X’s, scan right to look for corresponding 1, on finding it cross it with Y, and scan lef

leftmost 0, keep iterating until no more 0’s, then scan right looking for B.

The TM matches up 0’s and 1’s

q1 is the “scan right” state, looking for 1 q2 is the “scan left” state, looking for X q3 is “scan right”, looking for B

q4 is the final state

Can you extend the machine to include n=0?

How does the input-tape look like for string epsilon?

Other Examples:

00

001

000111

11

011

166 of 221

TOC notes by S P CHARI

Roger Ballard’s TM for Example #2, without any extra Tape Symbol: {0n1n | n ≥ 0}

q0

0

(q1, B, R)

1

B

(q4, B, R)

q1

(q1, 0, R)

(q1, 1, R)

(q2, B, L)

q2

-

(q3, B, L)

-

q3

(q3, 0, L)

(q3, 1, L)

(q0, B, R)

q4* - - -

Keep deleting 0 and corresponding 1 from extreme ends, until none left.

Logic:

q0 deletes a leftmost 0 and let q1 scan through end of string, q0 accepts on epsilon

q1 scans over the string and makes q2 expecting 1 on the left

q2 deletes 1 and let q3 “scan left” looking for the start of current string q3 lets q0 start the next iteration

q4 is the final state

Any bug?

Try on:

00

001

000111

11

011

167 of 221

TOC notes by S P CHARI

And his example of a correct TM for the language that goes on infinite loop outside language: {0n1n | n ≥ 0}

q0

0

(q1, B, R)

1

(q3, 1, L)

B

(q4, B, R)

q1

(q1, 0, R)

(q1, 1, R)

(q2, B, L)

q2

-

(q3, B, L)

-

q3

(q3, 0, L)

(q3, 1, L)

(q0, B, R)

q4* - - -

This machine still works correctly for all strings in the language, but

Logic:

start a string with 1 (not in the language), and it loops on B1 for ever.

168 of 221

TOC notes by S P CHARI

Exercises: Construct a DTM for each of the following.

{w | w is in {0,1}* and w ends in 00}

{w | w is in {0,1}* and w contains at least two 0’s}

{w | w is in {0,1}* and w contains at least one 0 and one 1}

Just about anything else (simple) you can think of

169 of 221

Formal Definitions for DTMs

TOC notes by S P CHARI

Let M = (Q, Σ, Г, δ, q0, B, F) be a TM.

Definition: An instantaneous description (ID) is a triple α12, where:

q, the current state, is in Q

α1α2, is in Г*, and is the current tape contents up to the rightmost non-blank symbol, or the symbol to the left of the tape head,

whichever is rightmost

The tape head is currently scanning the first symbol of α2

At the start of a computation α1= ε

If α2= ε then a blank is being scanned

Example: (for TM #1)

q00011 Xq1011

X0q111

Xq20Y1

q2X0Y1

Xq00Y1 XXq1Y1

XXYq11

XXq2YY

XXq0YY XXYq3Y

XXYYq3

XXYYBq4

Xq2XYY

170 of 221

TOC notes by S P CHARI

Suppose the following is the current ID of a DTM

x1x2…xi-1qxixi+1…xn Case 1) δ(q, xi) = (p, y, L)

  1. if i = 1 then qx1x2…xi-1xixi+1…xn |— pByx2…xi-1xixi+1…xn
  2. else x1x2…xi-1qxixi+1…xn |— x1x2…xi-2pxi-1yxi+1…xn

If any suffix of xi-1yxi+1…xn is blank then it is deleted.

Case 2) δ(q, xi) = (p, y, R)

x1x2…xi-1qxixi+1…xn |— x1x2…xi-1ypxi+1…xn

If i>n then the ID increases in length by 1 symbol

x1x2…xnq |— x1x2…xnyp

171 of 221

TOC notes by S P CHARI

Definition: Let M = (Q, Σ, Г, δ, q0, B, F) be a TM, and let w be a string in Σ*. Then w is accepted by M iff

q0w |—* α12

where p is in F and α1 and α2 are in Г*

Definition: Let M = (Q, Σ, Г, δ, q0, B, F) be a TM. The language accepted by M, denoted L(M), is the set

{w | w is in Σ* and w is accepted by M}

Notes:

In contrast to FA and PDAs, if a TM simply passes through a final state then the string is accepted.

Given the above definition, no final state of a TM need to have any transitions. Henceforth, this is our assumption.

If x is NOT in L(M) then M may enter an infinite loop, or halt in a non-final state.

Some TMs halt on ALL inputs, while others may not. In either case the language defined by TM is still well defined.

172 of 221

TOC notes by S P CHARI

Definition: Let L be a language. Then L is recursively enumerable if there exists a TM M such that L = L(M).

If L is r.e. then L = L(M) for some TM M, and

If x is in L then M halts in a final (accepting) state.

If x is not in L then M may halt in a non-final (non-accepting) state or no transition is available, or loop forever.

Definition: Let L be a language. Then L is recursive if there exists a TM M such that L = L(M) and M halts on all inputs.

If L is recursive then L = L(M) for some TM M, and

If x is in L then M halts in a final (accepting) state.

If x is not in L then M halts in a non-final (non-accepting) state or no transition is available (does not go to infinite loop).

Notes:

The set of all recursive languages is a subset of the set of all recursively enumerable languages

Terminology is easy to confuse: A TM is not recursive or recursively enumerable, rather a language is recursive or recursively

enumerable.

173 of 221

Recall the Hierarchy:

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages Context-Sensitive Languages

Context-Free Languages - ε

Regular Languages - ε

Non-Recursively Enumerable Languages

174 of 221

TOC notes by S P CHARI

Observation: Let L be an r.e. language. Then there is an infinite list M0, M1, … of TMs such that L = L(Mi).

Question: Let L be a recursive language, and M0, M1, … a list of all TMs such that L = L(Mi), and choose any i>=0. Does Mi always halt?

Answer: Maybe, maybe not, but at least one in the list does.

Question: Let L be a recursive enumerable language, and M0, M1, … a list of all TMs such that L = L(Mi), and

choose any i>=0. Does Mi always halt?

Answer: Maybe, maybe not. Depending on L, none might halt or some may halt.

If L is also recursive then L is recursively enumerable, recursive is subset of r.e.

Question: Let L be a r.e. language that is not recursive (L is in r.e. – r), and M0, M1, … a list of all TMs such that L = L(Mi), and choose any i>=0. Does Mi always halt?

Answer: No! If it did, then L would not be in r.e. – r, it would be recursive.

175 of 221

TOC notes by S P CHARI

L is Recursively enumerable:

TM exist: M0, M1, …

They accept string in L, and do not accept any string outside L

L is Recursive:

at least one TM halts on L and on ∑*-L, others may or may not

L is Recursively enumerable but not Recursive:

TM exist: M0, M1, …

but none halts on all x in ∑*-L

M0 goes on infinite loop on a string p in ∑*-L, while M1 on q in ∑*-L

However, each correct TM accepts each string in L, and none in ∑*-L

L is not R.E:

no TM exists

176 of 221

Let M be a TM.

TOC notes by S P CHARI

Question: Is L(M) r.e.?

Answer: Yes! By definition it is!

Question: Is L(M) recursive?

Answer: Don’t know, we don’t have enough information.

Question: Is L(M) in r.e – r?

Answer: Don’t know, we don’t have enough information.

177 of 221

Let M be a TM that halts on all inputs:

TOC notes by S P CHARI

Question: Is L(M) recursively enumerable?

Answer: Yes! By definition it is!

Question: Is L(M) recursive?

Answer: Yes! By definition it is!

Question: Is L(M) in r.e – r?

Answer: No! It can’t be. Since M always halts, L(M) is recursive.

178 of 221

Let M be a TM.

TOC notes by S P CHARI

As noted previously, L(M) is recursively enumerable, but may or may not be recursive.

Question: Suppose, we know L(M) is recursive. Does that mean M always halts?

Answer: Not necessarily. However, some TM M’ must exist such that L(M’) = L(M) and M’ always halts.

Question: Suppose that L(M) is in r.e. – r. Does M always halt?

Answer: No! If it did then L(M) would be recursive and therefore not in r.e. – r.

179 of 221

Let M be a TM, and suppose that M loops forever on some string x.

Question: Is L(M) recursively enumerable?

Answer: Yes! By definition it is. But, obviously x is not in L(M).

Question: Is L(M) recursive?

Answer: Don’t know. Although M doesn’t always halt, some other TM M’ may exist such that L(M’) = L(M) and M’

always halts.

Question: Is L(M) in r.e. – r?

Answer: Don’t know.

May be another M’ will halt on x, and on all strings! May be no TM for this L(M) does halt on all strings! We just do

not know!

TOC notes by S P CHARI

180 of 221

Modifications of the Basic TM Model

TOC notes by S P CHARI

Other (Extended) TM Models:

One-way infinite tapes Multiple tapes and tape heads Non-Deterministic TMs

Multi-Dimensional TMs (n-dimensional tape)

Multi-Heads

Multiple tracks

All of these extensions are equivalent to the basic DTM model

181 of 221

Closure Properties for Recursive and Recursively Enumerable Languages

TOC notes by S P CHARI

TMs model General Purpose (GP) Computers:

If a TM can do it, so can a GP computer

If a GP computer can do it, then so can a TM

If you want to know if a TM can do X, then some equivalent question are:

Can a general purpose computer do X?

Can a C/C++/Java/etc. program be written to do X?

For example, is a language L recursive?

Can a C/C++/Java/etc. program be written that always halts and accepts L?

182 of 221

TM Block Diagrams:

TOC notes by S P CHARI

If L is a recursive language, then a TM M that accepts L and always halts can be pictorially represented by a “chip” or

“box” that has one input and two outputs.

If L is a recursively enumerable language, then a TM M that accepts L can be pictorially represented by a “box” that has

one output.

w

yes

no

M

yes

Conceivably, M could be pwrovided with an outputMfor “no,” but this output cannot be counted on. Consequently, we simply ignore it.

183 of 221

TOC notes by S P CHARI

Theorem 1: The recursive languages are closed with respect to complementation, i.e., if L is a recursive

language, then so is

Proof: Let M be a TM such that L = L(M) and M always halts. Construct TM M’ as follows:

Note That:

M’ accepts iff M does not

M’ always halts since M always halts

From this it follows that the complement of L is recursive. •

Question: How is the construction achieved? Do we simply complement the final states in the TM? No! A string in L could end up in the complement of L.

Suppose q5 is an accepting state in M, but q0 is not.

If we simply complemented the final and non-final states, then q0 would be an accepting state in M’ but q5 would not. Since q0 is an accepting state, by definition all strings are accepted by M’

L = Σ * L

w

yes

no

M

yes

no

M’

184 of 221

L = L L

Theorem 2: The recursive languages are closed with respect to union, i.e., if L1 and L2 are recursive languages, then so is

Proof: Let M1 and M2 be TMs such that L1 = L(M1) and L2 = L(M2) and M1 and M2 always halts.

Construct TM M’ as follows:

Note That:

L(M’) = L(M1) L(M2)

L(M’) is a subset of L(M1) U L(M2) L(M1) U L(M2) is a subset of L(M’)

1 2

It follows from this that

is recursive. •

3 1 2

w

yes

no

M1

yes

no

M2

start

M’

M’ always halts since M andM always halt

TOCLno3tes =by PaLllav1i Joshi L2

185 of 221

TOC notes by S P CHARI

Theorem 3: The recursive enumerable languages are closed with respect to union, i.e., if L1 and L2 are recursively

enumerable languages, then so is

Proof: Let M1 and M2 be TMs such that L1 = L(M1) and L2 = L(M2). Construct M’ as follows:

Note That:

L(M’) = L(M1) U L(M2)

L(M’) is a subset of L(M1) U L(M2)

L(M1) U L(M2) is a subset of L(M’)

M’ halts and accepts iff M1 or M2 halts and accepts

It follows from this that

is recursively enumerable. •

Question: How do you run two TMs in parallel?

L3 = L1 L2

2

3 1

L = L L

w

yes

M1

yes

yes

M2

M’

186 of 221

Suppose, M1 and M2 had outputs for “no” in the previous construction, and these were transferred to the “no” output for M’

TOC notes by S P CHARI

Question: What would happen if w is in L(M1) but not in L(M2)?

Answer: You could get two outputs – one “yes” and one “no.”

At least M1 will halt and answer accept, M2 may or may not halt.

As before, for the sake of convenience the “no” output will be ignored.

w

yes

M1

yes

yes

M2

M’

no

no

no

187 of 221

are both Lrecursively enumerable then L (and therefore

TOC notes by S P CHARI

Theorem 4: If L and

) is recursive.

Proof: Let M1 and M2 be TMs such that L = L(M1) and

L(M’) is a subset of L L is a subset of L(M’)

M’ is TM for L

M’ always halts since either M1 or M2 halts for any given string

M’ shows that L is recursive

It follows from this that L (and therefore its’ complement) is recursive.

So, is also recursive (we proved it before). •

L

= L(M2). Construct M’ as follows:

L

Note That:

L(M’) = L w

yes

M1

yes

yes

M2

M’

no

L

188 of 221

Corollary of Thm 4: Let L be a subset of Σ*. Then one of the following must be true:

TOC notes by S P CHARI

Both L and One of L and Neither L nor

are recursive.

is recursively enumerable but not recursive, and the other is not recursively enumerable, or

is recursively enumerable

In other words, it is impossiLble to have both L and

r.e. but not recursive

L

L

L

189 of 221

In terms of the hierarchy: (possibility #1)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

L

L

190 of 221

In terms of the hierarchy: (possibility #2)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

L

L

191 of 221

In terms of the hierarchy: (possibility #3)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

L

L

192 of 221

In terms of the hierarchy: (Impossibility #1)

TOC notes by S P CHARI

Recursive Languages

Non-Recursively Enumerable Languages

L L

Recursively Enumerable Languages

193 of 221

In terms of the hierarchy: (Impossibility #2)

TOC notes by S P CHARI

L

Recursively Enumerable Languages

L

Recursive Languages

Non-Recursively Enumerable Languages

194 of 221

In terms of the hierarchy: (Impossibility #3)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

L

L

195 of 221

Note: This gives/identifies three approaches to show that a language is not recursive.

TOC notes by S P CHARI

Show that the language’s complement is not recursive, in one of the two ways:

Show that the language’s complement is recursively enumerable but not recursive Show that the language’s complement is not even recursively enumerable

196 of 221

The Halting Problem - Background

TOC notes by S P CHARI

Definition: A decision problem is a problem having a yes/no answer (that one presumably wants to solve with a computer). Typically, there is a list of parameters on which the problem is based.

Given a list of numbers, is that list sorted?

Given a number x, is x even?

Given a C program, does that C program contain any syntax errors?

Given a TM (or C program), does that TM contain an infinite loop?

From a practical perspective, many decision problems do not seem all that interesting. However,

from a theoretical perspective they are for the following two reasons:

Decision problems are more convenient/easier to work with when proving complexity results. Non-decision counter-parts can always be created & are typically at least as difficult to solve.

Notes:

The following terms and phrases are analogous:

Algorithm Decision Problem (un)Decidable

  • A halting TM program
  • A language(will show shortly)
  • (non)Recursive

197 of 221

Statement of the Halting Problem

TOC notes by S P CHARI

Practical Form: (P1)

Input: Program P and input I.

Question: Does P terminate on input I?

Theoretical Form: (P2)

Input: Turing machine M with input alphabet Σ and string w in Σ*.

Question: Does M halt on w?

A Related Problem We Will Consider First: (P3)

Input: Turing machine M with input alphabet Σ and one final state, and string w in Σ*. Question: Is w in L(M)?

Analogy:

Input: DFA M with input alphabet Σ and string w in Σ*.

Question: Is w in L(M)?

Is this problem (regular language) decidable? Yes! DFA always accepts or rejects.

198 of 221

Over-All Approach:

TOC notes by S P CHARI

We will show that a language Ld is not recursively enumerable

From this it will follow that is not recursive

Using this we will show that a language Lu is not recursive

From this it will follow that the halting problemLdis undecidable.

As We Will See:

P3 will correspond to the language Lu

Proving P3 (un)decidable is equivalent to proving Lu (non)recursive

199 of 221

Converting the Problem to a Language

TOC notes by S P CHARI

Let M = (Q, Σ, Γ, δ, q1, B, {qn}) be a TM, where

Q = {q1, q2, … , qn}, order the states from 1 through n Σ = {x1, x2} = {0, 1}

Γ = {x1, x2, x3} = {0, 1, B}

Encode each transition:

δ(qi, xj) = (qk , xl, dm)

where qi and qk are in ordered Q xj and xl are in Σ,

and dm is in {L, R} = {d1, d2}

as:

0i10j10k10l10m where the number of 0’s indicate the corresponding id, and single 1 acts as a barrier

The TM M can then be encoded as:

111code111code211code311 … 11coder111

where each codei is one transitions’ encoding, and 11’s are barriers between transitions from the table row-major.

Let this encoding of M be denoted by <M>.

200 of 221

TOC notes by S P CHARI

Less Formally:

Every state, tape symbol, and movement symbol is encoded as a sequence of 0’s:

q1,

0

q2,

00

q3

000

:

0

0

1

00

B

000

L

0

R

00

Note that 1’s are not used to represent the above, since 1 is used as a special separator symbol.

Example:

δ(q2, 1) = (q3 , 0, R)

Is encoded as:

00100100010100

201 of 221

TOC notes by S P CHARI

q1

0

(q1, 0, R)

1

(q1, 1, R)

B

(q2, B, L)

q2

(q3, 0, R)

-

-

q3

-

-

-

What is the L(M)?

Coding for the above table:

1110101010100110100101001001101000100100010110010100010100111

Are the followings correct encoding of a TM?

01100001110001

111111

202 of 221

Definition:

TOC notes by S P CHARI

Lt = {x | x is in {0, 1}* and x encodes a TM}

Question: Is Lt recursive?

Answer: Yes. [Check only for format, i.e. the order and number of 0’s and 1’s, syntax checking]

Question: Is Lt decidable:

Answer: Yes (same question).

203 of 221

The Universal Language

TOC notes by S P CHARI

Define the language Lu as follows:

Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Let x be in {0, 1}*. Then either:

  1. x doesn’t have a TM prefix, in which case x is not in Lu
  2. x has a TM prefix, i.e., x = <M,w> and either:
    1. w is not in L(M), in which case x is not in Lu
    2. w is in L(M), in which case x is in Lu

204 of 221

TOC notes by S P CHARI

Recall:

0

1

B

q1

(q1, 0, R)

(q1, 1, R)

(q2, B, L)

q2

(q3, 0, R)

-

-

q3

-

-

-

Which of the following are in Lu?

1110101010100110100101001001101000100100010110010100010100111

111010101010011010010100100110100010010001011001010001010011101110

111010101010011010010100100110100010010001011001010001010011100110111

01100001110001

111111

205 of 221

TOC notes by S P CHARI

Compare P3 and Lu:

(P3):

Input: Turing machine M with input alphabet Σ and one final state, and string w in Σ*. Question: Is w in L(M)?

Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Universal TM (UTM) is the machine for Lu

presuming it is r.e.! Can you write a program to accept strings in Lu?

Notes:

Lu is P3 expressed as a language

Asking if Lu is recursive is the same as asking if P3 is decidable.

Can you write a Halting program for accept/reject of strings in Sigma* ?

We will show that Lu is not recursive, and from this it will follow that P3 is un-decidable. From this we can further show that the Halting problem is un-decidable.

=> A general concept: a decision problem ≡ a formal language

206 of 221

TOC notes by S P CHARI

Define another language Ld as follows:

[Ld_bar = {self accepting TM encodings}, everything else is Ld ]

Ld = {x | x is in {0, 1}* and (a) either x is not a TM,

(b) or x is a TM, call it M, and x is not in L(M)} (1)

Note, there is only one string x

And, the question really is the complement of “does a TM accept its own encoding?” (Ld-bar’s complement)

Let x be in {0, 1}*. Then either:

  1. x is not a TM, in which case x is in Ld
  2. x is a TM, call it M, and either:
    1. x is not in L(M), in which case x is in Ld
    2. x is in L(M), in which case x is not in Ld

207 of 221

TOC notes by S P CHARI

Recall:

0

1

B

q1

(q1, 0, R)

(q1, 1, R)

(q2, B, L)

q2

(q3, 0, R)

-

-

q3

-

-

-

Which of the following are in Ld?

11101010101001101001010010011010001000100010110010100010100111

01100001110001

Change above machine to accept strings ending with 1: the encoding will not be in Ld

208 of 221

TOC notes by S P CHARI

Lemma: Ld is not recursively enumerable. [No TM for Ld!!!]

Proof: (by contradiction)

Suppose that Ld is recursively enumerable. In other words, there exists a TM M such that:

Ld = L(M) (2)

Now suppose that w is a string encoding of M. (3) Case 1) w is in Ld (4)

By definition of Ld given in (1), either w does not encode a TM, or w does encode a TM, call it M, and w is not in L(M). But we know that w

encodes a TM (3: that’s where it came from). Therefore:

w is not in L(M)

(5)

But then (2) and (5) imply that w is not in Ld contradicting (4).

Case 2) w is not in Ld

(6)

By definition of Ld given in (1), w encodes a TM, call it M, and:

w is in L(M)

(7)

But then (2) and (7) imply that w is in Ld contradicting (6).

Since both case 1) and case 2) lead to a contradiction, no TM M can exist such that Ld = L(M). Therefore Ld is not recursively enumerable. •

209 of 221

TOC notes by S P CHARI

Note:

= {x | x is in {0, 1}*, x encodes a TM, call it M, and x is in L(M)}

is not recursive.

Proof: If

Ld

Corollary:

were recursive, then Ld would be recursive, and therefore recursively enumerable, a contradiction. •

Ld

Ld

210 of 221

TOC notes by S P CHARI

Suppose that M’ always halts and Lu = L(M’). It follows that:

M’’ always halts

Theorem: Lu is not recursive.

Proof: (by contradiction)

Suppose that Lu is recursive. Recall that:

Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Suppose that Lu = L(M’) where M’ is a TM that always halts. Construct an algorithm (i.e., a TM that always halts) for

as follows:

Ld

L(M’’) =

would therefore be recursive, a contradiction. •

Ld

Ld

Yes

No

Yes

No

Let M be the TM that w encodes.

M’:

UTM for Lu

Yes

No

<M,w>

(i.e., <w,w>)

Is w a TM?

Lt

w

M’’: for Ld-bar

211 of 221

L_u is recursively enumerable

(you may ignore this slide, for now)

TOC notes by S P CHARI

Input the string

Decode the TM prefix, if it doesn't have one then the string is not in Lu

Otherwise, run/simulate the encoded TM on the suffix

If it terminates and accepts then the original string is in Lu.

If a given string is in Lu, then the above algorithm will correctly determine that, halt and say yes. If the given string is not in Lu, then there are three cases:

  1. the string doesn't have a TM as a prefix. In this case the above algo correctly detects this fact, and reports the string is not in Lu.
  2. the string has a TM prefix, and the TM halts and rejects on the suffix. In this case the above algo correctly reports the string is not in Lu.
  3. the string has a TM prefix, but it goes into an infinite loop on the suffix. In this case the above algo also goes into an infinite loop, but that’s ok since the string as a whole is not in Lu anyway, and we are just trying to show there exists a TM for only accepting strings in Lu.

From this proof note that if the prefix TM is a DFA or PDA, then our machine will also halt in the 3rd case above, no matter what the suffix is.

-- due to Dr. Bernhard (edited by me)

212 of 221

TOC notes by S P CHARI

The over-all logic of the proof is as follows:

  1. If Lu were recursive, then so will be
  2. is not recursive, because Ld is not r.e.

3. It follows that Lu is not recursive.

The second point was established by the corollary.

The first point was established by the theorem on a preceding slide.

This type of proof is commonly referred to as a reduction. Specifically, the problem of recognizing

recognizing Lu

was reduced to the problem of

Ld

Ld

Ld

213 of 221

TOC notes by S P CHARI

Define another language Lh:

Lh = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and M halts on w}

Note that Lh is P2 expressed as a language:

(P2):

Input: Turing machine M with input alphabet Σ and string w in Σ*.

Question: Does M halt on w?

214 of 221

TOC notes by S P CHARI

Theorem: Lh is not recursive.

Proof: (by contradiction)

Suppose that Lh is recursive. Recall that:

Lh = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and M halts on w}

and

Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Suppose that Lh = L(M’) where M’ is a TM that always halts. Construct an algorithm (i.e., a TM that always halts) for Lu

as follows:

Suppose that M’ always halts and Lh = L(M’). It follows that:

M’’ always halts

L(M’’) = Lu

Lu would therefore be recursive, a contradiction. •

Yes

No

Yes

No

Simulate M On w

Yes

No

M’ for Lh: does M halt on w?

<M,w>

M’’ : UTM for Lu

start

215 of 221

TOC notes by S P CHARI

The over-all logic of the proof is as follows:

  1. If Lh is recursive, then so is Lu
  2. Lu is not recursive
  3. It follows that Lh is not recursive.

The second point was established previously.

The first point was established by the theorem on the preceding slide.

This proof is also a reduction. Specifically, the problem of recognizing Lu was reduced to the problem of recognizing Lh.

[Lu and Lh both are recursively enumerable: for proof see Dr. Shoaff!]

216 of 221

TOC notes by S P CHARI

217 of 221

TOC notes by S P CHARI

Define another language Lq:

Lq = {x | x is in {0, 1}*, x encodes a TM M, and M does not contain an infinite loop}

Or equivalently:

Lq = {x | x is in {0, 1}*, x encodes a TM M, and there exists no string w in {0, 1}*

such that M does not terminate on w}

Note that:

= {x | x is in {0, 1}*, and either x does not encode a TM, or it does encode a TM, call it M, and there exists a string w in {0, 1}* such that M does not terminate on w}

Note that the above languages correspond to the following problem:

(P0):

Input: Program P.

Question: Does P contain an infinite loop?

Using the techniques discussed, what can we prove about Lq or its’ complement?

Lq

218 of 221

TOC notes by S P CHARI

  • More examples of non-recursive languages:

Lne = {x | x is a TM M and L(M) is not empty} is r.e. but not recursive.

Le = {x | x is a TM M and L(M) is empty} is not r.e.

Lr = {x | x is a TM M and L(M) is recursive} is not r.e.

Note that Lr is not the same as Lh = {x | x is a TM M that always halts}

but Lh is in Lr.

Lnr = {x | x is a TM M and L(M) is not recursive} is not r.e.

219 of 221

Ignore this slide

Lemma: Ld is not recursively enumerable: [No TM for Ld!!!]

TOC notes by S P CHARI

Proof: (by contradiction)

Suppose that Ld were recursively enumerable. In other words, that there existed a TM M such that:

Ld = L(M) (2)

Now suppose that wj is a string encoding of M. (3) Case 1) wj is in Ld (4)

By definition of Ld given in (1), either wj does not encode a TM, or wj does encode a TM, call it M, and wj is not in L(M). But we know that

wj encodes a TM (3: that’s where it came from). Therefore:

wj is not in L(M)

But then (2) and (5) imply that wj is not in Ld contradicting (4).

Case 2) wj is not in Ld

(5)

(6)

By definition of Ld given in (1), wj encodes a TM, call it M, and:

wj is in L(M)

(7)

But then (2) and (7) imply that wj is in Ld contradicting (6).

Since both case 1) and case 2) lead to a contradiction, no TM M can exist such that Ld = L(M). Therefore Ld is not recursively enumerable. •

220 of 221

TOC notes by S P CHARI

(

)

B

findPair

(findPair2, "(", R)

-

(final, B, R)

findPair2

(findPair2, "(", R)

(removePair, ")", L)

-

removePair

(fetch, "(", R)

(fetch, ")", R)

(goBack, B, L)

fetch

(retrieve, "(", R)

(retreive, ")", R)

(retreive, B, R)

retreive

(returnOpen, "(", L)

(returnClosed, ")", L)

(returnBlank, B, L)

returnOpen

(writeOpen, "(", L)

(writeOpen, ")", L)

(writeOpen, B, L)

returnClosed

(writeClosed, "(", L)

(writeClosed, ")", L)

(writeClosed, B, L)

returnBlank

(writeBlank "(", L)

(writeBlank, ")", L)

(writeBlank, B, L)

writeOpen

(removePair, "(", R)

(removePair, "(", R)

-

writeClosed

(removePair, ")", R)

(removePair, ")", R)

-

writeBlank

(removePair, B, R)

(removePair, B, R)

-

goBack

-

-

(backAgain, B, L)

backAgain

-

-

(seekFront, B, L)

seekFront

(seekFront, "(", L)

(seekFront, ")", L)

(findPair, B, R)

final*

-

-

-

Roger’s TM for balanced parenthesis:

221 of 221

TOC notes by S P CHARI

On 111 111 as a TM encoding

<Quote> It was ambiguous, in my opinion, based on the definition in the Hopcroft book, i.e., the definition in the Hopcroft book was not clear/precise enought to

account this special case. I don't have the book in front of me right now, but I think this is the example I used in class: Consider the TM that has exactly one state, but no transitions. Perfectly valid TM, and it would give us this encoding (111111). In that case the encoded machine would accept sigma* because the highest numbered state would be q0, the only state, and that would be the final state under the Hopcroft encoding. Now consider the TM that has exactly two states, but no transitions. Also a perfectly valid TM, and it would give us the same encoding. In that case the encoded machine would not accept anything because the final state is q1 (highest numbered state), and there is no way to get to it. I used it only as a way to raise that issue in class, i.e., the the Hopcroft definition is a bit ambiguous in this case.

One way to resolve the ambiguity is to require the encoding to specifically specify the final state (at the end or something). In that case, 111111 isn't even a valid TM, since it doesn't specify the final state. Another related question is, does a TM even have to have any states at all to be a valid TM? The encoding would have to be able to isolate that as a unique string also. <End Quote>

Phil Bernhard