4 of 221

Examples

TOC SP CHARI

The set of all strings over {a,b,c} that have ac as substring can be written as {xacy | x,y ∈{a,b,c}∗}.

This can also be written as {x∈{a,b,c}∗ ||x|_ac≥1},

stating that the set of all strings over {a,b,c} in which the number of occurrences of substring ac is at least 1.

The set of all strings over some alphabet Σ with even number of a’s is {x∈Σ∗ ||x|_a= 2n, for some n∈N}.

Equivalently, {x∈Σ∗ ||x|_a≡ 0 mod 2}.

The set of all strings over some alphabet Σ with equal number

of a’s and b’s can be written as {x∈Σ∗ ||x|_a=|x|_b}.

The set of all palindromes over an alphabet Σ can be written as

{x∈Σ∗ | x = x^R}, where x^Ris the string obtained by reversing x.

5 of 221

Finite State Machine

TOC notes by S P CHARI

The finite state machine represents a mathematical model of a system with certain input.

The model finally gives certain output.

The input is processed by various states, these states are called as intermediate states.

The finite state system is very good design tool for the programs such as TEXT EDITORS and LEXICAL ANALYZERS.

6 of 221

Definition of Finite Automata

TOC notes by S P CHARI

A finite automata is a collection of 5-tuple(Q,Σ,δ,q₀,F) Where , Q is finite set of states, which is non empty.

Σ is input alphabet, indicates input set.

δ is transition function or mapping function. We can determine the next state using this function.

q₀is an initial state and is in Q F is set of final states.

7 of 221

Finite Automata Model

TOC notes by S P CHARI

The finite automata can be represented as follows:

Input Tape

A	B	A	B	A	B	A	B	A	B

Finite

Control

Tape

Reader

Input Tape: It is a linear tape having some number of cells. Each input symbol is placed in each cell.

Finite Control: The finite control decides the next state on receiving particular input from input tape.

Out put

8 of 221

Types of Automata

TOC notes by S P CHARI

Finite Automata

Deterministic Finite Automata

Non Deterministic Finite Automata

9 of 221

Types of Automata

Deterministic Finite Automata: The Finite Automata is called Deterministic Finite Automata if there is only one path for a specific input from current state to next state.

It can be represented as follows:

A machine M = (Q,Σ,δ,q₀,F) Where ,

Q is finite set of states, which is non empty.

Σ is input alphabet, indicates input set.

δ is transition function or mapping function. We can determine the next state using this function.

q₀is an initial state and is in Q F is set of final states.

Where δ:Q X Σ -> Q

TOC notes by S P CHARI

KMIT

10 of 221

Types of Automata

Non Deterministic Finite Automata: The Finite Automata is called Non Deterministic Finite Automata if there are more than one path for a specific input from current state to next state.

It can be represented as follows:

A machine M = (Q,Σ,δ,q₀,F) Where ,

Q is finite set of states, which is non empty.

Σ is input alphabet, indicates input set.

δ is transition function or mapping function. We can determine the next state using this function.

q₀is an initial state and is in Q F is set of final states.

_Where_δ_{:Q X}_Σ_{-> 2}TO_QC notes by Pallavi Joshi

11 of 221

Difference between DFA & NFA

TOC notes by S P CHARI

Deterministic Finite Automata	Non Deterministic Finite Automata
For Every symbol of the alphabet, there is only one state transition in DFA.	We do not need to specify how does the NFA react according to some symbol.
DFA cannot use Empty String transition.	NFA can use Empty String transition.
DFA can be understood as one machine.	NFA can be understood as multiple little machines computing at the same time.
DFA will reject the string if it end at other than accepting state.	If all of the branches of NFA dies or rejects the string, we can say that NFA reject the string.
Backtracking is allowed in DFA.	Backtracking is not always allowed in NFA.
DFA is more difficult to construct.	NFA is easier to construct.

12 of 221

Hierarchy of languages

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages Context-Free Languages

Regular Languages

Non-Recursively Enumerable Languages

13 of 221

Deterministic Finite State Automata (DFA)

TOC notes by S P CHARI

……..

ken into cells

ead.

One-way, infinite tape, bro One-way, read-only tape h Finite control, i.e.,

finite number of states, and

transition rules between them, i.e.,

a program, containing the position of the read head, current symbol being scanned, and the current “state.”

A string is placed on the tape, read head is positioned at the left end, and the DFA will read the string one symbol at a time until all symbols have been read. The DFA will then either accept or reject the string.

Finite Control

0	1	1	0	0

14 of 221

TOC notes by S P CHARI

The finite control can be described by a transition diagram or table:

Example #1:

q₀q₀q₁

1 1

q₀q₀

One state is final/accepting, all others are rejecting.

The above DFA accepts those strings that contain an even number of 0’s, including the

null string, over Sigma = {0,1}

L = {all strings with zero or more 0’s}

Note, the DFA must reject all other strings

q₀

0 0

q₁

q₀

15 of 221

Note:

TOC notes by S P CHARI

Machine is for accepting a language, language is the purpose!

Many equivalent machines may accept the same language, but a machine cannot accept multiple languages!

Id’s of the characters or states are irrelevant,

you can call them by any names!

Sigma = {0, 1} ≡ {a, b}

States = {q0, q1} ≡ {u, v}, as long as they have

identical (isomorphic) transition table

….

M-inf

16 of 221

An equivalent machine to the previous example (DFA for even number of 0’s):

TOC notes by S P CHARI

q₀q₃q₁q₂q₂q₂

One state is final/accepting, all others are rejecting.

The above DFA accepts those strings that contain an even number of 0’s, including null

string, over Sigma = {0,1}

Can you draw a machine for a language by excluding the null string from the language? L =

{all strings with 2 or more 0’s}

q₀

1 0 0 1 1_q₁

accept string

q₂

q₃

17 of 221

Example #2:

TOC notes by S P CHARI

accepted

q₀

q₁

q₂

rejected

q₀q₀q₀q₁

Accepts those strings that contain at least two c’s

q₁

q₀

q₂

a/b/c

18 of 221

a/b/c

q₂

a q₀

q₁

Inductive Proof (sketch): that the machine correctly accepts strings with at least two c’s

Proof goes over the length of the string.

Base: x a string with |x|=0. state will be q0 => rejected.

Inductive hypothesis: |x|= integer k, & string x is rejected - in state q0 (x must have zero c),

OR, rejected – in state q1 (x must have one c),

OR, accepted – in state q2 (x has already with two c’s)

Inductive steps: Each case for symbol p, for string xp (|xp| = k+1), the last symbol p = a, b or c

	xa	xb	xc
x ends in q0	q0 =>reject (still zero c => should reject)	q0 =>reject (still zero c => should reject)	q1 =>reject (still zero c => should reject)
x ends in q1	q1 =>reject (still one c => should reject)	q1 =>reject (still one c => should reject)	q2 =>accept (two c now=> should accept)
x ends in q2	q2 =>accept (two c already => should ^accept)TOC notes by Palla	q2 =>accept (two c already => should _{vi J}a_oc_sc_he_ipt)	q2 =>accept (two c already => should accept)

19 of 221

Formal Definition of a DFA

TOC notes by S P CHARI

A DFA is a five-tuple:

M = (Q, Σ, δ, q₀, F)

Q Σ

q₀

A finite set of states

A finite input alphabet

The initial/starting state, q₀is in Q

A set of final/accepting states, which is a subset of Q

A transition function, which is a total function from Q x Σ to Q

δ: (Q x Σ) –> Q

δ(q,s) = q’

δ is defined for any q in Q and s in Σ, and

is equal to some state q’ in Q, could be q’=q

Intuitively, δ(q,s) is the state entered by M after reading symbol s while in state q.

20 of 221

Revisit example #1:

TOC notes by S P CHARI

Q = {q₀, q₁}

Σ = {0, 1}

Start state is q₀

F = {q₀}

δ:

q₁

q₀

q₁

q₀	q₁

q₁

21 of 221

Revisit example #2:

TOC notes by S P CHARI

Q = {q₀, q₁, q₂}

Σ = {a, b, c} Start state is q₀F = {q₂}

δ:

q₀

b q₀

q₁

q₀

q₁	q₁	q₂
q₂	q₂	q₂
n, at each st	ep M has	exactly on

q₁

q₂

Since δ is a functio e option.

It follows that for a given string, there is exactly one computation.

q₁

q₀

q₂

a/b/c

22 of 221

Extension of δ to Strings

TOC notes by S P CHARI

δ^{^}: (Q x Σ^*) –> Q

δ^{^}(q,w) – The state entered after reading string w having started in state q.

Formally:

δ^{^}(q, ε) = q, and
For all w in Σ^*and a in Σ

δ^{^}(q,wa) = δ (δ^{^}(q,w), a)

23 of 221

Recall Example #1:

TOC notes by S P CHARI

δ^{^}(q₀, 011)

= δ (δ^{^}(q₀,01), 1)

= δ (δ ( δ^{^}(q₀,0), 1), 1)

= δ (δ (δ (δ^{^}(q₀, λ), 0), 1), 1)

= δ (δ (δ(q₀,0), 1), 1)

= δ (δ (q₁, 1), 1)

= δ (q₁, 1)

= q₁

by rule #2

by rule #1

by definition of δ by definition of δ by definition of δ

Is 011 accepted? No, since δ^{^}(q₀, 011) = q₁is not a final state.

q₀

q₁

What is δ^{^}(q₀, 011)? Informally, it is the state entered by⁰M after processing 011 having started in state q_0.

Formally:

24 of 221

Note that:

TOC notes by S P CHARI

δ^{^}(q,a) = δ(δ^{^}(q, ε), a)

= δ(q, a)

by definition of δ^{^}, rule #2

by definition of δ^{^}, rule #1

Therefore:

δ^{^}(q, a₁a₂…a_n) = δ(δ(…δ(δ(q, a1), a2)…), a_n)

However, we will abuse notations, and use δ in place of δ^{^}: δ^{^}(q, a₁a₂…a_n) = δ(q, a₁a₂…a_n)

25 of 221

Example #3:

TOC notes by S P CHARI

δ(q₀, 011)

= δ (δ(q₀,01), 1)

= δ (δ (δ(q₀,0), 1), 1)

= δ (δ (q₁, 1), 1)

= δ (q₁, 1)

= q₁

by rule #2

by definition of δ by definition of δ by definition of δ

Is 011 accepted? No, since δ(q₀, 011) = q₁is not a final state.

Language?

L ={ all strings over {0,1} that has 2 or more 0 symbols}

⁰1

What is δ(q₀, 011)? Informally, it is the state entered by M after process0ing 011 having

started in state q_0.

Formally:

26 of 221

Recall Example #3:

TOC notes by S P CHARI

What is δ(q₁, 10)?

δ(q₁, 10)	= δ (δ(q₁,1), 0)	by rule #2
	= δ (q₁, 0)	by definition of δ
	= q₂	by definition of δ

Is 10 accepted? No, since δ(q₀, 10) = q₁is not a final state. The fact that δ(q₁, 10) = q₂is

irrelevant, q1 is not the start state!

27 of 221

Definitions related to DFAs

Let M = (Q, Σ, δ,q₀,F) be a DFA and let w be in Σ^*. Then w is accepted by M iff δ(q₀,w) = p for some state p in F.

Let M = (Q, Σ, δ,q₀,F) be a DFA. Then the language accepted by M is the set:

L(M) = {w | w is in Σ^*and δ(q₀,w) is in F} Another equivalent definition:

L(M) = {w | w is in Σ^*and w is accepted by M}

Let L be a language. Then L is a regular language iff there exists a DFA M such that L = L(M).

Let M₁= (Q₁, Σ₁, δ₁, q₀, F₁) and M₂= (Q₂, Σ₂, δ₂, p₀, F₂) be DFAs. Then M₁and M₂are equivalent iff

L(M₁) = L(M₂).

TOC notes by S P CHARI

28 of 221

TOC notes by S P CHARI

Notes:

A DFA M = (Q, Σ, δ,q₀,F) partitions the set Σ^*into two sets: L(M) and

Σ^*- L(M).

If L = L(M) then L is a subset of L(M) and L(M) is a subset of L (def. of set equality). Similarly, if L(M₁) = L(M₂) then L(M₁) is a subset of L(M₂) and L(M₂) is a subset of L(M₁). Some languages are regular, others are not. For example, if

Regular: L₁= {x | x is a string of 0's and 1's containing an even

number of 1's} and

Not-regular: L₂= {x | x = 0ⁿ1ⁿfor some n >= 0}

Can you write a program to “simulate” a given DFA, or any arbitrary input DFA?

Question we will address later:

How do we determine whether or not a given language is regular?

29 of 221

Give a DFA M such that:

TOC notes by S P CHARI

L(M) = {x | x is a string of 0’s and 1’s and |x| >= 2}

Prove this by induction

q₁

q₀

q₂

0/1

30 of 221

Give a DFA M such that:

TOC notes by S P CHARI

L(M) = {x | x is a string of (zero or more) a’s, b’s and c’s such that x does not contain the substring aa}

Logic:

In Start state (q0): b’s and c’s: ignore – stay in same state

q0 is also “accept” state

First ‘a’ appears: get ready (q1) to reject

But followed by a ‘b’ or ‘c’: go back to start state q0

When second ‘a’ appears after the “ready” state: go to reject state q2 Ignore everything after getting to the “reject” state q2

a/b/c

q₀q₁q₂

b/c

31 of 221

Give a DFA M such that:

TOC notes by S P CHARI

L(M) = {x | x is a string of a’s, b’s and c’s such that x contains the substring aba}

a/b/c

q₀q₁

Logic: acceptance is straight forward, progressing on each expected symbol

However, rejection needs special ca_bre_/c, in each state (for DFA, we will see this becomes

easier in NFA, non-deterministic machine)

q₂q₃

b/c

32 of 221

Give a DFA M such that:

TOC notes by S P CHARI

q₀

q₇

Remember, you may hav_qe multiple “final” sta_qtes, but only one “st_qart” state

4 5 6

q₂

q₁

q₃

L(M) = {x | x is a string of a’s and b’s such that x

contains both aa and bb}

First do, for a language where ‘aa’ comes before ‘bb’

Then do its reverse; and then parallelize them.

a/b

33 of 221

TOC notes by S P CHARI

Let Σ = {0, 1}. Give DFAs for {}, {ε}, Σ^*, and Σ⁺.

For {}: For {ε}:

For Σ^*:

For Σ⁺:

0/1

q₀

0/1

q₀

q₁

q₀

0/1

q₀

q₁

0/1

34 of 221

Problem: Third symbol from last is 1

TOC notes by S P CHARI

0/1

q₁

q₀

q₃

0/1

q₂

0/1

Is this a DFA?

No, but it is a Non-deterministic Finite Automaton

35 of 221

Nondeterministic Finite State Automata (NFA)

TOC notes by S P CHARI

An NFA is a five-tuple:

M = (Q, Σ, δ, q₀, F)

Q Σ

q₀

A finite set of states

A finite input alphabet

The initial/starting state, q₀is in Q

A set of final/accepting states, which is a subset of Q

A transition function, which is a total function from Q x Σ to 2^Q

δ: (Q x Σ) –> 2^Q:2^Qis the power set of Q, the set of all subsets of Q

:The set of all states p such that there is a transition

labeled s from q to p

δ(q,s)

δ(q,s) is a function from Q x S to 2^Q(but not only to Q)

36 of 221

TOC notes by S P CHARI

Example #1: one or more 0’s followed by one or more 1’s

Q = {q₀, q₁, q₂}

Σ = {0, 1}

Start state is q₀

F = {q₂}

δ:

q₀

¹{q₀, q₁}	{}
²{}	{q₁, q₂}
{q₂}	{q₂}

q₁

q₀

q₂

0/1

37 of 221

Example #2: pair of 0’s or pair of 1’s as substring

Q = {q₀, q₁, q₂, q₃, q₄}

Σ = {0, 1}

Start state is q₀

F = {q₂, q₄}

δ: 0

q₀

q₁

q₂q₃q₄

q₀

0/1

q₃

q₄

0/1

q₁

q₂

0/1

Joshi

{q₀, q₃}	{q₀, q₁}
{}	{q₂}
{q₂}	{q₂}
{q₄}	{}
{q₄}	{q₄} TOC notes by Pallavi

38 of 221

Notes:

TOC notes by S P CHARI

δ(q,s) may not be defined for some q and s (what does that mean?) δ(q,s) may map to multiple q’s

A string is said to be accepted if there exists a path from q₀to some state in F

A string is rejected if there exist NO path to any state in F

The language accepted by an NFA is the set of all accepted strings

Question: How does an NFA find the correct/accepting path for a given string?

NFAs are a non-intuitive computing model

You may use backtracking to find if there exists a path to a final state (following slide)

Why NFA?

We are primarily interested in NFAs as language defining capability, i.e., do NFAs accept languages that

DFAs do not?

Other secondary questions include practical ones such as whether or not NFA is easier to develop, or how does one implement NFA

39 of 221

Determining if a given NFA (example #2) accepts a given string (001) can be done

algorithmically:

TOC notes by S P CHARI

q₀

q₃

q₄q₄

Each level will have at most n states:

Complexity: O(|x|*n), for running over a string x

accepted

₁q₀

q₁

q₀

0/1

q₃

q₂

q₁

0/1

q₄

0/1

40 of 221

Another example (010):

TOC notes by S P CHARI

q₀

q₃

q₁

q₃

not accepted

All paths have been explored, and none lead to an accepting state.

q₀

0/1

q₃

q₄

q₂

q₁

41 of 221

Question: Why non-determinism is useful?

TOC notes by S P CHARI

Non-determinism = Backtracking Compressed information

Non-determinism hides backtracking

Programming languages, e.g., Prolog, hides backtracking => Easy to program at a higher level: what we

want to do, rather than how to do it

Useful in algorithm complexity study

Is NDA more “powerful” than DFA, i.e., accepts type of languages that any DFA cannot?

42 of 221

TOC notes by S P CHARI

Let Σ = {a, b, c}. Give an NFA M that accepts:

L = {x | x is in Σ^*and x contains ab}

Is L a subset of L(M)? Or, does M accepts all string in L?

Is L(M) a subset of L? Or, does M rejects all strings not in L?

Is an NFA necessary? Can you draw a DFA for this L?

Designing NFAs is not as trivial as it seems: easy to create bug accepting string outside

language

^q₀q₁

q₂

a/b/c

43 of 221

TOC notes by S P CHARI

Let Σ = {a, b}. Give an NFA M that accepts:

L = {x | x is in Σ^*and the third to the last symbol in x is b}

a/b

Is L a subset of L(M)?

Is L(M) a subset of L?

Give an equivalent DFA as an exercise.

q₁

q₀

q₃

a/b

q₂

a/b

44 of 221

Extension of δ to Strings and Sets of States

What we currently have:

δ : (Q x Σ) –> 2^Q

q₀

q₁

q₄

q₂

What we want (why?): δ : (2^Qx Σ^*) –> 2^Q

We will do this in two steps, which will be slightly different from the book, and we will make use of the following NFA.

q₃

TOC notes by S P CHARI

45 of 221

Extension of δ to Strings and Sets of States

TOC notes by S P CHARI

Step #1:

Given δ: (Q x Σ) –> 2^Qdefine δ^#: (2^Qx Σ) –> 2^Qas follows:

for all subsets R of Q, and symbols a in Σ

Note that:

1) δ^#(R, a) = δ(q, a)

∪

q∈R

δ^#({p},a) = δ(q, a)

= δ(p, a)

by definition of δ^#, rule #1 above

∪

q∈{ p}

Hence, we can use δ for δ^#

δ({q₀, q₂}, 0)

δ({q₀, q₁, q₂}, 0)

These now make sense, but previously they did not.

46 of 221

Example:

TOC notes by S P CHARI

δ({q₀, q₂}, 0)

= δ(q₀, 0) U δ(q₂, 0)

= {q₁, q₃} U {q₃, q₄}

= {q₁, q₃, q₄}

δ({q₀, q₁, q₂}, 1) = δ(q₀, 1) U δ(q₁, 1) U δ(q₂, 1)

= {} U {q₂, q₃} U {}

= {q₂, q₃}

47 of 221

Step #2:

TOC notes by S P CHARI

Given δ: (2^Qx Σ) –> 2^Qdefine δ^{^}: (2^Qx Σ*) –> 2^Qas follows:

δ^{^}(R,w) – The set of states M could be in after processing string w, having started from any

state in R.

Formally:

δ^{^}(R, ε) = R
δ^{^}(R,wa) = δ (δ^{^}(R,w), a)

for any subset R of Q

for any w in Σ^*, a in Σ, and

subset R of Q

Note that:

δ^{^}(R, a)

= δ(δ^{^}(R, ε), a)

= δ(R, a)

by definition of δ^{^}, rule #3 above by definition of δ^{^}, rule #2 above

Hence, we can use δ for δ^{^}

δ({q₀, q₂}, 0110)

δ({q₀, q₁, q₂}, 101101)

These now make sense, but previously they did not.

48 of 221

TOC notes by S P CHARI

Example:

What is δ({q₀}, 10)?

Informally: The set of states the NFA could be in after processing 10,

having started in state q₀, i.e., {q₁, q₂, q₃}.

Formally:

δ({q₀}, 10) = δ(δ({q₀}, 1), 0)

= δ({q₀}, 0)

= {q₁, q₂, q₃}

Is 10 accepted? Yes!

q₀

q₁

q₃

q₂

49 of 221

Example:

TOC notes by S P CHARI

What is δ({q₀, q₁}, 1)?

δ({q₀, q₁}, 1)

= δ({q₀}, 1) ∪ δ({q₁}, 1)

= {q₀} ∪ {q₂, q₃}

= {q₀, q₂, q₃}

What is δ({q₀, q₂}, 10)?

δ({q₀, q₂}, 10)

= δ(δ({q₀, q₂}, 1), 0)

= δ(δ({q₀}, 1) U δ({q₂}, 1), 0)

= δ({q₀} ∪ {q₃}, 0)

= δ({q₀,q₃}, 0)

= δ({q₀}, 0) ∪ δ({q₃}, 0)

= {q₁, q₂, q₃} ∪ {}

= {q₁, q₂, q₃}

50 of 221

Example:

TOC notes by S P CHARI

δ({q₀}, 101)

= δ(δ({q₀}, 10), 1)

= δ(δ(δ({q₀}, 1), 0), 1)

= δ(δ({q₀}, 0), 1)

= δ({q₁, q₂, q₃}, 1)

= δ({q₁}, 1) U δ({q₂}, 1) U δ({q₃}, 1)

= {q₂, q₃} U {q₃} U {}

= {q₂, q₃}

Is 101 accepted? Yes! q₃is a final state.

51 of 221

Definitions for NFAs

TOC notes by S P CHARI

Let M = (Q, Σ, δ,q₀,F) be an NFA and let w be in Σ^*. Then w is accepted by M iff δ({q₀}, w)

contains at least one state in F.

Let M = (Q, Σ, δ,q₀,F) be an NFA. Then the language accepted by M is the set: L(M) = {w | w is in Σ^*and δ({q₀},w) contains at least one state in F}

Another equivalent definition:

L(M) = {w | w is in Σ^*and w is accepted by M}

52 of 221

Equivalence of DFAs and NFAs

TOC notes by S P CHARI

Do DFAs and NFAs accept the same class of languages?

Is there a language L that is accepted by a DFA, but not by any NFA? Is there a language L that is accepted by an NFA, but not by any DFA?

Observation: Every DFA is an NFA, DFA is only restricted NFA.

Therefore, if L is a regular language then there exists an NFA M such that L = L(M). It follows that NFAs accept all regular languages.

But do NFAs accept more?

53 of 221

TOC notes by S P CHARI

Consider the following DFA: 2 or more c’s

Q = {q₀, q₁, q₂}

Σ = {a, b, c} Start state is q₀F = {q₂}

δ:

q₀

b q₀

q₁

q₀

q₁

q₂

q₁	q₁	q₂
q₂	q₂	q₂

q₁

q₀

q₂

a/b/c

54 of 221

An Equivalent NFA:

TOC notes by S P CHARI

Q = {q₀, q₁, q₂}

Σ = {a, b, c} Start state is q₀F = {q₂}

δ:

a	b	c
{q₀}	{q₀}	{q₁}

q₀

q₁

q₂

{q₁}	{q₁} {q	₂}
{q₂}	{q₂} {q	₂}

[q₀₁

q₁

q₀

Example of creating a DFA out of an NFA (as per the constructive proof):

0/1

^]{q₁} write as [q₁]	{}
{q ,q } ^]w ⁰e¹s rit a [q₀₁]	{q₁}

60 of 221

TOC notes by S P CHARI

δ:

->q₀

[q₁[ ]

[q₀₁

q₁

q₀

Example of creating a DFA out of an NFA (as per the constructive proof):

0/1

^]{q₁} write as [q₁]	{}
{q ,q } ^]w ⁰e¹s rit a [q₀₁]	{q₁}
[ ]	[ ]

61 of 221

TOC notes by S P CHARI

δ:

->q₀

[q₁[ ]

[q₀₁

q₁

q₀

Example of creating a DFA out of an NFA (as per the constructive proof):

0/1

^]{q₁} write as [q₁]	{}
{q ,q } ^]w ⁰e¹s rit a [q₀₁]	{q₁}
[ ]	[ ]
[q₀₁]	[q₁]

62 of 221

TOC notes by S P CHARI

Construct DFA M’ as follows:

δ({q₀}, 0) = {q₁}

δ({q₀}, 1) = {}

δ({q₁}, 0) = {q₀, q₁} =>

δ({q₁}, 1) = {q₁}

δ({q₀, q₁}, 0) = {q₀, q₁}

δ({q₀, q₁}, 1) = {q₁} =>

δ({}, 0) = {}

δ({}, 1) = {}

=> δ’([q₀], 0) = [q₁]

=> δ’([q₀], 1) = [ ]

δ’([q₁], 0) = [q₀q₁]

=> δ’([q₁], 1) = [q₁]

=> δ’([q₀q₁], 0) = [q₀q₁]

δ’([q₀q₁], 1) = [q₁]

δ’([ ], 0) = [ ]

δ’([ ], 1) = [ ]

[ ]

[q₁]

0/1

[q₀]

[q₀q₁]

63 of 221

TOC notes by S P CHARI

Theorem: Let L be a language. Then there exists an DFA M such that L = L(M) iff there

exists an NFA M’ such that L = L(M’).

Proof:

(if) Suppose there exists an NFA M’ such that L = L(M’). Then by Lemma 2 there

exists an DFA M such that L = L(M).

(only if) Suppose there exists an DFA M such that L = L(M). Then by Lemma 1

there exists an NFA M’ such that L = L(M’).

Corollary: The NFAs define the regular languages.

64 of 221

TOC notes by S P CHARI

Note: Suppose R = {}

δ(R, 0)

= δ(δ(R, ε), 0)

= δ(R, 0)

= δ(q, 0)

= _∪{}

q∈R

Since R = {}

Exercise - Convert the following NFA to a DFA:

Q = {q₀, q₁, q₂}

Σ = {0, 1}

Start state is q₀

F = {q₀}

δ: 0

q₀

q₁

{q₀, q₁}	{ }
{q₁}	{q₂}
{q₂}	{q₂}

65 of 221

Problem: Third symbol from last is 1

TOC notes by S P CHARI

0/1

q₁

q₀

q₃

0/1

q₂

0/1

Now, can you convert this NFA to a DFA?

66 of 221

NFAs with ε Moves

TOC notes by S P CHARI

An NFA-ε is a five-tuple:

M = (Q, Σ, δ, q₀, F)

Q Σ

q₀

A finite set of states

A finite input alphabet

The initial/starting state, q₀is in Q

A set of final/accepting states, which is a subset of Q

A transition function, which is a total function from Q x Σ U {ε} to 2^Q

δ: (Q x (Σ U {ε})) –> 2^Q

δ(q,s)

-The set of all states p such that there is a transition labeled a from q to p, where a is in Σ U {ε}

Sometimes referred to as an NFA-ε other times, simply as an NFA.

67 of 221

Example:

TOC notes by S P CHARI

δ:

q₀

- A string w = w₁w₂…w_nis processed

as w = ε^*w₁ε^*w₂ε^*… ε^*w_nε^*

Example: all computations on 00:

0 ε 0

q₀q₀q₁q₂

1 ^{q₀^}	{ }	{q₁} ^-
²{q , q } 1 2	{q₀, q₃}	{q₂}
³{q } 2	{q₂}	{ }
{ }	{ }	{ }

q₀

0/1

q₂

q₁

q₃

68 of 221

Informal Definitions

Let M = (Q, Σ, δ,q₀,F) be an NFA-ε.

A String w in Σ^*is accepted by M iff there exists a path in M from q₀to a state in F labeled by w and zero or more ε transitions.

The language accepted by M is the set of all strings from Σ^*that are accepted by M.

TOC notes by S P CHARI

69 of 221

ε-closure

TOC notes by S P CHARI

Define ε-closure(q) to denote the set of all states reachable from q by zero or more ε transitions.

Examples: (for the previous NFA)

ε-closure(q₀) = {q₀, q₁, q₂}

ε-closure(q₁) = {q₁, q₂}

ε-closure(q₂) = {q₂}

ε-closure(q₃) = {q₃}

ε-closure(q) can be extended to sets of states by defining:

ε-closure(P) = ε-closure(q)

Examples:

1 2 1 2

ε-closure({q , q }) = {q , q }

ε-closure({q₀, q₃}) = {q₀, q₁, q₂, q₃}

∪

q∈P

q₀

0/1

q₂

q₁

q₃

70 of 221

Extension of δ to Strings and Sets of States

TOC notes by S P CHARI

What we currently have:

δ : (Q x (Σ U {ε})) –> 2^Q

What we want (why?): δ : (2^Qx Σ^*) –> 2^Q

As before, we will do this in two steps, which will be slightly different from the book, and we will make use of the following NFA.

q₀

0/1

q₂

q₁

q₃

71 of 221

TOC notes by S P CHARI

Step #1:

Given δ: (Q x (Σ U {ε})) –> 2^Qdefine δ^#: (2^Qx (Σ U {ε})) –> 2^Qas follows:

1) δ^#(R, a) = δ(q, a) for all subsets R of Q, and symbols a in Σ U {ε}

Note that:

∪

q∈R

δ^#({p},a) = δ(q, a) by definition of δ^#, rule #1 above

= δ(p, a)

Hence, we can use δ for δ

δ({q₀, q₂}, 0)

δ({q₀, q₁, q₂}, 0)

These now make sense, but previously

they did not.

∪

q∈{ p}

72 of 221

Examples:

TOC notes by S P CHARI

What is δ({q₀, q₁, q₂}, 1)?

δ({q₀, q₁, q₂}, 1) = δ(q₀, 1) U δ(q₁, 1) U δ(q₂, 1)

= { } U {q₀, q₃} U {q₂}

= {q₀, q₂, q₃}

What is δ({q₀, q₁}, 0)?

δ({q₀, q₁}, 0)

= δ(q₀, 0) U δ(q₁, 0)

= {q₀} U {q₁, q₂}

= {q₀, q₁, q₂}

73 of 221

Step #2:

TOC notes by S P CHARI

Given δ: (2^Qx (Σ U {ε})) –> 2^Qdefine δ^{^}: (2^Qx Σ^*) –> 2^Qas follows:

δ^{^}(R,w) – The set of states M could be in after processing string w, having starting

from any state in R.

Formally:

2) δ^{^}(R, ε) = ε-closure(R) - for any subset R of Q

3) δ^{^}(R,wa) = ε-closure(δ(δ^{^}(R,w), a))

- for any w in Σ^*, a in Σ, and subset R of Q

Can we use δ for δ^{^}?

74 of 221

TOC notes by S P CHARI

Consider the following example:

δ({q₀}, 0) = {q₀}

δ^{^}({q₀}, 0) = ε-closure(δ(δ^{^}({q₀}, ε), 0))

= ε-closure(δ(ε-closure({q₀}), 0))

= ε-closure(δ({q_0,q₁, q₂}, 0))

= ε-closure(δ(q₀, 0) U δ(q₁, 0) U δ(q₂, 0))

= ε-closure({q₀} U {q₁, q₂} U {q₂})

By rule #3

By rule #2

By ε-closure

By rule #1

= ε-closure({q₀, q₁, q₂})

= ε-closure({q₀}) U ε-closure({q₁}) U ε-closure({q₂})

= {q₀, q₁, q₂} U {q₁, q₂} U {q₂}

= {q₀, q₁, q₂}

So what is the difference?

δ(q₀, 0) - Processes 0 as a single symbol, without ε transitions.

δ^{^}(q₀, 0) - Processes 0 using as many ε transitions as are possible.

75 of 221

Example:

TOC notes by S P CHARI

δ^{^}({q₀}, 01) = ε-closure(δ(δ^{^}({q₀}, 0), 1))

= ε-closure(δ({q₀, q₁, q₂}), 1)

= ε-closure(δ(q₀, 1) U δ(q₁, 1) U δ(q₂, 1))

= ε-closure({ } U {q₀, q₃} U {q₂})

= ε-closure({q₀, q₂, q₃})

By rule #3 Previous slide

By rule #1

= ε-closure({q₀}) U ε-closure({q₂}) U ε-closure({q₃})

= {q₀, q₁, q₂} U {q₂} U {q₃}

= {q₀, q₁, q₂, q₃}

76 of 221

Definitions for NFA-ε Machines

Let M = (Q, Σ, δ,q₀,F) be an NFA-ε and let w be in Σ^*. Then w is accepted by M iff δ^{^}({q₀},

w) contains at least one state in F.

Let M = (Q, Σ, δ,q₀,F) be an NFA-ε. Then the language accepted by M is the set: L(M) = {w | w is in Σ^*and δ^{^}({q₀},w) contains at least one state in F}

Another equivalent definition:

L(M) = {w | w is in Σ^*and w is accepted by M}

TOC notes by S P CHARI

77 of 221

Equivalence of NFAs and NFA-εs

TOC notes by S P CHARI

Do NFAs and NFA-ε machines accept the same class of languages?

Is there a language L that is accepted by a NFA, but not by any NFA-ε? Is there a language L that is accepted by an NFA-ε, but not by any DFA?

Observation: Every NFA is an NFA-ε.

Therefore, if L is a regular language then there exists an NFA-ε M such that L = L(M). It follows that NFA-ε machines accept all regular languages.

But do NFA-ε machines accept more?

78 of 221

TOC notes by S P CHARI

Lemma 1: Let M be an NFA. Then there exists a NFA-ε M’ such that L(M) = L(M’).

Proof: Every NFA is an NFA-ε. Hence, if we let M’ = M, then it follows that L(M’) = L(M).

The above is just a formal statement of the observation from the previous slide.

79 of 221

Lemma 2: Let M be an NFA-ε. Then there exists a NFA M’ such that L(M) = L(M’).

TOC notes by S P CHARI

Proof: (sketch)

Let M = (Q, Σ, δ,q₀,F) be an NFA-ε. Define an NFA M’ = (Q, Σ, δ’,q₀,F’) as:

F’ = F U {q} if ε-closure(q) contains at least one state from F

F’ = F otherwise

δ’(q, a) = δ^{^}(q, a) - for all q in Q and a in Σ

Notes:

δ’: (Q x Σ) –> 2^Qis a function

M’ has the same state set, the same alphabet, and the same start state as M

M’ has no ε transitions

80 of 221

Example:

TOC notes by S P CHARI

Step #1:

Same state set as M

q₀is the starting state

q₀

0/1

q₂

q₁

q₃

q₂

q₁

q₃

q₀

81 of 221

Example:

TOC notes by S P CHARI

Step #2:

q₀becomes a final state

q₀

0/1

q₂

q₁

q₃

q₂

q₁

q₃

q₀

Example:

TOC notes by S P CHARI

Step #8:

Done!

[use table of e-closure]

q₀

0/1

q₂

q₁

q₃

q₂

q₁

q₃

q₀

0/1

88 of 221

TOC notes by S P CHARI

Theorem: Let L be a language. Then there exists an NFA M such that L= L(M) iff there

exists an NFA-ε M’ such that L = L(M’).

Proof:

(if) Suppose there exists an NFA-ε M’ such that L = L(M’). Then by Lemma 2

there exists an NFA M such that L = L(M).

(only if) Suppose there exists an NFA M such that L = L(M). Then by Lemma 1

there exists an NFA-ε M’ such that L = L(M’).

Corollary: The NFA-ε machines define the regular languages.

89 of 221

UNIT-2�REGULAR EXPRESSIONS

TOC notes by S P CHARI

90 of 221

Introduction

TOC notes by S P CHARI

The Language accepted by finite automata are easily described by simple expressions called regular expressions.

The regular expression is the most effective way to represent any language.

The language accepted by some regular expression is known as a regular language.

91 of 221

Regular Expressions

TOC notes by S P CHARI

Any terminal symbol(i.e. an element of ∑), Ꜫ and φ are regular expressions.
The union of two regular expressions R₁and R₂, written as R₁+R₂, is also a regular expression.
The concatenation of two regular expressions R₁and R₂, written as R₁R₂, is also a regular expression.
The iteration(or Closure) of a regular expression R, written as R* is also a regular expression.
If R is a regular expression , then (R) is also a regular expression.

92 of 221

Regular Expressions

Identity Rules :

Let P,Q and R are the regular expressions then the identity rules are as follows:

φ +R = R
φR = Rφ = φ
ꜪR = RꜪ = R
Ꜫ* = Ꜫ and φ *= Ꜫ
R + R = R
R*R* = R*
RR* = R*R
(R*)*=R*
Ꜫ + RR*= R* = Ꜫ +R*R
(PQ)*P = P(QP)*
(P+Q)* = (P*Q*)* = (P*_TOC+_noQ_{tes b}*_y)_{Pallavi Joshi}
(P+Q)R = PR + QR and R(P+Q) = RP + RQ

93 of 221

Regular Expressions

TOC notes by S P CHARI

Exercise 2: Construct the regular expression for the language which accepts all the strings which begin or end with either 00 or 11.

Solution:

The R.E. can be categorized into two subparts.

R = L1 + L2

L1 = The String which begin with 00 or 11. L2 = The String which end with 00 or 11.

Let us find out L1 and L2.

L1 = (00+11)( any number of 0’s and 1’s ) L1 = (00+11)(0+1)*

L2 = ( any number of 0’s and 1’s ) (00+11) L2 = (0+1)*(00+11)

Hence

R = [(00+11)(0+1)*] + [(0+1)*(00+11)]

94 of 221

Algorithm to build R.E. from given DFA

TOC notes by S P CHARI

Let q₁be the initial state.
There are q₂,q₃,q₄,….q_nnumber of states. The final state may be some q_jwhere j<=n.
Let α_jirepresents the transition from qj to qi.
Calculate qi such that q_i= α_ji.q_j

If qi is a start state

q_i= α_ji.q_j+Ꜫ

Similarly compute the final state which ultimately gives the regular expression r.

95 of 221

Regular Expressions

TOC notes by S P CHARI

Exercise 1: Write a regular expression for the set of strings that contains an even number of 1’s over ∑={0,1}. Treat zero 1’s as an even number.

Solution:

The regular expression for the above F.A. is 0*10*1

96 of 221

DFA to RE via State

TOC notes by S P CHARI

1^E.^limStaⁱrⁿti^ang^tiowiⁿth ⁽¹int⁾ermediate states and then moving to accepting states, apply the state elimination process to produce an equivalent automaton with regular expression labels on the edges.

The result will be a one or two state automaton with a start state and

accepting state.

97 of 221

DFA to RE State

TOC notes by S P CHARI

^Elimin2^a. ^tiI^of tⁿhe⁽t²w⁾o states are different, we will have an automaton that looks like the following:

Start

We can describe this automaton as: (R+SU*T)*SU*

98 of 221

DFA to RE State

TOC notes by S P CHARI

^Elimin3^a. ^tIf^iothⁿe s⁽t³ar⁾t state is also an accepting state, then we must also perform a state elimination from the original automaton that gets rid of every state but the start state. This leaves the following:

Start

We can describe this automaton as simply R*.

99 of 221

DFA to RE State Elimination (4)

TOC notes by S P CHARI

4. If there are n accepting states, we must repeat the above steps for each accepting states to get n different regular expressions, R₁, R₂,

… R_n.

5. For each

repeat we turn any other accepting state to non-

accepting.

6. The desired regular expression for the automaton is then the union of each of

the n regular expressions: R₁∪ R₂… ∪ R_N

100 of 221

DFA🡪RE Example

TOC notes by S P CHARI

Convert the following to a RE

First convert the edges to RE’s:

Start

0,1

Start

0+1

101 of 221

DFA 🡪 RE Example (2)

TOC notes by S P CHARI

Eliminate State 1:

Start

0+1

Start

0+10

0+1

Note edge from 3🡪3

Answer: (0+10)*11(0+1)*

102 of 221

Second Example

TOC notes by S P CHARI

Automata that accepts even number of 1’s

Eliminate state 2:

Start

0 0

0+10*1

Start

10*1

103 of 221

Second Example (2)

Two accepting states, turn off state 3 first

Start

1 3

0+10*1

10*1

0+10*1

10*1

Start

This is just 0*; can ignore_Tg_Oo_Ci_nn_otg_est_bo_{y P}s_at_la_lat_ve_{i Jo}3_shs_iince we would “die”

104 of 221

Second Example (3)

Turn off state 1 second:

Start

0+10*1

10*1

This is just 0*10*1(0+10*1)*

Combine from previous slide to get

0* + 0*10*1(0+10*1)*

0+10*1

Start

10*1

TOC notes by S P CHARI

105 of 221

Converting a RE to an Automata

TOC notes by S P CHARI

We have shown we can convert an automata to a RE. To show equivalence we must also go the other direction, convert a RE to an automaton.
We can do this easiest by converting a RE to an ε-NFA

106 of 221

RE to ε-NFA

TOC notes by S P CHARI

Basis:

R=a

R=ε

R=Ø

Next slide: More complex RE’s

107 of 221

TOC notes by S P CHARI

R=S+T

R=ST

R=S^*

108 of 221

RE to ε-NFA Example

TOC notes by S P CHARI

Convert R= (ab+a)* to an NFA

We proceed in stages, starting from simple elements and working our way up

109 of 221

RE to ε-NFA Example (2)

ab+a

(ab+a)*

TOC notes by S P CHARI

110 of 221

CONTEXT FREE GRAMMAR

TOC notes by S P CHARI

111 of 221

Prerequisite

TOC notes by S P CHARI

At very first session we have seen that the Finite Automata and importance of Finite State Machines and the construction of Deterministic Finite Automata and some examples on it.
In previous session we have seen that the construction of Regular Expressions for the given problems and how to generate a Regular expression from the DFA and vice versa.
The above two concepts are required to understand the Context Free Grammar and the applications of CFG.

112 of 221

TOC notes by S P CHARI

DEFINITION

The Context Free Grammar can be defined as a set denoted by G=(V,T,P,S) Where

V- set of non-terminals Ex: { P,Q,R…} T- set of Terminals Ex: {p,q,r …}

P- set of Production rules

Where non-terminal -> non-terminal

non-terminal -> terminal

S- is start symbol.

113 of 221

G=({S},{0,1},P,S)

Where P = { S->0S

TOC notes by S P CHARI

EXAMPLE

S->1S S->Ꜫ }

114 of 221

TOC notes by S P CHARI

Example1 : Construct CFG which consists of all the strings having atleast one occurrence of ‘000’ over input {0,1}.

Solution:

We need to construct the equivalent R.E for the given problem

first.

The R.E=(any number of 0’s or 1’s) 000 (any number of 0’s or 1’s)

Thus R.E = (0+1)*000(0+1)*

Now let us build the production rules to satisfy the above R.E

{ S->ATA

A->0A|1A|Ꜫ T->000 }

115 of 221

Excercise1: Find the CFG for the odd length of strings in{a,b}* with middle symbol a

Excercise2: Generate the a CFG to obtain balanced set of parenthesis(i.e. every left parenthesis should match with corresponding right parenthesis)

TOC notes by S P CHARI

116 of 221

Derivation Trees

TOC notes by S P CHARI

Definition: Derivation tree is a graphical representation for the derivation of the given production rules for a given CFG.
It is the simple way to show how the derivation can be done to obtain some string from given set of production rules.
The derivation tree is also called as Parse tree. Properties of Derivation Tree:-
The root node is always a node indicating start symbol.
The derivation is read from left to right.
The leaf nodes are always terminal nodes.
The interior nodes are always the non-terminal nodes.

117 of 221

Example2: The CFG is given as S->bSb | a | b is a production rule. The S is a start symbol.

TOC notes by S P CHARI

Solution :

The Derivation tree for deriving a string bbabb as follows.

b S b

b S b a

By Simply reading the leaf nodes we can obtain the desired string.

118 of 221

TOC notes by S P CHARI

Exercise : Construct the derivation tree for the string ‘aabbabba’ from the CFG given by

S->aB|bA

A->a|aS|bAA B->b|bS|aBB

119 of 221

Difference between Regular Grammar and

Context Free Gram

mar

S.No.	REGULAR GRAMMAR	CONTEXT FREE GRAMMAR
1	Regular grammar is Defined as G=(V,T,P,S) where V- set of Non-Terminals, T- set of Terminals, P- set of Production rules, S- start Symbol Hence G be a regular Grammar.	The Context free grammar is defined as follows G=(V,T,P,S) where V- set of Non-Terminals, T- set of Terminals, P- set of Production rules, S- start Symbol Hence G be a CFG.
2	Using this grammar we can draw DFA	Using this we can construct Push Down Automata(PDA)
3	Compiler can not make use of this for Parsing.	Compiler makes use of this for Parsing.
4	The grammar accepts small set of languages. Hence it is type-3 grammar.	The grammar accepts a large set of languages. Hence it is type-2 grammar.
5	This form of grammar can be represented by left linear or right linear grammar	This form of grammar can be represented as nonterminal-> set of terminals and non terminals.
6	Every regular grammar is context free grammar.	Every Context Free Grammar can not be Regular
7	For Ex: S->aA\|B A->aA\|a B->bB\|b	For Example: S->aSb\|bSb\|Ꜫ
8	Syntax of any programming language can not be ^{represented completely by regular}TO^gC^rn^ao^mtes^mby^aP^rallavi Jos	Syntax of any programming language can be represented hi^{completely by regular grammar}

120 of 221

MINIMIZATION OF CFG

TOC notes by S P CHARI

The properties of reduced grammar are as follows.
Each variable(i.e. non terminal) and each terminal of G appears in the derivation of some word in L.
There should not be any production as X->Y where X and Y are non terminal.
If Ꜫ is not in the language L then there need not be the production X-> Ꜫ.

121 of 221

The Reduction of Grammar as follows

TOC notes by S P CHARI

Removal of Useless

symbols

Elimination of

Ꜫ Productions

Removal of unit

production

122 of 221

Removal of useless symbols

TOC notes by S P CHARI

Any symbol is not useful when it appears on R.H.S. in the production rule. If no

such derivation exists then it is supposed to be a useless symbol.

For Example:

G= (V,T,P,S) where V={S,T,X} T={0,1} S->0T|1T|X|0|1

T->00

In the above grammar S->X there is no further rule as a definition to X. Hence we can declare X is a useless symbol. Therefore we can remove the S-> X production from the given grammar.

After removal of useless symbol form the given grammar the Grammar should be as follows.

G= (V,T,P,S) where V={S,T} T={0,1}

S->0T|1T|0|1

T->00

123 of 221

Elimination of Ꜫ production

TOC notes by S P CHARI

In CFG if there is Ꜫ production we can remove it ,without changing the meaning of the grammar.

Example:

S->0S|1S| Ꜫ

from the above Grammar we can remove Ꜫ production without changing the meaning of it.

Simply if we place S-> Ꜫ in other rules.

we get S->0 and S->1 Hence we can rewrite the productions as follows

S->0S|1S|0|1

Thus we can remove Ꜫ production from the given grammar

124 of 221

Example2: For the CFG given below remove Ꜫ production S->aSa

S->bSb

S-> Ꜫ

Solution:

According to the replacement procedure we will place Ꜫ at S in the sentential form.

S->aSa if S= Ꜫ S->aa

Similarly if

S->bSb if S= Ꜫ S->bb

Thus finally rules are

S->aSa|bSb|aa|bb

Thus we can remove Ꜫ prod^TOu^{C n}c^ott^esio^byn^Pallaf^vr^{i J}o^osm^hithe given grammar.

125 of 221

Excercise1: Eliminate Ꜫ productions from the CFG given below A->0B1|1B1

B->0B|1B|Ꜫ

TOC notes by S P CHARI

Exercise 2: Construct the CFG without Ꜫ productions form the given Grammar.

S->a|Ab|aBa A->b|Ꜫ

B->b|A

126 of 221

Removal of Unit Productions

S->0A|1B|01

The unit productions are the productions in which one non terminal gives another non terminal.

For example X->Y, Y->Z, Z->X are the unit productions.

Example 1: if CFG is as below

S->0A|1B|C A->0S|00

B->1|A

C->01 then remove the unit productions.

Solution:

Clearly S->C is a unit production. But while removing it we have to consider what C gives. So, we add a rule to S.

TOC notes by S P CHARI

127 of 221

Example 1(2):

TOC notes by S P CHARI

Similarly B->A is also unit production so we can modify it as

B->1|0S|00

Thus finally we can write CFG without unit productions as

S->0A|1B|01

A->0S|00

B->1|0S|00

C->01

128 of 221

TOC notes by S P CHARI

Exercise: Eliminate the unit productions fro the Grammar S->AB

A->a

B->C|b C->D

D->E|bC

E->d|Ab

129 of 221

Solution:

Thus the R.E = (a+b)ⁿa (a+b)ⁿWhere n>=0

As the above given expression doesn’t represent regular language, The DFA can

not be designed for representing such language.

The CFG G= (V,T,P,S)

Where

V={S} T={a,b}

P={ S->aSa|bSb|aSb|bSa|a} S is Start Symbol.

Back

a or b

Length n

TOC notes by S P CHARI

130 of 221

Solution:

TOC notes by S P CHARI

Back

The well formed parenthesis means whenever opening parentheses is present there should be a closing parentheses associated with it.

Any expression must begin with opening bracket .We will consider T={(,[,),]} as set of brackets.

The CFG G=(V,T,P,S)

Where

V={S} T={(,[,),]}

P={ S->SS|()|[]|(S)|[S]

Where S is Start Symbol.

131 of 221

Solution :

TOC notes by S P CHARI

To draw a tree we will first try to

obtain derivation for the string aabbabba.

a aBB aa bS B

aab bA B aabba B aabba bS aabbab bA aabbabba

S->aB B->aBB B->bS S->bA A->a

B->bS S->bA

A->a

Now let us draw a tree

a B

a B B

b S b S

b A b A

Back

132 of 221

Solution:

TOC notes by S P CHARI

Back

We will first write all the productions that are giving terminal symbols. A->a

B->aa

But if we look at start symbol S then S->aS|A|C

There is no production for B from start symbol. Thus B become a useless symbol.

Similarly, S->C->aCb->aaCbb->….

There is no terminating symbol for C, Therefore we will eliminate C. Then the set of rules are

S->aS|A A->a

133 of 221

TOC notes by S P CHARI

Back

•^SC^oon^ls^uide^tr aⁱl^ol thⁿe p^:roductions that are giving terminal symbols

S->a

B->a

D->ddd

Now Consider

S->cC

C->cCD

D->ddd.

We will try to derive the string using production rule C we get ,

S->cC->ccCddd->cccCDddd->….

We will not get any terminal symbol for C. Thus we can eliminate C production . To reach D , Cis the only

production, As C is eliminated there is no point of D production. Hence , D is also eliminated.

Therefore,

S->aA|a|Bb

A->aB

B->a|Aa

Is the reduced Grammar.

134 of 221

Solution:

Now the Ꜫ production is B->Ꜫ we will delete this production, And then add the production having B replaced by Ꜫ

A->0B1 if B=Ꜫ

A->01

Similarly,

A->1B1->11

A->0B1|1B1|01|11

Similarly,

B->0B->0 B->1B->1

B->0B|1B |0|1

Collectively we can write

A->0B1|1B1|01|11 B->0B|1B|0|1

Back

TOC notes by S P CHARI

135 of 221

Solution:

As S->AB and A->a

There is only one rule with A which is giving terminal symbol. Hence there is no Unit production with A.

Now Consider,

B->C

C->D

D->E

It is clear that B,C and D are Unit Productions. As E->d|Ab, we will replace value of D. Then

D->d|Ab|bC

Similarly, a C->D we can write

C->d|Ab|bC

Hence B->d|Ab|bC|b

Thus the grammar after removing unit productions

S->AB

A->a

B->d|Ab|bC|b

C->d|Ab|bC

Back

TOC notes by S P CHARI

136 of 221

UNIT-3�PUSH DOWN AUTOMATA

TOC notes by S P CHARI

137 of 221

Prerequisite

TOC notes by S P CHARI

The basic relationship between CFG and R.E is that the CFG can be constructed for every R.E.

But more than that CFG can also be written for non regular languages like ₀n₁n.

Thus we can say that regular expressions are the subset of CFG.

For every R.E we can be drawn F.A., But F.A. is not sufficient to draw the CFG.

For better understanding of PDA we must know about Stack and its operations.

138 of 221

Push Down Automata

TOC notes by S P CHARI

The PDA will have input tape, finite control and stack.

TOS

Z₀

Stack

The input tape is divided in many cells.
The finite control has some pointer which points to the current symbol which is to be read.
At the end of the input $ or Δ blank symbol is placed to identify the end of input.
Here Stack will be used to store the items temporarily.

A	A	B	B	$	Δ	Δ	Δ

Finite Control

139 of 221

Definition of a PDA

TOC notes by S P CHARI

A pushdown automaton (PDA) is a seven-tuple: M = (Q, Σ, Г, δ, q₀, z₀, F)

Q Σ

A finite set of states

A finite input alphabet Г A finite

stack alphabet

q₀The initial/starting state, q₀is in Q

z₀A starting stack symbol, is in Г // need not always remain at the bottom of stack

A set of final/accepting states, which is a subset of Q

A transition function, where

For DPDA For NDPDA

δ: Q x (Σ U {ε}) x Г –> finite subsets of Q x Г*

δ: Q x (Σ U {ε}) x Г –> finite subsets of 2^{Q x Г*}

140 of 221

Push Down Automata

TOC notes by S P CHARI

Any language which is accepted by a F.A. can also accepted by PDA.
PDA can also accepts the class of languages which are not accepted by F.A., Thus PDA is much more superior to F.A.

Example: Design a PDA for accepting the Language L= {aⁿbⁿ|n>=1}.

Solution:

The above given language is in which equal number of a’s are followed by equal number of b’s.

Logic for PDA:

Push all a’s onto the stack
On reading every single b pop one a from the stack.
If the input string is reached end and the stack is empty then the string is accepted by the PDA.

141 of 221

PDA

TOC notes by S P CHARI

Q₀

Q₁

Q₂

(b,a/Ꜫ)

(a,a/aa)

(a,z₀/az₀)

(b,a/Ꜫ)

($, z₀/z₀)

142 of 221

Example(2)

TOC notes by S P CHARI

The Description for the PDA can be given as follows.

0 0 0 0

δ(q ,a,z )=(q ,az )

δ(q₀,a,a)=(q₀,aa)

δ(q₀,b,a)=(q₁,Ꜫ)

δ(q₁,b,a)=(q₁,Ꜫ)

δ(q₁,$, z₀)=(q₂,Ꜫ)

Where q₀is start state and q₂is accept state.

Simulation of PDA for the input string aaabbb as follows

δ(q₀,aaabbb,z₀) Ͱ(q₀,aabbb$,az₀)

0 0

Ͱ(q ,abbb$,aaz )

Ͱ(q₀,bbb$,aaaz₀) Ͱ(q₁,bb$,aaz₀) Ͱ(q₁,b$,az₀)

Ͱ(q₁,$, z₀)

Ͱ(q₂, Ꜫ)

Accept State

Hence the string is accepted by the PDA.

143 of 221

Excercise1: Design a PDA that accept a string of well formed parenthesis. Consider the parenthesis is a (,),[,],{,}.

TOC notes by S P CHARI

Excercise2: Construct PDA for the language L={aⁿb²ⁿ|n>=1}.

144 of 221

Conversion of CFG to PDA

TOC notes by S P CHARI

For the conversion of CFG to PDA the necessary condition is that the first symbol on R.H.S. production must be a terminal symbol.

Rule1: For non terminal symbol, add following rule

δ(q, Ꜫ,A)=(q,α)

Where the production rule is A-> α

Rule 2: For each terminal symbol, add the following rule.

δ(q, a,a)=(q, Ꜫ) for every terminal symbol.

145 of 221

Example: Construct PDA for the given CFG

S->0BB

TOC notes by S P CHARI

B->0S|1S|0 and test whether 010⁴is accepted by this PDA

Let PDA

A={{q},{0,1},{S,B,0,1}, δ,q,S,F}

The Production rules δ can be written as:

R1: δ(q, Ꜫ,S) = {(q,0BB)}

R2: δ(q, Ꜫ,B) = {(q,0S),(q,1S),(q,0)} R3: δ(q, 0,0) = {(q, Ꜫ)}

R4: δ(q, 1,1) = {(q, Ꜫ)}

146 of 221

Example(2):

TOC notes by S P CHARI

_{Testing 010}4 i.e. 010000 against _PDA

δ(q,010000,S) Ͱ(q,010000,0BB)

Ͱ(q,10000,BB)

Ͱ(q,10000,1SB) Ͱ(q,0000,SB) Ͱ(q,0000,0BBB) Ͱ(q,000,BBB)

Ͱ(q,000,0BB)

Ͱ(q,00,BB)

Ͱ(q,00,0B)

Ͱ(q,0, B)

Ͱ(q,0,0)

Ͱ(q, Ꜫ)

Since R1 Since R3 Since R2

Since R4

Since R1

Since R3

Since R2 Since R3 Since R2 Since R3 Since R2 Since R3

147 of 221

Excercise1: Construct the PDA for the given CFG.

S->0A

TOC notes by S P CHARI

A->0AB|1 B->1

Excercise1: Construct the PDA for the given CFG.

S->aSA|bSB|a|b A->a

B->b

148 of 221

Construction of CFG from PDA

Algorithm for getting production rules of CFG: Rule1: The start symbol production can be

S->[q₀,Z₀,q]

Where q indicates the next state an q0 is a start state.

Rule2: If there exists a move of PDA

δ(q,a,Z)={(q’, Ꜫ)}

then the production rule can be written as:

δ(q,Z,q’)->a

Rule3: If there exists a move of PDA as

δ(q,a,Z)={(q’,Z₁,Z₂,…..Z_n)}

Then the production rule of CFG can be written as

δ(q,a,Z)->a[q

1 1, 2 2 2, 3 3 3, 4 m m, m+1

,Z q ] [q ,Z q ] [q ,Z q ]….. [q ,Z q ].

TOC notes by S P CHARI

149 of 221

Example: The PDA is given below A=({q_0,q₁},{0,1},{S,A},δ, q_0,S,φ)

TOC notes by S P CHARI

Where δ is given as follows:

δ(q_0,1,S)={(q_0,AS)}

δ(q_0,Ꜫ,S)={(q_0,Ꜫ)}

δ(q_0,1,A)={(q_0,AA)}

δ(q_0,0,A)={(q_1,A)}

δ(q_1,1,A)={(q_1,Ꜫ)}

δ(q_1,0,S)={(q_0,S)}

Construct the CFG equivalent to this PDA.

150 of 221

Example(2):

TOC notes by S P CHARI

Solution:

Let we will construct a CFG

G=(V,T,P,S)

Here, T={0,1}, V={S U [q₀,A_,q₀],[q₀,A_,q₁],[q₀,S_,q₀],[q₀,S_,q₁],

[q₁,A_,q₁], [q₁,A_,q₀], [q₁,S_,q₁], [q₁,S_,q₀]}

Now let up build the production rules as:

Using rule1 from the algorithm P1: S->[q₀,S_,q₀]

P2: S->[q₀,S_,q₁]

Using rule3 of algorithm for the δ(q_0,1,S)={(q_0,AS)} we get, P3: [q₀,S_,q₀]->1[q₀,A_,q₀][q₀,S_,q₀]

P4: [q₀,S_,q₀]->1[q₀,A_,q₁][q₁,S_,q₀]

P5: [q₀,S_,q₁]->1[q₀,A_,q₀][q₀,S_,q₁]

P6: [q₀,S_,q₁]->1[q₀,A_,q₁][q₁,S_,q₁]

151 of 221

Example(3

₎N_:ow for δ(q_0,Ꜫ,S)={(q_0,Ꜫ)} using rule2 of algorithm we get

P7: [q₀,S_,q₀]-> Ꜫ

Similarly for δ(q_0,1,A)={(q_0,AA)} Using rule3 we get

P8:[q₀,A_,q₀]->1[q₀,A_,q₀][q₀,A_,q₀]

P9:[q₀,A_,q₀]->1[q₀,A_,q₁][q₁,A_,q₀]

P10:[q₀,A_,q₁]->1[q₀,A_,q₀][q₀,A_,q₁]

P11:[q₀,A_,q₁]->1[q₀,A_,q₁][q₁,A_,q₁]

Similarly for δ(q_0,0,A)={(q_1,A)} gives

P12:[q₀,A_,q₀]->0[q₁,A_,q₀]

P13:[q₀,A_,q₁]->0[q₁,A_,q₁]

δ(q_1,1,A)={(q_1,Ꜫ)} Gives

P14:[q₁,A_,q₁]->1

δ(q_1,0,S)={(q_0,S)}

P15: [q₁,S_,q₀]->0[q₀,S_,q₀]

P16:[q₁,S_,q₁]->0[q₀,S_,q₁^TO]^{C notes by Pallavi Joshi}

152 of 221

Exercise: Find the CFG corresponding to PDA whose transition mapping is as follows:

δ(S_,a , X) = (s_,AX)

δ(S_,b , A) = (s_,AA)

δ(S_,a , A) = (s_,Ꜫ)

TOC notes by S P CHARI

153 of 221

TOC notes by S P CHARI

BACK

Given PDA:

δ(S_,a , X) = (s_,AX)

δ(S_,b , A) = (s_,AA)

δ(S_,a , A) = (s_,Ꜫ)

Solution: Now we will apply conversion algorithm for each transition and obtain

the production rules:

For δ(S_,a , X) = (s_,AX)

P1: (S,X,s)->a(S A S)(S X s ) |a(S A s)(s X s )

P2: (S,X,S)->a(S A S)(S X S ) | a(S A s)(s X S )

For δ(S_,b , A) = (s_,AA)

P3: (S, A, s)->b(S A S)(S A s ) | b(S A s)(s A s)

P4: (S, A, S)->b(S A S)(S A S) | b(S A s)(s A S)

For δ(S_,a , A) = (s_,Ꜫ)

P5: (S, A, s)-> a

Hence , P1,P2,P3,P4 and P5 are productions in CFG.

154 of 221

TOC notes by S P CHARI

BACK

Given PDA:

δ(S_,a , X) = (s_,AX)

δ(S_,b , A) = (s_,AA)

δ(S_,a , A) = (s_,Ꜫ)

Solution: Now we will apply conversion algorithm for each transition and obtain

the production rules:

For δ(S_,a , X) = (s_,AX)

P1: (S,X,s)->a(S A S)(S X s ) |a(S A s)(s X s )

P2: (S,X,S)->a(S A S)(S X S ) | a(S A s)(s X S )

For δ(S_,b , A) = (s_,AA)

P3: (S, A, s)->b(S A S)(S A s ) | b(S A s)(s A s)

P4: (S, A, S)->b(S A S)(S A S) | b(S A s)(s A S)

For δ(S_,a , A) = (s_,Ꜫ)

P5: (S, A, s)-> a

Hence , P1,P2,P3,P4 and P5 are productions in CFG.

155 of 221

TURING MACHINE

TOC notes by S P CHARI

156 of 221

Turing Machines (TM)

TOC notes by S P CHARI

Generalize the class of CFLs:

Recursively Enumerable Languages

Recursive Languages Context-Free Languages

Regular Languages

Non-Recursively Enumerable Languages

157 of 221

Another Part of the Hierarchy:

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages Context-Sensitive Languages Context-Free Languages - ε

Regular Languages - ε

Non-Recursively Enumerable Languages

158 of 221

TOC notes by S P CHARI

Recursively enumerable languages are also known as type 0 languages.

Context-sensitive languages are also known as type 1 languages. Context-free languages are also known as type 2 languages.

Regular languages are also known as type 3 languages.

159 of 221

TMs model the computing capability of a general purpose computer, which informally can be described as:

Effective procedure

TOC notes by S P CHARI

Finitely describable

Well defined, discrete, “mechanical” steps

Always terminates

Computable function

A function computable by an effective procedure

TMs formalize the above notion.

Church-Turing Thesis: There is an effective procedure for solving a problem if and only if there is a TM that halts for all inputs and solves the problem.

There are many other computing models, but all are equivalent to or subsumed by TMs. There is no more powerful

machine (Technically cannot be proved).

DFAs and PDAs do not model all effective procedures or computable functions, but only a subset.

160 of 221

Deterministic Turing Machine (DTM)

TOC notes by S P CHARI

……..

Two-way, infinite tape, broken into cells, each containing one symbol.

Two-way, read/write tape head.

An input string is placed on the tape, padded to the left and right infinitely with blanks, read/write head is positioned at the left end of input string.

Finite control, i.e., a program, containing the position of the read head, current symbol being

scanned, and the current state.

In one move, depending on the current state and the current symbol being scanned, the TM 1) changes state, 2) prints a symbol over the cell being scanned, and 3) moves its’ tape head one cell left or right.

Many modifications possible, but Church-Turing declares equivalence of all.

Finite Control

B	B	0	1	1	0	……₀..	B	B

161 of 221

Formal Definition of a DTM

TOC notes by S P CHARI

A DTM is a seven-tuple:

M = (Q, Σ, Γ, δ, q₀, B, F)

Q Σ Γ B

q₀

A finite set of states

A finite input alphabet, which is a subset of Γ– {B} A finite tape alphabet, which is a strict superset of Σ A distinguished blank symbol, which is in Γ

The initial/starting state, q₀is in Q

A set of final/accepting states, which is a subset of Q

A next-move function, which is a mapping (i.e., may be undefined) from Q x Γ –> Q x Γ x {L,R}

Intuitively, δ(q,s) specifies the next state, symbol to be written, and the direction of tape head

movement by M after reading symbol s while in state q.

162 of 221

TOC notes by S P CHARI

Example #1: {w | w is in {0,1}* and w ends with a 0}

10110

Not ε

Q = {q₀, q₁, q₂}

Γ = {0, 1, B}

Σ = {0, 1}

F = {q₂}

δ:

->q₀(q₀, 0, R)

(q₀, 1, R)

B (q₁, B, L)

q₁(q₂, 0, R)

q₂^*-

q₀is the start state and the “scan right” state, until hits B

q₁is the verify 0 state q₂is the final state

163 of 221

Example #2: {0ⁿ1ⁿ| n ≥ 1}

->q₀(q₁, X, R) -

(q₃, Y, R)0’s finished -

- (more 0’s)

q₁(q₁, 0, R)ignore1 (q₂, Y, L) q₂(q₂, 0, L) ignore2 -

(q₀, X, R)

- (more 1’s)

(q₁, Y, R) ignore2 (q₂, Y, L) ignore1 (q₃, Y, R) ignore

(q₄, B, R)

q₃-

q₄* -

Sample Computation: (on 0011),

presume state q looks rightward

q₀0011BB.. |— Xq₁011

|— X0q₁11

|— Xq₂0Y1

|— q₂X0Y1

|— Xq₀0Y1

|— XXq₁Y1

|— XXYq₁1

|— XXq₂YY

|— Xq₂XYY

|— XXq₀YY

|— XXYq₃Y B…

|— XXYYq₃BB…

|— XXYYBq₄

TOC notes by S P CHARI

164 of 221

Making a TM for {0ⁿ1ⁿ| n >= 1}

Try n=2 or 3 first.

q0 is on 0, replaces with the character to X, changes state to q1, moves right
q1 sees next 0, ignores (both 0’s and X’s) and keeps moving right
q1 hits a 1, replaces it with Y, state to q2, moves left
q2 sees a Y or 0, ignores, continues left
when q2 sees X, moves right, returns to q0 for looping step 1 through 5
when finished, q0 sees Y (no more 0’s), changes to pre-final state q3
q3 scans over all Y’s to ensure there is no extra 1 at the end (to crash on seeing any 0 or 1)
when q3 sees B, all 0’s matched 1’s, done, changes to final state q4
blank line for final state q4

q₀0011BB.. |— Xq₁011

Try n=1 next.

Make sure unbalanced 0’s and 1’s, or mixture of 0-1’s,

“crashes” in a state not q4, as it should be

|— X0q₁11

|— Xq₂0Y1

|— q₂X0Y1

|— Xq₀0Y1

|— XXq₁Y1

|— XXYq₁1

|— XXq₂YY

|— Xq₂XYY

|— XXq₀YY

|— XXYq₃Y B…

|— XXYYq₃BB…

|— XXYYBq₄

TOC notes by S P CHARI

165 of 221

TOC notes by S P CHARI

Same Example #2: {0ⁿ1ⁿ| n ≥ 1}

Logic:

t to find next

q₀	0 (q₁, X, R)	1 -	X -	Y (q₃, Y, R)	-	B
q₁	(q₁, 0, R)	(q₂, Y, L)	-	(q₁, Y, R)	-
q₂	(q₂, 0, L)	-	(q₀, X, R)	(q₂, Y, L)	-
q₃ q₄	- -	- -	- -	(q₃, Y, R) -	(q₄, B, R)	-
cross 0’s with X’s, scan right to look for corresponding 1, on finding it cross it with Y, and scan lef

leftmost 0, keep iterating until no more 0’s, then scan right looking for B.

The TM matches up 0’s and 1’s

q₁is the “scan right” state, looking for 1 q₂is the “scan left” state, looking for X q₃is “scan right”, looking for B

q₄is the final state

Can you extend the machine to include n=0?

How does the input-tape look like for string epsilon?

Other Examples:

001

000111

011

166 of 221

TOC notes by S P CHARI

Roger Ballard’s TM for Example #2, without any extra Tape Symbol: {0ⁿ1ⁿ| n ≥ 0}

q₀	0 (q₁, B, R)	1	B (q₄, B, R)
q₁	(q₁, 0, R)	(q₁, 1, R)	(q₂, B, L)
q₂	-	(q₃, B, L)	-
q₃	(q₃, 0, L)	(q₃, 1, L)	(q₀, B, R)
q₄^*- - - Keep deleting 0 and corresponding 1 from extreme ends, until none left.

Logic:

q₀deletes a leftmost 0 and let q₁scan through end of string, q₀accepts on epsilon

q₁scans over the string and makes q₂expecting 1 on the left

q₂deletes 1 and let q₃“scan left” looking for the start of current string q₃lets q₀start the next iteration

q₄is the final state

Any bug?

Try on:

001

000111

011

167 of 221

TOC notes by S P CHARI

And his example of a correct TM for the language that goes on infinite loop outside language: {0ⁿ1ⁿ| n ≥ 0}

q₀	0 (q₁, B, R)	1 (q₃, 1, L)	B (q₄, B, R)
q₁	(q₁, 0, R)	(q₁, 1, R)	(q₂, B, L)
q₂	-	(q₃, B, L)	-
q₃	(q₃, 0, L)	(q₃, 1, L)	(q₀, B, R)
q₄^*- - - This machine still works correctly for all strings in the language, but

Logic:

start a string with 1 (not in the language), and it loops on B1 for ever.

168 of 221

TOC notes by S P CHARI

Exercises: Construct a DTM for each of the following.

{w | w is in {0,1}* and w ends in 00}

{w | w is in {0,1}* and w contains at least two 0’s}

{w | w is in {0,1}* and w contains at least one 0 and one 1}

Just about anything else (simple) you can think of

169 of 221

Formal Definitions for DTMs

TOC notes by S P CHARI

Let M = (Q, Σ, Г, δ, q₀, B, F) be a TM.

Definition: An instantaneous description (ID) is a triple α₁qα₂, where:

q, the current state, is in Q

α₁α₂, is in Г*, and is the current tape contents up to the rightmost non-blank symbol, or the symbol to the left of the tape head,

whichever is rightmost

The tape head is currently scanning the first symbol of α₂

At the start of a computation α₁= ε

If α₂= ε then a blank is being scanned

Example: (for TM #1)

q₀0011 Xq₁011

X0q₁11

Xq₂0Y1

q₂X0Y1

Xq₀0Y1 XXq₁Y1	XXYq₁1	XXq₂YY
XXq₀YY XXYq₃Y	XXYYq₃	XXYYBq₄

Xq₂XYY

170 of 221

TOC notes by S P CHARI

Suppose the following is the current ID of a DTM

^x1^x2^…xi-1^qxi^xi+1^…xn Case 1) δ(q, x_i) = (p, y, L)

if i = 1 then qx₁x₂…x_i-1x_ix_i+1…x_n|— pByx₂…x_i-1x_ix_i+1…x_n
else x₁x₂…x_i-1qx_ix_i+1…x_n|— x₁x₂…x_i-2px_i-1yx_i+1…x_n

If any suffix of x_i-1yx_i+1…x_nis blank then it is deleted.

Case 2) δ(q, x_i) = (p, y, R)

x₁x₂…x_i-1qx_ix_i+1…x_n|— x₁x₂…x_i-1ypx_i+1…x_n

If i>n then the ID increases in length by 1 symbol

x₁x₂…x_nq |— x₁x₂…x_nyp

171 of 221

TOC notes by S P CHARI

Definition: Let M = (Q, Σ, Г, δ, q₀, B, F) be a TM, and let w be a string in Σ*. Then w is accepted by M iff

q₀w |—* α₁pα₂

where p is in F and α₁and α₂are in Г*

Definition: Let M = (Q, Σ, Г, δ, q₀, B, F) be a TM. The language accepted by M, denoted L(M), is the set

{w | w is in Σ* and w is accepted by M}

Notes:

In contrast to FA and PDAs, if a TM simply passes through a final state then the string is accepted.

Given the above definition, no final state of a TM need to have any transitions. Henceforth, this is our assumption.

If x is NOT in L(M) then M may enter an infinite loop, or halt in a non-final state.

Some TMs halt on ALL inputs, while others may not. In either case the language defined by TM is still well defined.

172 of 221

TOC notes by S P CHARI

Definition: Let L be a language. Then L is recursively enumerable if there exists a TM M such that L = L(M).

If L is r.e. then L = L(M) for some TM M, and

If x is in L then M halts in a final (accepting) state.

If x is not in L then M may halt in a non-final (non-accepting) state or no transition is available, or loop forever.

Definition: Let L be a language. Then L is recursive if there exists a TM M such that L = L(M) and M halts on all inputs.

If L is recursive then L = L(M) for some TM M, and

If x is in L then M halts in a final (accepting) state.

If x is not in L then M halts in a non-final (non-accepting) state or no transition is available (does not go to infinite loop).

Notes:

The set of all recursive languages is a subset of the set of all recursively enumerable languages

Terminology is easy to confuse: A TM is not recursive or recursively enumerable, rather a language is recursive or recursively

enumerable.

173 of 221

Recall the Hierarchy:

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages Context-Sensitive Languages

Context-Free Languages - ε

Regular Languages - ε

Non-Recursively Enumerable Languages

174 of 221

TOC notes by S P CHARI

Observation: Let L be an r.e. language. Then there is an infinite list M₀, M₁, … of TMs such that L = L(M_i).

Question: Let L be a recursive language, and M₀, M₁, … a list of all TMs such that L = L(M_i), and choose any i>=0. Does M_ialways halt?

Answer: Maybe, maybe not, but at least one in the list does.

Question: Let L be a recursive enumerable language, and M₀, M₁, … a list of all TMs such that L = L(M_i), and

choose any i>=0. Does M_ialways halt?

Answer: Maybe, maybe not. Depending on L, none might halt or some may halt.

If L is also recursive then L is recursively enumerable, recursive is subset of r.e.

Question: Let L be a r.e. language that is not recursive (L is in r.e. – r), and M₀, M₁, … a list of all TMs such that L = L(M_i), and choose any i>=0. Does M_ialways halt?

Answer: No! If it did, then L would not be in r.e. – r, it would be recursive.

175 of 221

TOC notes by S P CHARI

L is Recursively enumerable:

TM exist: M₀, M₁, …

They accept string in L, and do not accept any string outside L

L is Recursive:

at least one TM halts on L and on ∑*-L, others may or may not

L is Recursively enumerable but not Recursive:

TM exist: M₀, M₁, …

but none halts on all x in ∑*-L

M₀goes on infinite loop on a string p in ∑*-L, while M₁on q in ∑*-L

However, each correct TM accepts each string in L, and none in ∑*-L

L is not R.E:

no TM exists

176 of 221

Let M be a TM.

TOC notes by S P CHARI

Question: Is L(M) r.e.?

Answer: Yes! By definition it is!

Question: Is L(M) recursive?

Answer: Don’t know, we don’t have enough information.

Question: Is L(M) in r.e – r?

Answer: Don’t know, we don’t have enough information.

177 of 221

Let M be a TM that halts on all inputs:

TOC notes by S P CHARI

Question: Is L(M) recursively enumerable?

Answer: Yes! By definition it is!

Question: Is L(M) recursive?

Answer: Yes! By definition it is!

Question: Is L(M) in r.e – r?

Answer: No! It can’t be. Since M always halts, L(M) is recursive.

178 of 221

Let M be a TM.

TOC notes by S P CHARI

As noted previously, L(M) is recursively enumerable, but may or may not be recursive.

Question: Suppose, we know L(M) is recursive. Does that mean M always halts?

Answer: Not necessarily. However, some TM M’ must exist such that L(M’) = L(M) and M’ always halts.

Question: Suppose that L(M) is in r.e. – r. Does M always halt?

Answer: No! If it did then L(M) would be recursive and therefore not in r.e. – r.

179 of 221

Let M be a TM, and suppose that M loops forever on some string x.

Question: Is L(M) recursively enumerable?

Answer: Yes! By definition it is. But, obviously x is not in L(M).

Question: Is L(M) recursive?

Answer: Don’t know. Although M doesn’t always halt, some other TM M’ may exist such that L(M’) = L(M) and M’

always halts.

Question: Is L(M) in r.e. – r?

Answer: Don’t know.

May be another M’ will halt on x, and on all strings! May be no TM for this L(M) does halt on all strings! We just do

not know!

TOC notes by S P CHARI

180 of 221

Modifications of the Basic TM Model

TOC notes by S P CHARI

Other (Extended) TM Models:

One-way infinite tapes Multiple tapes and tape heads Non-Deterministic TMs

Multi-Dimensional TMs (n-dimensional tape)

Multi-Heads

Multiple tracks

All of these extensions are equivalent to the basic DTM model

181 of 221

Closure Properties for Recursive and Recursively Enumerable Languages

TOC notes by S P CHARI

TMs model General Purpose (GP) Computers:

If a TM can do it, so can a GP computer

If a GP computer can do it, then so can a TM

If you want to know if a TM can do X, then some equivalent question are:

Can a general purpose computer do X?

Can a C/C++/Java/etc. program be written to do X?

For example, is a language L recursive?

Can a C/C++/Java/etc. program be written that always halts and accepts L?

182 of 221

TM Block Diagrams:

TOC notes by S P CHARI

If L is a recursive language, then a TM M that accepts L and always halts can be pictorially represented by a “chip” or

“box” that has one input and two outputs.

If L is a recursively enumerable language, then a TM M that accepts L can be pictorially represented by a “box” that has

one output.

yes

Conceivably, M could be p^wrovided with an output^Mfor “no,” but this output cannot be counted on. Consequently, we simply ignore it.

183 of 221

TOC notes by S P CHARI

Theorem 1: The recursive languages are closed with respect to complementation, i.e., if L is a recursive

language, then so is

Proof: Let M be a TM such that L = L(M) and M always halts. Construct TM M’ as follows:

Note That:

M’ accepts iff M does not

M’ always halts since M always halts

From this it follows that the complement of L is recursive. •

Question: How is the construction achieved? Do we simply complement the final states in the TM? No! A string in L could end up in the complement of L.

Suppose q₅is an accepting state in M, but q₀is not.

If we simply complemented the final and non-final states, then q₀would be an accepting state in M’ but q₅would not. Since q₀is an accepting state, by definition all strings are accepted by M’

L = Σ * −L

yes

M’

184 of 221

L = L ∪ L

Theorem 2: The recursive languages are closed with respect to union, i.e., if L₁and L₂are recursive languages, then so is

Proof: Let M₁and M₂be TMs such that L₁= L(M₁) and L₂= L(M₂) and M₁and M₂always halts.

Construct TM M’ as follows:

Note That:

L(M’) = L(M₁) L(M₂)

L(M’) is a subset of L(M₁) U L(M₂) L(M₁) U L(M₂) is a subset of L(M’)

1 2

It follows from this that

is recursive. •

3 1 2

yes

M₁

yes

M₂

start

M’

M’ always halts since M and^∪M always halt

TOCLno₃tes =by PaLllav₁i Jo∪shi L₂

185 of 221

TOC notes by S P CHARI

Theorem 3: The recursive enumerable languages are closed with respect to union, i.e., if L₁and L₂are recursively

enumerable languages, then so is

Proof: Let M₁and M₂be TMs such that L₁= L(M₁) and L₂= L(M₂). Construct M’ as follows:

Note That:

L(M’) = L(M₁) U L(M₂)

L(M’) is a subset of L(M₁) U L(M₂)

L(M₁) U L(M₂) is a subset of L(M’)

M’ halts and accepts iff M₁or M₂halts and accepts

It follows from this that

is recursively enumerable. •

Question: How do you run two TMs in parallel?

L₃= L₁∪ L₂

3 1

L = L ∪ L

yes

M₁

yes

M₂

M’

186 of 221

Suppose, M₁and M₂had outputs for “no” in the previous construction, and these were transferred to the “no” output for M’

TOC notes by S P CHARI

Question: What would happen if w is in L(M₁) but not in L(M₂)?

Answer: You could get two outputs – one “yes” and one “no.”

At least M₁will halt and answer accept, M₂may or may not halt.

As before, for the sake of convenience the “no” output will be ignored.

yes

M₁

yes

M₂

M’

187 of 221

are both _Lrecursively enumerable then L (and therefore

TOC notes by S P CHARI

Theorem 4: If L and

) is recursive.

Proof: Let M₁and M₂be TMs such that L = L(M₁) and

L(M’) is a subset of L L is a subset of L(M’)

M’ is TM for L

M’ always halts since either M₁or M₂halts for any given string

M’ shows that L is recursive

It follows from this that L (and therefore its’ complement) is recursive.

So, is also recursive (we proved it before). •

= L(M₂). Construct M’ as follows:

Note That:

L(M’) = L w

yes

M₁

yes

M₂

M’

188 of 221

Corollary of Thm 4: Let L be a subset of Σ*. Then one of the following must be true:

TOC notes by S P CHARI

Both L and One of L and Neither L nor

are recursive.

is recursively enumerable but not recursive, and the other is not recursively enumerable, or

is recursively enumerable

In other words, it is impossi^Lble to have both L and

r.e. but not recursive

189 of 221

In terms of the hierarchy: (possibility #1)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

190 of 221

In terms of the hierarchy: (possibility #2)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

191 of 221

In terms of the hierarchy: (possibility #3)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

192 of 221

In terms of the hierarchy: (Impossibility #1)

TOC notes by S P CHARI

Recursive Languages

Non-Recursively Enumerable Languages

L L

Recursively Enumerable Languages

193 of 221

In terms of the hierarchy: (Impossibility #2)

TOC notes by S P CHARI

Recursively Enumerable Languages

Recursive Languages

Non-Recursively Enumerable Languages

194 of 221

In terms of the hierarchy: (Impossibility #3)

TOC notes by S P CHARI

Recursive Languages

Recursively Enumerable Languages

Non-Recursively Enumerable Languages

195 of 221

Note: This gives/identifies three approaches to show that a language is not recursive.

TOC notes by S P CHARI

Show that the language’s complement is not recursive, in one of the two ways:

Show that the language’s complement is recursively enumerable but not recursive Show that the language’s complement is not even recursively enumerable

196 of 221

The Halting Problem - Background

TOC notes by S P CHARI

Definition: A decision problem is a problem having a yes/no answer (that one presumably wants to solve with a computer). Typically, there is a list of parameters on which the problem is based.

Given a list of numbers, is that list sorted?

Given a number x, is x even?

Given a C program, does that C program contain any syntax errors?

Given a TM (or C program), does that TM contain an infinite loop?

From a practical perspective, many decision problems do not seem all that interesting. However,

from a theoretical perspective they are for the following two reasons:

Decision problems are more convenient/easier to work with when proving complexity results. Non-decision counter-parts can always be created & are typically at least as difficult to solve.

Notes:

The following terms and phrases are analogous:

Algorithm Decision Problem (un)Decidable

A halting TM program
A language(will show shortly)
(non)Recursive

197 of 221

Statement of the Halting Problem

TOC notes by S P CHARI

Practical Form: (P1)

Input: Program P and input I.

Question: Does P terminate on input I?

Theoretical Form: (P2)

Input: Turing machine M with input alphabet Σ and string w in Σ*.

Question: Does M halt on w?

A Related Problem We Will Consider First: (P3)

Input: Turing machine M with input alphabet Σ and one final state, and string w in Σ*. Question: Is w in L(M)?

Analogy:

Input: DFA M with input alphabet Σ and string w in Σ*.

Question: Is w in L(M)?

Is this problem (regular language) decidable? Yes! DFA always accepts or rejects.

198 of 221

Over-All Approach:

TOC notes by S P CHARI

We will show that a language L_dis not recursively enumerable

From this it will follow that is not recursive

Using this we will show that a language L_uis not recursive

^{From this it will follow that the halting problem}L_d^{is undecidable.}

As We Will See:

P3 will correspond to the language L_u

Proving P3 (un)decidable is equivalent to proving L_u(non)recursive

199 of 221

Converting the Problem to a Language

TOC notes by S P CHARI

Let M = (Q, Σ, Γ, δ, q₁, B, {q_n}) be a TM, where

Q = {q₁, q₂, … , q_n}, order the states from 1 through n Σ = {x₁, x₂} = {0, 1}

Γ = {x₁, x₂, x₃} = {0, 1, B}

Encode each transition:

δ(q_i, x_j) = (q_k, x_l, d_m)

where q_iand q_kare in ordered Q x_jand x_lare in Σ,

and d_mis in {L, R} = {d₁, d₂}

as:

0ⁱ10^j10^k10^l10^mwhere the number of 0’s indicate the corresponding id, and single 1 acts as a barrier

The TM M can then be encoded as:

111code₁11code₂11code₃11 … 11code_r111

where each code_iis one transitions’ encoding, and 11’s are barriers between transitions from the table row-major.

Let this encoding of M be denoted by <M>.

200 of 221

TOC notes by S P CHARI

Less Formally:

Every state, tape symbol, and movement symbol is encoded as a sequence of 0’s:

q₁,	0
q₂,	00
q₃	000
:
0	0
1	00
B	000
L	0
R	00

Note that 1’s are not used to represent the above, since 1 is used as a special separator symbol.

Example:

δ(q₂, 1) = (q₃, 0, R)

Is encoded as:

00100100010100

201 of 221

TOC notes by S P CHARI

q₁	0 (q₁, 0, R)	1 (q₁, 1, R)	B (q₂, B, L)
q₂	(q₃, 0, R)	-	-
q₃	-	-	-

What is the L(M)?

Coding for the above table:

1110101010100110100101001001101000100100010110010100010100111

Are the followings correct encoding of a TM?

01100001110001

111111

202 of 221

Definition:

TOC notes by S P CHARI

L_t= {x | x is in {0, 1}* and x encodes a TM}

Question: Is L_trecursive?

Answer: Yes. [Check only for format, i.e. the order and number of 0’s and 1’s, syntax checking]

Question: Is L_tdecidable:

Answer: Yes (same question).

203 of 221

The Universal Language

TOC notes by S P CHARI

Define the language L_uas follows:

L_u= {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Let x be in {0, 1}*. Then either:

x doesn’t have a TM prefix, in which case x is not in L_u
x has a TM prefix, i.e., x = <M,w> and either:

w is not in L(M), in which case x is not in L_u
w is in L(M), in which case x is in L_u

204 of 221

TOC notes by S P CHARI

Recall:
	0	1	B
q₁	(q₁, 0, R)	(q₁, 1, R)	(q₂, B, L)
q₂	(q₃, 0, R)	-	-
q₃	-	-	-

Which of the following are in L_u?

1110101010100110100101001001101000100100010110010100010100111

111010101010011010010100100110100010010001011001010001010011101110

111010101010011010010100100110100010010001011001010001010011100110111

01100001110001

111111

205 of 221

TOC notes by S P CHARI

Compare P3 and L_u:

(P3):

Input: Turing machine M with input alphabet Σ and one final state, and string w in Σ*. Question: Is w in L(M)?

L_u= {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Universal TM (UTM) is the machine for L_u

presuming it is r.e.! Can you write a program to accept strings in Lu?

Notes:

L_uis P3 expressed as a language

Asking if L_uis recursive is the same as asking if P3 is decidable.

Can you write a Halting program for accept/reject of strings in Sigma* ?

We will show that L_uis not recursive, and from this it will follow that P3 is un-decidable. From this we can further show that the Halting problem is un-decidable.

=> A general concept: a decision problem ≡ a formal language

206 of 221

TOC notes by S P CHARI

Define another language L_das follows:

[L_d_bar = {self accepting TM encodings}, everything else is L_d]

L_d= {x | x is in {0, 1}* and (a) either x is not a TM,

(b) or x is a TM, call it M, and x is not in L(M)} (1)

Note, there is only one string x

And, the question really is the complement of “does a TM accept its own encoding?” (Ld-bar’s complement)

Let x be in {0, 1}*. Then either:

x is not a TM, in which case x is in L_d
x is a TM, call it M, and either:

x is not in L(M), in which case x is in L_d
x is in L(M), in which case x is not in L_d

207 of 221

TOC notes by S P CHARI

Recall:
	0	1	B
q₁	(q₁, 0, R)	(q₁, 1, R)	(q₂, B, L)
q₂	(q₃, 0, R)	-	-
q₃	-	-	-

Which of the following are in L_d?

11101010101001101001010010011010001000100010110010100010100111

01100001110001

Change above machine to accept strings ending with 1: the encoding will not be in L_d

208 of 221

TOC notes by S P CHARI

Lemma: L_dis not recursively enumerable. [No TM for L_d!!!]

Proof: (by contradiction)

Suppose that L_dis recursively enumerable. In other words, there exists a TM M such that:

L_d= L(M) (2)

Now suppose that w is a string encoding of M. (3) Case 1) w is in L_d(4)

By definition of L_dgiven in (1), either w does not encode a TM, or w does encode a TM, call it M, and w is not in L(M). But we know that w

encodes a TM (3: that’s where it came from). Therefore:

w is not in L(M)	(5)
But then (2) and (5) imply that w is not in L_dcontradicting (4). Case 2) w is not in L_d		(6)
By definition of L_dgiven in (1), w encodes a TM, call it M, and: w is in L(M)	(7)

But then (2) and (7) imply that w is in L_dcontradicting (6).

Since both case 1) and case 2) lead to a contradiction, no TM M can exist such that L_d= L(M). Therefore L_dis not recursively enumerable. •

209 of 221

TOC notes by S P CHARI

Note:

= {x | x is in {0, 1}*, x encodes a TM, call it M, and x is in L(M)}

is not recursive.

Proof: If

L_d

Corollary:

were recursive, then L_dwould be recursive, and therefore recursively enumerable, a contradiction. •

L_d

210 of 221

TOC notes by S P CHARI

Suppose that M’ always halts and L_u= L(M’). It follows that:

M’’ always halts

Theorem: L_uis not recursive.

Proof: (by contradiction)

Suppose that L_uis recursive. Recall that:

L_u= {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Suppose that L_u= L(M’) where M’ is a TM that always halts. Construct an algorithm (i.e., a TM that always halts) for

as follows:

L_d

L(M’’) =

would therefore be recursive, a contradiction. •

L_d

Yes

Let M be the TM that w encodes.

M’:

UTM for L_u

Yes

<M,w>

(i.e., <w,w>)

Is w a TM?

L_t

M’’: for L_d-bar

211 of 221

L_u is recursively enumerable

(you may ignore this slide, for now)

TOC notes by S P CHARI

Input the string

Decode the TM prefix, if it doesn't have one then the string is not in Lu

Otherwise, run/simulate the encoded TM on the suffix

If it terminates and accepts then the original string is in Lu.

If a given string is in Lu, then the above algorithm will correctly determine that, halt and say yes. If the given string is not in Lu, then there are three cases:

the string doesn't have a TM as a prefix. In this case the above algo correctly detects this fact, and reports the string is not in Lu.
the string has a TM prefix, and the TM halts and rejects on the suffix. In this case the above algo correctly reports the string is not in Lu.
the string has a TM prefix, but it goes into an infinite loop on the suffix. In this case the above algo also goes into an infinite loop, but that’s ok since the string as a whole is not in Lu anyway, and we are just trying to show there exists a TM for only accepting strings in Lu.

From this proof note that if the prefix TM is a DFA or PDA, then our machine will also halt in the 3^rdcase above, no matter what the suffix is.

-- due to Dr. Bernhard (edited by me)

212 of 221

TOC notes by S P CHARI

The over-all logic of the proof is as follows:

If L_uwere recursive, then so will be
is not recursive, because L_dis not r.e.

3. It follows that L_uis not recursive.

The second point was established by the corollary.

The first point was established by the theorem on a preceding slide.

This type of proof is commonly referred to as a reduction. Specifically, the problem of recognizing

recognizing L_u

was reduced to the problem of

L_d

213 of 221

TOC notes by S P CHARI

Define another language L_h:

L_h= {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and M halts on w}

Note that L_his P2 expressed as a language:

(P2):

Input: Turing machine M with input alphabet Σ and string w in Σ*.

Question: Does M halt on w?

214 of 221

TOC notes by S P CHARI

Theorem: L_his not recursive.

Proof: (by contradiction)

Suppose that L_his recursive. Recall that:

L_h= {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and M halts on w}

and

L_u= {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}

Suppose that L_h= L(M’) where M’ is a TM that always halts. Construct an algorithm (i.e., a TM that always halts) for L_u

as follows:

Suppose that M’ always halts and L_h= L(M’). It follows that:

M’’ always halts

L(M’’) = L_u

L_uwould therefore be recursive, a contradiction. •

Yes

Simulate M On w

Yes

M’ for L_h: does M halt on w?

<M,w>

M’’ : UTM for L_u

start

215 of 221

TOC notes by S P CHARI

The over-all logic of the proof is as follows:

If L_his recursive, then so is L_u
L_uis not recursive
It follows that L_his not recursive.

The second point was established previously.

The first point was established by the theorem on the preceding slide.

This proof is also a reduction. Specifically, the problem of recognizing L_uwas reduced to the problem of recognizing L_h.

[L_uand L_hboth are recursively enumerable: for proof see Dr. Shoaff!]

216 of 221

TOC notes by S P CHARI

Examples of non-halting program:

http://cs.fit.edu/~ryan/tju/russell.c http://cs.fit.edu/~ryan/tju/russell.scm http://cs.fit.edu/~ryan/tju/russell.py

217 of 221

TOC notes by S P CHARI

Define another language L_q:

L_q= {x | x is in {0, 1}*, x encodes a TM M, and M does not contain an infinite loop}

Or equivalently:

L_q= {x | x is in {0, 1}*, x encodes a TM M, and there exists no string w in {0, 1}*

such that M does not terminate on w}

Note that:

= {x | x is in {0, 1}*, and either x does not encode a TM, or it does encode a TM, call it M, and there exists a string w in {0, 1}* such that M does not terminate on w}

Note that the above languages correspond to the following problem:

(P0):

Input: Program P.

Question: Does P contain an infinite loop?

Using the techniques discussed, what can we prove about L_qor its’ complement?

L_q

218 of 221

TOC notes by S P CHARI

More examples of non-recursive languages:

L_ne= {x | x is a TM M and L(M) is not empty} is r.e. but not recursive.

L_e= {x | x is a TM M and L(M) is empty} is not r.e.

L_r= {x | x is a TM M and L(M) is recursive} is not r.e.

Note that L_ris not the same as L_h= {x | x is a TM M that always halts}

but L_his in L_r.

L_nr= {x | x is a TM M and L(M) is not recursive} is not r.e.

219 of 221

Ignore this slide

Lemma: L_dis not recursively enumerable: [No TM for L_d!!!]

TOC notes by S P CHARI

Proof: (by contradiction)

Suppose that L_dwere recursively enumerable. In other words, that there existed a TM M such that:

L_d= L(M) (2)

Now suppose that w_jis a string encoding of M. (3) Case 1) w_jis in L_d(4)

By definition of L_dgiven in (1), either w_jdoes not encode a TM, or w_jdoes encode a TM, call it M, and w_jis not in L(M). But we know that

w_jencodes a TM (3: that’s where it came from). Therefore:

w_jis not in L(M) But then (2) and (5) imply that w_jis not in L_dcontradicting (4). Case 2) w_jis not in L_d	(5)	(6)
By definition of L_dgiven in (1), w_jencodes a TM, call it M, and: w_jis in L(M)	(7)

But then (2) and (7) imply that w_jis in L_dcontradicting (6).

Since both case 1) and case 2) lead to a contradiction, no TM M can exist such that L_d= L(M). Therefore L_dis not recursively enumerable. •

220 of 221

TOC notes by S P CHARI

	(	)	B
findPair	(findPair2, "(", R)	-	(final, B, R)
findPair2	(findPair2, "(", R)	(removePair, ")", L)	-
removePair	(fetch, "(", R)	(fetch, ")", R)	(goBack, B, L)
fetch	(retrieve, "(", R)	(retreive, ")", R)	(retreive, B, R)
retreive	(returnOpen, "(", L)	(returnClosed, ")", L)	(returnBlank, B, L)
returnOpen	(writeOpen, "(", L)	(writeOpen, ")", L)	(writeOpen, B, L)
returnClosed	(writeClosed, "(", L)	(writeClosed, ")", L)	(writeClosed, B, L)
returnBlank	(writeBlank "(", L)	(writeBlank, ")", L)	(writeBlank, B, L)
writeOpen	(removePair, "(", R)	(removePair, "(", R)	-
writeClosed	(removePair, ")", R)	(removePair, ")", R)	-
writeBlank	(removePair, B, R)	(removePair, B, R)	-
goBack	-	-	(backAgain, B, L)
backAgain	-	-	(seekFront, B, L)
seekFront	(seekFront, "(", L)	(seekFront, ")", L)	(findPair, B, R)
final*	-	-	-

Roger’s TM for balanced parenthesis:

221 of 221

TOC notes by S P CHARI

On 111 111 as a TM encoding

<Quote> It was ambiguous, in my opinion, based on the definition in the Hopcroft book, i.e., the definition in the Hopcroft book was not clear/precise enought to

account this special case. I don't have the book in front of me right now, but I think this is the example I used in class: Consider the TM that has exactly one state, but no transitions. Perfectly valid TM, and it would give us this encoding (111111). In that case the encoded machine would accept sigma* because the highest numbered state would be q0, the only state, and that would be the final state under the Hopcroft encoding. Now consider the TM that has exactly two states, but no transitions. Also a perfectly valid TM, and it would give us the same encoding. In that case the encoded machine would not accept anything because the final state is q1 (highest numbered state), and there is no way to get to it. I used it only as a way to raise that issue in class, i.e., the the Hopcroft definition is a bit ambiguous in this case.

One way to resolve the ambiguity is to require the encoding to specifically specify the final state (at the end or something). In that case, 111111 isn't even a valid TM, since it doesn't specify the final state. Another related question is, does a TM even have to have any states at all to be a valid TM? The encoding would have to be able to isolate that as a unique string also. <End Quote>

Phil Bernhard