UNIT-1�INTRODUCTION TO
FINITE AUTOMATA��PREPARED BY:SRIRAM
Alphabet
An alphabet is a finite and non empty set of symbols.
The example of alphabet is,
S={a, b, c,……..z}
The elements a, b, c, …….z are alphabets.
If W={0,1}
Here 0 and 1 are alphabets, denoted by W.
String
TOC SP CHARI
For example, if Σ = {a,b} then various strings that can be formed from Σ are {ab, ba, aaa, bbbb, aba, bab, ….}. An infinite number of strings can be generated.
Language
For example: Σ ={Ꜫ,0,00,000,0000,….}.
Here the language L is defined as ‘Any number of Zeros’.
Examples
TOC SP CHARI
This can also be written as {x∈{a,b,c}∗ ||x|ac ≥1},
stating that the set of all strings over {a,b,c} in which the number of occurrences of substring ac is at least 1.
Equivalently, {x∈Σ∗ ||x|a ≡ 0 mod 2}.
of a’s and b’s can be written as {x∈Σ∗ ||x|a =|x|b}.
{x∈Σ∗ | x = xR}, where xR is the string obtained by reversing x.
Finite State Machine
TOC notes by S P CHARI
Definition of Finite Automata
TOC notes by S P CHARI
Σ is input alphabet, indicates input set.
δ is transition function or mapping function. We can determine the next state using this function.
q0 is an initial state and is in Q F is set of final states.
Finite Automata Model
TOC notes by S P CHARI
Input Tape
A | B | A | B | A | B | A | B | A | B |
Finite
Control
Tape
Reader
Out put
Types of Automata
TOC notes by S P CHARI
Finite Automata
Deterministic Finite Automata
Non Deterministic Finite Automata
Types of Automata
It can be represented as follows:
Q is finite set of states, which is non empty.
Σ is input alphabet, indicates input set.
δ is transition function or mapping function. We can determine the next state using this function.
q0 is an initial state and is in Q F is set of final states.
Where δ:Q X Σ -> Q
TOC notes by S P CHARI
KMIT
Types of Automata
It can be represented as follows:
Q is finite set of states, which is non empty.
Σ is input alphabet, indicates input set.
δ is transition function or mapping function. We can determine the next state using this function.
q0 is an initial state and is in Q F is set of final states.
Where δ:Q X Σ -> 2TOQC notes by Pallavi Joshi
Difference between DFA & NFA
TOC notes by S P CHARI
Deterministic Finite Automata | Non Deterministic Finite Automata |
For Every symbol of the alphabet, there is only one state transition in DFA. | We do not need to specify how does the NFA react according to some symbol. |
DFA cannot use Empty String transition. | NFA can use Empty String transition. |
DFA can be understood as one machine. | NFA can be understood as multiple little machines computing at the same time. |
DFA will reject the string if it end at other than accepting state. | If all of the branches of NFA dies or rejects the string, we can say that NFA reject the string. |
Backtracking is allowed in DFA. | Backtracking is not always allowed in NFA. |
DFA is more difficult to construct. | NFA is easier to construct. |
Hierarchy of languages
TOC notes by S P CHARI
Recursively Enumerable Languages
Recursive Languages Context-Free Languages
Regular Languages
Non-Recursively Enumerable Languages
Deterministic Finite State Automata (DFA)
TOC notes by S P CHARI
……..
ken into cells
ead.
One-way, infinite tape, bro One-way, read-only tape h Finite control, i.e.,
finite number of states, and
transition rules between them, i.e.,
a program, containing the position of the read head, current symbol being scanned, and the current “state.”
A string is placed on the tape, read head is positioned at the left end, and the DFA will read the string one symbol at a time until all symbols have been read. The DFA will then either accept or reject the string.
Finite Control
0 | 1 | 1 | 0 | 0 | | | | |
TOC notes by S P CHARI
The finite control can be described by a transition diagram or table:
Example #1:
1
q0 q0 q1
1 1
q0 q0
One state is final/accepting, all others are rejecting.
The above DFA accepts those strings that contain an even number of 0’s, including the
null string, over Sigma = {0,1}
L = {all strings with zero or more 0’s}
Note, the DFA must reject all other strings
q0
0 0
q1
0
q0
1
0
1
Note:
TOC notes by S P CHARI
L
you can call them by any names!
Sigma = {0, 1} ≡ {a, b}
States = {q0, q1} ≡ {u, v}, as long as they have
identical (isomorphic) transition table
M1
M2
….
M-inf
An equivalent machine to the previous example (DFA for even number of 0’s):
TOC notes by S P CHARI
q0 q3 q1 q2 q2 q2
One state is final/accepting, all others are rejecting.
The above DFA accepts those strings that contain an even number of 0’s, including null
string, over Sigma = {0,1}
Can you draw a machine for a language by excluding the null string from the language? L =
{all strings with 2 or more 0’s}
q0
1 0 0 1 1q1
0
0
1
accept string
q2
1
1
0
q3
1
0
Example #2:
TOC notes by S P CHARI
a
c
c
c
b
accepted
q0
q0
q1
q2
q2
q2
a
a
c
rejected
q0 q0 q0 q1
Accepts those strings that contain at least two c’s
q1
q0
q2
a
b
a
b
c
c
a/b/c
a/b/c
q2
a q0
a
q1
b
b
c
c
Inductive Proof (sketch): that the machine correctly accepts strings with at least two c’s
Proof goes over the length of the string.
Base: x a string with |x|=0. state will be q0 => rejected.
Inductive hypothesis: |x|= integer k, & string x is rejected - in state q0 (x must have zero c),
OR, rejected – in state q1 (x must have one c),
OR, accepted – in state q2 (x has already with two c’s)
Inductive steps: Each case for symbol p, for string xp (|xp| = k+1), the last symbol p = a, b or c
| xa | xb | xc |
x ends in q0 | q0 =>reject (still zero c => should reject) | q0 =>reject (still zero c => should reject) | q1 =>reject (still zero c => should reject) |
x ends in q1 | q1 =>reject (still one c => should reject) | q1 =>reject (still one c => should reject) | q2 =>accept (two c now=> should accept) |
x ends in q2 | q2 =>accept (two c already => should accept) TOC notes by Palla | q2 =>accept (two c already => should vi Jaocscheipt) | q2 =>accept (two c already => should accept) |
Formal Definition of a DFA
TOC notes by S P CHARI
A DFA is a five-tuple:
M = (Q, Σ, δ, q0, F)
Q Σ
q0
F
δ
A finite set of states
A finite input alphabet
The initial/starting state, q0 is in Q
A set of final/accepting states, which is a subset of Q
A transition function, which is a total function from Q x Σ to Q
δ: (Q x Σ) –> Q
δ(q,s) = q’
δ is defined for any q in Q and s in Σ, and
is equal to some state q’ in Q, could be q’=q
Intuitively, δ(q,s) is the state entered by M after reading symbol s while in state q.
Revisit example #1:
TOC notes by S P CHARI
Q = {q0, q1}
Σ = {0, 1}
Start state is q0
F = {q0}
δ:
0
q1
1
q0
q0
q1
q0 | q1 |
| |
q
0
q1
0
0
1
1
Revisit example #2:
TOC notes by S P CHARI
Q = {q0, q1, q2}
Σ = {a, b, c} Start state is q0 F = {q2}
δ:
a
q0
b q0
c
q1
q0
q1 | q1 | q2 |
q2 | q2 | q2 |
n, at each st | ep M has | exactly on |
q1
q2
Since δ is a functio e option.
It follows that for a given string, there is exactly one computation.
q1
q0
q2
a
b
a
b
c
c
a/b/c
Extension of δ to Strings
TOC notes by S P CHARI
δ^ : (Q x Σ*) –> Q
δ^(q,w) – The state entered after reading string w having started in state q.
Formally:
δ^(q,wa) = δ (δ^(q,w), a)
Recall Example #1:
TOC notes by S P CHARI
δ^(q0, 011)
= δ (δ^(q0,01), 1)
= δ (δ ( δ^(q0,0), 1), 1)
= δ (δ (δ (δ^(q0, λ), 0), 1), 1)
= δ (δ (δ(q0,0), 1), 1)
= δ (δ (q1, 1), 1)
= δ (q1, 1)
= q1
by rule #2
by rule #2
by rule #2
by rule #1
by definition of δ by definition of δ by definition of δ
Is 011 accepted? No, since δ^(q0, 011) = q1 is not a final state.
q0
q1
0
What is δ^(q0, 011)? Informally, it is the state entered by0 M after processing 011 having started in state q0.
Formally:
1
1
Note that:
TOC notes by S P CHARI
δ^ (q,a) = δ(δ^(q, ε), a)
= δ(q, a)
by definition of δ^, rule #2
by definition of δ^, rule #1
Therefore:
δ^ (q, a1a2…an) = δ(δ(…δ(δ(q, a1), a2)…), an)
However, we will abuse notations, and use δ in place of δ^: δ^(q, a1a2…an) = δ(q, a1a2…an)
Example #3:
TOC notes by S P CHARI
δ(q0, 011)
= δ (δ(q0,01), 1)
= δ (δ (δ(q0,0), 1), 1)
= δ (δ (q1, 1), 1)
= δ (q1, 1)
= q1
by rule #2
by rule #2
by definition of δ by definition of δ by definition of δ
Is 011 accepted? No, since δ(q0, 011) = q1 is not a final state.
Language?
L ={ all strings over {0,1} that has 2 or more 0 symbols}
q
q
0 1
q
2
1
1
0
0
1
What is δ(q0, 011)? Informally, it is the state entered by M after process0ing 011 having
started in state q0.
Formally:
Recall Example #3:
TOC notes by S P CHARI
What is δ(q1, 10)?
δ(q1, 10) | = δ (δ(q1,1), 0) | by rule #2 |
| = δ (q1, 0) | by definition of δ |
| = q2 | by definition of δ |
Is 10 accepted? No, since δ(q0, 10) = q1 is not a final state. The fact that δ(q1, 10) = q2 is
irrelevant, q1 is not the start state!
0
q
1
q
0
q
2
1
1
0
1
0
Definitions related to DFAs
Let M = (Q, Σ, δ,q0,F) be a DFA and let w be in Σ*. Then w is accepted by M iff δ(q0,w) = p for some state p in F.
Let M = (Q, Σ, δ,q0,F) be a DFA. Then the language accepted by M is the set:
L(M) = {w | w is in Σ* and δ(q0,w) is in F} Another equivalent definition:
L(M) = {w | w is in Σ* and w is accepted by M}
Let L be a language. Then L is a regular language iff there exists a DFA M such that L = L(M).
Let M1 = (Q1, Σ1, δ1, q0, F1) and M2 = (Q2, Σ2, δ2, p0, F2) be DFAs. Then M1 and M2 are equivalent iff
L(M1) = L(M2).
TOC notes by S P CHARI
TOC notes by S P CHARI
Notes:
A DFA M = (Q, Σ, δ,q0,F) partitions the set Σ* into two sets: L(M) and
Σ* - L(M).
If L = L(M) then L is a subset of L(M) and L(M) is a subset of L (def. of set equality). Similarly, if L(M1) = L(M2) then L(M1) is a subset of L(M2) and L(M2) is a subset of L(M1). Some languages are regular, others are not. For example, if
Regular: L1 = {x | x is a string of 0's and 1's containing an even
number of 1's} and
Not-regular: L2 = {x | x = 0n1n for some n >= 0}
Can you write a program to “simulate” a given DFA, or any arbitrary input DFA?
Question we will address later:
How do we determine whether or not a given language is regular?
Give a DFA M such that:
TOC notes by S P CHARI
L(M) = {x | x is a string of 0’s and 1’s and |x| >= 2}
Prove this by induction
q1
q0
q2
0/1
0/1
0/1
Give a DFA M such that:
TOC notes by S P CHARI
L(M) = {x | x is a string of (zero or more) a’s, b’s and c’s such that x does not contain the substring aa}
Logic:
In Start state (q0): b’s and c’s: ignore – stay in same state
q0 is also “accept” state
First ‘a’ appears: get ready (q1) to reject
But followed by a ‘b’ or ‘c’: go back to start state q0
When second ‘a’ appears after the “ready” state: go to reject state q2 Ignore everything after getting to the “reject” state q2
a
a/b/c
a
q0 q1 q2
b/c
b/c
Give a DFA M such that:
TOC notes by S P CHARI
L(M) = {x | x is a string of a’s, b’s and c’s such that x contains the substring aba}
a
a/b/c
b
q0 q1
c
Logic: acceptance is straight forward, progressing on each expected symbol
However, rejection needs special cabre/c, in each state (for DFA, we will see this becomes
easier in NFA, non-deterministic machine)
q2 q3
b/c
a
a
Give a DFA M such that:
TOC notes by S P CHARI
q0
b
q7
Remember, you may havqe multiple “final” staqtes, but only one “stqart” state
4 5 6
b
b
b
a
q2
q1
q3
a
L(M) = {x | x is a string of a’s and b’s such that x
contains both aa and bb}
First do, for a language where ‘aa’ comes before ‘bb’
Then do its reverse; and then parallelize them.
a
a
b
a/b
b
a
a
a
b
TOC notes by S P CHARI
Let Σ = {0, 1}. Give DFAs for {}, {ε}, Σ*, and Σ+.
For {}: For {ε}:
For Σ*:
For Σ+:
0/1
q0
0/1
q0
q1
q0
0/1
0/1
0/1
q0
q1
0/1
Problem: Third symbol from last is 1
TOC notes by S P CHARI
0/1
q1
q0
q3
1
0/1
q2
0/1
Is this a DFA?
No, but it is a Non-deterministic Finite Automaton
Nondeterministic Finite State Automata (NFA)
TOC notes by S P CHARI
An NFA is a five-tuple:
M = (Q, Σ, δ, q0, F)
Q Σ
q0
F
δ
A finite set of states
A finite input alphabet
The initial/starting state, q0 is in Q
A set of final/accepting states, which is a subset of Q
A transition function, which is a total function from Q x Σ to 2Q
δ: (Q x Σ) –> 2Q :2Q is the power set of Q, the set of all subsets of Q
:The set of all states p such that there is a transition
labeled s from q to p
δ(q,s)
δ(q,s) is a function from Q x S to 2Q (but not only to Q)
TOC notes by S P CHARI
Example #1: one or more 0’s followed by one or more 1’s
Q = {q0, q1, q2}
Σ = {0, 1}
Start state is q0
F = {q2}
δ:
0
1
q0
q
q
1 {q0, q1} | {} |
2 {} | {q1, q2} |
{q2} | {q2} |
q1
q0
q2
0
1
0
1
0/1
Example #2: pair of 0’s or pair of 1’s as substring
Q = {q0, q1, q2 , q3 , q4}
Σ = {0, 1}
Start state is q0
F = {q2, q4}
δ: 0
1
q0
q1
q2 q3 q4
q0
0/1
0
0
q3
q4
0/1
q1
q2
0/1
Joshi
1
1
{q0, q3} | {q0, q1} |
{} | {q2} |
{q2} | {q2} |
{q4} | {} |
{q4} | {q4} TOC notes by Pallavi |
Notes:
TOC notes by S P CHARI
δ(q,s) may not be defined for some q and s (what does that mean?) δ(q,s) may map to multiple q’s
A string is said to be accepted if there exists a path from q0 to some state in F
A string is rejected if there exist NO path to any state in F
The language accepted by an NFA is the set of all accepted strings
Question: How does an NFA find the correct/accepting path for a given string?
NFAs are a non-intuitive computing model
You may use backtracking to find if there exists a path to a final state (following slide)
Why NFA?
We are primarily interested in NFAs as language defining capability, i.e., do NFAs accept languages that
DFAs do not?
Other secondary questions include practical ones such as whether or not NFA is easier to develop, or how does one implement NFA
Determining if a given NFA (example #2) accepts a given string (001) can be done
algorithmically:
TOC notes by S P CHARI
q0
q0
q0
q3
q3
q4 q4
Each level will have at most n states:
Complexity: O(|x|*n), for running over a string x
accepted
1q0
q1
0
0
q0
0/1
0
q3
1
1
q2
q1
0
0/1
q4
0/1
Another example (010):
TOC notes by S P CHARI
q0
q0
q0
q0
q3
q1
q3
not accepted
All paths have been explored, and none lead to an accepting state.
0
1
0
q0
0/1
0
q3
q4
1
1
q2
q1
0
Question: Why non-determinism is useful?
TOC notes by S P CHARI
Non-determinism = Backtracking Compressed information
Non-determinism hides backtracking
Programming languages, e.g., Prolog, hides backtracking => Easy to program at a higher level: what we
want to do, rather than how to do it
Useful in algorithm complexity study
Is NDA more “powerful” than DFA, i.e., accepts type of languages that any DFA cannot?
TOC notes by S P CHARI
Let Σ = {a, b, c}. Give an NFA M that accepts:
L = {x | x is in Σ* and x contains ab}
Is L a subset of L(M)? Or, does M accepts all string in L?
Is L(M) a subset of L? Or, does M rejects all strings not in L?
Is an NFA necessary? Can you draw a DFA for this L?
Designing NFAs is not as trivial as it seems: easy to create bug accepting string outside
language
q0 q1
q2
a
a/b/c
b
a/b/c
TOC notes by S P CHARI
Let Σ = {a, b}. Give an NFA M that accepts:
L = {x | x is in Σ* and the third to the last symbol in x is b}
a/b
Is L a subset of L(M)?
Is L(M) a subset of L?
Give an equivalent DFA as an exercise.
q1
q0
b
q3
a/b
q2
a/b
Extension of δ to Strings and Sets of States
What we currently have:
δ : (Q x Σ) –> 2Q
q0
0
1
q1
q4
0
1
q2
0
0
1
What we want (why?): δ : (2Q x Σ*) –> 2Q
We will do this in two steps, which will be slightly different from the book, and we will make use of the following NFA.
0
q3
TOC notes by S P CHARI
0
Extension of δ to Strings and Sets of States
TOC notes by S P CHARI
Step #1:
Given δ: (Q x Σ) –> 2Q define δ#: (2Q x Σ) –> 2Q as follows:
for all subsets R of Q, and symbols a in Σ
Note that:
1) δ#(R, a) = δ(q, a)
∪
q∈R
δ#({p},a) = δ(q, a)
= δ(p, a)
by definition of δ#, rule #1 above
∪
q∈{ p}
Hence, we can use δ for δ#
δ({q0, q2}, 0)
δ({q0, q1, q2}, 0)
These now make sense, but previously they did not.
Example:
TOC notes by S P CHARI
δ({q0, q2}, 0)
= δ(q0, 0) U δ(q2, 0)
= {q1, q3} U {q3, q4}
= {q1, q3, q4}
δ({q0, q1, q2}, 1) = δ(q0, 1) U δ(q1, 1) U δ(q2, 1)
= {} U {q2, q3} U {}
= {q2, q3}
Step #2:
TOC notes by S P CHARI
Given δ: (2Q x Σ) –> 2Q define δ^: (2Q x Σ*) –> 2Q as follows:
δ^(R,w) – The set of states M could be in after processing string w, having started from any
state in R.
Formally:
for any subset R of Q
for any w in Σ*, a in Σ, and
subset R of Q
Note that:
δ^(R, a)
= δ(δ^(R, ε), a)
= δ(R, a)
by definition of δ^, rule #3 above by definition of δ^, rule #2 above
Hence, we can use δ for δ^
δ({q0, q2}, 0110)
δ({q0, q1, q2}, 101101)
These now make sense, but previously they did not.
TOC notes by S P CHARI
Example:
What is δ({q0}, 10)?
Informally: The set of states the NFA could be in after processing 10,
having started in state q0, i.e., {q1, q2, q3}.
Formally:
δ({q0}, 10) = δ(δ({q0}, 1), 0)
= δ({q0}, 0)
= {q1, q2, q3}
Is 10 accepted? Yes!
q0
0
1
q1
q3
0
1
q2
1
1
0
Example:
TOC notes by S P CHARI
What is δ({q0, q1}, 1)?
δ({q0 , q1}, 1)
= δ({q0}, 1) ∪ δ({q1}, 1)
= {q0} ∪ {q2, q3}
= {q0, q2, q3}
What is δ({q0, q2}, 10)?
δ({q0 , q2}, 10)
= δ(δ({q0 , q2}, 1), 0)
= δ(δ({q0}, 1) U δ({q2}, 1), 0)
= δ({q0} ∪ {q3}, 0)
= δ({q0,q3}, 0)
= δ({q0}, 0) ∪ δ({q3}, 0)
= {q1, q2, q3} ∪ {}
= {q1, q2, q3}
Example:
TOC notes by S P CHARI
δ({q0}, 101)
= δ(δ({q0}, 10), 1)
= δ(δ(δ({q0}, 1), 0), 1)
= δ(δ({q0}, 0), 1)
= δ({q1 , q2, q3}, 1)
= δ({q1}, 1) U δ({q2}, 1) U δ({q3}, 1)
= {q2, q3} U {q3} U {}
= {q2, q3}
Is 101 accepted? Yes! q3 is a final state.
Definitions for NFAs
TOC notes by S P CHARI
Let M = (Q, Σ, δ,q0,F) be an NFA and let w be in Σ*. Then w is accepted by M iff δ({q0}, w)
contains at least one state in F.
Let M = (Q, Σ, δ,q0,F) be an NFA. Then the language accepted by M is the set: L(M) = {w | w is in Σ* and δ({q0},w) contains at least one state in F}
Another equivalent definition:
L(M) = {w | w is in Σ* and w is accepted by M}
Equivalence of DFAs and NFAs
TOC notes by S P CHARI
Do DFAs and NFAs accept the same class of languages?
Is there a language L that is accepted by a DFA, but not by any NFA? Is there a language L that is accepted by an NFA, but not by any DFA?
Observation: Every DFA is an NFA, DFA is only restricted NFA.
Therefore, if L is a regular language then there exists an NFA M such that L = L(M). It follows that NFAs accept all regular languages.
But do NFAs accept more?
TOC notes by S P CHARI
Consider the following DFA: 2 or more c’s
Q = {q0, q1, q2}
Σ = {a, b, c} Start state is q0 F = {q2}
δ:
a
q0
b q0
c
q1
q0
q1
q2
q1 | q1 | q2 |
q2 | q2 | q2 |
| | |
q1
q0
q2
a
b
a
b
c
c
a/b/c
An Equivalent NFA:
TOC notes by S P CHARI
Q = {q0, q1, q2}
Σ = {a, b, c} Start state is q0 F = {q2}
δ:
a | b | c |
{q0} | {q0} | {q1} |
q0
q1
q2
{q1} | {q1} {q | 2} |
{q2} | {q2} {q | 2} |
| | |
q1
q0
q2
a
b
a
b
c
c
a/b/c
TOC notes by S P CHARI
Lemma 1: Let M be an DFA. Then there exists a NFA M’ such that L(M) = L(M’).
Proof: Every DFA is an NFA. Hence, if we let M’ = M, then it follows that L(M’) = L(M). The above is just a formal statement of the observation from the previous slide.
TOC notes by S P CHARI
Lemma 2: Let M be an NFA. Then there exists a DFA M’ such that L(M) = L(M’).
Proof: (sketch)
Let M = (Q, Σ, δ,q0,F).
Define a DFA M’ = (Q’, Σ, δ’,q’0,F’) as:
Q’ = 2Q
= {Q0, Q1,…,}
Each state in M’ corresponds to a
subset of states from M
where Qu = [qi0, qi1,…qij]
F’ = {Qu | Qu contains at least one state in F} q’0 = [q0]
δ’(Qu, a) = Qv iff δ(Qu, a) = Qv
TOC notes by S P CHARI
Example: empty string or start and end with 0
Q = {q0, q1}
Σ = {0, 1}
Start state is q0
F = {q0}
δ:
0
1
q0
q
1 {q1} | {} |
{q0, q1} | {q1} |
q1
q0
0
0/1
0
Example of creating a DFA out of an NFA (as per the constructive proof):
TOC notes by S P CHARI
δ for DFA:
0
->q0
1
q1
q0 -->q0
q1
0/1
0
0
[q1]{q1}
[ ] write as [q1]
{}
write as [ ]
TOC notes by S P CHARI
δ:
0
1
->q0
[q1 [ ]
[q01
q1
q0
Example of creating a DFA out of an NFA (as per the constructive proof):
0/1
0
0
]{q1} write as [q1] | {} |
{q ,q } ]w 0 e1 s rit a [q01] | {q1} |
TOC notes by S P CHARI
δ:
0
1
->q0
[q1 [ ]
[q01
q1
q0
Example of creating a DFA out of an NFA (as per the constructive proof):
0/1
0
0
]{q1} write as [q1] | {} |
{q ,q } ]w 0 e1 s rit a [q01] | {q1} |
[ ] | [ ] |
TOC notes by S P CHARI
δ:
0
1
->q0
[q1 [ ]
[q01
q1
q0
Example of creating a DFA out of an NFA (as per the constructive proof):
0/1
0
0
]{q1} write as [q1] | {} |
{q ,q } ]w 0 e1 s rit a [q01] | {q1} |
[ ] | [ ] |
[q01] | [q1] |
TOC notes by S P CHARI
Construct DFA M’ as follows:
δ({q0}, 0) = {q1}
δ({q0}, 1) = {}
δ({q1}, 0) = {q0, q1} =>
δ({q1}, 1) = {q1}
δ({q0, q1}, 0) = {q0, q1}
δ({q0, q1}, 1) = {q1} =>
δ({}, 0) = {}
δ({}, 1) = {}
=> δ’([q0], 0) = [q1]
=> δ’([q0], 1) = [ ]
δ’([q1], 0) = [q0q1]
=> δ’([q1], 1) = [q1]
=> δ’([q0q1], 0) = [q0q1]
δ’([q0q1], 1) = [q1]
=>
=>
δ’([ ], 0) = [ ]
δ’([ ], 1) = [ ]
[ ]
1
0
1
[q1]
0/1
[q0]
0
1
[q0q1]
0
TOC notes by S P CHARI
Theorem: Let L be a language. Then there exists an DFA M such that L = L(M) iff there
exists an NFA M’ such that L = L(M’).
Proof:
(if) Suppose there exists an NFA M’ such that L = L(M’). Then by Lemma 2 there
exists an DFA M such that L = L(M).
(only if) Suppose there exists an DFA M such that L = L(M). Then by Lemma 1
there exists an NFA M’ such that L = L(M’).
Corollary: The NFAs define the regular languages.
TOC notes by S P CHARI
Note: Suppose R = {}
δ(R, 0)
= δ(δ(R, ε), 0)
= δ(R, 0)
= δ(q, 0)
= ∪{}
q∈R
Since R = {}
Exercise - Convert the following NFA to a DFA:
Q = {q0, q1, q2}
Σ = {0, 1}
Start state is q0
F = {q0}
δ: 0
q0
1
q1
q
2
{q0, q1} | { } |
{q1} | {q2} |
{q2} | {q2} |
Problem: Third symbol from last is 1
TOC notes by S P CHARI
0/1
q1
q0
q3
1
0/1
q2
0/1
Now, can you convert this NFA to a DFA?
NFAs with ε Moves
TOC notes by S P CHARI
An NFA-ε is a five-tuple:
M = (Q, Σ, δ, q0, F)
Q Σ
q0
F
δ
A finite set of states
A finite input alphabet
The initial/starting state, q0 is in Q
A set of final/accepting states, which is a subset of Q
A transition function, which is a total function from Q x Σ U {ε} to 2Q
δ: (Q x (Σ U {ε})) –> 2Q
δ(q,s)
-The set of all states p such that there is a transition labeled a from q to p, where a is in Σ U {ε}
Sometimes referred to as an NFA-ε other times, simply as an NFA.
Example:
TOC notes by S P CHARI
δ:
0
1
ε
q0
- A string w = w1w2…wn is processed
as w = ε*w1ε*w2ε* … ε*wnε*
q
q
Example: all computations on 00:
0 ε 0
q0 q0 q1 q2
:
q
1 {q0} | { } | {q1} - |
2 {q , q } 1 2 | {q0, q3} | {q2} |
3 {q } 2 | {q2} | { } |
{ } | { } | { } |
q0
ε
0/1
q2
1
0
q1
0
q3
ε
0
1
Informal Definitions
Let M = (Q, Σ, δ,q0,F) be an NFA-ε.
A String w in Σ* is accepted by M iff there exists a path in M from q0 to a state in F labeled by w and zero or more ε transitions.
The language accepted by M is the set of all strings from Σ* that are accepted by M.
TOC notes by S P CHARI
ε-closure
TOC notes by S P CHARI
Define ε-closure(q) to denote the set of all states reachable from q by zero or more ε transitions.
Examples: (for the previous NFA)
ε-closure(q0) = {q0, q1, q2}
ε-closure(q1) = {q1, q2}
ε-closure(q2) = {q2}
ε-closure(q3) = {q3}
ε-closure(q) can be extended to sets of states by defining:
ε-closure(P) = ε-closure(q)
Examples:
1 2 1 2
ε-closure({q , q }) = {q , q }
ε-closure({q0, q3}) = {q0, q1, q2, q3}
∪
q∈P
q0
0/1
q2
ε
1
0
q1
0
q3
ε
0
1
Extension of δ to Strings and Sets of States
TOC notes by S P CHARI
What we currently have:
δ : (Q x (Σ U {ε})) –> 2Q
What we want (why?): δ : (2Q x Σ*) –> 2Q
As before, we will do this in two steps, which will be slightly different from the book, and we will make use of the following NFA.
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
TOC notes by S P CHARI
Step #1:
Given δ: (Q x (Σ U {ε})) –> 2Q define δ#: (2Q x (Σ U {ε})) –> 2Q as follows:
1) δ#(R, a) = δ(q, a) for all subsets R of Q, and symbols a in Σ U {ε}
Note that:
∪
q∈R
δ#({p},a) = δ(q, a) by definition of δ#, rule #1 above
= δ(p, a)
Hence, we can use δ for δ
#
δ({q0, q2}, 0)
δ({q0, q1, q2}, 0)
These now make sense, but previously
they did not.
∪
q∈{ p}
Examples:
TOC notes by S P CHARI
What is δ({q0 , q1, q2}, 1)?
δ({q0 , q1, q2}, 1) = δ(q0, 1) U δ(q1, 1) U δ(q2, 1)
= { } U {q0, q3} U {q2}
= {q0, q2, q3}
What is δ({q0, q1}, 0)?
δ({q0 , q1}, 0)
= δ(q0, 0) U δ(q1, 0)
= {q0} U {q1, q2}
= {q0, q1, q2}
Step #2:
TOC notes by S P CHARI
Given δ: (2Q x (Σ U {ε})) –> 2Q define δ^: (2Q x Σ*) –> 2Q as follows:
δ^(R,w) – The set of states M could be in after processing string w, having starting
from any state in R.
Formally:
2) δ^(R, ε) = ε-closure(R) - for any subset R of Q
3) δ^(R,wa) = ε-closure(δ(δ^(R,w), a))
- for any w in Σ*, a in Σ, and subset R of Q
Can we use δ for δ^?
TOC notes by S P CHARI
Consider the following example:
δ({q0}, 0) = {q0}
δ^({q0}, 0) = ε-closure(δ(δ^({q0}, ε), 0))
= ε-closure(δ(ε-closure({q0}), 0))
= ε-closure(δ({q0, q1, q2}, 0))
= ε-closure(δ(q0, 0) U δ(q1, 0) U δ(q2, 0))
= ε-closure({q0} U {q1, q2} U {q2})
By rule #3
By rule #2
By ε-closure
By rule #1
= ε-closure({q0, q1, q2})
= ε-closure({q0}) U ε-closure({q1}) U ε-closure({q2})
= {q0, q1, q2} U {q1, q2} U {q2}
= {q0, q1, q2}
So what is the difference?
δ(q0, 0) - Processes 0 as a single symbol, without ε transitions.
δ^(q0 , 0) - Processes 0 using as many ε transitions as are possible.
Example:
TOC notes by S P CHARI
δ^({q0}, 01) = ε-closure(δ(δ^({q0}, 0), 1))
= ε-closure(δ({q0, q1, q2}), 1)
= ε-closure(δ(q0, 1) U δ(q1, 1) U δ(q2, 1))
= ε-closure({ } U {q0, q3} U {q2})
= ε-closure({q0, q2, q3})
By rule #3 Previous slide
By rule #1
= ε-closure({q0}) U ε-closure({q2}) U ε-closure({q3})
= {q0, q1, q2} U {q2} U {q3}
= {q0, q1, q2, q3}
Definitions for NFA-ε Machines
Let M = (Q, Σ, δ,q0,F) be an NFA-ε and let w be in Σ*. Then w is accepted by M iff δ^({q0},
w) contains at least one state in F.
Let M = (Q, Σ, δ,q0,F) be an NFA-ε. Then the language accepted by M is the set: L(M) = {w | w is in Σ* and δ^({q0},w) contains at least one state in F}
Another equivalent definition:
L(M) = {w | w is in Σ* and w is accepted by M}
TOC notes by S P CHARI
Equivalence of NFAs and NFA-εs
TOC notes by S P CHARI
Do NFAs and NFA-ε machines accept the same class of languages?
Is there a language L that is accepted by a NFA, but not by any NFA-ε? Is there a language L that is accepted by an NFA-ε, but not by any DFA?
Observation: Every NFA is an NFA-ε.
Therefore, if L is a regular language then there exists an NFA-ε M such that L = L(M). It follows that NFA-ε machines accept all regular languages.
But do NFA-ε machines accept more?
TOC notes by S P CHARI
Lemma 1: Let M be an NFA. Then there exists a NFA-ε M’ such that L(M) = L(M’).
Proof: Every NFA is an NFA-ε. Hence, if we let M’ = M, then it follows that L(M’) = L(M).
The above is just a formal statement of the observation from the previous slide.
Lemma 2: Let M be an NFA-ε. Then there exists a NFA M’ such that L(M) = L(M’).
TOC notes by S P CHARI
Proof: (sketch)
Let M = (Q, Σ, δ,q0,F) be an NFA-ε. Define an NFA M’ = (Q, Σ, δ’,q0,F’) as:
F’ = F U {q} if ε-closure(q) contains at least one state from F
F’ = F otherwise
δ’(q, a) = δ^(q, a) - for all q in Q and a in Σ
Notes:
δ’: (Q x Σ) –> 2Q is a function
M’ has the same state set, the same alphabet, and the same start state as M
M’ has no ε transitions
Example:
TOC notes by S P CHARI
Step #1:
Same state set as M
q0 is the starting state
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
Example:
TOC notes by S P CHARI
Step #2:
q0 becomes a final state
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
Example:
TOC notes by S P CHARI
Step #3:
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
0
0
0
Example:
TOC notes by S P CHARI
Step #4:
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
0/1
0/1
0/1
1
Example:
TOC notes by S P CHARI
Step #5:
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
0/1
0/1
0/1
1
0
0
Example:
TOC notes by S P CHARI
Step #6:
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
0/1
0/1
1
0/1
0/1
0/1
1
1
Example:
TOC notes by S P CHARI
Step #7:
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q3
q0
0/1
0/1
1
0/1
0/1
0/1
1
1
0
q1
Example:
TOC notes by S P CHARI
Step #8:
Done!
[use table of e-closure]
q0
ε
0/1
q2
1
0
q1
0
ε
0
q3
1
q2
q1
q3
q0
0/1
0/1
1
0/1
0/1
0/1
1
1
0/1
TOC notes by S P CHARI
Theorem: Let L be a language. Then there exists an NFA M such that L= L(M) iff there
exists an NFA-ε M’ such that L = L(M’).
Proof:
(if) Suppose there exists an NFA-ε M’ such that L = L(M’). Then by Lemma 2
there exists an NFA M such that L = L(M).
(only if) Suppose there exists an NFA M such that L = L(M). Then by Lemma 1
there exists an NFA-ε M’ such that L = L(M’).
Corollary: The NFA-ε machines define the regular languages.
UNIT-2�REGULAR EXPRESSIONS
TOC notes by S P CHARI
Introduction
TOC notes by S P CHARI
Regular Expressions
TOC notes by S P CHARI
Regular Expressions
Identity Rules :
Let P,Q and R are the regular expressions then the identity rules are as follows:
Regular Expressions
TOC notes by S P CHARI
Solution:
The R.E. can be categorized into two subparts.
R = L1 + L2
L1 = The String which begin with 00 or 11. L2 = The String which end with 00 or 11.
Let us find out L1 and L2.
L1 = (00+11)( any number of 0’s and 1’s ) L1 = (00+11)(0+1)*
L2 = ( any number of 0’s and 1’s ) (00+11) L2 = (0+1)*(00+11)
Hence
R = [(00+11)(0+1)*] + [(0+1)*(00+11)]
Algorithm to build R.E. from given DFA
TOC notes by S P CHARI
If qi is a start state
qi = αji .qj +Ꜫ
Regular Expressions
TOC notes by S P CHARI
Solution:
Q0
Q1
1
0
0
1
The regular expression for the above F.A. is 0*10*1
DFA to RE via State
TOC notes by S P CHARI
1E.limStairntiangtiowinth (1int)ermediate states and then moving to accepting states, apply the state elimination process to produce an equivalent automaton with regular expression labels on the edges.
accepting state.
DFA to RE State
TOC notes by S P CHARI
Elimin2a. tiIof tnhe(t2w)o states are different, we will have an automaton that looks like the following:
Start
S
R
T
We can describe this automaton as: (R+SU*T)*SU*
U
DFA to RE State
TOC notes by S P CHARI
Elimin3a. tIfiothne s(t3ar)t state is also an accepting state, then we must also perform a state elimination from the original automaton that gets rid of every state but the start state. This leaves the following:
R
Start
We can describe this automaton as simply R*.
DFA to RE State Elimination (4)
TOC notes by S P CHARI
4. If there are n accepting states, we must repeat the above steps for each accepting states to get n different regular expressions, R1, R2,
… Rn.
5. For each
repeat we turn any other accepting state to non-
accepting.
6. The desired regular expression for the automaton is then the union of each of
the n regular expressions: R1∪ R2… ∪ RN
DFA🡪RE Example
TOC notes by S P CHARI
0
3
Start
1
2
1
1
0
0
0,1
3
Start
1
2
1
1
0
0+1
DFA 🡪 RE Example (2)
TOC notes by S P CHARI
Start
0
0
3
1
2
1
1
0+1
3
Start
2
11
0+10
0+1
Note edge from 3🡪3
Answer: (0+10)*11(0+1)*
Second Example
TOC notes by S P CHARI
1
Start
2
3
1
1
0
1
0 0
0+10*1
1
Start
3
0
10*1
Second Example (2)
Start
1 3
0
0+10*1
10*1
0
0+10*1
10*1
3
1
Start
This is just 0*; can ignoreTgOoC innotgestboy Psatlalatvei Jo3shsi ince we would “die”
Second Example (3)
1
Start
3
0
0+10*1
10*1
This is just 0*10*1(0+10*1)*
Combine from previous slide to get
0* + 0*10*1(0+10*1)*
0+10*1
1
Start
3
0
10*1
TOC notes by S P CHARI
Converting a RE to an Automata
TOC notes by S P CHARI
RE to ε-NFA
TOC notes by S P CHARI
R=a
R=ε
ε
R=Ø
a
Next slide: More complex RE’s
TOC notes by S P CHARI
R=S+T
S
T
ε
ε
ε
ε
R=ST
S
T
ε
R=S*
ε
S
ε
ε
ε
RE to ε-NFA Example
TOC notes by S P CHARI
a
a
b
b
ab
a
b
ε
RE to ε-NFA Example (2)
ab+a
(ab+a)*
a
b
ε
a
ε
ε
ε
ε
a
b
ε
a
ε
ε
ε
ε
ε
ε
ε
ε
TOC notes by S P CHARI
CONTEXT FREE GRAMMAR
TOC notes by S P CHARI
Prerequisite
TOC notes by S P CHARI
TOC notes by S P CHARI
DEFINITION
V- set of non-terminals Ex: { P,Q,R…} T- set of Terminals Ex: {p,q,r …}
P- set of Production rules
Where non-terminal -> non-terminal
non-terminal -> terminal
S- is start symbol.
Where P = { S->0S
TOC notes by S P CHARI
EXAMPLE
S->1S S->Ꜫ }
TOC notes by S P CHARI
Example1 : Construct CFG which consists of all the strings having atleast one occurrence of ‘000’ over input {0,1}.
We need to construct the equivalent R.E for the given problem
first.
The R.E=(any number of 0’s or 1’s) 000 (any number of 0’s or 1’s)
Thus R.E = (0+1)*000(0+1)*
Now let us build the production rules to satisfy the above R.E
{ S->ATA
A->0A|1A|Ꜫ T->000 }
Excercise1: Find the CFG for the odd length of strings in{a,b}* with middle symbol a
Excercise2: Generate the a CFG to obtain balanced set of parenthesis(i.e. every left parenthesis should match with corresponding right parenthesis)
TOC notes by S P CHARI
Derivation Trees
TOC notes by S P CHARI
Example2: The CFG is given as S->bSb | a | b is a production rule. The S is a start symbol.
TOC notes by S P CHARI
The Derivation tree for deriving a string bbabb as follows.
S
b S b
b S b a
By Simply reading the leaf nodes we can obtain the desired string.
TOC notes by S P CHARI
Exercise : Construct the derivation tree for the string ‘aabbabba’ from the CFG given by
S->aB|bA
A->a|aS|bAA B->b|bS|aBB
Difference between Regular Grammar and
Context Free Gram
mar
S.No. | REGULAR GRAMMAR | CONTEXT FREE GRAMMAR |
1 | Regular grammar is Defined as G=(V,T,P,S) where V- set of Non-Terminals, T- set of Terminals, P- set of Production rules, S- start Symbol Hence G be a regular Grammar. | The Context free grammar is defined as follows G=(V,T,P,S) where V- set of Non-Terminals, T- set of Terminals, P- set of Production rules, S- start Symbol Hence G be a CFG. |
2 | Using this grammar we can draw DFA | Using this we can construct Push Down Automata(PDA) |
3 | Compiler can not make use of this for Parsing. | Compiler makes use of this for Parsing. |
4 | The grammar accepts small set of languages. Hence it is type-3 grammar. | The grammar accepts a large set of languages. Hence it is type-2 grammar. |
5 | This form of grammar can be represented by left linear or right linear grammar | This form of grammar can be represented as nonterminal-> set of terminals and non terminals. |
6 | Every regular grammar is context free grammar. | Every Context Free Grammar can not be Regular |
7 | For Ex: S->aA|B A->aA|a B->bB|b | For Example: S->aSb|bSb|Ꜫ |
8 | Syntax of any programming language can not be represented completely by regularTOgCrnaomtesmbyaPrallavi Jos | Syntax of any programming language can be represented hicompletely by regular grammar |
MINIMIZATION OF CFG
TOC notes by S P CHARI
The Reduction of Grammar as follows
TOC notes by S P CHARI
Removal of Useless
symbols
Elimination of
Ꜫ Productions
Removal of unit
production
Removal of useless symbols
TOC notes by S P CHARI
such derivation exists then it is supposed to be a useless symbol.
For Example:
G= (V,T,P,S) where V={S,T,X} T={0,1} S->0T|1T|X|0|1
T->00
In the above grammar S->X there is no further rule as a definition to X. Hence we can declare X is a useless symbol. Therefore we can remove the S-> X production from the given grammar.
After removal of useless symbol form the given grammar the Grammar should be as follows.
G= (V,T,P,S) where V={S,T} T={0,1}
S->0T|1T|0|1
T->00
Elimination of Ꜫ production
TOC notes by S P CHARI
Example:
S->0S|1S| Ꜫ
from the above Grammar we can remove Ꜫ production without changing the meaning of it.
Simply if we place S-> Ꜫ in other rules.
we get S->0 and S->1 Hence we can rewrite the productions as follows
S->0S|1S|0|1
Thus we can remove Ꜫ production from the given grammar
Example2: For the CFG given below remove Ꜫ production S->aSa
S->bSb
S-> Ꜫ
According to the replacement procedure we will place Ꜫ at S in the sentential form.
S->aSa if S= Ꜫ S->aa
Similarly if
S->bSb if S= Ꜫ S->bb
Thus finally rules are
S->aSa|bSb|aa|bb
Thus we can remove Ꜫ prodTOuC ncottesioby nPallafvri Joosmhi the given grammar.
Excercise1: Eliminate Ꜫ productions from the CFG given below A->0B1|1B1
B->0B|1B|Ꜫ
TOC notes by S P CHARI
Exercise 2: Construct the CFG without Ꜫ productions form the given Grammar.
S->a|Ab|aBa A->b|Ꜫ
B->b|A
Removal of Unit Productions
S->0A|1B|01
For example X->Y, Y->Z, Z->X are the unit productions.
Example 1: if CFG is as below
S->0A|1B|C A->0S|00
B->1|A
C->01 then remove the unit productions.
Solution:
Clearly S->C is a unit production. But while removing it we have to consider what C gives. So, we add a rule to S.
TOC notes by S P CHARI
Example 1(2):
TOC notes by S P CHARI
B->1|0S|00
Thus finally we can write CFG without unit productions as
S->0A|1B|01
A->0S|00
B->1|0S|00
C->01
TOC notes by S P CHARI
Exercise: Eliminate the unit productions fro the Grammar S->AB
A->a
B->C|b C->D
D->E|bC
E->d|Ab
Solution:
Thus the R.E = (a+b)n a (a+b)n Where n>=0
As the above given expression doesn’t represent regular language, The DFA can
not be designed for representing such language.
The CFG G= (V,T,P,S)
Where
V={S} T={a,b}
P={ S->aSa|bSb|aSb|bSa|a} S is Start Symbol.
Back
a or b
a or b
a
Length n
Length n
TOC notes by S P CHARI
Solution:
TOC notes by S P CHARI
Back
The well formed parenthesis means whenever opening parentheses is present there should be a closing parentheses associated with it.
Any expression must begin with opening bracket .We will consider T={(,[,),]} as set of brackets.
The CFG G=(V,T,P,S)
Where
V={S} T={(,[,),]}
P={ S->SS|()|[]|(S)|[S]
Where S is Start Symbol.
Solution :
TOC notes by S P CHARI
obtain derivation for the string aabbabba.
S
aB
a aBB aa bS B
aab bA B aabba B aabba bS aabbab bA aabbabba
S->aB B->aBB B->bS S->bA A->a
B->bS S->bA
A->a
S
a B
a B B
b S b S
b A b A
a
a
Back
Solution:
TOC notes by S P CHARI
Back
B->aa
But if we look at start symbol S then S->aS|A|C
There is no production for B from start symbol. Thus B become a useless symbol.
Similarly, S->C->aCb->aaCbb->….
There is no terminating symbol for C, Therefore we will eliminate C. Then the set of rules are
S->aS|A A->a
TOC notes by S P CHARI
Back
•SCoonlsuidetr ailol thne p:roductions that are giving terminal symbols
S->a
B->a
D->ddd
Now Consider
S->cC
C->cCD
D->ddd.
We will try to derive the string using production rule C we get ,
S->cC->ccCddd->cccCDddd->….
We will not get any terminal symbol for C. Thus we can eliminate C production . To reach D , Cis the only
production, As C is eliminated there is no point of D production. Hence , D is also eliminated.
Therefore,
S->aA|a|Bb
A->aB
B->a|Aa
Is the reduced Grammar.
Solution:
A->0B1 if B=Ꜫ
A->01
Similarly,
A->1B1->11
A->0B1|1B1|01|11
Similarly,
B->0B->0 B->1B->1
B->0B|1B |0|1
Collectively we can write
A->0B1|1B1|01|11 B->0B|1B|0|1
Back
TOC notes by S P CHARI
Solution:
There is only one rule with A which is giving terminal symbol. Hence there is no Unit production with A.
Now Consider,
B->C
C->D
D->E
It is clear that B,C and D are Unit Productions. As E->d|Ab, we will replace value of D. Then
D->d|Ab|bC
Similarly, a C->D we can write
C->d|Ab|bC
Hence B->d|Ab|bC|b
Thus the grammar after removing unit productions
S->AB
A->a
B->d|Ab|bC|b
C->d|Ab|bC
Back
TOC notes by S P CHARI
UNIT-3�PUSH DOWN AUTOMATA
TOC notes by S P CHARI
Prerequisite
TOC notes by S P CHARI
Push Down Automata
TOC notes by S P CHARI
TOS
Z0
Stack
A | A | B | B | $ | Δ | Δ | Δ |
Finite Control
Definition of a PDA
TOC notes by S P CHARI
Q Σ
A finite set of states
A finite input alphabet Г A finite
stack alphabet
q0 The initial/starting state, q0 is in Q
z0 A starting stack symbol, is in Г // need not always remain at the bottom of stack
F
δ
A set of final/accepting states, which is a subset of Q
A transition function, where
For DPDA For NDPDA
δ: Q x (Σ U {ε}) x Г –> finite subsets of Q x Г*
δ: Q x (Σ U {ε}) x Г –> finite subsets of 2Q x Г*
Push Down Automata
TOC notes by S P CHARI
Example: Design a PDA for accepting the Language L= {anbn|n>=1}.
Solution:
The above given language is in which equal number of a’s are followed by equal number of b’s.
Logic for PDA:
PDA
TOC notes by S P CHARI
Q0
Q1
Q2
(b,a/Ꜫ)
(a,a/aa)
(a,z0/az0)
(b,a/Ꜫ)
($, z0/z0)
Example(2)
TOC notes by S P CHARI
0 0 0 0
δ(q ,a,z )=(q ,az )
δ(q0,a,a)=(q0,aa)
δ(q0,b,a)=(q1,Ꜫ)
δ(q1,b,a)=(q1,Ꜫ)
δ(q1,$, z0)=(q2,Ꜫ)
Where q0 is start state and q2 is accept state.
δ(q0,aaabbb,z0) Ͱ(q0,aabbb$,az0)
0 0
Ͱ(q ,abbb$,aaz )
Ͱ(q0,bbb$,aaaz0) Ͱ(q1,bb$,aaz0) Ͱ(q1,b$,az0)
Ͱ(q1,$, z0)
Ͱ(q2, Ꜫ)
Accept State
Hence the string is accepted by the PDA.
Excercise1: Design a PDA that accept a string of well formed parenthesis. Consider the parenthesis is a (,),[,],{,}.
TOC notes by S P CHARI
Excercise2: Construct PDA for the language L={anb2n|n>=1}.
Conversion of CFG to PDA
TOC notes by S P CHARI
δ(q, Ꜫ,A)=(q,α)
Where the production rule is A-> α
δ(q, a,a)=(q, Ꜫ) for every terminal symbol.
Example: Construct PDA for the given CFG
S->0BB
TOC notes by S P CHARI
B->0S|1S|0 and test whether 0104 is accepted by this PDA
A={{q},{0,1},{S,B,0,1}, δ,q,S,F}
The Production rules δ can be written as:
R1: δ(q, Ꜫ,S) = {(q,0BB)}
R2: δ(q, Ꜫ,B) = {(q,0S),(q,1S),(q,0)} R3: δ(q, 0,0) = {(q, Ꜫ)}
R4: δ(q, 1,1) = {(q, Ꜫ)}
Example(2):
TOC notes by S P CHARI
δ(q,010000,S) Ͱ(q,010000,0BB)
Ͱ(q,10000,BB)
Ͱ(q,10000,1SB) Ͱ(q,0000,SB) Ͱ(q,0000,0BBB) Ͱ(q,000,BBB)
Ͱ(q,000,0BB)
Ͱ(q,00,BB)
Ͱ(q,00,0B)
Ͱ(q,0, B)
Ͱ(q,0,0)
Ͱ(q, Ꜫ)
ACCEPT
Since R1 Since R3 Since R2
Since R4
Since R1
Since R3
Since R2 Since R3 Since R2 Since R3 Since R2 Since R3
Excercise1: Construct the PDA for the given CFG.
S->0A
TOC notes by S P CHARI
A->0AB|1 B->1
Excercise1: Construct the PDA for the given CFG.
S->aSA|bSB|a|b A->a
B->b
Construction of CFG from PDA
S->[q0,Z0,q]
Where q indicates the next state an q0 is a start state.
Rule2: If there exists a move of PDA
δ(q,a,Z)={(q’, Ꜫ)}
then the production rule can be written as:
δ(q,Z,q’)->a
Rule3: If there exists a move of PDA as
δ(q,a,Z)={(q’,Z1,Z2,…..Zn)}
Then the production rule of CFG can be written as
δ(q,a,Z)->a[q
1 1, 2 2 2, 3 3 3, 4 m m, m+1
,Z q ] [q ,Z q ] [q ,Z q ]….. [q ,Z q ].
TOC notes by S P CHARI
Example: The PDA is given below A=({q0, q1 },{0,1},{S,A},δ, q0, S,φ)
TOC notes by S P CHARI
Where δ is given as follows:
δ(q0, 1,S)={(q0, AS)}
δ(q0, Ꜫ,S)={(q0, Ꜫ)}
δ(q0, 1,A)={(q0, AA)}
δ(q0, 0,A)={(q1, A)}
δ(q1, 1,A)={(q1, Ꜫ)}
δ(q1, 0,S)={(q0, S)}
Construct the CFG equivalent to this PDA.
Example(2):
TOC notes by S P CHARI
Let we will construct a CFG
G=(V,T,P,S)
Here, T={0,1}, V={S U [q0,A,q0 ],[q0,A,q1 ],[q0,S,q0 ],[q0,S,q1 ],
[q1,A,q1 ], [q1,A,q0 ], [q1,S,q1 ], [q1,S,q0 ]}
Now let up build the production rules as:
Using rule1 from the algorithm P1: S->[q0,S,q0 ]
P2: S->[q0,S,q1 ]
Using rule3 of algorithm for the δ(q0, 1,S)={(q0, AS)} we get, P3: [q0,S,q0 ]->1[q0,A,q0 ][q0,S,q0 ]
P4: [q0,S,q0 ]->1[q0,A,q1 ][q1,S,q0 ]
P5: [q0,S,q1 ]->1[q0,A,q0 ][q0,S,q1 ]
P6: [q0,S,q1 ]->1[q0,A,q1 ][q1,S,q1 ]
Example(3
)N:ow for δ(q0, Ꜫ,S)={(q0, Ꜫ)} using rule2 of algorithm we get
P7: [q0,S,q0 ]-> Ꜫ
Similarly for δ(q0, 1,A)={(q0, AA)} Using rule3 we get
P8:[q0,A,q0 ]->1[q0,A,q0 ][q0,A,q0 ]
P9:[q0,A,q0 ]->1[q0,A,q1 ][q1,A,q0 ]
P10:[q0,A,q1 ]->1[q0,A,q0 ][q0,A,q1 ]
P11:[q0,A,q1 ]->1[q0,A,q1 ][q1,A,q1 ]
Similarly for δ(q0, 0,A)={(q1, A)} gives
P12:[q0,A,q0 ]->0[q1,A,q0 ]
P13:[q0,A,q1 ]->0[q1,A,q1 ]
δ(q1, 1,A)={(q1, Ꜫ)} Gives
P14:[q1,A,q1 ]->1
δ(q1, 0,S)={(q0, S)}
P15: [q1,S,q0 ]->0[q0,S,q0 ]
P16:[q1,S,q1 ]->0[q0,S,q1TO]C notes by Pallavi Joshi
Exercise: Find the CFG corresponding to PDA whose transition mapping is as follows:
δ(S, a , X) = (s, AX)
δ(S, b , A) = (s, AA)
δ(S, a , A) = (s, Ꜫ)
TOC notes by S P CHARI
TOC notes by S P CHARI
BACK
Given PDA:
δ(S, a , X) = (s, AX)
δ(S, b , A) = (s, AA)
δ(S, a , A) = (s, Ꜫ)
Solution: Now we will apply conversion algorithm for each transition and obtain
the production rules:
For δ(S, a , X) = (s, AX)
P1: (S,X,s)->a(S A S)(S X s ) |a(S A s)(s X s )
P2: (S,X,S)->a(S A S)(S X S ) | a(S A s)(s X S )
For δ(S, b , A) = (s, AA)
P3: (S, A, s)->b(S A S)(S A s ) | b(S A s)(s A s)
P4: (S, A, S)->b(S A S)(S A S) | b(S A s)(s A S)
For δ(S, a , A) = (s, Ꜫ)
P5: (S, A, s)-> a
Hence , P1,P2,P3,P4 and P5 are productions in CFG.
TOC notes by S P CHARI
BACK
Given PDA:
δ(S, a , X) = (s, AX)
δ(S, b , A) = (s, AA)
δ(S, a , A) = (s, Ꜫ)
Solution: Now we will apply conversion algorithm for each transition and obtain
the production rules:
For δ(S, a , X) = (s, AX)
P1: (S,X,s)->a(S A S)(S X s ) |a(S A s)(s X s )
P2: (S,X,S)->a(S A S)(S X S ) | a(S A s)(s X S )
For δ(S, b , A) = (s, AA)
P3: (S, A, s)->b(S A S)(S A s ) | b(S A s)(s A s)
P4: (S, A, S)->b(S A S)(S A S) | b(S A s)(s A S)
For δ(S, a , A) = (s, Ꜫ)
P5: (S, A, s)-> a
Hence , P1,P2,P3,P4 and P5 are productions in CFG.
TURING MACHINE
TOC notes by S P CHARI
Turing Machines (TM)
TOC notes by S P CHARI
Generalize the class of CFLs:
Recursively Enumerable Languages
Recursive Languages Context-Free Languages
Regular Languages
Non-Recursively Enumerable Languages
Another Part of the Hierarchy:
TOC notes by S P CHARI
Recursively Enumerable Languages
Recursive Languages Context-Sensitive Languages Context-Free Languages - ε
Regular Languages - ε
Non-Recursively Enumerable Languages
TOC notes by S P CHARI
Recursively enumerable languages are also known as type 0 languages.
Context-sensitive languages are also known as type 1 languages. Context-free languages are also known as type 2 languages.
Regular languages are also known as type 3 languages.
TMs model the computing capability of a general purpose computer, which informally can be described as:
Effective procedure
TOC notes by S P CHARI
Finitely describable
Well defined, discrete, “mechanical” steps
Always terminates
Computable function
A function computable by an effective procedure
TMs formalize the above notion.
Church-Turing Thesis: There is an effective procedure for solving a problem if and only if there is a TM that halts for all inputs and solves the problem.
There are many other computing models, but all are equivalent to or subsumed by TMs. There is no more powerful
machine (Technically cannot be proved).
DFAs and PDAs do not model all effective procedures or computable functions, but only a subset.
Deterministic Turing Machine (DTM)
TOC notes by S P CHARI
……..
Two-way, infinite tape, broken into cells, each containing one symbol.
Two-way, read/write tape head.
An input string is placed on the tape, padded to the left and right infinitely with blanks, read/write head is positioned at the left end of input string.
Finite control, i.e., a program, containing the position of the read head, current symbol being
scanned, and the current state.
In one move, depending on the current state and the current symbol being scanned, the TM 1) changes state, 2) prints a symbol over the cell being scanned, and 3) moves its’ tape head one cell left or right.
Many modifications possible, but Church-Turing declares equivalence of all.
Finite Control
B | B | 0 | 1 | 1 | 0 | ……0.. | B | B |
Formal Definition of a DTM
TOC notes by S P CHARI
A DTM is a seven-tuple:
M = (Q, Σ, Γ, δ, q0, B, F)
Q Σ Γ B
q0
F
δ
A finite set of states
A finite input alphabet, which is a subset of Γ– {B} A finite tape alphabet, which is a strict superset of Σ A distinguished blank symbol, which is in Γ
The initial/starting state, q0 is in Q
A set of final/accepting states, which is a subset of Q
A next-move function, which is a mapping (i.e., may be undefined) from Q x Γ –> Q x Γ x {L,R}
Intuitively, δ(q,s) specifies the next state, symbol to be written, and the direction of tape head
movement by M after reading symbol s while in state q.
TOC notes by S P CHARI
Example #1: {w | w is in {0,1}* and w ends with a 0}
0
00
10
10110
Not ε
Q = {q0, q1, q2}
Γ = {0, 1, B}
Σ = {0, 1}
F = {q2}
δ:
0
->q0 (q0, 0, R)
1
(q0, 1, R)
B (q1, B, L)
q1 (q2, 0, R)
q2* -
-
-
-
-
q0 is the start state and the “scan right” state, until hits B
q1 is the verify 0 state q2 is the final state
Example #2: {0n1n | n ≥ 1}
0
1
X
B
->q0 (q1, X, R) -
-
Y
(q3, Y, R)0’s finished -
-
- (more 0’s)
q1 (q1, 0, R)ignore1 (q2, Y, L) q2 (q2, 0, L) ignore2 -
(q0, X, R)
-
- (more 1’s)
(q1, Y, R) ignore2 (q2, Y, L) ignore1 (q3, Y, R) ignore
(q4, B, R)
q3 -
q4* -
-
-
-
-
-
Sample Computation: (on 0011),
presume state q looks rightward
q00011BB.. |— Xq1011
|— X0q111
|— Xq20Y1
|— q2X0Y1
|— Xq00Y1
|— XXq1Y1
|— XXYq11
|— XXq2YY
|— Xq2XYY
|— XXq0YY
|— XXYq3Y B…
|— XXYYq3 BB…
|— XXYYBq4
TOC notes by S P CHARI
Making a TM for {0n1n | n >= 1}
Try n=2 or 3 first.
q00011BB.. |— Xq1011
Try n=1 next.
Make sure unbalanced 0’s and 1’s, or mixture of 0-1’s,
“crashes” in a state not q4, as it should be
|— X0q111
|— Xq20Y1
|— q2X0Y1
|— Xq00Y1
|— XXq1Y1
|— XXYq11
|— XXq2YY
|— Xq2XYY
|— XXq0YY
|— XXYq3Y B…
|— XXYYq3 BB…
|— XXYYBq4
TOC notes by S P CHARI
TOC notes by S P CHARI
Same Example #2: {0n1n | n ≥ 1}
Logic:
t to find next
q0 | 0 (q1, X, R) | 1 - | X - | Y (q3, Y, R) | - | B |
q1 | (q1, 0, R) | (q2, Y, L) | - | (q1, Y, R) | - | |
q2 | (q2, 0, L) | - | (q0, X, R) | (q2, Y, L) | - | |
q3 q4 | - - | - - | - - | (q3, Y, R) - | (q4, B, R) | - |
cross 0’s with X’s, scan right to look for corresponding 1, on finding it cross it with Y, and scan lef | ||||||
leftmost 0, keep iterating until no more 0’s, then scan right looking for B.
The TM matches up 0’s and 1’s
q1 is the “scan right” state, looking for 1 q2 is the “scan left” state, looking for X q3 is “scan right”, looking for B
q4 is the final state
Can you extend the machine to include n=0?
How does the input-tape look like for string epsilon?
Other Examples:
00
001
000111
11
011
TOC notes by S P CHARI
Roger Ballard’s TM for Example #2, without any extra Tape Symbol: {0n1n | n ≥ 0}
q0 | 0 (q1, B, R) | 1 | B (q4, B, R) |
q1 | (q1, 0, R) | (q1, 1, R) | (q2, B, L) |
q2 | - | (q3, B, L) | - |
q3 | (q3, 0, L) | (q3, 1, L) | (q0, B, R) |
q4* - - - Keep deleting 0 and corresponding 1 from extreme ends, until none left. | |||
Logic:
q0 deletes a leftmost 0 and let q1 scan through end of string, q0 accepts on epsilon
q1 scans over the string and makes q2 expecting 1 on the left
q2 deletes 1 and let q3 “scan left” looking for the start of current string q3 lets q0 start the next iteration
q4 is the final state
Any bug?
Try on:
00
001
000111
11
011
TOC notes by S P CHARI
And his example of a correct TM for the language that goes on infinite loop outside language: {0n1n | n ≥ 0}
q0 | 0 (q1, B, R) | 1 (q3, 1, L) | B (q4, B, R) |
q1 | (q1, 0, R) | (q1, 1, R) | (q2, B, L) |
q2 | - | (q3, B, L) | - |
q3 | (q3, 0, L) | (q3, 1, L) | (q0, B, R) |
q4* - - - This machine still works correctly for all strings in the language, but | |||
Logic:
start a string with 1 (not in the language), and it loops on B1 for ever.
TOC notes by S P CHARI
Exercises: Construct a DTM for each of the following.
{w | w is in {0,1}* and w ends in 00}
{w | w is in {0,1}* and w contains at least two 0’s}
{w | w is in {0,1}* and w contains at least one 0 and one 1}
Just about anything else (simple) you can think of
Formal Definitions for DTMs
TOC notes by S P CHARI
Let M = (Q, Σ, Г, δ, q0, B, F) be a TM.
Definition: An instantaneous description (ID) is a triple α1qα2, where:
q, the current state, is in Q
α1α2, is in Г*, and is the current tape contents up to the rightmost non-blank symbol, or the symbol to the left of the tape head,
whichever is rightmost
The tape head is currently scanning the first symbol of α2
At the start of a computation α1= ε
If α2= ε then a blank is being scanned
Example: (for TM #1)
q00011 Xq1011
X0q111
Xq20Y1
q2X0Y1
Xq00Y1 XXq1Y1 | XXYq11 | XXq2YY |
XXq0YY XXYq3Y | XXYYq3 | XXYYBq4 |
Xq2XYY
TOC notes by S P CHARI
Suppose the following is the current ID of a DTM
x1x2…xi-1qxixi+1…xn Case 1) δ(q, xi) = (p, y, L)
If any suffix of xi-1yxi+1…xn is blank then it is deleted.
Case 2) δ(q, xi) = (p, y, R)
x1x2…xi-1qxixi+1…xn |— x1x2…xi-1ypxi+1…xn
If i>n then the ID increases in length by 1 symbol
x1x2…xnq |— x1x2…xnyp
TOC notes by S P CHARI
Definition: Let M = (Q, Σ, Г, δ, q0, B, F) be a TM, and let w be a string in Σ*. Then w is accepted by M iff
q0w |—* α1pα2
where p is in F and α1 and α2 are in Г*
Definition: Let M = (Q, Σ, Г, δ, q0, B, F) be a TM. The language accepted by M, denoted L(M), is the set
{w | w is in Σ* and w is accepted by M}
Notes:
In contrast to FA and PDAs, if a TM simply passes through a final state then the string is accepted.
Given the above definition, no final state of a TM need to have any transitions. Henceforth, this is our assumption.
If x is NOT in L(M) then M may enter an infinite loop, or halt in a non-final state.
Some TMs halt on ALL inputs, while others may not. In either case the language defined by TM is still well defined.
TOC notes by S P CHARI
Definition: Let L be a language. Then L is recursively enumerable if there exists a TM M such that L = L(M).
If L is r.e. then L = L(M) for some TM M, and
If x is in L then M halts in a final (accepting) state.
If x is not in L then M may halt in a non-final (non-accepting) state or no transition is available, or loop forever.
Definition: Let L be a language. Then L is recursive if there exists a TM M such that L = L(M) and M halts on all inputs.
If L is recursive then L = L(M) for some TM M, and
If x is in L then M halts in a final (accepting) state.
If x is not in L then M halts in a non-final (non-accepting) state or no transition is available (does not go to infinite loop).
Notes:
The set of all recursive languages is a subset of the set of all recursively enumerable languages
Terminology is easy to confuse: A TM is not recursive or recursively enumerable, rather a language is recursive or recursively
enumerable.
Recall the Hierarchy:
TOC notes by S P CHARI
Recursively Enumerable Languages
Recursive Languages Context-Sensitive Languages
Context-Free Languages - ε
Regular Languages - ε
Non-Recursively Enumerable Languages
TOC notes by S P CHARI
Observation: Let L be an r.e. language. Then there is an infinite list M0, M1, … of TMs such that L = L(Mi).
Question: Let L be a recursive language, and M0, M1, … a list of all TMs such that L = L(Mi), and choose any i>=0. Does Mi always halt?
Answer: Maybe, maybe not, but at least one in the list does.
Question: Let L be a recursive enumerable language, and M0, M1, … a list of all TMs such that L = L(Mi), and
choose any i>=0. Does Mi always halt?
Answer: Maybe, maybe not. Depending on L, none might halt or some may halt.
If L is also recursive then L is recursively enumerable, recursive is subset of r.e.
Question: Let L be a r.e. language that is not recursive (L is in r.e. – r), and M0, M1, … a list of all TMs such that L = L(Mi), and choose any i>=0. Does Mi always halt?
Answer: No! If it did, then L would not be in r.e. – r, it would be recursive.
TOC notes by S P CHARI
L is Recursively enumerable:
TM exist: M0, M1, …
They accept string in L, and do not accept any string outside L
L is Recursive:
at least one TM halts on L and on ∑*-L, others may or may not
L is Recursively enumerable but not Recursive:
TM exist: M0, M1, …
but none halts on all x in ∑*-L
M0 goes on infinite loop on a string p in ∑*-L, while M1 on q in ∑*-L
However, each correct TM accepts each string in L, and none in ∑*-L
L is not R.E:
no TM exists
Let M be a TM.
TOC notes by S P CHARI
Question: Is L(M) r.e.?
Answer: Yes! By definition it is!
Question: Is L(M) recursive?
Answer: Don’t know, we don’t have enough information.
Question: Is L(M) in r.e – r?
Answer: Don’t know, we don’t have enough information.
Let M be a TM that halts on all inputs:
TOC notes by S P CHARI
Question: Is L(M) recursively enumerable?
Answer: Yes! By definition it is!
Question: Is L(M) recursive?
Answer: Yes! By definition it is!
Question: Is L(M) in r.e – r?
Answer: No! It can’t be. Since M always halts, L(M) is recursive.
Let M be a TM.
TOC notes by S P CHARI
As noted previously, L(M) is recursively enumerable, but may or may not be recursive.
Question: Suppose, we know L(M) is recursive. Does that mean M always halts?
Answer: Not necessarily. However, some TM M’ must exist such that L(M’) = L(M) and M’ always halts.
Question: Suppose that L(M) is in r.e. – r. Does M always halt?
Answer: No! If it did then L(M) would be recursive and therefore not in r.e. – r.
Let M be a TM, and suppose that M loops forever on some string x.
Question: Is L(M) recursively enumerable?
Answer: Yes! By definition it is. But, obviously x is not in L(M).
Question: Is L(M) recursive?
Answer: Don’t know. Although M doesn’t always halt, some other TM M’ may exist such that L(M’) = L(M) and M’
always halts.
Question: Is L(M) in r.e. – r?
Answer: Don’t know.
May be another M’ will halt on x, and on all strings! May be no TM for this L(M) does halt on all strings! We just do
not know!
TOC notes by S P CHARI
Modifications of the Basic TM Model
TOC notes by S P CHARI
Other (Extended) TM Models:
One-way infinite tapes Multiple tapes and tape heads Non-Deterministic TMs
Multi-Dimensional TMs (n-dimensional tape)
Multi-Heads
Multiple tracks
All of these extensions are equivalent to the basic DTM model
Closure Properties for Recursive and Recursively Enumerable Languages
TOC notes by S P CHARI
TMs model General Purpose (GP) Computers:
If a TM can do it, so can a GP computer
If a GP computer can do it, then so can a TM
If you want to know if a TM can do X, then some equivalent question are:
Can a general purpose computer do X?
Can a C/C++/Java/etc. program be written to do X?
For example, is a language L recursive?
Can a C/C++/Java/etc. program be written that always halts and accepts L?
TM Block Diagrams:
TOC notes by S P CHARI
If L is a recursive language, then a TM M that accepts L and always halts can be pictorially represented by a “chip” or
“box” that has one input and two outputs.
If L is a recursively enumerable language, then a TM M that accepts L can be pictorially represented by a “box” that has
one output.
w
yes
no
M
yes
Conceivably, M could be pwrovided with an outputMfor “no,” but this output cannot be counted on. Consequently, we simply ignore it.
TOC notes by S P CHARI
Theorem 1: The recursive languages are closed with respect to complementation, i.e., if L is a recursive
language, then so is
Proof: Let M be a TM such that L = L(M) and M always halts. Construct TM M’ as follows:
Note That:
M’ accepts iff M does not
M’ always halts since M always halts
From this it follows that the complement of L is recursive. •
Question: How is the construction achieved? Do we simply complement the final states in the TM? No! A string in L could end up in the complement of L.
Suppose q5 is an accepting state in M, but q0 is not.
If we simply complemented the final and non-final states, then q0 would be an accepting state in M’ but q5 would not. Since q0 is an accepting state, by definition all strings are accepted by M’
L = Σ * −L
w
yes
no
M
yes
no
M’
L = L ∪ L
Theorem 2: The recursive languages are closed with respect to union, i.e., if L1 and L2 are recursive languages, then so is
Proof: Let M1 and M2 be TMs such that L1 = L(M1) and L2 = L(M2) and M1 and M2 always halts.
Construct TM M’ as follows:
Note That:
L(M’) = L(M1) L(M2)
L(M’) is a subset of L(M1) U L(M2) L(M1) U L(M2) is a subset of L(M’)
1 2
It follows from this that
is recursive. •
3 1 2
w
yes
no
M1
yes
no
M2
start
M’
M’ always halts since M and∪M always halt
TOCLno3tes =by PaLllav1i Jo∪shi L2
TOC notes by S P CHARI
Theorem 3: The recursive enumerable languages are closed with respect to union, i.e., if L1 and L2 are recursively
enumerable languages, then so is
Proof: Let M1 and M2 be TMs such that L1 = L(M1) and L2 = L(M2). Construct M’ as follows:
Note That:
L(M’) = L(M1) U L(M2)
L(M’) is a subset of L(M1) U L(M2)
L(M1) U L(M2) is a subset of L(M’)
M’ halts and accepts iff M1 or M2 halts and accepts
It follows from this that
is recursively enumerable. •
Question: How do you run two TMs in parallel?
L3 = L1 ∪ L2
2
3 1
L = L ∪ L
w
yes
M1
yes
yes
M2
M’
Suppose, M1 and M2 had outputs for “no” in the previous construction, and these were transferred to the “no” output for M’
TOC notes by S P CHARI
Question: What would happen if w is in L(M1) but not in L(M2)?
Answer: You could get two outputs – one “yes” and one “no.”
At least M1 will halt and answer accept, M2 may or may not halt.
As before, for the sake of convenience the “no” output will be ignored.
w
yes
M1
yes
yes
M2
M’
no
no
no
are both Lrecursively enumerable then L (and therefore
TOC notes by S P CHARI
Theorem 4: If L and
) is recursive.
Proof: Let M1 and M2 be TMs such that L = L(M1) and
L(M’) is a subset of L L is a subset of L(M’)
M’ is TM for L
M’ always halts since either M1 or M2 halts for any given string
M’ shows that L is recursive
It follows from this that L (and therefore its’ complement) is recursive.
So, is also recursive (we proved it before). •
L
= L(M2). Construct M’ as follows:
L
Note That:
L(M’) = L w
yes
M1
yes
yes
M2
M’
no
L
Corollary of Thm 4: Let L be a subset of Σ*. Then one of the following must be true:
TOC notes by S P CHARI
Both L and One of L and Neither L nor
are recursive.
is recursively enumerable but not recursive, and the other is not recursively enumerable, or
is recursively enumerable
In other words, it is impossiLble to have both L and
r.e. but not recursive
L
L
L
In terms of the hierarchy: (possibility #1)
TOC notes by S P CHARI
Recursive Languages
Recursively Enumerable Languages
Non-Recursively Enumerable Languages
L
L
In terms of the hierarchy: (possibility #2)
TOC notes by S P CHARI
Recursive Languages
Recursively Enumerable Languages
Non-Recursively Enumerable Languages
L
L
In terms of the hierarchy: (possibility #3)
TOC notes by S P CHARI
Recursive Languages
Recursively Enumerable Languages
Non-Recursively Enumerable Languages
L
L
In terms of the hierarchy: (Impossibility #1)
TOC notes by S P CHARI
Recursive Languages
Non-Recursively Enumerable Languages
L L
Recursively Enumerable Languages
In terms of the hierarchy: (Impossibility #2)
TOC notes by S P CHARI
L
Recursively Enumerable Languages
L
Recursive Languages
Non-Recursively Enumerable Languages
In terms of the hierarchy: (Impossibility #3)
TOC notes by S P CHARI
Recursive Languages
Recursively Enumerable Languages
Non-Recursively Enumerable Languages
L
L
Note: This gives/identifies three approaches to show that a language is not recursive.
TOC notes by S P CHARI
Show that the language’s complement is not recursive, in one of the two ways:
Show that the language’s complement is recursively enumerable but not recursive Show that the language’s complement is not even recursively enumerable
The Halting Problem - Background
TOC notes by S P CHARI
Definition: A decision problem is a problem having a yes/no answer (that one presumably wants to solve with a computer). Typically, there is a list of parameters on which the problem is based.
Given a list of numbers, is that list sorted?
Given a number x, is x even?
Given a C program, does that C program contain any syntax errors?
Given a TM (or C program), does that TM contain an infinite loop?
From a practical perspective, many decision problems do not seem all that interesting. However,
from a theoretical perspective they are for the following two reasons:
Decision problems are more convenient/easier to work with when proving complexity results. Non-decision counter-parts can always be created & are typically at least as difficult to solve.
Notes:
The following terms and phrases are analogous:
Algorithm Decision Problem (un)Decidable
Statement of the Halting Problem
TOC notes by S P CHARI
Practical Form: (P1)
Input: Program P and input I.
Question: Does P terminate on input I?
Theoretical Form: (P2)
Input: Turing machine M with input alphabet Σ and string w in Σ*.
Question: Does M halt on w?
A Related Problem We Will Consider First: (P3)
Input: Turing machine M with input alphabet Σ and one final state, and string w in Σ*. Question: Is w in L(M)?
Analogy:
Input: DFA M with input alphabet Σ and string w in Σ*.
Question: Is w in L(M)?
Is this problem (regular language) decidable? Yes! DFA always accepts or rejects.
Over-All Approach:
TOC notes by S P CHARI
We will show that a language Ld is not recursively enumerable
From this it will follow that is not recursive
Using this we will show that a language Lu is not recursive
From this it will follow that the halting problemLdis undecidable.
As We Will See:
P3 will correspond to the language Lu
Proving P3 (un)decidable is equivalent to proving Lu (non)recursive
Converting the Problem to a Language
TOC notes by S P CHARI
Let M = (Q, Σ, Γ, δ, q1, B, {qn}) be a TM, where
Q = {q1, q2, … , qn}, order the states from 1 through n Σ = {x1, x2} = {0, 1}
Γ = {x1, x2, x3} = {0, 1, B}
Encode each transition:
δ(qi, xj) = (qk , xl, dm)
where qi and qk are in ordered Q xj and xl are in Σ,
and dm is in {L, R} = {d1, d2}
as:
0i10j10k10l10m where the number of 0’s indicate the corresponding id, and single 1 acts as a barrier
The TM M can then be encoded as:
111code111code211code311 … 11coder111
where each codei is one transitions’ encoding, and 11’s are barriers between transitions from the table row-major.
Let this encoding of M be denoted by <M>.
TOC notes by S P CHARI
Less Formally:
Every state, tape symbol, and movement symbol is encoded as a sequence of 0’s:
q1, | 0 |
q2, | 00 |
q3 | 000 |
: | |
0 | 0 |
1 | 00 |
B | 000 |
L | 0 |
R | 00 |
Note that 1’s are not used to represent the above, since 1 is used as a special separator symbol.
Example:
δ(q2, 1) = (q3 , 0, R)
Is encoded as:
00100100010100
TOC notes by S P CHARI
q1 | 0 (q1, 0, R) | 1 (q1, 1, R) | B (q2, B, L) |
q2 | (q3, 0, R) | - | - |
q3 | - | - | - |
What is the L(M)?
Coding for the above table:
1110101010100110100101001001101000100100010110010100010100111
Are the followings correct encoding of a TM?
01100001110001
111111
Definition:
TOC notes by S P CHARI
Lt = {x | x is in {0, 1}* and x encodes a TM}
Question: Is Lt recursive?
Answer: Yes. [Check only for format, i.e. the order and number of 0’s and 1’s, syntax checking]
Question: Is Lt decidable:
Answer: Yes (same question).
The Universal Language
TOC notes by S P CHARI
Define the language Lu as follows:
Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}
Let x be in {0, 1}*. Then either:
TOC notes by S P CHARI
Recall: | | ||
| 0 | 1 | B |
q1 | (q1, 0, R) | (q1, 1, R) | (q2, B, L) |
q2 | (q3, 0, R) | - | - |
q3 | - | - | - |
Which of the following are in Lu?
1110101010100110100101001001101000100100010110010100010100111
111010101010011010010100100110100010010001011001010001010011101110
111010101010011010010100100110100010010001011001010001010011100110111
01100001110001
111111
TOC notes by S P CHARI
Compare P3 and Lu:
(P3):
Input: Turing machine M with input alphabet Σ and one final state, and string w in Σ*. Question: Is w in L(M)?
Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}
Universal TM (UTM) is the machine for Lu
presuming it is r.e.! Can you write a program to accept strings in Lu?
Notes:
Lu is P3 expressed as a language
Asking if Lu is recursive is the same as asking if P3 is decidable.
Can you write a Halting program for accept/reject of strings in Sigma* ?
We will show that Lu is not recursive, and from this it will follow that P3 is un-decidable. From this we can further show that the Halting problem is un-decidable.
=> A general concept: a decision problem ≡ a formal language
TOC notes by S P CHARI
Define another language Ld as follows:
[Ld_bar = {self accepting TM encodings}, everything else is Ld ]
Ld = {x | x is in {0, 1}* and (a) either x is not a TM,
(b) or x is a TM, call it M, and x is not in L(M)} (1)
Note, there is only one string x
And, the question really is the complement of “does a TM accept its own encoding?” (Ld-bar’s complement)
Let x be in {0, 1}*. Then either:
TOC notes by S P CHARI
Recall: | | ||
| 0 | 1 | B |
q1 | (q1, 0, R) | (q1, 1, R) | (q2, B, L) |
q2 | (q3, 0, R) | - | - |
q3 | - | - | - |
Which of the following are in Ld?
11101010101001101001010010011010001000100010110010100010100111
01100001110001
Change above machine to accept strings ending with 1: the encoding will not be in Ld
TOC notes by S P CHARI
Lemma: Ld is not recursively enumerable. [No TM for Ld!!!]
Proof: (by contradiction)
Suppose that Ld is recursively enumerable. In other words, there exists a TM M such that:
Ld = L(M) (2)
Now suppose that w is a string encoding of M. (3) Case 1) w is in Ld (4)
By definition of Ld given in (1), either w does not encode a TM, or w does encode a TM, call it M, and w is not in L(M). But we know that w
encodes a TM (3: that’s where it came from). Therefore:
w is not in L(M) | (5) | |
But then (2) and (5) imply that w is not in Ld contradicting (4). Case 2) w is not in Ld | | (6) |
By definition of Ld given in (1), w encodes a TM, call it M, and: w is in L(M) | (7) | |
But then (2) and (7) imply that w is in Ld contradicting (6).
Since both case 1) and case 2) lead to a contradiction, no TM M can exist such that Ld = L(M). Therefore Ld is not recursively enumerable. •
TOC notes by S P CHARI
Note:
= {x | x is in {0, 1}*, x encodes a TM, call it M, and x is in L(M)}
is not recursive.
Proof: If
Ld
Corollary:
were recursive, then Ld would be recursive, and therefore recursively enumerable, a contradiction. •
Ld
Ld
TOC notes by S P CHARI
Suppose that M’ always halts and Lu = L(M’). It follows that:
M’’ always halts
Theorem: Lu is not recursive.
Proof: (by contradiction)
Suppose that Lu is recursive. Recall that:
Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}
Suppose that Lu = L(M’) where M’ is a TM that always halts. Construct an algorithm (i.e., a TM that always halts) for
as follows:
Ld
L(M’’) =
would therefore be recursive, a contradiction. •
Ld
Ld
Yes
No
Yes
No
Let M be the TM that w encodes.
M’:
UTM for Lu
Yes
No
<M,w>
(i.e., <w,w>)
Is w a TM?
Lt
w
M’’: for Ld-bar
L_u is recursively enumerable
(you may ignore this slide, for now)
TOC notes by S P CHARI
Input the string
Decode the TM prefix, if it doesn't have one then the string is not in Lu
Otherwise, run/simulate the encoded TM on the suffix
If it terminates and accepts then the original string is in Lu.
If a given string is in Lu, then the above algorithm will correctly determine that, halt and say yes. If the given string is not in Lu, then there are three cases:
From this proof note that if the prefix TM is a DFA or PDA, then our machine will also halt in the 3rd case above, no matter what the suffix is.
-- due to Dr. Bernhard (edited by me)
TOC notes by S P CHARI
The over-all logic of the proof is as follows:
3. It follows that Lu is not recursive.
The second point was established by the corollary.
The first point was established by the theorem on a preceding slide.
This type of proof is commonly referred to as a reduction. Specifically, the problem of recognizing
recognizing Lu
was reduced to the problem of
Ld
Ld
Ld
TOC notes by S P CHARI
Define another language Lh:
Lh = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and M halts on w}
Note that Lh is P2 expressed as a language:
(P2):
Input: Turing machine M with input alphabet Σ and string w in Σ*.
Question: Does M halt on w?
TOC notes by S P CHARI
Theorem: Lh is not recursive.
Proof: (by contradiction)
Suppose that Lh is recursive. Recall that:
Lh = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and M halts on w}
and
Lu = {x | x is in {0, 1}* and x = <M,w> where M is a TM encoding and w is in L(M)}
Suppose that Lh = L(M’) where M’ is a TM that always halts. Construct an algorithm (i.e., a TM that always halts) for Lu
as follows:
Suppose that M’ always halts and Lh = L(M’). It follows that:
M’’ always halts
L(M’’) = Lu
Lu would therefore be recursive, a contradiction. •
Yes
No
Yes
No
Simulate M On w
Yes
No
M’ for Lh: does M halt on w?
<M,w>
M’’ : UTM for Lu
start
TOC notes by S P CHARI
The over-all logic of the proof is as follows:
The second point was established previously.
The first point was established by the theorem on the preceding slide.
This proof is also a reduction. Specifically, the problem of recognizing Lu was reduced to the problem of recognizing Lh.
[Lu and Lh both are recursively enumerable: for proof see Dr. Shoaff!]
TOC notes by S P CHARI
Examples of non-halting program:
http://cs.fit.edu/~ryan/tju/russell.c http://cs.fit.edu/~ryan/tju/russell.scm http://cs.fit.edu/~ryan/tju/russell.py
TOC notes by S P CHARI
Define another language Lq:
Lq = {x | x is in {0, 1}*, x encodes a TM M, and M does not contain an infinite loop}
Or equivalently:
Lq = {x | x is in {0, 1}*, x encodes a TM M, and there exists no string w in {0, 1}*
such that M does not terminate on w}
Note that:
= {x | x is in {0, 1}*, and either x does not encode a TM, or it does encode a TM, call it M, and there exists a string w in {0, 1}* such that M does not terminate on w}
Note that the above languages correspond to the following problem:
(P0):
Input: Program P.
Question: Does P contain an infinite loop?
Using the techniques discussed, what can we prove about Lq or its’ complement?
Lq
TOC notes by S P CHARI
Lne = {x | x is a TM M and L(M) is not empty} is r.e. but not recursive.
Le = {x | x is a TM M and L(M) is empty} is not r.e.
Lr = {x | x is a TM M and L(M) is recursive} is not r.e.
Note that Lr is not the same as Lh = {x | x is a TM M that always halts}
but Lh is in Lr.
Lnr = {x | x is a TM M and L(M) is not recursive} is not r.e.
Ignore this slide
Lemma: Ld is not recursively enumerable: [No TM for Ld!!!]
TOC notes by S P CHARI
Proof: (by contradiction)
Suppose that Ld were recursively enumerable. In other words, that there existed a TM M such that:
Ld = L(M) (2)
Now suppose that wj is a string encoding of M. (3) Case 1) wj is in Ld (4)
By definition of Ld given in (1), either wj does not encode a TM, or wj does encode a TM, call it M, and wj is not in L(M). But we know that
wj encodes a TM (3: that’s where it came from). Therefore:
wj is not in L(M) But then (2) and (5) imply that wj is not in Ld contradicting (4). Case 2) wj is not in Ld | (5) | (6) |
By definition of Ld given in (1), wj encodes a TM, call it M, and: wj is in L(M) | (7) | |
But then (2) and (7) imply that wj is in Ld contradicting (6).
Since both case 1) and case 2) lead to a contradiction, no TM M can exist such that Ld = L(M). Therefore Ld is not recursively enumerable. •
TOC notes by S P CHARI
| ( | ) | B |
findPair | (findPair2, "(", R) | - | (final, B, R) |
findPair2 | (findPair2, "(", R) | (removePair, ")", L) | - |
removePair | (fetch, "(", R) | (fetch, ")", R) | (goBack, B, L) |
fetch | (retrieve, "(", R) | (retreive, ")", R) | (retreive, B, R) |
retreive | (returnOpen, "(", L) | (returnClosed, ")", L) | (returnBlank, B, L) |
returnOpen | (writeOpen, "(", L) | (writeOpen, ")", L) | (writeOpen, B, L) |
returnClosed | (writeClosed, "(", L) | (writeClosed, ")", L) | (writeClosed, B, L) |
returnBlank | (writeBlank "(", L) | (writeBlank, ")", L) | (writeBlank, B, L) |
writeOpen | (removePair, "(", R) | (removePair, "(", R) | - |
writeClosed | (removePair, ")", R) | (removePair, ")", R) | - |
writeBlank | (removePair, B, R) | (removePair, B, R) | - |
goBack | - | - | (backAgain, B, L) |
backAgain | - | - | (seekFront, B, L) |
seekFront | (seekFront, "(", L) | (seekFront, ")", L) | (findPair, B, R) |
final* | - | - | - |
Roger’s TM for balanced parenthesis:
TOC notes by S P CHARI
On 111 111 as a TM encoding
<Quote> It was ambiguous, in my opinion, based on the definition in the Hopcroft book, i.e., the definition in the Hopcroft book was not clear/precise enought to
account this special case. I don't have the book in front of me right now, but I think this is the example I used in class: Consider the TM that has exactly one state, but no transitions. Perfectly valid TM, and it would give us this encoding (111111). In that case the encoded machine would accept sigma* because the highest numbered state would be q0, the only state, and that would be the final state under the Hopcroft encoding. Now consider the TM that has exactly two states, but no transitions. Also a perfectly valid TM, and it would give us the same encoding. In that case the encoded machine would not accept anything because the final state is q1 (highest numbered state), and there is no way to get to it. I used it only as a way to raise that issue in class, i.e., the the Hopcroft definition is a bit ambiguous in this case.
One way to resolve the ambiguity is to require the encoding to specifically specify the final state (at the end or something). In that case, 111111 isn't even a valid TM, since it doesn't specify the final state. Another related question is, does a TM even have to have any states at all to be a valid TM? The encoding would have to be able to isolate that as a unique string also. <End Quote>
Phil Bernhard