1 of 138

Introduction to Computability Theory

Finite Automata and Regular Languages

Prof. Amos Israeli

1

2 of 138

Introduction

Computer Science stems from two starting points:

Mathematics: What can be computed?

And what cannot be computed?

Electrical Engineering: How can we build computers?

Not in this course.

2

3 of 138

Introduction

Computability Theory deals with the profound mathematical basis for Computer Science, yet it has some interesting practical ramifications � that I will try to point out sometimes.

The question we will try to answer in this course is:

“What can be computed? What Cannot be computed and where is the line between the two?”

3

4 of 138

Computational Models

A Computational Model is a mathematical object (Defined on paper) that enables us to reason about computation and to study the properties and limitations of computing.

We will deal with Three principal computational models in increasing order of Computational Power.

4

5 of 138

Computational Models

We will deal with three principal models of computations:

Finite Automaton (in short FA).�recognizes Regular Languages .
Push Down Automaton (in short PDA).�recognizes Context Free Languages .
Turing Machines (in short TM).�recognizes Computable Languages .

5

6 of 138

Alan Turing - A Short Detour

Dr. Alan Turing is one of the founders of Computer Science (he was an English Mathematician). Important facts:

“Invented” Turing machines.
“Invented” the Turing Test.
Broke into the German submarine transmission encoding machine “Enigma”.
Was arraigned for being gay and committed suicide soon after.

6

7 of 138

Finite Automata - A Short Example

The control of a washing machine is a very simple example of a finite automaton.
The most simple washing machine accepts quarters and operation does not start until at least 3 quarters were inserted.

8 of 138

Control of a Simple Washing Machine

Accepts quarters.
Operation starts after at least 3 quarters were inserted.
Accepted words: 25,25,25; 25,25,25,25; …

25

9 of 138

Finite Automata - A Short Example

The second washing machine accepts 50 cents coins as well.

Accepted words: 25,25,25; 25,50; …

25

25,50

25

50

10 of 138

Finite Automata - A Short Example

The second washing machine accepts 50 cents coins as well.
The most complex washing machine accepts $1 coins too.

25

25,50,100

25

50

50,100

100

25,50,100

11 of 138

Finite Automaton - An Example

11

0

1

States:

0,1

Initial State:

Final State:

12 of 138

Finite Automaton – An Example

12

,

0

1

0,1

Accepted words:

Transition Function:

Alphabet: .

Note: Each state has all transitions .

13 of 138

Finite Automaton – Formal Definition

A finite automaton is a 5-tupple �where:

is a finite set called the states.
is a finite set called the alphabet.
is the transition function.
is the start state, and
is the set of accept states.

13

,

14 of 138

Observations

Each state has a single transition for each symbol in the alphabet.
Every FA has a computation for every finite string over the alphabet.

14

,

15 of 138

Examples

accepts all words (strings) ending with 1.�The language recognized by , called � satisfies: .

�

15

16 of 138

How to do it

Find some simple examples (short accepted and rejected words)
Think what should each state “remember” (represent).
Draw the states with a proper name.
Draw transitions that preserve the states’ “memory”.
Validate or correct.
Write a correctness argument.

17 of 138

The automaton

Correctness Argument:

The FA’s states encode the last input bit and is the only accepting state. The transition function preserves the states encoding.

1

0

1

0

1

0

18 of 138

Examples

accepts all words (strings) ending with 1.� .
accepts all words ending with 0.

2’. accepts all words ending with 0 and the empty word .

This is the Complement Automaton of .

18

19 of 138

Examples

accepts all words (strings) ending with 1.� .
accepts all words ending with 0.
accepts all strings over alphabet �that start and end with the same symbol .

19

20 of 138

Examples (cont)

accepts all words of the form where are integers and .
accepts all words in .

20

21 of 138

Languages

Definition: A language is a set of strings over some alphabet .
Examples:

�

21

22 of 138

Languages

A language of an FA, , , is the set of words (strings) that accepts.
If we say that recognizes .
If is recognized by some finite automaton , is a Regular Language.

22

,

23 of 138

Some Questions

Q1: How do you prove that a language is regular?

A1: By presenting an FA, , satisfying .

Q2: Why is it important?

A2: Recognition of a regular language requires a controller with bounded Memory.

Q3: How do you prove that a language is not regular?

A3: Hard! to be answered on Week3 of the course.

23

24 of 138

Wrap up

In this talk we:

Motivated the course.
Defined Finite Automata (Latin Pl. form of automaton).
Learned how to deal with construction of automata and how to come up with a correctness argument.

24

25 of 138

Example of an NFA

25

0,1

1

,0

1. A state nay have 0 or more transitions labeled with � the same symbol.

2. transitions are possible.

NFA – Nondeterministic Finite Automaton

26 of 138

Computation of an NFA

When several transitions with the same label exist, an input word may induce several paths.
When no transition is possible a computation is “stuck”.

Q:Which words are accepted and which are not?

A: If word w induces (at least) a single accepting path, the automaton “chooses” this accepting path and w is accepted.

26

27 of 138

Possible Computations

27

DFA

.

. . . .

NFA

At each step of the computation: DFA - A single state is occupied.

NFA - Several states may be � occupied.

28 of 138

Computation of an NFA - Example

28

On the input word w=01011 there exist an accepting path and w is accepted.

Can we characterize (find) the language recognized by this automaton?

Words containing either 101 or 11 as substrings.

0,1

1

,0

29 of 138

Computation tree of 01011

29

0,1

1

,0

0,1

0

1

0

1

Accepting branch

30 of 138

Why do we Care About NFAs?

NFA-s and DFA-s are equivalent (Meaning: They recognize the same set of languages).In other words: Each NFA recognizing language L has an equivalent DFA recognizing L.

(Note: This statement must be proved.)

But usually, the NFA is much simpler.
Enables the proof of theorems. (e.g. about regular operations)

30

31 of 138

Example - A Complicated DFA

Can you guess the language?

1

0

32 of 138

Example – An Equivalent NFA

Can you guess the language?

1

0,1

Bit Strings that have 1 in position third from the end

33 of 138

Example - A Complicated DFA

Can you guess the language?

1

0

Can you verify it now?

34 of 138

DFA – A Formal Definition(Rerun)

A finite automaton is a 5-tupple �where:

is a finite set called the states.
is a finite set called the alphabet.
is the transition function.
is the start state, and
is the set of accepting states.

34

35 of 138

NFA – A Formal Definition

A finite automaton is a 5-tupple �where:

is a finite set called the states.
is a finite set called the alphabet.
is the transition function.� ( )
is the start state, and
is the set of accept states

35

36 of 138

Differences between NFA-s and DFA-s

There are two differences:

1. The range of the transition function is now � . (The set of subsets of the state set )

2. The transition function allows transitions.

36

37 of 138

Computations of NFA-s

In general a computation of an NFA, N, on input w, induces a computation tree.

Each path of the computation tree represents a possible computation of N.

The NFA N accepts w, if its computation tree includes at least one path ending with an accepting state.

37

38 of 138

Computations of NFA-s

There are two ways to look at computations of an NFA:

The first is to say that the NFA N “chooses” the right path on its tree of possible computations.
The second is to say that the NFA N traverses its computation tree “in parallel”.

38

39 of 138

Equivalence Between DFAs and NFAs

Now we prove that the class of NFAs is Equivalent to the class of DFA:

Theorem: For every NFA , there exists a DFA� , such that .

Proof Idea: The proof is Constructive: We assume that we know , and construct a simulating DFA, .

39

40 of 138

Construction Demonstration

40

We start with an NFA with no transitions.

The equivalent DFA is denoted by:

0,1

1

0

41 of 138

Construction Demonstration

The state set of the equivalent DFA, D, should reflect the fact that at each step of the computation, N may occupy several sates, Thus we define the state set of D as �the power-set of the state set of N, namely:

42 of 138

Construction Demonstration

Recall that the transition function of any NFA is defined by .

Do not be confused by the use of as the state set of D . D is a DFA, not the original NFA.

43 of 138

Construction Demonstration

The alphabet of D, , is identical to the alphabet of N.

The starting state of D corresponds to the starting state of N, namely .

The most complex part of the construction is the transition function:

In the sequel it is demonstrated stage by stage:

44 of 138

Construction Demonstration

�

Since the only state reachable from by 0 is ,

we get .

0,1

1

0

45 of 138

Construction Demonstration

Since both and are reachable from by 1,

we get .

0,1

1

0

1

46 of 138

Construction Demonstration

Similarly we get .

0,1

1

0

1

47 of 138

Construction Demonstration

and .

0,1

1

0

1

0

1

48 of 138

Construction Demonstration

Since and ,�we get .

0,1

1

0

1

0

1

49 of 138

Construction Demonstration

Finally we get the following DFA:

0,1

1

0

1

0

1

0

1

50 of 138

Construction Demonstration

And its disconnected section:

1

0

1

0

1

0

1

0

1

51 of 138

Construction Demonstration

Since 1,0 is accepted we get that is an accepting state.

0,1

1

0

1

0

1

0

1

52 of 138

Construction Demonstration

Since 1,0,1 is accepted we get that is also an accepting state.

0,1

1

0

1

0

1

0

1

53 of 138

Wrap Up

In this lecture we:

Introduced NonDeterministic Finite Automatons.
Demonstrated the equivalence between DFA-s and NFA-s.

53

54 of 138

Equivalence Between DFAs and NFAs

Every DFA is (A special case of) an NFA, hence� , but .

Nevertheless, these classes are Equivalent.

This means that for any NFA N there exists a DFA D satisfying: .

54

55 of 138

Equivalence Between DFAs and NFAs

Thus, to prove equivalence of the classes we prove:

Theorem: For every NFA N there exists a DFA D satisfying :

Proof Idea: The proof is Constructive: We assume that we know , and construct a simulating DFA, D .

55

56 of 138

Proof

Let recognizing some language A. The state set of the simulating DFA D, should reflect the fact that at each step of the computation, N may occupy several sates.

Thus we define the state set of D as �the power-set of the state set of N.

56

57 of 138

Proof (cont.)

Let recognizing some language A. First we assume that N has no � - transitions.

Define where .

57

58 of 138

Proof (cont.)

Our next task is to define D’s transition function, : �For any and define

If R is a state of M, then it is a set of states of N. When M in state R processes an input symbol a, M goes to the set of states to which N will go in any of the branches of its computation.

58

59 of 138

Proof (cont.)

An alternative way to write the definition of M’s transition function, is: �For any and define

And the explanation is just the same.

Note: if than

Which is OK since .

59

60 of 138

Proof (cont.)

The initial state of M is: � .

Finally, the final state of M is: �

Since D accepts if N reaches at least one accepting state.

The reader can verify for her/him self that D indeed simulates N .

60

61 of 138

Proof (cont.)

It remains to consider - transitions.

For any state of D define to be the collection of states of R unified with the states reachable from R by - transitions. The old definition of is:

And the new definition is:

61

62 of 138

Proof (end)

In addition, we have to change the definition of � , the initial state of . The previous definition, , is replaced with� .

Once again the reader can verify that the new definition of D satisfies all requirements.

62

63 of 138

Corollary

A language L is regular if and only if there exists an NFA recognizing L .

63

64 of 138

The Regular Operations

Let and be 2 regular languages above the same alphabet, . We define the 3 Regular Operations:

Union: .
Concatenation: .
Star: .

64

65 of 138

Elaboration

Union is straight forward.
Concatenation is the operation in which each word in A is concatenated with every word in B.
Star is a unary operation in which each word in A is concatenated with every other word in A and this happens any finite number of times.

65

66 of 138

The Regular Operations - Examples

.�

66

67 of 138

Motivation for Nondeterminism

We want to use the regular operations for a systematic construction of all regular expressions.

Given a two DFA-s, their product DFA can recognizes their union, but we do not know how to construct a DFA recognizing either concatenation or star.

This can be proved by using NFA-s.

67

68 of 138

Theorem

The class of Regular languages is closed under the all three regular operations.

.

68

69 of 138

Proof for union Using NFA-s

If and are regular, each has its own recognizing automaton and , respectively.

In order to prove that the language is regular we have to construct an FA that accepts exactly the words in .

69

70 of 138

A Pictorial proof

70

F

71 of 138

Proof for union Using NFA-s

Let recognizing A₁, and � recognizing A₂.

Construct to recognize ,

Where , ,

71

72 of 138

Theorem

The class of Regular languages is closed under the concatenation operation. � .

72

73 of 138

Proof idea

Given an input word to be checked whether it belongs to , we may want to run until it reaches an accepting state and then to move to .

73

74 of 138

Proof idea

The problem: Whenever an accepting state is reached, we cannot be sure whether the word of is finished yet.

The idea: Use non-determinism to choose the right point in which the word of is finished and the word of starts.

74

75 of 138

A Pictorial proof

75

F

76 of 138

Proof using NFAs

Let recognizing A₁, and � recognizing A₂.

Construct to recognize ,

Where , ,

76

77 of 138

Theorem

The class of Regular languages is closed under the star operation. � .

77

78 of 138

A Pictorial proof

78

F

79 of 138

Proof using NFAs

Let recognizing A₁.

Construct to recognize

Where , , and

79

80 of 138

Wrap Up

In this lecture we:

Proved equivalence between DFA-s and NFA-s.
Motivated and defined the three Regular Operations.
Proved that the regular operations preserve regularity.

80

81 of 138

Introduction

Regular languages are defined and described by use of finite automata.

In this lecture, we introduce Regular Expressions as an equivalent way, yet more elegant, to describe regular languages.

81

82 of 138

Motivation

If one wants to describe a regular language, La, she can use the a DFA, D or an NFA N, such that that .

This is not always very convenient.

Consider for example the regular expression� describing the language of binary strings containing a single 1.

82

83 of 138

Basic Regular Expressions

A Regular Expression (RE in short) is a string of symbols that describes a regular Language.

Let be an alphabet. For each , the symbol is an RE representing the set .
The symbol is an RE representing the set . (The set containing the empty string).
The symbol is an RE representing the empty set.

83

84 of 138

Inductive Construction

Let and be two regular expressions representing languages and , resp.

The string is a regular expression representing the set .
The string is a regular expression representing the set .
The string is a regular expression representing the set .

84

85 of 138

Inductive Construction - Remarks

Note that in the inductive part of the definition larger RE-s are defined by smaller ones. This ensures that the definition is not circular.

85

86 of 138

Inductive Construction - Remarks

This inductive definition also dictates the way we will prove theorems: For any theorem T.

Stage 1: Prove T correct for all base cases.

Stage 2: Assume T is correct for and . � Prove correctness for , , � and .

86

87 of 138

Some Useful Notation

Let be a regular expression:

The string represents , and it also holds that .
The string represents .�
The string represents .
The Language represented by R is denoted by� .

87

88 of 138

Precedence Rules

The star (*) operation has the highest precedence.
The concatenation ( ) operation is second on the preference order.
The union ( ) operation is the least preferred.
Parentheses can be omitted using these rules.

88

89 of 138

Examples

– .
– .
– .�
– . �
– . �

89

90 of 138

Examples

- all words starting and � ending with the same letter.
- all strings of forms � 1,1,…,1 and 0,1,1,…1 .
- A set concatenated with the � empty set yields the empty set .
- .

90

91 of 138

Equivalence With Finite Automata

Regular expressions and finite automata are equivalent in their descriptive power. This fact is expressed in the following Theorem:

Theorem�A set is regular if and only if it can be described by a regular expression.

The proof is by two Lemmata (Lemmas):

91

,

92 of 138

Lemma ->

If a language L can be described by regular expression then L is regular.

92

,

93 of 138

Proofs Using Inductive Definition

The proof follows the inductive definition of RE-s as follows:

Stage 1: Prove correctness for all base cases.

Stage 2: Assume correctness for and , and show its correctness for , and� .

93

94 of 138

Induction Basis

For any , the expression describes the set , recognized by:
The set represented by�the expression is recognized by:
The set represented by�the expression is recognized by:

94

95 of 138

The Induction Step

Now, we assume that and represent two regular sets and claim that , and represent the corresponding regular sets.

The proof for this claim is straight forward using the constructions given in the proof for the closure of the three regular operations.

95

96 of 138

Examples

Show that the following regular expressions represent regular languages:

.
.

To be demonstrated on the Blackboard.

96

97 of 138

Lemma <-

If a language L is regular then L can be described by some regular expression.

97

98 of 138

Proof Stages

The proof follows the following stages:

Define Generalized Nondeterministic Finite Automaton (GNFA in short).
Show how to convert any DFA to an equivalent GNFA.
Show an algorithm to convert any GNFA to an equivalent GNFA with 2 states.
Convert a 2-state GNFA to an equivalent RE.

98

,

99 of 138

Properties of a Generalized NFA

A GNFA is a finite automaton in which each transition is labeled with a regular expression over the alphabet .
A single initial state with all possible outgoing transitions and no incoming trans.
A single final state without outgoing trans.
A single transition between every two states, including self loops.

99

,

100 of 138

Example of a Generalized NFA

100

101 of 138

A Computation of a GNFA

A computation of a GNFA is similar to a computation of an NFA. except:�In each step, a GNFA consumes a block of symbols that matches the RE on the transition used by the NFA.

101

102 of 138

Example of a GNFA Computation

Consider abba or bb or abbbaaaaabbbbb

102

103 of 138

Converting a DFA (or NFA) to a GNFA

Conversion is done by a very simple process:

Add a new start state with an - transition from the new start state to the old start state.
Add a new accepting state with - transition from every old accepting state to the new accepting state.

103

104 of 138

Converting a DFA to a GNFA (Cont)

Replace any transition with multiple labels by a single transition labeled with the union of all labels.
Add any missing transition, including self transitions; label the added transition by .

104

105 of 138

Stage 1: Convert D to a GNFA

1.0 Start with D

105

106 of 138

Stage 1: Convert D to a GNFA

1.1 Add 2 new states

106

107 of 138

Stage 1: Convert D to a GNFA

1.2 Make the initial state and the final�state.

107

108 of 138

Stage 1: Convert D to a GNFA

1.3 Replace multi label transitions by their union.

108

109 of 138

Stage 1: Convert D to a GNFA

1.4 Add all missing transitions and label them .

109

110 of 138

Ripping a state from a GNFA

The final element needed for the proof is a procedure in which for any GFN G, any state of G, not including and , can be ripped off G, while preserving .�This is demonstrated in the next slide by considering a general state, denoted by , and an arbitrary pair of states, and , as demonstrated in the next slide:

110

111 of 138

Removing a state from a GNFA

111

Before Ripping

After Ripping

Note: This should be done for every pair of outgoing and incoming outgoing .

112 of 138

Ellaboration

Consider the RE ,�representing all strings �that enable transition �from via to .

What we want to do is to augment the Regular expression of transition , namely , so These strings can pass through . This is done by setting it to .

112

113 of 138

Ellaboration

Note: In order to achieve �an equivalent GNFA in �which is disconnected,�this procedure should be�carried out separately, for every�pair of transitions of the form and � . Then can be removed, as demonstrated on the next slide:

113

114 of 138

Elaboration

Assume the following situation:�In order to rip , all pairs�of incoming and outgoing�transitions should be considered �in the way showed on the �previous slide namely consider� � one after the other. After that can be ripped while preserving .

114

115 of 138

In Particular

Replace with .

�

115

116 of 138

A (half?) Formal Proof of Lemma<-

The first step is to formally define a GNFA.

Each transition should be labeled with an RE.

Define the transition function as follows: �

�where denotes all regular expressions over .

Note: The def. of is different then for NFA.

116

117 of 138

Changes in Definition

Note: The definition of as:

�is different than the original definitions (For DFA and NFA).

In this definition we rely on the fact that every 2 states (except and ) are connected in both directions.

117

118 of 138

GNFA – A Formal Definition

A Generalized Finite Automaton is a 5-tupple � where:

is a finite set called the states.
is a finite set called the alphabet.
^* is the transition function.
is the start state, and
is the accept state.

118

119 of 138

GNFA – Defining a Computation

A GNFA accepts a string if �and there exists a sequence of states � , satisfying:

For each , , , where� , or in other words, is the expression on the arrow from to .

119

120 of 138

Procedure CONVERT

Procedure CONVERT takes as input a GNFA G with k states.

If then these 2 states must be and � , and the algorithm returns

.

If , the algorithm converts G to an equivalent G’ with states by use of the ripping procedure described before.

120

121 of 138

Procedure CONVERT

Convert

;
If return ;
;
;
For any and any

�for �return ;

121

122 of 138

Recap

In this lecture we:

Motivated and defined regular expressions as a more concise and elegant method to represent regular Languages.
Proved that FA-s (Deterministic as well as Nondeterministic) and RE-s is identical by:�2.1 Defined GNFA – s.�2.2 Showed how to convert a DFA to a GNFA.�2.3 Showed an algorithm to converted a � GNFA with states to an equivalent � GNFA with states.

122

123 of 138

Introduction and Motivation

In this lecture we ask: Are all languages regular?

The answer is negative.

The simplest example is the language�

Try to think about this language.

123

124 of 138

Introduction and Motivation

If we try to find a DFA that recognizes the language , , it seems that we need an infinite number of states, to “remember” how many a-s we saw so far.

Note: This is not a proof!

Perhaps a DFA recognizing B exists, but we are not clever enough to find it?

124

125 of 138

Introduction and Motivation

The Pumping Lemma is the formal tool we use to prove that the language B (as well as many other languages) is not regular.

125

126 of 138

What is Pumping?

Consider the following NFA, denoted by N:

It accepts all words of the form .

126

127 of 138

What is Pumping?

Consider now the word .

Pumping means that the word can be divided into two parts: 1 and 10, such that for any , the word .

We say that the word 110 can be pumped.

For this is called down pumping.

For this is called up pumping.

127

128 of 138

What is Pumping?

A more general description would be:�A word , can be pumped if and for each , it holds that .

Note: the formal definition is a little more complex than this one.�

128

129 of 138

The Pumping Lemma

Let A be a regular language. There exists a number p such that for every , if � then w may be divided into three parts , satisfying:

for each , it holds that .
.
.

Note: Without req. 2 the Theorem is trivial.

129

130 of 138

Demonstration Continuation

In terms of the previous demonstration we have:

.
For , we get:

.� .� .

130

131 of 138

Proof of the Pumping Lemma

Let D be a DFA recognizing A and let p be the number of states of D. If A has no words whose length is at least p, the theorem holds vacuously. Let be an arbitrary word such that . Denote the symbols of w by where .

131

132 of 138

Proof of the Pumping Lemma

Assume that is the sequence of states that D goes through while computing with input w. For each , , � . Since , .

Since the sequence contains �states and since the number of states of D is p, that there exist two indices , such that .

132

133 of 138

Proof of the Pumping Lemma

Denote , and � .

Note: Under this definition and .

By this definition, the computation of D on � starting from , ends at .

By this definition, the computation of D on � , starting from , ends at which is an accepting state.

133

134 of 138

Proof of the Pumping Lemma

The computation of D on �starting from , ends at . Since , this computation starts and ends at the same state.

Since it is a circular computation, it can repeat itself k times for any .

In other words: for each , .

Q.E.D.

134

135 of 138

Example: .

Lemma: The language is not regular.

Proof: Assume towards a contradiction that L is regular and let p be the pumping length of L.

Let . By the Pumping Lemma there exists a division of w, , such that � , and w can be pumped.

This means that , where .

Since we conclude

135

136 of 138

Example: (Cont.)

This however implies that in , the number of a-s is smaller then the number of b-s.

It also means that for every , the number of a-s in is larger then the number of �b-s. Both cases constitute a contradiction.

Note: Each one of these cases is separately sufficient for the proof.

136

137 of 138

Discussion

This is what we got so far:

�

137

RL-s Ex:

138 of 138

Lecture Recap

Motivated the Pumping Lemma.
Presented and demonstrated the pumping concept.
Presented and proved the Pumping Lemma.
Used the pumping lemma to prove that some languages are not regular.

138