Overview

In this assignment, we explore the idea of language regularity. A language is regular if we can create a DFA that accepts all the strings in the language and rejects all the strings not in the language.

In this assignment, we introduce an equivalent notion of regularity called regular expressions. We will use regular expressions, as well as DFAs, to explore the computational limits of regular languages.

Regular expressions

Like a DFA, regular expressions give us a way to define a language.

Learn about regular expressions

Read this brief tutorial on regular expressions, to learn more about them. You might also want to consider doing the regular-expression warmups, to get more practice with regular expressions.

Note: You may be familiar with regular expressions from a programming language that you know. If so, read the tutorial carefully because the regular expressions described there are probably more constrained than the ones you know. Regular expressions in programming languages tend to be more expressive (and possibly more computationally powerful) than the more formal definition we will be using in this assignment.

Write a regular expression

Using the syntax described in the tutorial, write a regular expression that defines the language of bitstrings that start and end with 0. (Note that the empty string should not be accepted by this regular expression.)

Turn in your regular expression on Gradescope, under the assignment HW 1: Regular expression.

Write a proof about your regular expression

It turns out that regular expressions are equivalent to DFAs: There is an algorithm to convert every regular expression to a DFA that accepts the same language, and vice-versa. We will not study that algorithm in CS 42, but fortunately—because it’s an algorithm—you can use a tool to automatically convert your regular expression to a DFA!

Plug your regular expression from the previous part of the assignment into this online converter, to generate an equivalent DFA. Observe the number of states required by the DFA, then answer the questions on Gradescope (assignment HW 1: Distinguishability proof) to prove that any DFA that accepts bitstrings that start and end with 0 must need at least the number of states defined by your generated DFA.

Here are a couple of important notes:

Make sure you are using the version of the converter that creates a minimum DFA. That version should be the one linked above; but if you are not sure, just ask!
Sometimes, the tool does not generate a deterministic finite automaton: there may not be a transition for 0 and for 1, for each state in the DFA. Double-check the generated automaton. If it is not deterministic then the DFA requires at least N+1 states (where N is the number of states in the generated automaton).^[1]

(Ir)regular languages

On Gradescope (assignment HW 1: Irregularity proof), complete the proof that the following language is not regular:

L = {w | w = zz for some string z }

(In other words, w has even length and the first half is the same as the second half)

For this language, we will assume that 𝚺={0,1} (in other words., bitstrings).

(Note that this language is a little similar to the one in the previous part—both start and end with the same thing—but the language in this part is not regular.)

Before working on this problem, you might want to look over the class materials and the warmups, to become more familiar with this style of proof.

Read and respond

Read over this article. (Note, you will need to log in with your g.hmc.edu address to access the article).

On Gradescope (HW 1: Reading), write a brief response (4–6 sentences) to the following prompt:

The article mentions a "quiz" where you can see if you can tell if snippets were algorithmically or human-authored, but that quiz is no longer available. Instead, take this direct link to an algorithmic-authoring quiz. Try the quiz (which has six poems—some by human, some by computer). How did you do? To what extent do you feel that automatic authoring is a positive (or negative, or mixed) development? .

Note: The quiz may not show up in some browsers or with some ad-blockers. I was able to get it to display in Safari…

Where to go from here

If you are interested in these topics, here are a few ways to explore it deeper. These items aren’t part of the assignment and aren’t meant to interfere with your other work. They’re just here for fun, and you can refer back to them later, whenever you have some free time and want to think more about the themes of the assignment.

This week in class, we explored regularity, Turing Machines, and the expressive power of computational models. Essentially, we answered the question “What does it mean to compute?”. If you are interested in these topics, here are some additional things to explore.

Resources

If you’re interested in DFAs and the kinds of problems they can(’t) solve, here are a couple of books that go more in depth. These books also talk about other concepts related to DFAs, all from the subdiscipline of Computer Science and Math called Theory of Computation.

Automata, Computability and Complexity: Theory and Applications by Elaine Rich.
Introduction to the Theory of Computation by Michael Sipser.

Both these books go into more detail about regularity, including topics such as:

A proof that DFAs and regular expressions are equivalent models of computation.
Algorithms for converting between these models.
An algorithm for minimizing a DFA.
An algorithm for comparing whether two DFAs are equivalent.
Descriptions of Turing Machines.
Descriptions of other models of computation we haven’t examined.

If you liked Turing Machines and the Halting Problem, you might be interested in:

The Annotated Turing by Charles Petzold
A poem about the Halting Problem by Geoffrey K. Pullum

Practice problems for Turing machines

CS 42 doesn’t ask you to build up the skill of creating Turing Machines—that kind of question won’t appear on an assignment or an exam. However, if you’re curious about what it’s like to use a Turing Machine to solve a problem, here are a couple of practice problems. You can draw them with JSFlap!

Exercise 1

Make a Turing machine that accepts the following language:

L = { w | w is a bitstring of the form 0n12n}

i.e., a bitstring that starts with some number of 0s, followed by twice that many 1s. Assume that the read/write head starts on the left-most character of the (possibly empty) input. Remember that there may be zero 0s!

Exercise 2

Make a Turing machine that accepts the following language:

L = {w | w contains only 0s and the length of w is a power of 2}

You can assume that the input will be a single, contiguous block of all 0s. Also, you can assume that if there are any 0s on the Turing machine's tape, that the read/write head starts at the leftmost 0.

[1] We can make the generated automaton deterministic by adding a single “trap” state. Then, for each state in the generated automaton, for each missing transition, we add a transition from the generated state to the new trap state. This trap state is not an accepting state, so the new automaton accepts the same strings as the generated one, while also being deterministic.