Turing Machines

There was a time, not too long ago, when people had not yet precisely defined what computation was. Alan Turing took on this task. Here is what he came up with.

The Backstory

History helps a lot! Why do we have Turing Machines? Why do we care? Why are we better knowing about Turing Machines than not knowing them?

Hear the answers in class. We’ll go through 19th century advances in logic, the formalists versus the intuitionists, Hilbert’s Challenges, the need to formalize computation, Gödel’s and Church’s approaches, and how Turing’s view of computation was the most mechanistic (as opposed to logical or mathematical) and influenced by human computers of the day, and how he (Turing) translated all these ideas into simple mathematical constructs that in some real sense....nailed it!

The Basic Idea

Turing looked at computers of his day and asked “What are these people doing?” He thought about it and realized these folks would write something down, think for a bit, write something else or erase something, think some more, write or erase, and so on. They would move from mental state to mental state as they worked, deciding what to do next based on what mental state they were in and what was currently written.

He realized he could capture this process formally and keep the model very simple: instead of a two-dimensional page, a single one-dimensional tape would do, and instead of randomly jumping around, it would be okay to move one cell at a time. Today we picture the machines like this:

tm.png

The idea is this:

When in state $q$ looking at symbol $\sigma$, write a new symbol $\sigma'$, move either one square left or right, then go to state $q'$.

Here’s the story of the development of the Turing Machine and its impact:

Thirteen Examples

Let’s see some examples before we get formal.

More Examples

If the 13 examples in the video were not enough, we have more.

Binary Decrement

TODO

Reverse a String

This machine reverses any string over the alphabet $\{a, b, c\}$:

tm-reverse.png

Binary Addition

TODO

Parity

TODO

Is Empty?

TODO

Is Nonempty?

TODO

Is Even?

Here’s a recognizer machine to determine if its input, a binary numeral, is even:

tmIsEven.png

Hey, there is a connection to language theory here! The TM just given can both answer the question “Is this number even?” but it also recognizes the language $\{ w \in \{0,1\} \mid w \textrm{ is a even binary integer} \}$. That’s right, answering yes-no questions is isomorphic to determining membership in a language.

Is Odd?

tmIsOdd.png

Is (a signed binary integer) Negative?

tmIsNegative.png

Is Palindrome?

Here is a machine to determine if a string over the alphabet $\{a,b,c\}$ is a palindrome:

TODO

Is Divisible by 5?

Take the empty string to be 0. Read the binary string from left to right. Whenever you see a zero, your current number goes from $n$ to $2n$. Whenever your see a 1, your number increases to $2n+1$. Keep track of the current number modulo 5. You accept if you finish with a number whose value mod 5 is 0.

mod5.png

Even More Examples

Want even more examples? Check out this article with six interesting examples, including:

Formal Definition

Y’all set for a formal definition?

Remember, from our earlier notes on Automata Theory that machines can be transducers (produce an output) or recognizers (answer yes or no).

Here is the definition for a transducing TM:

A Turing Machine is a tuple $(Q, \Sigma, \Gamma, b, R, S)$ where:

  • $Q$ is a finite set of states
  • $\Sigma$ is the alphabet of symbols that comprise our inputs and outputs
  • $\Gamma$ is the symbols allowed to appear on the tape
  • $b \in \Gamma - \Sigma$ is the blank symbol
  • $R \subseteq (Q \times \Gamma) \times (\Gamma \times \{\textsf{L},\textsf{R}\} \times Q)$ says what can be done in a given state looking at a given symbol
  • $S \in Q$ is the distinguished start state

A configuration $\Psi$ is a tuple $(u,q,v) \in \Gamma^* \times Q \times \Gamma^*$ describing a machine with $uv$ on the tape preceded by, and followed by, an infinite number of blanks, with the machine looking at the first symbol of $v$.

The transition relation $\Longrightarrow$ on configurations relates configurations with the configurations they give rise to after one step of execution. Formally (note $u, v \in \Gamma^*$ and $\sigma, \sigma' \in \Gamma$):

$$ \begin{array}{lcl} \Longrightarrow_R & = & \{ ((u,q,\sigma v),(u\sigma',q',v)) \mid (q,\sigma)\,R\,(\sigma',\textsf{R},q') \} \\ & \cup & \{ ((u\sigma,q,v),(u,q',\sigma'v)) \mid (q,\sigma)\,R\,(\sigma',\textsf{L},q') \} \end{array} $$

The machine $(Q, \Sigma, \Gamma, b, R, S)$ computes $w'$ from $w$ if $w, w' \in \Sigma^*$ and $$\exists\,u\,q\,\sigma\,v. (\varepsilon, S, w) \Longrightarrow^*_R (u, q, \sigma v) \wedge w' = \textsf{trim}(u\sigma v) \wedge \not \exists \Psi. (q, \sigma)\,R\,\Psi$$

What about the recognizer flavor? It’s just this:

A Recognizing Turing Machine is a tuple $(Q, \Sigma, \Gamma, b, R, S, A)$ where:

  • $(Q, \Sigma, \Gamma, b, R, S)$ is as defined above
  • $A \subseteq Q$ is a distinguished subset of accept states
  • Configurations and the transition relation are as above
The machine $(Q, \Sigma, \Gamma, b, R, S, A)$ accepts the string $w$ if $w \in \Sigma^*$ and $$\exists u\,q\,\sigma\,v. (\varepsilon, S, w) \Longrightarrow^*_R (u, q, \sigma v) \wedge q \in A \wedge \not \exists \Psi. (q, \sigma)\,R\,\Psi$$

The set of all strings accepted by machine $M$ is said to be $L(M)$, the language recognized by $M$.

Recognizers and Deciders

Note that a recognizer just has to answer YES for all strings in the language it is trying to recognize. For strings not in the language, a recognizer may loop forever. If a recognizer always halts with a YES or NO, it is called a decider.

Exercise: Study the definitions above until you understand everything about them and can regenerate them without looking. Don’t just memorize. Understand.

Example. Let’s use the formal definition to describe a Turing Machine that increments a binary number. Here’s the machine as a picture:

tmIncrementStandard.png

and here is the machine as a table:

StateSymbolWriteMoveNext State
$\textsf{S}$$0$0R$\textsf{S}$
$\textsf{S}$$1$1R$\textsf{S}$
$\textsf{S}$$\#$$\#$L$\textsf{F}$
$\textsf{F}$$1$0L$\textsf{F}$
$\textsf{F}$$0$1L$\textsf{D}$
$\textsf{F}$$\#$1L$\textsf{D}$

Let’s identify the parts. (For simplicity, we’ll use a single character for the state names.) Our machine is $(Q, \Sigma, \Gamma, \#, R, \textsf{S})$ where:

$Q = \{ \textsf{S}, \textsf{F}, \textsf{D} \}$
$\Sigma = \{ 0, 1 \}$
$\Gamma = \{ 0, 1, \# \}$
$\begin{array}{l} R = \{ \\ \;\; ((\textsf{S}, 0), (0, \textsf{R}, \textsf{S})), \\ \;\; ((\textsf{S}, 1), (1, \textsf{R}, \textsf{S})), \\ \;\; ((\textsf{S}, \#), (\#, \textsf{L}, \textsf{F})), \\ \;\; ((\textsf{F}, 1), (0, \textsf{L}, \textsf{F})), \\ \;\; ((\textsf{F}, 0), (1, \textsf{L}, \textsf{D})), \\ \;\; ((\textsf{F}, \#), (1, \textsf{L}, \textsf{D})), \\ \} \end{array}$

So there you have it. Any Turing Machine is just a mathematical object. Now we can study them formally.

Encoding

There are many ways to succinctly encode (unambiguously) a Turing machine. A common approach lists only the rules, using a single symbol for each state, making the start symbol appear at the beginning of the first rule, and simply inferring the base and tape alphabets. For example:

    S00RS_S11RS_S##LF_F01LD_F#1LD

Since each rule has exactly five symbols, we don’t really need the underscores:

    S00RSS11RSS##LFF01LDF#1LD

Later when we look at deterministic machines, we will see a different encoding.

Given a machine $M$, we generally use $\langle M\rangle$ to refer to its encoding.

The nice thing about encoding machines is that we can provide them as input to other machines. More on this in a few minutes.

Universality

In principle, we can build Turing Machines for computing all kinds of functions, such as capitalizing text, finding words in text, computing interest payments, finding documents on the World Wide Web, etc. But functions themselves can be represented in symbols, whether in mathematical notation, plain text Python, or any formalism of your choice. Just a few minutes we saw that the Turing machine to increment a binary number can itself be represented as a string:

    S00RSS11RSS##LFF01LDF#1LD

Given this, perhaps we can build a machine (call it $U$), that when you give it ANY function $f$ (as text), and some data $x$, then $U$ will produce $f(x)$? That is: $$U(f,x) = f(x)$$

As it turns out, for a certain class of functions, you can construct such a machine. You might take that for granted today, but it was not shown to be true until the 20th century! Really! (As far as anyone knows, that is.) This profound result is due to, you guessed it, Alan Turing, and it pretty much changed the world. This $U$ is called the Universal Turing Machine in his honor. Today we build variations of the original $U$. We call them computers.

Stop.

Think about this.

Woah.

Just woah.

This may the most profound idea anyone has ever had about computing.

Are you starting to see why Alan Turing is revered as the founder of computer science?

Exercise: If you haven’t watched the video at the top of these notes, do so now. Jade not only gives the backstory of the Turing machine creation and a nontrivial example, but she covers the idea of universality and its importance. This video is so spot on, you need to watch it. Maybe watch it again even if you already did.

You can also start reading about Universal Machines at Wikipedia.

Exercise: Why is it kind of cool that Turing’s work predated real general purpose computers?

Variations on the Turing Machine

Let’s mix things up a bit.

Determinism

We’ve defined our machines with a transition relation that could have multiple outgoing transitions from a state for the same configuration. But what if we required the machine to have, for every state, one and only one transition? That is, instead of:

$$ R \subseteq (Q \times \Gamma) \times (\Gamma \times \{L, R\} \times Q) $$

we required $R$ to be a function? Doing so would require us to expand the codomain of the function so that the machine could stop. We could add a halting action:

$$ R : (Q \times \Gamma) \rightarrow (\Gamma \times \{L, R\} \times Q) \cup \{ \textsf{HALT} \} $$

This says: if the machine is in some state $\sigma \in Q$, and currently looking at a symbol $\sigma \in \Gamma$, then either:

  1. Replace the symbol with a new symbol $\sigma'$, move one cell, and enter state $q' \in Q$, OR
  2. Halt

A slightly different definition of a deterministic Turing Machine treats “halting” as a kind of state, a halting pseudo-state:

$$ R : (Q \times \Gamma) \rightarrow (\Gamma \times \{L, R\} \times (Q\cup \{ \textsf{HALT})) \} $$

This says: if the machine is in some state $\sigma \in Q$, and currently looking at a symbol $\sigma \in \Gamma$, then:

  1. Replace the symbol with a new symbol $\sigma'$, move one cell, THEN
  2. Either enter state $q' \in Q$ or Halt
Exercise: Study and appreciate the subtle differences between these definitions. Do you feel your mathematical maturity rising?

With deterministic machines, we can write our tables in a different way. We place the triple (current-symbol, symbol-to-write, next-state) in the appropriate cell. For our signed binary number negation machine,

tmNegation.png

we get the following table (for the halt-as-action variety, with “—” representing the halt action, and states labeled traditionally as $A, B, C, \ldots$):

01#
A0RA1RA#LB
B0LB1LC
C1LC0LC

For the halt-as-state variety, the special halt state has no outgoing transitions, so it is never listed in the left column. We can redo our increment machine in this fashion:

tmIncrementHaltState.png

and label the halt state with a Z (or an H, or any other character that does not name a regular state):

01#
A0RA1RA#LB
B1LZ0LB1LZ
CLASSWORK
How many $n$-state deterministic TMs are there?

For a tape alphabet of $k$ symbols, it turns out that for the halt-as-action variety, there are $(2kn+1)^{kn}$ machines, and for the halt-as-state variety, there are $(2k(n+1))^{kn}$ machines.

For the $k=2$ case, the numbers are $(4n+1)^{2n}$ and $(4(n+1))^{2n}$, respectively.

We will work it out.

A fun fact: This simplification does not change the computing power of the model. Anything computed by a deterministic TM can be computed by a regular TM and vice versa.

Exercise: Find a proof of this in any theory textbook.

Stationary “Moves”

Instead of having to go left or right at each step, you can also provide to option to stay put. No change to computing power.

$$R \subseteq (Q \times \Gamma) \times (\Gamma \times \{\textsf{L},\textsf{R},\textsf{X}\} \times Q)$$

Here X means stay on the same cell.

$\varepsilon$-Moves

In the base model, the machine looks at the current symbol and decides what to do next. We can let it just decide to jump to another state without examining the current symbol. Again, no change in computing power.

$$R \subseteq (Q \times (\Gamma \cup \{ \varepsilon \})) \times (\Gamma \times \{\textsf{L},\textsf{R}\} \times Q)$$

Multidimensional Tapes

How about, instead of a tape that is infinite only in the left and right directions, we make a two dimensional tape that can go up and down also? Or a 3-D tape? Or make a machine whose tape is any finite number of dimensions? All good. These extensions do not increase the computing power of the model.

One-way Infinite Tape

What if the tape had a left-end, and was infinite only to the right? A little messier to program, perhaps, but no change in computing power.

Multiple Tapes

More than one tape? More convenient. But you guessed it, no change in computing power.

Stack and Queue Tapes

If we restricted our tape memory to act like a queue, we have a Queue Automaton and no change in computing power. But if our tape was stack-like, we’ve reduced our power: we can no longer compute certain functions (recognize certain languages). But if we had two stacks we are back to full power.

Read-Only and Write-Only and Append-Only Tapes

Sometimes we might want to have a bunch of tapes and use certain tapes only in certain ways. Maybe we can only read or only write or only append. Hmmm, tapes are starting to look flexible. We can imagine a tape that is:

Linear Bounded Automata (LBAs)

What if we bounded the tape on both ends? There’s this thing called a Linear Bounded Automaton which is a TM with a bounded tape, just big enough to hold the input and two additional special tape symbols, the left-end-marker and the right-end-marker. These symbols can not be overwritten, and you cannot move beyond them. They are super convenient because how else would keep from falling off the ends?

lba.png

Any change in computing power? It turns out yes. The computing power is reduced.

Exercise: (Research) Find a language recognizable by a Turing Machine but not by an LBA.

Pushdown Automata (PDAs)

It turns out that the class of context-free languages can be decided exactly by a machine with two tapes: (1) a bounded, read-only, right-move only table holding exactly the input, and (2) a stack-like, one-way infinite tape for whatever you need. This kind of machine is called a push-down automaton (PDA). It is useful because stacks are super efficient, $\Theta(1)$ for pushing and popping. The PDA runs until there is no transition to make, or until the machine exhausts its input.

If you want a PDA transducer, add an output tape.

pda.png

Finite Automata (FAs)

What if we had no memory at all? We’d have something called a Finite Automaton.

fa.png

A lot of computations fall into this category.

Online Simulators

When writing a Turing machine to carry out a computation or decide a question, don’t just guess! There are a couple of online simulators out there, such as this one and this one, so you can check your work!

Exercise: Write your own simulator.

Summary

We’ve covered:

  • Historical motivation
  • How Turing viewed computation
  • Lots of Examples
  • Formal Definition
  • Variations
  • Universality
  • Busy Beavers