History helps a lot! Why do we have Turing Machines? Why do we care? Why are we better knowing about Turing Machines than not knowing them?
Hear the answers in class. We’ll go through 19th century advances in logic, the formalists versus the intuitionists, Hilbert’s Challenges, the need to formalize computation, Gödel’s and Church’s approaches, and how Turing’s view of computation was the most mechanistic (as opposed to logical or mathematical) and influenced by human computers of the day, and how he (Turing) translated all these ideas into simple mathematical constructs that in some real sense....nailed it!
Turing looked at computers of his day and asked “What are these people doing?” He thought about it and realized these folks would write something down, think for a bit, write something else or erase something, think some more, write or erase, and so on. They would move from mental state to mental state as they worked, deciding what to do next based on what mental state they were in and what was currently written.
He realized he could capture this process formally and keep the model very simple: instead of a two-dimensional page, a single one-dimensional tape would do, and instead of randomly jumping around, it would be okay to move one cell at a time. Today we picture the machines like this:
The idea is this:
When in state $q$ looking at symbol $\sigma$, write a new symbol $\sigma'$, move either one square left or right, then go to state $q'$.
Here’s the story of the development of the Turing Machine and its impact:
Let’s see some examples before we get formal.
If the 13 examples in the video were not enough, we have more.
TODO
This machine reverses any string over the alphabet $\{a, b, c\}$:
TODO
TODO
TODO
TODO
Here’s a recognizer machine to determine if its input, a binary numeral, is even:
Hey, there is a connection to language theory here! The TM just given can both answer the question “Is this number even?” but it also recognizes the language $\{ w \in \{0,1\} \mid w \textrm{ is a even binary integer} \}$. That’s right, answering yes-no questions is isomorphic to determining membership in a language.
Here is a machine to determine if a string over the alphabet $\{a,b,c\}$ is a palindrome:
TODO
Take the empty string to be 0. Read the binary string from left to right. Whenever you see a zero, your current number goes from $n$ to $2n$. Whenever your see a 1, your number increases to $2n+1$. Keep track of the current number modulo 5. You accept if you finish with a number whose value mod 5 is 0.
Want even more examples? Check out this article with six interesting examples, including:
Y’all set for a formal definition?
Remember, from our earlier notes on Automata Theory that machines can be transducers (produce an output) or recognizers (answer yes or no).
Here is the definition for a transducing TM:
A Turing Machine is a tuple $(Q, \Sigma, \Gamma, b, R, S)$ where:
A configuration $\Psi$ is a tuple $(u,q,v) \in \Gamma^* \times Q \times \Gamma^*$ describing a machine with $uv$ on the tape preceded by, and followed by, an infinite number of blanks, with the machine looking at the first symbol of $v$.
The transition relation $\Longrightarrow$ on configurations relates configurations with the configurations they give rise to after one step of execution. Formally (note $u, v \in \Gamma^*$ and $\sigma, \sigma' \in \Gamma$):
$$ \begin{array}{lcl} \Longrightarrow_R & = & \{ ((u,q,\sigma v),(u\sigma',q',v)) \mid (q,\sigma)\,R\,(\sigma',\textsf{R},q') \} \\ & \cup & \{ ((u\sigma,q,v),(u,q',\sigma'v)) \mid (q,\sigma)\,R\,(\sigma',\textsf{L},q') \} \end{array} $$The machine $(Q, \Sigma, \Gamma, b, R, S)$ computes $w'$ from $w$ if $w, w' \in \Sigma^*$ and $$\exists\,u\,q\,\sigma\,v. (\varepsilon, S, w) \Longrightarrow^*_R (u, q, \sigma v) \wedge w' = \textsf{trim}(u\sigma v) \wedge \not \exists \Psi. (q, \sigma)\,R\,\Psi$$
What about the recognizer flavor? It’s just this:
A Recognizing Turing Machine is a tuple $(Q, \Sigma, \Gamma, b, R, S, A)$ where:
The set of all strings accepted by machine $M$ is said to be $L(M)$, the language recognized by $M$.
Recognizers and DecidersNote that a recognizer just has to answer YES for all strings in the language it is trying to recognize. For strings not in the language, a recognizer may loop forever. If a recognizer always halts with a YES or NO, it is called a decider.
Example. Let’s use the formal definition to describe a Turing Machine that increments a binary number. Here’s the machine as a picture:
and here is the machine as a table:
State | Symbol | Write | Move | Next State |
---|---|---|---|---|
$\textsf{S}$ | $0$ | 0 | R | $\textsf{S}$ |
$\textsf{S}$ | $1$ | 1 | R | $\textsf{S}$ |
$\textsf{S}$ | $\#$ | $\#$ | L | $\textsf{F}$ |
$\textsf{F}$ | $1$ | 0 | L | $\textsf{F}$ |
$\textsf{F}$ | $0$ | 1 | L | $\textsf{D}$ |
$\textsf{F}$ | $\#$ | 1 | L | $\textsf{D}$ |
Let’s identify the parts. (For simplicity, we’ll use a single character for the state names.) Our machine is $(Q, \Sigma, \Gamma, \#, R, \textsf{S})$ where:
So there you have it. Any Turing Machine is just a mathematical object. Now we can study them formally.
There are many ways to succinctly encode (unambiguously) a Turing machine. A common approach lists only the rules, using a single symbol for each state, making the start symbol appear at the beginning of the first rule, and simply inferring the base and tape alphabets. For example:
S00RS_S11RS_S##LF_F01LD_F#1LD
Since each rule has exactly five symbols, we don’t really need the underscores:
S00RSS11RSS##LFF01LDF#1LD
Later when we look at deterministic machines, we will see a different encoding.
Given a machine $M$, we generally use $\langle M\rangle$ to refer to its encoding.
The nice thing about encoding machines is that we can provide them as input to other machines. More on this in a few minutes.
In principle, we can build Turing Machines for computing all kinds of functions, such as capitalizing text, finding words in text, computing interest payments, finding documents on the World Wide Web, etc. But functions themselves can be represented in symbols, whether in mathematical notation, plain text Python, or any formalism of your choice. Just a few minutes we saw that the Turing machine to increment a binary number can itself be represented as a string:
S00RSS11RSS##LFF01LDF#1LD
Given this, perhaps we can build a machine (call it $U$), that when you give it ANY function $f$ (as text), and some data $x$, then $U$ will produce $f(x)$? That is: $$U(f,x) = f(x)$$
As it turns out, for a certain class of functions, you can construct such a machine. You might take that for granted today, but it was not shown to be true until the 20th century! Really! (As far as anyone knows, that is.) This profound result is due to, you guessed it, Alan Turing, and it pretty much changed the world. This $U$ is called the Universal Turing Machine in his honor. Today we build variations of the original $U$. We call them computers.
Stop.
Think about this.
Woah.
Just woah.
This may the most profound idea anyone has ever had about computing.
Are you starting to see why Alan Turing is revered as the founder of computer science?
You can also start reading about Universal Machines at Wikipedia.
Let’s mix things up a bit.
We’ve defined our machines with a transition relation that could have multiple outgoing transitions from a state for the same configuration. But what if we required the machine to have, for every state, one and only one transition? That is, instead of:
$$ R \subseteq (Q \times \Gamma) \times (\Gamma \times \{L, R\} \times Q) $$we required $R$ to be a function? Doing so would require us to expand the codomain of the function so that the machine could stop. We could add a halting action:
$$ R : (Q \times \Gamma) \rightarrow (\Gamma \times \{L, R\} \times Q) \cup \{ \textsf{HALT} \} $$This says: if the machine is in some state $\sigma \in Q$, and currently looking at a symbol $\sigma \in \Gamma$, then either:
A slightly different definition of a deterministic Turing Machine treats “halting” as a kind of state, a halting pseudo-state:
$$ R : (Q \times \Gamma) \rightarrow (\Gamma \times \{L, R\} \times (Q\cup \{ \textsf{HALT})) \} $$This says: if the machine is in some state $\sigma \in Q$, and currently looking at a symbol $\sigma \in \Gamma$, then:
With deterministic machines, we can write our tables in a different way. We place the triple (current-symbol, symbol-to-write, next-state) in the appropriate cell. For our signed binary number negation machine,
we get the following table (for the halt-as-action variety, with “—” representing the halt action, and states labeled traditionally as $A, B, C, \ldots$):
0 | 1 | # | |
---|---|---|---|
A | 0RA | 1RA | #LB |
B | 0LB | 1LC | — |
C | 1LC | 0LC | — |
For the halt-as-state variety, the special halt state has no outgoing transitions, so it is never listed in the left column. We can redo our increment machine in this fashion:
and label the halt state with a Z (or an H, or any other character that does not name a regular state):
0 | 1 | # | |
---|---|---|---|
A | 0RA | 1RA | #LB |
B | 1LZ | 0LB | 1LZ |
How many $n$-state deterministic TMs are there?For a tape alphabet of $k$ symbols, it turns out that for the halt-as-action variety, there are $(2kn+1)^{kn}$ machines, and for the halt-as-state variety, there are $(2k(n+1))^{kn}$ machines.
For the $k=2$ case, the numbers are $(4n+1)^{2n}$ and $(4(n+1))^{2n}$, respectively.
We will work it out.
A fun fact: This simplification does not change the computing power of the model. Anything computed by a deterministic TM can be computed by a regular TM and vice versa.
Instead of having to go left or right at each step, you can also provide to option to stay put. No change to computing power.
$$R \subseteq (Q \times \Gamma) \times (\Gamma \times \{\textsf{L},\textsf{R},\textsf{X}\} \times Q)$$Here X means stay on the same cell.
In the base model, the machine looks at the current symbol and decides what to do next. We can let it just decide to jump to another state without examining the current symbol. Again, no change in computing power.
$$R \subseteq (Q \times (\Gamma \cup \{ \varepsilon \})) \times (\Gamma \times \{\textsf{L},\textsf{R}\} \times Q)$$How about, instead of a tape that is infinite only in the left and right directions, we make a two dimensional tape that can go up and down also? Or a 3-D tape? Or make a machine whose tape is any finite number of dimensions? All good. These extensions do not increase the computing power of the model.
What if the tape had a left-end, and was infinite only to the right? A little messier to program, perhaps, but no change in computing power.
More than one tape? More convenient. But you guessed it, no change in computing power.
If we restricted our tape memory to act like a queue, we have a Queue Automaton and no change in computing power. But if our tape was stack-like, we’ve reduced our power: we can no longer compute certain functions (recognize certain languages). But if we had two stacks we are back to full power.
Sometimes we might want to have a bunch of tapes and use certain tapes only in certain ways. Maybe we can only read or only write or only append. Hmmm, tapes are starting to look flexible. We can imagine a tape that is:
What if we bounded the tape on both ends? There’s this thing called a Linear Bounded Automaton which is a TM with a bounded tape, just big enough to hold the input and two additional special tape symbols, the left-end-marker and the right-end-marker. These symbols can not be overwritten, and you cannot move beyond them. They are super convenient because how else would keep from falling off the ends?
Any change in computing power? It turns out yes. The computing power is reduced.
It turns out that the class of context-free languages can be decided exactly by a machine with two tapes: (1) a bounded, read-only, right-move only table holding exactly the input, and (2) a stack-like, one-way infinite tape for whatever you need. This kind of machine is called a push-down automaton (PDA). It is useful because stacks are super efficient, $\Theta(1)$ for pushing and popping. The PDA runs until there is no transition to make, or until the machine exhausts its input.
If you want a PDA transducer, add an output tape.
What if we had no memory at all? We’d have something called a Finite Automaton.
A lot of computations fall into this category.
When writing a Turing machine to carry out a computation or decide a question, don’t just guess! There are a couple of online simulators out there, such as this one and this one, so you can check your work!
We’ve covered: