David Hilbert was good at formal logic. He wondered about many things.
He challenged mathematicians of his day with many problems.
Thinking there might be a way to algorithmically determine the truth or falsity of any statement in an axiomatic system, he posed the Entscheidungsproblem.
Philip Wadler tells the story:
Computation has limits. Mathematicians and programmers will never go out of business. Computer Science will always be an interesting field. We can dig deeper into this field of computation, and in particular, study what it means to be computable.
We learned form Wadler that all three of these:
were equivalent. It turns out that a bunch of other efforts to formalize computation were also equivalent:
Because all of these formalisms are equivalent, and no one has found something better (other than magic oracles), most folks hypothesize that we really have formally captured our intuitive notion of computability with these things. The Church-Turing Thesis states that what we intuitively understand by “computation” is captured by the Turing Machine.
In retrospect, one wonders why Hilbert suspected everything statement in math (technically, “formal axiomatic systems capable of representing arithmetic and beyond”) was decidable! And thus, whether all true statements had proofs, or why there should exist a finite algorithm that could compute all functions in $\mathbb{N} \rightarrow \mathbb{N}$ (or even in $\mathbb{N} \rightarrow \mathbb{B}$).
For example, a simple diagonalization proof tells you there are non-computable functions in $\mathbb{N} \rightarrow \mathbb{N}$:
Proof: Programs are finite strings of symbols so there are a countably infinite number of them. If all functions were computable, then there would need to be a program for each one, so there would need to be a countably infinite (enumerable) number of functions. If we could enumerate every function from $\mathbb{N} \rightarrow \mathbb{N}$ as $f_0, f_1, f_2, \ldots$, then the function $f$ such that $f(n) = f_n(n)+1$. would not be on the list. So the list is incomplete.
We’re going to write out this diagonalization on the board.
Okay nice, so there are more functions than programs. But are there any realistic functions we can specify that we cannot decide?
Do there exist problems for which all instances have a YES-NO answer but no computer, no matter how powerful, nor any algorithm, in any formalism whatsoever, can compute the answer?
Of course, as a simple counting algorithm can tell you. But are there any that we can describe and actually understand? Yes, and here’s the classic first example. It is known as The Halting Problem:
Write a function called halts
, which accepts a function f
and an input x
, and returns True
if f(x)
halts, and False
if f(x)
, when run, would go into an infinite loop.
Sounds simple enough. Let’s assume we can write it:
def halts(f, x): # code goes here # eventually deciding the issue # and returning True or False
But here’s the problem. Let’s consider this, perfectly normal function:
def fox(f): if halts(f, f): while True: pass
The expression fox(fox)
leads to an inconsistency: Running fox(fox)
does this: "if halts(fox, fox)
then go into an infinite loop, otherwise halt. Ugh. Wait. OMG. This cannot be. So our assumption that halts
could be written must be incorrect. We cannot write halts
in any programming language. The halting problem is undecidable.
Python?
It’s way more fun to use Python in these proofs, but when we do, keep in mind these are isomorphic to language decision problems. In the Halting Problem case, we can re-cast this whole “problem” into one of deciding a particular language, namely: $$ \mathcal{H} = \{ \langle M \rangle \cdot w \mid M \textrm{ halts on } w \}$$where $\langle M \rangle$ is the string encoding of machine $M$.
But guess what: Even though you can’t decide whether a program will halt, you can recognize whether a program will halt. Just simulate it. So the language of halting programs is recognizable but not decidable.
If you can recognize a language, you can enumerate all its utterances. For concreteness, let’s see how this is done with the Halting Problem. Since all programs are strings we can enumerate a list of all programs $p_0, p_1, p_2, p_3, \ldots$. We can do this:
...and so on. Whenever a $p_i$ halts, add it to your “output list” of halting programs.
This process enumerates all and only the halting programs.
We have the following fact:
The set of all languages that are Turing-Recognizable is known as RE, namely, the set of Recursively Enumerable Languages.
and also:
The set of all languages that are Turing-Decidable is known as R, namely, the set of Recursive Languages.
But now, what about the language of all strings that encode Turing Machines that never halt (sometimes called $\bar{H}$? That language isn’t even recognizable. This will need a proof, but we’ll get there soon.
So, how do we know a language isn’t recognizable?
We can find a first non-recognizable language using diagonalization. The language is called $\mathcal{X}$ and is defined by:
$$ \mathcal{X} = \{ \langle M \rangle \mid M \textrm{ does not recognize } \langle M \rangle \} $$(Informally, it is the set of all machines that don’t recognize themselves.) To show this language is not recognizable by any Turing Machine let’s make a list of all the Turing Machines there can possibly exist. There are a countably infinite number of these. We will call them $M_0, M_1, M_2, \ldots$. For each machine we will write out whether it recognizes the encodings of all of the machines:
$\langle M_0 \rangle$ | $\langle M_1 \rangle$ | $\langle M_2 \rangle$ | $\langle M_3 \rangle$ | ... | |
---|---|---|---|---|---|
$M_0$ | T | F | F | T | ... |
$M_1$ | T | T | T | T | ... |
$M_2$ | F | F | F | T | ... |
$M_3$ | T | T | F | T | ... |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
The T’s in each row comprise the language recognized by $M_i$. The language $\mathcal{X}$ is the flipped diagonal of this table, which cannot appear in this table. No matter how we try to enumerate all the Turing Machines, there will never, EVER, be a row which makes $\mathcal{X}$. So $\mathcal{X}$ cannot be recognized by a Turing Machine.
So the Halting Problem is undecidable. But how do we prove other problems undecidable? One way is to prove: “If problem $P$ were decidable, then the Halting Problem would also be decidable.”
This is done by reducing the Halting Problem to problem $P$.
In general, a reduction of problem $A$ to problem $B$ means we use an algorithm for $B$ to solve $A$. You’ve probably done that a lot in the past, right?
See Wikipedia’s article on reducing problems and Wikipedia’s list of undecidable problems.
Here is something amazing.
Let’s say you were given a general YES-NO problem about the run-time behavior of programs (e.g. does the program halt? or does this program ever output more than three symbols?). Rice’s Theorem says that the problem is undecidable unless the property holds for all programs or holds for none.
More technically: All nontrivial semantic properties of programs are undecidable.
Here semantic means run-time and trivial means holds for all or holds for none.
More at Wikipedia.
The “limits” of computing models can define families of languages. For example:
The languages that can be recognized by | are the ___ languages | which Chomsky called |
---|---|---|
Turing Machines | Recognizable, a.k.a. Recursively Enumerable (RE) | Type 0 |
Turing Machines that always halt | Decidable, a.k.a. Recursive (R) | |
LBA | Type-1 (CS) | Type 1 |
NPDA | Context-Free (CF) | Type 2 |
DPDA | Deterministic Context-Free (LR) | |
FA | Regular (REG) | Type 3 |
FA with no cycles | Finite (FIN) |
ALL OF THESE LANGUAGE FAMILIES ARE PROPER SUBSETS OF THE FAMILY IN THE ROW ABOVE IT.
Gabe Robins has a beautiful visualization (which mixes in a few classes from complexity theory):
Loosely speaking, an oracle is something you can add to a machine, that, at any time during a computation, you can simple “ask the oracle” a question whose answer might not even be computable. It just magically answers the question and you go about your business.
Read about oracles at Wikipedia.
Computability Theory deals with what can and cannot be computed on a particular computing model. It does not make any claims on the number of steps required, or the amount of space required, to do the computation. That is what complexity theory is for.
But you know, maybe we shouldn’t try to separate computability from complexity. These are convenient labels, but they are very inter-related. In fact, Robins’s diagram intentionally places complexity classes in with the standard computability classes. Many schools offer a course called Computability and Complexity. Erik Demaine at MIT gives a brief overview of the limits of computation within his complexity theory lecture in his 6.006 class:
You might also like:
When studying Robins’s notes, focus on the set entitled “Formal languages, automata models, grammars, decidability, and Rice's theorem” (PDF here). There are several examples of problem reductions and of applications of Rice’s Theorem.
We’ve covered: