Type Theory

Type Theory is considered by many to be preferable to Set Theory as an appropriate foundation for Computer Science. Some even say it is a better foundation for mathematics. It’s actually kind of cool, too. Really.

What is Type Theory?

A Type Theory is a system for classifying mathematical and computational objects by the kinds of things they are, and for governing how those objects can be combined and manipulated. Like Set Theory and Category Theory, Type Theory can be used as a foundation for mathematics.

There are many type theories, just like there are many set theories. We’ll see some later. Like modern set theories, type theories are designed to avoid the paradoxes found in what is now called Naïve Set Theory.

The basic idea in type theory is that we bring objects into existence with their type, and define functions that manipulate (i.e., compute with) objects based on their type. Because every object has a type, the typing judgment

$$ x\!: t $$

(read “$x$ inhabits type $t$”) is central. New types are constructed carefully from existing types, avoiding paradoxes. Because functions are an integral feature rather than an afterthought, Type Theory has a naturally computational flavor, which makes it preferable, many say, as a foundation for computer science.

Roughly speaking, when Set Theory is used as a foundation for mathematics, we need a logical system to be in place to do our reasoning; with Type Theory, however, reasoning is done by deriving type judgments and evaluating functions within the system. Sometimes, functions return truth values, so evaluating them sure sounds like a proof. (It is.) We sometimes say “logic emerges from Type Theory.”

Logic emerges from Type Theory.

The Basics

Types are defined by exhaustive rules that create objects that inhabit the types. Let’s learn by example.

The Boolean Type

Here is our first type:

$\dfrac{}{\textsf{true}\!: \textsf{Bool}}$

$\dfrac{}{\textsf{false}\!: \textsf{Bool}}$

This definition states that the type $\textsf{Bool}$ is inhabited by exactly two individuals, $\textsf{true}$ and $\textsf{false}$. No more, no less. That’s what we mean when we say the rules are exhaustive. We have created two and only two inhabitants for this new type. (Exhaustiveness is part of the structural metatheory.)

Numbers

Here is the type of natural numbers:

$\dfrac{}{0\!: \textsf{Nat}}$

$\dfrac{n\!: \textsf{Nat}}{\textsf{s}\,n\!: \textsf{Nat}}$

This says “(1) $0$ is a natural number, (2) For any natural number $n$, $\mathsf{s\,}n$ is a natural number, and (3) nothing else is a natural number.” So the type of natural numbers is inhabited by $0$, $\mathsf{s\,}0$, $\mathsf{s\,s\,}0$, $\mathsf{s\,s\,s\,}0$, $\mathsf{s\,s\,s\,s\,}0$, and so on. We’ll write these as $0$, $1$, $2$, $3$, $4$, and so on. These abbreviations should be familiar to you. Next, here is the type of integers:

$\dfrac{n\!: \textsf{Nat}}{\textsf{pos}\,n\!: \textsf{Int}}$

$\dfrac{n\!: \textsf{Nat}}{\textsf{neg}\,(\textsf{s}\,n)\!: \textsf{Int}}$

This type is inhabited by $\mathsf{pos\,}0$, $\mathsf{neg}\,(\mathsf{s}\,0)$, $\mathsf{pos}\,(\mathsf{s}\,0)$, $\mathsf{neg}\,(\mathsf{s\,s}\,0)$, $\mathsf{pos}\,(\mathsf{s\,s}\,0)$, $\mathsf{neg}\,(\mathsf{s\,s\,s}\,0)$, and so on. We’ll abbreviate these as $0$, $-1$, $1$, $-2$, $2$, $-3$, and so on. But what, then, is the type of $3$? We need to figure it out from context, or write $3_{\textsf{Nat}}$ or $3_{\textsf{Int}}$ to be explicit.

Exercise: How does this definition avoid $-0$?

Here are the rationals:

$\dfrac{n\!: \textsf{Int} \quad d\!: \textsf{Nat}}{\textsf{rat}\,n\,(\textsf{s}\,d)\!: \textsf{Rat}}$

So one-third is $\textsf{rat}\,1\,3$, which we can abbreviate as $\frac{1}{3}$.

Exercise: How does this definition avoid zero in the denominator?

We can build many more numeric types, including a few that are useful in programming languages, for example $\textsf{Int8}$, $\textsf{Int16}$, $\textsf{Int32}$, $\textsf{Int64}$, $\textsf{Int128}$, $\textsf{UInt8}$, $\textsf{UInt16}$, $\textsf{UInt32}$, $\textsf{UInt64}$, $\textsf{UInt128}$, $\textsf{Float16}$, $\textsf{Float32}$, $\textsf{Float64}$, $\textsf{Float128}$, $\textsf{Complex64}$, $\textsf{Complex128}$, and so on. Each of these have a finite number of elements, so, despite requiring a ridiculous number of rules, they can be defined in principle.

Product Types

Now let’s build types from types. If $t_1$ and $t_2$ are types, then $t_1 \times t_2$ is a type, defined like so:

$\dfrac{x\!: t_1 \quad y\!: t_2}{(x, y)\!: t_1 \times t_2}$

As a convention, we’ll take $\times$ to be right-associative, so the type expression $t_1 \times t_2 \times t_3$ sugars $t_1 \times (t_2 \times t_3)$, and, correspondingly, when writing inhabitants, $(a,b,c)$ sugars $(a,(b,c))$.

In general, $(a_1, \ldots, a_n)$, which sugars $(a_1, (a_2, (\ldots, (a_{n-1}, a_n)\ldots)))$, is called an $n$-tuple. Its length is $n$.

Example:
The expression $(2, \mathsf{false}, \mathsf{true}, \frac{3}{5})$
sugars $(2, (\mathsf{false}, (\mathsf{true}, \frac{3}{5})))$
and has type $\mathsf{Nat \times (Bool \times (Bool \times Rat))}$
which we can sugar as $\mathsf{Nat \times Bool \times Bool \times Rat}$.

The Unit Type

So we have $2$-tuples and $3$-tuples and can keep going, as long as $n \geq 2$.

Can we go the other way?

Sure...a $1$-tuple would just have one component, which we can treat as the value itself, so $a = (a)$.

A $0$-tuple would look like this: $()$.

Let’s make that a value, and give it a type. Wait, people have already done this. The type is called $\textsf{Unit}$:

$\dfrac{}{()\!: \textsf{Unit}}$

List Types

Next is the type “list of elements of type $t$” which we will write $t^*$:

$\dfrac{}{[\,]\!: t^*}$

$\dfrac{x\!: t \quad y\!: t^*}{(x :: y)\!: t^*}$

We will write ${[x]}$ for $(x\,\textbf{::}\,[\,])$, ${[x,y,z]}$ for $(x\,\textbf{::}\,(y\,\textbf{::}\,(z\,\textbf{::}\,[\,])))$, and so on. For an empty list, we would need to rely on context to infer its type, or be explicit by writing, for example, $[\,]_{\textsf{Int}^*}$ or $[\,]_{(\textsf{Bool}\times\textsf{Unit})^*}$.

The String Type

Imagine what the type $\textsf{Unicode}^*$ is. Got it? That’s right! We’ll use the abbreviation $\textsf{String}$ for this type.

Custom Types

We can make types. Here is one for primary colors:

$\dfrac{}{\textsf{red}\!: \textsf{PrimaryColor}}$

$\dfrac{}{\textsf{green}\!: \textsf{PrimaryColor}}$

$\dfrac{}{\textsf{blue}\!: \textsf{PrimaryColor}}$

You can define the type $\textsf{Unicode}$ whose inhabitants are, you guessed it, the characters of Unicode. That’d be a lot of rules! But it is indeed definable, as there are a finite number of such characters. You can do one rule for each of the 1,111,998 possible characters, something like:

$\dfrac{}{\textsf{U+0000}\!: \textsf{Unicode}}$

$\dfrac{}{\textsf{U+0001}\!: \textsf{Unicode}}$

$\cdots$

$\dfrac{}{\textsf{U+10FFFD}\!: \textsf{Unicode}}$

Feel free to use the official character names instead of code points.

Types whose inhabitants have varying forms are easy to write. Here is a type for some simple two-dimensional shapes:

$\dfrac{r\!: \textsf{Float64}}{\textsf{circle}\,r\!: \textsf{Shape}}$

$\dfrac{w\!: \textsf{Float64}\quad h\!: \textsf{Float64}}{\textsf{rectangle}\,w\,h\!: \textsf{Shape}}$

$\dfrac{a\!: \textsf{Float64}\quad b\!: \textsf{Float64} \quad c\!: \textsf{Float64}}{\textsf{triangle}\,a\,b\,c\!: \textsf{Shape}}$

The idea is that $\mathsf{circle}\,5.0$ is a circle with radius 5.

Here is a type for “binary trees of type $t$”:

$\dfrac{}{\textsf{empty}\!:\mathsf{Bintree}\;t}$

$\dfrac{x\!:t \quad l\!:\mathsf{Bintree}\;t \quad r\!:\mathsf{Bintree}\;t}{(\mathsf{node}\;x\;l\;r)\!:\mathsf{Bintree}\;t}$

Exercise: Draw the tree

$\begin{array}{l}\quad(\mathsf{node}\;3\\ \quad\quad(\mathsf{node}\;2\;\mathsf{empty}\;\mathsf{empty})\\ \quad\quad (\mathsf{node}\;5\;\mathsf{empty}\;\mathsf{empty}))\end{array}$

How about trees with any number of children?

$\dfrac{}{\textsf{empty}\!:\mathsf{Tree}\;t}$

$\dfrac{x\!:t \quad ts\!:(\mathsf{Tree}\;t)^*}{(\textsf{node}\;x\;ts)\!:\mathsf{Tree}\;t}$

Option Types

Now let’s do “option of type $t$” (also called an optional of type $t$) which we will write $t\texttt{?}$:

$\dfrac{}{\textsf{none}\!: t\texttt{?}}$

$\dfrac{x\!:t}{\textsf{some}\;x\!: t\texttt{?}}$

We thus have $\textsf{some}\:3\!:\mathsf{Nat}\texttt{?}$ and $\textsf{some}\:(-3,\textsf{true})\!:(\mathsf{Int \times Bool})\texttt{?}$. When writing $\textsf{none}$, we must rely on context to know which type is being referred to, or use subscripts to be explicit, e.g., $\mathsf{none_{Int?}}$.

Void

Here is something surprisingly quite useful: the empty type, $\textsf{Void}$, is the type with no inhabitants. As there are no inhabitants, there are no constructors, so no rules to write.

Don’t confuse Void and Unit
The type $\textsf{Void}$ has no inhabitants at all. The type $\textsf{Unit}$ has a single inhabitant, namely $()$.

Functions

To serve as a basis for computation, let alone for mathematics, we need to know more than which items inhabit which types. We need to know how the items behave. We specify behavior by defining functions on the types. We were introduced to functions briefly in our notes on mathematical foundations, where we saw this diagram:

$x$

→

$f$

→

$f\;x$

To define a function, you show how to map each element of one type (the domain) to an element of another type (the codomain). If a function $f$ has domain $t_1$ and codomain $t_2$, then $f$ inhabits the type $t_1 \to t_2$. To apply or invoke the function $f$ on argument $x$, we write $f\,x$ or $f(x)$. To represent a function explicitly, we write:

$$\lambda x_t. e$$

where $x$ is the argument of the function, $t$ is the type of the argument, and $e$ is the expression that computes the output based on $x$. Usually the output type is inferrable, but if it is not, writing $(\lambda x. e)\!: t_1\to t_2$ is totally fine.

We specify functions by showing how to compute the output for each possible input. We base this off the constructors of the type. Remember the Boolean type? It has only two constructors, $\textsf{true}$ and $\textsf{false}$. So we write the function for boolean negation like so:

$\begin{array}{l} \lambda b_{\textsf{Bool}}.\; \textsf{match}\;b \\ \quad\quad \textsf{when}\;\textsf{true} \rightarrow \textsf{false} \\ \quad\quad \textsf{when} \;\textsf{false} \rightarrow \textsf{true} \end{array}$

The type of natural numbers we saw above has two constructors: $0$ and $\textsf{s}$. The “is zero” function on natural numbers is:

$\begin{array}{l} \lambda n_{\textsf{Nat}}.\; \textsf{match}\;n \\ \quad\quad \textsf{when}\;0 \rightarrow \textsf{true} \\ \quad\quad \textsf{when} \;\textsf{s}\,n \rightarrow \textsf{false} \end{array}$

And the “plus two” function on natural numbers is:

$\begin{array}{l} \lambda n_{\textsf{Nat}}. \textsf{match}\;n \\ \quad\quad \textsf{when}\;0 \rightarrow \textsf{s}\,\textsf{s}\,0 \\ \quad\quad \textsf{when} \;\textsf{s}\,n \rightarrow \textsf{s}\,\textsf{s}\,\textsf{s}\,n \end{array}$

Sometimes the match cases collapse into one, as in the plus two function, which can be written more succinctly as:

$\lambda n_{\textsf{Nat}}. \textsf{s}\,\textsf{s}\,n$

We often want to give names to functions to make them easier to refer to, for example:

$\begin{array}{l} \textsf{isZero} =_{\small{\textrm{def}}} \lambda n_{\textsf{Nat}}.\; \textsf{match}\;n \\ \quad\quad \textsf{when}\;0 \rightarrow \textsf{true} \\ \quad\quad \textsf{when} \;\textsf{s}\,n \rightarrow \textsf{false} \end{array}$

We can get fancy by moving the parameter to the left of the equals sign, like this:

$\begin{array}{l} \textsf{isZero}\;n_{\textsf{Nat}} =_{\small{\textrm{def}}} \textsf{match}\;n \\ \quad\quad \textsf{when}\;0 \rightarrow \textsf{true} \\ \quad\quad \textsf{when} \;\textsf{s}\,n \rightarrow \textsf{false} \end{array}$

But it is more common to write a definition by cases. This can be quite readable, especially when we move the typing information to its own line. Here are the three functions we have seen so far:

$\begin{array}{l} \textsf{not}\!: \textsf{Bool} \rightarrow \textsf{Bool} \\ \textsf{not}\;\textsf{true} = \textsf{false} \\ \textsf{not}\;\textsf{false} = \textsf{true} \end{array}$

$\begin{array}{l} \textsf{isZero}\!: \textsf{Nat} \rightarrow \textsf{Bool} \\ \textsf{isZero}\;0 = \textsf{true} \\ \textsf{isZero}\;(\textsf{s}\,n) = \textsf{false} \end{array}$

$\begin{array}{l} \textsf{plusTwo}\!: \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{plusTwo}\;0 = \textsf{s}\,\textsf{s}\,0 \\ \textsf{plusTwo}\;(\textsf{s}\,n) = \textsf{s}\,\textsf{s}\,\textsf{s}\,n \end{array}$

Parameters can be tuples:

$\begin{array}{l} \textsf{and}\!:(\textsf{Bool} \times \textsf{Bool}) \rightarrow \textsf{Bool} \\ \textsf{and}\;(\textsf{true}, \textsf{true}) = \textsf{true} \\ \textsf{and}\;(\textsf{true}, \textsf{false}) = \textsf{false} \\ \textsf{and}\;(\textsf{false}, \textsf{true}) = \textsf{false} \\ \textsf{and}\;(\textsf{false}, \textsf{false}) = \textsf{false} \end{array}$

But since functions are objects, we can do without tuple parameters, and define functions more simply. Study this:

$\begin{array}{l} \textsf{and}\!: \textsf{Bool} \rightarrow (\textsf{Bool} \rightarrow \textsf{Bool}) \\ \textsf{and}\;\textsf{true} = \lambda x_{\textsf{Bool}}.\,x \\ \textsf{and}\;\textsf{false} = \lambda x_{\textsf{Bool}}.\,\textsf{false} \end{array}$

Make sure you understand why that works, then move on to this definition:

$\begin{array}{l} \textsf{or}\!: \textsf{Bool} \rightarrow (\textsf{Bool} \rightarrow \textsf{Bool}) \\ \textsf{or}\;\textsf{true} = \lambda x_{\textsf{Bool}}.\,\textsf{true} \\ \textsf{or}\;\textsf{false} = \lambda x_{\textsf{Bool}}.\,x \end{array}$

We can keep moving parameters to the left, squeezing out the lambda expressions:

$\begin{array}{l} \textsf{and}\!: \textsf{Bool} \rightarrow \textsf{Bool} \rightarrow \textsf{Bool} \\ \textsf{and}\;\textsf{true}\;x = x \\ \textsf{and}\;\textsf{false}\;x = \textsf{false} \end{array}$

$\begin{array}{l} \textsf{or}\!: \textsf{Bool} \rightarrow \textsf{Bool} \rightarrow \textsf{Bool} \\ \textsf{or}\;\textsf{true}\;x = \textsf{true} \\ \textsf{or}\;\textsf{false}\;x = x \end{array}$

We slipped in some sugar there: When writing function types, the arrow associates to the right, so $t_1 \rightarrow t_2 \rightarrow t_3$ is the same as $t_1 \rightarrow (t_2 \rightarrow t_3)$. And when writing function calls, application associates to the left, so $f\,x\,y$ is the same as $(f\,x)\,y$.

Here is how to add natural numbers. Make sure to study the definition carefully. It is recursive, but well-defined since the recursive portion operates on a component that is structurally smaller than the argument of the clause in which it appears:

$\begin{array}{l} \textsf{plus}\!: \textsf{Nat} \rightarrow \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{plus}\;0\;m = m \\ \textsf{plus}\;(\textsf{s}\,n)\;m = \textsf{s}\,(\textsf{plus}\;n\;m) \end{array}$

Multiplication:

$\begin{array}{l} \textsf{times}\!: \textsf{Nat} \rightarrow \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{times}\;0\;m = 0 \\ \textsf{times}\;(\textsf{s}\,n)\;m = \textsf{plus}\;m\;(\textsf{times}\;n\;m) \end{array}$

Exponentiation:

$\begin{array}{l} \textsf{exp}\!: \textsf{Nat} \rightarrow \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{exp}\;0\;m = 1 \\ \textsf{exp}\;(\textsf{s}\,n)\;m = \textsf{times}\;m\;(\textsf{exp}\;n\;m) \end{array}$

Less Than:

$\begin{array}{l} \textsf{lt}\!: \textsf{Nat} \rightarrow \textsf{Nat} \rightarrow \textsf{Bool} \\ \textsf{lt}\;0\;0 = \textsf{false} \\ \textsf{lt}\;0\;(\textsf{s}\,m) = \textsf{true} \\ \textsf{lt}\;(\textsf{s}\,n)\;0 = \textsf{false} \\ \textsf{lt}\;(\textsf{s}\,n)\;(\textsf{s}\,m) = \textsf{lt}\;n\;m \end{array}$

And here is something remarkably useful:

$\begin{array}{l} \textsf{cond}\!: \textsf{Bool} \rightarrow t \rightarrow t \rightarrow t \\ \textsf{cond}\;\textsf{true}\;x\;y = x \\ \textsf{cond}\;\textsf{false}\;x\;y = y \end{array}$

Ok hold up! It’s time to introduce some notation to make things a bit more familiar. From now on, we’re going to use some sugar, writing:

$\begin{array}{lll} \neg b & \text{for} & \textsf{not}\;b \\ a \land b & \text{for} & \textsf{and}\;a\;b \\ a \lor b & \text{for} & \textsf{or}\;a\;b \\ \textsf{if}\;b\;\textsf{then}\;x\;\textsf{else}\;y & \text{for} & \textsf{cond}\;b\;x\;y \\ n + 1 & \text{for} & \textsf{s}\;n \\ m + n & \text{for} & \textsf{plus}\;n\;m \\ m \times n & \text{for} & \textsf{times}\;n\;m \\ m^n & \text{for} & \textsf{exp}\;n\;m \\ n \lt m & \text{for} & \textsf{lt}\;n\;m \\ \textsf{let}\;x = e\;\textsf{in}\;e' & \text{for} & (\lambda x.\,e')\;e \end{array}$

Heck we’ll even write $xy$ for $x \times y$ except when it makes things too confusing.

We can in principle define more numeric types and more arithmetic and logical operations. We won’t do so here, but feel free to do so. It can be fun! This can take us to places where things start to look convenient, and familiar. Here are a couple functions using the $\textsf{Shape}$ type defined earlier:

$\begin{array}{l} \textsf{area}\!:\textsf{Shape} \rightarrow \textsf{Float64} \\ \textsf{area}\;(\textsf{circle}\;r) = \pi r^2 \\ \textsf{area}\;(\textsf{rectangle}\;w\;h) = wh \\ \textsf{area}\;(\textsf{triangle}\;a\;b\;c) = \textsf{let}\;s = \frac{a + b + c}{2} \;\textsf{in}\; \sqrt{s(s-a)(s-b)(s-c)} \\ \end{array}$

$\begin{array}{l} \textsf{perimeter}\!:\textsf{Shape} \rightarrow \textsf{Float64} \\ \textsf{perimeter}\;(\textsf{circle}\;r) = 2 \pi r \\ \textsf{perimeter}\;(\textsf{rectangle}\;w\;h) = 2 (w + h) \\ \textsf{perimeter}\;(\textsf{triangle}\;a\;b\;c) = a + b + c \\ \end{array}$

Exercise: For fun, rewrite the above two function definitions using $\textsf{match}$ expressions.

Now, some list functions:

$\begin{array}{l} \textsf{length}\!: t^* \rightarrow \textsf{Nat} \\ \textsf{length}\;[\,] = 0 \\ \textsf{length}\;(x\,\textbf{::}\,y) = (\textsf{length}\;y) + 1 \end{array}$

Appending two lists:

$\begin{array}{l} \textsf{append}\!: t^* \rightarrow t^* \rightarrow t^* \\ \textsf{append}\;[\,]\;z = z \\ \textsf{append}\;(x\,\textbf{::}\,y)\;z = x\,\textbf{::}\,(\textsf{append}\;y\;z) \end{array}$

Reversing a list:

$\begin{array}{l} \textsf{reverse}\!: t^* \rightarrow t^* \\ \textsf{reverse}\;[\,] = [\,] \\ \textsf{reverse}\;(x\,\textbf{::}\,y) = \textsf{append}\;(\textsf{reverse}\;y)\;[x] \end{array}$

Fetching the first element (the head) of a list:

$\begin{array}{l} \textsf{head}\!: t^* \rightarrow t\texttt{?} \\ \textsf{head}\;[\,] = \textsf{none} \\ \textsf{head}\;(x\,\textbf{::}\,y) = \textsf{some}\;x \end{array}$

Fetching all but the head of a list (the tail):

$\begin{array}{l} \textsf{tail}\!: t^* \rightarrow t^*\texttt{?} \\ \textsf{tail}\;[\,] = \textsf{none} \\ \textsf{tail}\;(x\,\textbf{::}\,y) = \textsf{some}\;y \end{array}$

Mapping a function over a list (in other words, applying a function to each element of a list):

$\begin{array}{l} \textsf{map}\!: (t_1 \rightarrow t_2) \rightarrow t_1^* \rightarrow t_2^* \\ \textsf{map}\;f\;[\,] = [\,] \\ \textsf{map}\;f\;(x\,\textbf{::}\,y) = (f\;x)\,\textbf{::}\,(\textsf{map}\;f\;y) \end{array}$

Filtering a list:

$\begin{array}{l} \textsf{filter}\!: (t \rightarrow \textsf{Bool}) \rightarrow t^* \rightarrow t^* \\ \textsf{filter}\;p\;[\,] = [\,] \\ \textsf{filter}\;p\;(x\,\textbf{::}\,y) = \textsf{if}\;(p\;x)\;\textsf{then}\;(x\,\textbf{::}\,(\textsf{filter}\;p\;y))\;\textsf{else}\;\textsf{filter}\;p\;y \end{array}$

Sometimes it’s helpful to think of an optional as kind of like a sequence of 0 or 1 element. In this case, functions like $\textsf{map}$ make sense to apply to optionals:

$\begin{array}{l} \textsf{map}\!: (t_1 \rightarrow t_2) \rightarrow t_1\texttt{?} \rightarrow t_2\texttt{?} \\ \textsf{map}\;f\;\textsf{none} = \textsf{none} \\ \textsf{map}\;f\;(\textsf{some}\;x) = \textsf{some}\;(f\;x) \end{array}$

This gives us an elegant way to define a function for determining the index of the first occurrence of an element in a list:

$\begin{array}{l} \textsf{indexOf}\!: t \rightarrow t^* \rightarrow \textsf{Nat}\texttt{?} \\ \textsf{indexOf}\;x\;[\,] = \textsf{none} \\ \textsf{indexOf}\;x\;(y\,\textbf{::}\,ys) = \\ \quad \quad \textsf{if}\;x = y\;\textsf{then}\;\textsf{some}\;0\;\textsf{else}\;\textsf{map}\;(\lambda n.\,n+1)\;(\textsf{indexOf}\;x\;ys) \end{array}$

CLASSWORK

We’ll create example invocations for each of these.

Type Inference

When defining functions by cases, the type annotation on the function name gives us enough context to infer the types of the clauses in the definition. When the function stands alone, explicit type annotations are often necessary, especially when using operators that are overloaded across multiple types (e.g., $+$, $\times$, $\bmod$ across $\textsf{Nat}$, $\textsf{Int}$, $\textsf{Float64}$, etc.):

$(\lambda x. x^2 - x + 5)\!:(\mathsf{Int \rightarrow Int})$

$(\lambda x. x^2 - x + 5)\!:(\mathsf{Real \rightarrow Real})$

$(\lambda n. n \log_2 n)\!:(\mathsf{Nat \rightarrow Real})$

$(\lambda x. x \bmod 2 = 0)\!:(\mathsf{Nat \rightarrow Bool})$

$(\lambda x. \lambda y. 5x + 2y - 7)\!:(\mathsf{Real \rightarrow Real \rightarrow Real})$

$(\lambda (x, y). 5x + 2y - 7)\!:(\mathsf{Real \times Real \rightarrow Real})$

$(\lambda \theta. \lambda (x, y). (x \cos\theta - y \sin\theta, x\sin\theta + y\cos\theta))\!:(\mathsf{Real \rightarrow (Real \times Real) \to (Real \times Real)})$

Sometimes, annotating only the parameters is sufficient, since the types of the bodies are often inferrable:

$\lambda x_{\small \textsf{Int}}. (x^2 - x + 5)$

$\lambda x_{\small \textsf{Real}}. (x^2 - x + 5)$

$\lambda n_{\small \textsf{Nat}}. (n \log_2 n)$

$\lambda x_{\small \textsf{Nat}}. (x \bmod 2 = 0)$

$\lambda x_{\small \textsf{Real}}. \lambda y_{\small \textsf{Real}}. (5x + 2y - 7)$

$\lambda (x, y)_{\small \mathsf{Real} \times \mathsf{Real}}. (5x + 2y - 7)$

$\lambda \theta_{\small \mathsf{Real}}. \lambda (x, y)_{\small \mathsf{Real} \times \mathsf{Real}}. (x \cos\theta - y \sin\theta, x\sin\theta + y\cos\theta)$

An alternative to explicitly providing type annotations is to rely on context. Do so at your own risk. It’s generally okay in informal settings, but when programming or using automated proof assistants you can run into trouble, since the inferred type in these languages or systems might not be what you expect.

Polymorphic Functions

Here’s a fun one. Try to infer the type of $\lambda f. \lambda x. f(f(x))$. What do you think it is?

Well, $(\textsf{Int} \rightarrow \textsf{Int}) \rightarrow \textsf{Int} \rightarrow \textsf{Int}$ works. But so does $(\textsf{Float64} \rightarrow \textsf{Float64}) \rightarrow \textsf{Float64} \rightarrow \textsf{Float64}$. And so does $(\textsf{Bool} \rightarrow \textsf{Bool}) \rightarrow \textsf{Bool} \rightarrow \textsf{Bool}$. And so on.

To say that this function works on any type, we introduce type variables. The type is then $\forall \alpha. (\alpha \rightarrow \alpha) \rightarrow \alpha \rightarrow \alpha$. You don’t have to write the $\forall$ though. But it is a nice way to say, “you can substitute any type you like for the type variable $\alpha$.” We say this function is polymorphic.

We’ve actually seen quite a few polymorphic functions already.

Exercise: Explain the difference between overloading and polymorphism.

Partial Functions

In our notes on Set Theory we encountered the idea of a partial function, a function that did not map every element of its domain to an element of its codomain. Type Theory doesn’t like partial functions. Fortunately, you can always find a corresponding total function. There are three ways to do it: (1) use the option type we saw above for the codomain, (2) use a custom type with an error value variant for the codomain, or (3) restrict the domain.

CLASSWORK

Let’s do all three approaches for $\lambda x_{\textsf{Real}}. \frac{1}{x}$ and $\lambda x_{\textsf{Real}}. \sqrt{x}$.

Subtypes

Philosophy question: should a thing inhabit just one type? Some mathematically-oriented theories impose this restriction. Pragmatically, though, it makes sense to allow things to be in more than one type. Like $3$. Why not let it inhabit $\textsf{Nat}$ and $\textsf{Int}$ and maybe even more numeric types? If we allow this, we would notice that every inhabitant of $\textsf{Nat}$ also inhabits $\textsf{Int}$. That is, $\textsf{Nat}$ is a subtype of $\textsf{Int}$, which we write:

$$ \textsf{Nat} \; \texttt{<:} \; \textsf{Int} $$

Conversely, $\textsf{Int}$ is a supertype of $\textsf{Nat}$.

Let’s get technical:

$t_1 \; \texttt{<:} \; t_2$ if and only if for every $v$ such that $v\!: t_1$, we have $v\!: t_2$.

Exercise: $\textsf{Void}$ is a subtype of every type. Why?

Union and Intersection Types

If we allow individuals to inhabit multiple types, then it makes sense to define union types. The union $t_1 \mid t_2$ is inhabited by exactly all the inhabitants of $t_1$ and those of $t_2$. Yes, it’s possible for $t_1$ and $t_2$ to overlap. If we do allow $\textsf{Nat}\;\texttt{<:}\;\textsf{Int}$, then the type $\textsf{Nat} \mid \textsf{Int}$ is just $\textsf{Int}$.

But the type $\textsf{Int} \mid \textsf{String}$ is perhaps interesting. Maybe you can find a use for it?

Exercise: Is $t_1 \mid t_2$ a subtype of $t_1$? Is it a supertype of $t_1$?

In case you were wondering, yes, if you have union types, you can also have intersection types, also. We write them like this: $t_1\:\texttt{&}\:t_2$.

Exercise: Argue that

$\textsf{Nat} \mid \textsf{Int}$ is $\textsf{Int}$
$\textsf{Nat}\:\texttt{&}\:\textsf{Int}$ is $\textsf{Nat}$
$\textsf{Nat}\:\texttt{&}\:\textsf{String}$ is $\textsf{Void}$
$\textsf{Void} \mid t$ is $t$
$\textsf{Void}\:\texttt{&}\:t$ is $\textsf{Void}$
$t \mid t$ is t
$t\:\texttt{&}\:t$ is t

where $t$ is any type.

Are you getting set theory vibes?
Please calm down.
Types are not sets.
TYPES ARE NOT SETS.
We’ll talk about sets later. Please ffs do not let any preconceived notions of sets cloud your learning of the beauty of types. Keep. your. focus. on. types. TYPES!

That said, there does exist a body of work that exploits some of the similarity between types and sets. One is semantic subtyping, which you can read about in this seminal paper by Frisch, Castagna, and Benzaken. Rather than a thinking about a type being defined by its constructors (as $A+B$ is defined by $\mathsf{tag_1}$ and $\mathsf{tag_2}$), a type becomes a set of values, and $A \vee B$ is literally $[\![ A ]\!] \cup [\![ B ]\!]$, and similarly for the intersection type. The approach has several advantages, which you can read about in their paper. Check it out after you finish these notes.

Sums, Products, and Exponents

Here are two types, $C$ (for color, with 3 inhabitants $r$, $g$, and $b$) and $D$ (for direction, with 2 inhabitants $l$ and $r$):

$$ \dfrac{}{r\!: C} \quad\quad\quad \dfrac{}{g\!: C} \quad\quad\quad \dfrac{}{b\!: C} $$ $$ \quad\quad\quad \dfrac{}{l\!: D} \quad\quad\quad \dfrac{}{r\!: D} $$

The inhabitants of $C \times D$ are:

$$ (r,l),\;(r,r),\;(g,l),\;(g,r),\;(b,l),\;(b,r) $$

Did you notice that $C$ has 3 inhabitants and $D$ has 2 inhabitants and $C\times D$ has $3 \times 2 = 6$ inhabitants? See why we call these product types? Right? RIGHT?

Since we have products, maybe we should have sums. We’d like $C + D$ to be a type with $3 + 2 = 5$ inhabitants. You might think a union would work, but wait, it doesn’t because the union ($C \mid D$) only has four elements: $r$, $g$, $b$, and $l$. A true sum type needs to have two distinct $r$’s. We have to tag the “source” of each element of the sum. So the type $C+D$ has these inhabitants:

$$ \mathsf{inl}\,r,\;\mathsf{inl}\,g,\;\mathsf{inl}\,b,\;\mathsf{inr}\,l,\;\mathsf{inr}\,r $$

In general, we define a sum type like so:

$$ \dfrac{x: t_1}{\textsf{inl}\:x\!: t_1 + t_2} \quad\quad \dfrac{x: t_2}{\textsf{inr}\:x\!: t_1 + t_2} $$

Read $\textsf{inl}$ as “inject left” and $\textsf{inr}$ as “inject right”.

Sum types are rarely used, because in practice, you would create a type with readable, meaningful constructor names.

As we did with product types in extending the notion of product to 0 elements yielding the unit type, extending the notion of sum to 0 elements yields the sum type over no types, which is...the type $\textsf{Void}$ referred to above!

Does this remind you of basic number theory?
Void is the empty sum type, the sum of no types, and has zero elements. Just like the number 0 is the sum over a list of no elements, the additive identity.
Unit is the empty product type, the product of no types, and has one element. Just like the number 1 is the product over a list of no elements. The multiplicative identity.
In some theories the void type is actually called $\mathbf{0}$ and the unit type is actually called $\mathsf{1}$.

So, we have sums and products, but what about exponential types? Could there be types $C^D$ and $D^C$? There are, and it’s kind of interesting what they turn out to be. Let’s list all the inhabitants of $C \rightarrow D$:

$$ \begin{array}{llll} (000) & r \mapsto l, & g \mapsto l, & b \mapsto l \\ (001) & r \mapsto l, & g \mapsto l, & b \mapsto r \\ (010) & r \mapsto l, & g \mapsto r, & b \mapsto l \\ (011) & r \mapsto l, & g \mapsto r, & b \mapsto r \\ (100) & r \mapsto r, & g \mapsto l, & b \mapsto l \\ (101) & r \mapsto r, & g \mapsto l, & b \mapsto r \\ (110) & r \mapsto r, & g \mapsto r, & b \mapsto l \\ (111) & r \mapsto r, & g \mapsto r, & b \mapsto r \\ \end{array} $$

There are 8 of these, which is not coincidentally $2^3$, exactly the number of inhabitants of $D$ raised to the power of the number of inhabitants of $C$. Therefore, $C \rightarrow D$ is sometimes written $D^C$.

It works the other way, too. $D \rightarrow C$ can be written $C^D$, as it has $3^2 = 9$ inhabitants:

$$ \begin{array}{lll} (00) & l \mapsto r, & r \mapsto r \\ (01) & l \mapsto r, & r \mapsto g \\ (02) & l \mapsto r, & r \mapsto b \\ (10) & l \mapsto g, & r \mapsto r \\ (11) & l \mapsto g, & r \mapsto g \\ (12) & l \mapsto g, & r \mapsto b \\ (20) & l \mapsto b, & r \mapsto r \\ (21) & l \mapsto b, & r \mapsto g \\ (22) & l \mapsto b, & r \mapsto b \\ \end{array} $$

And there you have it: function types are the exponential types. 🤯

Sets

Given a function $A$ of type $t \rightarrow \mathsf{Bool}$, you can imagine collecting together, in an object, all of the inhabitants $x$ of type $t$ for which $A\,x = \textsf{true}$. That’s a set. Yes, that’s what a set is in Type Theory. It is that simple.

Type Notation	Set Notation	Explanation
$\lambda x_{\textsf{Nat}}.\,x=0$	$\{0\}$	The function returns true only for $0$
$\lambda x_{\textsf{Nat}}.\,x=2 \vee x=3 \vee x=5$	$\{2, 3, 5\}$	The function returns true only for $2$, $3$, and $5$
$\lambda x_{\textsf{Nat}}.\,\textsf{true}$	$\mathbb{N}$	The function returns true only for all natural numbers
$\lambda x_{\textsf{Nat}}.\,x > 5$	$\{ x \in \mathbb{N} \mid x > 5 \}$	The function returns true only for all natural numbers greater than $5$
$\lambda x_{\textsf{Int}}.\,\textsf{true}$	$\mathbb{Z}$	The function returns true only for all integers
$\lambda x_t.\,\textsf{false}$	$\varnothing$	The function returns true for no arguments
$(A\,x)_{\textsf{Bool}}$	$x \in A$	The application returns true if and only if $x$ is in the set $A$

That’s right. You can use set notation inside of Type Theory.

Exercise: Do you like Type Theory?

How do you think Type Theory deals with relations? Remember, in set theory, relations come first and functions are just special kinds of relations. In Type Theory, with functions coming first, you might think relations are just a kind of function, and you’d be right! But how exactly does that work?

Even in Type Theory, we write sets with curly braces. But understand the braces are sugaring function expressions.

In Type Theory, sets are functions with codomain Bool.

Types as an Algebra

Now let’s package things up in a slightly different, and more abstract, fashion—a useful exercise when building up a theory. As we build our catalog, we’ll add some new important information.

Here are some types shared by almost all constructive type theories:

$\textsf{Void}$, sometimes written $\bot$, or $\textbf{0}$, is a type, called the empty type, the type of no inhabitants.
$\textsf{Unit}$, sometimes written $()$, or $\textbf{1}$, is a type, called the unit type, the type of exactly one inhabitant.
You can inductively define types with exhaustive (so the “nothing else is an inhabitant” is implicit) construction rules, for instance $\textsf{Bool}$, $\textsf{Nat}$, $\textsf{Int}$, $\textsf{Unicode}$, $\textsf{Float64}$, etc. For the rules to make sense, they must be (1) well-founded, i.e., every inhabitant must be buildable by a finite number of applications of the rules, and (2) strictly positive, i.e., has no premise containing the type being defined on the left-hand side of any function arrow.
Why is strict positivity required?
Imagine this illegal rule:
$$ \dfrac{f\!: T \rightarrow \textsf{Bool}}{\textsf{fun}\,f\!:T} $$
This says that one of the things a $T$ can be is a set of $T$’s but Cantor’s Theorem, which we proved back in the Set Theory notes, says $|T| < |\mathcal{P}(T)|$ for every $T$. Contradiction! Strict positivity is the cheap, purely syntactic check that catches this before you ever get to the contradiction.

Exercise: One thing we’re skipping in these notes is the idea of mutually recursive type definitions. See if there are any issues we need to take into account.
If $t_1$ and $t_2$ are types, then $t_1 \times t_2$ is a type, called a product type. It is the type of tuples $(a, b)$ where $a\!:\!t_1$ and $b\!:\!t_2$.
If $t_1$ and $t_2$ are types, then $t_1 + t_2$ is a type, called a sum type. It is the type of values $\textsf{inl}\,a$ for $a\!:\!t_1$ and $\textsf{inr}\,b$ for $b\!:\!t_2$.
If $t_1$ and $t_2$ are types, then $t_1 \rightarrow t_2$ is a type, known as a function type. It is the type of functions that take an argument of type $t_1$ and return a value of type $t_2$.

The union type and intersection types we saw before do not exist in all type theories:

If $t_1$ and $t_2$ are types, then $t_1 \mid t_2$ is a type, called a union type. It is the type of values that are either of type $t_1$ or of type $t_2$.
If $t_1$ and $t_2$ are types, then $t_1\,\texttt{&}\, t_2$ is a type, called an intersection type. It is the type of values that are of both type $t_1$ and type $t_2$.

Each of the types above are either primitive or constructed from constituent types. Nice and clean.

But we can do more.

We can make a type dependent not just on another type, but upon a value. That’s a topic for the next section. It’s wild.

Dependent Types

Time for something cool.

Motivation

Remember our definition of the type “lists of type $t$”:

$$ \dfrac{}{[\,]\!: t^*} \quad\quad\quad \dfrac{x\!: t \quad y\!: t^*}{(x :: y)\!: t^*} $$

It says that the type of lists (of type $t$) is inhabited by lists of any length whatsoever. This means that the $\textsf{head}$ and $\textsf{tail}$ functions require an option type (like we saw above) or be defined to return a custom type with an error variant, or worse, just be undefined or crash on an empty list. Ditto for accessing an element out of bounds. Not only that, but this type gives us no guarantees that reversing a list preserves its length. If we could make types such as “lists of length $5$” or “lists of length $8$”, then all these functions could be made safe without needing complicated machinery or risking a crash.

But making a separate type for each $n$ is infeasible. We need a single expression that can make the type of lists of length $n$ for any given $n$—that is, a type dependent upon a value, not just on other types.

This can be done!

A First Dependent Type

Here is how to define $\mathsf{Vec}\;t\;n$, the type of length-$n$ vectors of $t$ (note that we normally use the term vector instead of list when talking about fixed-length sequences):

$$ \dfrac{}{\mathsf{nil}\!:\mathsf{Vec}\;t\;0} \quad\quad\quad \dfrac{x\!:t \quad\;\; y\!:\mathsf{Vec}\;t\;n}{(x :: y)\!:\mathsf{Vec}\;t\;(n+1)} $$

$\mathsf{Vec}\;t\;n$ is a dependent type, so named because the type “depends on” a term.

This is a crazy powerful idea. Look at everything we get.

The vector length is now part of the type itself. For example $\textsf{Vec}\;\textsf{Nat}\;3$ is the type of all length-$3$ vectors of natural numbers, inhabited by vectors like $(2\!::\!(1\!::\!(3\!::\!\textsf{nil})))$. $\textsf{Vec}\;\textsf{Nat}\;5$ is a completely different type.
Because $\textsf{Vec}\;t\;n$ for each different $n$ is a genuinely different type, $\textsf{Vec}\;t$, all by itself, is a mapping from $\textsf{Nat}$ to types—not a type itself, but a family of types indexed by a natural number.
You cannot even write down a $\textsf{Vec}\;t\;3$ with four elements in it; there’s simply no rule that would let you derive it. This is wild: we’re getting to where we can capture some program correctness rules, that we thought might have to wait until runtime to check, in the compile-time (static) type checker!
The operation to fetch the first element (remember the function we called $\textsf{head}$ above?) can be typed as $\mathsf{Vec}\;t\;(n+1) \rightarrow t$ (more precisely, $\Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;(n+1) \rightarrow t)$, as we’ll see below). The $n+1$ in the domain guarantees the sequence is non-empty at the type level with no runtime check needed, no option type needed for the codomain, and no special custom error type needed. Compare to $t^*$, where nothing stops you from asking for the first element of $[\,]$.

Dependent Types allow a compile-time type checker to find errors one would typically expect to occur at runtime.

More Examples

Besides $\textsf{Vec}\;t\;n$, here are a few more examples of dependent types.

$\textsf{Fin}\;n$ is the type of natural numbers less than $n$. For example, $\textsf{Fin}\;3$ is inhabited by $0$, $1$, and $2$, but not $3$ or any larger number.
$\textsf{Matrix}\;m\;n$ is the type of $m \times n$ matrices. For example, $\textsf{Matrix}\;2\;3$ is the type of $2$-by-$3$ matrices.
Height-balanced trees of a given height
The type of pairs in which the first element is a natural number $n$, and the type of the second element is $\textsf{Bool}$ if $n$ is even, and $\textsf{Nat}$ if $n$ is odd

A few more, with a slightly different flavor, that are handled differently in practice:

The type of pairs of numbers whose first element is less than the second
The type of pairs of numbers that add up to a specific number $n$
The type of sorted lists
The type of theorems of a logical system

A Definition

Definition time. A dependent type is a family of types $B$ indexed by terms $x\!:\!A$, that is, for each different $a\!:\!A$, $B(a)$ may be a genuinely different type. So $B$ is less of a type than a type family. Think of $B$ as a function from terms of type $A$ to types: for a given value $a$, $B(a)$ is a type.

In the example above, $\textsf{Vec}\;t$ maps natural numbers to types, and is a dependent type. For each specific $n$, $\textsf{Vec}\;t\;n$ is a distinct type representing vectors of length $n$, e.g., $\textsf{Vec}\;t\;3$.

You will frequently see functions that operate on dependent types, and pairs containing elements from a dependent type. These constructs have a special notation: $\Pi$ (dependent functions) will extend $\to$ and $\Sigma$ (dependent pairs) will extend $\times$.

Relax, the notation is not scary at all.

Dependent Function Types ($\Pi$)

$\Pi_{x:A} B$ is the dependent type of functions that, given $a\!:\!A$, return a term whose type is specifically $B[x\mapsto a]$. When $B$ doesn’t actually depend on $x$, $\Pi_{x:A} B$ is just $A \rightarrow B$, so $\Pi$ is a strict generalization of the ordinary function type.

Let’s look at the types of some common vector operations, and see how dependent types make them more precise.

$\textsf{head}: \Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;(n+1) \rightarrow t)$

$\textsf{tail}: \Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;(n+1) \rightarrow \mathsf{Vec}\;t\;n)$

$\textsf{reverse}: \Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;n \rightarrow \mathsf{Vec}\;t\;n)$

$\textsf{append}: \Pi_{m:\textsf{Nat}} \Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;m \rightarrow \mathsf{Vec}\;t\;n \rightarrow \mathsf{Vec}\;t\;(m+n))$

What is the $\Pi$ for?
It’s just the notation that people use to denote dependent function types. It’s a binding operator, just like $\lambda$, $\forall$, and $\exists$ that you may have seen before. Without it, the $n$ would be free and just hanging there. The $\Pi$ makes it clear it’s a bound variable, and the type depends on it in a controlled way.

Ok, so yes, you may ask: “isn’t the $t$ also free?” We probably should have bound it too, something like $\Pi_{t:\textsf{Type}} \Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;(n+1) \rightarrow t)$, or we could use a simple $\forall$ like we saw above in our section on polymorphic types. It’s often elided in practice.

Dependent Pair Types ($\Sigma$)

$\Sigma_{x:A} B(x)$ is the dependent type of pairs $(a, b)$ where $a\!:\!A$ and $b\!:\!B(a)$, that is, the type of the second component depends on the value of the first. When $B$ doesn’t depend on $x$, $\Sigma_{x:A} B$ is just $A \times B$, so $\Sigma$ generalizes the ordinary pair type the same way $\Pi$ generalizes $\rightarrow$.

The classic example is a pair of a length $n$ and a vector of that exact length: $\Sigma_{n:\textsf{Nat}} (\textsf{Vec}\,A\,n)$.

So: $\Pi$ says “for every $a\!:\!A$, here’s a function that will produce something of the type $B(a)$ specific to that $a$.” $\Sigma$ says “here is some particular $a\!:\!A$, paired with a value of the type $B(a)$ specific to that $a$.” Hey...looks like $\Pi$ gives $\forall$ vibes and $\Sigma$ gives $\exists$ vibes. We’ll see soon that that is more than a coincidence.

Refinement Types

A refinement type is a subtype formed from its supertype by applying a restriction based on a value. A good example is “the positive naturals” which we write as $\Sigma_{n:\textsf{Nat}} (n > 0)$ — a pair of a number together with a proof that it’s positive. This only makes sense once you allow a proposition ($n > 0$) to appear as a type—something we’ll see how to do soon.

In Programming Languages

Think of the correctness and security implications of dependent types! You can encode invariants directly in a function’s type rather than checking them at runtime or just asserting them in a comment. A sort function can be typed to guarantee its output is a permutation of its input, a matrixMultiply function can be typed to reject mismatched dimensions at compile time, and so on.

There’s a deeper connection, though. It has to do with Propositions as Types, which is coming up next.

Propositions as Types

Much of our type theory discussion so far seems like more than just looking for foundations of mathematics and computer science. There’s something that looks pretty practical too. Type systems in programming languages seem to come from this idea. So does logical reasoning. There seems to be a connection here.

Perhaps you’ve heard of the Lean theorem prover / proof assistant / programming language? Or Rocq (formerly known as Coq)? Or HOL, Isabelle, Agda, or Idris? These systems and languages enable proofs that are so complex that they are beyond an individual human’s ability to produce in their lifetime. And they perform formal verification (i.e., full correctness proofs) on systems that are not allowed to fail (think medicine and avionics). Running programs in these languages prove theorems. This is starting to get interesting.

These systems are direct applications of Type Theory. Not Set Theory. Specifically, they come from a discovery of a deep relationship between logic, type theory, and computation. A relationship people figured out over time as they noticed some interesting similarities between these fields. Similarities that had to be more than coincidences:

The Logical Formula	which means	is like the type	which is
$F$	Falsity	$\textsf{Void}$	the empty type (sometimes written $\bot$)
$T$	Truth	$\textsf{Unit}$	the type with only one inhabitant $()$
$A \wedge B$	Conjunction (And)	$A \times B$	the pair type
$A \vee B$	Disjunction (Or)	$A + B$	the sum type
$A \supset B$	Implication	$A \rightarrow B$	the function type
$\forall x. ((x \in A) \supset B)$ where $x$ is free in $B$	Universal Quantification	$\Pi_{x: A} B$	the dependent function type
$\exists x. ((x \in A) \wedge B)$ where $x$ is free in $B$	Existential Quantification	$\Sigma_{x: A} B$	the dependent pair type

It turns out that you can prove a formula (in this context formulas and propositions are the same thing) by constructing an inhabitant of the corresponding type. Inferring the typing judgments is exactly the same as working through the logical inferences.

The fact that formulas corresponded directly with types was noticed by Haskell Curry and William Alvin Howard at various times during the 1930s through the 1960s. This observation is now known as the Curry-Howard Correspondence, or the Curry-Howard Isomorphism. Here’s a video showing why it is amazing:

The correspondence shows that not only:

Propositions are Types

but that:

Proofs are Programs

and further that:

Computation is Proof Normalization

Another video, which covers a bit more, including a worked out example:

To really go deeper and to see more technical details of this correspondence, you really need to read Philip Wadler's detailed but very accessible paper on the subject.

Exercise: Read it.

Next, read the the Wikipedia page on the Curry-Howard Correspondence, to see not only the extent of the correspondence, but also to find how much hardcore computer research has come out of this discovery.

Exercise: Use your reading and study of the Wadler paper and Wikipedia article to build your own table of the Curry-Howard Correspondence, with as many rows as you can find. Note that the correspondence is not limited to formulae and types. Try to get at least 30 rows.

Universes

On the one hand, it seems like values and types are very different things. But we’ve been using $t$ in places as if it were a variable, but of a type. Can types be values of a type?

In our discussion of dependent function types, we mentioned that the free-floating $t$ was really a variable and needed to be bound with something like $\Pi_{t:\textsf{Type}} \Pi_{n:\textsf{Nat}} (\mathsf{Vec}\;t\;(n+1) \rightarrow t)$. For that to typecheck, $\textsf{Type}$ has to be a type whose inhabitants are types (like $\textsf{Bool}$ or $\textsf{Nat}$). Let’s try to define this type:

$\dfrac{}{\textsf{Bool}\!:\textsf{Type}} \quad\quad \dfrac{}{\textsf{Nat}\!:\textsf{Type}}$

$\dfrac{t\!:\textsf{Type}}{t^*\!:\textsf{Type}}$

$\dfrac{t_1\!:\textsf{Type}\quad t_2\!:\textsf{Type}}{t_1 \times t_2\!:\textsf{Type}}$

$\dfrac{t\!:\textsf{Type}\quad n\!:\textsf{Nat}}{\mathsf{Vec}\;t\;n\!:\textsf{Type}}$

Every type we’ve built is an inhabitant of $\textsf{Type}$. So what type is $\textsf{Type}$ itself? You might say:

$\dfrac{}{\textsf{Type}\!:\textsf{Type}}$

Okay that looks pretty suspicious. It feels like an instance of the same self-reference (“set of all sets”) that doomed Naive Set Theory. In fact, $\textsf{Type}\!:\!\textsf{Type}$ is inconsistent, as was shown in 1972 by Jean Yves Girard, who demonstrated that from that assumption, you can find an inhabitant of the empty type $\textsf{Void}$, which by Propositions-as-Types is the same as proving $F$. This is now known as Girard’s paradox. The original work was awesome but complex, and though simplified by Antonin Hurkens in 1995, is still too long to show here. As expected, it comes about by cleverly employing self-reference.

Exercise: Look up Girard’s paradox (or Hurkens’ simplified version) and summarize, in your own words, what self-referential type gets constructed and how it leads to a proof of anything.

The way we get consistent type theories is the same way that some folks get consistent set theories: stratification. Instead of one self-swallowing $\textsf{Type}$, build an infinite tower of universes, each classified by the next one up:

$$ \dfrac{}{\textsf{Type}_i\!:\textsf{Type}_{i+1}} $$

So $\textsf{Bool}\!:\!\textsf{Type}_0$, and $\textsf{Type}_0\!:\!\textsf{Type}_1$, and $\textsf{Type}_1\!:\!\textsf{Type}_2$, and so on, forever. No universe is ever a member of itself, so the self-reference of Girard’s Paradox cannot occur.

We need to add one more rule so that values and types can live together in any universe they need to, namely cumulativity:

$$ \dfrac{t\!:\textsf{Type}_i}{t\!:\textsf{Type}_{i+1}} $$

Cumulativity says every universe’s inhabitants are also inhabitants of every universe above it.

Exercise: What massive annoyance would occur if we didn’t have cumulativity? (Hint: think about the type of $\textsf{Vec}$.)

AMBIGUOUS NOTATION ALERT
Writing the subscript every single time is unbearable, especially since you rarely care which level you’re at, so in practice, people just write $\textsf{Type}$, unsubscripted, and mean it ambiguously: “some $\textsf{Type}_i$, for whichever $i$ makes this typecheck.” This is known as typical ambiguity. It looks frightening because you see $\textsf{Type}\!:\!\textsf{Type}$. You just have to remember it abbreviates $\textsf{Type}_i\!:\!\textsf{Type}_{i+1}$ for some $i$.

Modern proof assistants make the convention rigorous instead of merely informal, under the name universe polymorphism: a definition like $\mathsf{Vec}$ is parameterized by an explicit level variable $i$, giving it a type like $\mathsf{Vec} : \Pi_{i:\mathbb{N}}\, \textsf{Type}_i \rightarrow \textsf{Nat} \rightarrow \textsf{Type}_i$, and each use site instantiates $i$ to whatever level is needed, with the typechecker solving for consistent levels behind the scenes.

Do be careful with the terminology. A universe is one level (a $\textsf{Type}_i$ for some specific $i$). The cumulative hierarchy is the whole tower, $\textsf{Type}_0, \textsf{Type}_1, \textsf{Type}_2, \ldots$. Unsubscripted $\textsf{Type}$ is a placeholder standing for one unspecified universe in the hierarchy, resolved by context or by the typechecker.

Type Theories Throughout History

Just like there are many set theories, there are quite a few type theories. Here is a catalog placed into a rough timeline:

Russell’s Theory of Types: (1908) Bertrand Russell’s paper Mathematical logic as based on the theory of types, introduced a “ramified” theory of types—meaning that the theory stratifies types into levels (based on whether the object is an individual, a property, or a property of properties) and further subdivided them into orders (ramification). He did this to resolve Russell’s Paradox by ensuring a set can never contain itself or be defined in terms of a totality it belongs to. He wanted all definitions to be predicative. But this was too restrictive! He had to introduce the clunky “axiom of reducibility” just to recover ordinary mathematics (like natural numbers), so no one uses it anymore, but it’s the historical root of everything that follows.
Simple Theory of Types: (Ramsey, 1925; later streamlined by Tarski) A simplification of Russell’s theory that drops the orders and keeps only the type hierarchy, eliminating the need for the axiom of reducibility. While it admits impredicativity, it is vastly more useful: one needs only a stack of types $T_0, T_1, T_2, \ldots$ where each type’s members are sets of the type below it. Every other subsequent type theory recognizes this.
Exercise: Define predicative and impredicative. Why is impredicativity considered more useful in practice? And how to we make it okay?
Church’s Simple Theory of Types (STT): (1940) Alonzo Church combined his untyped lambda calculus with a simple type discipline, partly to avoid the usual paradoxes. He made Type Theory interesting again! Before Church, people found Zermelo’s work on Set Theory to be just too good, so Type Theory had been somewhat neglected. STT was powerful, useful, elegant, and the direct ancestor of every typed programming language! Really!
System F / The Polymorphic Lambda Calculus: (Girard, 1972; independently, Reynolds, 1974) Jean-Yves Girard (for proof theory) and John Reynolds (for programming languages) independently extended simply typed lambda calculus with universal quantification over types, giving parametric polymorphism, which you might sometimes hear called generics. This is the theoretical basis for polymorphism in the ML family of languages (including Haskell), and (in a restricted form) Java’s and C#’s and Rust’s generics.
Martin-Löf Type Theory (MLTT): (Per Martin-Löf, 1972 onward) An elegant and highly influential intuitionistic type theory that takes dependent types ($\Pi_{x:A} B$ and $\Sigma_{x:A} B$) as primitive, intended as a full foundation for constructive mathematics. MLTT is the direct ancestor of dependently typed proof assistants like Agda and, via the Calculus of Constructions, Rocq.
Andrews’ $Q_0$ and $Q_0^{\infty}$: (Peter Andrews, 1986, refined in the 2nd edition, 2002) A simply-typed higher-order logic descending directly from Church’s 1940 simple theory of types, with a more minimal basis—fewer symbols, fewer axioms, and fewer rules—everything is built from equality! Unlike MLTT and CoC, $Q_0$ stays classical (not constructive) and non-dependent, making it a useful counterpoint to intuitionistic and dependent theories which dominate the field.
The Calculus of Constructions (CoC): (Coquand and Huet, 1988) Combines Martin-Löf-style dependent types with System F-style polymorphism into a single, very expressive type theory. It’s the theoretical core of the Rocq (formerly Coq) proof assistant.
Homotopy Type Theory (HoTT): (Developed through the 2000s, crystallizing around 2006–2013, with Vladimir Voevodsky’s univalence axiom as a key turning point) Reinterprets Martin-Löf’s identity types through the lens of Homotopy Theory, where types are spaces, terms are points, and equalities are paths. This is the newest major branch. There is a very well-known book on this topic.

Case Studies

Let’s look at some type theories. We’ll see two main flavors: (1) Church’s STT, Andrews’ $Q_0$ and $Q_0^{\infty}$, and HOL, which are classical and non-dependent, and (2) Martin-Löf Type Theory, the Calculus of Constructions, and Homotopy Type Theory, which are constructive and dependent. The first group are built like logical systems that have types on terms; the latter build up universes according to the ways we’ve seen above.

Church’s Simple Type Theory

Church’s type theory was introduced in this awesome paper from 1940.

In it, Church presents a type theory from which classical logic emerges. Despite an extremely minimal notation, his system allows logic, sets, and numbers to be represented. We won’t go over his theory here, because there’s a direct successor worth looking at.

Exercise: Skim, but don’t skim too lightly, Church’s paper. Then read about the theory at the Stanford Encyclopedia of Philosophy.

Andrews’ $Q_0$

Peter Andrews was a student of Church’s. His $Q_0$ is pretty much the same as Church’s Simple Type Theory, but with a smaller basis. Instead of defining a handful of constants, he builds everything from functions and the notion of equality. Here’s the syntax, modernized a bit so the function type $(\beta\alpha)$ is now $\alpha \to \beta$, and $\iota$ is now $\textsf{the}$.

Syntax of $\mathcal{Q}_0$

A type is exclusively $\imath$, $o$, or $(\alpha\to\beta)$ for types $\alpha$ and $\beta$.
A variable is $x_\alpha, y_\alpha, \ldots$ for every type $\alpha$.
A constant is exclusively $\textsf{Q}_{\alpha\to\alpha\to o}$ (equality) and $\textsf{the}_{(\alpha\to o)\to \alpha}$ (description) for every type $\alpha$.
A term is exclusively a variable, a constant, or, for terms $F_{\alpha\to\beta}$, $A_\alpha$, and $B_\beta$, $(F_{\alpha\to\beta} A_\alpha)_\beta$, or $(\lambda x_\alpha.\,B_\beta)_{\alpha\to\beta}$.

Some vocabulary:

$\imath$ is the type of individuals.
$o$ is the type of truth values.
$(\alpha\to\beta)$ is the type of functions from type $\alpha$ to type $\beta$.
$(F\,A)$ is an application of term $F$ to term $A$.
$(\lambda x. B)$ is an abstraction, representing a function with parameter $x$ and body $B$.

We can express various objects and formulae from our earlier notes on Logic:

$r_\imath$ is Romeo from the type of individuals
$j_\imath$ is Juliet from the type of individuals
$R_o$ is the proposition “It is raining” (note its type!)
$I_{\imath\to o}$ is the predicate “is Italian”—a particular function from individuals to truth values
$L_{\imath\to(\imath\to o)}$ is “likes”
($L_{\imath\to(\imath\to o)}r_\imath)_{\imath\to o}$ is a function, that when given an individual, will produce whether Romeo likes that individual
(($L_{\imath\to(\imath\to o)}r_\imath)_{\imath\to o} j_\imath)_o$ is the truth value of “Romeo likes Juliet”

Some allowable sugar:

The $\to$ in function types is right associative, so $\alpha \to \beta \to \gamma$ is interpreted as $\alpha \to (\beta \to \gamma)$ rather than $(\alpha \to \beta) \to \gamma$.
Application is left associative, so $A\,B\,C$ is interpreted as $((A\,B)\,C)$ rather than $(A\,(B\,C))$.
You can drop type symbols only when it is absolutely clear and unambiguous from the context what the types should be

Some of the conventional logical operators seem to be missing from the syntax, but that’s only because they are really just abbreviations, or sugar, for other terms. Here are the common ones, defined in order, since later ones may depend on earlier ones.

Term	What it sugars	Intuition
$A_\alpha = B_\alpha$	$\textsf{Q}_{\alpha\to \alpha\to o}A_\alpha B_\alpha$	Equality is primitive
$A_o \equiv B_o$	$A_o = B_o$	Material equivalence is just boolean equality
$T_o$	$\textsf{Q}_{o\to o\to o}=\textsf{Q}_{o\to o\to o}$	$\textsf{Q}$ equals itself is true
$F_o$	$(\lambda x_o.\,T_o) = (\lambda x_o.\,x_o)$	These two functions are not equal
$\forall x_\alpha.\, A_o$	$(\lambda x_\alpha.\,T_o) = (\lambda x_\alpha.\,A_o)$	For these functions to be equal, $A$ must be true for all $x_\alpha$
$A_o \land B_o$	$\lambda f_{o\to o\to o}. f\,T\,T = \lambda f_{o\to o\to o}. f\,A\,B$	Only way so far to get “both of these are true”
$A_o \supset B_o$	$A = (A \land B)$	Material implication
$\neg A_o$	$A_o = F_o$	Negation
$A_o \lor B_o$	$\neg(\neg A \land \neg B)$	De Morgan way to get “$A$ or $B$”
$\exists x_\alpha. A_o$	$\neg(\forall x_\alpha. \neg A)$	There exists at least one $x_\alpha$ such that $A$ holds
$A_\alpha \neq B_\alpha$	$\neg(A_\alpha = B_\alpha)$	Not equals
$\exists_1 x_\alpha. A_o$	$\exists x_\alpha. \forall y_\alpha. A[x\mapsto y] \equiv y = x$	There exists exactly one $x_\alpha$ such that $A$ holds
$\iota x_\alpha. A_o$	$\textsf{the}_{(\alpha\to o)\to \alpha} (\lambda x_\alpha. A)$	The unique element $x_\alpha$ such that $A$ holds, if it exists

Exercise: Write each of the above nonsugared expressions in fully unabbreviated form, fully parenthesized and with all type symbols restored.

Andrews axiomatized this theory in what’s known as the Hilbert style: separating axioms and inference rules, defining both without hypothetical judgments. His axioms are:

Axiom 1. There are exactly two truth values.

$$ g_{o\to o}\,T \wedge g_{o\to o}\,F \;\equiv\; \forall x_o.\,g_{o\to o}\,x_o $$

If a predicate on booleans holds at $T$ and holds at $F$, it holds at every boolean. The logic is bivalent.

Axiom Schema 2. Equal things have the same properties.

$$ x_\alpha = y_\alpha \;\supset\; (h_{\alpha\to o}\,x_\alpha = h_{\alpha\to o}\,y_\alpha) $$

If $x=y$, then anything you can say about $x$ via a predicate $h$ is equally true of $y$. (This is a schema since there is an instance of this axiom for each type $\alpha$.)

Axiom Schema 3. Function equality is extensional.

$$ f_{\beta\to\alpha} = g_{\beta\to\alpha} \;\equiv\; \forall x_\beta.\,f_{\beta\to\alpha}\,x_\beta = g_{\beta\to\alpha}\,x_\beta $$

Two functions are equal exactly when they agree on every input. There’s simply no other way for functions to differ. (This is a schema since there is an instance of this axiom for each pair of types $\alpha,\beta$.)

Axiom Schema 4. Beta Conversion.

$$ (\lambda x_\alpha.\,B_\beta)\,A_\alpha \;=\; B_\beta[x_\alpha\mapsto A_\alpha] $$

Applying a function literally is substitution of the argument for the parameter into the body. The substitution is as how we defined it in our Logic Notes: a proper substitution that does not capture any free variables in $A$, and achieves this by renaming if necessary. This is a schema since there is an instance of this axiom for each $\lambda$-expression and argument.

Axiom 5. Description.

$$ \mathsf{the}\,(\lambda x_\imath.\,x_\imath = y_\imath) \;=\; y_\imath $$

“The individual equal to $y$” is $y$.

He has only one rule of inference:

Rule R: From $C$ and $A_\alpha = B_\alpha$, infer the result of replacing one occurrence of $A_\alpha$ inside $C$ by an occurrence of $B_\alpha$, provided that the occurrence of $A_\alpha$ in $C$ is not (an occurrence of a variable) immediately preceded by $\lambda$.

In his book, Andrews shows how all of the theorems and rules of first-order logic are derived from these five simple axioms and single rule!

Remember why this is interesting
In Set Theory, we assumed first-order to already exist (with its own axioms and inference rules), then gave approximately 10 axioms to define what sets were and how to make new sets. Do you remember the axioms? Can you recite them?
In the Type Theory $Q_0$, we need only define what a function is and how it behaves, then we define all of the logical operators as functions (in many cases, just as sugared expressions). Functions first, then logic.

Let’s translate from Andrews’ Hilbert-style presentation of his theory to the Gentzen-style display of inference rules that we’ve encountered before:

Inference Rules for $\mathcal{Q}_0$

The structural rules ASSUME, WEAKEN, CONTRACT, EXCHANGE, CUT, plus:

$$ \frac{}{ \vdash (\lambda x_\alpha.\,B_\beta)\,A_\alpha \;=\; B[x \mapsto A]}\;{\scriptsize \textrm{BETA}} $$ $$ \frac{\mathcal{H}_1 \vdash A_\alpha = B_\alpha \quad\;\; \mathcal{H}_2 \vdash \mathcal{C}[\ldots A_\alpha \ldots]}{\mathcal{H}_1, \mathcal{H}_2 \vdash \mathcal{C}[\ldots B_\alpha \ldots]}\;{\scriptsize \textrm{R}} $$ $$ \frac{\mathcal{H} \vdash A_\beta = B_\beta}{\mathcal{H} \vdash (\lambda x_\alpha.\,A) = (\lambda x_\alpha.\,B)}\;{\scriptsize \textrm{ABS}\;(x\;\textrm{not free in}\;\mathcal{H})} $$ $$ \frac{\mathcal{H} \vdash A_{\alpha\to\beta}\,x = B_{\alpha\to\beta}\,x}{\mathcal{H} \vdash A = B}\;{\scriptsize \textrm{EXT}\;(x\;\textrm{not free in}\;A, B, \mathcal{H})} $$ $$ \frac{\mathcal{H}_1,\, (A_o = T_o) \vdash B \quad\;\; \mathcal{H}_2,\, (A_o = F_o) \vdash B}{\mathcal{H}_1, \mathcal{H}_2 \vdash B}\;{\scriptsize \textrm{CASES}} $$ $$ \frac{}{ \vdash \forall p_{\imath\to o}.\,\left( (\exists_1 x_\imath.\, p\,x) \supset p\,(\textsf{the}\;p) \right)}\;{\scriptsize \textrm{DESC}} $$

The notation $\mathcal{C}[\ldots A_\alpha \ldots]$ in a rule premise denotes a term $\mathcal{C}$ in which a single, distinguished occurrence of the subterm $A_\alpha$ has been singled out. $\mathcal{C}[\ldots B_\alpha \ldots]$ in a rule conclusion means the same term $\mathcal{C}$ with that one occurrence of $A_\alpha$ replaced by $B_\alpha$, provided that no binder binds a free variable of $A$ or captures a free variable of $B$. Any other occurrences of $A_\alpha$ elsewhere in $\mathcal{C}$ are left alone.

Make sure you understand both provisos on replacement notation. When trying to use $A=B$ to replace an $A$ occurrence with a $B$ occurrence:

No free variable of $A$ must be in the scope of a binder. You cannot use $bc=z$ for replacement into $\lambda c.abcd$ since the $c$ in the body is not replaceable—it’s tied to the $\lambda c$.
No free variable in $B$ must be captured. You cannot use $b=cz$ for replacement into $\lambda c.abcd$ since the resulting $\lambda c.aczcd$ means something different.

Exercise: The notation $\mathcal{C}[\ldots A_\alpha \ldots]$ is completely different from the notation $A[x]$ introduced in the Logic notes. The latter simply says that the variable $x$ occurs free in $A$. The former names exactly one occurrence of a subterm (or a replacement term) that may be arbitrarily large. We handled the possible capture of variables in variable substition with renaming. By why can’t renaming help with the substitution mechanism of rule R?

A note on the translation
In converting from the original Hilbert-style to the Gentzen style, we took a few liberties. We’ve (1) added structural inference rules for convenience in managing hypotheses, (2) split Axiom 3, the equation $(f = g) = \forall x_\alpha.\,f\,x = g\,x$, into two directional rules ABS and EXT, (3) replaced Axiom 1 stating there are only two boolean values ($gT_o \land gF_o = \forall x_o.\,gx$) with CASES, and (4) introduced our own R rule as a mashup of his Rule R and his derived rule RR. The resulting system proves the same theorems.
This is a bold statement that we are offering without proof, so you are invited to show the equivalence between the two presentations!

Here’s an explanation for each rule:

$\textbf{BETA}$: Applying a function to an argument yields the value of the function’s body with the argument substituted for the parameter.
$\textbf{R}$: If two terms are equal, then any property that holds of one should hold of the other. This is Leibniz's law.
$\textbf{ABS}$: Equal bodies give equal functions. This is what lets us generalize: it is the only rule that can build an equation between two abstractions whose bound variable actually occurs in the body.
$\textbf{EXT}$: Two functions are equal if they yield equal outputs for all inputs, i.e., extensionally equal.
$\textbf{CASES}$: If a desired conclusion $B$ holds when a proposition $A$ is assumed True, and the same conclusion $B$ holds when $A$ is assumed False, then $B$ holds unconditionally.
$\textbf{DESC}$: If there exists exactly one element that satisfies a property, then that element can be denoted with a description.

All of the rules from propositional and predicate logic above are derived rules in $Q_0$. Let’s go through just a few. When using rule R, we’ll underline the occurrence being replaced. We’ll used dashed lines when all we are doing is sugaring or desugaring.

We’ll begin by deriving rules that allow us to use the basic equality rules: reflexivity, symmetry, and transitivity. Then we’ll continue to some of the familiar rules you know from propositional and predicate logic, introducing a few convenient rules along the way.

REFL $\frac{}{\vdash A_\alpha = A_\alpha}$

$ \begin{prooftree} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\vdash (\lambda x_\alpha.\,x_\alpha)\,A_\alpha = A_\alpha$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\vdash \underline{\smash{(\lambda x_\alpha.\,x_\alpha)\,A_\alpha}} = A_\alpha$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\vdash A_\alpha = A_\alpha$} \end{prooftree} $

SYM $\frac{\mathcal{H} \vdash A_\alpha = B_\alpha}{\mathcal{H} \vdash B_\alpha = A_\alpha}$

$ \begin{prooftree} \AxiomC{$\mathcal{H} \vdash A_\alpha = B_\alpha$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{REFL}}$} \UnaryInfC{$\vdash \underline{\smash{A_\alpha}} = A_\alpha$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash B_\alpha = A_\alpha$} \end{prooftree} $

TRANS $\frac{\mathcal{H_1} \vdash A_\alpha = B_\alpha \quad \mathcal{H_2} \vdash B_\alpha = C_\alpha}{\mathcal{H_1}, \mathcal{H_2} \vdash A_\alpha = C_\alpha}$

$ \begin{prooftree} \AxiomC{$\mathcal{H_1} \vdash A_\alpha = B_\alpha$} \AxiomC{$\mathcal{H_2} \vdash \underline{\smash{B_\alpha}} = C_\alpha$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H_1}, \mathcal{H_2} \vdash A_\alpha = C_\alpha$} \end{prooftree} $

APPLY $\frac{\mathcal{H} \vdash F = G}{\mathcal{H} \vdash FA = GA}$

$ \begin{prooftree} \AxiomC{$\mathcal{H} \vdash F = G$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{REFL}}$} \UnaryInfC{$\vdash F\,A = \underline{\smash{F}}\,A$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash F\,A = G\,A$} \end{prooftree} $

TRUE-INTRO $\frac{}{\vdash T}$

$ \begin{prooftree} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{REFL}}$} \UnaryInfC{$\vdash \textsf{Q}_{o\to o\to o} = \textsf{Q}_{o\to o\to o}$} \dashedLine\UnaryInfC{$\vdash T$} \end{prooftree} $

NEG-ELIM (FALSE-INTRO) $\frac{\mathcal{H_1} \vdash \neg A \quad \mathcal{H_2} \vdash A}{\mathcal{H_1}, \mathcal{H_2} \vdash F}$

$ \begin{prooftree} \AxiomC{$\mathcal{H_1} \vdash \neg A$} \dashedLine\UnaryInfC{$\mathcal{H_1} \vdash A = F$} \AxiomC{$\mathcal{H_2} \vdash \underline{\smash{A}}$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H_1}, \mathcal{H_2} \vdash F$} \end{prooftree} $

EQ-T-ELIM $\frac{\mathcal{H} \vdash A = T}{\mathcal{H} \vdash A}$

$ \begin{prooftree} \AxiomC{$\mathcal{H} \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{SYM}}$} \UnaryInfC{$\mathcal{H} \vdash T = A$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{TRUE-INTRO}}$} \UnaryInfC{$\vdash \underline{\smash{T}}$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash A$} \end{prooftree} $

FALSE-ELIM $\frac{\mathcal{H} \vdash F}{\mathcal{H} \vdash A}$

$ \begin{prooftree} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\vdash (\lambda x_o.\,x)\,A = A$} \AxiomC{$\mathcal{H} \vdash F$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash (\lambda x_o.\,T) = (\lambda x_o.\,x)$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\vdash \underline{\smash{(\lambda x_o.\,T)}}\,A = T$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash \underline{\smash{(\lambda x_o.\,x)\,A}} = T$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-ELIM}}$} \UnaryInfC{$\mathcal{H} \vdash A$} \end{prooftree} $

EQ-T-INTRO $\frac{\mathcal{H} \vdash A}{\mathcal{H} \vdash A = T}$

$ \begin{prooftree} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = T \vdash A = T$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = F \vdash A = F$} \AxiomC{$\mathcal{H} \vdash \underline{\smash{A}}$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H},\,A = F \vdash F$} \RightLabel{$\;\scriptsize{\textrm{FALSE-ELIM}}$} \UnaryInfC{$\mathcal{H},\,A = F \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{CASES}}$} \BinaryInfC{$\mathcal{H} \vdash A = T$} \end{prooftree} $

EQ-T-INTRO-HYP: $\frac{\mathcal{H},\, A \vdash C}{\mathcal{H},\, A = T \vdash C}$

NEG-INTRO $\frac{\mathcal{H},\,A \vdash F}{\mathcal{H} \vdash \neg A}$

$ \begin{prooftree} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = T \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-ELIM}}$} \UnaryInfC{$A = T \vdash A$} \AxiomC{$\mathcal{H},\,A \vdash F$} \RightLabel{$\;\scriptsize{\textrm{CUT}}$} \BinaryInfC{$\mathcal{H},\,A = T \vdash F$} \RightLabel{$\;\scriptsize{\textrm{FALSE-ELIM}}$} \UnaryInfC{$\mathcal{H},\,A = T \vdash A = F$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = F \vdash A = F$} \RightLabel{$\;\scriptsize{\textrm{CASES}}$} \BinaryInfC{$\mathcal{H} \vdash A = F$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash \neg A$} \end{prooftree} $

CONJ-INTRO $\frac{\mathcal{H_1} \vdash A\quad\mathcal{H_2} \vdash B}{\mathcal{H_1}, \mathcal{H_2} \vdash A \land B}$

$ \begin{prooftree} \AxiomC{$\mathcal{H_1} \vdash A$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-INTRO}}$} \UnaryInfC{$\mathcal{H_1} \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{SYM}}$} \UnaryInfC{$\mathcal{H_1} \vdash T = A$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{REFL}}$} \UnaryInfC{$\vdash \lambda f_{o\to o\to o}. fTT = \lambda f_{o\to o\to o}. f\underline{\smash{T}}T$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H_1} \vdash \lambda f_{o\to o\to o}. fTT = \lambda f_{o\to o\to o}. fA\underline{\smash{T}}$} \AxiomC{$\mathcal{H_2} \vdash B$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-INTRO}}$} \UnaryInfC{$\mathcal{H_2} \vdash B = T$} \RightLabel{$\;\scriptsize{\textrm{SYM}}$} \UnaryInfC{$\mathcal{H_2} \vdash T = B$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H_1}, \mathcal{H_2} \vdash \lambda f_{o\to o\to o}. fTT = \lambda f_{o\to o\to o}. fAB$} \dashedLine\UnaryInfC{$\mathcal{H_1}, \mathcal{H_2} \vdash A \land B$} \end{prooftree} $

CONJ-ELIM-1 $\frac{\mathcal{H} \vdash A_o \land B_o}{\mathcal{H} \vdash A_o}$

$ \begin{prooftree} \AxiomC{$\mathcal{H} \vdash A \land B$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash \lambda f_{o\to o\to o}. fTT = \lambda f_{o\to o\to o}. fAB$} \RightLabel{$\;\scriptsize{\textrm{APPLY}}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda f_{o\to o\to o}. fTT)(\lambda x. \lambda y. x) = (\lambda f_{o\to o\to o}. fAB)(\lambda x. \lambda y. x) $} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda x. \lambda y. x)TT = (\lambda x. \lambda y. x)AB$} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda y. T)T = (\lambda y. A)B$} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\mathcal{H} \vdash T = A$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-ELIM}}$} \UnaryInfC{$\mathcal{H} \vdash A$} \end{prooftree} $

CONJ-ELIM-2 $\frac{\mathcal{H} \vdash A_o \land B_o}{\mathcal{H} \vdash B_o}$

$ \begin{prooftree} \AxiomC{$\mathcal{H} \vdash A \land B$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash \lambda f_{o\to o\to o}. fTT = \lambda f_{o\to o\to o}. fAB$} \RightLabel{$\;\scriptsize{\textrm{APPLY}}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda f_{o\to o\to o}. fTT)(\lambda x. \lambda y. y) = (\lambda f_{o\to o\to o}. fAB)(\lambda x. \lambda y. y) $} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda x. \lambda y. y)TT = (\lambda x. \lambda y. y)AB$} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda y. T)T = (\lambda y. B)B$} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\mathcal{H} \vdash T = B$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-ELIM}}$} \UnaryInfC{$\mathcal{H} \vdash B$} \end{prooftree} $

DISJ-INTRO-1 $\frac{\mathcal{H} \vdash A}{\mathcal{H} \vdash A \lor B}$

TODO

DISJ-INTRO-2 $\frac{\mathcal{H} \vdash B}{\mathcal{H} \vdash A \lor B}$

TODO

DISJ-ELIM $\frac{\mathcal{H_1} \vdash A \lor B \quad \mathcal{H_2},\,A \vdash C \quad \mathcal{H_3},\,B \vdash C}{\mathcal{H_1}, \mathcal{H_2}, \mathcal{H_3} \vdash C}$

TODO

IMPL-ELIM (MP) $\frac{\mathcal{H_1} \vdash A \supset B \quad \mathcal{H_2} \vdash A}{\mathcal{H_1}, \mathcal{H_2} \vdash B}$

$ \begin{prooftree} \AxiomC{$\mathcal{H_1} \vdash A \supset B$} \dashedLine\UnaryInfC{$\mathcal{H_1} \vdash A = (A \land B)$} \AxiomC{$\mathcal{H_2} \vdash \underline{\smash{A}}$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H_1}, \mathcal{H_2} \vdash A \land B$} \RightLabel{$\;\scriptsize{\textrm{CONJ-ELIM-2}}$} \UnaryInfC{$\mathcal{H_1}, \mathcal{H_2} \vdash B$} \end{prooftree} $

IMPL-INTRO (CP) $\frac{\mathcal{H},\,A \vdash B}{\mathcal{H} \vdash A \supset B}$

THIS ONE IS NOT WORKING - STILL UNDER CONSTRUCTION

$ \begin{prooftree} % ---------------------------------------------------- % LEFT MAIN BRANCH: THE TRUE CASE (A = T) % ---------------------------------------------------- \AxiomC{$\mathcal{H}, A \vdash B$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-INTRO}}$} \UnaryInfC{$\mathcal{H}, A \vdash B = T$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{AND-T-LEFT}}$} \UnaryInfC{$\vdash (T \land B) = \underline{\smash{B}}$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H}, A \vdash (T \land B) = T$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-INTRO-HYP}}$} \UnaryInfC{$\mathcal{H}, A = T \vdash (T \land B) = T$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = T \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H}, A = T, A = T \vdash (\underline{\smash{A}} \land B) = \underline{\smash{A}}$} \RightLabel{$\;\scriptsize{\textrm{CONTRACT}}$} \UnaryInfC{$\mathcal{H}, A = T \vdash (A \land B) = A$} % ---------------------------------------------------- % RIGHT MAIN BRANCH: THE FALSE CASE (A = F) % ---------------------------------------------------- \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{AND-F-LEFT}}$} \UnaryInfC{$\vdash (F \land B) = \underline{\smash{F}}$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = F \vdash A = F$} \RightLabel{$\;\scriptsize{\textrm{SYM}}$} \UnaryInfC{$A = F \vdash F = A$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$A = F \vdash (\underline{\smash{F}} \land B) = A$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{ASSUME}}$} \UnaryInfC{$A = F \vdash A = F$} \RightLabel{$\;\scriptsize{\textrm{SYM}}$} \UnaryInfC{$A = F \vdash F = A$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$A = F, A = F \vdash (A \land B) = A$} \RightLabel{$\;\scriptsize{\textrm{CONTRACT}}$} \UnaryInfC{$A = F \vdash (A \land B) = A$} \RightLabel{$\;\scriptsize{\textrm{WEAKEN}}$} \UnaryInfC{$\mathcal{H}, A = F \vdash (A \land B) = A$} % ---------------------------------------------------- % THE SYNTHESIS % ---------------------------------------------------- \RightLabel{$\;\scriptsize{\textrm{CASES}}$} \BinaryInfC{$\mathcal{H}, \mathcal{H} \vdash (A \land B) = A$} \RightLabel{$\;\scriptsize{\textrm{CONTRACT*}}$} \UnaryInfC{$\mathcal{H} \vdash (A \land B) = A$} \RightLabel{$\;\scriptsize{\textrm{SUGAR}}$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash A \supset B$} \end{prooftree} $

UNIV-ELIM (SPEC): $\frac{\mathcal{H} \vdash \forall x_\alpha.\,A}{\mathcal{H} \vdash A[x_\alpha \mapsto t_\alpha]}$ where $t_\alpha$ is free for $x$ in $A$

$ \begin{prooftree} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\vdash (\lambda x_\alpha.\,A)\,t = A[x \mapsto t]$} \AxiomC{$\mathcal{H} \vdash \forall x_\alpha.\,A$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash (\lambda x_\alpha.\,T) = (\lambda x_\alpha.\,A)$} \AxiomC{} \RightLabel{$\;\scriptsize{\textrm{BETA}}$} \UnaryInfC{$\vdash \underline{\smash{(\lambda x_\alpha.\,T)\,t}} = T$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash \underline{\smash{(\lambda x_\alpha.\,A)\,t}} = T$} \RightLabel{$\;\scriptsize{\textrm{R}}$} \BinaryInfC{$\mathcal{H} \vdash A[x \mapsto t] = T$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-ELIM}}$} \UnaryInfC{$\mathcal{H} \vdash A[x \mapsto t]$} \end{prooftree} $

UNIV-INTRO (GEN) $\frac{\mathcal{H} \vdash A}{\mathcal{H} \vdash \forall x_\alpha.\,A}$ where $x_\alpha$ is not free in $\mathcal{H}$

$ \begin{prooftree} \AxiomC{$\mathcal{H} \vdash A$} \RightLabel{$\;\scriptsize{\textrm{EQ-T-INTRO}}$} \UnaryInfC{$\mathcal{H} \vdash A = T$} \RightLabel{$\;\scriptsize{\textrm{SYM}}$} \UnaryInfC{$\mathcal{H} \vdash T = A$} \RightLabel{$\;\scriptsize{\textrm{ABS}\;(x\;\textrm{not free in}\;\mathcal{H})}$} \UnaryInfC{$\mathcal{H} \vdash (\lambda x_\alpha.\,T) = (\lambda x_\alpha.\,A)$} \dashedLine\UnaryInfC{$\mathcal{H} \vdash \forall x_\alpha.\,A$} \end{prooftree} $

Did you notice GEN is basically ABS? Do you see why it is? Why it should be?

EXISTS-INTRO $\frac{\mathcal{H} \vdash A[x \mapsto t]}{\mathcal{H} \vdash \exists x_\alpha.\,A}$

TODO: Derive using the definition $\exists x. A \equiv \neg\forall x.\neg A$ and NEG-ELIM/GEN mechanics.

EXISTS-ELIM $\frac{\mathcal{H}_1 \vdash ∃ x_\alpha. A \quad\;\; \mathcal{H}_2,\,A[x \mapsto y] \vdash C}{\mathcal{H}_1, \mathcal{H}_2 \vdash C}$ where $y$ is not free in $\mathcal{H}_1$, $\mathcal{H}_2$, $C$, or $\exists x_\alpha. A$

TODO: Fill out the proof tree by combining the hypothetical subproof with INDIRECT-PROOF.

Not only can we derive the logic rules, but we can define sets, tuples, relations, numbers, and many more kinds of mathematical objects. Sets in type theory are just functions to booleans (The type $o$ in $Q_0$), so our notation for sets is recovered as sugar:

Term	What it sugars	Intuition
$x_\alpha \in A_{\alpha\to o}$	$A_{\alpha\to o}\,x$	$x$ has property $A$. Set membership is function application!
$\{ x_\alpha \mid A_o \}$	$\lambda x_\alpha. A_o$	A set is simply a boolean-valued characteristic function
$A_{\alpha\to o} \cup B_{\alpha\to o}$	$\forall x_\alpha. (x \in A \lor x \in B)$	Set union
$A_{\alpha\to o} \cap B_{\alpha\to o}$	$\forall x_\alpha. (x \in A \land x \in B)$	Set intersection
$A_{\alpha\to o} \subseteq B_{\alpha\to o}$	$\forall x_\alpha.\, (x \in A \supset x \in B)$	Subset
$\mathcal{P}\,A_{\alpha\to o}$	$\{ B_{\alpha\to o} \mid B \subseteq A \}$	Powerset (the set of all subsets) of $A$

Exercise: Write each of the above nonsugared expressions in fully unabbreviated form, fully parenthesized and with all type symbols restored. As you do, gain a strong feel for how sets are just functions to booleans.

We have sets, but numbers and arithmetic are still needed to have a Type Theory strong enough to formalize mathematics. Gaining these features requires an extension to $Q_0$, which Andrews calls $Q_0^\infty$. The extension adds a new type $\sigma$ for numbers, and adds axioms to define the natural numbers and arithmetic. The new type $\sigma$ is defined as the type of sets of sets of individuals, so that a number is represented as a set of sets of individuals. This is a very different approach from Church’s, which used a type of functions from booleans to booleans to represent numbers. We’ll visit Church’s approach when we look at the Lambda Calculus.

Andrews’ $Q_0^\infty$

Nothing in $Q_0$ says how many individuals there are, and a model with only three individuals has only finitely many cardinalities to go around. So Andrews adds exactly one thing — an axiom of infinity — and then defines everything else. No new primitive constants, no new axioms for arithmetic.

Axiom 6. Infinity.

$$ \exists r_{\imath\to\imath\to o}.\;\; \forall x_\imath.\,\neg\,r\,x\,x \;\;\wedge\;\; \forall x_\imath \forall y_\imath \forall z_\imath.\,(r\,x\,y \wedge r\,y\,z \supset r\,x\,z) \;\;\wedge\;\; \forall x_\imath \exists y_\imath.\,r\,x\,y $$

There is a relation on individuals that is irreflexive, transitive, and never runs out: every individual has an $r$-successor. Starting anywhere and walking, you can never come back to where you have been, so the type $\imath$ must be infinite. (Andrews gives several equivalent forms of this axiom; this is the one he takes as primitive.)

The numbers themselves are Frege–Russell cardinals: the number 3 is the set of all three-element sets of individuals. That is why $\sigma$ is $(\imath\to o)\to o$ — a set of sets of individuals — and it is why Axiom 6 is needed. Without infinitely many individuals the cardinals eventually collapse into the empty collection and $S$ stops being injective.

Term	What it sugars	Intuition
$\sigma$	$(\imath\to o)\to o$	The type of numbers, represented as sets of sets of individuals
$0_\sigma$	$\lambda p_{\imath\to o}.\,\forall x_\imath.\,\neg\,p\,x$	The number zero: the collection whose only member is the empty set of individuals
$S_{\sigma\to\sigma}$	$\lambda n_\sigma \lambda p_{\imath\to o}.\,\exists x_\imath.\,p\,x \wedge n\,(\lambda y_\imath.\,p\,y \wedge y \neq x)$	The successor function: $p$ counts as $n+1$ when you can pull one element out of $p$ and what is left counts as $n$
$\mathbb{N}_{\sigma\to o}$	$\lambda n_\sigma.\,\forall q_{\sigma\to o}.\,(q\,0 \wedge \forall m_\sigma.\,(q\,m \supset q\,(S\,m))) \supset q\,n$	The set of natural numbers: what is in every collection containing $0$ and closed under $S$ — the intersection of all inductive sets

Look at what that last definition buys: mathematical induction is not an axiom here, it is the definition of $\mathbb{N}$, so the induction principle falls out immediately. The rest of the Peano postulates ($0$ is not a successor, $S$ is injective on $\mathbb{N}$) become theorems, addition and multiplication are defined by an ordinary recursion theorem proved inside the logic, and number theory is simply in.

Exercise: Convince yourself that $S\,0$ is the set of all one-element sets of individuals, by unfolding the definitions above.

Exercise: Suppose there were exactly two individuals. Show that $S$ then fails to be injective. Which of the Peano postulates survives, and which dies?

HOL

HOL is an interactive theorem prover. There is an underlying logic with a syntax, semantics, and inference rules. Unlike Church’s and Andrews’ systems above, HOL’s logic is given with axioms and Gentzen-style inference rules. The inference rules use sequents in which the assumptions are explicitly unordered sets, so no EXCHANGE and CONTRACT rules are needed to manage them, and discharge is handled via set subtraction. The focus here is not on formalizing all of mathematics per se, but in being able to prove everything.

The logic is organized into many theories, each a collection of axioms, definitions, and theorems. Proofs are written as functions in the meta language ML which return objects of type thm. The only possible values of type thm are those created via functions that implement the inference rules of the logic. This is Propositions as Types in real life.

HOL is more broadly a family of theorem provers, including HOL4, HOL Light, and Isabelle/HOL.

Exercise: Make a list of significant theorems that have been mechanically proven in HOL and its relatives.

Exercise: Make a list of significant industrial hardware architectures, compilers, drivers, and smart contracts that have been mechanically verified in HOL and its relatives.

Martin-Löf Type Theory (MLTT)

The Swedish philosopher and logician Per Martin-Löf did such great work in developing intuitionistic type theory (as opposed to Church’s and Andrews’ which are classical), that the term is pretty much synonymous with his own branch of the field: Martin-Löf Type Theory (MLTT). There’s a lot of history behind the development of MLTT, including how original versions were found inconsistent due to Girard’s Paradox, how the theory (or theories) have evolved, and how the theory ended up powering many fields of computer science including formal verification, software security, and advanced programming languages.

The core idea of MLTT is propositions-as-types (with the dependent types we saw above), which fits perfectly with intuitionistic logic. As expected, the type theory generates a logic: every (constructive) demonstration of an inhabitant of a type is a proof of the corresponding proposition.

Good overviews of MLTT can be found at Wikipedia, nLab, the Stanford Encyclopedia of Philosophy, and lecture slides from Sergey Goncharov and Jacob Neumann. But here’s a quick summary of its highlights for the impatient:

Type Judgments. You’ll see four primitive judgment forms: $\Gamma \vdash A\;\textsf{type}$, $\Gamma \vdash a : A$, and the definitional equalities $\Gamma \vdash A \equiv B$ and $\Gamma \vdash a \equiv b : A$.
Rules. You’ll see formation rules (when is this a type?), introduction rules (how do you build an inhabitant?), elimination rules (how do you use one?), and computation rules (what does eliminating a constructed thing reduce to?).
Quantifiers emerge from Dependent Types. The dependent function type $\Pi_{x:A}B(x)$ generalizes $A \to B$ and reads as $\forall$; the dependent pair type $\Sigma_{x:A}B(x)$ generalizes $A \times B$ and reads as $\exists$. An inhabitant of a $\Sigma$ is literally a witness paired with evidence, which is how existence proofs are truly constructive and not indirect.
Data types are inductive, and their eliminators are induction. $\mathbb{N}$, lists, trees, and the general W-types are given by constructors. The eliminator for $\mathbb{N}$ is precisely the principle of mathematical induction. Recursion and induction are the same rule read two ways.
Equality is a type. For $a, b : A$ there is an identity type $\textsf{Id}_A(a,b)$, introduced by $\textsf{refl}$ and eliminated by the rule $J$. There’s a distinction made between propositional equality and definitional equality ($\equiv$) which is a big deal.
It is constructive on purpose. Negation is $A \to \mathbf{0}$ (yes that $\mathbf{0}$ is the empty type), and disjunction is the sum type $A + B$. There’s no rule that gives you $A + (A \to \mathbf{0})$. So no excluded middle! You can consistently assume excluded middle, but you lose canonicity: proofs stop being programs that compute. Perhaps they...loop forever?
A hierarchy of universes. $\textsf{Type}_0 : \textsf{Type}_1 : \textsf{Type}_2 : \cdots$, exactly the cumulative hierarchy from our Universes section above, because $\textsf{Type} : \textsf{Type}$ is inconsistent by Girard’s Paradox.
The theory is a programming language. Terms normalize, type checking is decidable, and a closed term of type $\mathbb{N}$ evaluates to an actual numeral. That is what makes Agda, Rocq, and Lean possible at all: checking a proof is running a type checker.

Quite a few proof assistants, provers, and programming languages are based on MLTT, including Agda, Rocq, Lean, Nuprl, Matita.

Exercise: Generate a summary of systems in the MLTT tradition.

Many new research directions have grown out of Martin-Löf’s work. One of the newer directions is Homotopy Type Theory (HoTT), which you can read about at Wikipedia, nLab, and its very own HoTT website.

Type Theory in Practice

Type systems for real programming languages are actually applied type theories. We won’t go into depth here, but rather just drop a few items of interest.

In languages with static type systems, type checking a program is proving (yes, that thing mathematicians do) it well-typed. That’s a big part of Curry-Howard, remember?
Statically-typed languages with direct support for functional languages (e.g., Haskell, OCaml, Standard ML) are very much influenced by Church’s type theories. These languages are lauded for being secure, and often extremely efficient (see why Jane Street uses OCaml for high-frequency trading).
The power of dependently typed languages (e.g., Idris) and proof systems with dependent types (e.g., Rocq, Agda, Lean, Nuprl, Matita) derive from research in type theory. These proof systems are incredibly important in modern computational research, as people work on proofs far too complex for any human to scribble on a few pages of paper. (Example: The proof that $BB(5)=47176870$ required 20,000 lines of Coq.)

Exercise: How are TypeScript’s narrowing capabilities reflected in Type Theory?

Exercise: What else can you say about the practical applications of Type Theory? Do some research.

By now you’re aware of how important Type Theory is in both Computer Science and Mathematics. It may even be the foundation of the intersection of the two fields:

The video is worth watching because you will see somewhere in it, how to pronounce Martin-Löf’s name correctly. 😀🎉🙌💪

Current Research

Two research threads are particularly active right now: people are (1) working on Homotopy Type Theory and (2) putting dependent types to work in some surprisingly different, very practical places. Very briefly, here are some things to note:

Cubical Type Theory
Quantitative Type Theory
Community-built libraries of formalized mathematics (e.g., Lean’s Mathlib)
Combining provers with LLMs, e.g., AlphaProof

Recall Practice

Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.

Type Theory is one of three kinds of theories that can be said to have a claim as a theoretical foundation for all of mathematics. What are the other two?
Set Theory and Category Theory
Why was the first type theory created?
To avoid paradoxes in formal logic and set theory, such as Russell’s paradox.
Why do some people find Type Theory more appealing as a foundation for mathematics and computation than Set Theory?
Because Type Theory integrates logic and computation directly into the foundation, allowing for constructive proofs and more direct connections to programming languages.

[TODO: So many more needed]

Summary

We’ve covered:

The Basics
Types as an Algebra
Dependent Types
Propositions as Types
Universes
Type Theories Throughout History
Case Studies
Type Theory in Practice

Type Theory

What is Type Theory?

The Basics

The Boolean Type

Numbers

Product Types

The Unit Type

List Types

The String Type

Custom Types

Option Types

Void

Functions

Type Inference

Polymorphic Functions

Partial Functions

Subtypes

Union and Intersection Types

Sums, Products, and Exponents

Sets

Types as an Algebra

Dependent Types

Motivation

A First Dependent Type

More Examples

A Definition

Dependent Function Types ($\Pi$)

Dependent Pair Types ($\Sigma$)

Refinement Types

In Programming Languages

Propositions as Types

Universes

Type Theories Throughout History

Case Studies

Church’s Simple Type Theory

Andrews’ $Q_0$

Andrews’ $Q_0^\infty$

HOL

Martin-Löf Type Theory (MLTT)

Type Theory in Practice

Current Research

Further Reading

Recall Practice

Summary