A Type Theory is a formal system for classifying mathematical and computational objects by the kinds of things they are, and for governing how those objects can be combined and manipulated. Like Set Theory and Category Theory, Type Theory can be used as a foundation for mathematics.
There are many type theories, just like there are many set theories. We’ll see some later. The first type theory was created to address paradoxes found in what is now called Naïve Set Theory. More on this later.
The basic idea in type theory is that we bring objects into existence with their type, and define functions that manipulate (i.e., compute with) objects based on their type. New types are constructed carefully from existing types, generally avoiding paradoxes. Because functions are an integral feature rather than an afterthought, Type Theory has a naturally computational flavor, which makes it preferable, many say, as a foundation for computer science.
Types are defined by exhaustive rules that create objects that inhabit the types. Let’s learn by example.
Here is our first type:
$$ \dfrac{}{\textsf{true}\!: \textsf{Bool}} \quad\quad\quad \dfrac{}{\textsf{false}\!: \textsf{Bool}} $$This definition states that the type $\textsf{Bool}$ is inhabited by exactly two individuals, $\textsf{true}$ and $\textsf{false}$. No more, no less. That’s what we mean when we say the rules are exhaustive. We have created two and only two inhabitants for this new type.
Now here is the type of natural numbers:
$$ \dfrac{}{0\!: \textsf{Nat}} \quad\quad\quad \dfrac{n\!: \textsf{Nat}}{\textsf{s}\,n\!: \textsf{Nat}} $$This says “(1) $0$ is a natural number, (2) For any natural number $n$, $\mathsf{s\,}n$ is a natural number, and (3) nothing else is a natural number.” So the type of natural numbers is inhabited by $0$, $\mathsf{s\,}0$, $\mathsf{s\,s\,}0$, $\mathsf{s\,s\,s\,}0$, $\mathsf{s\,s\,s\,s\,}0$, and so on. We’ll write these as $0$, $1$, $2$, $3$, $4$, and so on. These abbreviations should be familiar to you.
Now for the integers:
$$ \dfrac{n\!: \textsf{Nat}}{\textsf{pos}\,n\!: \textsf{Int}} \quad\quad\quad \dfrac{n\!: \textsf{Nat}}{\textsf{neg}\,n\!: \textsf{Int}} $$So the type of integers is inhabited by $\mathsf{pos\,}0$, $\mathsf{neg\,}0$, $\mathsf{pos\,s\,}0$, $\mathsf{neg\,s\,}0$, $\mathsf{pos\,s\,s\,}0$, $\mathsf{neg\,s\,s\,}0$, and so on. We’ll abbreviate these as $0$, $-0$, $1$, $-1$, $2$, $-2$, and so on. But what, then, is the type of $3$? We need to figure it out from context, or we can write $3_{\textsf{Nat}}$ or $3_{\textsf{Int}}$ to be explicit.
Here are the rationals:
$$ \dfrac{n\!: \textsf{Int} \quad d\!: \textsf{Nat}}{\textsf{rat}\,n\,d\!: \textsf{Rat}} $$So one-third is $\textsf{rat}\,1\,3$, which we can abbreviate as $\frac{1}{3}$.
We can build lots of numeric types that are useful in programming languages, for example $\textsf{Int8}$, $\textsf{Int16}$, $\textsf{Int32}$, $\textsf{Int64}$, $\textsf{Int128}$, $\textsf{UInt8}$, $\textsf{UInt16}$, $\textsf{UInt32}$, $\textsf{UInt64}$, $\textsf{UInt128}$, $\textsf{Float16}$, $\textsf{Float32}$, $\textsf{Float64}$, $\textsf{Float128}$, $\textsf{Complex64}$, $\textsf{Complex128}$, and so on. Some might require a lot of rules, but they can be defined in principle.
We can make types. Here is one for primary colors:
$$ \dfrac{}{\textsf{red}\!: \textsf{PrimaryColor}} \quad\quad\quad \dfrac{}{\textsf{green}\!: \textsf{PrimaryColor}} \quad\quad\quad \dfrac{}{\textsf{blue}\!: \textsf{PrimaryColor}} $$You can define the type $\textsf{Unicode}$ whose inhabitants are, you guessed it, the characters of Unicode. That’d be a lot of rules! But it is indeed definable, as there are a finite number of such characters.
Now here is a type for some simple two-dimensional shapes:
$$ \dfrac{r\!: \textsf{Float64}}{\textsf{circle}\,r\!: \textsf{Shape}} \quad\quad\quad \dfrac{w\!: \textsf{Float64}\quad h\!: \textsf{Float64}}{\textsf{rectangle}\,w\,h\!: \textsf{Shape}} \quad\quad\quad \dfrac{a\!: \textsf{Float64}\quad b\!: \textsf{Float64} \quad c\!: \textsf{Float64}}{\textsf{triangle}\,a\,b\,c\!: \textsf{Shape}} $$The idea is that $\mathsf{circle}\,5.0$ is a circle with radius 5.
Now let’s build types from types. If $t_1$ and $t_2$ are types, then $t_1 \times t_2$ is a type, defined like so:
$$ \dfrac{v_1\!: t_1 \quad v_2\!: t_2}{(v_1, v_2)\!: t_1 \times t_2} $$The type $t_1 \times t_2$ is a pair type, which is a special case of a product type, also known as a tuple type. Tuples are not limited to two component types:
$$ \dfrac{v_1\!: t_1 \quad \cdots \quad v_n\!: t_n}{(v_1, \ldots, v_n)\!: t_1 \times \cdots \times t_n} $$Example: $(2, \mathsf{false}, (\mathsf{true}, \frac{3}{5}))\!: \mathsf{Int \times Bool \times (Bool \times Rat)}$
Can we have a product type with zero components? We can. In fact, there is only one such type. And it necessarily has only one value. The type is called $\textsf{Unit}$:
$$ \dfrac{}{()\!: \textsf{Unit}} $$Next is the type “list of elements of type $t$” which we will write $t^*$:
$$ \dfrac{}{[\,]\!: t^*} \quad\quad\quad \dfrac{x\!: t \quad y\!: t^*}{(x :: y)\!: t^*} $$We will write ${[x]}$ for $(x\,\textbf{::}\,[\,])$, ${[x,y,z]}$ for $(x\,\textbf{::}\,(y\,\textbf{::}\,(z\,\textbf{::}\,[\,])))$, and so on. For an empty list, we would need to rely on context to infer its type, or be explicit by writing, for example, $[\,]_{\textsf{Int}^*}$ or $[\,]_{(\textsf{Bool}\times\textsf{Unit})^*}$.
Imagine what the type $\textsf{Unicode}^*$ is. Got it? That’s right! We’ll use the abbreviation $\textsf{String}$ for this type.
Now let’s do “option of type $t$” which we will write $t\texttt{?}$:
$$ \dfrac{}{\textsf{none}\!: t\texttt{?}} \quad\quad \dfrac{x\!:t}{\textsf{some}\;x\!: t\texttt{?}} $$We thus have $\textsf{some}\:3\!:\mathsf{Nat}\texttt{?}$ and $\textsf{some}\:(-3,T)\!:(\mathsf{Int \times Bool})\texttt{?}$. When writing $\textsf{none}$, we must rely on context to know which type is being referred to, or you can use subscripts to be explicit, e.g., $\mathsf{none_{Int?}}$.
Finally, we’ll define the type $\textsf{Void}$ to be the type with no inhabitants. As there are no inhabitants, there are no formation rules.
Don’t confuse Void and UnitThe type $\textsf{Void}$ has no inhabitants at all. The type $\textsf{Unit}$ has a single inhabitant, namely $()$.
To serve as a basis for computation, let alone for mathematics, we need to know more than which items inhabit which types. We need to know how the items behave. We specify behavior by defining functions on the types. To define a function, you show how to map each element of one type (the domain) to an element of another type (the codomain). A mapping must exist for each rule that defines the domain type. Here are some examples:
$\begin{array}{l} \textsf{not}\!: \textsf{Bool} \rightarrow \textsf{Bool} \\ \textsf{not}\;\textsf{true} = \textsf{false} \\ \textsf{not}\;\textsf{false} = \textsf{true} \end{array}$
$\begin{array}{l} \textsf{isZero}\!: \textsf{Nat} \rightarrow \textsf{Bool} \\ \textsf{isZero}\;0 = \textsf{true} \\ \textsf{isZero}\;(\textsf{s}\,n) = \textsf{false} \end{array}$
$\begin{array}{l} \textsf{plusTwo}\!: \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{plusTwo}\;0 = \textsf{s}\,\textsf{s}\,0 \\ \textsf{plusTwo}\;(\textsf{s}\,n) = \textsf{s}\,\textsf{s}\,\textsf{s}\,n \end{array}$
Functions don’t have to have names. In fact, $\textsf{plusTwo}$ can be written as a lambda expression:
$$ \lambda n_{\textsf{Nat}}.\,\textsf{s}\,\textsf{s}\,n $$Did you notice something? Functions are actually values that inhabit function types. Woah. Maybe you noticed this by the way we wrote “$\textsf{not}\!: \textsf{Bool} \rightarrow \textsf{Bool}$”. Not an accident! $\textsf{Bool} \rightarrow \textsf{Bool}$ is indeed a type. In fact, for any types $t_1$ and $t_2$,
$t_1 \rightarrow t_2$ is a type.
This is great, since this lets us define $\textsf{and}$ and $\textsf{or}$ like so:
$\begin{array}{l} \textsf{and}\!: \textsf{Bool} \rightarrow (\textsf{Bool} \rightarrow \textsf{Bool}) \\ \textsf{and}\;\textsf{true} = \lambda x_{\textsf{Bool}}.\,x \\ \textsf{and}\;\textsf{false} = \lambda x_{\textsf{Bool}}.\,\textsf{false} \end{array}$
$\begin{array}{l} \textsf{or}\!: \textsf{Bool} \rightarrow (\textsf{Bool} \rightarrow \textsf{Bool}) \\ \textsf{or}\;\textsf{true} = \lambda x_{\textsf{Bool}}.\,\textsf{true} \\ \textsf{or}\;\textsf{false} = \lambda x_{\textsf{Bool}}.\,x \end{array}$
Here is an amazing function, known as the conditional. Notice its type. It has a type variable $t$ meaning (at least in this case) that it behaves the same way for all types (meaning you can substitute any real, concrete type for $t$):
$\begin{array}{l} \textsf{cond}\!: \textsf{Bool} \rightarrow (t \rightarrow (t \rightarrow t)) \\ \textsf{cond}\;\textsf{true} = \lambda x_{t}.\,\lambda y_{t}.\,x \\ \textsf{cond}\;\textsf{false} = \lambda x_{t}.\,\lambda y_{t}.\,y \end{array}$
In case that’s hard to read, we can simplify our definition a bit, using fewer $\lambda$ symbols and fewer parentheses:
$\begin{array}{l} \textsf{cond}\!: \textsf{Bool} \rightarrow t \rightarrow t \rightarrow t \\ \textsf{cond}\;\textsf{true}\,x\,y = x \\ \textsf{cond}\;\textsf{false}\,x\,y = y \end{array}$
The convention is that when writing function types, $t_1 \rightarrow t_2 \rightarrow t_3$ is the same as $t_1 \rightarrow (t_2 \rightarrow t_3)$. And when writing function calls, $f\,x\,y$ is the same as $(f\,x)\,y$.
Here is how we add natural numbers:
$\begin{array}{l} \textsf{plus}\!: \textsf{Nat} \rightarrow \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{plus}\;0\;m = m \\ \textsf{plus}\;(\textsf{s}\,n)\;m = \textsf{s}\,(\textsf{plus}\;n\;m) \end{array}$
And here is how we multiply natural numbers:
$\begin{array}{l} \textsf{times}\!: \textsf{Nat} \rightarrow \textsf{Nat} \rightarrow \textsf{Nat} \\ \textsf{times}\;0\;m = 0 \\ \textsf{times}\;(\textsf{s}\,n)\;m = \textsf{plus}\;m\;(\textsf{times}\;n\;m) \end{array}$
Ok hold up! It’s time to introduce some notation to make things a bit more familiar. From now on, we’re going to write:
$\begin{array}{lll} \neg b & \text{for} & \textsf{not}\;b \\ a \land b & \text{for} & \textsf{and}\;a\;b \\ a \lor b & \text{for} & \textsf{or}\;a\;b \\ \textsf{if}\;b\;\textsf{then}\;x\;\textsf{else}\;y & \text{for} & \textsf{cond}\;b\;x\;y \\ n + 1 & \text{for} & \textsf{s}\;n \\ n + m & \text{for} & \textsf{plus}\;n\;m \\ n \times m & \text{for} & \textsf{times}\;n\;m \\ \textsf{let}\;x = e\;\textsf{in}\;e' & \text{for} & (\lambda x.\,e')\;e \end{array}$
Relax, these are just abbreviations. The basic low-level notations of our theory still exist.
Back to example function definitions. The length of a list:
$\begin{array}{l} \textsf{length}\!: t^* \rightarrow \textsf{Nat} \\ \textsf{length}\;[\,] = 0 \\ \textsf{length}\;(x\,\textbf{::}\,y) = (\textsf{length}\;y) + 1 \end{array}$
Appending two lists:
$\begin{array}{l} \textsf{append}\!: t^* \rightarrow t^* \rightarrow t^* \\ \textsf{append}\;[\,]\;z = z \\ \textsf{append}\;(x\,\textbf{::}\,y)\;z = x\,\textbf{::}\,(\textsf{append}\;y\;z) \end{array}$
Reversing a list:
$\begin{array}{l} \textsf{reverse}\!: t^* \rightarrow t^* \\ \textsf{reverse}\;[\,] = [\,] \\ \textsf{reverse}\;(x\,\textbf{::}\,y) = \textsf{append}\;(\textsf{reverse}\;y)\;[x] \end{array}$
Mapping a function over a list (in other words, applying a function to each element of a list):
$\begin{array}{l} \textsf{map}\!: (t_1 \rightarrow t_2) \rightarrow t_1^* \rightarrow t_2^* \\ \textsf{map}\;f\;[\,] = [\,] \\ \textsf{map}\;f\;(x\,\textbf{::}\,y) = (f\;x)\,\textbf{::}\,(\textsf{map}\;f\;y) \end{array}$
Filtering a list:
$\begin{array}{l} \textsf{filter}\!: (t \rightarrow \textsf{Bool}) \rightarrow t^* \rightarrow t^* \\ \textsf{filter}\;p\;[\,] = [\,] \\ \textsf{filter}\;p\;(x\,\textbf{::}\,y) = \textsf{if}\;(p\;x)\;\textsf{then}\;(x\,\textbf{::}\,(\textsf{filter}\;p\;y))\;\textsf{else}\;\textsf{filter}\;p\;y \end{array}$
Sometimes it’s helpful to think of an optional as kind of like a sequence of 0 or 1 element. In this case, functions like $\textsf{map}$ make sense to apply to optionals:
$\begin{array}{l} \textsf{map}\!: (t_1 \rightarrow t_2) \rightarrow t_1\texttt{?} \rightarrow t_2\texttt{?} \\ \textsf{map}\;f\;\textsf{none} = \textsf{none} \\ \textsf{map}\;f\;(\textsf{some}\;x) = \textsf{some}\;(f\;x) \end{array}$
This gives us an elegant way to define a function for determining the index of the first occurrence of an element in a list:
$\begin{array}{l} \textsf{indexOf}\!: t \rightarrow t^* \rightarrow \textsf{Nat}\texttt{?} \\ \textsf{indexOf}\;x\;[\,] = \textsf{none} \\ \textsf{indexOf}\;x\;(y\,\textbf{::}\,ys) = \\ \quad \quad \textsf{if}\;x = y\;\textsf{then}\;\textsf{some}\;0\;\textsf{else}\;\textsf{map}\;(\lambda n.\,n+1)\;(\textsf{indexOf}\;x\;ys) \end{array}$
We’ll create example invocations for each of these.
Philosophy question: should a thing inhabit just one type? Some mathematically-oriented theories impose this restriction. Pragmatically, though, it makes sense to allow things to be in more than type. Like $3$. Why not let it inhabit $\textsf{Nat}$ and $\textsf{Int}$ and maybe even more numeric types? If we allow this, we would notice that every inhabitant of $\textsf{Nat}$ also inhabits $\textsf{Int}$. That is, $\textsf{Nat}$ is a subtype of $\textsf{Int}$, which we write:
$$ \textsf{Nat} \; \texttt{<:} \; \textsf{Int} $$Conversely, $\textsf{Int}$ is a supertype of $\textsf{Nat}$.
Let’s get technical:
$t_1 \; \texttt{<:} \; t_2$ if and only if for every $v$ such that $v\!: t_1$, we have $v\!: t_2$.
If we allow individuals to inhabit multiple types, then it makes sense to define union types. The union $t_1 \mid t_2$ is inhabited by exactly all the inhabitants of $t_1$ and those of $t_2$. Yes, it’s possible for $t_1$ and $t_2$ to overlap. So the type $\textsf{Nat} \mid \textsf{Int}$ is just $\textsf{Int}$.
But the type $\textsf{Int} \mid \textsf{String}$ is perhaps interesting. Maybe you can find a use for it?
In case you were wondering, yes, if you have union types, you can also have intersection types, also. We write them like this: $t_1\:\texttt{&}\:t_2$.
Are you getting set theory vibes?Please calm down.
Types are not sets.
TYPES ARE NOT SETS.
We’ll talk about sets later. Please ffs do not let any preconceived notions of sets cloud your learning of the beauty of types. Keep. your. focus. on. types.
Here are two types, $C$ (for color, with 3 inhabitants $r$, $g$, and $b$) and $D$ (for direction, with 2 inhabitants $l$ and $r$):
$$ \dfrac{}{r\!: C} \quad\quad\quad \dfrac{}{g\!: C} \quad\quad\quad \dfrac{}{b\!: C} $$ $$ \quad\quad\quad \dfrac{}{l\!: D} \quad\quad\quad \dfrac{}{r\!: D} $$The inhabitants of $C \times D$ are:
$$ (r,l),\;(r,r),\;(g,l),\;(g,r),\;(b,l),\;(b,r) $$Did you notice that $C$ has 3 inhabitants and $D$ has 2 inhabitants and $C\times D$ has $3 \times 2 = 6$ inhabitants? See why we call these product types? Right? RIGHT?
Since we have products, maybe we should have sums. We’d like $C + D$ to be a type with $3 + 2 = 5$ inhabitants. You might think a union would work, but wait, it doesn’t because the union ($C \mid D$) only has four elements: $r$, $g$, $b$, and $l$. A true sum type needs to have two distinct $r$’s. We have to tag the “source” of each element of the sum. So the type $C+D$ has these inhabitants:
$$ \mathsf{tag_1}r,\;\mathsf{tag_1}g,\;\mathsf{tag_1}b,\;\mathsf{tag_2}l,\;\mathsf{tag_2}r $$In general, we define a sum type like so:
$$ \dfrac{v_1\!: t_1 \quad \cdots \quad v_n\!: t_n }{\mathsf{tag_1}v\!: t_1 + \cdots + t_n \quad\quad \cdots \quad\quad \mathsf{tag_n}v\!: t_1 + \cdots + t_n} $$If $n = 0$ we have the sum type over no types. It’s the type $\textsf{Void}$ referred to above!
Does this remind you of basic number theory?Void is the empty sum type, the sum of no types, and has zero elements. Just like the number 0 is the sum over a list of no elements, the additive identity.
Unit is the empty product type, the product of no types, and has one element. Just like the number 1 is the product over a list of no elements. The multiplicative identity.
In some theories the void type is actually called $\mathbf{0}$ and the unit type is actually called $\mathsf{1}$.
So, we have sums and products, but what about exponential types? Could there be types $C^D$ and $D^C$? There are, and it’s kind of interesting what they turn out to be. Let’s list all the inhabitants of $C \rightarrow D$:
$$ \begin{array}{llll} (000) & r \mapsto l, & g \mapsto l, & b \mapsto l \\ (001) & r \mapsto l, & g \mapsto l, & b \mapsto r \\ (010) & r \mapsto l, & g \mapsto r, & b \mapsto l \\ (011) & r \mapsto l, & g \mapsto r, & b \mapsto r \\ (100) & r \mapsto r, & g \mapsto l, & b \mapsto l \\ (101) & r \mapsto r, & g \mapsto l, & b \mapsto r \\ (110) & r \mapsto r, & g \mapsto r, & b \mapsto l \\ (111) & r \mapsto r, & g \mapsto r, & b \mapsto r \\ \end{array} $$There are 8 of these, which is not coincidentally $2^3$, exactly the number of inhabitants of $D$ raised to the power of the number of inhabitants of $C$. Therefore, $C \rightarrow D$ is sometimes written $D^C$.
It works the other way, too. $D \rightarrow C$ can be written $C^D$, as it has $3^2 = 9$ inhabitants:
$$ \begin{array}{lll} (00) & l \mapsto r, & r \mapsto r \\ (01) & l \mapsto r, & r \mapsto g \\ (02) & l \mapsto r, & r \mapsto b \\ (10) & l \mapsto g, & r \mapsto r \\ (11) & l \mapsto g, & r \mapsto g \\ (12) & l \mapsto g, & r \mapsto b \\ (20) & l \mapsto b, & r \mapsto r \\ (21) & l \mapsto b, & r \mapsto g \\ (22) & l \mapsto b, & r \mapsto b \\ \end{array} $$And there you have it: function types are the exponential types. 🤯
Given a function $A$ of type $t \rightarrow \mathsf{Bool}$, you can imagine collecting together, in an object, all of the inhabitants $x$ of type $t$ for which $A\,x = \textsf{true}$. That’s a set. Yes, that’s what a set is in Type Theory. It is that simple.
It’s conventional to write sets with curly braces.
| Type Notation | Set Notation | Explanation |
|---|---|---|
| $\lambda x_{\textsf{Nat}}.\,x=0$ | $\{0\}$ | The function returns true only for 0 |
| $\lambda x_{\textsf{Nat}}.\,x=2 \vee x=3 \vee x=5$ | $\{2, 3, 5\}$ | The function returns true only for 2, 3, and 5 |
| $\lambda x_{\textsf{Nat}}.\,\textsf{true}$ | $\mathbb{N}$ | The function returns true for all natural numbers |
| $\lambda x_{\textsf{Nat}}.\,x > 5$ | $\{ x \in \mathbb{N} \mid x > 5 \}$ | The function returns true for all natural numbers greater than 5 |
| $\lambda x_{\textsf{Int}}.\,\textsf{true}$ | $\mathbb{Z}$ | The function returns true for all integers |
| $\lambda x_t.\,\texttt{false}$ | $\varnothing$ | The function returns true for no arguments |
| $(A\,x)_{\textsf{Bool}}$ | $x \in A$ | The application returns true if and only if $x$ is in the set $A$ |
Now let’s package things up in a slightly different, and more abstract, fashion—a useful exercise when building up a theory. As we build our catalog, we’ll add some new important information.
Here are some types shared by almost all constructive type theories:
Why is strict positivity required?Imagine this illegal rule:
$$ \dfrac{f\!: T \rightarrow \textsf{Bool}}{\textsf{fun}\,f\!:T} $$This says that one of the things a $T$ can be is a set of $T$’s but Cantor’s Theorem, which we proved back in the Math Foundations notes, says $|T| < |\mathcal{P}(T)|$ for every $T$. Contradiction! Strict positivity is the cheap, purely syntactic check that catches this before you ever get to the contradiction.
The union type we saw before does not exist in all type theories:
Each of the types above are either primitive or constructed from constituent types. But what if we wanted to make a type dependent not just on another type, but upon a value? That’s a topic for the next section. It’s wild.
Time for something cool.
First note (the obvious? fact that) that applying a function of type $A \rightarrow B$ returns something of type $B$ no matter which $a\!:\!A$ you feed it. What if the type of the result could vary depending on the value of the argument? A dependent function type does that.
A dependent type is a family of types $B(x)$ indexed by terms $x\!:\!A$, that is, for each different $a\!:\!A$, $B(a)$ may be a genuinely different type. We can do this for both functions and tuples: $\Pi$ (dependent functions) will extend $\to$ and $\Sigma$ (dependent tuples) will extend $\times$.
Before piling on more notation, let’s see a type family up close. Here’s the smallest possible one, built right out of $\textsf{Bool}$:
$$ B(\textsf{true}) = \textsf{Nat} \quad\quad\quad B(\textsf{false}) = \textsf{Bool} $$Here $B$ maps values of $\textsf{Bool}$ to types. Again, think of $B$ as a family of types, one type for each value of $\textsf{Bool}$. The whole family, including the indexing mechanism, is the dependent type $B$. We’ll use this tiny $B$ as a running example below.
Now for a more useful one. Recall that we earlier defined list type $t^*$ and had to write a separate function, $\textsf{length}\!: t^* \rightarrow \textsf{Nat}$, to find out how long a list was after the fact. What if the length were part of the type itself? Here’s $\textsf{Vec}\,A\,n$, the type of length-$n$ vectors of $A$’s:
$$ \dfrac{}{\textsf{nil}\!: \textsf{Vec}\,A\,0} \quad\quad\quad \dfrac{x\!: A \quad v\!: \textsf{Vec}\,A\,n}{(x :: v)\!: \textsf{Vec}\,A\,(n+1)} $$Look closely: this is the exact shape of the list rules, except now $\textsf{Vec}\,A\,n$ for each different $n$ is a genuinely different type, related to the others only by the fact that prefixing onto a $\textsf{Vec}\,A\,n$ bumps you up to a $\textsf{Vec}\,A\,(n+1)$. So $\textsf{Vec}\,A$, all by itself, is a mapping from $\textsf{Nat}$ to types—a type family, exactly like $B$ above, just with infinitely many cases instead of two. You cannot even write down a $\textsf{Vec}\,A\,3$ with four elements in it; there’s simply no rule that would let you derive it. This is getting interesting: we’re getting to where we can capture some program correctness rules, that we thought might have to wait until runtime to check, in the compile-time (static) type checker!
$\Pi_{x:A} B(x)$ is the type of functions that, given $a\!:\!A$, return a term whose type is specifically $B(a)$ — not some fixed type $B$, but the particular type that depends on which $a$ you passed in. When $B$ doesn’t actually depend on $x$, $\Pi_{x:A} B$ is just $A \rightarrow B$, so $\Pi$ is a strict generalization of the ordinary function type.
For our toy family $B$ above, $\Pi_{b:\textsf{Bool}} B(b)$ is inhabited by functions $f$ such that $f\,\textsf{true}\!:\!\textsf{Nat}$ and $f\,\textsf{false}\!:\!\textsf{Bool}$ — both at once, from the very same function. An ordinary $\textsf{Bool} \rightarrow t$ type can’t express this at all, since it has to commit to one fixed codomain $t$ for every input.
Now for $\textsf{Vec}$: $\Pi_{n:\textsf{Nat}} (\textsf{Vec}\;A\;n \rightarrow \textsf{Vec}\;A\;n)$ is the type of functions that take a vector of any length and return a vector of the same length. Compare this to $\textsf{reverse}\!: t^* \rightarrow t^*$ from before — that signature promises nothing whatsoever about the output’s length, even though we know reversing preserves it. The dependent version actually states the guarantee, right there in the type, before a single line of $\textsf{reverse}$ is written.
Here’s one more, showing off what dependent types are really for: $\Pi_{n:\textsf{Nat}} (\textsf{Vec}\;A\;(n+1) \rightarrow A)$, a "head" function. Notice the domain is $\textsf{Vec}\,A\,(n+1)$, never $\textsf{Vec}\,A\,0$. There is no way to even call this function on an empty vector — the typechecker rejects it before the program runs, no runtime "list is empty" check required.
$\Sigma_{x:A} B(x)$ is the type of pairs $(a, b)$ where $a\!:\!A$ and $b\!:\!B(a)$, that is, the type of the second component depends on the value of the first. When $B$ doesn’t depend on $x$, $\Sigma_{x:A} B$ is just $A \times B$, so $\Sigma$ generalizes the ordinary product type the same way $\Pi$ generalizes $\rightarrow$.
For our toy family, $\Sigma_{b:\textsf{Bool}} B(b)$ is inhabited by pairs like $(\textsf{true}, 5)$ and $(\textsf{false}, \textsf{true})$ — but not $(\textsf{true}, \textsf{true})$, since once the first component is $\textsf{true}$, the second is required to be a $\textsf{Nat}$. Contrast this with an ordinary pair type like $\textsf{Bool} \times t$: there, the type of the second slot is locked in advance and can’t react to which boolean shows up.
The classic example is a pair of a length $n$ and a vector of that exact length: $\Sigma_{n:\textsf{Nat}} (\textsf{Vec}\,A\,n)$. Here’s a nice payoff: this $\Sigma$ type turns out to contain exactly the same information as an ordinary list $A^*$ — pair up a list with its own length and you get a $\Sigma_{n:\textsf{Nat}}(\textsf{Vec}\,A\,n)$; strip the length back off a $\Sigma_{n:\textsf{Nat}}(\textsf{Vec}\,A\,n)$ and you get a plain list. Nothing is gained or lost going either direction. $\Sigma$ is what lets you "uncurry" a length out of the type system and carry it around as ordinary data instead, whenever that’s more convenient than baking it into the type as $\textsf{Vec}$ does.
So: $\Pi$ says "for every $a\!:\!A$, here’s how to produce something of the type $B(a)$ specific to that $a$." $\Sigma$ says "here is some particular $a\!:\!A$, paired with a witness of the type $B(a)$ specific to that $a$." Hold onto that "for every / here is a particular" framing — it’s exactly why $\Pi$ will line up with $\forall$ and $\Sigma$ will line up with $\exists$ when we get to Propositions as Types, next.
A refinement type is a subtype formed from its supertype by applying a restriction based on a value. A good examples is “the positive naturals” which we write as $\Sigma_{n:\textsf{Nat}} (n > 0)$ — a pair of a number together with a proof that it’s positive. This only makes sense once you allow a proposition ($n > 0$) to appear as a type, which is exactly the move we’ll make precise in the next section.
The Power of Dependent Types for Programming LanguagesThe payoff for programming languages is that dependent types let you encode invariants directly in a function’s type rather than checking them at runtime or just asserting them in a comment: a sort function can be typed to guarantee its output is a permutation of its input, a matrix-multiply function can be typed to reject mismatched dimensions at compile time, and so on. This expressive jump — types that talk about values, not just other types — is also what makes type theory able to absorb logic wholesale, which is exactly what "Propositions as Types," next, is about.
[Claude to help with this section]
| Logical Formula | Meaning | Type |
|---|---|---|
| $F$ | Falsity | $\bot$ |
| $T$ | Truth | $()$ |
| $A \wedge B$ | Conjunction (And) | $A \times B$ |
| $A \vee B$ | Disjunction (Or) | $A + B$ |
| $A \supset B$ | Implication | $A \rightarrow B$ |
| $\forall x. ((x \in A) \supset B)$ where $x$ is free in $B$ | Universal Quantification | $\Pi_{x: A} B$ |
| $\exists x. ((x \in A) \wedge B)$ where $x$ is free in $B$ | Existential Quantification | $\Sigma_{x: A} B$ |
Can types be values of a type?
TODO
TODO - something about Univalence and Voevodsky
In our previous look at mathematical foundations, we talked briefly about differences between Set Theory and Type Theory. Now it’s time to go a little deeper.
In Set Theory, you start with the axioms and inference rules of first-order predicate logic. Then you add new axioms and inference rules to describe sets and how they behave. The axioms vary from theory to theory (recall that popular set theories include ZF and ZFC, NBG, NF and NFU, MK, KP, CST). ZFC is a big one. The additional axioms in ZFC (beyond those of first-order predicate logic) are:
In Type Theory, you don’t need a pre-existing logic. In fact, the so-called logical connectives ($\wedge$, $\vee$, $\neg$, and so on) are just functions. Here is just one way to do it (because just as there are many set theories, there are many different type theories.) These definitions come from Peter Andrews’ book on Type Theory. Andrews defines a simple type theory called $Q_0$.
Andrews builds $Q_0$ out of just two primitive types and two primitive constants, which is enough for formalize all of mathematics! The two primitive types are:
It $\alpha$ and $\beta$ are types then so is $\alpha \rightarrow \beta$. Andrews actually writes this the other way around, as the juxtaposed pair $(\beta\alpha)$—don’t let that throw you if you go looking in the book directly.
The two primitive constants are $Q$, equality at every type (we’ll write it infix as $=$, just as Andrews does, so you’ll almost never see a bare $Q$), and a description operator we’ll write $\mathsf{the}$. (Andrews uses the special symbol ℩ for this, which is easy to confuse with the type $\iota$ — they are genuinely different symbols! We’ll just write $\mathsf{the}$ and dodge the issue.) Officially $\mathsf{the}$ is only primitive at type $\iota$, but Andrews proves, as a theorem, that you get a copy of it at every type for free.
That’s the entire primitive vocabulary. Every other symbol you’ll see from here on — $\neg, \wedge, \vee, \supset, \equiv, \forall, \exists$, even $T$ and $F$ themselves — gets built out of nothing but $=$ and $\lambda$.
(Quick heads-up: $T$, $F$, $\wedge$, and $\forall$ show up in Axiom Q1 below as if you already know them. You basically do — but officially they’re abbreviations we build a little further down, out of $=$ and $\lambda$ alone. Feel free to skim ahead to see the wiring before circling back.)
And here’s the truly wild part: $Q_0$ has exactly one rule of inference.
Rule RFrom a theorem $C$ together with a proved equation $A_\alpha = B_\alpha$, infer the result of replacing some occurrence of $A_\alpha$ inside $C$ with $B_\alpha$ — anywhere inside $C$, at any depth, buried under as many $\lambda$’s and connectives as you like. (One restriction: you can’t rewrite a bound variable sitting immediately after the $\lambda$ that binds it — that would be as nonsensical as renaming the $n$ in $\textsf{s}\,n$ without renaming its binder too.)
That’s the whole proof system. Substitute equals for equals, wherever they occur. Did you notice that Axiom Q4 (beta-conversion) is just another equation? That means Rule R alone already knows how to compute — there’s no separate “now go reduce this” step bolted on, the way there often is elsewhere. In $Q_0$, logical reasoning and computation are the very same move: replacing something with something it’s provably equal to.
One more thing worth flagging: because Q1 nails $o$ down to exactly two values, $Q_0$ is thoroughly classical — there’s no room for a third truth value, or for rejecting the law of excluded middle the way intuitionistic systems do (recall the Logic notes!). That’s exactly why we listed $Q_0$, back in the timeline, as a classical counterpoint to MLTT and the Calculus of Constructions. (Andrews also adds an Axiom 6, asserting infinitely many individuals, once you want to build up arithmetic — we won’t need it here.)
Then we define:
(One notational note before we start: we’ll write $:=$ for “is defined to be” below, instead of our usual $\equiv$ — because $\equiv$ is about to become one of the things we’re defining!)
Every single one of these is pure syntactic sugar sitting on top of $Q$ and $\lambda$. There’s no separate “logic module” bolted onto $Q_0$ — the logic is equality, wearing eight different outfits.
Another interesting set of axioms and inference rules comes from Higher Order Logic (HOL). Strictly speaking HOL has no axioms, but only inference rules:
[Claude to help. IIRC there are simply 8 inference rules]
There does exist a body of work that exploits some of the similarity between types and sets. One is semantic subtyping, which you can read about in this seminal paper by Frisch, Castagna, and Benzaken. Rather than a thinking about a type being defined by its constructors (as $A+B$ is defined by $\mathsf{tag_1}$ and $\mathsf{tag_2}$), a type becomes a set of values, and $A \vee B$ is literally $[\![ A ]\!] \cup [\![ B ]\!]$, and similarly for the intersection type. The approach has several advantages, which you can read about in their paper.
Just like there are many Set Theories, there are quite a few Type Theories. Here is a catalog placed into a rough timeline:
Two research threads are particularly active right now: people working on Homotopy Type Theory (see the timeline above), and people are putting dependent types to work in some surprisingly different, very practical places. Very briefly, here are some things to note:.
Type systems for real programming languages are actually applied type theories. We won’t go into depth here, but rather provide a few little tidbits of interest.
Type Theory is too rich of a topic to cover in a single page. Browse the following resources to go more in-depth.
Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.
We’ve covered: