Operational Semantics

One way to define the meaning of a program is to show how it runs on some machine.

A Tiny Example Language

We’ll start very simply, with a language that just has integers, addition and subtraction, and print statements. Programs look like this:

print(8 - (3 + 21) + 34);
print(55 - 3 + 2);

The concrete syntax can be given by this grammar:

Program = Stmt+
Stmt    = "print" "(" Exp ")" ";"
Exp     = (Exp ("+" | "-"))? Term
Term    = num | "(" Exp ")"
num     = digit+

When defining a semantics we don’t care about concrete syntax, only abstract syntax. So here is the abstract syntax of this tiny language, which is all we care about from here on:

n: Nml
e: Exp = n | e + e | e - e
s: Stm = print e
p: Pro = program s+
Need Practice?

For the example program in this section, give both the parse tree and the abstract syntax tree.

Concrete Operational Semantics

A concrete operational semantics is given to a language by mapping language constructs to operations on a virtual machine—a machine whose operation is “obvious.”

Stack machines are pretty well understood! Let’s make up a stack machine. The machine has a stack for execution and allows you to store integers in named cells. The instructions are:

InstructionDescription
PUSH $n$Push the Constant $n$
ADDPop twice then push first popped value plus second popped value
SUBPop twice then push first popped value minus second popped value
PRINTPrint contents of top of stack to standard output then pop

When doing a concrete operational semantics, we just show the instruction sequence for each abstract syntactic form. That way, every program in our language will be translated to a unique instruction sequence. We assume this machine is sooooooooo simple that the meaning of instruction sequences is well understood. Therefore showing how to translate programs into machine instruction sequences suffices to define the semantics of the language. We will define the translation function, $T$, by cases on the abstract syntax forms:

$\begin{array}{l} T[\![n]\!] = \mathsf{PUSH}\;n\\ T[\![e_1 + e_2]\!] = T\,e_1 \;•\; T\,e_2 \;•\; \mathsf{ADD}\\ T[\![e_1 - e_2]\!] = T\,e_1 \;•\; T\,e_2 \;•\; \mathsf{SUB}\\ T[\![\mathtt{print}\;e]\!] = T\,e \;•\; \mathsf{PRINT}\\ T[\![\mathtt{program}\;s_1 \ldots s_n]\!] = T\,s_1 \;•\; \cdots \;•\; T\,s_n \end{array}$

The cool brackets $[\![$ and $]\!]$ distinguish syntactic forms, such as numerals and plus-expressions, from regular mathematical values, such as numbers.

Let’s see how it works for a trivial program:

$\begin{array}{l} T[\![\mathtt{program}\;\mathtt{print}\;8 + 13]\!] \\ \;\;\;\; = T[\![\mathtt{print}\;8 + 13]\!] \\ \;\;\;\; = T[\![8 + 13]\!] \;•\; \mathsf{PRINT} \\ \;\;\;\; = T[\![8]\!] \;•\; T[\![13]\!] \;•\; \mathsf{ADD} \;•\; \mathsf{PRINT} \\ \;\;\;\; = \mathsf{PUSH}\;8 \;•\; T[\![13]\!] \;•\; \mathsf{ADD} \;•\; \mathsf{PRINT} \\ \;\;\;\; = \mathsf{PUSH}\;8 \;•\; \mathsf{PUSH}\;13 \;•\; \mathsf{ADD} \;•\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13) \;•\; \mathsf{ADD} \;•\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13, \mathsf{ADD}) \;•\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13, \mathsf{ADD}, \mathsf{PRINT}) \end{array}$

The execution of this program on the “machine” unfolds as follows:

                // stack is (), output is ()
PUSH 8          // stack is [8], output is ()
PUSH 13         // stack is [8, 13], output is ()
ADD             // stack is [21], output is ()
PRINT           // stack is (), output is [21]

So the meaning of our program is the one element integer list $[21]$.

The translation of the example program from the beginning of this section is:

// print(8 - (3 + 21) + 34);
// print(55 - 3 + 2);

PUSH 8 
PUSH 3 
PUSH 21
ADD
SUB     
PUSH 34
ADD
PRINT
PUSH 55
PUSH 3 
SUB
PUSH 2 
ADD
PRINT   

CLASSWORK
Execute the program on the virtual machine. The output should be $[18, 54]$.

Structural Operational Semantics

Rather than specifying a machine with its own instruction set, we can simply define rules that show how a computation proceeds without the separate machine. Yep, we are going to abstract away the abstract machine!

For the tiny language, evaluating expressions eventually produces a number, executing statements keeps appending to “standard output,” and the execution of a program involves executing its statements, starting with the empty output to eventually produce its final output.

$$\frac{e_1 \longrightarrow e_1'} {[\![e_1 + e_2]\!] \longrightarrow [\![e_1' + e_2]\!]}$$
$$\frac{e_2 \longrightarrow e_2'} {[\![n + e_2]\!] \longrightarrow [\![n + e_2']\!]}$$
$$\frac{} {[\![n_1 + n_2]\!] \longrightarrow n_1+n_2}$$
$$\frac{e_1 \longrightarrow e_1'} {[\![e_1 - e_2]\!] \longrightarrow [\![e_1' - e_2]\!]}$$
$$\frac{e_2 \longrightarrow e_2'} {[\![n - e_2]\!] \longrightarrow [\![n - e_2']\!]}$$
$$\frac{} {[\![n_1 - n_2]\!] \longrightarrow n_1-n_2}$$
$$\frac{e \longrightarrow e'} {[\![\mathtt{print}\;e]\!], o \longrightarrow [\![\mathtt{print}\;e']\!], o }$$
$$\frac{} { [\![\mathtt{print}\;n]\!], o \longrightarrow o \cdot n }$$
$$\frac{ s_1, o \longrightarrow s'_1, o' } { [\![\mathtt{program}\;s_1, \ldots s_n]\!], o \longrightarrow [\![\mathtt{program}\;s'_1, \ldots s_n]\!], o' }$$
$$\frac{ s_1, o \longrightarrow o' } { [\![\mathtt{program}\;s_1, s_2, \ldots s_n]\!], o \longrightarrow [\![\mathtt{program}\;s_2, \ldots s_n]\!], o }$$
$$\frac{} { [\![\mathtt{program}]\!], o \longrightarrow o }$$

Here is the basic idea:

For each syntactic construct, the “machine” is in some configuration, which is either (1) the construct along with some other supporting data like a memory, input, output, or some such thing, or (2) some kind of “result”. The first kind of configuration is called an intermediate configuration.

In our example language:

CLASSWORK
This is best developed live, in a video or on a whiteboard.

To apply the semantics, we find the $o$ that the “starting configuration” $ [\![\mathtt{program}\;s_1 \cdots s_n]\!], ()$ will end up in. We can show the computation with a proof tree:

$$ \dfrac{ \dfrac{ \dfrac{ \dfrac{ \dfrac{ [\![3]\!]\longrightarrow 3\;\;\;[\![2]\!]\longrightarrow 2 }{ [\![3-2]\!]\longrightarrow 1 } }{ [\![\mathtt{print}\;3-2]\!],()\longrightarrow [\![\mathtt{print}\;1]\!],() } \;\;\;\;\;\;\; \dfrac{}{ [\![\mathtt{print}\;1]\!],()\longrightarrow (1)} }{ [\![\mathtt{print}\;3-2]\!],()\longrightarrow (1) } }{ [\![\mathtt{program\;print}\;3-2]\!], () \longrightarrow [\![\mathtt{program}]\!], (1) }\;\;\;\;\;\;\; \dfrac{}{ [\![\mathtt{program}]\!],(1) \longrightarrow (1) } }{ [\![\mathtt{program\;print}\;3-2]\!], () \longrightarrow (1) } $$

How much work was that?

That was a lot of work for a tiny language. Did you know there is another way to do this stuff, that does not involve nitty gritty computation details? There is. It is called natural semantics. Coming up...right now.

Natural Semantics

The semantics we’ve just seen, structural operational semantics, is often called small-step operational semantics. There’s another approach, known as, you guessed it, big-step operational semantics, also called natural semantics.

A motivation for, and details of, natural semantics, can be found in any good textbook on semantics. But we can learn it by just jumping right in. Here is the natural semantics for the tiny language we’ve been working with:

$$\frac{} {[\![n]\!] \Longrightarrow n}$$
$$\frac{e_1 \Longrightarrow x \;\;\;\; e_2 \Longrightarrow y} {[\![e_1 + e_2]\!] \Longrightarrow x+y}$$
$$\frac{e_1 \Longrightarrow x \;\;\;\; e_2 \Longrightarrow y} {[\![e_1 - e_2]\!] \Longrightarrow x-y}$$
$$\frac{e \Longrightarrow x} { [\![\mathtt{print}\;e]\!], o \Longrightarrow o \cdot x }$$
$$\frac{ s_1, () \Longrightarrow o_1 \;\;\;\; s_2, o_1 \Longrightarrow o_2 \;\;\;\; \cdots \;\;\;\; s_n, o_{n-1} \Longrightarrow o_n} {[\![\mathtt{program}\;s_1, \ldots s_n]\!] \Longrightarrow o_n }$$

CLASSWORK
We’ll derive this in class. Did you find it easier going?
Structural Operational Semantics vs. Natural Semantics

It might take time to really internalize the difference. Here are two things that may help. First, a summary table:
SOSNatural
Break computations down into individual “steps”Concerned only with the final result of evaluating a construct
Aka: “small-step”, “transitional”, “reduction”Aka: “big-step”, “relational”, “evaluation”
Uses $\rightarrow$Uses $\Rightarrow$ or $\Downarrow$
Transitions between configurations, e.g., $ e,m \rightarrow e',m$Complete evaluation of constructs, e.g., $ e,m \Rightarrow x$
Gordon Plotkin, 1981Gilles Kahn, 1987

Second, here is a fantastic and detailed slide deck from a University of Illinois class that does an excellent job of describing both methods.

Operational Semantics for Astro

Let’s scale up just a little bit. Remember Astro? Here is the abstract syntax:

$ \begin{array}{lcl} n: \mathsf{Nml} & & \\ i: \mathsf{Ide} & & \\ e: \mathsf{Exp} & = & n \mid i \mid -e \mid e+e \mid e-e \mid e\,\mathtt{*}\,e \mid e\,/\,e \mid e\,\%\,e \mid e\,\mathtt{**}\,e \mid i\;e*\\ s: \mathsf{Stm} & = & i = e \mid \mathtt{print}\;e\\ p: \mathsf{Pro} & = & s+\\ \end{array}$

Several new features appear in Astro that were not in the trivial language:

Concrete Operational Semantics

We’ll have to scale up the machine a bit to support Astro:

InstructionDescription
PUSH $n$Push the Constant $n$
LOAD $v$Push the value in memory location $v$
STORE $v$Pop into memory location $v$
NEGPop then push the negation of the popped value
SINPop then push the sine of the popped value
COSPop then push the cosine of the popped value
SQRTPop then push the square root of the popped value
ADDPop twice then push first popped value plus second popped value
SUBPop twice then push first popped value minus second popped value
MULPop twice then push first popped value times second popped value
DIVPop twice then push first popped value divided by second popped value
REMPop twice then push the remainder of first popped value divided by second popped value
POWPop twice then push first popped value raised to the power of the second popped value
HYPOTPop twice then push the hypotenuse of the two popped values
PRINTPrint contents of top of stack to standard output then pop
Exercise: Write the translation for Astro into this machine code. You can do it! Just don’t forget to handle errors such as division by zero or loading a variable from a memory location that hasn’t been initialized.

Natural Semantics

To get started, we think through the kinds of reduction relations we need. Since we have identifiers that are bound to entities, we need a place to store the entities. This is usually called a store or a memory. Expressions have to be evaluated in the context of a memory, and will produce a number:

$\Longrightarrow_E\;\subseteq (\mathsf{Exp} \times \mathsf{Mem}) \times \mathbb{R}$

Statements read and modify this memory and the program’s output. We’ll create the type abbreviation $\mathsf{Output}=\mathbb{R}^*$ to make the specification a little more transparent:

$\Longrightarrow_S\;\subseteq (\mathsf{Stm} \times \mathsf{Mem} \times \mathsf{Output}) \times (\mathsf{Mem} \times \mathsf{Output})$

Programs are executed to produce a list of output values:

$\Longrightarrow_P\;\subseteq \mathsf{Pro} \times \mathsf{Output}$

There are four things an identifier can stand for: (1) a numeric variable, (2) a function (something that is called from an expression), or (3) nothing at all. Numeric variables not only have a numeric value, but we also have to track whether they are read-only or read-write. And for functions, we have to keep track of not only their “value” (a lambda expression, woohoo), but how many parameters they are declared with, since we can only call them with the exact same number of arguments! This means that our memory is a function with the following type:

$\begin{array} l\mathsf{Ide} \rightarrow (\\ \;\;\;\; \{\mathsf{UNDEF}\} \;\cup \\ \;\;\;\; (\{\mathsf{NUM}\} \times \mathbb{R} \times \{\mathsf{RO},\mathsf{RW}\}) \;\cup \\ \;\;\;\; (\{\mathsf{FUN}\} \times (\mathbb{R}^* \rightarrow \mathbb{R}) \times \mathbb{N})) \end{array}$

Now we are ready to give the natural semantics. We start slow, with expressions. Some of these are rather detailed, so much class time will be spent in showing where the rules come from.

$$\frac{} {[\![n]\!],m \Longrightarrow n}$$
$$\frac{m(i) = (\mathsf{NUM}, x, \_)} {[\![i]\!],m \Longrightarrow x}$$
$$\frac{ e,m \Longrightarrow x} {[\![\mathsf{-}\;e]\!],m \Longrightarrow -x}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\} \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = (\mathsf{FUN}, f, n)} {[\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow f(a_1,\ldots,a_n)}$$
$$\frac{ e,m \Longrightarrow x' \;\;\; m(i) = \mathsf{UNDEF} \vee m(i) = (\mathsf{NUM},x,\mathsf{RW})} {[\![i=e]\!],m,o \Longrightarrow (m[(\mathsf{NUM},x',\mathsf{RW})\,/\,i], o)}$$
$$\frac{e, m \Longrightarrow x} {[\![\mathtt{print}\;e]\!],m,o \Longrightarrow (m, o \cdot x)}$$
$$\frac{( s_i, m_{i-1}, o_{i-1} \Longrightarrow (m_i,o_i))_{i=1}^n} {[\![\mathtt{program}\;s_1,\ldots,s_n]\!] \Longrightarrow o_n}$$

where $o_0$, the initial output, is defined to be the empty sequence, and the initial memory $m_0$ is our “standard library”:

$ \begin{array}{l} m_0 = \lambda\,i. \mathsf{UNDEF} \:[\\ \;\;\;\; (\mathsf{NUM}, \pi, \mathsf{RO})\,/\,\mathtt{π}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda x.\sqrt{x}, 1)\,/\,\mathtt{sqrt}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda x.\sin{x}, 1)\,/\,\mathtt{sin}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda x.\cos{x}, 1)\,/\,\mathtt{cos}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda (x,y).\sqrt{x^2+y^2}, 2)\,/\,\mathtt{hypot}] \:[\\ \end{array} $
How did we handle errors?

We took a simple approach to error handling here. Programs that are illegal, either because they violate static constraints (e.g., using an identifier bound to the wrong type of entity), or failed at run time (e.g., divide by zero), just simply cannot be derived by the semantic rules.

We”re not that interested, then, in what kind of “error” a program exhibits. What we care about is: if the program is legal, what is its meaning?

How bad was that? Of course, it takes practice to get used to.

Structural Operational Semantics

An exercise for you.

Exercise: Try it!

Operational Semantics for Bella

The programming language Bella has its own formal specification. It adds a bunch of very interesting features to Astro:

Here’s the abstract syntax, copied from the specification:

$ \begin{array}{l} n\!: \mathsf{Numeral} \\ i\!: \mathsf{Identifier} \\ e\!: \mathsf{Expression} = n \;|\; i \;|\; \mathtt{true} \;|\; \mathtt{false} \;|\; \mathit{unaryop} \; e \;|\; e_1 \; \mathit{binop} \; e_2 \;|\; i \; e^* \;|\; e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2 \\ s\!: \mathsf{Statement} = \mathtt{let}\;i = e \;|\; \mathtt{func}\;i\;i^*=e \;|\; i = e \;|\; \mathtt{print}\;e \;|\; \mathtt{while}\;e\;b \\ b\!: \mathsf{Block} = \mathtt{block}\; s^* \\ p\!: \mathsf{Program} = \mathtt{program}\; b \end{array} $

Natural Semantics

The natural semantics of Bella is fully explained in the specification, so head over there to continue your operational semantics study.

Structural Operational Semantics

Try it!

Exercise: Write it!

Concrete Operational Semantics

It should be really fun to extend the virtual machine we’ve been playing with above to have new instructions that can handle Bella.

Exercise: Try this for now. Some day I will add it to these notes.

Some Interesting Language Features

Rather than jumping in to a new language, let’s go one feature at a time, and think, how might we extend Bella to have this feature?”

If Statements

The ubiquitous if-statement might look like this in a Bella program:

if x >= 3 {
   print(y);
   y = sqrt(y);
} else {
   y = y ** 2;
}

and in the abstract syntax appear as:

s: Statement = ... | if e b b

The (natural semantics) rules should be straightforward:

$$\frac{ e,m \Longrightarrow x \;\;\;\; x \neq 0 \;\;\;\; b_1,m,o \Longrightarrow (m'o')} {[\![\mathtt{if}\;e\;b_1\;b_2]\!],m,o \Longrightarrow (m',o')}$$
$$\frac{ e,m \Longrightarrow 0 \;\;\;\; b_2,m,o \Longrightarrow (m'o')} {[\![\mathtt{if}\;e\;b_1\;b_2]\!],m,o \Longrightarrow (m',o')}$$

For SOS we need to compute in smaller steps, bringing our tests down to a single number:

$$\frac{ e,m \longrightarrow e',m } {[\![\mathtt{if}\;e\;b_1\;b_2]\!],m,o \longrightarrow [\![\mathtt{if}\;e'\;b_1\;b_2]\!],m,o}$$
$$\frac{x \neq 0} {[\![\mathtt{if}\;x\;b_1\;b_2]\!],m,o \longrightarrow b_1,m,o}$$
$$\frac{} {[\![\mathtt{if}\;0\;b_1\;b_2]\!],m,o \longrightarrow b_2,m,o}$$

Parallel Assignment

Many languages have what they call parallel assignment, which might look like:

x, y = e1, e2

Natural semantics for Bella (in which identifiers must be bound before they can appear anywhere, including assignment statements):

$$\frac{\begin{gathered}e_1, m \Longrightarrow x \;\;\;\; e_2,m \Longrightarrow y \\ m(i) = (\mathsf{NUM}, x, \mathsf{RW}) \;\;\;\; m(j) = (\mathsf{NUM}, y, \mathsf{RW})\end{gathered}} {[\![i,j = e_1, e_2]\!],m,o \Longrightarrow (m[x/i][y/j],o)}$$

Swap

Swapping is similar to parallel assignment, but (in most languages) requires that both identifiers already be bound to read-write variables. Of course, that kind of behavior can vary from language to language, but that’s what we will assume here. The concrete syntax might look like:

x, y = y, x

with the following abstract syntax:

s: Statement = ... | swap i i

Natural semantics:

$$\frac{m(i) = (\mathsf{NUM}, x, \mathsf{RW}) \;\;\;\; m(j) = (\mathsf{NUM}, x, \mathsf{RW})} {[\![\mathsf{swap}\;i\;j]\!],m,o \Longrightarrow (m[x/j][y/i],o)}$$

And SOS:

$$\frac{m(i) = (\mathsf{NUM}, x, \mathsf{RW}) \;\;\;\; m(j) = (\mathsf{NUM}, x, \mathsf{RW})} {[\![\mathsf{swap}\;i\;j]\!],m,o \longrightarrow [\![\mathsf{skip}]\!],m[x/j][y/i],o}$$
Exercise: Study the rules until you fully understand them. Compare and contrast the semantics for both swap and parallel assignment.

Repeat-Until

What about something like this? You know, kinda like the while-statement except the body is executed at least once. And a truthy condition stops the loop.

repeat {
   print(y);
   y = sqrt(y);
} until y < 1;

Abstract syntax:

s: Statement = ... | repeat b e

Semantics:

$$\frac{ b,m,o \Longrightarrow (m', o') \;\;\;\; e,m' \Longrightarrow x \;\;\; x \neq 0} { [\![\mathtt{repeat}\;b\;e]\!],m,o \Longrightarrow (m', o')}$$
$$\frac{\begin{gathered} b,m,o \Longrightarrow (m',o') \;\;\; e,m' \Longrightarrow 0 \\ [\![\mathtt{repeat}\;b\;e]\!],m',o' \Longrightarrow (m'',o'')\;\;\; \end{gathered}} { [\![\mathtt{repeat}\;b\;e]\!],m,o \Longrightarrow (m'',o'')}$$

Repeat-N-Times

More mainstream languages need this beauty:

repeat n*2 {
   print(0);
}

Yep that’s right. Just execute a block a certain number of times, without having to introduce a stupid iterator variable. Abstract syntax:

s: Statement = ... | dotimes e b

For the semantics, we, as language designers, have to determine whether we want to evaluate the number of times to do the loop just once, or re-evaluate it each time. Let’s say we evaluate it only once, before we even think about doing the loop body:

$$\frac{ e,m_1 \Longrightarrow n \;\;\; ( b,m_i,o_i \Longrightarrow (m_{i+1}, o_{i+1}))_{i=1}^n} { [\![\mathtt{dotimes}\;b\;e]\!],m_1,o_1 \Longrightarrow (m_{n+1}, o_{n+1})}$$

For-Loops

Sometimes, your definite iteration will need a iterator variable. Here’s a common form that we can add to Bella:

for i = x to y*8 {
   print(i);
   y = y / 2;
}

with abstract syntax:

s: Statement = ... | for i e e b

Again, as language designers, we have some choices to make:

Let’s choose to evaluate the bounds once at the beginning, and to make the loop variable local to the body and read-only. This gives us the semantic rules:

$$\frac{ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \;\;\; x > y} { [\![\mathtt{for}\;i\;e_1\;e_2\;b]\!],m,o \Longrightarrow (m, o)}$$
$$\frac{ e_1,m_x \Longrightarrow x \;\;\; e_2,m_x \Longrightarrow y \;\;\; x \leq y \;\;\; ( b,m_j[j/i],o_j \Longrightarrow (m_{j+1}, o_{j+1}))_{j=x}^y} { [\![\mathtt{for}\;i\;e_1\;e_2\;b]\!],m_x,o_x \Longrightarrow (m_{y+1}, o_{y+1})}$$

Nondeterminism

Many languages have a select statement, whose job is to randomly pick one of its arms to execute. The simplest form is to have exactly two arms:

select {
   print(0);
} or {
   y = y / 2;
   print(1);
}

with abstract syntax:

s: Statement = ... | select b b

The semantics are simple, but interesting. We have multiple rules that can fire for a given select statement. But that’s just fine. It’s what we want. The program has multiple possible executions. The semantics tells us all of them.

$$\frac{ b_1,m,o \Longrightarrow (m',o')} { [\![\mathtt{select}\;b_1\;b_2]\!],m,o\Longrightarrow (m', o')}$$
$$\frac{ b_2,m,o \Longrightarrow (m',o')} { [\![\mathtt{select}\;b_1\;b_2]\!],m,o\Longrightarrow (m', o')}$$

Concurrency

Concurrency is about things happening “at the same time.” The usual model is interleaved computation. For example:

TODO

can have multiple results:

TODO

Let’s add a new syntactic construct to execute two statements concurrently:

s: Statement = ... | par s s

Here, natural semantics is not sufficient; it can only say that we can execute the first statement and then the second, or the second and then the first. But an SOS can specify interleaved execution quite easily:

$$\frac{s_1,m,o \longrightarrow s_1',m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow [\![\mathtt{par}\;s_1'\;s_2]\!],m',o' }$$
$$\frac{s_2,m,o \longrightarrow s_2',m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow [\![\mathtt{par}\;s_1\;s_2']\!],m',o' }$$
$$\frac{s_1,m,o \longrightarrow m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow s_2,m',o' }$$
$$\frac{s_2,m,o \longrightarrow m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow s_1,m',o' }$$

Types

Now for one of biggest topics: types.

In OG Bella, variables could only hold numeric values. “Boolean” expressions like x>=y just evaluate to either 0 or 1, so boolean values did not exist. 0 was “falsy” and 1 was “truthy.” Expressions never produced anything but numbers! This is far from a modern programming language.

Strong Typing

Let’s extend Bella to be much more expressive: we’ll allow expressions to not only evaluate to numbers or booleans, but functions as well. And we’ll add lists. But with added power comes some responsibilities: we should do typechecking so we don’t multiply functions, call numbers, perform a not operation on a number, or other such dreadful things. We want to make it so that any misuse of expressions—that is the use of an expression of one type in a context that requires a different type—generate an error. We want strong typing.

Before we show the semantics, let’s look at lists. They should look like this in Bella programs:

a = [10, 20, 30];
b = [false, true, true, false];
if (b[0) {
  print(a[2]);
}

which in the abstract syntax will admit two new forms:

e: Exp = ... | e[e] | [e*]

So now for the semantics. Expression evaluation in OG Bella always produced a number. Now it will produce a more general value:

$\Longrightarrow_E \; \subseteq \; (\mathsf{Exp} \times \mathsf{Mem}) \times \mathsf{Value}$

Now what are values? There is a value for each kind of type (respectively, number, boolean, list, user-defined function, intrinsic function). So we’re gonna say:

$\begin{array}{l} \mathsf{Value}\;=_{def} \\ \;\;\;\; \mathbb{R} \;\cup \\ \;\;\;\; \mathbb{B} \;\cup \\ \;\;\;\; \mathsf{Value}^* \;\cup \\ \;\;\;\; (\mathsf{Identifier}^* \times \mathsf{Expression}) \;\cup \\ \;\;\;\; ((\mathsf{Value}^* \rightarrow \mathsf{Value}) \times \mathbb{N}) \end{array}$

Now we need a memory to store the entities bound to each identifier:

$ \mathsf{Mem}\;=_{def} \mathsf{Identifier} \rightarrow \mathsf{UNDEF} \;\cup\; (\mathsf{Value} \times \{\mathsf{RO},\mathsf{RW}\}) $

Here’s the semantics:

$$\frac{}{ [\![n]\!],m \Longrightarrow n}$$
$$\frac{}{ [\![\mathtt{true}]\!],m \Longrightarrow T}$$
$$\frac{}{ [\![\mathtt{false}]\!],m \Longrightarrow F}$$
$$\frac{m(i) = (x, \_)} {[\![i]\!],m \Longrightarrow x}$$
$$\frac{ e,m \Longrightarrow x \;\;\; x \in \mathbb{R}} {[\![\mathsf{-}\;e]\!],m \Longrightarrow -x}$$
$$\frac{ e,m \Longrightarrow x \;\;\; x \in \mathbb{B}} {[\![\mathsf{!}\;e]\!],m \Longrightarrow \neg x}$$
$$\frac{( e_i,m \Longrightarrow x_i)_{i=1}^n} {[\![\: [e_1,\ldots,e_n] \:]\!],m \Longrightarrow (x_1,\ldots,x_n)}$$
$$\frac{\begin{gathered} e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \\ x \in \mathsf{Value}^* \;\;\; y \in \mathbb{N} \;\;\;\; 0 \leq y < |x| \end{gathered} } {[\![\: e_1[e_2]\: ]\!],m \Longrightarrow x\downarrow y}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\} \;\;\; op \neq \mathtt{/} \vee y \neq 0 \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \;\;\; x,y \in \mathbb{R} \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{\begin{gathered} op \in \{\mathtt{<}, \mathtt{<=}, \mathtt{==}, \mathtt{!=}, \mathtt{>=}, \mathtt{>} \} \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \;\;\; x,y \in \mathbb{R} \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{ e_1,m \Longrightarrow F} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow F}$$
$$\frac{ e_1,m \Longrightarrow T \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e_1,m \Longrightarrow T} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow T}$$
$$\frac{ e_1,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow T \;\;\; e_1,m \Longrightarrow y} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow z} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow z}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = (((p_1,\ldots, p_n),e'), \_) \;\;\; e', m[a_i/p_i]_{i=1}^n \Longrightarrow x} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow x}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = ((f, n), \_)} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow f(a_1,\ldots,a_n)}$$
$$\frac{ e,m \Longrightarrow x \;\;\;\; m(i) = \mathsf{UNDEF}} { [\![\mathtt{let}\;i=e]\!],m,o \Longrightarrow (m[(x,\mathsf{RW})\,/\,i], o)}$$
$$\frac{m(i) = \mathsf{UNDEF}} { [\![\mathtt{fun}\;i\;p=e]\!],m,o \Longrightarrow (m[((p,e), \mathsf{RO})\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x' \;\;\; m(i) = (x,\mathsf{RW})} { [\![i=e]\!],m,o \Longrightarrow (m[(x',\mathsf{RW})\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x} { [\![\mathtt{print}\;e]\!],m,o \Longrightarrow (m, o\!\cdot\!(x))}$$
$$\frac{ e,m \Longrightarrow F} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m, o)}$$
$$\frac{\begin{gathered} e,m \Longrightarrow T \;\;\; b,m,o \Longrightarrow (m',o') \\ [\![\mathtt{while}\;e\;b]\!],m',o' \Longrightarrow (m'',o'')\end{gathered}} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m'',o'')}$$
$$\frac{( s_i, m_i, o_i \Longrightarrow (m_{i+1},o_{i+1}))_{i=1}^n} { [\![\mathtt{block}\;s_1,\ldots,s_n]\!],m_1,o_1 \Longrightarrow o_{n+1}}$$
$$\frac{ b, m_0, () \Longrightarrow (m,o)} {[\![\mathtt{program}\;b]\!] \Longrightarrow o}$$
Exercise: Study the rules.

Static Types

The semantics above strongly suggested type checking was done at run time. Now suppose we wanted to extend that extension of Bella so that all type checking could be done at compile time. Many (most?) programmers love languages that check types during compilation since it gives them a bit more confidence about the program’s correctness. Others my not like the lack of flexibility. Whatever your preference may be, static typing is a thing we should explore.

With static types, the language itself should have some changes. We will:

The static semantics will show what each construct “means” relative to a context. A context maps identifiers to things we know about them at compile time, rather than run time. We don’t care about values, then, but we do care about types and access (read-only or read-write). A context is then:

$ \begin{array}{l} \mathsf{Context} =_{def} \mathsf{Ide} \rightarrow \\ \;\;\;\; \mathsf{UNDEF} \;\cup \\ \;\;\;\; (\mathsf{Type} \times \{\mathsf{RO},\mathsf{RW}\}) \\ \\ \mathsf{Type} =_{def} \\ \;\;\;\; \{ \mathsf{num} \} \;\cup \\ \;\;\;\; \{ \mathsf{bool} \} \;\cup \\ \;\;\;\; (\{ \mathsf{list} \} \times \mathsf{Type}) \;\cup \\ \;\;\;\; (\mathsf{Type^*} \rightarrow \mathsf{Type}) \;\cup \\ \;\;\;\; ((\mathsf{Ide} \times \mathsf{Type})^* \times \mathsf{Exp} \times \mathsf{Type}) \end{array} $

When analyzing an expression in a context, we want to generate a type for it. When analyzing statements (which may include declarations, we will produce a new context. Programs simply are ok or not ok. Formally the relation types are:

TODO: explain the relations

A possible static semantics is:

$$\frac{}{ [\![n]\!],c \Longrightarrow \mathsf{num}}$$
$$\frac{}{ [\![\mathtt{true}]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{}{ [\![\mathtt{false}]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{c(i) = (t, \_)} {[\![i]\!],c \Longrightarrow t}$$
$$\frac{ e,c \Longrightarrow \mathsf{num}} {[\![\mathsf{-}\;e]\!],c \Longrightarrow \mathsf{num}}$$
$$\frac{ e,c \Longrightarrow \mathsf{bool}} {[\![\mathsf{!}\;e]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{( e_i,c \Longrightarrow t)_{i=1}^n \;\;\; n \geq 1} {[\![\: [e_1,\ldots,e_n] \:]\!],c \Longrightarrow \mathsf{list}\langle t\rangle}$$
$$\frac{ e_1,c \Longrightarrow \mathsf{list}\langle t\rangle \;\;\; e_2,c \Longrightarrow \mathsf{num}} {[\![\: e_1[e_2]\: ]\!],c \Longrightarrow t}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\}\\ e_1,c \Longrightarrow \mathsf{num} \;\;\; e_2,c \Longrightarrow \mathsf{num} \end{gathered}} {[\![e_1\;op\;e_2]\!],c \Longrightarrow \mathsf{num}}$$
$$\frac{\begin{gathered} op \in \{\mathtt{<}, \mathtt{<=}, \mathtt{==}, \mathtt{!=}, \mathtt{>=}, \mathtt{>} \} \\ e_1,c \Longrightarrow \mathsf{num} \;\;\; e_2,c \Longrightarrow \mathsf{num}\end{gathered}} {[\![e_1\;op\;e_2]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{ e_1,c \Longrightarrow \mathsf{bool} \;\;\; e_2,c \Longrightarrow \mathsf{bool}} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{ e_1,c \Longrightarrow \mathsf{bool} \;\;\; e_2,c \Longrightarrow \mathsf{bool}} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{ e,c \Longrightarrow \mathsf{bool} \;\;\; e_1,c \Longrightarrow t \;\;\; e_2,c \Longrightarrow t} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],c \Longrightarrow t}$$
$$\frac{( e_i,c \Longrightarrow t_i)_{i=1}^n \;\;\; c(i) = (((p_1,t_1),\ldots,(p_n,t_n)),e',t') \;\;\; e', c[(t_i,\mathsf{RO})/p_i]_{i=1}^n \Longrightarrow t'} { [\![i\;e_1,\ldots,e_n]\!],c \Longrightarrow t'}$$
$$\frac{( e_i,c \Longrightarrow t_i)_{i=1}^n \;\;\; c(i) = (f, (t_1, \ldots, t_n), t')} { [\![i\;e_1,\ldots,e_n]\!],c \Longrightarrow t'}$$
$$\frac{ e,c \Longrightarrow t \;\;\;\; c(i) = \mathsf{UNDEF}} { [\![\mathtt{let}\;i=e]\!],c \Longrightarrow c[(t,\mathsf{RW})\,/\,i]}$$
$$\frac{c(i) = \mathsf{UNDEF}} { [\![\mathtt{fun}\;i\;p\;t=e]\!],c \Longrightarrow c[((p,e,t), \mathsf{RO})\,/\,i]}$$
$$\frac{ e,c \Longrightarrow t \;\;\; c(i) = (t,\mathsf{RW})} { [\![i=e]\!],c \Longrightarrow c}$$
$$\frac{ e,c \Longrightarrow t} { [\![\mathtt{print}\;e]\!],c \Longrightarrow c}$$
$$\frac{ e,c \Longrightarrow \mathsf{bool} \;\;\; b,c \Longrightarrow c'} { [\![\mathtt{while}\;e\;b]\!],c \Longrightarrow c'}$$
$$\frac{( s_i,c_i \Longrightarrow c_{i+1})_{i=1}^n} { [\![\mathtt{block}\;s_1,\ldots,s_n]\!],c_1 \Longrightarrow c_{n+1}}$$
$$\frac{ b, c_0 \Longrightarrow c} {[\![\mathtt{program}\;b]\!] \Longrightarrow \mathsf{OK}}$$

A nice benefit of doing the static semantics only is that we can now write the dynamic semantics under the assumption that the program has already been typechecked. This means our memory needs to only store values (since the read-only and number-of-parameter settings has already been handled in the static semantics. We’ve done that “break a large problem down into smaller ones” thing! Here is the Bella dynamic-semantics-only:

$$\frac{}{ [\![n]\!],m \Longrightarrow n}$$
$$\frac{}{ [\![\mathtt{true}]\!],m \Longrightarrow T}$$
$$\frac{}{ [\![\mathtt{false}]\!],m \Longrightarrow F}$$
$$\frac{m(i) = x} {[\![i]\!],m \Longrightarrow x}$$
$$\frac{ e,m \Longrightarrow x} {[\![\mathsf{-}\;e]\!],m \Longrightarrow -x}$$
$$\frac{ e,m \Longrightarrow x} {[\![\mathsf{!}\;e]\!],m \Longrightarrow \neg x}$$
$$\frac{( e_i,m \Longrightarrow x_i)_{i=1}^n} {[\![\: [e_1,\ldots,e_n] \:]\!],m \Longrightarrow (x_1,\ldots,x_n)}$$
$$\frac{\begin{gathered} e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \\ 0 \leq y < |x| \end{gathered} } {[\![\: e_1[e_2]\: ]\!],m \Longrightarrow x\downarrow y}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\} \;\;\; op \neq \mathtt{/} \vee y \neq 0 \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{\begin{gathered} op \in \{\mathtt{<}, \mathtt{<=}, \mathtt{==}, \mathtt{!=}, \mathtt{>=}, \mathtt{>} \} \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{ e_1,m \Longrightarrow F} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow F}$$
$$\frac{ e_1,m \Longrightarrow T \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e_1,m \Longrightarrow T} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow T}$$
$$\frac{ e_1,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow T \;\;\; e_1,m \Longrightarrow y} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow z} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow z}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = ((p_1,\ldots, p_n),e') \;\;\; e', m[a_i/p_i]_{i=1}^n \Longrightarrow x} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow x}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow m(i)(a_1,\ldots,a_n)}$$
$$\frac{ e,m \Longrightarrow x} { [\![\mathtt{let}\;i=e]\!],m,o \Longrightarrow (m[x\,/\,i], o)}$$
$$\frac{} { [\![\mathtt{fun}\;i\;p=e]\!],m,o \Longrightarrow (m[(p,e)\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x} { [\![i=e]\!],m,o \Longrightarrow (m[x\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x} { [\![\mathtt{print}\;e]\!],m,o \Longrightarrow (m, o\!\cdot\!(x))}$$
$$\frac{ e,m \Longrightarrow F} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m, o)}$$
$$\frac{\begin{gathered} e,m \Longrightarrow T \;\;\; b,m,o \Longrightarrow (m',o') \\ [\![\mathtt{while}\;e\;b]\!],m',o' \Longrightarrow (m'',o'')\end{gathered}} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m'',o'')}$$
$$\frac{( s_i, m_i, o_i \Longrightarrow (m_{i+1},o_{i+1}))_{i=1}^n} { [\![\mathtt{block}\;s_1,\ldots,s_n]\!],m_1,o_1 \Longrightarrow o_{n+1}}$$
$$\frac{ b, m_0, () \Longrightarrow (m,o)} {[\![\mathtt{program}\;b]\!] \Longrightarrow o}$$
Exercise: Think about this two-part semantics. Compare it to the previous “all-in-one” semantics. What are the pros and cons? Which do you like better, at least for Bella? What might you like better if we were giving a semantics of C++, Rust, or JavaScript?

Learning More

Here are some good papers and other references:

Summary

We’ve covered:

  • A motivating example
  • What concrete operational semantics is
  • What structural operational semantics is
  • What natural semantics is
  • Differences between SOS and NS
  • Astro
  • Bella
  • More language features
  • Types
  • Where to learn more