Operational Semantics

One way to define the meaning of a program is to show how it runs on some machine.

A Tiny Example Language

We’ll start very simply, with a language that just has integers, addition and subtraction, and print statements. Programs look like this:

print(8 - (3 + 21) + 34);
print(55 - 3 + 2);

The concrete syntax can be given by this grammar:

$\begin{array}{lcl} \textit{Program} & \longleftarrow & \textit{Stmt}+ \\ \textit{Stmt} & \longleftarrow & \texttt{"print"}\;\texttt{"("}\;\textit{Exp}\;\texttt{")"}\;\texttt{";"} \\ \textit{Exp} & \longleftarrow & (\textit{Exp}\;(\texttt{"+"}\;|\;\texttt{"-"}))?\;\textit{Term} \\ \textit{Term} & \longleftarrow & \textit{num} \;|\; \texttt{"("}\;\textit{Exp}\;\texttt{")"} \\ \textit{num} & \longleftarrow & \textit{digit}+ \end{array}$

When defining a semantics we don’t care about concrete syntax, only abstract syntax. So here is the abstract syntax of this tiny language, which is all we care about from here on:

$\begin{array}{lcl} n: \textsf{Nml} & & \\ e: \textsf{Exp} & = & n \mid e + e \mid e - e \\ s: \textsf{Stm} & = & \mathtt{print}\;e \\ p: \textsf{Pro} & = & s+ \end{array}$

Need Practice?

For the example program in this section, give both the parse tree and the abstract syntax tree.

Concrete Operational Semantics

A concrete operational semantics is given to a language by mapping language constructs to operations on a virtual machine—a machine whose operation is “obvious.”

Stack machines are pretty well understood! Let’s make up a stack machine. The machine has a stack for execution and allows you to store integers in named cells. The instructions are:

InstructionDescription
PUSH $n$Push the value of the literal $n$
ADDPop twice then push first popped value plus second popped value
SUBPop twice then push first popped value minus second popped value
PRINTPrint contents of top of stack to standard output then pop

When doing a concrete operational semantics, we just show the instruction sequence for each abstract syntactic form. That way, every program in our language will be translated to a unique instruction sequence. We assume this machine is sooooooooo simple that the meaning of instruction sequences is well understood. Therefore showing how to translate programs into machine instruction sequences suffices to define the semantics of the language. We will define the translation function, $T$, by cases on the abstract syntax forms:

$\begin{array}{l} T[\![n]\!] = \mathsf{PUSH}\;n\\ T[\![e_1 + e_2]\!] = T\,e_1 \;\boldsymbol{\cdot}\; T\,e_2 \;\boldsymbol{\cdot}\; \mathsf{ADD}\\ T[\![e_1 - e_2]\!] = T\,e_1 \;\boldsymbol{\cdot}\; T\,e_2 \;\boldsymbol{\cdot}\; \mathsf{SUB}\\ T[\![\mathtt{print}\;e]\!] = T\,e \;\boldsymbol{\cdot}\; \mathsf{PRINT}\\ T[\![\mathtt{program}\;s_1 \ldots s_n]\!] = T\,s_1 \;\boldsymbol{\cdot}\; \cdots \;\boldsymbol{\cdot}\; T\,s_n \end{array}$

The cool brackets $[\![$ and $]\!]$ distinguish syntactic forms, such as numerals and plus-expressions, from regular mathematical values, such as numbers.

Let’s see how it works for a trivial program:

$\begin{array}{l} T[\![\mathtt{program}\;\mathtt{print}\;8 + 13]\!] \\ \;\;\;\; = T[\![\mathtt{print}\;8 + 13]\!] \\ \;\;\;\; = T[\![8 + 13]\!] \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = T[\![8]\!] \;\boldsymbol{\cdot}\; T[\![13]\!] \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = \mathsf{PUSH}\;8 \;\boldsymbol{\cdot}\; T[\![13]\!] \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = \mathsf{PUSH}\;8 \;\boldsymbol{\cdot}\; \mathsf{PUSH}\;13 \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13) \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13, \mathsf{ADD}) \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13, \mathsf{ADD}, \mathsf{PRINT}) \end{array}$

The execution of this program on the “machine” unfolds as follows:

                // stack is (), output is ()
PUSH 8          // stack is [8], output is ()
PUSH 13         // stack is [8, 13], output is ()
ADD             // stack is [21], output is ()
PRINT           // stack is (), output is [21]

So the meaning of our program is the one element integer list $[21]$.

The translation of the example program from the beginning of this section is:

// print(8 - (3 + 21) + 34);
// print(55 - 3 + 2);

PUSH 8 
PUSH 3 
PUSH 21
ADD
SUB     
PUSH 34
ADD
PRINT
PUSH 55
PUSH 3 
SUB
PUSH 2 
ADD
PRINT   

CLASSWORK
Execute the program on the virtual machine. The output should be $[18, 54]$.

Structural Operational Semantics

Rather than specifying a machine with its own instruction set, we can simply define rules that show how a computation proceeds without the separate machine. Yep, we are going to abstract away the abstract machine!

For the tiny language, evaluating expressions eventually produces a number, executing statements keeps appending to “standard output,” and the execution of a program involves executing its statements, starting with the empty output to eventually produce its final output.

$$\frac{e_1 \longrightarrow e_1'} {[\![e_1 + e_2]\!] \longrightarrow [\![e_1' + e_2]\!]}$$
$$\frac{e_2 \longrightarrow e_2'} {[\![n + e_2]\!] \longrightarrow [\![n + e_2']\!]}$$
$$\frac{} {[\![n_1 + n_2]\!] \longrightarrow n_1+n_2}$$
$$\frac{e_1 \longrightarrow e_1'} {[\![e_1 - e_2]\!] \longrightarrow [\![e_1' - e_2]\!]}$$
$$\frac{e_2 \longrightarrow e_2'} {[\![n - e_2]\!] \longrightarrow [\![n - e_2']\!]}$$
$$\frac{} {[\![n_1 - n_2]\!] \longrightarrow n_1-n_2}$$
$$\frac{e \longrightarrow e'} {[\![\mathtt{print}\;e]\!], o \longrightarrow [\![\mathtt{print}\;e']\!], o }$$
$$\frac{} { [\![\mathtt{print}\;n]\!], o \longrightarrow o \cdot n }$$
$$\frac{ s_1, o \longrightarrow s'_1, o' } { [\![\mathtt{program}\;s_1, \ldots s_n]\!], o \longrightarrow [\![\mathtt{program}\;s'_1, \ldots s_n]\!], o' }$$
$$\frac{ s_1, o \longrightarrow o' } { [\![\mathtt{program}\;s_1, s_2, \ldots s_n]\!], o \longrightarrow [\![\mathtt{program}\;s_2, \ldots s_n]\!], o' }$$
$$\frac{} { [\![\mathtt{program}]\!], o \longrightarrow o }$$

Here is the basic idea:

In our example language:

CLASSWORK
This is best developed live, in a video or on a whiteboard.

To apply the semantics, we find the $o$ that the “starting configuration” $ [\![\mathtt{program}\;s_1 \cdots s_n]\!], ()$ will end up in. We can show the computation with a proof tree:

$$ \dfrac{ \dfrac{ \dfrac{ \dfrac{ \dfrac{ [\![3]\!]\longrightarrow 3\;\;\;[\![2]\!]\longrightarrow 2 }{ [\![3-2]\!]\longrightarrow 1 } }{ [\![\mathtt{print}\;3-2]\!],()\longrightarrow [\![\mathtt{print}\;1]\!],() } \;\;\;\;\;\;\; \dfrac{}{ [\![\mathtt{print}\;1]\!],()\longrightarrow (1)} }{ [\![\mathtt{print}\;3-2]\!],()\longrightarrow (1) } }{ [\![\mathtt{program\;print}\;3-2]\!], () \longrightarrow [\![\mathtt{program}]\!], (1) }\;\;\;\;\;\;\; \dfrac{}{ [\![\mathtt{program}]\!],(1) \longrightarrow (1) } }{ [\![\mathtt{program\;print}\;3-2]\!], () \longrightarrow (1) } $$

How much work was that?

That was a lot of work for a tiny language. Did you know there is another way to do this stuff, that does not involve nitty gritty computation details? There is. It is called natural semantics. Coming up...right now.

Natural Semantics

The semantics we’ve just seen, structural operational semantics, is often called small-step operational semantics. There’s another approach, known as, you guessed it, big-step operational semantics, also called natural semantics. Where little steps in SOS are given with the relation $\longrightarrow$, natural semantics uses $\Longrightarrow$ or $\Downarrow$ to signify that a construct is completely evaluated or executed to its final configuration.

A motivation for, and details of, natural semantics, can be found in any good textbook on semantics. But we can learn it by just jumping right in. Here is the natural semantics for the tiny language we’ve been working with:

$$\frac{} {[\![n]\!] \Longrightarrow n}$$
$$\frac{e_1 \Longrightarrow x \;\;\;\; e_2 \Longrightarrow y} {[\![e_1 + e_2]\!] \Longrightarrow x+y}$$
$$\frac{e_1 \Longrightarrow x \;\;\;\; e_2 \Longrightarrow y} {[\![e_1 - e_2]\!] \Longrightarrow x-y}$$
$$\frac{e \Longrightarrow x} { [\![\mathtt{print}\;e]\!], o \Longrightarrow o \cdot x }$$
$$\frac{ s_1, () \Longrightarrow o_1 \;\;\;\; s_2, o_1 \Longrightarrow o_2 \;\;\;\; \cdots \;\;\;\; s_n, o_{n-1} \Longrightarrow o_n} {[\![\mathtt{program}\;s_1, \ldots s_n]\!] \Longrightarrow o_n }$$

CLASSWORK
We’ll derive this in class. Did you find it easier going?
Structural Operational Semantics vs. Natural Semantics

It might take time to really internalize the difference. Here are two things that may help. First, a summary table:
SOSNatural
Break computations down into individual “steps”Concerned only with the final result of evaluating a construct
Aka: “small-step”, “transitional”, “reduction”Aka: “big-step”, “relational”, “evaluation”
Uses $\rightarrow$Uses $\Rightarrow$ or $\Downarrow$
Transitions between configurations, e.g., $ e,m \rightarrow e',m$Complete evaluation of constructs, e.g., $ e,m \Rightarrow x$
Gordon Plotkin, 1981Gilles Kahn, 1987

Second, here is a fantastic and detailed slide deck from a University of Illinois class that does an excellent job of describing both methods.

Operational Semantics for Astro

Let’s scale up just a little bit. Remember Astro? Here is the abstract syntax:

$ \begin{array}{lcl} n: \mathsf{Nml} & & \\ i: \mathsf{Ide} & & \\ e: \mathsf{Exp} & = & n \mid i \mid -e \mid e+e \mid e-e \mid e\,\mathtt{*}\,e \mid e\,/\,e \mid e\,\%\,e \mid e\,\mathtt{**}\,e \mid i\;e*\\ s: \mathsf{Stm} & = & i = e \mid \mathtt{print}\;e\\ p: \mathsf{Pro} & = & s+\\ \end{array}$

Several new features appear in Astro that were not in the trivial language:

We are going to look at the three styles of operations semantics for Astro.

Concrete Operational Semantics of Astro

We’ll have to scale up the machine a bit to support Astro:

InstructionDescription
PUSH $n$Push the value of the literal $n$
LOAD $v$Push the value in memory location $v$, crashing if $v$ is not in memory
STORE $v$Pop into memory location $v$
NEGPop then push the negation of the popped value
SINPop then push the sine of the popped value
COSPop then push the cosine of the popped value
SQRTPop then push the square root of the popped value
ADDPop twice then push first popped value plus second popped value
SUBPop twice then push first popped value minus second popped value
MULPop twice then push first popped value times second popped value
DIVPop twice then push first popped value divided by second popped value, crashing if the second popped value is 0
REMPop twice then push the remainder of first popped value divided by second popped value
POWPop twice then push first popped value raised to the power of the second popped value
HYPOTPop twice then push the hypotenuse of the two popped values
PRINTPrint contents of top of stack to standard output then pop

For example, the Astro program:

x = 9 * 35 ** π;
print(3 / hypot(1-cos(x), 13));

is:

PUSH 9
PUSH 35
PUSH π
POW
MUL
STORE x
PUSH 3
PUSH 1
LOAD x
COS
SUB
PUSH 13
HYPOT
DIV
PRINT

For a concrete semantics, we’re don’t actually interpret the program, we compile it down to the little stack language. How can we specify this translation function? One way is to just write it in Python:

astro_concrete.py
import math


def check(condition, message):
    if not condition:
        raise ValueError(message)


initial_env = {
    "π": { "kind": "CONST", "value": math.pi },
    "sqrt":   { "kind": "FUN", "arity": 1 },
    "sin": { "kind": "FUN", "arity": 1 },
    "cos": { "kind": "FUN", "arity": 1 },
    "hypot": { "kind": "FUN", "arity": 2 }}


def t(node, env):
    match node:
        case int(n) | float(n):
            return [f"PUSH {n}"]
        case str(x):
            check(x in env, f"Uninitialized {x}")
            check((kind := env[x]["kind"]) in ("CONST", "VAR"), "Not a variable")
            return [f"LOAD {x}" if kind == "VAR" else f"PUSH {env[x]['value']}"]
        case ("neg", e):
            return t(e, env) + ["NEG"]
        case ("add", e1, e2):
            return t(e1, env) + t(e2, env) + ["ADD"]
        case ("sub", e1, e2):
            return t(e1, env) + t(e2, env) + ["SUB"]
        case ("mul", e1, e2):
            return t(e1, env) + t(e2, env) + ["MUL"]
        case ("div", e1, e2):
            return t(e1, env) + t(e2, env) + ["DIV"]
        case ("rem", e1, e2):
            return t(e1, env) + t(e2, env) + ["REM"]
        case ("pow", e1, e2):
            return t(e1, env) + t(e2, env) + ["POW"]
        case ("call", f, args):
            check(f in env, f"Unknown {f}")
            check(env[f]["kind"] == "FUN", f"{f} is not a function")
            check(len(args) == env[f]["arity"], f"Wrong number of args for {f}")
            return [i for arg in args for i in t(arg, env)] + [f.upper()]
        case ("assign", i, e):
            check(i not in env or env[i]["kind"] == "VAR", f"{i} is not assignable")
            return t(e, env) + [f"STORE {i}"], {**env, i: {"kind": "VAR"}}
        case ("print", exp):
            return t(exp, env) + ["PRINT"], env
        case ("program", statements):
            code, e = [], env
            for s in statements:
                c, e = t(s, e)
                code += c
            return code
        case _:
            raise ValueError("Malformed program")


# Quick check: print the translation of a small program
program = ("program", [
    ("assign", "x", ("mul", 9, ("pow", 35, "π"))),
    ("print", ("div", 3, ("call", "hypot", [
        ("sub", 1, ("call", "cos", ["x"])), 13])))])
print('\n'.join(t(program, initial_env)))

Let’s run it:

$ python3 astro_concrete.py
PUSH 9
PUSH 35
PUSH 3.141592653589793
POW
MUL
STORE x
PUSH 3
PUSH 1
LOAD x
COS
SUB
PUSH 13
HYPOT
DIV
PRINT

Exactly what we expected.

Natural Semantics of Astro

We saw this on our notes on Semantics.

Structural Operational Semantics of Astro

An exercise for you.

Exercise: Try it!

Operational Semantics for Bella

The programming language Bella has its own formal specification. It adds a bunch of very interesting features to Astro:

Here’s the abstract syntax, copied from the specification:

$ \begin{array}{l} n\!: \mathsf{Numeral} \\ i\!: \mathsf{Identifier} \\ e\!: \mathsf{Expression} = n \;|\; i \;|\; \mathtt{true} \;|\; \mathtt{false} \;|\; \mathit{unaryop} \; e \;|\; e_1 \; \mathit{binop} \; e_2 \;|\; i \; e^* \;|\; e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2 \\ s\!: \mathsf{Statement} = \mathtt{let}\;i = e \;|\; \mathtt{func}\;i\;i^*=e \;|\; i = e \;|\; \mathtt{print}\;e \;|\; \mathtt{while}\;e\;b \\ b\!: \mathsf{Block} = \mathtt{block}\; s^* \\ p\!: \mathsf{Program} = \mathtt{program}\; b \end{array} $

Natural Semantics

The natural semantics of Bella is fully explained in the specification, so head over there to continue your operational semantics study.

Structural Operational Semantics

Try it!

Exercise: Write it!

Concrete Operational Semantics

It should be really fun to extend the virtual machine we’ve been playing with above to have new instructions that can handle Bella.

Exercise: Try this for now. Some day I will add it to these notes.

Some Interesting Language Features

Rather than jumping in to a new language, let’s go one feature at a time, and think, how might we extend Bella to have this feature?”

If Statements

The ubiquitous if-statement might look like this in a Bella program:

if x >= 3 {
   print(y);
   y = sqrt(y);
} else {
   y = y ** 2;
}

and in the abstract syntax appears as:

$s: \textsf{Statement} = \ldots \mid \texttt{if}\;e\;b\;b$

The (natural semantics) rules should be straightforward:

$$\frac{ e,m \Longrightarrow x \;\;\;\; x \neq 0 \;\;\;\; b_1,m,o \Longrightarrow (m'o')} {[\![\mathtt{if}\;e\;b_1\;b_2]\!],m,o \Longrightarrow (m',o')}$$
$$\frac{ e,m \Longrightarrow 0 \;\;\;\; b_2,m,o \Longrightarrow (m'o')} {[\![\mathtt{if}\;e\;b_1\;b_2]\!],m,o \Longrightarrow (m',o')}$$

For SOS we need to compute in smaller steps, bringing our tests down to a single number:

$$\frac{ e,m \longrightarrow e',m } {[\![\mathtt{if}\;e\;b_1\;b_2]\!],m,o \longrightarrow [\![\mathtt{if}\;e'\;b_1\;b_2]\!],m,o}$$
$$\frac{x \neq 0} {[\![\mathtt{if}\;x\;b_1\;b_2]\!],m,o \longrightarrow b_1,m,o}$$
$$\frac{} {[\![\mathtt{if}\;0\;b_1\;b_2]\!],m,o \longrightarrow b_2,m,o}$$

Parallel Assignment

Many languages have what they call parallel assignment, which might look like:

x, y = e1, e2

with the following abstract syntax:

$s: \textsf{Statement} = \ldots \mid i, i = e, e$

We can add the following rules to the natural semantics for Bella (in which identifiers must be bound before they can appear anywhere, including assignment statements):

$$\frac{\begin{gathered}e_1, m \Longrightarrow x \;\;\;\; e_2,m \Longrightarrow y \\ m(i) = (\mathsf{NUM}, x, \mathsf{RW}) \;\;\;\; m(j) = (\mathsf{NUM}, y, \mathsf{RW})\end{gathered}} {[\![i,j = e_1, e_2]\!],m,o \Longrightarrow (m[x/i][y/j],o)}$$

Swap

Swapping is similar to parallel assignment, but (in most languages) requires that both identifiers already be bound to read-write variables. Of course, that kind of behavior can vary from language to language, but that’s what we will assume here. The concrete syntax might look like:

x, y = y, x

with the following abstract syntax:

$s: \textsf{Statement} = \ldots \mid \texttt{swap}\;i\;i$

Natural semantics:

$$\frac{m(i) = (\mathsf{NUM}, x, \mathsf{RW}) \;\;\;\; m(j) = (\mathsf{NUM}, x, \mathsf{RW})} {[\![\mathsf{swap}\;i\;j]\!],m,o \Longrightarrow (m[x/j][y/i],o)}$$

And SOS:

$$\frac{m(i) = (\mathsf{NUM}, x, \mathsf{RW}) \;\;\;\; m(j) = (\mathsf{NUM}, x, \mathsf{RW})} {[\![\mathsf{swap}\;i\;j]\!],m,o \longrightarrow [\![\mathsf{skip}]\!],m[x/j][y/i],o}$$
Exercise: Study the rules until you fully understand them. Compare and contrast the semantics for both swap and parallel assignment.

Repeat-Until

What about something like this? You know, kinda like the while-statement except the body is executed at least once. And a truthy condition stops the loop.

repeat {
   print(y);
   y = sqrt(y);
} until y < 1;

Abstract syntax:

$s: \textsf{Statement} = \ldots \mid \texttt{repeat}\;b\;e$

Semantics:

$$\frac{ b,m,o \Longrightarrow (m', o') \;\;\;\; e,m' \Longrightarrow x \;\;\; x \neq 0} { [\![\mathtt{repeat}\;b\;e]\!],m,o \Longrightarrow (m', o')}$$
$$\frac{\begin{gathered} b,m,o \Longrightarrow (m',o') \;\;\; e,m' \Longrightarrow 0 \\ [\![\mathtt{repeat}\;b\;e]\!],m',o' \Longrightarrow (m'',o'')\;\;\; \end{gathered}} { [\![\mathtt{repeat}\;b\;e]\!],m,o \Longrightarrow (m'',o'')}$$

Repeat-N-Times

More mainstream languages need this beauty:

repeat n*2 {
   print(0);
}

Yep that’s right. Just execute a block a certain number of times, without having to introduce a stupid iterator variable. Abstract syntax:

$s: \textsf{Statement} = \ldots \mid \texttt{dotimes}\;e\;b$

For the semantics, we, as language designers, have to determine whether we want to evaluate the number of times to do the loop just once, or re-evaluate it each time. Let’s say we evaluate it only once, before we even think about doing the loop body:

$$\frac{ e,m_1 \Longrightarrow n \;\;\; ( b,m_i,o_i \Longrightarrow (m_{i+1}, o_{i+1}))_{i=1}^n} { [\![\mathtt{dotimes}\;b\;e]\!],m_1,o_1 \Longrightarrow (m_{n+1}, o_{n+1})}$$

For-Loops

Sometimes, your definite iteration will need a iterator variable. Here’s a common form that we can add to Bella:

for i = x to y*8 {
   print(i);
   y = y / 2;
}

with abstract syntax:

$s: \textsf{Statement} = \ldots \mid \texttt{for}\;i\;e\;e\;b$

Again, as language designers, we have some choices to make:

Let’s choose to evaluate the bounds once at the beginning, and to make the loop variable local to the body and read-only. This gives us the semantic rules:

$$\frac{ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \;\;\; x > y} { [\![\mathtt{for}\;i\;e_1\;e_2\;b]\!],m,o \Longrightarrow (m, o)}$$
$$\frac{ e_1,m_x \Longrightarrow x \;\;\; e_2,m_x \Longrightarrow y \;\;\; x \leq y \;\;\; ( b,m_j[j/i],o_j \Longrightarrow (m_{j+1}, o_{j+1}))_{j=x}^y} { [\![\mathtt{for}\;i\;e_1\;e_2\;b]\!],m_x,o_x \Longrightarrow (m_{y+1}, o_{y+1})}$$

Nondeterminism

Many languages have a select statement, whose job is to randomly pick one of its arms to execute. The simplest form is to have exactly two arms:

select {
   print(0);
} or {
   y = y / 2;
   print(1);
}

with abstract syntax:

$s: \textsf{Statement} = \ldots \mid \texttt{select}\;b\;b$

The semantics are simple, but interesting. We have multiple rules that can fire for a given select statement. But that’s just fine. It’s what we want. The program has multiple possible executions. The semantics tells us all of them.

$$\frac{ b_1,m,o \Longrightarrow (m',o')} { [\![\mathtt{select}\;b_1\;b_2]\!],m,o\Longrightarrow (m', o')}$$
$$\frac{ b_2,m,o \Longrightarrow (m',o')} { [\![\mathtt{select}\;b_1\;b_2]\!],m,o\Longrightarrow (m', o')}$$

Concurrency

Concurrency is about things happening “at the same time.” The usual model is interleaved computation. For example:

x, y = 1, 2;
par {
  x = y + 5 * y;    // t0:=y, t1:=y, t2:=5*t1, t3:=t0+t2, x:=t3
  y = x + 8;        // t4:=x, t5:=t4+8, y:=t5
}
print x;
print y;

can have multiple results! One of which is [12,20]. Another is [54,9]. There are even more!

CLASSWORK
Determine all of the other possibilities.

Let’s add a new syntactic construct to execute two statements concurrently:

$s: \textsf{Statement} = \ldots \mid \texttt{par}\;s\;s$

Here, natural semantics is not sufficient; it can only say that we can execute the first statement and then the second, or the second and then the first. But an SOS can specify interleaved execution quite easily:

$$\frac{s_1,m,o \longrightarrow s_1',m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow [\![\mathtt{par}\;s_1'\;s_2]\!],m',o' }$$
$$\frac{s_2,m,o \longrightarrow s_2',m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow [\![\mathtt{par}\;s_1\;s_2']\!],m',o' }$$
$$\frac{s_1,m,o \longrightarrow m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow s_2,m',o' }$$
$$\frac{s_2,m,o \longrightarrow m',o'} { [\![\mathtt{par}\;s_1\;s_2]\!],m,o \longrightarrow s_1,m',o' }$$
Exercise: Does this handle all interleavings we came up with in our classwork? If not, how do we fix things?

Types

Now for one of biggest topics: types.

In OG Bella, variables could only hold numeric values. “Boolean” expressions like x>=y just evaluate to either 0 or 1, so boolean values did not exist. 0 was “falsy” and 1 was “truthy.” Expressions never produced anything but numbers! This is far from a modern programming language.

Strong Typing

Let’s extend Bella to be much more expressive: we’ll allow expressions to not only evaluate to numbers or booleans, but functions as well. And we’ll add lists. But with added power comes some responsibilities: we should do typechecking so we don’t multiply functions, call numbers, perform a not operation on a number, or other such dreadful things. We want to make it so that any misuse of expressions—that is the use of an expression of one type in a context that requires a different type—generate an error. We want strong typing.

Before we show the semantics, let’s look at lists. They should look like this in Bella programs:

a = [10, 20, 30];
b = [false, true, true, false];
if (b[0) {
  print(a[2]);
}

which in the abstract syntax will admit two new forms:

$e: \textsf{Exp} = \ldots \mid e \texttt{[} e \texttt{]} \mid \texttt{[} e^* \texttt{]}$

So now for the semantics. Expression evaluation in OG Bella always produced a number. Now it will produce a more general value:

$\Longrightarrow_E \; \subseteq \; (\mathsf{Exp} \times \mathsf{Mem}) \times \mathsf{Value}$

Now what are values? There is a value for each kind of type (respectively, number, boolean, list, user-defined function, intrinsic function). So we’re gonna say:

$\begin{array}{l} \mathsf{Value}\;=_{def} \\ \;\;\;\; \mathsf{Real} \;\cup \\ \;\;\;\; \mathbb{B} \;\cup \\ \;\;\;\; \mathsf{Value}^* \;\cup \\ \;\;\;\; (\mathsf{Identifier}^* \times \mathsf{Exp}) \;\cup \\ \;\;\;\; ((\mathsf{Value}^* \rightarrow \mathsf{Value}) \times \mathbb{N}) \end{array}$
Hold up, that is NOT a set!

That’s not a proper definition of a set called $\mathsf{Value}$, is it? How can a set $A$ contain functions of type $A \rightarrow A$? The latter has a higher cardinality than the former! Informally, this is a type definition, not a set definition. It’s a way of saying, “a value is one of these things.” We’ll worry about why this works later; for now, let’s go with it.

Now we need a memory to store the entities bound to each identifier:

$ \mathsf{Mem}\;=_{def} \mathsf{Identifier} \rightharpoonup (\mathsf{Value} \times \{\mathsf{RO},\mathsf{RW}\}) $

Here’s the semantics:

$$\frac{}{ [\![n]\!],m \Longrightarrow n}$$
$$\frac{}{ [\![\mathtt{true}]\!],m \Longrightarrow T}$$
$$\frac{}{ [\![\mathtt{false}]\!],m \Longrightarrow F}$$
$$\frac{m(i) = (x, \_)} {[\![i]\!],m \Longrightarrow x}$$
$$\frac{ e,m \Longrightarrow x \;\;\; x \in \mathsf{Real}} {[\![\mathsf{-}\;e]\!],m \Longrightarrow -x}$$
$$\frac{ e,m \Longrightarrow x \;\;\; x \in \mathbb{B}} {[\![\mathsf{!}\;e]\!],m \Longrightarrow \neg x}$$
$$\frac{( e_i,m \Longrightarrow x_i)_{i=1}^n} {[\![\: [e_1,\ldots,e_n] \:]\!],m \Longrightarrow (x_1,\ldots,x_n)}$$
$$\frac{\begin{gathered} e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \\ x \in \mathsf{Value}^* \;\;\; y \in \mathbb{N} \;\;\;\; 0 \leq y < |x| \end{gathered} } {[\![\: e_1[e_2]\: ]\!],m \Longrightarrow x\downarrow y}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\} \;\;\; op \neq \mathtt{/} \vee y \neq 0 \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \;\;\; x,y \in \mathsf{Real} \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{\begin{gathered} op \in \{\mathtt{<}, \mathtt{<=}, \mathtt{==}, \mathtt{!=}, \mathtt{>=}, \mathtt{>} \} \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \;\;\; x,y \in \mathsf{Real} \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{ e_1,m \Longrightarrow F} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow F}$$
$$\frac{ e_1,m \Longrightarrow T \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e_1,m \Longrightarrow T} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow T}$$
$$\frac{ e_1,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow T \;\;\; e_1,m \Longrightarrow y} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow z} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow z}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = (((p_1,\ldots, p_n),e'), \_) \;\;\; e', m[a_i/p_i]_{i=1}^n \Longrightarrow x} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow x}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = ((f, n), \_)} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow f(a_1,\ldots,a_n)}$$
$$\frac{ e,m \Longrightarrow x \;\;\;\; m(i) = \bot} { [\![\mathtt{let}\;i=e]\!],m,o \Longrightarrow (m[(x,\mathsf{RW})\,/\,i], o)}$$
$$\frac{m(i) = \bot} { [\![\mathtt{fun}\;i\;p=e]\!],m,o \Longrightarrow (m[((p,e), \mathsf{RO})\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x' \;\;\; m(i) = (x,\mathsf{RW})} { [\![i=e]\!],m,o \Longrightarrow (m[(x',\mathsf{RW})\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x} { [\![\mathtt{print}\;e]\!],m,o \Longrightarrow (m, o\!\cdot\!(x))}$$
$$\frac{ e,m \Longrightarrow F} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m, o)}$$
$$\frac{\begin{gathered} e,m \Longrightarrow T \;\;\; b,m,o \Longrightarrow (m',o') \\ [\![\mathtt{while}\;e\;b]\!],m',o' \Longrightarrow (m'',o'')\end{gathered}} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m'',o'')}$$
$$\frac{( s_i, m_i, o_i \Longrightarrow (m_{i+1},o_{i+1}))_{i=1}^n} { [\![\mathtt{block}\;s_1,\ldots,s_n]\!],m_1,o_1 \Longrightarrow o_{n+1}}$$
$$\frac{ b, m_0, () \Longrightarrow (m,o)} {[\![\mathtt{program}\;b]\!] \Longrightarrow o}$$
Exercise: Study the rules.

Static Types

The semantics above strongly suggested type checking was done at run time. Now suppose we wanted to extend that extension of Bella so that all type checking could be done at compile time. Many (most?) programmers love languages that check types during compilation since it gives them a bit more confidence about the program’s correctness. Others may not like the lack of flexibility. Whatever your preference may be, static typing is a thing we should explore.

Generally, with static types you will add new keywords or identifiers to your language for types, like number, boolean. But in some cases, you can make a language powerful enough to do complete type inference without ever mentioning types, ever. Can we do this for statically-typed Bella?

Anyway, the approach is to write the semantics in two parts.

The static semantics will show what each construct “means” relative to a context. A context maps identifiers to things we know about them at compile time, rather than run time. We don’t care about values, then, but we do care about types and access (read-only or read-write). A context is then:

$ \begin{array}{l} \mathsf{Context} =_{def} \mathsf{Ide} \rightharpoonup \\ \;\;\;\; (\mathsf{Type} \times \{\mathsf{RO},\mathsf{RW}\}) \\ \\ \mathsf{Type} =_{def} \\ \;\;\;\; \{ \mathsf{num} \} \;\cup \\ \;\;\;\; \{ \mathsf{bool} \} \;\cup \\ \;\;\;\; (\{ \mathsf{list} \} \times \mathsf{Type}) \;\cup \\ \;\;\;\; (\{ \mathsf{stdfun} \} \times \mathsf{Type^*} \times \mathsf{Type}) \;\cup \\ \;\;\;\; (\{ \mathsf{userfun} \} \times (\mathsf{Ide} \times \mathsf{Type})^* \times \mathsf{Exp} \times \mathsf{Type}) \end{array} $

When analyzing an expression in a context, we want to generate a type for it. When analyzing statements (which may include declarations, we will produce a new context. Programs simply are ok or not ok. Formally the relation types are:

TODO: define and explain the relations

A possible static semantics is:

$$\frac{}{ [\![n]\!],c \Longrightarrow \mathsf{num}}$$
$$\frac{}{ [\![\mathtt{true}]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{}{ [\![\mathtt{false}]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{c(i) = (t, \_) \quad t \in \{ \mathsf{num}, \mathsf{bool}, (\mathsf{list},\_) \}} {[\![i]\!],c \Longrightarrow t}$$
$$\frac{ e,c \Longrightarrow \mathsf{num}} {[\![\mathsf{-}\;e]\!],c \Longrightarrow \mathsf{num}}$$
$$\frac{ e,c \Longrightarrow \mathsf{bool}} {[\![\mathsf{!}\;e]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{( e_i,c \Longrightarrow t)_{i=1}^n \;\;\; n \geq 1} {[\![\: [e_1,\ldots,e_n] \:]\!],c \Longrightarrow (\mathsf{list},t)}$$
$$\frac{ e_1,c \Longrightarrow (\mathsf{list}, t) \;\;\; e_2,c \Longrightarrow \mathsf{num}} {[\![\: e_1[e_2]\: ]\!],c \Longrightarrow t}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\}\\ e_1,c \Longrightarrow \mathsf{num} \;\;\; e_2,c \Longrightarrow \mathsf{num} \end{gathered}} {[\![e_1\;op\;e_2]\!],c \Longrightarrow \mathsf{num}}$$
$$\frac{\begin{gathered} op \in \{\mathtt{<}, \mathtt{<=}, \mathtt{==}, \mathtt{!=}, \mathtt{>=}, \mathtt{>} \} \\ e_1,c \Longrightarrow \mathsf{num} \;\;\; e_2,c \Longrightarrow \mathsf{num}\end{gathered}} {[\![e_1\;op\;e_2]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{ e_1,c \Longrightarrow \mathsf{bool} \;\;\; e_2,c \Longrightarrow \mathsf{bool}} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{ e_1,c \Longrightarrow \mathsf{bool} \;\;\; e_2,c \Longrightarrow \mathsf{bool}} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],c \Longrightarrow \mathsf{bool}}$$
$$\frac{ e,c \Longrightarrow \mathsf{bool} \;\;\; e_1,c \Longrightarrow t \;\;\; e_2,c \Longrightarrow t} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],c \Longrightarrow t}$$
$$\frac{( e_i,c \Longrightarrow t_i)_{i=1}^n \;\;\; c(i) = (\mathsf{userfun}, ((p_1,t_1),\ldots,(p_n,t_n)),e',t') \;\;\; e', c[(t_i,\mathsf{RO})/p_i]_{i=1}^n \Longrightarrow t'} { [\![i\;e_1,\ldots,e_n]\!],c \Longrightarrow t'}$$
$$\frac{( e_i,c \Longrightarrow t_i)_{i=1}^n \;\;\; c(i) = (\mathsf{stdfun}, (t_1, \ldots, t_n), t')} { [\![i\;e_1,\ldots,e_n]\!],c \Longrightarrow t'}$$
$$\frac{ e,c \Longrightarrow t \;\;\;\; c(i) = \bot} { [\![\mathtt{let}\;i=e]\!],c \Longrightarrow c[(t,\mathsf{RW})\,/\,i]}$$
$$\frac{c(i) = \bot} { [\![\mathtt{fun}\;i\;p\;t=e]\!],c \Longrightarrow c[(\mathsf{userfun},(p,e,t), \mathsf{RO})\,/\,i]}$$
$$\frac{ e,c \Longrightarrow t \;\;\; c(i) = (t,\mathsf{RW})} { [\![i=e]\!],c \Longrightarrow c}$$
$$\frac{ e,c \Longrightarrow t} { [\![\mathtt{print}\;e]\!],c \Longrightarrow c}$$
$$\frac{ e,c \Longrightarrow \mathsf{bool} \;\;\; b,c \Longrightarrow c'} { [\![\mathtt{while}\;e\;b]\!],c \Longrightarrow c'}$$
$$\frac{( s_i,c_i \Longrightarrow c_{i+1})_{i=1}^n} { [\![\mathtt{block}\;s_1,\ldots,s_n]\!],c_1 \Longrightarrow c_{n+1}}$$
$$\frac{ b, c_0 \Longrightarrow c} {[\![\mathtt{program}\;b]\!] \Longrightarrow \mathsf{OK}}$$

A nice benefit of doing the static semantics only is that we can now write the dynamic semantics under the assumption that the program has already been type checked. This means our memory needs to only store values (since the read-only and number-of-parameter settings has already been handled in the static semantics. We’ve done that “break a large problem down into smaller ones” thing! Here is the Bella dynamic-semantics-only:

$$\frac{}{ [\![n]\!],m \Longrightarrow n}$$
$$\frac{}{ [\![\mathtt{true}]\!],m \Longrightarrow T}$$
$$\frac{}{ [\![\mathtt{false}]\!],m \Longrightarrow F}$$
$$\frac{m(i) = x} {[\![i]\!],m \Longrightarrow x}$$
$$\frac{ e,m \Longrightarrow x} {[\![\mathsf{-}\;e]\!],m \Longrightarrow -x}$$
$$\frac{ e,m \Longrightarrow x} {[\![\mathsf{!}\;e]\!],m \Longrightarrow \neg x}$$
$$\frac{( e_i,m \Longrightarrow x_i)_{i=1}^n} {[\![\: [e_1,\ldots,e_n] \:]\!],m \Longrightarrow (x_1,\ldots,x_n)}$$
$$\frac{\begin{gathered} e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \\ 0 \leq y < |x| \end{gathered} } {[\![\: e_1[e_2]\: ]\!],m \Longrightarrow x\downarrow y}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\} \;\;\; op \neq \mathtt{/} \vee y \neq 0 \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{\begin{gathered} op \in \{\mathtt{<}, \mathtt{<=}, \mathtt{==}, \mathtt{!=}, \mathtt{>=}, \mathtt{>} \} \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{ e_1,m \Longrightarrow F} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow F}$$
$$\frac{ e_1,m \Longrightarrow T \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{\&\&}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e_1,m \Longrightarrow T} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow T}$$
$$\frac{ e_1,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow y} {[\![e_1\;\mathtt{|\,|}\;e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow T \;\;\; e_1,m \Longrightarrow y} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow y}$$
$$\frac{ e,m \Longrightarrow F \;\;\; e_2,m \Longrightarrow z} {[\![e \; \mathtt{?} \; e_1 \; \mathtt{:} \; e_2]\!],m \Longrightarrow z}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = ((p_1,\ldots, p_n),e') \;\;\; e', m[a_i/p_i]_{i=1}^n \Longrightarrow x} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow x}$$
$$\frac{( e_i,m \Longrightarrow a_i)_{i=1}^n} { [\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow m(i)(a_1,\ldots,a_n)}$$
$$\frac{ e,m \Longrightarrow x} { [\![\mathtt{let}\;i=e]\!],m,o \Longrightarrow (m[x\,/\,i], o)}$$
$$\frac{} { [\![\mathtt{fun}\;i\;p=e]\!],m,o \Longrightarrow (m[(p,e)\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x} { [\![i=e]\!],m,o \Longrightarrow (m[x\,/\,i], o)}$$
$$\frac{ e,m \Longrightarrow x} { [\![\mathtt{print}\;e]\!],m,o \Longrightarrow (m, o\!\cdot\!(x))}$$
$$\frac{ e,m \Longrightarrow F} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m, o)}$$
$$\frac{\begin{gathered} e,m \Longrightarrow T \;\;\; b,m,o \Longrightarrow (m',o') \\ [\![\mathtt{while}\;e\;b]\!],m',o' \Longrightarrow (m'',o'')\end{gathered}} { [\![\mathtt{while}\;e\;b]\!],m,o \Longrightarrow (m'',o'')}$$
$$\frac{( s_i, m_i, o_i \Longrightarrow (m_{i+1},o_{i+1}))_{i=1}^n} { [\![\mathtt{block}\;s_1,\ldots,s_n]\!],m_1,o_1 \Longrightarrow o_{n+1}}$$
$$\frac{ b, m_0, () \Longrightarrow (m,o)} {[\![\mathtt{program}\;b]\!] \Longrightarrow o}$$
Exercise: Think about this two-part semantics. Compare it to the previous “all-in-one” semantics. What are the pros and cons? Which do you like better, at least for Bella? What might you like better if we were giving a semantics of C++, Rust, or JavaScript?

Learning More

Here are some good papers and other references:

Recall Practice

Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.

  1. What is the difference between structural operational semantics and natural semantics?
    Structural operational semantics is a small-step approach, showing each individual step of a computation; natural semantics is a big-step approach, showing the effect of executing whole constructs without regard to intermediate computational states.

Summary

We’ve covered:

  • A motivating example
  • What concrete operational semantics is
  • What structural operational semantics is
  • What natural semantics is
  • Differences between SOS and NS
  • Astro
  • Bella
  • More language features
  • Types
  • Where to learn more