We’ll start very simply, with a language that just has integers, addition and subtraction, and print statements. Programs look like this:
print(8 - (3 + 21) + 34); print(55 - 3 + 2);
The concrete syntax can be given by this grammar:
$\begin{array}{lcl} \textit{Program} & \longleftarrow & \textit{Stmt}+ \\ \textit{Stmt} & \longleftarrow & \texttt{"print"}\;\texttt{"("}\;\textit{Exp}\;\texttt{")"}\;\texttt{";"} \\ \textit{Exp} & \longleftarrow & (\textit{Exp}\;(\texttt{"+"}\;|\;\texttt{"-"}))?\;\textit{Term} \\ \textit{Term} & \longleftarrow & \textit{num} \;|\; \texttt{"("}\;\textit{Exp}\;\texttt{")"} \\ \textit{num} & \longleftarrow & \textit{digit}+ \end{array}$
When defining a semantics we don’t care about concrete syntax, only abstract syntax. So here is the abstract syntax of this tiny language, which is all we care about from here on:
$\begin{array}{lcl} n: \textsf{Nml} & & \\ e: \textsf{Exp} & = & n \mid e + e \mid e - e \\ s: \textsf{Stm} & = & \mathtt{print}\;e \\ p: \textsf{Pro} & = & s+ \end{array}$
Need Practice?
For the example program in this section, give both the parse tree and the abstract syntax tree.
A concrete operational semantics is given to a language by mapping language constructs to operations on a virtual machine—a machine whose operation is “obvious.”
Stack machines are pretty well understood! Let’s make up a stack machine. The machine has a stack for execution and allows you to store integers in named cells. The instructions are:
Instruction | Description |
---|---|
PUSH $n$ | Push the value of the literal $n$ |
ADD | Pop twice then push first popped value plus second popped value |
SUB | Pop twice then push first popped value minus second popped value |
Print contents of top of stack to standard output then pop |
When doing a concrete operational semantics, we just show the instruction sequence for each abstract syntactic form. That way, every program in our language will be translated to a unique instruction sequence. We assume this machine is sooooooooo simple that the meaning of instruction sequences is well understood. Therefore showing how to translate programs into machine instruction sequences suffices to define the semantics of the language. We will define the translation function, $T$, by cases on the abstract syntax forms:
The cool brackets $[\![$ and $]\!]$ distinguish syntactic forms, such as numerals and plus-expressions, from regular mathematical values, such as numbers.
Let’s see how it works for a trivial program:
$\begin{array}{l} T[\![\mathtt{program}\;\mathtt{print}\;8 + 13]\!] \\ \;\;\;\; = T[\![\mathtt{print}\;8 + 13]\!] \\ \;\;\;\; = T[\![8 + 13]\!] \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = T[\![8]\!] \;\boldsymbol{\cdot}\; T[\![13]\!] \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = \mathsf{PUSH}\;8 \;\boldsymbol{\cdot}\; T[\![13]\!] \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = \mathsf{PUSH}\;8 \;\boldsymbol{\cdot}\; \mathsf{PUSH}\;13 \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13) \;\boldsymbol{\cdot}\; \mathsf{ADD} \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13, \mathsf{ADD}) \;\boldsymbol{\cdot}\; \mathsf{PRINT} \\ \;\;\;\; = (\mathsf{PUSH}\;8, \mathsf{PUSH}\;13, \mathsf{ADD}, \mathsf{PRINT}) \end{array}$
The execution of this program on the “machine” unfolds as follows:
// stack is (), output is () PUSH 8 // stack is [8], output is () PUSH 13 // stack is [8, 13], output is () ADD // stack is [21], output is () PRINT // stack is (), output is [21]
So the meaning of our program is the one element integer list $[21]$.
The translation of the example program from the beginning of this section is:
// print(8 - (3 + 21) + 34); // print(55 - 3 + 2); PUSH 8 PUSH 3 PUSH 21 ADD SUB PUSH 34 ADD PRINT PUSH 55 PUSH 3 SUB PUSH 2 ADD PRINT
Execute the program on the virtual machine. The output should be $[18, 54]$.
Rather than specifying a machine with its own instruction set, we can simply define rules that show how a computation proceeds without the separate machine. Yep, we are going to abstract away the abstract machine!
For the tiny language, evaluating expressions eventually produces a number, executing statements keeps appending to “standard output,” and the execution of a program involves executing its statements, starting with the empty output to eventually produce its final output.
Here is the basic idea:
In our example language:
This is best developed live, in a video or on a whiteboard.
To apply the semantics, we find the $o$ that the “starting configuration” $ [\![\mathtt{program}\;s_1 \cdots s_n]\!], ()$ will end up in. We can show the computation with a proof tree:
How much work was that?That was a lot of work for a tiny language. Did you know there is another way to do this stuff, that does not involve nitty gritty computation details? There is. It is called natural semantics. Coming up...right now.
The semantics we’ve just seen, structural operational semantics, is often called small-step operational semantics. There’s another approach, known as, you guessed it, big-step operational semantics, also called natural semantics. Where little steps in SOS are given with the relation $\longrightarrow$, natural semantics uses $\Longrightarrow$ or $\Downarrow$ to signify that a construct is completely evaluated or executed to its final configuration.
A motivation for, and details of, natural semantics, can be found in any good textbook on semantics. But we can learn it by just jumping right in. Here is the natural semantics for the tiny language we’ve been working with:
We’ll derive this in class. Did you find it easier going?
Structural Operational Semantics vs. Natural Semantics
It might take time to really internalize the difference. Here are two things that may help. First, a summary table:Second, here is a fantastic and detailed slide deck from a University of Illinois class that does an excellent job of describing both methods.
SOS Natural Break computations down into individual “steps” Concerned only with the final result of evaluating a construct Aka: “small-step”, “transitional”, “reduction” Aka: “big-step”, “relational”, “evaluation” Uses $\rightarrow$ Uses $\Rightarrow$ or $\Downarrow$ Transitions between configurations, e.g., $ e,m \rightarrow e',m$ Complete evaluation of constructs, e.g., $ e,m \Rightarrow x$ Gordon Plotkin, 1981 Gilles Kahn, 1987
Let’s scale up just a little bit. Remember Astro? Here is the abstract syntax:
Several new features appear in Astro that were not in the trivial language:
π
, read-only), and functions (sin
, cos
, sqrt
, and hypot
)We are going to look at the three styles of operations semantics for Astro.
We’ll have to scale up the machine a bit to support Astro:
Instruction | Description |
---|---|
PUSH $n$ | Push the value of the literal $n$ |
LOAD $v$ | Push the value in memory location $v$, crashing if $v$ is not in memory |
STORE $v$ | Pop into memory location $v$ |
NEG | Pop then push the negation of the popped value |
SIN | Pop then push the sine of the popped value |
COS | Pop then push the cosine of the popped value |
SQRT | Pop then push the square root of the popped value |
ADD | Pop twice then push first popped value plus second popped value |
SUB | Pop twice then push first popped value minus second popped value |
MUL | Pop twice then push first popped value times second popped value |
DIV | Pop twice then push first popped value divided by second popped value, crashing if the second popped value is 0 |
REM | Pop twice then push the remainder of first popped value divided by second popped value |
POW | Pop twice then push first popped value raised to the power of the second popped value |
HYPOT | Pop twice then push the hypotenuse of the two popped values |
Print contents of top of stack to standard output then pop |
For example, the Astro program:
x = 9 * 35 ** π; print(3 / hypot(1-cos(x), 13));
is:
PUSH 9 PUSH 35 PUSH π POW MUL STORE x PUSH 3 PUSH 1 LOAD x COS SUB PUSH 13 HYPOT DIV PRINT
For a concrete semantics, we’re don’t actually interpret the program, we compile it down to the little stack language. How can we specify this translation function? One way is to just write it in Python:
import math
def check(condition, message):
if not condition:
raise ValueError(message)
initial_env = {
"π": { "kind": "CONST", "value": math.pi },
"sqrt": { "kind": "FUN", "arity": 1 },
"sin": { "kind": "FUN", "arity": 1 },
"cos": { "kind": "FUN", "arity": 1 },
"hypot": { "kind": "FUN", "arity": 2 }}
def t(node, env):
match node:
case int(n) | float(n):
return [f"PUSH {n}"]
case str(x):
check(x in env, f"Uninitialized {x}")
check((kind := env[x]["kind"]) in ("CONST", "VAR"), "Not a variable")
return [f"LOAD {x}" if kind == "VAR" else f"PUSH {env[x]['value']}"]
case ("neg", e):
return t(e, env) + ["NEG"]
case ("add", e1, e2):
return t(e1, env) + t(e2, env) + ["ADD"]
case ("sub", e1, e2):
return t(e1, env) + t(e2, env) + ["SUB"]
case ("mul", e1, e2):
return t(e1, env) + t(e2, env) + ["MUL"]
case ("div", e1, e2):
return t(e1, env) + t(e2, env) + ["DIV"]
case ("rem", e1, e2):
return t(e1, env) + t(e2, env) + ["REM"]
case ("pow", e1, e2):
return t(e1, env) + t(e2, env) + ["POW"]
case ("call", f, args):
check(f in env, f"Unknown {f}")
check(env[f]["kind"] == "FUN", f"{f} is not a function")
check(len(args) == env[f]["arity"], f"Wrong number of args for {f}")
return [i for arg in args for i in t(arg, env)] + [f.upper()]
case ("assign", i, e):
check(i not in env or env[i]["kind"] == "VAR", f"{i} is not assignable")
return t(e, env) + [f"STORE {i}"], {**env, i: {"kind": "VAR"}}
case ("print", exp):
return t(exp, env) + ["PRINT"], env
case ("program", statements):
code, e = [], env
for s in statements:
c, e = t(s, e)
code += c
return code
case _:
raise ValueError("Malformed program")
# Quick check: print the translation of a small program
program = ("program", [
("assign", "x", ("mul", 9, ("pow", 35, "π"))),
("print", ("div", 3, ("call", "hypot", [
("sub", 1, ("call", "cos", ["x"])), 13])))])
print('\n'.join(t(program, initial_env)))
Let’s run it:
$ python3 astro_concrete.py PUSH 9 PUSH 35 PUSH 3.141592653589793 POW MUL STORE x PUSH 3 PUSH 1 LOAD x COS SUB PUSH 13 HYPOT DIV PRINT
Exactly what we expected.
We saw this on our notes on Semantics.
An exercise for you.
The programming language Bella has its own formal specification. It adds a bunch of very interesting features to Astro:
let
keyword). All variables must be declared before being used.while
statement.Here’s the abstract syntax, copied from the specification:
The natural semantics of Bella is fully explained in the specification, so head over there to continue your operational semantics study.
Try it!
It should be really fun to extend the virtual machine we’ve been playing with above to have new instructions that can handle Bella.
Rather than jumping in to a new language, let’s go one feature at a time, and think, how might we extend Bella to have this feature?”
The ubiquitous if-statement might look like this in a Bella program:
if x >= 3 { print(y); y = sqrt(y); } else { y = y ** 2; }
and in the abstract syntax appears as:
$s: \textsf{Statement} = \ldots \mid \texttt{if}\;e\;b\;b$
The (natural semantics) rules should be straightforward:
For SOS we need to compute in smaller steps, bringing our tests down to a single number:
Many languages have what they call parallel assignment, which might look like:
x, y = e1, e2
with the following abstract syntax:
$s: \textsf{Statement} = \ldots \mid i, i = e, e$
We can add the following rules to the natural semantics for Bella (in which identifiers must be bound before they can appear anywhere, including assignment statements):
Swapping is similar to parallel assignment, but (in most languages) requires that both identifiers already be bound to read-write variables. Of course, that kind of behavior can vary from language to language, but that’s what we will assume here. The concrete syntax might look like:
x, y = y, x
with the following abstract syntax:
$s: \textsf{Statement} = \ldots \mid \texttt{swap}\;i\;i$
Natural semantics:
And SOS:
What about something like this? You know, kinda like the while-statement except the body is executed at least once. And a truthy condition stops the loop.
repeat { print(y); y = sqrt(y); } until y < 1;
Abstract syntax:
$s: \textsf{Statement} = \ldots \mid \texttt{repeat}\;b\;e$
Semantics:
More mainstream languages need this beauty:
repeat n*2 { print(0); }
Yep that’s right. Just execute a block a certain number of times, without having to introduce a stupid iterator variable. Abstract syntax:
$s: \textsf{Statement} = \ldots \mid \texttt{dotimes}\;e\;b$
For the semantics, we, as language designers, have to determine whether we want to evaluate the number of times to do the loop just once, or re-evaluate it each time. Let’s say we evaluate it only once, before we even think about doing the loop body:
Sometimes, your definite iteration will need a iterator variable. Here’s a common form that we can add to Bella:
for i = x to y*8 { print(i); y = y / 2; }
with abstract syntax:
$s: \textsf{Statement} = \ldots \mid \texttt{for}\;i\;e\;e\;b$
Again, as language designers, we have some choices to make:
Let’s choose to evaluate the bounds once at the beginning, and to make the loop variable local to the body and read-only. This gives us the semantic rules:
Many languages have a select
statement, whose job is to randomly pick one of its arms to execute. The simplest form is to have exactly two arms:
select { print(0); } or { y = y / 2; print(1); }
with abstract syntax:
$s: \textsf{Statement} = \ldots \mid \texttt{select}\;b\;b$
The semantics are simple, but interesting. We have multiple rules that can fire for a given select statement. But that’s just fine. It’s what we want. The program has multiple possible executions. The semantics tells us all of them.
Concurrency is about things happening “at the same time.” The usual model is interleaved computation. For example:
x, y = 1, 2; par { x = y + 5 * y; // t0:=y, t1:=y, t2:=5*t1, t3:=t0+t2, x:=t3 y = x + 8; // t4:=x, t5:=t4+8, y:=t5 } print x; print y;
can have multiple results! One of which is [12,20]. Another is [54,9]. There are even more!
Determine all of the other possibilities.
Let’s add a new syntactic construct to execute two statements concurrently:
$s: \textsf{Statement} = \ldots \mid \texttt{par}\;s\;s$
Here, natural semantics is not sufficient; it can only say that we can execute the first statement and then the second, or the second and then the first. But an SOS can specify interleaved execution quite easily:
Now for one of biggest topics: types.
In OG Bella, variables could only hold numeric values. “Boolean” expressions like x>=y
just evaluate to either 0 or 1, so boolean values did not exist. 0 was “falsy” and 1 was “truthy.” Expressions never produced anything but numbers! This is far from a modern programming language.
Let’s extend Bella to be much more expressive: we’ll allow expressions to not only evaluate to numbers or booleans, but functions as well. And we’ll add lists. But with added power comes some responsibilities: we should do typechecking so we don’t multiply functions, call numbers, perform a not
operation on a number, or other such dreadful things. We want to make it so that any misuse of expressions—that is the use of an expression of one type in a context that requires a different type—generate an error. We want strong typing.
Before we show the semantics, let’s look at lists. They should look like this in Bella programs:
a = [10, 20, 30]; b = [false, true, true, false]; if (b[0) { print(a[2]); }
which in the abstract syntax will admit two new forms:
$e: \textsf{Exp} = \ldots \mid e \texttt{[} e \texttt{]} \mid \texttt{[} e^* \texttt{]}$
So now for the semantics. Expression evaluation in OG Bella always produced a number. Now it will produce a more general value:
Now what are values? There is a value for each kind of type (respectively, number, boolean, list, user-defined function, intrinsic function). So we’re gonna say:
Hold up, that is NOT a set!That’s not a proper definition of a set called $\mathsf{Value}$, is it? How can a set $A$ contain functions of type $A \rightarrow A$? The latter has a higher cardinality than the former! Informally, this is a type definition, not a set definition. It’s a way of saying, “a value is one of these things.” We’ll worry about why this works later; for now, let’s go with it.
Now we need a memory to store the entities bound to each identifier:
Here’s the semantics:
The semantics above strongly suggested type checking was done at run time. Now suppose we wanted to extend that extension of Bella so that all type checking could be done at compile time. Many (most?) programmers love languages that check types during compilation since it gives them a bit more confidence about the program’s correctness. Others may not like the lack of flexibility. Whatever your preference may be, static typing is a thing we should explore.
Generally, with static types you will add new keywords or identifiers to your language for types, like number
, boolean
. But in some cases, you can make a language powerful enough to do complete type inference without ever mentioning types, ever. Can we do this for statically-typed Bella?
Anyway, the approach is to write the semantics in two parts.
The static semantics will show what each construct “means” relative to a context. A context maps identifiers to things we know about them at compile time, rather than run time. We don’t care about values, then, but we do care about types and access (read-only or read-write). A context is then:
When analyzing an expression in a context, we want to generate a type for it. When analyzing statements (which may include declarations, we will produce a new context. Programs simply are ok or not ok. Formally the relation types are:
TODO: define and explain the relations
A possible static semantics is:
A nice benefit of doing the static semantics only is that we can now write the dynamic semantics under the assumption that the program has already been type checked. This means our memory needs to only store values (since the read-only and number-of-parameter settings has already been handled in the static semantics. We’ve done that “break a large problem down into smaller ones” thing! Here is the Bella dynamic-semantics-only:
Here are some good papers and other references:
Here are some questions useful for your spaced repetition learning. Many of the answers are not found on this page. Some will have popped up in lecture. Others will require you to do your own research.
We’ve covered: