Semantics

What does it all mean?

Syntax vs. Semantics

The semantics of a language assigns a meaning to each utterance of the language. But where does this meaning come from? How it is assigned? It appears an inescapable fact that semantics is driven by the underlying structure of the utterances, not to each individual character. It is the arrangement of the characters that gives a program its particular meaning. The arrangement. The syntax. Generally semantics is given by mapping syntactic structures to their meaning.

Why Study Semantics?

I asked ChatGPT in late 2022.

Have there been any advances in the field of programming language semantics (denotational, operational, etc.) in the last 20 years?.


There have been many advances in the field of programming language semantics in the last 20 years. Some examples include:

  1. The development of new programming languages that have different semantics, such as functional languages like Haskell and ML, and concurrent languages like Go and Rust.
  2. The development of new tools and techniques for verifying the correctness of programs, such as model checkers and automated theorem provers.
  3. The study of new semantic models, such as game semantics and big-step semantics, which can provide more insight into the behavior of programs.
  4. The application of techniques from category theory, such as monads and arrows, to the study of programming language semantics.
  5. The development of new approaches to type systems, such as dependent types and linear types, which can help ensure the safety and correctness of programs.
  6. The use of semantics to optimize the performance of programs, through techniques such as program analysis and transformation.

Overall, the field of programming language semantics has seen a great deal of activity and innovation in the last 20 years, and continues to be an active area of research and development.

Informal Semantic Descriptions

The traditional way to assign meanings to programs is with natural language. These descriptions, to be fair, try to be super precise. Let’s see an example. We’ll attempt an informal semantics of the language Astro we saw previously, when we covered Syntax. As a refresher, here’s the abstract syntax of the language:

$ \begin{array}{lcl} n: \mathsf{Nml} & & \\ i: \mathsf{Ide} & & \\ e: \mathsf{Exp} & = & n \mid i \mid -e \mid e+e \mid e-e \mid e\,\mathtt{*}\,e \mid e\,/\,e \mid e\,\mathtt{\%}\,e \mid e\,\mathtt{**}\,e \mid i\;e^*\\ s: \mathsf{Stm} & = & i = e \mid \mathtt{print}\;e\\ p: \mathsf{Pro} & = & \mathtt{program}\;s^+\\ \end{array}$

Let’s attempt a semantics in plain English (any other would do just as well), trying to be precise:

Well that that was long-winded and who knows if we even caught everything?

CLASSWORK
In fact, the informal semantics above are terrible! There is so much imprecision! To get started, what does it be to be “bound”? What is “previously”? Let’s try to find more problems. There are quite a few!

And can you imagine doing this for a real language? It would take hundreds and hundreds of pages? Right? Right.

Exercise: Browse the ECMAScript Language Specification. How many pages long is it?

Well, it looks like we want a formal semantics. But what would that even look like? How do we formalize meaning?

Approaches to Formal Semantics

Let’s think of some ways we could capture the meaning of a program formally. Some things come to mind:

We can also mix elements of various approaches.

An Operational Approach

Let’s try to build a semantics for Astro, giving a set of rules which may be seen as simulating the execution of a program on some kind of abstract “machine.” For Astro, programs evaluate to the list of numbers they output. For example, the program:

x = 5;
print(sqrt(5 * 13 + 35));
print(3);
x = x + 21;
print(x / 8);

“evaluates to” the sequence $[10, 3, 3.25]$. So we want to define the relation $\Longrightarrow_P$ such that:

$p \Longrightarrow_P o$

means that program $p$ produces output $o$, where $o \in \mathbb{R}^*$ The evaluation relation for programs must be defined in terms of its constituent statements and its constituent expressions. But hmmm, we know that we want 3+2 to evaluate to $5$, but how do we evaluate x+3? It depends on what x is bound to, hence expressions must be evaluated in a context that contains a mapping from identifiers to the entities they are bound to. This mapping is called a memory (in the US, or a store in the UK). So we will write the evaluation relation for expressions as:

$e,m \Longrightarrow_E x$

meaning that in the context of memory $m$, expression $e$ evaluates to $x$. For statements, we notice that statements can both read and write memory, and they can append to the output. So:

$s,m,o \Longrightarrow_S (m', o')$

means that, if the context of memory $m$ with the output is currently $o$, then the execution of $s$ produces a new memory $m'$ and new output $o'$. Now we can define the operational semantics of Astro as follows (for brevity, dropping the subscripts on the evaluation relations since they can be inferred from context):

$$\frac{} {[\![n]\!],m \Longrightarrow n}$$
$$\frac{m(i) = (\mathsf{NUM}, x, \_)} {[\![i]\!],m \Longrightarrow x}$$
$$\frac{e,m \Longrightarrow x} {[\![\mathsf{-}\;e]\!],m \Longrightarrow -x}$$
$$\frac{\begin{gathered} op \in \{ \mathsf{+}, \mathsf{-}, \mathsf{*}, \mathsf{/}, \mathsf{\%}, \mathtt{**}\} \\ e_1,m \Longrightarrow x \;\;\; e_2,m \Longrightarrow y \end{gathered}} {[\![e_1\;op\;e_2]\!],m \Longrightarrow op(x,y)}$$
$$\frac{e,m \Longrightarrow x \;\;\; m(i) = \mathsf{UNDEF} \vee m(i) = (\mathsf{NUM},\_,\mathsf{RW})} {[\![i=e]\!],m,o \Longrightarrow (m[(\mathsf{NUM},x,\mathsf{RW})\,/\,i], o)}$$
$$\frac{(e_i,m \Longrightarrow a_i)_{i=1}^n \;\;\; m(i) = (\mathsf{FUN}, f, n)} {[\![i\;e_1,\ldots,e_n]\!],m \Longrightarrow f(a)}$$
$$\frac{e, m \Longrightarrow a} {[\![\mathtt{print}\;e]\!],m,o \Longrightarrow (m,\,o\!\cdot\!a)}$$
$$\frac{(s_i, m_{i-1}, o_{i-1} \Longrightarrow (m_i,o_i))_{i=1}^n} {[\![\mathtt{program}\;s_1,\ldots,s_n]\!] \Longrightarrow o_n}$$

where $o_0$, the initial output, is defined to be the empty sequence, and the initial memory $m_0$ is our “standard library”:

$ \begin{array}{l} m_0 = \lambda\,i. \mathsf{UNDEF} \:[\\ \;\;\;\; (\mathsf{NUM}, \pi, \mathsf{RO})\,/\,\mathtt{π}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda x.\sqrt{x}, 1)\,/\,\mathtt{sqrt}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda x.\sin{x}, 1)\,/\,\mathtt{sin}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda x.\cos{x}, 1)\,/\,\mathtt{cos}] \:[\\ \;\;\;\; (\mathsf{FUN}, \lambda (x,y).\sqrt{x^2+y^2}, 2)\,/\,\mathtt{hypot}] \end{array}$

We’ve been able to capture the notion of the memory formally! In a memory, identifiers are bound to values that have one of four forms:

Notice that a memory is a member of the following set:

$\begin{array}{l} \mathsf{Ide} \rightarrow (\\ \;\;\;\; \{\mathsf{UNDEF}\} \;\cup \\ \;\;\;\; (\{\mathsf{NUM}\} \times \mathbb{R} \times \{\mathsf{RO},\mathsf{RW}\}) \;\cup \\ \;\;\;\; (\{\mathsf{FUN}\} \times (\mathbb{R}^* \rightarrow \mathbb{R}) \times \mathbb{N})) \end{array}$

That looks pretty formal, right? And while we’re at it, the relations themselves are also typed:

$\begin{array}{l} \Longrightarrow_P \subseteq \mathsf{Pro} \times \mathsf{Output} \\ \Longrightarrow_S \subseteq (\mathsf{Stm} \times \mathsf{Mem} \times \mathsf{Output}) \times (\mathsf{Mem} \times \mathsf{Output}) \\ \Longrightarrow_E \subseteq (\mathsf{Exp} \times \mathsf{Mem}) \times \mathbb{R} \end{array}$

where $\mathsf{Output}$ is just $\mathbb{R}^*$.

Exercise: Explain, in detailed, technical English, why each of the above makes sense.
More about operational semantics

We’ve only given a very brief introduction. There’s so much more. There are many styles of operational semantics, and for the style we’ve presented here, you’ll find both “big-step” and “small-step” flavors. Much more to come.

A Denotational Approach

Let’s try something else. For each abstract syntax form, we’ll give a function mapping the form to its meaning, a mathematical object known as a denotation:

$\mathscr{E}[\![n]\!]\,m = n$ $\mathscr{E}[\![i]\!]\,m = \mathsf{if}\;m(i) = (\mathsf{NUM},x,\_)\;\mathsf{then}\;x\;\mathsf{else}\;\bot$ $\mathscr{E}[\![ -e ]\!]\,m = -\mathscr{E}e\,m$ $\mathscr{E} [\![ e_1 \,\mathtt{+}\, e_2 ]\!] \,m = \mathscr{E}e_1 m + \mathscr{E}e_2 m$ $\mathscr{E} [\![ e_1 \,\mathtt{-}\, e_2 ]\!]\,m = \mathscr{E}e_1 m - \mathscr{E}e_2 m$ $\mathscr{E} [\![ e_1 \,\mathtt{*}\, e_2 ]\!]\,m = \mathscr{E}e_1 m \times \mathscr{E}e_2 m$ $\mathscr{E} [\![ e_1 \,\mathtt{/}\, e_2 ]\!]\,m = \mathscr{E}e_1 m \div \mathscr{E}e_2 m$ $\mathscr{E} [\![ e_1 \,\mathtt{\%}\, e_2]\!]\,m = \mathscr{E}e_1 m \;\mathsf{rem}\; \mathscr{E}e_2 m$ $\mathscr{E} [\![ e_1 \,\mathtt{**}\, e_2]\!]\,m = (\mathscr{E}e_1 m)^{\mathscr{E}e_2 m}$ $\mathscr{E} [\![ i\;e_1\!\ldots\!e_n]\!]\,m = \mathsf{if}\;m(i)=(\mathsf{FUN},f,n)\;\mathsf{then}\; f(\mathscr{E}e_1 m, \ldots, \mathscr{E}e_n m)\;\mathsf{else}\;\bot$ $\mathscr{S} [\![ i=e ]\!](m,o) = \mathsf{if}\;m(i)\in\{\mathsf{UNDEF},(\mathsf{NUM},x,\mathsf{RW})\} \;\mathsf{then}\; (m[(\mathsf{NUM},\mathscr{E}\,e\,m,\mathsf{RW})\,/\,i], o) \;\mathsf{else}\;\bot$ $\mathscr{S} [\![ \mathtt{print}\;e]\!](m,o) = (m, o\!\cdot\!\mathscr{E}e m)$ $\begin{array}{l} \mathscr{P}[\![\mathtt{program} \; s_1\!\ldots\!s_n]\!] = \mathsf{let}\; (m,o) = \mathscr{S}s_n( \ldots \mathscr{S}s_1(m_0,\;()) \ldots ) \;\mathsf{in}\; o \\ \;\;\;\;\;\;\;\mathsf{where}\;m_0 = \\ \;\;\;\;\;\;\;\;\;\;\;\; \lambda\,i. \mathsf{UNDEF} \:[\\ \;\;\;\;\;\;\;\;\;\;\;\; (\mathsf{NUM}, \pi, \mathsf{RO})\,/\,\mathtt{π}] \:[\\ \;\;\;\;\;\;\;\;\;\;\;\; (\mathsf{FUN}, \lambda x.\sqrt{x}, 1)\,/\,\mathtt{sqrt}] \:[\\ \;\;\;\;\;\;\;\;\;\;\;\; (\mathsf{FUN}, \lambda x.\sin{x}, 1)\,/\,\mathtt{sin}] \:[\\ \;\;\;\;\;\;\;\;\;\;\;\; (\mathsf{FUN}, \lambda x.\cos{x}, 1)\,/\,\mathtt{cos}] \:[\\ \;\;\;\;\;\;\;\;\;\;\;\; (\mathsf{FUN}, \lambda (x,y).\sqrt{x^2+y^2}, 2)\,/\,\mathtt{hypot}] \\ \end{array} $

Think of $\bot$ for now as some kind of error thing. And any function application that has $\bot$ as an argument will automatically return $\bot$. We need to go into some pretty deep and gnarly and difficult mathematics to actually justify it. More on this later in the course. And oh, wait until you see what we need to do to justify recursive definitions. Get your inner mathematician ready.

We left a LOT out.

There is a great deal more to denotational semantics that what is shown here, but this gets us started. Entire books have been written on denotational semantics alone. There are many mathematical difficulties involved that need to be handled, particularly surrounding recursion. There is so much to cover.

It can be tough going mathematically, which is probably why operational semantics is a bit more popular.

Methodologies

Here are some methodologies that have evolved in the world of formal semantics:

ApproachMethodIdeas
OperationalConcrete Operational SemanticsDefine an actual abstract machine to execute the program
Structural Operational Semantics (Small-step)A set of inference rules that show when each individual step of a computation can take place and what its effect is.
Natural Semantics (Big-step)A set of inference rules that show the effect of executing whole constructs without regard to intermediate computational states.
DenotationalDirect Denotational SemanticsDefine the meaning of each syntactic construct as a mathematical object in terms of its constituent constructs
Continuation SemanticsDefine the meaning of each syntactic construct, compositionally, as a mathematical object, where meanings are generally transformations modeling the “remainder of the program” (to better handle errors and disruptive control transfers)
AxiomaticHoare LogicA kind of axiomatic semantics (where the meaning is inferred from defining a set of invariant properties for each construct) focusing on proving the correctness of programs
Algebraic SemanticsDefine the semantics via an algebra.
HybridAction SemanticsDefine meaning in a way that is more familiar to programmers, with actions, data, and yielders.

Mathematics of Meaning

In the sneak peaks above you’ve just been given a taste of what formal semantic definitions look like.

How can we be sure they are truly formal, or truly unambiguous? What kind of mathematical objects work to covey the meaning of programs in a programming language? If your approach to semantics is super algebraic, you might want to check out Category Theory. But mostly, it’s the Lambda Calculus that everyone learns and knows and uses mostly.

We’ll study the Lambda Calculus pretty soon.

Resources

Where are some places to learn about semantics? Wikipedia is a good place to start:

There are a few books that cover formal semantics:

Also, I have course notes coming up for Operational, Denotational, and Axiomatic semantics. We’ll get to them soon.

Context, Again

Let’s revisit context once again.

What would you say about this Astro program?

print(x);

The grammar permits it, but the contextual constraint requiring all variables to have been previously assigned is violated. The program is illegal. The same goes for similar violations in many popular programming languages, such as:

But are these syntactically illegal or semantically illegal? We saw earlier you can say either:

For the phase...
Syntax people say...
Semantics people say...
The context-free structure
“The context-free syntax”
Use CFG/EBNF/PEG
“The syntax”
Use CFG/EBNF/PEG
The context-sensitive rules
“The context-sensitive syntax”
Use attribute grammars or two-level grammars
“The static semantics”
Use semantic rules
The execution behavior
“The semantics”
Use semantic rules
“The dynamic semantics”
Use semantic rules

cssyntaxorsemantics.jpeg

In the semantics definitions for Astro above, we handled all the contextual rules in the semantics. In fact we completely conflated the static and dynamic semantics, mashing them both together in a single set of rules. In practice, you might see two sets of rules, one that just checks the contextual constraints, and the second defining the run-time behavior, under the assumption that the constraints have already been checked.

The terminology “static semantics” and “dynamic semantics” is in line with the two computer science terms:

Static

Things we can know about a program before it is run, simply by inspecting the source code.

Dynamic

Things we only know about a program while it is running, such as certain effects on the environment, which may be driven by external inputs unknown before execution.

You may have heard the terms “static” and “dynamic” in the context of type systems. Many people speak of a statically-typed language as one in which type checking is done prior to program execution and a dynamically typed language as one in which type checking is done during program execution. In reality, most languages do a little of both, but one or the other usually predominates.

Semantics in Practice

Formal semantics, and formal methods in general (e.g., proofs of correctness) seem to be underused in practice, except in certain life-critical domains. But the benefit of studying formal semantics leads to much more rigorous approaches to software correctness and security, a very good thing indeed.

Applying semantic rules at compile-time is part of what’s known as static analysis. During static analysis we look for outright errors, of course, but we also look for suspect practices that have the potential for errors or security vulnerabilities. This latter part of static analysis is necessarily outside the code of formal semantics, but it is an extremely important pragmatic issue that requires the developer’s full attention. To learn more about this area of static analysis, browse the following:

Summary

We’ve covered:

  • Where semantics comes from
  • The drawbacks of an informal, even if precise, semantic specification
  • Broad approaches to formal semantics
  • Where to learn about formal semantics
  • The problem of context
  • Semantics in practice