Formal Descriptions of Programming Languages

How about a case study in formal language definition? We’ll not only define a language, we’ll illustrate several alternative definition styles along the way.

Introduction

Let’s see how one can give a formal programming language definitions by actually writing some. We’ll define a small imperative programming language, and will call it Iki. We’ll look at several different definition styles. Don’t worry, though, each of the definitions define the same language.

Syntax

The source of an Iki program is a Unicode string. Here is a formal definition of Iki syntax using Ohm:

Iki {
  Program     =  Block
  Block       =  (Stmt ";")+
  Stmt        =  var id ":" Type             -- declaration
              |  VarExp "=" Exp              -- assignment
              |  read VarExp ("," VarExp)*   -- read
              |  write Exp ("," Exp)*        -- write
              |  while Exp loop Block endw   -- while
  Type        =  int | bool
  Exp         =  Exp or Exp1                 -- binary
              |  Exp1
  Exp1        =  Exp1 and Exp2               -- binary
              |  Exp2
  Exp2        =  Exp3 relop Exp3             -- binary
              |  Exp3
  Exp3        =  Exp3 addop Exp4             -- binary
              |  Exp4
  Exp4        =  Exp4 mulop Exp5             -- binary
              |  Exp5
  Exp5        =  prefixop Exp6               -- unary
              |  Exp6
  Exp6        =  boollit
              |  intlit
              |  VarExp
              |  "(" Exp ")"                 -- parens
  VarExp      =  id

  var         =  "var" ~idrest
  read        =  "read" ~idrest
  write       =  "write" ~idrest
  while       =  "while" ~idrest
  loop        =  "loop" ~idrest
  endw        =  "end" ~idrest
  int         =  "int" ~idrest
  bool        =  "bool" ~idrest
  true        =  "true" ~idrest
  false       =  "false" ~idrest
  or          =  "or" ~idrest
  and         =  "and" ~idrest
  not         =  "not" ~idrest

  keyword     =  var | int | bool
              |  read | write | while | loop | endw
              |  true | false | or | and | not
  id          =  ~keyword letter idrest*
  idrest      =  "_" | alnum
  intlit      =  digit+
  boollit     =  true | false
  addop       =  "+" | "-"
  relop       =  "<=" | "<" | "==" | "!=" | ">=" | ">"
  mulop       =  "*" | "/" | "%"
  prefixop    =  ~"--" "-" | not

  space      +=  comment
  comment     =  "--" (~"\n" any)* "\n"
}

We can also define the same language using syntax diagrams. While Ohm distinguishes lexical from phrase categories using lower and uppercase starting letters, respectively, syntax diagrams make their distinction with rounded rectangles vs. rectangles. They also assume that a “lexical analysis” or “tokenization” pass was previously done, using a maximal munch process:

Program

Block

Stmt

Type

Exp

Exp1

Exp2

Exp3

Exp4

Exp5

Exp6

VarExp

keyword

intlit

boollit

addop

relop

mulop

prefixop

space

comment

Other forms of syntax definition include BNF, EBNF, and ABNF. There are many others.

Abstract Syntax

A grammar, whether written in Ohm or some other notation, defines a concrete syntax. A semantics will be defined based on an abstract, rather than a concrete, syntax (because punctuation, spacing, and comments just get in the way), so we have to define how to generate abstract syntax trees from concrete ones. In other words, we want to turn this string:

-- An Iki program
var x: int;
var y: int;
while y - 5 == 3 loop
  var y: int;
  read x, y;
  x = 2 * (3 + y);
end;
write 5;

into this tree:

(Program
  (Block
    (Declare x int)
    (Declare y int)
    (While
      (== (- y 5) 3)
      (Block
        (Declare y int)
        (Read x y)
        (= x (* 2 (+ 3 y)))))
    (Write 5)))

There are many ways to formally specify the relationship between the grammar and the abstract syntax tree. The first way shoves a lot of notation into the grammar itself. The idea is that as you parse a string by walking the grammar rules, you emit chunks of trees as you go. Let’s show it via an example:

Program =  Block:b                                        ^{Program b}
Block   =  (Stmt::s ';')+                                 ^{Block s}
Stmt    =  'var' id:i ':' Type:t                          ^{Var i t}
        |  VarExp:v '=' Exp:e                             ^{Assign v e}
        |  'read' VarExp::v (',' VarExp::v)*              ^{Read v}
        |  'write' Exp::e (',' Exp::e)*                   ^{Write e}
        |  'while' Exp:e 'loop' Block:b 'end'             ^{While e b}
Type    =  'int':t                                        ^t
        |  'bool':t                                       ^t
Exp     =  Exp1:e1 ('or':o Exp1:e2 {Binary o e1 e2}:e1)*  ^e1
Exp1    =  Exp2:e1 ('and':o Exp2:e2 {Binary o e1 e2}:e1)* ^e1
Exp2    =  Exp3:e1 (relop:o Exp3:e2 {Binary o e1 e2}:e1)? ^e1
Exp3    =  Exp4:e1 (addop:o Exp4:e2 {Binary o e1 e2}:e1)* ^e1
Exp4    =  Exp5:e1 (mulop:o Exp5:e2 {Binary o e1 e2}:e1)* ^e1
Exp5    =  prefixop:o Exp6:e                              ^{Unary o e}
        |  Exp6:e                                         ^e
Exp6    =  boollit:b                                      ^{Boollit b}
        |  intlit:n                                       ^{Intlit n}
        |  VarExp:v                                       ^v
        |  '(' Exp:e ')'                                  ^e
VarExp  =  id:i                                           ^{VarExp i}

Each grammar rule is augmented with tree productions; you might be able to infer the metalanguage for this, but if not, here are the highlights:

AST nodes are represented as {A B C ...} where A is the root and B, C, ... are the ordered children.
Tree nodes can be simple strings or numbers.
The evaluation of a grammar rules produces a node; we indicate that the node is produced or returned, with the ^ symbol.
During a parse, the evaluation of categories, tokens, and tree nodes can be bound to variables. The : binding overwrites the current value of the variable, if any; the :: binding accumulates values into a list.

A second way to define an AST for a grammar is to use Ohm. Ohm separates operations from the grammar, so things aren’t messy at all. The operations that produce the AST are written in an actual programming language, so they are executable and testable. Here’s the Ohm specification for generating the AST for Iki, using JavaScript:

const astBuilder = grammar.createSemantics().addOperation('ast', {
  Program(b) {
    return new Program(b.ast());
  },
  Block(s, _) {
    return new Block(s.ast());
  },
  Stmt_declaration(_1, id, _2, type) {
    return new VariableDeclaration(id.sourceString, type.ast());
  },
  Stmt_assignment(varexp, _, exp) {
    return new AssignmentStatement(varexp.ast(), exp.ast());
  },
  Stmt_read(_1, v, _2, more) {
    return new ReadStatement([v.ast(), ...more.ast()]);
  },
  Stmt_write(_1, e, _2, more) {
    return new WriteStatement([e.ast(), ...more.ast()]);
  },
  Stmt_while(_1, e, _2, b, _3) {
    return new WhileStatement(e.ast(), b.ast());
  },
  Type(typeName) {
    return typeName.sourceString;
  },
  Exp_binary(e1, _, e2) {
    return new BinaryExpression('or', e1.ast(), e2.ast());
  },
  Exp1_binary(e1, _, e2) {
    return new BinaryExpression('and', e1.ast(), e2.ast());
  },
  Exp2_binary(e1, op, e2) {
    return new BinaryExpression(op.sourceString, e1.ast(), e2.ast());
  },
  Exp3_binary(e1, op, e2) {
    return new BinaryExpression(op.sourceString, e1.ast(), e2.ast());
  },
  Exp4_binary(e1, op, e2) {
    return new BinaryExpression(op.sourceString, e1.ast(), e2.ast());
  },
  Exp5_unary(op, e) {
    return new UnaryExpression(op.sourceString, e.ast());
  },
  Exp6_parens(_1, e, _2) {
    return e.ast();
  },
  boollit(_) {
    return new BooleanLiteral(this.sourceString === 'true');
  },
  intlit(_) {
    return new IntegerLiteral(this.sourceString);
  },
  VarExp(_) {
    return new VariableExpression(this.sourceString);
  },
});

Another way is to use something called an attribute grammar. There are other ways, too.

Semantics

This section defines the semantics of Iki. Semantics generally has a static and a dynamic part; the static semantics describes rules that can be checked at compile time but that are impossible (or ridiculously inconvenient) to put into a context free grammar. The dynamic part defines what the program actually does when it runs. This is done on a macro as well as a micro level.

Static Semantics

The functions below transform an abstract syntax tree into a semantic graph. The semantic graph is almost identical to the abstract syntax tree; the only exceptions (in Iki, anyway) being that:

All Exp nodes have a new field (type)
All VarExp nodes have a new field (referent) that refers to a Var node.

The functions mathematically encode the rules we intuitively know as:

Identifiers must be declared before being used
Inner declarations hide outer ones
Variables local to a block are invisible to sibling and ancestor blocks
Only ints may be read and written
Only ints may appear as operands to arithmetic and relational operators
Only bools may appear as operands to logical operators
Types must match across assignments
Types must match across == and != operators

The definitions make use of environments which are mappings from variable names (strings) to actual variables. At any point in time we have of stack of environments corresponding to the nesting of the block currently being processed. Every time we enter a block we push a new, empty environment on this stack, which will get filled with the declarations of this block. When leaving a block, the topmost environment is popped. When resolving a variable reference, we search the environments from the top of the stack to the bottom.

$\mathsf{type} \; \mathit{environment} = (\mathsf{string} \times \mathsf{Var}) \; \mathsf{map}$
$\mathsf{fun} \; \mathit{find} \; [env_1 ... env_n] \; i =$
$\;\;\;\; \mathsf{if} \; n = 0 \; \mathsf{then} \; \bot_{NOT\_FOUND}$
$\;\;\;\; \mathsf{elsif} \; i \in env_1 \; \mathsf{then} \; env_1[i]$
$\;\;\;\; \mathsf{else} \; \mathit{find} \; [env_2 ... env_n] \; i$

$\mathcal{P}:\;\mathsf{Program} \rightarrow \mathsf{Program}$
$\mathcal{B}:\;\mathsf{Block} \rightarrow \mathit{environment} \; \mathsf{list} \rightarrow \mathsf{Block}$
$\mathcal{S}:\;\mathsf{Stmt} \rightarrow \mathit{environment} \; \mathsf{list} \rightarrow \mathsf{Stmt} \times \mathit{environment} \; \mathsf{list}$
$\mathcal{E}:\;\mathsf{Exp} \rightarrow \mathit{envirnoment} \; \mathsf{list} \rightarrow \mathsf{Exp}$

$\mathcal{P}\,(\mathsf{Program}\;b) = \mathsf{Program}\;(\mathcal{B}\;b\;[])$

$\mathcal{B}\,(\mathsf{Block}\;s_1\;...\;s_n)\,envs =$
$\;\;\;\; \mathsf{let} \; \xi_0 = envs \; \mathsf{and} \; (s_i', \xi_i) \mathop{=}\limits_{i=1}^n \mathcal{S} \; s_i \; \xi_{i-1} \; \mathsf{in} \; (\mathsf{Block}\;s_1'\;...\;s_n')$

$\mathcal{S}\,(\mathsf{Var}\;i\;t)\,(env::envs) =$
$\;\;\;\; \mathsf{if} \; i \in env\; \mathsf{then} \; \bot_{REDECLARATION}$
$\;\;\;\; \mathsf{else} \; \mathsf{let}\;d=(\mathsf{Var}\;i\;t)\;\mathsf{in}\;(d,\;env[d\,/\,i]::envs)$
$\mathcal{S}\;(\mathsf{Assign}\;v\;e)\;envs =$
$\;\;\;\;\mathsf{let}\;v' = \mathcal{E}\;v\;envs\; \mathsf{and} \;e' = \mathcal{E}\;e\;envs\;\mathsf{in}$
$\;\;\;\;\;\;\;\;\mathsf{if}\;v'.type = e'type\;\mathsf{then}\;((\mathsf{Assign}\;v'\;e'),\;envs)\;\mathsf{else}\; \bot_{TYPE\_ERROR}$
$\mathcal{S}\,(\mathsf{Read}\;v_1\;...\;v_n)\,envs =$
$\;\;\;\;\mathsf{let}\;v_i' \mathop{=}\limits_{i=1}^n \mathcal{E}\;v_i\;envs \; \mathsf{in}$
$\;\;\;\;\;\;\;\;\mathsf{if} \; \bigwedge_{i=1}^{n}(v_i'.type = \mathsf{int})\;\mathsf{then}\;((\mathsf{Read}\;v_1'\;...\;v_n'),\;envs) \; \mathsf{else} \; \bot_{TYPE\_ERROR}$
$\mathcal{S}\,(\mathsf{Write}\;e_1\;...\;e_n)\,envs =$
$\;\;\;\;\mathsf{let}\;e_i' \mathop{=}\limits_{i=1}^n \mathcal{E}\;e_i\;envs \; \mathrm{in}$
$\;\;\;\;\;\;\;\;\mathsf{if}\;\bigwedge_{i=1}^{n}(e_i'.type = \mathsf{int}) \;\mathsf{then}\;((\mathsf{Write}\;e_1'\;...\;e_n'),\;envs)\;\mathsf{else}\; \bot_{TYPE\_ERROR}$
$\mathcal{S}\,(\mathsf{While}\;e\;b)\,envs =$
$\;\;\;\;\mathsf{let}\;e'=\mathcal{E}\;e\;envs\;\mathsf{in}$
$\;\;\;\;\;\;\;\;\mathsf{if}\;e'.type = \mathsf{bool}\;\mathsf{then}\;((\mathsf{While}\;e'\;(\mathcal{B}\;b\;([]::envs))),\;envs)\;\mathsf{else}\; \bot_{TYPE\_ERROR}$

$\mathcal{E}\,(\mathsf{VarExp}\;i)\,envs = \mathsf{let}\;v = find\;envs\;i\;\mathsf{in}\;(\mathsf{VarExp}\;i\;v\;v.type)$
$\mathcal{E}\,(\mathsf{Intlit}\;n)\,envs = (\mathsf{Intlit}\;n\;\mathsf{int})$
$\mathcal{E}\,(\mathsf{Boollit}\;b)\,envs = (\mathsf{Boollit}\;b\;\mathsf{bool})$
$\mathcal{E}\,(\mathsf{Binary}\;o\;e_1\;e_2)\,envs =$
$\;\;\;\;\mathsf{let}\;e_1' = \mathcal{E}\;e_1\;envs\;and\;e_2' = \mathcal{E}\;e_2\;envs\;\mathsf{in}$
$\;\;\;\;\;\;\;\;\mathsf{if}\; o \in \{\verb@+@, \verb@-@, \mathtt{*}, \mathtt{/}, \mathtt{\%}, \mathtt{<}, \mathtt{<=}, \mathtt{>}, \mathtt{>=}\} \wedge e_1'.type = \mathsf{int} \wedge e_2'.type = \mathsf{int}\;\mathsf{then}\;(\mathsf{Binary}\;o\;e_1'\;e_2'\;\mathsf{int})$
$\;\;\;\;\;\;\;\;\mathsf{elsif}\;o \in \{\mathsf{and} ,\mathsf{or}\} \wedge e_1'.type = \mathsf{bool} \wedge e_2'.type = \mathsf{bool})\;\mathsf{then}\;(\mathsf{Binary}\;o\;e_1'\;e_2'\;\mathsf{bool})$
$\;\;\;\;\;\;\;\;\mathsf{elsif}\;o \in \{\verb@==@, \verb@!=@\} \wedge e_1'.type = e_2'.type\;\mathsf{then}\;(\mathsf{Binary}\;o\;e_1'\;e_2'\;\mathsf{bool})$
$\;\;\;\;\;\;\;\;\mathsf{else}\;\bot_{TYPE\_ERROR}$
$\mathcal{E}\,(\mathsf{Unary}\;o\;e)\,envs =$
$\;\;\;\;\mathsf{let}\;e' = \mathcal{E}\;e\;envs\;\mathsf{in}$
$\;\;\;\;\;\;\;\;\mathsf{if}\;o = \mathsf{-} \wedge e'.type = \mathsf{int}\;\mathsf{then}\;(\mathsf{Unary}\;o\;e'\;\mathsf{int})$
$\;\;\;\;\;\;\;\;\mathsf{elsif}\;o = \mathsf{not} \wedge e'.type = \mathsf{bool}\;\mathsf{then}\;(\mathsf{Unary}\;o\;e'\;\mathsf{bool})$
$\;\;\;\;\;\;\;\;\mathsf{else}\;\bot_{TYPE\_ERROR}$

This static semantics can be viewed as adding a type field to every expression node, and a referent field to every variable expression node.

Note that the functions may fail to produce a semantic graph for a given abstract syntax trees. In such a case we have a program which is syntactically valid (possesses no syntax errors), but semantically invalid.

Dynamic Semantics

The dynamic meaning of an Iki program is some transformation from an input file to an output file. For example, the meaning of var x: int; read x; write x+1; end is “the function that takes in an arbitrary input file, and returns an output file containing a single element whose value is one more than the first item on the input file.”

Macrosemantics

The meaning is given here by a functions from semantic graphs to their “meanings.” The meaning of a program is a mapping from inputs to outputs. Statements change the state of the computation; the state is the aggregation of (1) its current data store (mapping from its variables to their values), (2) the current input file, and (3) the current output file. The meaning of an expression is the value it produces, relative to a store, of course. Declarations in Iki have no dynamic effect.

type file = int list
type store = Var $\rightarrow$ (int+bool)
type state = (store $\times$ file $\times$ file)

$\mathscr{P}:\;\mathsf{Program} \rightarrow \mathit{file} \rightarrow \mathit{file}$
$\mathscr{B}:\;\mathsf{Block} \rightarrow \mathit{state} \rightarrow \mathit{state}$
$\mathscr{S}:\;\mathsf{Stmt} \rightarrow \mathit{state} \rightarrow \mathit{state}$
$\mathscr{E}:\;\mathsf{Exp} \rightarrow \mathit{store} \rightarrow (\mathsf{int+bool})$

$\mathscr{P}\,(\mathsf{Program}\;b)\,i = (\mathscr{B}\;b\;(\lambda v.\bot,\,i,\,\mathsf{[]}))[2] $
$\mathscr{B}\,(\mathsf{Block}\;s_1\;...\;s_n)\,\psi = \mathsf{if}\;n=1\; \mathsf{then}\;\mathscr{S}s_1\psi\; \mathsf{else}\;\mathscr{B}\,(\mathsf{Block}\;s_2\;...\;s_n) (\mathscr{S}s_1\psi) $
$\mathscr{S}\,(\mathsf{Var}\;i\;t)\,\psi = \psi $
$\mathscr{S}\,(\mathsf{Assign}\;v\;e)(\sigma,i,o) = (\sigma[\mathscr{E}e\sigma\:/\:v.referent],i,o) $
$\mathscr{S}\,(\mathsf{Read}\;v_1\;...\;v_n))(\sigma,i,o) =$
$\;\;\;\;\;\;\mathsf{if} \; length(i) = 0\; \mathsf{then} \;\bot_{END\_ERROR}$
$\;\;\;\;\;\;\mathsf{elsif} \; n=1\; \mathsf{then} \; (\sigma[head(i)/v_1.referent],\;tail(i),\;o)$
$\;\;\;\;\;\;\mathsf{else} \; \mathscr{S}\,(\mathsf{Read}\;v_2\;...\;v_n) \; (\mathscr{S}\,(\mathsf{Read}\;v_1\,(\sigma,i,o))) $
$\mathscr{S}\,(\mathsf{Write}\;e_1\;...\;e_n)(\sigma,i,o) =$
$\;\;\;\;\;\;\mathsf{if} \; n = 1\; \mathsf{then} \; (\sigma,\;i,\;o @ \mathscr{E} e_1 \sigma))$
$\;\;\;\;\;\;\mathsf{else} \; \mathscr{S}\,(\mathsf{Write}\;e_2\;...\;e_n) (\mathscr{S}\,(\mathsf{Write}\;e_1)(\sigma,i,o)) $
$\mathscr{S}\,(\mathsf{While}\;e\;b)\,\psi = \mathsf{if}\;\mathsf{not}\;\mathscr{E}e(\psi[0])\; \mathsf{then}\;\psi\; \mathsf{else}\;\mathscr{S}\,(\mathsf{While}\;e\;b) (\mathscr{S}\;s\;\psi) $
$\mathscr{E}\,(\mathsf{VarExp}\;i\;v)\,\sigma = \sigma v $
$\mathscr{E}\,(\mathsf{Intlit}\;n)\,\sigma = n $
$\mathscr{E}\,(\mathsf{Boollit}\;b)\,\sigma = b $
$\mathscr{E}\,(\mathsf{Binary}\;\mathsf{or}\;e_1\;e_2)\,\sigma = \mathsf{if}\;\mathscr{E}e_1\sigma\; \mathsf{then}\;\mathsf{true}\; \mathsf{else}\;\mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\mathsf{and}\;e_1\;e_2)\,\sigma = \mathsf{if}\;\mathscr{E}e_1\sigma\; \mathsf{then}\;\mathscr{E}e_2\sigma\; \mathsf{else}\;\mathsf{false}$
$\mathscr{E}\,(\mathsf{Binary}\;\texttt{<=}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma \leq \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\texttt{<}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma < \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\texttt{=}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma = \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\texttt{!=}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma \neq \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\texttt{>=}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma \geq \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\texttt{>}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma > \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\mathtt{+}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma + \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\mathtt{-}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma - \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\mathtt{*}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma \times \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Binary}\;\mathtt{/}\;e_1\;e_2)\,\sigma = \mathscr{E}e_1\sigma \div \mathscr{E}e_2\sigma$
$\mathscr{E}\,(\mathsf{Unary}\;\mathtt{-}\;e)\,\sigma = -\mathscr{E}e\sigma$
$\mathscr{E}\,(\mathsf{Unary}\;\mathsf{not}\;e)\,\sigma = \mathsf{not}\;\mathscr{E}e\sigma$

The style in which we’ve been writing our semantic descriptions is called a Denotational Semantics. We can also write an Operational Semantics:

$$\frac{(\lambda v.\bot,\,i,\,\mathsf{[]}) \vdash b \Longrightarrow (\sigma,i',o)} {i \vdash \mathsf{Program}\;b \Longrightarrow o}$$

$$\frac{\psi \vdash s_1 \Longrightarrow \psi'}{\psi \vdash \mathsf{Block}\;s_1 \Longrightarrow \psi'}$$

$$\frac{\psi \vdash s_1 \Longrightarrow \psi' \;\;\; \psi' \vdash \mathsf{Block}\;s_2...s_n \Longrightarrow \psi''} {\psi \vdash \mathsf{Block}\;s_1...s_n \Longrightarrow \psi''}$$

$$\frac{}{\psi \vdash \mathsf{Var}\;i\;t \Longrightarrow \psi}$$

$$\frac{\sigma \vdash e \Longrightarrow x}{(\sigma,i,o) \vdash \mathsf{Assign}\;v\;e \Longrightarrow (\sigma[x/v.\mathit{referent}], i, o)}$$

$$\frac{}{(\sigma,\mathsf{[]},o) \vdash \mathsf{Read}\;v_1...v_n \Longrightarrow \bot}$$

$$\frac{}{(\sigma,x::i,o) \vdash \mathsf{Read}\;v \Longrightarrow (\sigma[x/v.\mathit{referent}], i, o)}$$

$$\frac{\psi \vdash \mathsf{Read}\;v_1 \Longrightarrow \psi' \;\;\; \psi' \vdash \mathsf{Read}\;v_2...v_n \Longrightarrow \psi''} {\psi \vdash \mathsf{Read}\;v_1...v_n \Longrightarrow \psi''}$$

$$\frac{\sigma \vdash e \Longrightarrow x}{(\sigma,i,o) \vdash \mathsf{Write}\;e \Longrightarrow (\sigma, i, o@x)}$$

$$\frac{\psi \vdash \mathsf{Write}\;e_1 \Longrightarrow \psi' \;\;\; \psi' \vdash \mathsf{Write}\;e_2...e_n \Longrightarrow \psi''} {\psi \vdash \mathsf{Write}\;e_1...e_n \Longrightarrow \psi''}$$

$$\frac{\sigma \vdash e \Longrightarrow \mathsf{false}}{(\sigma,i,o) \vdash \mathsf{While}\;e\;b \Longrightarrow (\sigma, i, o)}$$

$$\frac{\sigma \vdash e \Longrightarrow \mathsf{true} \;\;\; (\sigma,i,o) \vdash b \Longrightarrow \psi \;\;\; \psi \vdash \mathsf{While}\;e\;b \Longrightarrow \psi'} {(\sigma,i,o) \vdash \mathsf{While}\;e\;b \Longrightarrow \psi'}$$

$$\frac{}{\sigma \vdash \mathsf{VarExp}\;i\;v \Longrightarrow \sigma\,v}$$

$$\frac{}{\sigma \vdash \mathsf{Intlit}\;n \Longrightarrow n}$$

$$\frac{}{\sigma \vdash \mathsf{Boollit}\;b \Longrightarrow b}$$

$$\frac{\sigma \vdash e_1 \Longrightarrow \mathsf{false} \;\;\; \sigma \vdash e_2 \Longrightarrow y} {\sigma \vdash \mathsf{Binary\;or}\;e_1\;e_2 \Longrightarrow y}$$

$$\frac{\sigma \vdash e_1 \Longrightarrow \mathsf{true}} {\sigma \vdash \mathsf{Binary\;or}\;e_1\;e_2 \Longrightarrow \mathsf{true}}$$

$$\frac{\sigma \vdash e_1 \Longrightarrow \mathsf{false}} {\sigma \vdash \mathsf{Binary\;and}\;e_1\;e_2 \Longrightarrow \mathsf{false}}$$

$$\frac{\sigma \vdash e_1 \Longrightarrow \mathsf{true} \;\;\; \sigma \vdash e_2 \Longrightarrow y} {\sigma \vdash \mathsf{Binary\;and}\;e_1\;e_2 \Longrightarrow y}$$

$$\frac{\begin{gathered}op \in \{\mathsf{<}, \mathsf{<=}, \mathsf{==}, \mathsf{!=}, \mathsf{>=}, \mathsf{>}, \mathsf{+}, \mathsf{-}, \mathsf{*}\, \mathsf{/}, \mathsf{\%}\} \\ \sigma \vdash e_1 \Longrightarrow x \;\;\; \sigma \vdash e_2 \Longrightarrow y \end{gathered}} {\sigma \vdash \mathsf{Binary}\;op\;e_1\;e_2 \Longrightarrow op(x,y)}$$

$$\frac{\sigma \vdash e \Longrightarrow x\;\;\; op \in \{\mathsf{-}, \mathsf{not}\}} {\sigma \vdash \mathsf{Unary}\;op\;e \Longrightarrow op(x)}$$

Exercise: Rewrite the static semantics (given earlier on this page) in an operational style.

Microsemantics

The meaning of a program is a mapping from an input file to an output file; these correspond to the environment’s standard input and standard output files respectively. Integers are read from standard input as digit strings optionally prefixed by a single - and separated by whitespace. The first integer in the file may be preceded by whitespace and the last integer may be followed by whitespace. Note that the macrosemantics does allow a program to terminate without reading the entire input file; in this case an implementation must ignore the unread portion of the input file. If an error occurs during reading, the program aborts and must signal its environment in some fashion (e.g. it may set an exit status value, or write a message). Integers are written to the output file as digit strings, prefixed by a single - if negative, separated by a single space. There may be whitespace before the first integer and after the last.

Implementation Permissions

Sometimes there are things left unspecified in a syntax. In Iki, we didn’t mention a couple things, so a definition would probably include language like the following: “An implementation may fix a maximum length for identifiers, and may fix a maximum bit length for integers. Arithmetic may be signed or modular.”