Register Machines

Tape-based automata, like the Turing Machine, are not the only formal models of computation studied in computer science.

Why Registers?

Tape-based automata are quite minimalistic, and great for making a theory from just about the most primitive concepts. Registers feel a little higher-level, and also, let’s face it, real machines have registers.

The Basics

In automata theory, A register machine has a number of registers $r_0, r_1, r_2, \ldots$, each holding a single (unbounded) natural number. The machine’s control is a program, defined as a list of (labeled) instructions. The instructions are normally executed in sequence, but some instructions can cause a “jump” to an arbitrary instruction. The instruction set varies from model to model (of which there exist an infinite number of possibilities).

       ┌────────────────────┐
    r0 │                    │
       ├────────────────────┤
    r1 │                    │
       ├────────────────────┤
    r2 │                    │
       ├────────────────────┤
    r3 │                    │
       ├────────────────────┤
    r4 │                    │
       ├────────────────────┤

Don’t miss the Wikipedia article on register machines.

The simplest kind of register machines are those whose instruction set only allows registers to be incremented or decremented only. These are called counter machines.

Counter Machines

Here’s a sample instruction set (I just made it up) for a counter machine:

Various Counter Machine Instructions
$\textsf{CLEAR }r_i$$r_i := 0$
$\textsf{INC }r_i$$r_i := [r_i] + 1$
$\textsf{DEC }r_i$$r_i := [r_i] - 1$
$\textsf{COPY }r_1, r_j$$r_j := [r_i]$
$\textsf{JZ }r_i, L$If $[r_i] = 0$ then jump to $L$
$\textsf{JE }r_i, r_j, L$If $[r_i] = [r_j]$ then jump to $L$
$\textsf{JZDEC }r_i, L$If $[r_i] = 0$ then jump to $L$ else $r_i := [r_i] - 1$
$\textsf{JUMP } L$Jump to instruction at address $L$

This instruction set is pretty simple; it assumes the input is the contents of the registers before execution, and the output is the contents of $r_0$. The program is a list of instructions and the program ends when you have no more instructions.

For convenience, we can also imagine a machine that has separate streams for input and output, operated on by READ and WRITE instructions, respectively. We might even throw in a convenient instruction to halt the program:

More Instructions
$\textsf{READ }r_i$Consume next value from input stream and store in $r_i$
$\textsf{WRITE }r_i$Append contents of $r_i$ to output stream
$\textsf{HALT}$Stop executing

Because counter machine programs are primitive, like Turing Machines, programs are pretty long. Here’s a program that reads to integers, adds them, and writes their sum.

0 |  READ r0            // x
1 |  READ r1            // y
2 |  JZDEC r1, 5        // if y is 0, we're done
3 |  INC r0             // x++
4 |  JUMP 2             // loop back
5 |  WRITE r0           // there's the output

If you have the right set of instructions, you can show a register machine to be equivalent in computing power to a Turing machine.

Surprisingly, there exists a register machine that is equivalent in computing power to a Turing Machine that has only two instructions:

Simplest Counter Machine
$\textsf{INC }r_i$Increment register $r_i$
$\textsf{JZDEC }r_i, L$If register $r_i$ holds 0, jump to instruction at address $L$, otherwise decrement register $r$
O RLY?

Remember the reads and writes aren’t actually needed. You can still compute by having a convention such that the input of the program starts in register 0 and the output is read from register 1 whenever there are no more instructions to execute. Since each register is an unbounded integer, it can encode whatever the heck you like.

RAMs

A more sophisticated type of register machine is the RAM (Random Access Machine). Instruction sets vary, but usually contain arithmetic, logic, and jumps. Some instruction sets feature three-operands (sometimes called 3AC sets) and might look like this:

Restricted 3AC: Simple Direct Addressing Only
InstructionDescription
$\textsf{LOAD } n, r_k$$r_k := n$
$\textsf{COPY }r_1, r_j$$r_j := [r_i]$
$\textsf{ADD } r_i, r_j, r_k$$r_k := [r_i] + [r_j]$
$\textsf{SUB } r_i, r_j, r_k$$r_k := [r_i] - [r_j]$
... and so on for MUL, DIV, POW, AND, OR, LT, LE, EQ, NE, GE, GT, SHL, SHR
$\textsf{NEG } r_i, r_j$$r_j := - [r_i]$
$\textsf{NOT } r_i, r_j$$r_j := \neg [r_i]$
Can also include JUMP, JZ, JE, READ, WRITE, HALT from above

Now we can improve the model a great deal by adding two powerful ideas, (1) we can let our source operands be constant values (also known as immediate values), and (2) let our registers be indirect. For example:

COPY 5, r8     -- with imm operands, no need for LOAD
COPY 13, r5
COPY r8, r2    -- copy contents of r8 into r2
COPY *r8, r2   -- copy contents of r5 into r2
COPY r8, *r2   -- copy contents of r8 into r13
COPY *r8, *r2  -- copy contents of r5 into r13
NEG *r2, r3    -- copy negation of r13 into r3
ADD *r8, 5, r3
SUB r13, *r13, *r2 

Now sources can be immediates, direct registers, or indirect registers; destinations can be direct or indirect registers (not immediates!). So we can write:

3AC
InstructionDescription
$\textsf{COPY }src, dest$$\langle dest \rangle := \langle src \rangle$
$\textsf{ADD }src_1, src_2, dest$$\langle dest \rangle := \langle src_1 \rangle + \langle src_2 \rangle$
$\textsf{SUB }src_1, src_2, dest$$\langle dest \rangle := \langle src_1 \rangle - \langle src_2 \rangle$
... and so on for MUL, DIV, POW, AND, OR, LT, LE, EQ, NE, GE, GT, SHL, SHR
$\textsf{NEG }src, dest$$\langle dest \rangle := - \langle src \rangle$
$\textsf{NOT }src, dest$$\langle dest \rangle := \neg \langle src \rangle$
$\textsf{JZ }src, L$If $\langle src \rangle = 0$ then jump to $L$
$\textsf{JE }src_1, src_2, L$If $\langle src_1 \rangle = \langle src_2 \rangle$ then jump to $L$
$\textsf{JUMP } L$Jump to instruction at address $L$
$\textsf{READ }dest$$\langle dest \rangle := $ next value from input stream
$\textsf{WRITE }src$Append $\langle src \rangle$ to output stream
$\textsf{HALT}$Stop executing

Here is a GCD program on this machine:

// while y != 0:
//   x, y = y, x % y
// print(x)

0 |  READ r0            // x
1 |  READ r1            // y
2 |  JZ r1, 7           // if y is 0, we're done
3 |  COPY r1, r2        // y -> old_y (to "save" it)
4 |  MOD r0, r2, r1     // x % y -> y
5 |  COPY r2, r0        // old_y -> x
6 |  JUMP 2             // back to top of "loop"
7 |  WRITE r0

Rather than worrying about the position of instructions, let’s use labels:

// while y != 0:
//   x, y = y, x % y
// print(x)

  READ r0            // x will be in r0
  READ r1            // y will be in r1
start:
  JZ r1, wrapup      // if y is 0, we're done
  COPY r1, r2        // y -> old_y
  MOD r0, r2, r1     // x % old_y -> y
  COPY r2, r0        // old_y -> x
  JUMP start         // back to top of "loop"
wrapup:
  WRITE r0           // print x

Another variation requires the target to be one of the sources:

2AC
InstructionDescription
$\textsf{COPY }src, dest$$\langle dest \rangle := \langle src \rangle$
$\textsf{ADD }src, dest$$\langle dest \rangle := \langle dest \rangle + \langle src \rangle$
$\textsf{SUB }src, dest$$\langle dest \rangle := \langle dest \rangle - \langle src \rangle$
... and so on for MUL, DIV, POW, AND, OR, LT, LE, EQ, NE, GE, GT, SHL, SHR
$\textsf{NEG }dest$$\langle dest \rangle := - \langle dest \rangle$
$\textsf{NOT }dest$$\langle dest \rangle := \neg \langle dest \rangle$
$\textsf{JZ }src, L$If $\langle src \rangle = 0$ then jump to $L$
$\textsf{JUMP } L$Jump to instruction at address $L$
$\textsf{READ }dest$$\langle dest \rangle := $ next value from input stream
$\textsf{WRITE }src$Append $\langle src \rangle$ to output stream
$\textsf{HALT}$Stop executing

Code gets a little longer now:

// 2AC GCD PROGRAM

// while y != 0:
//   x, y = y, x % y
// print(x)

  READ r0            // x will be in r0
  READ r1            // y will be in r1
start:
  JZ r1, wrapup      // if y is 0, we're done
  COPY r0, r2        // x -> old_x
  COPY r1, r3        // y -> old_y
  MOD r3, r2         // old_x % old_y  
  COPY r2, r1        // new y for next iteration
  COPY r3, r0        // new x for next iteration
  JUMP start         // back to top of "loop"
wrapup:
  WRITE r0           // print x

We can even make the target register always be implicitly $r_0$, which, since this is always the destination for all the arithmetic operations, we call the accumulator:

1AC (Accumulator Machine)
InstructionDescription
$\textsf{LOAD }src$$r_0 := \langle src \rangle$
$\textsf{STORE }dest$$\langle dest \rangle := [r_0]$
$\textsf{ADD }src$$r_0 := [r_0] + \langle src \rangle$
$\textsf{SUB }src$$r_0 := [r_0] - \langle src \rangle$
... and so on for MUL, DIV, POW, AND, OR, LT, LE, EQ, NE, GE, GT, SHL, SHR
$\textsf{NEG }$$r_0 := - [r_0]$
$\textsf{NOT }$$r_0 := \neg [r_0]$
$\textsf{JZ }L$If $[r_0] = 0$ then jump to $L$
$\textsf{JUMP } L$Jump to instruction at address $L$
$\textsf{READ }$$r_0 := $ next value from input stream
$\textsf{WRITE }$Append $[r_0]$ to output stream
$\textsf{HALT}$Stop executing

Programs start to get a lot longer with this architecture:

// 1AC GCD PROGRAM

// while y != 0:
//   x, y = y, x % y
// print(x)

  READ            // x in r0
  STORE r1        // x in r1
  READ            // y in r0
start:
  JZ wrapup       // if y is 0, we're done
  STORE r2        // old_y saved in r2
  LOAD r1         // x in r0
  STORE r3        // old_x saved in r3
  MOD r2          // x % y in r0
  STORE r1        // new x for next iteration
  LOAD r2         // new y for next iteration
  JUMP start      // back to top of "loop"
wrapup:
  LOAD r1        // Now x is in r0
  WRITE          // So we can write x

We can make all the arguments implicit if we just organize all our registers like a stack:

0AC (Stack Machine)
InstructionDescription
$\textsf{LOAD } n$Push $n$
$\textsf{ADD}$Pop into $y$; pop into $x$; push $x + y$
$\textsf{SUB}$Pop into $y$; pop into $x$; push $x - y$
... and so on for MUL, DIV, POW, AND, OR, LT, LE, EQ, NE, GE, GT, SHL, SHR
$\textsf{NEG}$Pop into $x$; push $-x$
$\textsf{NOT}$Pop into $x$; push $\neg x$
$\textsf{JZ } L$Pop into $x$; If $x= 0$ jump to instruction at address $L$
$\textsf{JUMP } L$Jump to instruction at address $L$
$\textsf{HALT}$Stop executing

Stack machines are cool when executing arithmetic expressions. Work left to right. Example:

// Stack machine code for 7 - 3 * 8 ** 1 / 3

LOAD 7    // 7
LOAD 3    // 7 3
LOAD 8    // 7 3 8
LOAD 1    // 7 3 8 1
POW       // 7 3 (8**1)
MUL       // 7 (3*(8**1))
LOAD 3    // 7 (3*(8**1)) 3
DIV       // 7 ((3*(8**1))/3)
SUB       // 7-((3*(8**1))/3)

To be able to move things around with a stack architecture, we can add DUP and SWAP instructions, or even have a separate stash of labeled registers for storage.

// 0AC GCD PROGRAM

// while y != 0:
//   x, y = y, x % y
// print(x)

// Assuming separate stash of labeled registers for storage,
// BUT all operations can only occur on stack top.
// STORE instruction does a pop.

  READ
  STORE x
  READ
  STORE y
start:
  LOAD y
  JZ wrapup       // if y is 0, we're done
  LOAD x
  LOAD y
  STORE x         // new x is the old y
  LOAD y
  MOD             // x % y
  STORE y         // new y is x % y
  JUMP start      // back to top of "loop"
wrapup:
  LOAD x
  WRITE

RASPs

Another kind is called the RASP (Random Access Stored Program). Read the Wikipedia article for this one.

Exercise: What is the historical importance of the RASP model? Why was it created?

Summary

We’ve covered:

  • What register machines are and why they are interesting
  • Counter machines
  • The simplest counter machine that is Turing equivalent
  • 3AC RAMs
  • 2AC RAMs
  • 1AC RAMs (accumulator machines)
  • 0AC RAMs (stack machines)
  • RASPs