Tuples are instruction-like entities consisting of an operator and zero to three arguments. Arguments are literals, variables, temporaries, subroutines, or labels.
Typical tuples are:
Tuple | Rendered as... | Description |
---|---|---|
(COPY,x,y) | y := x | Simple copy |
(COPY_FROM_DEREF,x,y) | y := *x | Copy the contents of memory at address x into y |
(COPY_TO_DEREF,x,y) | *y := x | Copy x into the memory at address y |
(COPY_FROM_OFS,x,y,z) | z := *(x+y) | Copy the contents of memory at address x+y into z |
(COPY_TO_OFS,x,y,z) | *(y+z) := x | Copy x into the memory at address y+z |
(ADD,x,y,z) | z := x + y | Sum |
(SUB,x,y,z) | z := x - y | Difference |
(MUL,x,y,z) | z := x * y | Product |
(DIV,x,y,z) | z := x / y | Quotient |
(MOD,x,y,z) | z := x mod y | Modulo |
(REM,x,y,z) | z := x rem y | Remainder |
(POWER,x,y,z) | z := pow x, y | Exponentiation |
(SHL,x,y,z) | z := x << y | Left shift |
(SHR,x,y,z) | z := x >> y | Logical right shift |
(SAR,x,y,z) | z := x >>> y | Arithmetic right shift |
(AND,x,y,z) | z := x & y | Bitwise conjunction |
(OR,x,y,z) | z := x | y | Bitwise disjunction |
(XOR,x,y,z) | z := x xor y | Bitwise exclusive or |
(NOT,x,y) | y := !x | Logical complement |
(NEG,x,y) | y := -x | Negation |
(COMP,x,y) | y := ~x | Bitwise complement |
(ABS,x,y) | y := abs x | Absolute value |
(SIN,x,y) | y := sin x | Sine |
(COS,x,y) | y := cos x | Cosine |
(ATAN,x,y,z) | z := atan x,y | Arctangent |
(LN,x,y) | y := ln x | Natural logarithm |
(SQRT,x,y) | y := sqrt x | Square root |
(INC x) | inc x | Increment |
(DEC,x) | dec x | Decrement |
(INC_DEREF x) | inc *x | Increment a memory location given its address |
(DEC_DEREF,x) | dec *x | Decrement a memory location given its address |
(LT,x,y,z) | z := x < y | 1 if x is less than y,0 otherwise |
(LE,x,y,z) | z := x <= y | 1 if x is less than or equal to y; 0 otherwise |
(EQ,x,y,z) | z := x == y | 1 if x is equal to y; 0 otherwise |
(NE,x,y,z) | z := x != y | 1 if x is not equal to y; 0 otherwise |
(GE,x,y,z) | z := x >= y | 1 if x is greater than or equal to y; 0 otherwise |
(GT,x,y,z) | z := x > y | 1 if x is greater than y; 0 otherwise |
(LABEL,L) | L: | Label |
(JUMP,L) | goto L | Unconditional jump to a label |
(JZERO,x,L) | if x == 0 goto L | Jump if zero / Jump if false |
(JNZERO,x,L) | if x != 0 goto L | Jump if zero / Jump if true |
(JLT,x,y,L) | if x < y goto L | Jump if less / Jump if not greater or equal |
(JLE,x,y,L) | if x <= y goto L | Jump if less or equal / Jump if not greater |
(JEQ,x,y,L) | if x == y goto L | Jump if equal |
(JNE,x,y,L) | if x != y goto L | Jump if not equal |
(JGE,x,y,L) | if x >= y goto L | Jump if greater or equal / Jump if not less |
(JGT,x,y,L) | if x > y goto L | Jump if greater / Jump if not less |
(IJE,x,y,L) | if ++x == y goto L | Increment and jump if equal. Useful at end of for-loops that count up to a given value. |
(DJNZ,x,L) | if --x != 0 goto L | Decrement and jump if not zero. Great for loops that count down to zero. |
(STARTCALL,f) | startcall f | Begin the call to function f: a STARTCALL tuple should be followed by zero or more PARAM tuples and then a CALLP or CALLF tuple. This operation is provided for languages that allow nested functions and normally pass static links before parameters. |
(PARAM,x,p) | param x, p | Pass x as an argument to parameter p |
(CALLP,f) | call f | Call procedure (non-value returning function) f |
(CALLF,f,x) | x := call f | Call function f, catching return value in x |
(RETP) | ret | Return from procedure |
(RETF,x) | ret x | Return x from this function |
(INT_TO_STRING,x,y) | y := to_string x | String representation of an integer (one can add a new tuple to take a base) |
(FLOAT_TO_STRING,x,y) | y := to_string x | String representation of a float (one can add a new tuple taking a format string) |
(BOOL_TO_STRING,x,y) | y := to_string x | String representation of a boolean (localized?) |
(CHAR_TO_STRING,x,y) | y := to_string x | String representation of a character |
(ALLOC,x,y) | y := alloc x | Allocate x bytes of memory, returning the address in y (or 0 if the memory could not be allocated). |
(TO_FLOAT,x,y) | y := to_float x | Integer to real |
(NULL_CHECK,x) | assert x != 0 | Do something if x is null |
(ASSERT_POSITIVE,x) | assert x > 0 | Do something if x is not positive |
(BOUND,x,y,z) | assert y <= x < z | Do something if x is not between y (inclusive) and x (inclusive) |
(NO_OP) | nop | Do nothing |
(EXIT) | exit | terminate the program |
Different authors of tuple systems will make different design choices. For example, what happens if an assertion fails? That depends on the source language; therefore, a given translator will fill in these details. How big is an integer, or a float? How are characters encoded? Also left up to the implementation.
Tuple arguments are generally low-level entities, such as:
By non-structured we mean items that fit into a single fixed-size machine unit. Strings, arrays, and structs are always referenced through a pointer.
It is not necessary to specify the kinds or sizes of the value data types. Integers and floating-point values are generally assumed, but a system will generally allow for a several different types of various sizes of these. A specific translation would define these types, and a back end would arrange for type conversions. For example:
(ADD, x, i, y)
where x and y are floating point and i is an integer, is an allowable tuple, and would be turned into backend code that explicitly or implicitly converts i to a floating-point value (of the right size) before doing a floating-point addition.
Similarly, a particular translation may include unsigned integer types (as one would in C). A backend has to take care of these cases. For example, with an x86 backend, a JGE tuple would become a jge
instruction with signed arguments, but a jae
with unsigned arguments.
In practice, tuple arguments can sometimes take arguments that represent sizes in bytes. It is up to a specific translation to compute these values.
In tuple-base representations, source language features like enumeration, character, and boolean literals should be turned into simple integer literals by a compiler front-end. Ditto for arrays and structs: tuples generally maniuplate individual word-like entities; large structures are flattened into smaller things.
Variables represent run-time storage locations. It’s not important, really. if these locations are in static memory, in a stack frame, or the heap, but it’s good to keep track of the static nesting level for a variable, as this is useful for languages with arbitrarily nested subroutines. In languages without subroutine nesting, such as C, global variables would have level 0 and all others would have level 1.
You might often see two types of variables, integer variables and floating point variables, since most hardware makes the same distinction.
Temporaries differ from variables in that variables come from source language variables that are actually declared, while temporaries are the result of evaluating expressions. For example, if a source language featured the expression
x + y * z
the corresponding tuple sequence would be:
(MUL, y, z, r0) (ADD, x, r0, r1)
where r0 and r1 are temporaries. Temporaries come in two flavors: value temporaries and address temporaries. An address temporary would arise from evaluating an expression like:
a[i].x
since the address of the corresponding storage location has to be computed at run-time (assuming the value of i is not known at compile-time).
Value temporaries can be either integer or floating-point, while address temporaries are always themselves integer-valued, since their value is an address, but internally they track whether they are referencing an integer or floating-point object.
It might be helpful to distinguish between subroutines defined in the current translation unit (user subroutines) and those defined elsewhere (external subroutines).
A user subroutine denotes a function in the current translation unit. It contains the following properties:
External subroutines represent top-level functions callable by the current translation unit but defined elsewhere. They are known only by name. Example:
print(x+4); |
add i0, 4, r3 param r3 param s0 call __print |
Labels are used to hold the structured control graph in a linearized list of tuples.
while (x < 3) { y = 10; x = x * 5; } |
L1: if i0 >= 3 goto L2 i1 := 10 r3 := i0 * 5 i0 := r3 goto L1 L2: |
A simple example:
string f(real i) { return "dog" unless i >= 3.3; } string test = f(cos(2.2)); |
p0: ; level=0 parent=null params=[] locals=[i0] f0 := cos 2.2 startcall p1 f1 := cos 2.2 param f1 r1 := call p1 i0 := r1 exit p1: ; level=1 parent=p0 params=[x1] locals=[] r0 := x1 >= 3.3 if r0 != 0 goto L0 ret s0 L0: call __terminate_thread s0: [100, 111, 103] |
An example with heap-allocated arrays and structures:
int[] a; a[6] = 20; real x = a[80]; a = new int[]{4, 2, 1}; struct Point { int x; int y; } Point[] polygon; print(polygon[5].y); |
p0: ; level=0 parent=null params=[] locals=[i0, x1, i1] assert i0 != 0 r0 := *(i0+-4) assert 0 <= 6 < r0 r1 := 6 * 4 r2 := i0 + r1 *r2 := 20 assert i0 != 0 r3 := *(i0+-4) assert 0 <= 80 < r3 r4 := 80 * 4 r5 := i0 + r4 r6 := *r5 x1 := r6 r7 := alloc 16 *r7 := 3 r7 := r7 + 4 *(r7+0) := 4 *(r7+4) := 2 *(r7+8) := 1 i0 := r7 assert i1 != 0 r8 := *(i1+-4) assert 0 <= 5 < r8 r9 := 5 * 4 r10 := i1 + r9 r11 := *r10 assert r11 != 0 r12 := r11 + 4 r14 := *r12 r13 := int_to_str r14 param r13 call __print exit |
An example with threads:
void f() { for (int i = 0; i < 20000; i++) { print (i + " "); } } thread t1 = start f(); thread t2 = start f(); join(t1); join(t2); |
p0: ; level=0 parent=null params=[] locals=[i1, i2] param 0 param p1 r3 := call __create_thread i1 := r3 param 0 param p1 r4 := call __create_thread i2 := r4 param i1 call __join param i2 call __join exit p1: ; level=1 parent=p0 params=[] locals=[i0] i0 := 0 L0: if i0 >= 20000 goto L1 r1 := int_to_str i0 param s0 param r1 r2 := call __concat_string param r2 call __print inc i0 goto L0 L1: ret s0: [32] |