Squid is a little IR I made up myself.

Squid stands for **S**imple **Qu**adruple-based **I**ntermediate **D**escription. It is a fairly useful and extensible intermediate language for imperative, block-structured languages.

A Squid description consists of a collection of subroutines, each of which is either an internal subroutine or an external subroutine. A subroutine has a list of tuples. Tuples consist of an operator and zero to three arguments. Arguments are literals, variables, temporaries, subroutines, or labels.

The set of Squid tuples is not fixed. Anyone can add more if desired. The core set is as follows:

Tuple | Rendered as... | Description |
---|---|---|

(COPY,x,y) | y := x | Simple copy |

(COPY_FROM_DEREF,x,y) | y := *x | Copy the contents of memory at address x into y |

(COPY_TO_DEREF,x,y) | *y := x | Copy x into the memory at address y |

(COPY_FROM_OFS,x,y,z) | z := *(x+y) | Copy the contents of memory at address x+y into z |

(COPY_TO_OFS,x,y,z) | *(y+z) := x | Copy x into the memory at address y+z |

(ADD,x,y,z) | z := x + y | Sum |

(SUB,x,y,z) | z := x - y | Difference |

(MUL,x,y,z) | z := x * y | Product |

(DIV,x,y,z) | z := x / y | Quotient |

(MOD,x,y,z) | z := x mod y | Modulo |

(REM,x,y,z) | z := x rem y | Remainder |

(POWER,x,y,z) | z := pow x, y | Exponentiation |

(SHL,x,y,z) | z := x << y | Left shift |

(SHR,x,y,z) | z := x >> y | Logical right shift |

(SAR,x,y,z) | z := x >>> y | Arithmetic right shift |

(AND,x,y,z) | z := x & y | Bitwise conjunction |

(OR,x,y,z) | z := x | y | Bitwise disjunction |

(XOR,x,y,z) | z := x xor y | Bitwise exclusive or |

(NOT,x,y) | y := !x | Logical complement |

(NEG,x,y) | y := -x | Negation |

(COMP,x,y) | y := ~x | Bitwise complement |

(ABS,x,y) | y := abs x | Absolute value |

(SIN,x,y) | y := sin x | Sine |

(COS,x,y) | y := cos x | Cosine |

(ATAN,x,y,z) | z := atan x,y | Arctangent |

(LN,x,y) | y := ln x | Natural logarithm |

(SQRT,x,y) | y := sqrt x | Square root |

(INC x) | inc x | Increment |

(DEC,x) | dec x | Decrement |

(INC_DEREF x) | inc *x | Increment a memory location given its address |

(DEC_DEREF,x) | dec *x | Decrement a memory location given its address |

(LT,x,y,z) | z := x < y | 1 if x is less than y,0 otherwise |

(LE,x,y,z) | z := x <= y | 1 if x is less than or equal to y; 0 otherwise |

(EQ,x,y,z) | z := x == y | 1 if x is equal to y; 0 otherwise |

(NE,x,y,z) | z := x != y | 1 if x is not equal to y; 0 otherwise |

(GE,x,y,z) | z := x >= y | 1 if x is greater than or equal to y; 0 otherwise |

(GT,x,y,z) | z := x > y | 1 if x is greater than y; 0 otherwise |

(LABEL,L) | L: | Label |

(JUMP,L) | goto L | Unconditional jump to a label |

(JZERO,x,L) | if x == 0 goto L | Jump if zero / Jump if false |

(JNZERO,x,L) | if x != 0 goto L | Jump if zero / Jump if true |

(JLT,x,y,L) | if x < y goto L | Jump if less / Jump if not greater or equal |

(JLE,x,y,L) | if x <= y goto L | Jump if less or equal / Jump if not greater |

(JEQ,x,y,L) | if x == y goto L | Jump if equal |

(JNE,x,y,L) | if x != y goto L | Jump if not equal |

(JGE,x,y,L) | if x >= y goto L | Jump if greater or equal / Jump if not less |

(JGT,x,y,L) | if x > y goto L | Jump if greater / Jump if not less |

(IJE,x,y,L) | if ++x == y goto L | Increment and jump if equal. Useful at end of for-loops that count up to a given value. |

(DJNZ,x,L) | if --x != 0 goto L | Decrement and jump if not zero. Great for loops that count down to zero. |

(STARTCALL,f) | startcall f | Begin the call to function f: a STARTCALL tuple should be followed by zero or more PARAM tuples and then a CALLP or CALLF tuple. This operation is provided for languages that allow nested functions and normally pass static links before parameters. |

(PARAM,x,p) | param x, p | Pass x as an argument to parameter p |

(CALLP,f) | call f | Call procedure (non-value returning function) f |

(CALLF,f,x) | x := call f | Call function f, catching return value in x |

(RETP) | ret | Return from procedure |

(RETF,x) | ret x | Return x from this function |

(INT_TO_STRING,x,y) | y := to_string x | String representation of an integer (one can add a new tuple to take a base) |

(FLOAT_TO_STRING,x,y) | y := to_string x | String representation of a float (one can add a new tuple taking a format string) |

(BOOL_TO_STRING,x,y) | y := to_string x | String representation of a boolean (localized?) |

(CHAR_TO_STRING,x,y) | y := to_string x | String representation of a character |

(ALLOC,x,y) | y := alloc x | Allocate x bytes of memory, returning the address in y (or 0 if the memory could not be allocated). |

(TO_FLOAT,x,y) | y := to_float x | Integer to real |

(NULL_CHECK,x) | assert x != 0 | Do something if x is null |

(ASSERT_POSITIVE,x) | assert x > 0 | Do something if x is not positive |

(BOUND,x,y,z) | assert y <= x < z | Do something if x is not between y (inclusive) and x (inclusive) |

(NO_OP) | nop | Do nothing |

(EXIT) | exit | terminate the program |

Squid is designed to be pretty flexible, so some aspects of this definition are left unspecified. For example, what happens if an assertion fails? That depends on the source language; therefore, a given translator will fill in these details. How big is an integer, or a float? How are characters encoded? Also left up to the implementation.

Tuple arguments in Squid are limited to:

- labels
- subroutines
- simple, non-structured, values (literals, variables, and temporaries)

By non-structured we mean items that fit into a single fixed-size machine unit. Strings, arrays, and structs are always referenced through a pointer.

Squid does not specify the kinds or sizes of the value data types. It assumes, of course, integers and floating-point values, but allows for a several different types of various sizes of these. A specific translation would define these types, and a back end would arrange for type conversions. For example:

(ADD, x, i, y)

where x and y are floating point and i is an integer, is an allowable tuple, and would be turned into backend code that explicitly or implicitly convertes i to a floating-point value (of the right size) before doing a floating-point addition.

Similarly, a particular translation may include unsigned integer types (as one would in C). A backend has to take care of these cases. For example, with an x86 backend, a JGE tuple would become a `jge`

instruction with signed arguments, but a `jae`

with unsigned arguments.

Because the set of data types is not specified in Squid, many tuples take arguments that represent sizes in bytes. It is up to a specific translation to compute these values.

Squid has integer, floating-point, and string literals. Enumeration, character, and boolean literals in a source language should be turned into simple integer literals by a front-end.

Squid has no other literals, such as for arrays or structs.

Squid variables represent run-time storage locations. Squid does not care if these locations are in static memory, in a stack frame, or the heap, but it does keep track of the static nesting level for a variable, which is useful for languages with arbitrarily nested subroutines. In languages without subroutine nesting, such as C, global varaibles would have level 0 and all others would have level 1.

Squid allows two types of variables, integer variables and floating point variables, since most hardware makes the same distinction. As of right now, these are the only two variable types. Any further interpretation is up to the compiler front or back end using Squid.

Temporaries differ from variables in that variables come from source language variables that are actually declared, while temporaries are the result of evaluating expressions. For example, if a source language featured the expression

x + y * z

the corresponding Squid tuple sequence would be:

(MUL, y, z, r0) (ADD, x, r0, r1)

where r0 and r1 are temporaries. Temporaries come in two flavors: value temporaries and address temporaries. An address temporary would arise from evaluating an expression like:

a[i].x

since the address of the corresponding storage location has to be computed at run-time (assuming the value of i is not known at compile-time).

Value temporaries can be either integer or floating-point, while address temporaries are always themselves integer-valued, since their value is an address, but internally they track whether they are referencing an integer or floating-point object.

Squid distinguishes between subroutines defined in the current translation unit (user subroutines) and those defined elsewhere (external subroutines).

A user subroutine denotes a function in the current translation unit. It contains the following properties:

**level**: the static nesting level**parent**: the subroutine in which this subroutine is nested within**locals**: the set of local variables**parameters**: the list of parameters**tuples**: the list of tuples for the body

External subroutines represent top-level functions callable by the current translation unit but defined elsewhere. They are known only by name. Example:

print(x+4); |
add i0, 4, r3 param r3 param s0 call __print |

Labels are used to hold the structured control graph in a linearized list of tuples.

while (x < 3) { y = 10; x = x * 5; } |
L1: if i0 >= 3 goto L2 i1 := 10 r3 := i0 * 5 i0 := r3 goto L1 L2: |

A simple example:

string f(real i) { return "dog" unless i >= 3.3; } string test = f(cos(2.2)); |
p0: ; level=0 parent=null params=[] locals=[i0] f0 := cos 2.2 startcall p1 f1 := cos 2.2 param f1 r1 := call p1 i0 := r1 exit p1: ; level=1 parent=p0 params=[x1] locals=[] r0 := x1 >= 3.3 if r0 != 0 goto L0 ret s0 L0: call __terminate_thread s0: [100, 111, 103] |

An example with heap-allocated arrays and structures:

int[] a; a[6] = 20; real x = a[80]; a = new int[]{4, 2, 1}; struct Point { int x; int y; } Point[] polygon; print(polygon[5].y); |
p0: ; level=0 parent=null params=[] locals=[i0, x1, i1] assert i0 != 0 r0 := *(i0+-4) assert 0 <= 6 < r0 r1 := 6 * 4 r2 := i0 + r1 *r2 := 20 assert i0 != 0 r3 := *(i0+-4) assert 0 <= 80 < r3 r4 := 80 * 4 r5 := i0 + r4 r6 := *r5 x1 := r6 r7 := alloc 16 *r7 := 3 r7 := r7 + 4 *(r7+0) := 4 *(r7+4) := 2 *(r7+8) := 1 i0 := r7 assert i1 != 0 r8 := *(i1+-4) assert 0 <= 5 < r8 r9 := 5 * 4 r10 := i1 + r9 r11 := *r10 assert r11 != 0 r12 := r11 + 4 r14 := *r12 r13 := int_to_str r14 param r13 call __print exit |

An example with threads:

void f() { for (int i = 0; i < 20000; i++) { print (i + " "); } } thread t1 = start f(); thread t2 = start f(); join(t1); join(t2); |
p0: ; level=0 parent=null params=[] locals=[i1, i2] param 0 param p1 r3 := call __create_thread i1 := r3 param 0 param p1 r4 := call __create_thread i2 := r4 param i1 call __join param i2 call __join exit p1: ; level=1 parent=p0 params=[] locals=[i0] i0 := 0 L0: if i0 >= 20000 goto L1 r1 := int_to_str i0 param s0 param r1 r2 := call __concat_string param r2 call __print inc i0 goto L0 L1: ret s0: [32] |