TODO
TODO
Each invocation of a subroutine has an activation record, or frameassociated with it at runtime, holding some or all of the following:
Note that some of the stack frame may be in registers and some of it may be in memory.
Activation records are also called stack frames. When the runtime supports closures or concurrency, something like a spaghetti stack is required.
Calling conventions specify:
TODO
The C-calling convention on the IA-32 is:
int example(int x, int y) { int a, b, c; ... f(y, a, b, b, x); ... }
The prolog/epilog for this function would look like:
push ebp ; must save old ebp mov ebp, esp ; point ebp to this frame sub esp, ___ ; make space for locals ... mov esp, ebp ; clean up locals pop ebp ; restore old ebp ret
with stack frame:
+---------+ ebp-12 | a | +---------+ ebp-8 | b | +---------+ ebp-4 | c | +---------+ ebp | old ebp | +---------+ ebp+4 | retaddr | +---------+ ebp+8 | x | +---------+ ebp+12 | y | +---------+
The SPARC uses register windows. You can read about these at Wikipedia. Let’s start with an example:
! ----------------------------------------------------------------------------
! writeint
!
! A small program that contains a function to write an unsigned integer value
! to standard output.
! ----------------------------------------------------------------------------
.global _start
.text
! writeint(unsigned int n) writes n to standard output as text. It is
! inefficient since it is recursive and invokes a system call to write
! each digit.
!
! It is equivalent to the C function
!
! void writeint(unsigned n) {
! if (n >= 10) writeint(n / 10);
! putchar(n + '0');
! }
writeint:
save %sp, -96, %sp
mov 10, %l0 ! Keep 10 in %l0
cmp %i0, %l0 ! Is it less than 10?
bl writedigit ! If so just write the digit
nop
mov %g0, %y
udiv %i0, %l0, %l1 ! n / 10 -> %l1
umul %l1, %l0, %l2 ! n * (n/10) -> %l2
sub %i0, %l2, %l2 ! n mod 10 -> %l2
call writeint
mov %l1, %o0 ! call writeint(n / 10)
mov %l2, %i0 ! Put digit into %i0 before writing
writedigit: ! Writes the digit in %i0
add 48, %i0, %l1 ! Character value of the digit
set digit, %o1
stb %l1, [%o1] ! Must be in memory to write
mov 4, %g1 ! System call write
mov 1, %o0 ! stdout
mov 1, %o2 ! number of characters to write
ta 0
ret
restore
! A starting point for the program that just makes a couple of calls
! to the writeint function.
_start:
call writeint ! writeint(5)
mov 5, %o0
call writeint ! writeint(1312)
mov 1312, %o0
sethi %hi(7308841), %o0 ! writeint(7308841)
call writeint
or %o0, %lo(7308841), %o0
mov 1, %g1
ta 0
.data
.align 4
digit: .skip 4
TODO - describe this example
A leaf subroutine makes no calls. A good compiler can generate code to make a leaf subroutine's activation use the stack frame of its caller. The SPARC has direct support for this. Example:
! write_stars(n) writes n stars followed by a newline
write_stars:
mov %o0, %l0 ! save the argument in %l0
mov 4, %g1 ! syscall write
mov 1, %o0 ! stdout
set star, %o1 ! message to write
mov 1, %o2 ! one character in message
write_star:
tst %l0 ! no more characters to write
be write_newline ! write terminating newline if so
nop
ta 0
ba write_star ! go see if more to write
dec %l0 ! count off one more char as written
write_newline:
set newline, %o1
retl
ta 0 ! write the newline (before returning)
star:
.ascii "*"
newline:
.ascii "\n"
TODO - describe this example
A compiler inlines a subroutine call when it replaces the call with the body of the called subroutine (with all argument-parameter bindings resolved, of course). If all calls are inlined, there is no need to generate separate code for the subroutine. Inlining should be faster in general, since there is no call/return overhead (which is substantial for short subroutines). But if there are thousands of such calls you can end up with more compiled code.
And compilers have to be careful when inlining recursive subroutines.
Given:
function f(a, b) { var x = 1; function g(a, y) { var z = 3; function h(z) { print x; } function k(c) { var x = c; if (x < 5) k(c+1); else h(1); } if (a == 0) g(1, 0); else k(2); } g(0, 0); }
The call sequence goes f → g → g → k → k → k → k → h. What gets printed? It depends on whether we have static or dynamic scoping.
If dynamically scoped, use a-lists or central reference tables.
If statically scoped, use static links (or displays). In this case we rely on the compiler to generate code that says:
"To find x from the body of k, take two steps down the static chain and look in offset -4".
TODO
TODO
TODO
TODO
We’ve covered: