Each invocation of a subroutine has a record associated with it at runtime, holding some or all of the following:
Note that some of the stack frame may be in registers and some of it may be in memory.
Activation records are also called stack frames. When the runtime supports closures or concurrency, something like a spaghetti stack is required.
Calling conventions specify:
The C-calling convention on the IA-32 is:
int example(int x, int y) { int a, b, c; ... f(y, a, b, b, x); ... }
The prolog/epilog for this function would look like:
push ebp ; must save old ebp mov ebp, esp ; point ebp to this frame sub esp, ___ ; make space for locals ... mov esp, ebp ; clean up locals pop ebp ; restore old ebp ret
with stack frame
+---------+ ebp-12 | a | +---------+ ebp-8 | b | +---------+ ebp-4 | c | +---------+ ebp | old ebp | +---------+ ebp+4 | retaddr | +---------+ ebp+8 | x | +---------+ ebp+12 | y | +---------+
The SPARC uses register windows
! ---------------------------------------------------------------------------- ! writeint ! ! A small program that contains a function to write an unsigned integer value ! to standard output. ! ---------------------------------------------------------------------------- .global _start .text ! writeint(unsigned int n) writes n to standard output as text. It is ! inefficient since it is recursive and invokes a system call to write ! each digit. ! ! It is equivalent to the C function ! ! void writeint(unsigned n) { ! if (n >= 10) writeint(n / 10); ! putchar(n + '0'); ! } writeint: save %sp, -96, %sp mov 10, %l0 ! Keep 10 in %l0 cmp %i0, %l0 ! Is it less than 10? bl writedigit ! If so just write the digit nop mov %g0, %y udiv %i0, %l0, %l1 ! n / 10 --> %l1 umul %l1, %l0, %l2 ! n * (n/10) --> %l2 sub %i0, %l2, %l2 ! n mod 10 --> %l2 call writeint mov %l1, %o0 ! call writeint(n / 10) mov %l2, %i0 ! Put digit into %i0 before writing writedigit: ! Writes the digit in %i0 add 48, %i0, %l1 ! Character value of the digit set digit, %o1 stb %l1, [%o1] ! Must be in memory to write mov 4, %g1 ! System call write mov 1, %o0 ! stdout mov 1, %o2 ! number of characters to write ta 0 ret restore ! A starting point for the program that just makes a couple of calls ! to the writeint function. _start: call writeint ! writeint(5) mov 5, %o0 call writeint ! writeint(1312) mov 1312, %o0 sethi %hi(7308841), %o0 ! writeint(7308841) call writeint or %o0, %lo(7308841), %o0 mov 1, %g1 ta 0 .data .align 4 digit: .skip 4
A leaf subroutine makes no calls. A good compiler can generate code to make a leaf subroutine's activation use the stack frame of its caller. The SPARC has direct support for this.
! write_stars(n) writes n stars followed by a newline write_stars: mov %o0, %l0 ! save the argument in %l0 mov 4, %g1 ! syscall write mov 1, %o0 ! stdout set star, %o1 ! message to write mov 1, %o2 ! one character in message write_star: tst %l0 ! no more characters to write be write_newline ! write terminating newline if so nop ta 0 ba write_star ! go see if more to write dec %l0 ! count off one more char as written write_newline: set newline, %o1 retl ta 0 ! write the newline (before returning) star: .ascii "*" newline: .ascii "\n"
A compiler inlines a subroutine call when it replaces the call with the body of the called subroutine (with all argument-parameter bindings resolved, of course). If all calls are inlined, there is no need to generate separate code for the subroutine. Inlining should be faster in general, since there is no call/return overhead (which is substantial for short subroutines). But if there are thousands of such calls you can end up with more compiled code.
And compilers have to be careful when inlining recursive subroutines....
Given
function f(a, b) { var x = 1; function g(a, y) { var z = 3; function h(z) { print x; } function k(c) { var x = c; if (x < 5) k(c+1); else h(1); } if (a == 0) g(1, 0); else k(2); } g(0, 0); }
The call sequence goes f → g → g → k → k → k → k → h. What gets printed? It depends on whether we have static or dynamic scoping.
If dynamically scoped, use a-lists or central reference tables.
If statically scoped, use static links (or displays). In this case we rely on the compiler to generate code that says:
"To find x from the body of k, take two steps down the static chain and look in offset -4".