Precise Stack Roots

At present, the Go runtime conservatively considers all locations between the stack pointer in the top stack segment and the base of the bottom stack segment to be stack roots.  This is an opportunity for space leaks as locations without live pointers may prevent dead objects from having their storage recycled.

Ideally, only locations with live pointers would be considered stack roots.  In a statically typed language such as Go, the compiler knows the type of every object and it knows which locations are live at any point in the program.  By making this information accessible to the garbage collector, the garbage collector can ignore the locations of dead pointers and non-pointer data.

The information about what stack locations contain live pointers as computed by the compiler is potentially very large and kept in a data structure with an in-memory representation. However, some simplifying assumptions can be made that reduce the data size and make it easily accessible from outside the compiler.

First, only information for locations that are observable by the garbage collector need to be made available.  In Go, this roughly corresponds to call instructions.  Second, while the Go compiler can store values in registers, the calling convention is caller-save, meaning live pointers will only be found on the stack at the time of a call.

Given these properties, a practical representation for liveness data would be a bitmap.  This bitmap must span locations between the return address and the stack pointer and contain a set bit for every live pointer.  A convenient location for this information would be the Func structure as it is serves as a repository for related information.  While it is conceivable that the bitmap data can grow to be very large, empirically, the bitmap data is regular and yields to space saving techniques such as structure sharing and difference encoding.

Implementing precise stack roots might proceed as follows

1) Visit stack frames instead of stack segments to collect stack roots.  This requires improving the precision of the stack walking routines.  Today, these routines have some best effort behaviors.  To be useful to the garbage collector they must be exact.

2) While visiting stack frames, ignore values between the local variables and the callee’s return address.  Live values in this area will be scanned by the callee.  This requires knowing the size of the local variables area.

3) Export liveness information from the register allocator, pass it through the linker, and attach it to the Func object.  This requires an additional pass in the optimizer that collects live-after information at interesting program points.  As an optimizer pass, only functions subjected to register allocation will be annotated with liveness information.  All other functions will be scanned conservatively.

4) Make liveness analysis a required compilation phase so unoptimized code can be annotated with liveness information.

Related issues

1) Pointers that flow into runtime calls need additional consideration.  If a runtime call blocks or invokes the garbage collector its roots must be made known to the garbage collector.  This requires a scheme where pointer values are be passed by reference and re-fetched after a blocking call.

2) Pointers that flow into native code might need to be prohibited.

3) A special debugging mode that writes a known and invalid bit pattern over dead values can be useful for debugging the stack scan.