Published using Google Docs
Alternatives to Dynamic Code Generation in Go
Updated automatically every 5 minutes

Alternatives to Dynamic Code Generation in Go

Russ Cox

September 2012

The Go runtime uses dynamic code generation to implement closures. I took this approach primarily out of expedience: it avoided toolchain-wide changes to the representation of function values and to the function calling convention. However, it is clear that in the long term we should not depend on dynamic code generation, since it limits the environments in which Go can run. It also complicates minor parts of the toolchain: the stack trace code has ugly heuristics to handle closures, and the gdb support cannot get past a closure in a stack trace.

The canonical solution is to represent a closure as a pair of pointers, one to static code and one to a dynamic context pointer giving access to the captured variables. This document describes the plan for doing this in the Go runtime and some of its implications. I’d like to do this for Go 1.1.

Func Representation

The representation of a Go func value is currently just a code pointer, so it must change. The only real option is to make it two words, a code pointer and a context pointer. This will affect Go struct and argument list layout and also code generation for copying function values.

The plan is to leave C function pointers as single-word pointers and introduce in the runtime C header, analogous to the Slice and String types:

        typedef struct Func Func;

        struct Func

        {

                void *fn;

                void *ctxt;

        };

Calling Convention

The calling convention for Go functions must change to make the context pointer available to the call as an additional argument. There are two reasonable options here: use a register or use the stack.

The first option is to pass the context pointer in a fixed register, like AX on the x86 and R0 on the ARM. This has the significant benefit of not changing the on-stack frame layout, thus avoiding compiler changes and recalculation of frame offsets in assembly files. It allows the use of existing C functions as Go functions, since the (non-closure) C function would just ignore the incoming register value. However, it also introduces a new kind of function call, which will likely require changes to the optimizers and certainly make it impossible for C to call a Go closure directly. (It could still do so via an assembly stub.)

The second option is to treat the context pointer as a new first argument to the function. In C terms, a Go func(float64, int64) today is equivalent to a void(*)(float64, int64). This option would change that, so that the same Go func would be a C void(*)(void*, float64, int64): the new void* is the closure context pointer. Non-closure functions would have the slot but not read from it. Direct function calls, which necessarily do not call closures, would leave space for the slot but not initialize it, knowing that the called code will not read it. (Always leaving room for the slot means that any compiled function can be used directly as a code pointer in a func value.) Using the stack has the benefit of making the Go implementation directly expressible in C, avoiding the need for assembly stubs when calling Go functions from C. It also makes it possible to implement a closure-like Go function value as a C function, should the need arise. However, it requires frame layout adjustments in every Go function. Those written in Go or written in C using goc2c will be handled transparently by the Go compiler or by goc2c; those written in plain C will need a new void* argument; and those written in assembly will need all their frame offsets adjusted.

The register approach changes almost no existing code but does change the execution model: assumptions made by the optimizer need to be revised, and it will be impossible to call Go directly from C. The stack approach changes nearly all existing (non-Go) code but leaves the execution model alone.

I am inclined toward the stack approach. I chose the current dynamic code generation approach because it changed almost no existing code, but three years later we find ourselves replacing it because we cannot support the changed execution model (namely that code can be generated at runtime). Although the stack approach is more invasive to existing code, leaving the execution model alone seems like the right long term path.

Adjusting the assembly stack offsets seems to me the largest transition cost for the stack approach. This can be automated with a tool, and assembly offsets are changing anyway (on 64-bit, because int is now 64 bits; on 32-bit, because we need to enforce 64-bit alignment of 64-bit values).

Method Values

In discussions a year ago, we reached consensus on what the meaning of method values would be if we added them to the language. They would look like:

        var r io.Reader

        f := r.Read  // f has type func([]byte) (int, error)

        n, err := f(p)

Even once we reached that agreement, I did not bother to send out a spec change, because the implementation of “f := r.Read” would hide the allocation of a closure. It seemed better to force people to write:

        f := func(p []byte) (int, error) { return r.Read(p) }

and make the closure explicit. Also I was lazy and did not want to implement it. But I have always treated it as a “someday we’ll want to do this.”

Moving to a two-word function value representation makes it worth revisiting this idea. The call to Read on an io.Reader interface value differs from the call to a value of f’s type in that it passes the receiver word in a leading argument on the stack. If we use the stack approach for calling func values, then the code pointer used in an interface method table is exactly the same kind of code pointer needed for a func value. The implementation of “f := r.Read” could simply copy the method pointer into the first word of f and the receiver word into the second. That is, it would be no more costly than an ordinary assignment.

In a way this would tie the func value and interface value representations and calling conventions together. But in another way a func value is just a more limited form of an interface value: it’s like an interface with a single unnamed method. The simplicity of the conversion would derives directly from using the same approach to implement both.

The fact that “f := r.Read” is disallowed is a common surprise among new programmers. We have an opportunity to make it just work, and cheaply. I think we probably should.

Reflection

In reflection, v.Method(i) returns a Value corresponding to the i’th method of v with the receiver v pre-bound. For example, assuming the i’th method is named F,

        v.Method(i).Call(ValueOf(x), ValueOf(y))

is the reflect equivalent of v.F(x, y). The fact that Method and Call are two different steps means that v.Method(i) by itself must evaluate to something. Today it evaluates a reflect.Value that can be used in Call and have its Type inspected. However, the Interface method, as in

v.Method(i).Interface()

panics, because there is no Go value to return in the interface{}.

With the new func value representation, it would be easy to create a method value here with v pre-bound, just as it is easy to create one in the “f := r.Read” case above. So the Interface call would not need to panic anymore. However, I am reluctant to fix this without having method values in the language proper: one of the reasons v.Method(i).Interface() does not today use dynamic code generation to create and return an appropriate closure (besides my own laziness) is that I did not want to give reflect more power than was available in the language proper. That is, I didn’t want to create a situation where you had to use reflect to get something done.

If we introduce method values, we should fix v.Method(i).Interface() too.

Conclusion

We should change closures not to use dynamic code generation. To do that, we must make fun values two words. We should make the calling convention insert an extra leading context pointer argument for all Go functions; that will require changes to C and assembly functions called from Go, but it will preserve the execution model we have today. It is also very similar to the handling of interface calls. It is in fact so similar that it makes the implementation of method values trivial. After switching to the new funcs, we should spec and implement method values in both the language and in package reflect.