1 of 26

Garbage Collection in GO

Aagosh Aggarwal�Backend Engineer�Merwork Team

1

2 of 26

Types of Memory

2

3 of 26

Which variable goes where?

  • Go needs to figure out which variable/object needs to be created on the stack and which should escape to the heap.�
  • It uses Escape Analysis to do that.

3

4 of 26

Escape Analysis

Go does escape analysis at compile time to figure out which variables should be allocated on the stack and which ones should escape to the Heap.

�From the Golang FAQ:�When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame. However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.

4

5 of 26

Stack Allocation

package main

func main() {

n := 1

n = increment(n)

println(n)

}

func increment(x int) int {

return x+1

}

  • In this code the variables exist on Stack.

  • The variables do not escape to the Heap as these are not referenced once the function returns.

5

6 of 26

func main()��n = 1

package main

func main() {

n := 1

n = increment(n)

println(n)

}

func increment(x int) int {

return x+1

}

func increment()

x = 2

func main()

n = 2

func main()�n = 2

func println()�n =2

func increment()

x = 2

func increment()

x = 2

Here the stack frames marked red are invalid as the function has returned. Go reclaims these invalid stack frames itself.

6

7 of 26

Heap Allocation

package main

func main() {

n := allocate()

*n++

println(*n)

}

func allocate() *int {

x := 1

return &x

}

  • In this code the variable x escapes to the Heap at compile time.

  • This is because it is referenced after the increment function returns, and if it would have been in the stack frame for that function, it would lead to Dangling Pointer errors.

7

8 of 26

func allocate()��x = 1

package main

func main() {

n := allocate()

*n++

println(*n)

}

func allocate() *int {

x := 1

return &x

}

func main()

n = address of X

func main()

n = address of X

func allocate()

x = 1

This will lead to Dangling pointer error

8

9 of 26

func allocate()�

package main

func main() {

n := allocate()

*n++

println(*n)

}

func allocate() *int {

x := 1

return &x

}

func main()

n = address of X

HEAP�x = 1

9

10 of 26

Garbage Collection

  • Memory Management system used to reclaim the allocated memory which is no longer referenced by the program.

  • Needed to make sure that the program is not using up all the available memory and it does not run out of memory.

  • Only associated with the Heap Memory. Stacks are self cleaning and do not require GC to free up the memory.

10

11 of 26

Go’s Garbage Collector

Go uses a non-generational, non-compacting, concurrent, tricolor mark and sweep Garbage collector.

Non-Generational: Generational GC relies on generational hypothesis: most allocated values are unused quickly, so it is better for the GC to run through the recently allocated objects first. �Because GO uses escape analysis and tries to allocate objects with short life-span on the stack, so it does not really need a Generational Garbage collector.

Non-Compacting: Compacting GC ensures that the spaces between the objects in heap are compacted. This way the heap memory remains contiguous. The allocator used by Go, tcmalloc, reduces fragmentation by itself and so it does not use a compacting GC as of now.

Concurrent: GO’s garbage collector runs concurrently with the application processes. It does not STW (Stop the world) during the entire collection process.

Tricolor Mark and Sweep: This is the process through which collection takes place.

11

12 of 26

Phases of Garbage Collection

  • Mark Setup

  • Marking

  • Mark Termination

12

13 of 26

Mark Setup

  • Mark Setup phase involves turning the Write Barrier ON.

  • Write Barrier ensures that the data integrity during the collection phase is maintained as both the Garbage collection and application processes will be running concurrently.

  • Once the write barrier is turned ON, the pointer changes are tracked. This makes sure that no new Live Objects are garbage collected.

  • To turn the Write Barrier ON, STW (Stop The World) state needs to be achieved. This means that all the running goroutines are stopped during the Mark Setup Phase.

GC Collection Starts

Stop the World

Write Barrier turned ON

Start

the World

13

14 of 26

Stop the World

M

M

M

M

P1

P4

P3

P2

G

G

G

G

14

15 of 26

Marking

  • During the marking phase, Garbage collector takes 25% of the available CPU for itself. This percentage is configurable.

  • This means that the Application runs at a reduced CPU capacity during this time.

  • During the Marking phase, the values that are still in use in the heap memory are marked.

  • All objects are considered white at the beginning of the marking phase.

  • The root objects/pointers are marked grey.

  • Then a grey object is picked and is marked black.

  • All the pointers from this object are checked and the referenced objects are marked grey.

  • The last two steps are repeated till there are no more objects remaining to be colored.

  • At the end of this, all the objects that are marked white are dead and can be garbage collected.

15

16 of 26

Marking

Image source: https://developer.com

16

17 of 26

Marking

M

M

M

M

P1

P4

P3

P2

GC

G

G

G

17

18 of 26

Mark Assist

  • As mentioned earlier, marking is a concurrent activity. The application goroutines run concurrently with the garbage collector.

  • If in case during the collection, collector realizes that the heap allocation from other goroutines will fill up the entire heap before the collector can complete the collection, the collector will recruit more application goroutines to complete the collection work. This is called Mark Assist.

  • Collector will try to recruit the goroutine which is allocating the most.

  • Mark Assists further increase the latency of the application as during mark assists even more application

Goroutines are involved in the collection process.

  • The garbage collector tries to avoid the need for mark assists. For this, it tries to start the subsequent garbage collections earlier. This way it tries to make sure that the mark assists are not required in these subsequent cycles.

18

19 of 26

Mark Assist

M

M

M

M

P1

P4

P3

P2

GC

MA

G

G

19

20 of 26

Mark Termination

  • After the Marking phase completes, Mark Termination phase starts. During this phase write barriers are turned off. There is a STW (Stop the World) during this phase as well.

  • Once the write barriers are turned OFF, next calculation goal is calculated. This determines when the next GC cycle should start.

Stop the World

Write barriers turned off

Start the World

Next GC Goal calculated

20

21 of 26

Sweeping

  • During the garbage collection, the memory blocks that were not in use are marked.

  • Sweeping is the process through which these memory blocks are reclaimed.

  • It is done when the application goroutines try to allocate memory in the heap.

21

22 of 26

Pacing

  • Algorithm through which the GC decides when to run the next collection cycle.

  • Depends on the amount of heap allocations that the application is doing.

  • There is a percentage - GC percentage which determines the ratio of the additional heap memory that can be allocated after a GC cycle completes, before the next GC cycle is kicked in.�
  • For example; if GC percentage is 100% (default value), and the heap memory in use after a cycle was 10 MB, then the next cycle will be initiated when the total heap memory allocated will be 20 MB.

  • Each GC cycle incurs a latency on the application. There are 2 STW latencies and latencies introduced by running the marking phase on application goroutines. All of these latencies impact the application latency.

  • We need to make sure that the application code is GC friendly to make sure that these latencies are not causing unexpected problems.

  • Although reducing the number of GC cycles can be seen as a good option, but that would mean that there are a lot of allocations b/w the cycles and so each cycle will then increase the latency a lot.

  • So we need to find the perfect balance between the number of cycles and the amount of heap memory being allocated.

22

23 of 26

GC Percentage

If heap memory in use after a GC cycle was 10 MB and the GC percentage is 100%, that means that the next GC will be triggered when the heap memory would be 20 MB.

Next Garbage collection initiated when the heap memory is 20 MB.

10 MB

20 MB

23

24 of 26

Writing GC Friendly Code

func allocateSlice() []int {

slice1 := make([]int, 10000000)

for i:=0; i<len(slice1); i++ {

slice1[i] = i

}

// slice2 uses the same underlying array as slice1

slice2 := slice1[1:3]

return slice2

}

func main() {

slice := allocateSlice()

runtime.GC()

slice = append(slice, 5)

runtime.GC()

}

Although we have sliced this, the underlying array wont be garbage collected as the same array is still being referenced.

The underlying array does not get GC here.

The previous underlying array gets GC here as after append the underlying array changes

24

25 of 26

Writing GC Friendly Code

type Reader interface {

Read(p []byte) (n int, err error)

}

io.Reader interface is implemented as above. It would have been clearer if it would have returned the slice read like below:

type Reader interface {

Read(n int) (p []byte, err error)

}

Reason: This way the byte slice is allocated on the stack. If it would have been implemented like the second alternative, the byte slice would be allocated on heap and would have resulted in a lot of garbage on the heap.

25

26 of 26

Conclusion

  • Try to make sure that heap allocations are being done while keeping the GC latencies in mind. Look out for objects that are being referenced but are not needed.

  • If there are unexpected latencies in the application, profile the application and look out for GC latencies. If GC is causing these latencies, try to tune the GC parameters.

  • Check the pacing of the GC using GC traces and try to tune the pacing if required. Normally, the pacing should be fine for the application and this should be done only if there are GC latencies in the app that are quite evident.

26