1 of 49

Five Steps to Make Your Go Code Faster & More Efficient

Bartłomiej Płotka

Senior Software Engineer at Google

4 Feb 2023 | FOSDEM Go Dev Room

2 of 49

Bartłomiej (Bartek) Płotka

Senior Software Engineer @ Google

Google Cloud: Google Managed Prometheus
Open Source, Go, Distributed Systems, Observability

Thanos co-author, maintainer of Prometheus
…and ~20 more projects

Mentor in the CNCF Mentoring Initiatives (~19 times)
CNCF Ambassador & TAG Observability Tech Lead

3 of 49

Bartłomiej (Bartek) Płotka

Google Cloud: Google Managed Prometheus
Open Source, Go, Distributed Systems, Observability

Thanos co-author, maintainer of Prometheus
…and ~20 more projects

Mentor in the CNCF Mentoring Initiatives (~19 times)
CNCF Ambassador & TAG Observability Tech Lead

Senior Software Engineer @ Google

https://bwplotka.dev/book

4 of 49

A story From Thanos Project

5 of 49

A story From Thanos Project

6 of 49

Ups!

7 of 49

Ups!

8 of 49

Solutions?

Let’s “tune” configuration!

9 of 49

Solutions?

Vertical Scale Up

128 GB

10 of 49

Solutions?

Horizontal Scale Out

11 of 49

Solutions?

Let’s install different OSS system…

12 of 49

Solutions?

Let’s use vendor…

13 of 49

Meanwhile in the code…

@bwplotka

14 of 49

What helped?

Optimizing on Algorithm and Code Level!

15 of 49

What helped?

Optimizing on Algorithm and Code Level!

https://hmarr.com/blog/go-allocation-hunting/

16 of 49

Software Efficiency Enables Things!

…yet we don’t focus on it with the right mindset!

17 of 49

Expect Quiz!

Quiz!

Chance to win a signed copy of my “Efficient Go” book today. Deadline 4.02.2023 4:00pm (30m)

No rush: Random selection among those who responded with correct answers!

Link at the end of the talk! 🙈

18 of 49

Five Pragmatic Steps towards

More Efficient Go Programs

19 of 49

Step #1: Use TFBO: Test Fix Benchmark Optimize

Use Efficiency Aware Development Flow!

20 of 49

Step #1: Use TFBO: Test Fix Benchmark Optimize

TFBO = TDD wrapped with BDO

(benchmark driven optimizations)

(test driven development)

21 of 49

Step #1: Use TFBO: Test Fix Benchmark Optimize

TFBO = TDD wrapped with BDO

22 of 49

Step #1: Use TFBO: Test Fix Benchmark Optimize

TFBO = TDD wrapped with BDO

23 of 49

Step #1: Use TFBO: Test Fix Benchmark Optimize

TFBO = TDD wrapped with BDO

24 of 49

Step #1: Use TFBO: Test Fix Benchmark Optimize

TFBO = TDD wrapped with BDO

25 of 49

Step #2: Understand current efficiency level!

Benchmark First!

26 of 49

Step #2: Understand current efficiency level!

Micro benchmarks

27 of 49

Step #2: Understand current efficiency level!

Micro benchmarks

$ go test -run '^$' -bench '^BenchmarkCreate$'

28 of 49

Step #2: Understand current efficiency level!

Micro benchmarks

$ export ver=v1 && \

go test -run '^$' -bench '^BenchmarkCreate$' \

-benchtime 1s -count 6 \

-cpu 1 -benchmem \

| tee ${ver}.txt

29 of 49

Step #2: Understand current efficiency level!

Micro benchmarks

$ export ver=v1 && \

go test -run '^$' -bench '^BenchmarkCreate$' \

-benchtime 1s -count 6 \

-cpu 1 -benchmem \

| tee ${ver}.txt

30 of 49

Step #2: Understand current efficiency level!

Micro benchmarks

https://pkg.go.dev/golang.org/x/perf/cmd/benchstat

$ benchstat v1.txt

goos: linux

goarch: amd64

cpu: AMD EPYC 7B12

│ v1.txt │

│ sec/op │

Create 86.64m ± 1%

│ v1.txt │

│ B/op │

Create 83.96Mi ± 0%

│ v1.txt │

│ allocs/op │

Create 39.00 ± 0%

31 of 49

Step #3: Understand Your Efficiency Requirements

No Expectations?

“Not sure if that’s fast enough… YOLO” 🙈

$ benchstat v1.txt

goos: linux

goarch: amd64

cpu: AMD EPYC 7B12

│ v1.txt │

│ sec/op │

Create 86.64m ± 1%

│ v1.txt │

│ B/op │

Create 83.96Mi ± 0%

│ v1.txt │

│ allocs/op │

Create 39.00 ± 0%

32 of 49

Step #3: Understand Your Efficiency Requirements

No clear Expectations?

“Program should be fast and use reasonable amount of memory” 🤔

$ benchstat v1.txt

goos: linux

goarch: amd64

cpu: AMD EPYC 7B12

│ v1.txt │

│ sec/op │

Create 86.64m ± 1%

│ v1.txt │

│ B/op │

Create 83.96Mi ± 0%

│ v1.txt │

│ allocs/op │

Create 39.00 ± 0%

33 of 49

Step #3: Understand Your Efficiency Requirements

RAER: Resource Aware Efficiency Requirements

34 of 49

Step #3: Understand Your Efficiency Requirements

RAER: Resource Aware Efficiency Requirements

API should have:

Runtime Complexity: ~34.4 * N^2 nanoseconds

Space (RAM) Complexity: ~2.3 * N bytes

And I mean literally average numbers on paper for latency, and resource consumption like memory - ideally as a function over the data set.

It’s funny! Every software engineer have do it on system design interview, but we do that rarely in further work and that’s pretty SAD.

One of the reason why we have to have clear goals is that many optimizations has to be accepted deliberately - they might sacrifice latency for lower memory usage. Or opposite e.g. when cache is introduced - we use more memory but improve latency enormously. Without clear goals we can’t make a decision.

Here is the irritating part potentially - I think lack of clear goals or procrastination to define ones (because it’s uncomfortable - it’s like setting condition for failure) are the main reasons why we overspend on infrastructure these days.

That’s why I would propose to set RAER - resource aware efficiency requirements, ideally on paper. With your stakeholders signature. Don’t worry if you can’t match those requirements - you can always negotiate!

35 of 49

Step #3: Understand Your Efficiency Requirements

RAER: Resource Aware Efficiency Requirements

create() should have:

Runtime Complexity: 1 million * ~30 nanoseconds

Space (RAM) Complexity: ~1 million * ~16 bytes

36 of 49

Step #3: Understand Your Efficiency Requirements

RAER: Resource Aware Efficiency Requirements

create() should have:

Runtime Complexity: 1 million * ~30 nanoseconds = 30ms

Space (RAM) Complexity: ~1 million * ~16 bytes = 15 MB

$ benchstat v1.txt

goos: linux

goarch: amd64

cpu: AMD EPYC 7B12

│ v1.txt │

│ sec/op │

Create 86.64m ± 1%

│ v1.txt │

│ B/op │

Create 83.96Mi ± 0%

│ v1.txt │

│ allocs/op │

Create 39.00 ± 0%

37 of 49

Step #4: Focus on the Hot Path

Do Profiling!

38 of 49

Step #4: Focus on the Hot Path

Do Profiling!

$ export ver=v1 && \

go test -run '^$' -bench '^BenchmarkCreate$' \

-benchtime 1s -count 6 \

-cpu 1 -benchmem \

-memprofile=${ver}.mem.pprof \ -cpuprofile=${ver}.cpu.pprof \

| tee ${ver}.txt

39 of 49

Step #4: Focus on the Hot Path

Do Profiling!

go tool pprof -http :8080 v1.cpu.pprof

40 of 49

Step #4: Focus on the Hot Path

Do Profiling!

go tool pprof -http :8080 v1.mem.pprof

41 of 49

Step #5: Try optimizing that part & repeat!

Append (from docs):

If array is full, then resize it.
Add “FOSDEM” to last elem of array.
Return new or same array.

42 of 49

Step #5: Try optimizing that part & repeat!

43 of 49

Repeat!

44 of 49

Repeat!

45 of 49

Repeat!

$ export ver=v2 && \

go test -run '^$' -bench '^BenchmarkCreate$' \

-benchtime 1s -count 5 \

-cpu 1 -benchmem \

| tee ${ver}.txt

46 of 49

Repeat!

$ benchstat v1.txt v2.txt

cpu: AMD EPYC 7B12

│ v1.txt │ v2.txt │

│ sec/op │ sec/op vs base │

Create 87.71m ± 6% 11.56m ± 3% -86.82% (p=0.000 n=6+10)

│ v1.txt │ v2.txt │

│ B/op │ B/op vs base │

Create 83.96Mi ± 0% 15.27Mi ± 0% -81.82% (n=6+10)

│ v1.txt │ v2.txt │

│ allocs/op │ allocs/op vs base │

Create 39.000 ± 0% 1.000 ± 0% -97.44% (n=6+10)

47 of 49

LGTM 😍😍😍!

$ benchstat v1.txt v2.txt

cpu: AMD EPYC 7B12

│ v1.txt │ v2.txt │

│ sec/op │ sec/op vs base │

Create 87.71m ± 6% 11.56m ± 3% -86.82% (p=0.000 n=6+10)

│ v1.txt │ v2.txt │

│ B/op │ B/op vs base │

Create 83.96Mi ± 0% 15.27Mi ± 0% -81.82% (n=6+10)

│ v1.txt │ v2.txt │

│ allocs/op │ allocs/op vs base │

Create 39.000 ± 0% 1.000 ± 0% -97.44% (n=6+10)

create() should have:

Runtime Complexity: 1 million * ~30 nanoseconds = 30ms

Space (RAM) Complexity: ~1 million * ~16 bytes = 15 MB

48 of 49

Lessons

Optimizing Software Efficiency might be easier than you think! [if done right]

Follow Pragmatic TFBO Flow
Benchmark (go test -bench)
Set Clear Goals (RAER)
Profile (pprof)
Understand what is happening under� the hood (tip: usually generic = slow)

49 of 49

Thank You! Questions?

Quiz!

Chance to win a signed copy of my “Efficient Go” book today. Deadline 4.02.2023 4:00pm (30m)

No rush: Random selection among those who responded with correct answers!

Link: https://bwplotka.dev/quiz.html

Book: https://bwplotka.dev/book