“Github Major Service Outage”
Georges Seurat, 1884
Oil on canvas�http://classicprogrammerpaintings.com/post/144953638470
Low level details for high level developers
Low level details for high level developers
Balázs Attila-Mihály
About me
why?
Why?
What we’re (not) going to talk about
Those things are important!
“There are three great virtues of a programmer: laziness, impatience and hubris”�-- Larry Wall
Getting the biggest return on investment...
What we’re going to talk about
The Zen of performance
Why should I care?
Not your grandmother's Von Neuman machine
From source code to hardware
From source code to hardware
Ideas
From source code to hardware
From source code to hardware
From source code to hardware
perf stat perl -E 'say "Hello World!"'
Performance counter stats for 'perl -E say "Hello World!"':
1,621871 task-clock (msec) # 0,890 CPUs utilized
0 context-switches # 0,000 K/sec
0 cpu-migrations # 0,000 K/sec
200 page-faults # 0,123 M/sec
4.839.322 cycles # 2,984 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
4.428.820 instructions # 0,92 insns per cycle
914.978 branches # 564,150 M/sec
37.288 branch-misses # 4,08% of all branches
0,001822859 seconds time elapsed
From source code to hardware
run.c:
int
Perl_runops_standard(pTHX)�{� OP *op = PL_op;� PERL_DTRACE_PROBE_OP(op);� while ((PL_op = op = op->op_ppaddr(aTHX))) {� PERL_DTRACE_PROBE_OP(op);� }� PERL_ASYNC_CHECK();
TAINT_NOT;� return 0;�}
Latency Numbers Every Programmer Should Know
why?
Memory layout matters
Possible solution
Possible solution
RPerl
have $foo = 33
have $bar = 1_932
have $baz = 58.545_454_545_454_5
Performance counter stats for '/tmp/foobar':
1,101880 task-clock (msec) # 0,856 CPUs utilized
…
3.616.659 instructions # 1,10 insns per cycle
637.273 branches # 578,351 M/sec
18.727 branch-misses # 2,94% of all branches
0,001286668 seconds time elapsed
have $foo = 33
have $bar = 1_932
have $baz = 58.545_454_545_454_5
Performance counter stats for 'perl /tmp/foobar.pl':
130,153650 task-clock (msec) # 0,997 CPUs utilized
…
473.391.113 instructions # 1,20 insns per cycle
101.997.452 branches # 783,670 M/sec
3.610.345 branch-misses # 3,54% of all branches
0,130532666 seconds time elapsed
Possible solution
Profiling
Profiling
http://agentzh.org/misc/flamegraph/perl-vm-test-nginx.svg
Problem
#chr position id ref alt�1 27259823 rs143970144 C A�3 134279741 rs570267197 C T�3 4427096 rs189830239 T G�4 56396589 rs751646898 A G�6 103754045 rs188253003 A G�8 81783139 rs201875105 G A�9 40999891 rs28602573 G A�12 55068468 rs3062496 TCA T�21 37313602 rs145886040 G T
Solution: pure perl
https://github.com/gpanther/yapc-eu-2016-benchmarks
Solution: perl with encoded key
Embracing the OS
Embracing the OS
Embracing the OS
Embracing the OS
Solution: perl with memory mapped file
Solution: perl + mmap + Inline::C
Resources
Resources
Resources
Thank you!
Questions and (possibly) answers