scaling
people, product, systems
hi!
CTO, GoPay
17 years in tech
@rnjn
June, 2019
iC
Pr
TW
GoJek
many hats!
dev
pm
pgm
hr
student
coach
.
.
.
swiss knife
first time speaker
– Frederick P. Brooks, No Silver Bullet, 1986
“… a scaling-up of a software entity is not merely a repetition of the same elements in larger size; it is necessarily an increase in the number of different elements. In most cases, the elements interact with each other in some nonlinear fashion, and the complexity of the whole increases much more than linearly.”
GoJek journey
january 2015
~15,000 monthly orders
< 5 people
october 2015
~2,500,000 monthly orders
< 40 people
october 2016
~35,000,000 monthly orders
< 100 people
october 2017
~85,000,000 monthly orders
< 250 people
february 2018 6666X
ultra lean: 1 engineer
per
6ook monthly orders
that’s completed orders, no fine print
failed at many things, learnt to scale a bit
.
.
.
.
.
still learning
truth
there is no silver bullet solution to scaling
experiment
observe
course correct
ingredients
product
people
teams
engineering
.
.
self
product
Experiments over Features
Statistics over Opinion
Just Enough Polish
Believe in Serendipity & Luck
scale a product to a platform
transport completed orders
time of day
time of day
food completed orders
driver completed orders
time of day
ingredients
product
people
teams
engineering
.
.
self
hire right
Doers over Talkers
Quality over Quantity
Learners and Seekers over Egos
No Jerks
fairness
Fairness over Cost
Treat people as adults
Open feedback vs closed rooms
Pay for execution; what not who
favour least levels in information hierarchy (we have 3)
wellness
Fairness
Unlimited sick leaves
No death marches
Learn and Teach compassion
Promote fast
ingredients
product
people
teams
engineering
.
.
self
hard problems
distributed development is hard
retention is hard
massive cost of bad teams
controlling toxicity is hard
learnings
frequent and open feedback
rituals
responsibilities over titles
help people grow together
help teams focus on the essentials
craft your teams
Small size
All roles sit together
Clear, measurable goals
Iterations
Autonomy to execute, within bounds
favour least levels in information hierarchy
ingredients
product
people
teams
engineering
.
.
self
hard problems
distributed systems are hard
system failures are inevitable
programmer discipline is hard
simple is hard
recovering from debt is very hard
steps on scale
patterns
caching
read write separation
multiple copies and prepared views
app level sharding
writes
validate -> act (store, call other parties in order) -> respond
vs
validate -> record -> respond -> broadcast -> act
Once validated, consistency is ensured (eventually)
OMS
1. Create/Update
2. Validate
3. Record
4. Broadcast
6. Act
5. Respond
safe zone
safe zone prereqs
idempotency
replayable queues
eventual consistency
isolate from dependency failures
more patterns
discard bad calls ASAP
secure shell
build multiple redundancies
open source over managed
document document document
design for failure
circuit breakers
replayable queues
idempotency across systems (internal and external)
test for chaos
UX - guide users when the system fails
automate
humans are inefficient at repetitive work
humans are costly
human communication problems are unsolved
if you’ve manually done it thrice, write a script
choose the right tools for the job
Language - JVM(java, clojure, jruby/rails, kotlin), swift, GoLang
Persistence - pg, redis, kafka, mongo, hbase, files
Practices & rituals - TDD, XP -> pairing, standups, IPMs, retros
Trunk based development
Automated QA
infrastructure
managed cloud, multiple DCs
cloud agnostic tooling (no ElasticX)
VMs -> containers
100% automation, phoenix servers
key thoughts
quality never goes out of style
don’t break what you cannot fix
work with people you can trust
build a school, teach how to fish
experiment
observe
course correct
reading