Ronny Kohavi, Alex Deng, Lukas Vermeer��
A/B Testing Intuition Busters
Common Misunderstandings in Online Controlled Experiments
@RonnyK
© 2022 Ron Kohavi
Motivation
#2
© 2022 Ron Kohavi
Tough Paper to Write
I would very much encourage the authors to reread it and tone it down in parts…�If not done so, the paper would in fact embarrass KDD for years…
#3
© 2022 Ron Kohavi
P-Values
Misinterpretation and abuse of statistical tests,�confidence intervals, and statistical power have�been decried for decades, yet remain rampant.�A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof
-- Greenland et al (2016)
#4
© 2022 Ron Kohavi
What We Want vs. What we Get
#5
© 2022 Ron Kohavi
#6
�
Company / Source | Success Rate | FPR |
Microsoft | 33% | 5.9% |
Avinash Kaushik | 20% | 11.1% |
Bing | 15% | 15.0% |
Booking.com, Google Ads, Netflix | 10% | 22.0% |
Airbnb Search | 8% | 26.4% |
© 2022 Ron Kohavi
Key Point: Surprising Results Require�Strong Evidence – Lower P-values
#7
© 2022 Ron Kohavi
Statistical Power
#8
© 2022 Ron Kohavi
Example of Power Calculation
80 users in each variant, and it was stat-sig showing 337% improvement
#9
Do NOT trust experiments with low power
© 2022 Ron Kohavi
Winner’s Curse
#10
Do NOT trust experiments with low power
© 2022 Ron Kohavi
Post-hoc Power Calculations are Noisy and Misleading
#11
© 2022 Ron Kohavi
Minimize Data Processing Options
#12
Statistician: you have already calculated the p-value?
Surgeon: yes, I used multinomial logistic regression.
Statistician: Really? How did you come up with that?
Surgeon: I tried each analysis on the statistical software� dropdown menus, and that was the one �that gave the smallest p-value
-- Andrew Vickers (2009)
© 2022 Ron Kohavi
Beware of Unequal Variants
#13
© 2022 Ron Kohavi
Summary (1 of 2)
We shared five intuition busters and made recommendations on how to address the issues
#14
© 2022 Ron Kohavi
Summary (2 of 2)
#15
© 2022 Ron Kohavi
#16
Q&A
To learn more about A/B tests and controlled experiments,�I teach a 10-hour Zoom class (next one Aug 22).� See https://bit.ly/ABClassRKLI
Paper at https://bit.ly/ABTestingIntuitionBusters
These slides at https://bit.ly/ABTestingIntuitionBustersTalk
© 2022 Ron Kohavi