Lecture 14
A/B Testing
Summer 2023
Meme Monday
Announcements
Announcements
Weekly Goals
Review
P-Values
(In)consistency Based on Tail Area
Testing whether or not Mendel’s model is good:
Conventions About Inconsistency
Definition of the p-value
The p-value is the chance,
Fair, or biased towards tails?
The Cutoff as an Error Probability
Can the Conclusion be Wrong?
Yes.
| Data consistent with the null | Data point to the alternative |
Null is true | ✅ | ❌ |
Alternative is true | ❌ | ✅ |
An Error Probability
Decision Rule and Error Probability
If you test Mendel’s model using a 5% cutoff for the p-value,
for which values of the statistic will you reject the model?
Reject
the model
5%
Consistent with
the model
Origin of the Conventions
Sir Ronald Fisher, 1890-1962
“We have the duty of formulating, of summarizing, and of communicating our conclusions, in intelligible form, in recognition of the right of other free minds to utilize them in making their own decisions.”
Fisher’s Personal Preference
“It is convenient to take this point [5%] as a limit in judging whether a deviation is to be considered significant or not.”
–– Statistical Methods for Research Workers, 1925
“If one in twenty does not seem high enough odds, we may, if we prefer it, draw the line at one in fifty (the 2 percent point), or one in a hundred (the 1 percent point). Personally, the author prefers to set a low standard of significance at the 5 percent point …” –– 1926
A/B Testing
Comparing Two Samples
(Demo)
The Groups and the Question
Hypotheses
Test Statistic
Group B average - Group A average
(Demo)
The Data
...
Non-smoker
Non-smoker
Smoker
Non-smoker
120 oz
113 oz
128 oz
117 oz
Smoker
108 oz
...
Detour: Sampling
Shuffling Labels Under the Null
...
Non-smoker
Non-smoker
Smoker
Smoker
120 oz
113 oz
128 oz
117 oz
Smoker
108 oz
...
Shuffling Rows
Random Permutation
(Demo)
Simulating Under the Null
(Demo)
When to use A/B testing?
What can we conclude from the A/B test we just conducted?
ⓘ
Click Present with Slido or install our Chrome extension to activate this poll while presenting.
Hypothesis Testing Review
abs(group_a_mean - group_b_mean)
Where is A/B testing used in other applications?
A/B Testing Applications