1 of 12

Lecture 25

A/B Testing

DATA 8

Summer 2017

Slides created by John DeNero (denero@berkeley.edu), Ani Adhikari (adhikari@berkeley.edu), and Sam Lau (samlau95@berkeley.edu)

2 of 12

Announcements

3 of 12

Comparing Samples

4 of 12

A/B Testing

  • Two random samples:
    • Sample A
    • Sample B
  • Eg. Patriots’ and Colts’ footballs (Deflategate)

  • Question: Are they drawn from the same underlying distribution?

  • Answer by A/B testing

5 of 12

The Hypotheses

  • Null:
    • The two samples are drawn from the same underlying population distribution; they look like two random draws from the same set.

  • Alternative:
    • The samples are drawn from different distributions; they don’t look like random draws from the same set.

6 of 12

Permutation Test

  • Null: The two samples are drawn randomly from the same underlying distribution.

  • If the null is true, all rearrangements of the variable values among the two samples are equally likely. So:
    • compute the observed test statistic
    • then shuffle the attribute values and recompute the statistic; repeat; compare with the observed statistic

7 of 12

The Test Statistic

  • If the samples are categorical, then a natural test statistic is the total variation distance. It measures the difference between the distributions in the two samples.
  • If the samples are numerical, often a simpler statistic is just fine, such as the absolute difference between the two sample means.

(Demo)

8 of 12

Attendance

9 of 12

Effect Size

10 of 12

How Big is the Difference?

If you think that the two underlying population means might be different, you’ll want to know how different they are.

  • So instead of just running a “same/different” test, don’t make any hypotheses. Just estimate the difference between the two population means.

  • You can do this by bootstrapping the sample and constructing a confidence interval for the parameter: “difference between the population means”.

(Demo)

11 of 12

Randomized Controlled Experiments

12 of 12

Causality

  • Sample A: control group
  • Sample B: treatment group

  • If the treatment and control groups are selected at random, then you can make causal conclusions.

  • Any difference in outcomes between the two groups could be due to
    • chance
    • the treatment

(Demo)