1 of 13

Lecture 22

Comparing Samples — Sit with a partner!

DATA 8

Summer 2017

Slides created by John DeNero (denero@berkeley.edu), Ani Adhikari (adhikari@berkeley.edu), and Sam Lau (samlau95@berkeley.edu)

2 of 13

Announcements

3 of 13

Good Job on the Midterm!

4 of 13

Deflategate

5 of 13

Deflategate

6 of 13

Tom Brady Then

7 of 13

Tom Brady Now

Boston Globe,

Sunday 10/9/16

(Demo)

8 of 13

Attendance (new link!)

9 of 13

Comparing Two Samples

10 of 13

Permutation

A

B

C

D

E

X

Y

X

~Y

A

B

C

D

E

11 of 13

Permutation Test

  • Whether two samples are drawn randomly from the same underlying distribution
    • Attribute distributions are the same for both classes
  • If the null is true, all rearrangements of the attribute values among the two classes are equally likely
  • So compute the observed test statistic
    • Then shuffle the attribute values and recompute the statistic; repeat; compare with observed statistic

(Demo)

12 of 13

Official Analysis

"[T]he average pressure drop of the Patriots game balls exceeded the average pressure drop of the Colts balls by 0.45 to 1.02 psi, depending on various possible assumptions regarding the gauges used, and assuming an initial pressure of 12.5 psi for the Patriots balls and 13.0 for the Colts balls."

-- Investigative report commissioned by the NFL regarding the AFC Championship game on January 18, 2015

13 of 13

Recap

  • Looking at one sample? Use previous methods.
  • Looking at two?
    • Usually want to see if there is a difference
  • Typical null: The two samples came from the same distribution.
    • Then, use a permutation test to get new samples
  • Calculate p-value as usual.
  • Hooray.