1 of 10

Lecture 11 part 2

Comparing Distributions

2 of 10

Jury Selection in Alameda County

3 of 10

Jury Panels

Section 197 of California's Code of Civil Procedure says, "All persons selected for jury service shall be selected at random, from a source or sources inclusive of a representative cross section of the population of the area served by the court."

Eligible jurors in a County

Jury

List of eligible residents

Jury panel

(Demo)

4 of 10

Two Viewpoints

5 of 10

Model and Alternative

  • Model:
    • The people on the jury panels were selected at random from the eligible population

  • Alternative viewpoint:
    • No, they weren’t

6 of 10

A New Statistic

7 of 10

Distance Between Distributions

  • People on the panels are of multiple ethnicities
  • Distribution of ethnicities is categorical

  • To see whether the the distribution of ethnicities of the panels is close to that of the eligible jurors, we have to measure the distance between two categorical distributions

(Demo)

8 of 10

Total Variation Distance

Every distance has a computational recipe

Total Variation Distance (TVD):

  • For each category, compute the difference in proportions between two distributions
  • Take the absolute value of each difference
  • Sum, and then divide the sum by 2

(Demo)

9 of 10

Summary

10 of 10

Summary of the Method

To assess whether a sample was drawn randomly from a known categorical distribution:

  • Use TVD as the statistic because it measures the distance between categorical distributions
  • Sample at random from the population and compute the TVD from the random sample; repeat numerous times
  • Compare:
    • Empirical distribution of simulated TVDs
    • Actual TVD from the sample in the study