Lecture 16
Empirical Distributions
DATA 8
Spring 2022
Announcements
Weekly Goals
Review: Distributions
A Statistic
Inference
Making conclusions based on data in random samples
Use the data to guess the value of an unknown number
Create an estimate of the unknown quantity
fixed
depends on the random sample
Terminology
A statistic can be used as an estimate of a parameter
(Demo)
Probability Distribution of a Statistic
Empirical Distribution of a Statistic
(Demo)
Assessing Models
Models
Approach to Assessment
Today’s Examples
Some Goals of Data Science
For example
The skills that you have gained empower you to do this.
First Example
We will study a U.S. Supreme Court case in the 1960s
This case became the foundation of significant reform.
Jury Selection
All eligible jurors in a County
Jury
Panel should be representative of the eligible jurors
Jury panel
Chosen by deliberate inclusion or exclusion
US Constitution:
“right to a speedy and public trial, by an impartial jury”
Supreme Court Case
Robert Swain’s Case
Eligible jurors: 26% Black
Jury:
0 Black
Panel should be representative of the eligible jurors
Jury panel:
8 out of 100 Black
Chosen by deliberate inclusion or exclusion
Supreme Court Ruling, 1965
Eligible jurors: 26% Black
Panel should be representative of the eligible jurors
Jury panel:
8 out of 100 Black
About discrepancies like this, the Court wrote:
“the overall percentage disparity has been small”
Discussion Question
Sampling from a Distribution
sample_proportions(sample_size, pop_distribution)
(Demo)
Statistical Bias
“only 10 to 15% of … jury panels drawn from the jury box since 1953 have been [Black], there having been only one case in which the percentage was as high as 23%”
A Genetic Model
Gregor Mendel, 1822-1884
A Model
Choosing a Statistic
| sample percent of purple-flowering plants - 75 |
(Demo)
Two Viewpoints
Model and Alternative
Steps in Assessing a Model
Next time