Lecture 28
Designing Experiments
DATA 8
Fall 2023
Announcements
Weekly Goals
Review: SD and Bell-Shaped Curves
If a histogram is bell-shaped, then
Distribution of the �Average of a Large Sample
CLT with More Details
If the sample is large and drawn at random with replacement:
Then, regardless of the distribution of the population,
CLT with More Details
If the sample is large and drawn at random with replacement:
Then, regardless of the distribution of the population,
Increasing Sample Size
Three Different SDs
Population of flight delays
Random sample of 100 flights
SD of sample average: 27/sqrt(100) = 2.7
Confidence Intervals
Graph of the Distribution
The Key to 95% Confidence
= (population SD) / √sample size
1 SD above the mean
2 SDs above the mean
Constructing the Interval
Constructing the Interval
For 95% of all samples,
The Interval
(Demo)
Summarizing: construction of intervals
Question
If we can make 95% confidence interval in this way:
This method only works for means and sums ( as it is based on CLT) but bootstrap is a much more generalized approach which can work for other statistics like medians as well
Width of the Interval
Total width of a 95% confidence interval for the population average
= 4 * SD of the sample average
= 4 * (population SD) / √sample size
Sample Proportions
Proportions are Averages
If the population consists of 1’s and 0’s (yes/no answers to a question), then:
Confidence Interval
Controlling the Width
= 4 * (SD of 0/1 population) / √sample size
The Sample Size for a Given Width
0.01 = 4 * (SD of 0/1 population) / √sample size
(Demo)
√sample size = 4 * (SD of 0/1 population) / 0.01
“Worst Case” Population SD
Discussion Question
Discussion Question
width = 4 * (0.5) / √ 1004
width ≈ 0.063, so margin of error ≈ 3.15%
Discussion Question
Fill in the blank with a decimal:
Discussion Question
width = 4 * (0.5) / √ 10000
width = 0.02, so margin of error = 0.01
Discussion Question
Discussion Question
0.025 = 2 * (0.5) / √sample size
√sample size = 2 * (0.5) / 0.025
sample size = 40**2 = 1600