1 of 17

Lecture 26

The Normal Distribution

DATA 8

Spring 2023

2 of 17

Announcements

  • Homework 8 due Wednesday at 11pm
  • Lab 8 due Friday at 11pm
  • Midterm Regrade Requests are due Friday at 11:59pm
  • My OH Today: 4-6pm @ FSM

3 of 17

Review: Standard Units

  • How many SDs above average?
  • z = (value - average)/SD
    • Negative z: value below average
    • Positive z: value above average
    • z = 0: value equal to average
  • When values are in standard units: average = 0, SD = 1
  • Gives us a way to compare/understand data no matter what the original units

4 of 17

The SD and the Histogram

  • Usually, it's not easy to estimate the SD by looking at a histogram.

  • But if the histogram has a bell shape, then you can.

(Demo)

5 of 17

The SD and Bell-Shaped Curves

If a histogram is bell-shaped, then

  • the average is at the center

  • the SD is the distance between the average and the points of inflection on either side

6 of 17

Points of Inflection

7 of 17

The Normal Distribution

8 of 17

The Standard Normal Curve

A beautiful formula that we won’t use at all:

9 of 17

Bell Curve

10 of 17

Normal Proportions

11 of 17

How Big are Most of the Values?

No matter what the shape of the distribution,

the bulk of the data are in the range “average ± a few SDs”

If a histogram is bell-shaped, then

  • Almost all of the data are in the range

“average ± 3 SDs”

12 of 17

Bounds and Normal Approximations

13 of 17

A “Central” Area

14 of 17

Central Limit Theorem

15 of 17

Sample Averages

  • The Central Limit Theorem describes how the normal distribution (a bell-shaped curve) is connected to random sample averages.
  • We care about sample averages because they estimate population averages.

(Demo)

16 of 17

Central Limit Theorem

If the sample is

  • large, and
  • drawn at random with replacement,

Then, regardless of the distribution of the population,

the probability distribution of the sample average

is roughly normal

17 of 17

Discussion Question

After rolling 1,000,000 fair 6-sided dice, which of these histograms would you expect to have a bell shape? Check all that apply.

  1. The histogram of outcomes of these million rolls
  2. The histogram that results from computing the average outcome of these million rolls
  3. The histogram that results from splitting the outcomes into 1,000 groups of 1,000 (in the order they occurred) and computing the average outcome of each group