ECO220: Probability
Simon Halliday
First dice-rolling activity
Second dice-rolling activity
Third dice-rolling activity
Important ideas
How do we get different combinations?
Back to the second dice-rolling activity
Odd
Even
Odd
Even
Odd
Even
Odd, Odd
Odd, Even
Even, Odd
Even, Even
Roll 1
Roll 2
Some basic rules
Addition rule:
P[A or B] = P[A] + P[B] - P[A and B]
Sometimes events are mutually exclusive, in which case this degenerates to:
P[A or B] = P[A] + P[B]
(there can be no event “A and B” when events are mutually exclusive).
Some basic rules
Subtraction Rule
If there are only two potential events, then:
P[A] + P[Not A] = 1
Therefore:
P[Not A] = 1 - P[A]
Some basic rules
Multiplication Rule:
P[A and B] = P[A] ⋅ P[B|A]
E.g. probability of two aces when drawing two cards from a deck = 4/52 ⋅ 3/51
Or, when the events are independent:
P[A and B] = P[A] ⋅ P[B]
Some admin
A couple of things:
We have no lab next week, so lab 1 will be due the week following.
I have to cancel office hours today because I have to go to the dentist and the only appointment I could get conflicts with office hours.
Bayes’ Rule and Posterior Probabilities
Need to be able to calculate how likely something is given prior information
Bayes’ Rule and Posterior Probabilities
What does all of this mean?
Terminology...
Applying Bayes’ Rule
Create a Contingency Table
| Test Positive | Test Negative | Total |
Have Cancer | | | |
Don’t Have Cancer | | | |
Total | | | 1000 |
Create a Contingency Table
| Test Positive | Test Negative | Total |
Have Cancer | 8 | 2 | 10 |
Don’t Have Cancer | 95 | 895 | 990 |
Total | 103 | 897 | 1000 |
| Test Positive | Test Negative | Total |
Have Cancer | 8 | 2 | 10 |
Don’t Have Cancer | 95 | 895 | 990 |
Total | 103 | 897 | 1000 |
False negative
False positive
Natural Frequencies and Bayes Rule
When we construct a contingency table, we are playing on our understanding of natural frequencies, which people are better at understanding than probabilities:
Try this question
Suppose that there are two kinds of households, the Careless and the Careful; 99 percent of households are Careful and 1 percent are Careless. In any given year, a home inhabited by a Careless household has a 0.010 probability of being destroyed by fire, and a home occupied by a Careful household has a 0.001 probability of being destroyed by fire. If a home is destroyed by fire, what is the probability that it was occupied by a Careless household?
Careless Fires
Let’s take this step-by-step:
Put these together to use Bayes’ Rule!
Careless Fires
9% chance of someone being careless is greater than 1%, but still not a HUGE percentage which some people might think.
Create a Contingency Table
| Fire | No Fire | Total |
Careless | | | |
Careful | | | |
Total | | | 100,000 |
Create a Contingency Table
| Fire | No Fire | Total |
Careless | 10 | 990 | 1000 |
Careful | 99 | 98,901 | 99,000 |
Total | 109 | 99,891 | 100,000 |
| Fire | No Fire | Total |
Careless | 10 | 990 | 1000 |
Careful | 99 | 98,901 | 99,000 |
Total | 109 | 99,891 | 100,000 |
False negative
False positive
Natural Frequencies and Bayes Rule
Put together the natural frequencies to find the probability or 10 to 109.
Midterm 1
Random Variables
This is theoretical
and reflects
probabilities
from the random variable
This is empirical
and reflects
observations
from the actual coin clips
A simulation of 1000 repetitions of 500 coin flips, with the number of heads out of those 500 listed on the x-axis. The number of heads > 250 highlighted in pink, <250 in blue
Expected Value of a random variable
Mu is the expected value (E[X]) of a random variable X:
This is equivalent to the sum of each value of Xi multiplied by each observation’s probability.
Expected Value: Example
Variance of a random variable
Standardized Variable
The standardized value Z of a random variable X is determined by subtracting the mean of the probability distribution and dividing by the standard deviation:
Observed data can be similarly transformed by subtracting the mean of the data and then dividing by the standard deviation:
Standardized Variables
Central Limit Theorem
If Z is the sum of n independent, identically distributed random variables with a finite, non-zero standard deviation, then the probability distribution of Z approaches the normal distribution as n increases.
10 Coin Flips
50 Coin Flips
1000 Coin Flips
Central Limit Theorem
Why is the CLT useful?
Central Limit Theorem
What about the assumptions (independent & identically distributed)?
The probability that a specific value of Z will be in a specified interval is given by the corresponding area under the normal curve.
The total area is 1.
Half of the area is 0.5
Looking at P[0 < Z < 1.25], we compare the areas:
Use the Z
Tables provided to indicate what the probability is that:
P[-0.75 < Z < -0.5]
And P[1 < Z < 2]
Describing a distribution
We have notation for talking about means, standard deviations and a kind of distribution:
Z ~ N(8, 12)
Is distributed
mean
normally
standard deviation
Method for answering questions
For mathematical/formula-based questions in a problem set:
For graphical-based question:
Problem 4.15
In an interview for a banking job, a student was asked to calculate the following:
A can contains 20 coins, 19 of which are ordinary coins and 1 of which has heads on both sides. A coin is randomly selected from the can and flipped five times. It lands heads all five times. What is the probability that it is the two-headed coin?
My guidance:
Problem 3.12
Roll four six-sided dice ten times, each time recording the sum of the four numbers rolled. Calculate the mean and median of your ten rolls. Repeat this experiment 20 times. Which of these two measures seems to be the least stable?
How would you go about doing this?
Think about how we used Google Sheets and what you could do there rather than rolling actual dice.
Problem 3.12
Roll four six-sided dice ten times, each time recording the sum of the four numbers rolled. Calculate the mean and median of your ten rolls. Repeat this experiment 20 times. Which of these two measures seems to be the least stable?
How would you go about doing this?
Think about how we used Google Sheets and what you could do there rather than rolling actual dice.