1 of 73

ECO220: Probability

Simon Halliday

2 of 73

3 of 73

4 of 73

First dice-rolling activity

5 of 73

Second dice-rolling activity

6 of 73

Third dice-rolling activity

7 of 73

Important ideas

  • One event vs. a sequence of events
  • The “equally likely” approach
  • One event vs. long-run frequencies

8 of 73

9 of 73

10 of 73

How do we get different combinations?

11 of 73

12 of 73

13 of 73

14 of 73

15 of 73

16 of 73

17 of 73

18 of 73

19 of 73

Back to the second dice-rolling activity

20 of 73

Odd

Even

Odd

Even

Odd

Even

Odd, Odd

Odd, Even

Even, Odd

Even, Even

Roll 1

Roll 2

21 of 73

22 of 73

Some basic rules

Addition rule:

P[A or B] = P[A] + P[B] - P[A and B]

Sometimes events are mutually exclusive, in which case this degenerates to:

P[A or B] = P[A] + P[B]

(there can be no event “A and B” when events are mutually exclusive).

23 of 73

Some basic rules

Subtraction Rule

If there are only two potential events, then:

P[A] + P[Not A] = 1

Therefore:

P[Not A] = 1 - P[A]

24 of 73

Some basic rules

Multiplication Rule:

P[A and B] = P[A] ⋅ P[B|A]

E.g. probability of two aces when drawing two cards from a deck = 4/52 ⋅ 3/51

Or, when the events are independent:

P[A and B] = P[A] ⋅ P[B]

25 of 73

Some admin

A couple of things:

We have no lab next week, so lab 1 will be due the week following.

I have to cancel office hours today because I have to go to the dentist and the only appointment I could get conflicts with office hours.

26 of 73

27 of 73

Bayes’ Rule and Posterior Probabilities

Need to be able to calculate how likely something is given prior information

  • A conditional probability
  • One outcome conditional on other information
  • We have prior information or “priors”
  • We need a hypothesis about the eventual outcome
  • We need a posterior probability from our prior probabilities

28 of 73

Bayes’ Rule and Posterior Probabilities

What does all of this mean?

29 of 73

Terminology...

  • The bar above the A means “Not A”.
  • We compare the probability of A and not A.
  • Probability of having cancer (P(A) = 0.01)
  • Probability of not having cancer (P(not A) = 0.99) = 1 - P(A) = 1 - 0.01.
  • Then you have the probability of a having a positive mammogram given that a patient has cancer, (P(B|A) = 0.8)
  • Then you have the probability of having a positive mammogram given that a patient doesn’t have cancer (P(B|not A) = 0.096)
  • We put all of these together using Bayes’ Rule.

30 of 73

31 of 73

Applying Bayes’ Rule

32 of 73

Create a Contingency Table

Test Positive

Test Negative

Total

Have Cancer

Don’t Have Cancer

Total

1000

33 of 73

Create a Contingency Table

Test Positive

Test Negative

Total

Have Cancer

8

2

10

Don’t Have Cancer

95

895

990

Total

103

897

1000

34 of 73

Test Positive

Test Negative

Total

Have Cancer

8

2

10

Don’t Have Cancer

95

895

990

Total

103

897

1000

False negative

False positive

35 of 73

Natural Frequencies and Bayes Rule

When we construct a contingency table, we are playing on our understanding of natural frequencies, which people are better at understanding than probabilities:

36 of 73

37 of 73

Try this question

Suppose that there are two kinds of households, the Careless and the Careful; 99 percent of households are Careful and 1 percent are Careless. In any given year, a home inhabited by a Careless household has a 0.010 probability of being destroyed by fire, and a home occupied by a Careful household has a 0.001 probability of being destroyed by fire. If a home is destroyed by fire, what is the probability that it was occupied by a Careless household?

38 of 73

Careless Fires

Let’s take this step-by-step:

  • What is P(A)*P(B|A)?
  • What is P(not A)*P(B|not A)

Put these together to use Bayes’ Rule!

39 of 73

Careless Fires

9% chance of someone being careless is greater than 1%, but still not a HUGE percentage which some people might think.

40 of 73

Create a Contingency Table

Fire

No Fire

Total

Careless

Careful

Total

100,000

41 of 73

Create a Contingency Table

Fire

No Fire

Total

Careless

10

990

1000

Careful

99

98,901

99,000

Total

109

99,891

100,000

42 of 73

Fire

No Fire

Total

Careless

10

990

1000

Careful

99

98,901

99,000

Total

109

99,891

100,000

False negative

False positive

43 of 73

Natural Frequencies and Bayes Rule

Put together the natural frequencies to find the probability or 10 to 109.

44 of 73

Midterm 1

  • In-class midterm
  • Thursday 1 March (moved because we lost a day of labs).
  • Covering up to end of Ch. 4 (probability)
  • Problems similar to what we have done in class and in problem sets
  • I will post solutions to PS1 and PS2 this Friday to help you to prepare for the midterm.
  • Though I can’t ask you to work specifically in Google sheets, I can ask you what would make sense with a given function, ask you to translate a function into English and more. Similarly with Stata output.

45 of 73

Random Variables

  • A random variable X is a variable whose numerical value is determined by chance, the outcome of a random phenomenon.
    • e.g. X might be the number of heads when three coins are flipped or the number that comes up when a six-sided die is rolled.
  • Discrete random variable has a countable number of possible outcomes:
    • E.g. a six-sided die can be 1, 2, 3, 4, 5, or 6.
  • Continuous random variable has a continuum of possible values,
    • e.g. distance, time.

46 of 73

This is theoretical

and reflects

probabilities

from the random variable

47 of 73

This is empirical

and reflects

observations

from the actual coin clips

48 of 73

A simulation of 1000 repetitions of 500 coin flips, with the number of heads out of those 500 listed on the x-axis. The number of heads > 250 highlighted in pink, <250 in blue

49 of 73

Expected Value of a random variable

Mu is the expected value (E[X]) of a random variable X:

This is equivalent to the sum of each value of Xi multiplied by each observation’s probability.

50 of 73

Expected Value: Example

  • Consider flipping a coin 4 times.
  • Take the value (0, 1, 2, or 3 heads)
  • And multiply each by its probability
    • 0.125, 0.25, 0.375
  • μ = 0 x 0.125 + 1 x 0. 375 + 2 x 0.375 + 3 x 0.125 = 1.5
  • The expected value is not necessarily the most likely value of X.
  • You will never flip 1.5 heads!
  • The expected value is the anticipated long-run average value of X if this experiment is repeated a large number of times.

51 of 73

52 of 73

Variance of a random variable

  • The variance σ2 of a probability distribution is analogous to the variance s2 of observed data.
  • Use probabilities in place of observed frequencies.

  • The variance of a discrete random variable X is a weighted average of the squared deviations of the possible values of X from the mean, using the probability of each X value as weights:

53 of 73

Standardized Variable

The standardized value Z of a random variable X is determined by subtracting the mean of the probability distribution and dividing by the standard deviation:

Observed data can be similarly transformed by subtracting the mean of the data and then dividing by the standard deviation:

54 of 73

Standardized Variables

  • A standardized variable Z measures how many standard deviations X is above or below its mean.
  • If X is equal to its mean, Z is equal to 0.
  • If X is one standard deviation above its mean, Z is equal to 1.
  • If X is two standard deviations below its mean, Z is equal to –2.

55 of 73

56 of 73

57 of 73

Central Limit Theorem

If Z is the sum of n independent, identically distributed random variables with a finite, non-zero standard deviation, then the probability distribution of Z approaches the normal distribution as n increases.

58 of 73

10 Coin Flips

59 of 73

50 Coin Flips

60 of 73

1000 Coin Flips

61 of 73

Central Limit Theorem

Why is the CLT useful?

  • Does a normal distribution only emerge with large n?
  • No! Often appears even when n is quite small
  • Often 10, 20 or 30 observations gives us enough variation.

62 of 73

Central Limit Theorem

What about the assumptions (independent & identically distributed)?

  • Many variables end up normally distributed without strict independence and identical distributions
  • Think strength, height, weight -- all have genetic & social components
  • What about log of wages/income?

63 of 73

The probability that a specific value of Z will be in a specified interval is given by the corresponding area under the normal curve.

The total area is 1.

Half of the area is 0.5

Looking at P[0 < Z < 1.25], we compare the areas:

  • 0.5 of the positive area
  • subtract 0.1056
  • Left with ~=0.39 chance of 0 <= Z < 1.25

64 of 73

Use the Z

Tables provided to indicate what the probability is that:

P[-0.75 < Z < -0.5]

And P[1 < Z < 2]

65 of 73

66 of 73

67 of 73

68 of 73

Describing a distribution

We have notation for talking about means, standard deviations and a kind of distribution:

Z ~ N(8, 12)

Is distributed

mean

normally

standard deviation

69 of 73

Method for answering questions

For mathematical/formula-based questions in a problem set:

  1. State/include the formula
  2. Explain the formula
  3. Illustrate the formula with the problem and substitute in the relevant values

For graphical-based question:

  1. Make sure you have enough space for the figure
  2. Label all axes and relevant lines
  3. Label points to which you will refer, preferably title it Figure A/1
  4. Explain the figure

70 of 73

Problem 4.15

In an interview for a banking job, a student was asked to calculate the following:

A can contains 20 coins, 19 of which are ordinary coins and 1 of which has heads on both sides. A coin is randomly selected from the can and flipped five times. It lands heads all five times. What is the probability that it is the two-headed coin?

My guidance:

  • What is P(A)? What is P(not A)?
  • What is P(B|A)? What is P(B| not A)?
  • How do we now use Bayes’ Rule to calculate P(A|B)?

71 of 73

Problem 3.12

Roll four six-sided dice ten times, each time recording the sum of the four numbers rolled. Calculate the mean and median of your ten rolls. Repeat this experiment 20 times. Which of these two measures seems to be the least stable?

How would you go about doing this?

Think about how we used Google Sheets and what you could do there rather than rolling actual dice.

72 of 73

Problem 3.12

Roll four six-sided dice ten times, each time recording the sum of the four numbers rolled. Calculate the mean and median of your ten rolls. Repeat this experiment 20 times. Which of these two measures seems to be the least stable?

How would you go about doing this?

Think about how we used Google Sheets and what you could do there rather than rolling actual dice.

73 of 73