4 of 73

First dice-rolling activity

Roll one (1) 6-sided die 10 times;
Report your rolls in the spreadsheet below
https://docs.google.com/spreadsheets/d/1ev61-U_rjIugyiqY9C_VAYik46HMldN5GIaGPXC0PWg/edit?usp=sharing

5 of 73

Second dice-rolling activity

Roll two (2) 6-sided dice (value 1, value 2) 10 times in a row
Report your rolls in the spreadsheet below
https://docs.google.com/spreadsheets/d/1xAu4wi3yg15NtdSUb398a4an9Y0JvR-SDyMl6VM-FNc/edit?usp=sharing

6 of 73

Third dice-rolling activity

We don’t actually have to roll dice.
We can simulate the rolls
https://docs.google.com/spreadsheets/d/1FXNQvveHgeaAgZsw4tcc_zLR7Am5MvxEAe2u9fzMXNo/edit?usp=sharing

7 of 73

Important ideas

One event vs. a sequence of events
The “equally likely” approach
One event vs. long-run frequencies

10 of 73

How do we get different combinations?

19 of 73

Back to the second dice-rolling activity

Roll two (2) 6-sided dice (value 1, value 2) 10 times in a row
Report your rolls in the spreadsheet below
https://docs.google.com/spreadsheets/d/1xAu4wi3yg15NtdSUb398a4an9Y0JvR-SDyMl6VM-FNc/edit?usp=sharing

20 of 73

Odd

Even

Odd

Even

Odd

Even

Odd, Odd

Odd, Even

Even, Odd

Even, Even

Roll 1

Roll 2

22 of 73

Some basic rules

Addition rule:

P[A or B] = P[A] + P[B] - P[A and B]

Sometimes events are mutually exclusive, in which case this degenerates to:

P[A or B] = P[A] + P[B]

(there can be no event “A and B” when events are mutually exclusive).

23 of 73

Some basic rules

Subtraction Rule

If there are only two potential events, then:

P[A] + P[Not A] = 1

Therefore:

P[Not A] = 1 - P[A]

24 of 73

Some basic rules

Multiplication Rule:

P[A and B] = P[A] ⋅ P[B|A]

E.g. probability of two aces when drawing two cards from a deck = 4/52 ⋅ 3/51

Or, when the events are independent:

P[A and B] = P[A] ⋅ P[B]

25 of 73

Some admin

A couple of things:

We have no lab next week, so lab 1 will be due the week following.

I have to cancel office hours today because I have to go to the dentist and the only appointment I could get conflicts with office hours.

27 of 73

Bayes’ Rule and Posterior Probabilities

Need to be able to calculate how likely something is given prior information

A conditional probability
One outcome conditional on other information
We have prior information or “priors”
We need a hypothesis about the eventual outcome
We need a posterior probability from our prior probabilities

28 of 73

Bayes’ Rule and Posterior Probabilities

What does all of this mean?

29 of 73

Terminology...

The bar above the A means “Not A”.
We compare the probability of A and not A.
Probability of having cancer (P(A) = 0.01)
Probability of not having cancer (P(not A) = 0.99) = 1 - P(A) = 1 - 0.01.
Then you have the probability of a having a positive mammogram given that a patient has cancer, (P(B|A) = 0.8)
Then you have the probability of having a positive mammogram given that a patient doesn’t have cancer (P(B|not A) = 0.096)
We put all of these together using Bayes’ Rule.

31 of 73

Applying Bayes’ Rule

32 of 73

Create a Contingency Table

	Test Positive	Test Negative	Total
Have Cancer
Don’t Have Cancer
Total			1000

33 of 73

Create a Contingency Table

	Test Positive	Test Negative	Total
Have Cancer	8	2	10
Don’t Have Cancer	95	895	990
Total	103	897	1000

34 of 73

	Test Positive	Test Negative	Total
Have Cancer	8	2	10
Don’t Have Cancer	95	895	990
Total	103	897	1000

False negative

False positive

35 of 73

Natural Frequencies and Bayes Rule

When we construct a contingency table, we are playing on our understanding of natural frequencies, which people are better at understanding than probabilities:

37 of 73

Try this question

Suppose that there are two kinds of households, the Careless and the Careful; 99 percent of households are Careful and 1 percent are Careless. In any given year, a home inhabited by a Careless household has a 0.010 probability of being destroyed by fire, and a home occupied by a Careful household has a 0.001 probability of being destroyed by fire. If a home is destroyed by fire, what is the probability that it was occupied by a Careless household?

38 of 73

Careless Fires

Let’s take this step-by-step:

What is P(A)*P(B|A)?
What is P(not A)*P(B|not A)

Put these together to use Bayes’ Rule!

39 of 73

Careless Fires

9% chance of someone being careless is greater than 1%, but still not a HUGE percentage which some people might think.

40 of 73

Create a Contingency Table

	Fire	No Fire	Total
Careless
Careful
Total			100,000

41 of 73

Create a Contingency Table

	Fire	No Fire	Total
Careless	10	990	1000
Careful	99	98,901	99,000
Total	109	99,891	100,000

42 of 73

	Fire	No Fire	Total
Careless	10	990	1000
Careful	99	98,901	99,000
Total	109	99,891	100,000

False negative

False positive

43 of 73

Natural Frequencies and Bayes Rule

Put together the natural frequencies to find the probability or 10 to 109.

44 of 73

Midterm 1

In-class midterm
Thursday 1 March (moved because we lost a day of labs).
Covering up to end of Ch. 4 (probability)
Problems similar to what we have done in class and in problem sets
I will post solutions to PS1 and PS2 this Friday to help you to prepare for the midterm.
Though I can’t ask you to work specifically in Google sheets, I can ask you what would make sense with a given function, ask you to translate a function into English and more. Similarly with Stata output.

45 of 73

Random Variables

A random variable X is a variable whose numerical value is determined by chance, the outcome of a random phenomenon.

e.g. X might be the number of heads when three coins are flipped or the number that comes up when a six-sided die is rolled.

Discrete random variable has a countable number of possible outcomes:

E.g. a six-sided die can be 1, 2, 3, 4, 5, or 6.

Continuous random variable has a continuum of possible values,

e.g. distance, time.

46 of 73

This is theoretical

and reflects

probabilities

from the random variable

47 of 73

This is empirical

and reflects

observations

from the actual coin clips

48 of 73

A simulation of 1000 repetitions of 500 coin flips, with the number of heads out of those 500 listed on the x-axis. The number of heads > 250 highlighted in pink, <250 in blue

49 of 73

Expected Value of a random variable

Mu is the expected value (E[X]) of a random variable X:

This is equivalent to the sum of each value of X_i multiplied by each observation’s probability.

50 of 73

Expected Value: Example

Consider flipping a coin 4 times.
Take the value (0, 1, 2, or 3 heads)
And multiply each by its probability

0.125, 0.25, 0.375

μ = 0 x 0.125 + 1 x 0. 375 + 2 x 0.375 + 3 x 0.125 = 1.5
The expected value is not necessarily the most likely value of X.
You will never flip 1.5 heads!
The expected value is the anticipated long-run average value of X if this experiment is repeated a large number of times.

52 of 73

Variance of a random variable

The variance σ² of a probability distribution is analogous to the variance s² of observed data.
Use probabilities in place of observed frequencies.

The variance of a discrete random variable X is a weighted average of the squared deviations of the possible values of X from the mean, using the probability of each X value as weights:

53 of 73

Standardized Variable

The standardized value Z of a random variable X is determined by subtracting the mean of the probability distribution and dividing by the standard deviation:

Observed data can be similarly transformed by subtracting the mean of the data and then dividing by the standard deviation:

54 of 73

Standardized Variables

A standardized variable Z measures how many standard deviations X is above or below its mean.
If X is equal to its mean, Z is equal to 0.
If X is one standard deviation above its mean, Z is equal to 1.
If X is two standard deviations below its mean, Z is equal to –2.

57 of 73

Central Limit Theorem

If Z is the sum of n independent, identically distributed random variables with a finite, non-zero standard deviation, then the probability distribution of Z approaches the normal distribution as n increases.

58 of 73

10 Coin Flips

59 of 73

50 Coin Flips

60 of 73

1000 Coin Flips

61 of 73

Central Limit Theorem

Why is the CLT useful?

Does a normal distribution only emerge with large n?
No! Often appears even when n is quite small
Often 10, 20 or 30 observations gives us enough variation.

62 of 73

Central Limit Theorem

What about the assumptions (independent & identically distributed)?

Many variables end up normally distributed without strict independence and identical distributions
Think strength, height, weight -- all have genetic & social components
What about log of wages/income?

63 of 73

The probability that a specific value of Z will be in a specified interval is given by the corresponding area under the normal curve.

The total area is 1.

Half of the area is 0.5

Looking at P[0 < Z < 1.25], we compare the areas:

0.5 of the positive area
subtract 0.1056
Left with ~=0.39 chance of 0 <= Z < 1.25

64 of 73

Use the Z

Tables provided to indicate what the probability is that:

P[-0.75 < Z < -0.5]

And P[1 < Z < 2]

68 of 73

Describing a distribution

We have notation for talking about means, standard deviations and a kind of distribution:

Z ~ N(8, 12)

Is distributed

mean

normally

standard deviation

69 of 73

Method for answering questions

For mathematical/formula-based questions in a problem set:

State/include the formula
Explain the formula
Illustrate the formula with the problem and substitute in the relevant values

For graphical-based question:

Make sure you have enough space for the figure
Label all axes and relevant lines
Label points to which you will refer, preferably title it Figure A/1
Explain the figure

70 of 73

Problem 4.15

In an interview for a banking job, a student was asked to calculate the following:

A can contains 20 coins, 19 of which are ordinary coins and 1 of which has heads on both sides. A coin is randomly selected from the can and flipped five times. It lands heads all five times. What is the probability that it is the two-headed coin?

My guidance:

What is P(A)? What is P(not A)?
What is P(B|A)? What is P(B| not A)?
How do we now use Bayes’ Rule to calculate P(A|B)?

71 of 73

Problem 3.12

Roll four six-sided dice ten times, each time recording the sum of the four numbers rolled. Calculate the mean and median of your ten rolls. Repeat this experiment 20 times. Which of these two measures seems to be the least stable?

How would you go about doing this?

Think about how we used Google Sheets and what you could do there rather than rolling actual dice.

72 of 73

Problem 3.12

How would you go about doing this?

Think about how we used Google Sheets and what you could do there rather than rolling actual dice.

1 of 73

2 of 73

3 of 73

4 of 73

5 of 73

6 of 73

7 of 73

8 of 73

9 of 73

10 of 73

11 of 73

12 of 73

13 of 73

14 of 73

15 of 73

16 of 73

17 of 73

18 of 73

19 of 73

20 of 73

21 of 73

22 of 73

23 of 73

24 of 73

25 of 73

26 of 73

27 of 73

28 of 73

29 of 73

30 of 73

31 of 73

32 of 73

33 of 73

34 of 73

35 of 73

36 of 73

37 of 73

38 of 73

39 of 73

40 of 73

41 of 73

42 of 73

43 of 73

44 of 73

45 of 73

46 of 73

47 of 73

48 of 73

49 of 73

50 of 73

51 of 73

52 of 73

53 of 73

54 of 73

55 of 73

56 of 73

57 of 73

58 of 73

59 of 73

60 of 73

61 of 73

62 of 73

63 of 73

64 of 73

65 of 73

66 of 73

67 of 73

68 of 73

69 of 73

70 of 73

71 of 73

72 of 73

73 of 73