1 of 19

Is the game fair? Developing the big ideas of inference using simulations

Craig Lazarski

Craig_Lazarski@caryacademy.org

2 of 19

Workshop Goals:

Use a data investigation process framework as a method for conducting a hypothesis test.

Provide you with a simulation tool that allows you and students to explore every aspect of a hypothesis test using a single context.

3 of 19

Data investigation Process

https://www.fi.ncsu.edu/projects/instep/

4 of 19

Frame the problem

We need to evaluate if the dice being produce by a company, are fair dice. A fair dice would be one in which each outcome is equally likely and occurs at the same rate in the long run.

5 of 19

Consider and Gather Data

We need rolls of the dice that are representative. Since each roll is independent, we can roll the dice many times and record the results.

6 of 19

Process Data

We will observe how many times each outcome occurs and create a table that displays the frequency of each outcome.

7 of 19

Explore and Visualize the data

We can create a histogram of the results. We will be interested in observing if the heights of the bars are similar or if they are different. The more they vary, the more evidence we have that the dice might be unfair.

8 of 19

Consider models

Count data such as this can be used to make a test statistic that is modeled by a chi square distribution. We can use this distribution to compute a p-value

9 of 19

Communicate and Propose Action

If the p-value is small, we will conclude that the dice produced by the company are not fair. We should consider that the less unfair a company’s dice are, the higher the sample size we will need to detect the unfair dice.

10 of 19

Inference Challenges

Students believe that Hypothesis tests results are definitive.

Students pay little attention to sample size, if they do, it is often for checking it off as a necessary condition.

Students do not understand that the null hypothesis is an assumption!

Students rely on p-values to make decisions and ignore what a test statistics can tell them.

Students do not understand that Type 1 and Type 2 errors are conditional!

Students find it difficult to meaningfully interpret what the Power of a test means.

11 of 19

Schoolopoly: Is the dice fair or biased?�

Background

Suppose your school is planning to create a board game modelled on the classic game of Monopoly. The game is to be called Schoolopoly and like Monopoly, will be played with dice. Because many copies of the game expect to be sold, companies are competing for the contract to supply dice for Schoolopoly. Some companies have been accused of making poor quality dice, and these are to be avoided, since players must believe the dice they are using are actually “fair.” Each company has provided dice for analysis, and you will be assigned one company to investigate.

12 of 19

Investigation tool

https://shiny.mathisawesome.com/app/dice

13 of 19

Task 1: Investigate the companies

Using the Analysis tab (the home page) explore each company and decide which company is fair.

What evidence do you have for your decision?
What are the real-world limitations that might make reaching this conclusion more challenging than using this simulation tool?

Key concepts explored:

What assumptions do we use to make decisions and how do we make them?

We always start with some assumption regarding a context that determines what behavior we should be expecting. In the case of the dice, we expect that each outcome should occur at the same rate. This means that regardless of the number of times we roll, each roll should occur 1/6 of the time. For different contexts, we may need to know more about probability to develop the expected outcomes.

What role does sample size have in making decisions?

If we can roll the dice a very large number of times, we can observe how well the outcomes fit our assumption. The fewer rolls, the more variation in the results and the harder it is to make a decision.

What role does variation have in making decisions?

We can use a simulation to roll the dice more times than we would be able to do manually, however, how many rolls would we be able to observe without the simulation? How does the number of rolls impact what we observe? If the dice are fair, what do the results look like when we observe a smaller number of trials? How much variation from our expectation is natural when the dice are fair?

14 of 19

Task 2: Measure the variation

Using the Test Statistic tab explore how to construct a test statistic and how to interpret that test statistic.

What does a test statistic measure?
How can we interpret a test statistic?
What role does the null hypothesis play in the value of a test statistic?

Key concepts explored:

What assumptions do we use to make decisions and how do we make them?

We always start with some assumption regarding a context that determines what behavior we should be expecting. In the case of the dice, we expect that each outcome should occur at the same rate. This means that regardless of the number of times we roll, each roll should occur 1/6 of the time. For different contexts, we may need to know more about probability to develop the expected outcomes.

What role does sample size have in making decisions?

If we can roll the dice a very large number of times, we can observe how well the outcomes fit our assumption. The fewer rolls, the more variation in the results and the harder it is to make a decision.

What role does variation have in making decisions?

We can use a simulation to roll the dice more times than we would be able to do manually, however, how many rolls would we be able to observe without the simulation? How does the number of rolls impact what we observe? If the dice are fair, what do the results look like when we observe a smaller number of trials? How much variation from our expectation is natural when the dice are fair?

15 of 19

Task 3: Model the variation

Using the distribution tab explore the pattern and frequency with which we observe this test statistic (distribution).

How is the distribution of a test statistic generated?

How can we use the distribution of a test statistic to create a rule that can be used to reach a conclusion?

When using our rule, what mistakes can we make and how often might they happen?

Key concepts explored:

What assumptions do we use to make decisions and how do we make them?

We always start with some assumption regarding a context that determines what behavior we should be expecting. In the case of the dice, we expect that each outcome should occur at the same rate. This means that regardless of the number of times we roll, each roll should occur 1/6 of the time. For different contexts, we may need to know more about probability to develop the expected outcomes.

What role does sample size have in making decisions?

If we can roll the dice a very large number of times, we can observe how well the outcomes fit our assumption. The fewer rolls, the more variation in the results and the harder it is to make a decision.

What role does variation have in making decisions?

We can use a simulation to roll the dice more times than we would be able to do manually, however, how many rolls would we be able to observe without the simulation? How does the number of rolls impact what we observe? If the dice are fair, what do the results look like when we observe a smaller number of trials? How much variation from our expectation is natural when the dice are fair?

16 of 19

Homework

Using the Chi square analysis tab use the decision rule we established to assess the dice dice baby company for the following sample sizes:

The minimum sample size to meet the expected counts condition: 30
A much larger sample size of 600

What role does sample size have in this process?

Key concepts explored:

What assumptions do we use to make decisions and how do we make them?

We always start with some assumption regarding a context that determines what behavior we should be expecting. In the case of the dice, we expect that each outcome should occur at the same rate. This means that regardless of the number of times we roll, each roll should occur 1/6 of the time. For different contexts, we may need to know more about probability to develop the expected outcomes.

What role does sample size have in making decisions?

If we can roll the dice a very large number of times, we can observe how well the outcomes fit our assumption. The fewer rolls, the more variation in the results and the harder it is to make a decision.

What role does variation have in making decisions?

We can use a simulation to roll the dice more times than we would be able to do manually, however, how many rolls would we be able to observe without the simulation? How does the number of rolls impact what we observe? If the dice are fair, what do the results look like when we observe a smaller number of trials? How much variation from our expectation is natural when the dice are fair?

17 of 19

Task 4: How good is the test?

Using the Power tab explore how often we make the correct decisions for each of the companies.

What is the relationship between sample size and the accuracy of the decision we make?
What is the relationship between how unfair a dice is and the power of the test.
What is the relationship between the power of a test and the confidence level or an interval?

18 of 19

Classroom Implementation

Day 1:

Task 1, 2, and 3 (40 minutes)
Teach the formal Goodness of Fit test using the Dice context (40 minutes)
Homework: Identify the minimum sample size and conduct a GOF test for Dice Dice Baby. Repeat using a sample size of 600. Investigate how sample size impacts the conclusion of the test. (20 minutes)

Day 2:

Discussion about sample size from the homework assignment. (10 minutes)
Investigate the power of the test (20 minutes)

Total time: ~130 minutes

19 of 19

Resources

Contact information: Craig_Lazarski@caryacademy.org

Dice app and other shiny apps: Shiny.mathisawesome.com

Published article using dice app: statistics teacher

Github: https://github.com/clazarski

Instep resources: https://www.fi.ncsu.edu/projects/instep/