1 of 33

Modeling with Data for Beginners

WELCOME!

2 of 33

While you are waiting…

As you are getting settled, please take time to answer the modeling questions located at your table and the Mentimeter poll.

https://www.menti.com/alfnqjvoeuqj

3 of 33

Our Vision

THANK YOU to our partners and collaborators

It is the goal of the Data Science Academy and The Science House to encourage the use of data and development of data skills and literacy throughout K-12 education - to include the awareness of and ability to use mathematical modeling in everyday thinking.

4 of 33

Your Vision

  1. Please take a moment to share at your table what you hope data can do for your classroom.
  2. Data Card - During our 90 minutes together, you are encouraged to record your own personal dataset

(Anything you like! Be creative!)

5 of 33

6 of 33

Today’s Modeling with Data Challenge Question:

Is the Marvel Cinematic Universe representative of the population?

  • Create a model that determines how representative the MCU is of the population that can be applied to other major entertainment collections.

7 of 33

Math Modeling vs Word Problems

8 of 33

Math Modeling vs Word Problems

Modeling Problems:

Modeling problems do not provide all of the information and require students to use both mathematics and creativity.

  • Assumptions will have to be made, which will be based on different perspectives.
  • Many possible, yet valid, solutions will be developed.
  • Solutions will also require clear explanations of how the problem was approached, what assumptions were made, and what variables were factored into the model.

9 of 33

Modeling with Data = Data Science

Math Modeling

Algorithms

Machine Learning & AI

Digital Humanities

Statistics

Computing

Coding

Geographic Information Systems (GIS)

Data Visualization

10 of 33

Asking personally or socially relevant data questions

Asking students questions that they care about or that are relevant to their lives and communities not only builds agency and drives engagement, but also teaches them how to start asking their own questions.

Is the Marvel Cinematic Universe representative of the population?

11 of 33

How can we understand students' cultures and interests?

  • Learn student interests through classroom surveys and conversations.
  • Listen to what students are not only talking about, but what they are debating.
  • Investigate community-wide issues (environmental or social)
  • Allow students to ask their own questions, either independently or with guidance

12 of 33

Where to Find the Data

Authentic datasets can be found in a number of different places:

Repository

  • Bootstrap
  • GitHub
  • More…

13 of 33

The Modeling Process

Math Modeling is an iterative process, but can be broken down into several basic steps:

  • Defining the Problem Statement
  • Making Assumptions
  • Defining Variables
  • Getting a Solution
  • Analysis & Model Assessment
  • Reporting the Results

14 of 33

Defining the Problem Statement

Task: At your tables, work as a collaborative group to brainstorm all of the ways to interpret the problem and possible variables to consider.

(Feel free to use any type of visual diagram or mind maps to brainstorm your ideas.)

15 of 33

Making Assumptions

Many modeling problems are too complex to solve outright.

  • It may be necessary to simplify the problem by making assumptions, which will reduce the number of variables.

Task: In your group, decide what considerations or variables can you fill in with assumptions?

Try to limit your problem to only 3 variables.

16 of 33

Defining Variables

Once you have defined the problem and thought out a list of assumptions, you should identify the most important aspects of the problem that can be measured. These are the variables.

Independent Variables = measurable inputs into the model

Dependent Variables = measurable outputs from the model

Model Parameters = Constants (unchanging parameters; possibly from some of the previous assumptions made)

17 of 33

Data Analysis

There are a number of platforms that allow you to easily analyze datasets: CODAP, DataClassroom, Tuva

  1. Today we will be using CODAP
  2. Decide which dataset(s) you want to work with.
    • csv files are typically easiest when first starting

Before looking at today’s dataset, first open the following 2 files:

MCU Character DB, mcu_box_office

    • Today we are working with the MCU_DB:
  • What do you notice about these databases?

18 of 33

Data Analysis - Tidy vs Clean Data

Characteristics of Tidy Data:

  1. Every column is a variable.
  2. Every row is an observation.
  3. Every cell is a single value.

Cleaning Data:

Step 1: Remove duplicate, irrelevant, or unwanted observations

Step 2: Fix structural errors - naming conventions, typos, or incorrect capitalization

Step 3: Decide what to do about outliers

Step 4: Handle missing data

19 of 33

Data Analysis

Today we are working with the MCU_DB

  • Download the csv file, then open CODAP,

We will bring our dataset into CODAP together.

After you familiarize yourself with CODAP, think about the variables in your model and start exploring!

Look for trends and potential relationships

Consider if you want or need to change your variables or assumptions (Modeling is and iterative process!)

20 of 33

Getting a Solution

Now that you have defined the problem, identified some measurable variables, and simplified the problem with assumptions, you have a basic initial mathematical model.

You will use this model to generate some preliminary answers to the problem.

There are a number of ways that you could calculate your answers - from using calculus or differential equations to using graphs - your mathematical toolbox will determine your next steps.

21 of 33

Getting a Solution

As you decide what approach to take in building your solution, the following considerations may be helpful:

  • Have you seen this type of problem before?
  • Is there a single unknown variable, or many?
  • Is the problem linear or nonlinear?
  • Am I solving a system of equations simultaneously, or can I solve them sequentially?
  • What software or computational tools could be used?
  • Would a graph or other visualization help?

22 of 33

Getting a Solution

Task: In your teams discuss and design a basic model that you think will help you answer our modeling question.

On the large note paper provided, layout your basic model.

Include:

  • Your defined problem
  • What major assumptions you made
  • Your variables
  • How you plan to solve the problem (What will your app look like; How will it work; What is your reasoning)

23 of 33

Getting a Solution

Have students rotate and take a look at the other teams’ models and use the sticky notes to share their ideas about each model:

  • What thoughts do you have about the model?
  • What questions do you have?
  • Share any suggestions
  • Share how you would approach solving this model
    • What mathematical or digital tools would you use?

24 of 33

Model Assessment

Does My Answer Make Sense?

Encourage students to assess their model

If something looks off, first check the calculations or formulas. Then determine if changes to the assumptions or the math are needed.

  • Is the sign correct?
  • Is the magnitude of the answer reasonable?
  • Does the model behave as expected?
  • Can you validate the model?

25 of 33

Model Assessment: How Strong Is Our Model?

  • Did we support decisions about which variables to include and which to not?
  • Did we include, how to use our chosen variables, and if/how to weigh them?
  • Did we explained and justified various choices we made?
  • Did we determined an appropriate incorporation of non numerical data , or justified not including these data?
  • Did we recognized and dealt with various units and explain the need for standardization?

26 of 33

Analysis & Model Assessment

Is My Model Valid?

  • Does the model support the question you’re trying to answer?
  • Does your model show the mathematical relationship between your variables and is consistent with your assumptions?
  • Is your model sensitive (example of cats model) to changes in the parameters?
    • Do small changes in the parameters lead to significant changes in the output?

Have students change their parameters and note the outputs of their model. Record any thoughts, complications, etc.

27 of 33

Reporting the Results

Communication is a key aspect of data science and therefore students should be given opportunities to report out on their modeling process.

  • Take notes throughout your entire development process
  • Keep track of all assumptions made
  • Be sure to leave enough time to focus on quality writing

28 of 33

Reporting the Results

Key Ideas:

    • Summarize how the problem was approached and what assumptions were made.
    • Include justifications of each assumption
    • Recognize strengths and weakness of the model
    • Describe the real-world application of the problem and model

29 of 33

Data Pass - Data Visualization

Pass around each data visualization[.

During each round record the following:

Option 1

Round 1: One Understanding about the data

Round 2: One Question the data could answer

Round 3 : One Question the data creates

30 of 33

Data Pass - Data Visualization

Pass around each data visualization.

During each round record the following:

Option 2

Round 1: Write something that you

Round 2: Describe any trends that you may see.

Round 3: Is the style of graph the best for conveying this type of information? Why or Why not?

31 of 33

Data Visualization

Look back at your personal data card.

  • Design your own visualization to express your data.
  • Include an explanation and key.

32 of 33

33 of 33

Look at you modeling! You’re a natural!