1 of 14

API CAN CODE �Doing Data Science

Final Project 2: Asking Questions

This work was made possible through generous support from the National Science Foundation (Award # 2141655).

2 of 14

Warmup

2

What do you notice? ��What do you wonder?

3 of 14

“Doing Data Science” Final Project

For the final project, you will take on the role of a data scientist! �

This project will take several class periods, so try to choose a topic you wouldn’t mind working on for a while!

3

Research Questions

Identify questions about this dataset

Finding �Data

Access and evaluate a relevant dataset

Data �Refining

Filter, clean and trim the data

Data Visualization

Create data visualizations and interpret

Communicating Results

Share your conclusions and insights

4 of 14

Today’s Focus

Today, we’ll focus on identifying questions about our datasets.

4

Research Questions

Identify questions about this dataset

Finding �Data

Access and evaluate a relevant dataset

Data �Refining

Filter, clean and trim the data

Data Visualization

Create data visualizations and interpret

Communicating Results

Share your conclusions and insights

5 of 14

Review of Final Project Exemplars

  • Take a look at some example final projects from previous years:
  • Look at the structure of these project presentations and what kinds of questions they ask!
    • Look at the section headers. What headers are included?
    • Look at the content within each section. What would go in each section in your project?

5

6 of 14

Types of Data Science Questions

  • Descriptive questions ask about features and statistics of a particular sample or population
    • Example: How many movies are comedies?
    • Example: What are the win rates across NFL teams?�
  • Comparative questions compare two samples or populations
    • Example: Does the most popular dog differ by country?
    • Example: Do pop songs get more streams than rock songs?

6

7 of 14

Types of Data Science Questions

  • Evaluative questions often look for the “best” or most extreme cases in a sample or population.
    • Example: Who is the best soccer player in the USA?
    • Example: Who is the most popular pop music artist?�
  • Predictive questions ask what factors can be used to predict outcomes
    • Example: Do heavier dogs live longer?
    • Example: Is there a relationship between roller coaster height and roller coaster speed?

7

8 of 14

What questions could you ask?

Using each graph, brainstorm a few questions that you could think to ask about the data. �

Consider the four question types we discussed:

  • Descriptive questions ask about features and statistics of a particular sample or population
  • Comparative questions compare two samples or populations
  • Evaluative questions often look for the “best” or most extreme cases in a sample or population
  • Predictive questions ask what factors can be used to predict outcomes

8

9 of 14

What questions could you ask?

Using each graph, brainstorm a few questions that you could think to ask about the data. �

Consider the four question types we discussed:

  • Descriptive questions ask about features and statistics of a particular sample or population
  • Comparative questions compare two samples or populations
  • Evaluative questions often look for the “best” or most extreme cases in a sample or population
  • Predictive questions ask what factors can be used to predict outcomes

9

10 of 14

First Draft of Questions

  • Write 4-5 data science questions that you could use your dataset to answer�
  • These questions might be descriptive, comparative, evaluative, or predictive
    • Try to think of a few different types of questions that you could ask about your dataset! �
  • Submit your list of questions to today’s worksheet.

10

11 of 14

Peer Review of Questions

  • Trade your list of questions with a classmate. �
  • Consider their dataset and the questions they want to answer. Offer feedback along the following lines:
    • Do you think these questions are answerable using this dataset? Why or why not?
    • Are these questions descriptive, comparative, evaluative, or predictive?
    • Suggest at least one alternative question that your classmate could ask, particularly in a question category they haven’t written a question for.

11

12 of 14

Considering Additional Data

  • After this Peer Review, you might feel like you need an additional data source to answer some of your questions.�
  • Consider searching Kaggle for a CSV you could use to supplement your API data. �
  • Once you have a CSV, you can use our CSV converter to turn this file into a JSON.�
  • Then, drop the converted JSON file into the right spot in this EduBlocks program. (Think about what variables you want in the resulting print-out!)

12

13 of 14

Question Revisions

  • Using your classmate’s feedback to guide you, revise your list of questions. �
  • Some questions might stay the same; others might change.�
  • You also might remove or add questions based on their suggestions! �
  • Submit your revised 4-5 data science questions on today’s worksheet.�
  • REFLECTION: How did this process improve your questions?�
  • How do you know you have good data science questions?�
  • What is the main question your project will try to answer, and WHY are you interested in this data in particular? What do you hope to discover/explain to your audience? (NOTE: Your data science questions should address this question using different data types, relationships, and trends)

13

14 of 14

Thanks!

apicancode@umd.edu

14

This work was made possible through generous support from the National Science Foundation (Award # 2141655).