1 of 15

API CAN CODE �Computational Foundations of �Data Science

Lesson 2.5: Preparing Data for Analysis

This work was made possible through generous support from the National Science Foundation (Award # 2141655).

2 of 15

2.4 Recap

  • We can use for loops to iterate through an entire dataset and quickly print out all the observations, or one or more aspects of all the observations�
  • We can use if statements to filter down what is being printed out into just observations we’re interested in�
  • Last lesson, we looked at the Top 100 Movies dataset from IMDB and explored some of our questions about these movies!

3 of 15

Warmup

  • Open this program in EduBlocks, which uses the Mario Kart data we’ve seen before�
  • Create chunks of code that filter the data so that:
    • Only Luigi’s data prints out
    • All the acceleration values print out, but nothing else
    • Only Peach’s weight value prints out

4 of 15

Finding the Billboard Hot 100

  1. Go to RapidAPI �
  2. Go to the Billboard API
  3. Should look like this (Note: we will use the “Billboard Hot 100” endpoint):�
  4. Click “Subscribe to Test” and select the free option

5 of 15

Getting Example Code

Once subscribed, go to the right side (look for “Code Snippets”): �make sure the “Target” reads “Python” and the “Client” reads “Requests”.

6 of 15

Request from EduBlocks

  1. Open, clone, save, and rename this program.
  2. Enter the URL, the Sunday of this week, and your X-RapidAPI-Key shown in the example code from RapidAPI.

7 of 15

Preview the Data

  1. Run your completed program and view the output.�

What variables are present? What kind of data is stored?�

What questions do you have about the data?

8 of 15

Explore the Data

Modify your code so that it lists just the artists in the dataset. Do you recognize any artists in the list?�

How many Taylor Swift songs are in this list? (Write a loop that counts for you!)�

Choose an artist and write an if statement to check if there are any songs by that artist in the dataset.

9 of 15

Investigate!

Formulate a question of interest about the Billboard 100 data and answer it! You might use for loops or conditional if statements like we used for the Movie investigation.�

You might also want data from another week; think about what statements you could add to your program to draw in data from other weeks!�

10 of 15

Share Your Investigation Results!

  1. Tell the class about your findings. How did you do your analysis? �
  2. What did you find?�
  3. Did you bring in any data from other weeks? What variables did you work with?�
  4. What more do you wish you were able to do with this analysis? (Additional variables? Tables? Graphs?)

11 of 15

Exporting Data from EduBlocks

  1. Now, remove all of the code after “myJSON = json.loads(r.text)”�
  2. Attach the chunk of code that starts with “print(“rank, title, artist”)” to your code.�
  3. Run the code. What does the output look like? Is it still in JSON format?

12 of 15

Importing Data into CODAP

  1. Your teacher will copy the output from our last program. (You can do this by highlighting all the text, and pressing ctrl+C.)�
  2. Now, they will open CODAP, a data analysis program that includes data visualization. Once in CODAP, select “Create New Document.”�
  3. Once in this document, select the “Tables” button in the top left, then “--new from clipboard--”�
  4. What form is this data in? What would this program allow you to do with it?

13 of 15

Importing Data into CODAP

  1. Once imported to CODAP, the data is in a nice, neat tabular format�
  2. We can now use it to make visualizations and other data summaries

14 of 15

Upcoming work with CODAP

  1. In the next unit, we’ll do more work with CODAP! �
  2. This program will let us take our processed data from EduBlocks and do further analysis, like transforming variables, creating new ones, and visualizing data through graphs. �
  3. RapidAPI lets us draw in live-updated data; EduBlocks processes it for us; CODAP lets us finish our analysis.

15 of 15

Thanks!

apicancode@umd.edu

This work was made possible through generous support from the National Science Foundation (Award # 2141655).

API Can Code is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike

4.0 International (CC BY-NC-SA 4.0) License