1 of 21

Visualization

John Yi

Winter 2025

2 of 21

Why Data Visualization?

3 of 21

Why Data Visualization?

4 of 21

Why Data Visualization?

[{'date': '2020-01', 'positive': 2}, {'date': '2020-02', 'positive': 16}, {'date': '2020-03', 'positive': 5939}, {'date': '2020-04', 'positive': 9133}, {'date': '2020-05', 'positive': 7621}, {'date': '2020-06', 'positive': 11969}, {'date': '2020-07', 'positive': 24516}, {'date': '2020-08', 'positive': 17284}, {'date': '2020-09', 'positive': 13259}, {'date': '2020-10', 'positive': 21868}, {'date': '2020-11', 'positive': 64133}, {'date': '2020-12', 'positive': 66590}, {'date': '2021-01', 'positive': 69267}, {'date': '2021-02', 'positive': 28176}, {'date': '2021-03', 'positive': 4759}]

  • What trends do you notice?
  • What other questions can you ask about this data?
    • Daily cases
    • New vs cumulative
    • Other states
    • What else might be missing from this data?

5 of 21

Why Data Visualization?

  • What trends do you notice?
  • What other questions can you ask about this data?
    • Daily cases
    • New vs cumulative
    • WA vs other states
    • What else might be missing from this data?

[{'date': '2021-03-07', 'positive': 664}, {'date': '2021-03-06', 'positive': 778}, {'date': '2021-03-05', 'positive': 854}, {'date': '2021-03-04', 'positive': 795}, {'date': '2021-03-03', 'positive': 733}, {'date': '2021-03-02', 'positive': 935}, {'date': '2021-03-01', 'positive': 0}, {'date': '2021-02-28', 'positive': 951}, {'date': '2021-02-27', 'positive': 1169}, {'date': '2021-02-26', 'positive': 1088}, {'date': '2021-02-25', 'positive': 872}, {'date': '2021-02-24', 'positive': 731}, {'date': '2021-02-23', 'positive': 1168}, {'date': '2021-02-22', 'positive': 0}, {'date': '2021-02-21', 'positive': 890}, {'date': '2021-02-20', 'positive': 897}, {'date': '2021-02-19', 'positive': 1200}, {'date': '2021-02-18', 'positive': 1061}, {'date': '2021-02-17', 'positive': 1699}, {'date': '2021-02-16', 'positive': 0}, {'date': '2021-02-15', 'positive': 0}, {'date': '2021-02-14', 'positive': 880}, {'date': '2021-02-13', 'positive': 1008}, {'date': '2021-02-12', 'positive': 1453}, {'date': '2021-02-11', 'positive': 681}, {'date': '2021-02-10', 'positive': 811}, {'date': '2021-02-09', 'positive': 3068}, {'date': '2021-02-08', 'positive': 0}, {'date': '2021-02-07', 'positive': 775}, {'date': '2021-02-06', 'positive': 1493}, {'date': '2021-02-05', 'positive': 1584}, {'date': '2021-02-04', 'positive': 1602}, {'date': '2021-02-03', 'positive': 1357}, {'date': '2021-02-02', 'positive': 1738}, {'date': '2021-02-01', 'positive': 0}, {'date': '2021-01-31', 'positive': 1796}, {'date': '2021-01-30', 'positive': 1992}, {'date': '2021-01-29', 'positive': 2520}, {'date': '2021-01-28', 'positive': 1807}, {'date': '2021-01-27', 'positive': 1341}, {'date': '2021-01-26', 'positive': 1943}, {'date': '2021-01-25', 'positive': 0}, {'date': '2021-01-24', 'positive': 1949}, {'date': '2021-01-23', 'positive': 2162}, {'date': '2021-01-22', 'positive': 2070}, {'date': '2021-01-21', 'positive': 2028}, {'date': '2021-01-20', 'positive': 2050}, {'date': '2021-01-19', 'positive': 0}, {'date': '2021-01-18', 'positive': 3969}, {'date': '2021-01-17', 'positive': 0}, {'date': '2021-01-16', 'positive': 2193}, {'date': '2021-01-15', 'positive': 2575}, {'date': '2021-01-14', 'positive': 2658}, {'date': '2021-01-13', 'positive': 1858}, {'date': '2021-01-12', 'positive': 5091}, {'date': '2021-01-11', 'positive': 0}, {'date': '2021-01-10', 'positive': 2988}, {'date': '2021-01-09', 'positive': 4595}, {'date': '2021-01-08', 'positive': 3260}, {'date': '2021-01-07', 'positive': 1985}, {'date': '2021-01-06', 'positive': 2332}, {'date': '2021-01-05', 'positive': 1039}, {'date': '2021-01-04', 'positive': 8644}, {'date': '2021-01-03', 'positive': 0}, {'date': '2021-01-02', 'positive': 0}, {'date': '2021-01-01', 'positive': 4422}, {'date': '2020-12-31', 'positive': 1484}, {'date': '2020-12-30', 'positive': 2174}, {'date': '2020-12-29', 'positive': 1953}, {'date': '2020-12-28', 'positive': 0}, {'date': '2020-12-27', 'positive': 3626}, {'date': '2020-12-26', 'positive': 0}, {'date': '2020-12-25', 'positive': 2891}, {'date': '2020-12-24', 'positive': 2315}, {'date': '2020-12-23', 'positive': 1252}, {'date': '2020-12-22', 'positive': 4035}, {'date': '2020-12-21', 'positive': 0}, {'date': '2020-12-20', 'positive': 2332}, {'date': '2020-12-19', 'positive': 3063}, {'date': '2020-12-18', 'positive': 2940}, {'date': '2020-12-17', 'positive': 61}, {'date': '2020-12-16', 'positive': 667}, {'date': '2020-12-15', 'positive': 652}, {'date': '2020-12-14', 'positive': 1410}, {'date': '2020-12-13', 'positive': 2228}, {'date': '2020-12-12', 'positive': 2421}, {'date': '2020-12-11', 'positive': 2557}, {'date': '2020-12-10', 'positive': 2977}, {'date': '2020-12-09', 'positive': 3650}, {'date': '2020-12-08', 'positive': 1182}, {'date': '2020-12-07', 'positive': 1773}, {'date': '2020-12-06', 'positive': 3080}, {'date': '2020-12-05', 'positive': 3138}, {'date': '2020-12-04', 'positive': 3312}, {'date': '2020-12-03', 'positive': 3401}, {'date': '2020-12-02', 'positive': 4702}, {'date': '2020-12-01', 'positive': 1314}, {'date': '2020-11-30', 'positive': 2179}, {'date': '2020-11-29', 'positive': 2624}, {'date': '2020-11-28', 'positive': 530}, {'date': '2020-11-27', 'positive': 2936}, {'date': '2020-11-26', 'positive': 3157}, {'date': '2020-11-25', 'positive': 3678}, {'date': '2020-11-24', 'positive': 1405}, {'date': '2020-11-23', 'positive': 1890}, {'date': '2020-11-22', 'positive': 3193}, {'date': '2020-11-21', 'positive': 3082}, {'date': '2020-11-20', 'positive': 3070}, {'date': '2020-11-19', 'positive': 3216}, {'date': '2020-11-18', 'positive': 3454}, {'date': '2020-11-17', 'positive': 1169}, {'date': '2020-11-16', 'positive': 1881}, {'date': '2020-11-15', 'positive': 2610}, {'date': '2020-11-14', 'positive': 2910}, {'date': '2020-11-13', 'positive': 2163}, {'date': '2020-11-12', 'positive': 2384}, {'date': '2020-11-11', 'positive': 2713}, {'date': '2020-11-10', 'positive': 777}, {'date': '2020-11-09', 'positive': 1407}, {'date': '2020-11-08', 'positive': 1931}, {'date': '2020-11-07', 'positive': 2124}, {'date': '2020-11-06', 'positive': 1872}, {'date': '2020-11-05', 'positive': 1595}, {'date': '2020-11-04', 'positive': 1706}, {'date': '2020-11-03', 'positive': 543}, {'date': '2020-11-02', 'positive': 787}, {'date': '2020-11-01', 'positive': 1147}, {'date': '2020-10-31', 'positive': 1245}, {'date': '2020-10-30', 'positive': 1102}, {'date': '2020-10-29', 'positive': 1075}, {'date': '2020-10-28', 'positive': 1133}, {'date': '2020-10-27', 'positive': 348}, {'date': '2020-10-26', 'positive': 589}, {'date': '2020-10-25', 'positive': 835}, {'date': '2020-10-24', 'positive': 849}, {'date': '2020-10-23', 'positive': 785}, {'date': '2020-10-22', 'positive': 918}, {'date': '2020-10-21', 'positive': 913}, {'date': '2020-10-20', 'positive': 298}, {'date': '2020-10-19', 'positive': 409}, {'date': '2020-10-18', 'positive': 664}, {'date': '2020-10-17', 'positive': 770}, {'date': '2020-10-16', 'positive': 771}, {'date': '2020-10-15', 'positive': 768}, {'date': '2020-10-14', 'positive': 867}, {'date': '2020-10-13', 'positive': 253}, {'date': '2020-10-12', 'positive': 383}, {'date': '2020-10-11', 'positive': 701}, {'date': '2020-10-10', 'positive': 726}, {'date': '2020-10-09', 'positive': 723}, {'date': '2020-10-08', 'positive': 723}, {'date': '2020-10-07', 'positive': 906}, {'date': '2020-10-06', 'positive': 252}, {'date': '2020-10-05', 'positive': 403}, {'date': '2020-10-04', 'positive': 628}, {'date': '2020-10-03', 'positive': 580}, {'date': '2020-10-02', 'positive': 618}, {'date': '2020-10-01', 'positive': 633}, {'date': '2020-09-30', 'positive': 778}, {'date': '2020-09-29', 'positive': 200}, {'date': '2020-09-28', 'positive': 328}, {'date': '2020-09-27', 'positive': 521}, {'date': '2020-09-26', 'positive': 588}, {'date': '2020-09-25', 'positive': 595}, {'date': '2020-09-24', 'positive': 599}, {'date': '2020-09-23', 'positive': 669}, {'date': '2020-09-22', 'positive': 196}, {'date': '2020-09-21', 'positive': 323}, {'date': '2020-09-20', 'positive': 535}, {'date': '2020-09-19', 'positive': 478}, {'date': '2020-09-18', 'positive': 471}, {'date': '2020-09-17', 'positive': 482}, {'date': '2020-09-16', 'positive': 511}, {'date': '2020-09-15', 'positive': 138}, {'date': '2020-09-14', 'positive': 239}, {'date': '2020-09-13', 'positive': 478}, {'date': '2020-09-12', 'positive': 475}, {'date': '2020-09-11', 'positive': 536}, {'date': '2020-09-10', 'positive': 564}, {'date': '2020-09-09', 'positive': 219}, {'date': '2020-09-08', 'positive': 163}, {'date': '2020-09-07', 'positive': 304}, {'date': '2020-09-06', 'positive': 507}, {'date': '2020-09-05', 'positive': 527}, {'date': '2020-09-04', 'positive': 528}, {'date': '2020-09-03', 'positive': 549}, {'date': '2020-09-02', 'positive': 596}, {'date': '2020-09-01', 'positive': 162}, {'date': '2020-08-31', 'positive': 325}, {'date': '2020-08-30', 'positive': 499}, {'date': '2020-08-29', 'positive': 533}, {'date': '2020-08-28', 'positive': 598}, {'date': '2020-08-27', 'positive': 609}, {'date': '2020-08-26', 'positive': 632}, {'date': '2020-08-25', 'positive': 217}, {'date': '2020-08-24', 'positive': 274}, {'date': '2020-08-23', 'positive': 551}, {'date': '2020-08-22', 'positive': 509}, {'date': '2020-08-21', 'positive': 628}, {'date': '2020-08-20', 'positive': 624}, {'date': '2020-08-19', 'positive': 670}, {'date': '2020-08-18', 'positive': 164}, {'date': '2020-08-17', 'positive': 359}, {'date': '2020-08-16', 'positive': 673}, {'date': '2020-08-15', 'positive': 628}, {'date': '2020-08-14', 'positive': 632}, {'date': '2020-08-13', 'positive': 725}

6 of 21

Why Data Visualization?

  • Easier to read (communicate to a broader audience)
  • Reveals insights about data

7 of 21

Why Data Visualization?

8 of 21

Matplotlib

9 of 21

Matplotlib - data

  • In order to plot something, you have to use some form of data.
    • This can often take the most time in programming!
  • Matplotlib works well with two lists: one for the x (independent) variable and one for the y (dependent) variable.
  • For the purposes of this lecture, assume you will be given these two lists:
    • months = 0, 1, 2, 3...
    • cases = 2, 16, 5939, 9133...
  • You can also create your own lists by using the collect_state_data() function given to you and passing in the two letter state code (i.e. “WA”)
  • Starter code available at the link below or Ed post

https://tinyurl.com/hp4xnwtf

10 of 21

Matplotlib - import + plt.show()

import matplotlib.pyplot as plt

  • Matplotlib is a visualization library that is not included in base python. So, we must import it.
  • Same import as what we used for graphs!

plt.show()

  • Will pop up a window on your screen with the graph you created.
  • This will reset all plots.

11 of 21

Matplotlib - plt.plot()

plt.plot(xs, ys)

  • By default, the x variable comes before the y variable.

plt.plot(months, wa_cases)

plt.show()

  • You should now see a graph pop up like the one on the right.

12 of 21

Matplotlib - plt.title, plt.xlabel, plt.ylabel

plt.title()

  • This function allows you to communicate what this graph is about by placing a title on top. You can also add axis titles to name your independent and dependent variables.

plt.title(“Monthly COVID Cases in WA”)

plt.xlabel(“Months from Jan, 2020”)

plt.ylabel(“New Positive Cases”)

13 of 21

Matplotlib - labels, plt.legend()

  • Now let’s say that you wanted to plot another state’s data along with Washington.

plt.plot(months, or_cases, label=”OR”)

  • You can simply make another call to plt.plot() in order to create another line. However, it may be hard to tell which line is which so make sure to specify the label and call plt.legend().

14 of 21

Matplotlib - plt.savefig(), plt.clf()

  • You may find it more convenient to have the plot saved on your computer as a .png file rather than have it pop open a window every time. In that case, use plt.savefig(filename).
  • Keep in mind that if you don’t call plt.show() you must call plt.clf() to reset your plot, otherwise you may unintentionally create multiple lines!

15 of 21

Matplotlib - customization

  • Matplotlib has many options to customize how your plots look. Here are just a few:
    • Color
    • Linewidth
    • Marker
    • Markersize
    • Linestyle
  • Feel free to play around with them!

plt.plot(months, wa_cases, label="WA", color="purple", linewidth=2, marker=".", linestyle="-.")

plt.plot(months, or_cases, label="OR", color="green", linewidth=0.5, marker=8, markersize=5)

16 of 21

Matplotlib - Documentation + more info

For more info on how to use matplotlib, check out its website:

https://matplotlib.org/stable/

  • Consider exploring other types of plots besides line graphs!

If you’re curious where the data came from it is publicly available here:

https://covidtracking.com/data/download

17 of 21

Application Problems

18 of 21

Application Problems

  • Try to create the following graphs yourself, following the format as closely as you can.
  • Hint: Use a for loop to call plt.plot() multiple times.

19 of 21

Application Problems

  • Try to create the following graphs yourself, following the format as closely as you can.
  • Hint: Use a for loop to create a new list of cumulative positive cases. Also, notice how the line width and markers are different.

20 of 21

Additional Problems

The chart to the left shows the reproduction number of various infectious diseases. COVID has an R0 of around 3, meaning that for every one person that is infected they will infect on average three other people.

  • To your plot of WA cases, add another line that represents how many cases there would be if the disease spread followed an exponential growth curve. Here is what the equation would look like with an R0 of 3.

21 of 21

Application Problems

  • Create a line graph that represents the monthly COVID cases for any number of states of your choosing. As you do so, justify why you chose those states (similar population, geographical location, political leaning, etc.).