1 of 35

Maps

1

Data 6 Summer 2022

LECTURE 20

Adding another tool to our visualization toolkit.

Developed by students and faculty at UC Berkeley and Tuskegee University

data6.org/su22/syllabus/#acknowledgements-

2 of 35

Week 4

Announcements!

  • Homework 3 has been released and will be due on 7/28 @ 11 PM
  • Homework 2 grades have been released on Gradescope. Regrade requests are due by 7/27 @ 6 PM
  • Remember that you can get 3% extra credit if you have a mid-semester check-in with a course staff member
    • See Ed for more details

2

3 of 35

Today’s Roadmap

Lecture 20, Data 6 Summer 2022

  1. Motivation
  2. Scatter Plot Maps
  3. Choropleth Maps

3

4 of 35

Motivation

4

1. Motivation

2. Scatter Plot Maps

3. Choropleth Maps

5 of 35

5

6 of 35

@TerribleMaps

6

7 of 35

Today’s Data

Over 10,000 fast food restaurants from across the US, with over 500 unique fast food chains.

7

from Kaggle

8 of 35

Quick Review

8

1. Motivation

2. Scatter Plot Maps

3. Choropleth Maps

9 of 35

Scatter Plots

9

The method

t.scatter(column_for_x, column_for_y)

creates a scatter plot using the specified columns. Both columns must contain numerical values.

Optional arguments, in addition to color column_for_x, column_for_y:

  • group (str): points will be colored according to category in this categorical column.
  • labels (str): points will be labeled according to their value in this column.

10 of 35

.group()

The term “group” in data science is most commonly associated with data aggregation and disaggregation.

Aggregation: A process in which information is gathered and expressed in collective or summary form.

Disaggregation (aka disentanglement): A process of taking aggregated data and breaking it down into smaller information units.

10

The method t.group(column) counts the number of rows for each unique value in column, and returns a two-column table with the results.

11 of 35

Scatter Plot Maps

11

1. Motivation

2. Scatter Plot Maps

3. Choropleth Maps

12 of 35

Latitude and Longitude

Any point on Earth can be described by its latitude and longitude.

  • Latitude can be thought of as the “y” or “vertical” position.
  • Longitude can be thought of as the “x” or “horizontal” position.
  • When describing a location, latitude always comes before longitude.

12

13 of 35

What’s Wrong with a Scatter Plot?

13

14 of 35

Scatter Plot Maps

When we want to visualize the geographic locations of a lot of data points, it's often helpful to start with a scatter plot map.

  • Scatter plots with geographic maps
  • Help you visualize geographic locations in relation to cities, states, and countries.

14

Scatters Plot + Map = ❤️

Use px.scatter_geo(df, lat, lon)

data frame, latitude, longitude

15 of 35

A Side Note

Plotly

For our maps we will use a Python library called Plotly (px in your notebooks). Plotly is a commonly used visualization library and is really useful for creating maps.

  • Plotly can be quite confusing and difficult to understand
  • The documentation is really helpful

We do not expect you to memorize/remember any Plotly syntax. This is purely for fun.

15

16 of 35

Optional Arguments

We can customize our scatter plot maps by specifying the following arguments:

16

Argument

Example

Behavior

color

color = ‘name’

The colors of points on the map are assigned based on their category in the name column of the data frame

locationmode

locationmode = ‘USA-states’

Specifies which map to display. We usually set this to ‘USA-states’ to show a state map of the US

scope

scope = ‘usa’

Specifies the scope of the map (i.e. what is visible). Setting scope to ‘usa’ zooms the map in just to the US

title

title = ‘My Map’

Sets the title of the map

17 of 35

Questions?

17

18 of 35

Example: All Restaurants

18

Just because we can plot all restaurants in our dataset doesn’t mean we should.

19 of 35

Example: Regional Chains

19

20 of 35

Choropleth Maps

20

1. Motivation

2. Scatter Plot Maps

3. Choropleth Maps

21 of 35

Choropleth Maps

Choropleth maps are useful for visualizing numerical variables across different states or countries. In this sense they are analogous to bar charts, since they encode one categorical variable (state or country) and one numerical variable.

21

Aggregation!

Use px.choropleth(df, locations)

data frame, state abbreviations

22 of 35

Example: Election Mapping

22

23 of 35

Example: Census Data

23

Percent Black or African American

Percent Hispanic or Latino

24 of 35

Example: Redlining

24

LA “Residential Security Map” from The Color of Law (Rothstein)

25 of 35

Plotly Choropleth Maps

We can customize our choropleth maps by specifying the following arguments:

Argument

Example

Behavior

color

color = ‘name’

The colors of points on the map are assigned based on their category in the name column of the data frame

color_discrete_sequence

color_discrete_sequence = px.colors.qualitative.Bold

Specifies the color palette to use for coloring the categories.

locationmode

locationmode = ‘USA-states’

Specifies which map to display. We usually set this to ‘USA-states’ to show a state map of the US

scope

scope = ‘usa’

Specifies the scope of the map (i.e. what is visible). Setting scope to ‘usa’ zooms the map in just to the US

title

title = ‘My Map’

Sets the title of the map

26 of 35

Colors

There are a lot of options to choose from for color palettes.

px.colors.qualitative.D3

px.colors.qualitative.Set2

26

What are some considerations we should keep in mind when choosing a color palette?

27 of 35

Example: Favorite Chains

27

28 of 35

Example: Pizza Chains

28

29 of 35

Example: McDonald’s vs. Burger King

29

30 of 35

Example: Local Burger Chains

30

31 of 35

Questions?

31

32 of 35

In Conclusion…

32

33 of 35

Summary

  • Scatter plot maps are useful when we want to visualize the geographic locations of a lot of data points, but it is really easy to overcrowd our maps
  • Choropleth maps allow us to visualize data aggregated by county, state, or country
  • Plotly is a very powerful mapping/visualization library, but you’re not expected to be an expert at it.

33

Map wisely!

34 of 35

Recap

Next Time

34

  • Maps
    • Scatter maps
    • Choropleth maps
  • Web Development
    • Making your own website!

35 of 35

Week 4

Announcements!

  • Homework 3 has been released and will be due on 7/28 @ 11 PM
  • Homework 2 grades have been released on Gradescope. Regrade requests are due by 7/27 @ 6 PM
  • Remember that you can get 3% extra credit if you have a mid-semester check-in with a course staff member
    • See Ed for more details

35